Louis Dionne [Tue, 15 Jun 2021 22:47:38 +0000 (18:47 -0400)]
[runtimes] Always build libc++, libc++abi and libunwind with -fPIC
Building the libraries with -fPIC ensures that we can link an executable
against the static libraries with -fPIE. Furthermore, there is apparently
basically no downside to building the libraries with position independent
code, since modern toolchains are sufficiently clever.
This commit enforces that we always build the runtime libraries with -fPIC.
This is another take on D104327, which instead makes the decision of whether
to build with -fPIC or not to the build script that drives the runtimes'
build.
Fixes http://llvm.org/PR43604.
Differential Revision: https://reviews.llvm.org/D104328
Louis Dionne [Thu, 15 Jul 2021 22:06:18 +0000 (18:06 -0400)]
[libc++] CI: Run -std=c++03 on Clang ToT
Differential Revision: https://reviews.llvm.org/D106104
Craig Topper [Tue, 27 Jul 2021 16:48:18 +0000 (09:48 -0700)]
[RISCV] Select vector shl by 1 to a vector add.
A vector add may be faster than a vector shift.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D106689
Melanie Blower [Tue, 27 Jul 2021 17:51:31 +0000 (13:51 -0400)]
[clang][fpenv][patch] Change clang option -ffp-model=precise to select ffp-contract=on
Change the ffp-model=precise to enables -ffp-contract=on (previously
-ffp-model=precise enabled -ffp-contract=fast). This is a follow-up
to Andy Kaylor's comments in the llvm-dev discussion "Floating Point
semantic modes". From the same email thread, I put Andy's distillation
of floating point options and floating point modes into UsersManual.rst
Also fixes bugs.llvm.org/show_bug.cgi?id=50222
I had to revert this a few times because of failures on the x86-64
buildbot but I think we finally have that fixed by LNT/
79f2b03c51.
Reviewed By: rjmccall, andrew.kaylor
Differential Revision: https://reviews.llvm.org/D74436
David Green [Tue, 27 Jul 2021 17:48:58 +0000 (18:48 +0100)]
[AArch64] Update and expand min-max cost model test. NFC
This expands the cost model test for min/max to many more types,
including floating point minnum/maxnum and minimum/maximum, and FP16
with and without fullfp16. The old llc run lines are removed, as those
are better tested by CodeGen tests.
Andy Kaylor [Tue, 27 Jul 2021 17:09:30 +0000 (10:09 -0700)]
Enabling the copy-constant-to-alloca optimization in more instances
Patch by Mohammad Fawaz
This patch allows lifetime calls to be ignored (and later erased) if we
know that the copy-constant-to-alloca optimization is going to happen.
The case that is missed is when the global variable is in a different address
space than the alloca (as shown in the example added to the lit test.)
This used to work before https://github.com/llvm/llvm-project/commit/
6da31fa4a61d68af21dfa1e144e726ed6d77903e
Differential Revision: https://reviews.llvm.org/D106573
David Sherwood [Fri, 23 Jul 2021 09:52:53 +0000 (10:52 +0100)]
[LoopVectorize] Don't interleave scalar ordered reductions for inner loops
Consider the following loop:
void foo(float *dst, float *src, int N) {
for (int i = 0; i < N; i++) {
dst[i] = 0.0;
for (int j = 0; j < N; j++) {
dst[i] += src[(i * N) + j];
}
}
}
When we are not building with -Ofast we may attempt to vectorise the
inner loop using ordered reductions instead. In addition we also try
to select an appropriate interleave count for the inner loop. However,
when choosing a VF=1 the inner loop will be scalar and there is existing
code in selectInterleaveCount that limits the interleave count to 2
for reductions due to concerns about increasing the critical path.
For ordered reductions this problem is even worse due to the additional
data dependency, and so I've added code to simply disable interleaving
for scalar ordered reductions for now.
Test added here:
Transforms/LoopVectorize/AArch64/strict-fadd-vf1.ll
Differential Revision: https://reviews.llvm.org/D106646
Anna Thomas [Tue, 27 Jul 2021 16:34:12 +0000 (12:34 -0400)]
Update reduction test. Remove standalone test file
Based on post commit review comments at
68ffed12b.
Eugene Zhulenev [Tue, 27 Jul 2021 16:17:31 +0000 (09:17 -0700)]
[mlir] Math: add algebraic simplification patterns to math transforms
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D106822
Matt Arsenault [Wed, 21 Jul 2021 20:50:49 +0000 (16:50 -0400)]
AMDGPU: Update tests for lower i1 change
I forgot to squash the test updates for
b32d3d9e81cdd9275d19cd2a396c461edc9e7189
Stella Laurenzo [Thu, 22 Jul 2021 19:57:41 +0000 (19:57 +0000)]
Re-engineer MLIR python build support.
* Implements all of the discussed features:
- Links against common CAPI libraries that are self contained.
- Stops using the 'python/' directory at the root for everything, opening the namespace up for multiple projects to embed the MLIR python API.
- Separates declaration of sources (py and C++) needed to build the extension from building, allowing external projects to build custom assemblies from core parts of the API.
- Makes the core python API relocatable (i.e. it could be embedded as something like 'npcomp.ir', 'npcomp.dialects', etc). Still a bit more to do to make it truly isolated but the main structural reset is done.
- When building statically, installed python packages are completely self contained, suitable for direct setup and upload to PyPi, et al.
- Lets external projects assemble their own CAPI common runtime library that all extensions use. No more possibilities for TypeID issues.
- Begins modularizing the API so that external projects that just include a piece pay only for what they use.
* I also rolled in a re-organization of the native libraries that matches how I was packaging these out of tree and is a better layering (i.e. all libraries go into a nested _mlir_libs package). There is some further cleanup that I resisted since it would have required source changes that I'd rather do in a followup once everything stabilizes.
* Note that I made a somewhat odd choice in choosing to recompile all extensions for each project they are included into (as opposed to compiling once and just linking). While not leveraged yet, this will let us set definitions controlling the namespacing of the extensions so that they can be made to not conflict across projects (with preprocessor definitions).
* This will be a relatively substantial breaking change for downstreams. I will handle the npcomp migration and will coordinate with the circt folks before landing. We should stage this and make sure it isn't causing problems before landing.
* Fixed a couple of absolute imports that were causing issues.
Differential Revision: https://reviews.llvm.org/D106520
Aart Bik [Tue, 27 Jul 2021 00:14:30 +0000 (17:14 -0700)]
[mlir][sparse] fixed bug in verification
The order of testing in two sparse tensor ops was incorrect,
which could cause an invalid cast (crashing the compiler instead
of reporting the error). This revision fixes that bug.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D106841
Matt Arsenault [Tue, 9 Mar 2021 21:48:49 +0000 (16:48 -0500)]
AMDGPU: Treat IMPLICIT_DEF like a constant lanemask source
This is partially a workaround. SILowerI1Copies does not understand
unstructured loops. This would result in inserting instructions to
merge a mask register in the same block where it was defined in an
unstructured loop.
Thomas Lively [Tue, 27 Jul 2021 15:41:29 +0000 (08:41 -0700)]
[WebAssembly] Codegen for extmul SIMD instructions
Replace the clang builtins and LLVM intrinsics for the SIMD extmul instructions
with normal codegen patterns.
Differential Revision: https://reviews.llvm.org/D106724
Riccardo Mori [Tue, 27 Jul 2021 15:28:09 +0000 (17:28 +0200)]
Update isl to isl-0.24-69-g54aac5ac
This is needed for having the functions isl_{set,map}_n_basic_{set,map}
exported to the C++ interface.
Some tests have been modified to reflect the isl changes.
Anastasia Stulova [Tue, 27 Jul 2021 15:27:36 +0000 (16:27 +0100)]
[OpenCL] NULL redefined as nullptr in C++ mode.
Redefines NULL as nullptr instead of ((void*)0)
in C++ for OpenCL.
Such internal representation of NULL provides
compatibility with C++11 and later language
standards.
Patch by Topotuna (Justas Janickas)!
Differential Revision: https://reviews.llvm.org/D105987
Hans Wennborg [Tue, 27 Jul 2021 15:22:01 +0000 (17:22 +0200)]
Revert "[clang][pp] adds '#pragma include_instead'"
> `#pragma clang include_instead(<header>)` is a pragma that can be used
> by system headers (and only system headers) to indicate to a tool that
> the file containing said pragma is an implementation-detail header and
> should not be directly included by user code.
>
> The library alternative is very messy code that can be seen in the first
> diff of D106124, and we'd rather avoid that with something more
> universal.
>
> This patch takes the first step by warning a user when they include a
> detail header in their code, and suggests alternative headers that the
> user should include instead. Future work will involve adding a fixit to
> automate the process, as well as cleaning up modules diagnostics to not
> suggest said detail headers. Other tools, such as clangd can also take
> advantage of this pragma to add the correct user headers.
>
> Differential Revision: https://reviews.llvm.org/D106394
This caused compiler crashes in Chromium builds involving PCH and an include
directive with macro expansion, when Token::getLiteralData() returned null. See
the code review for details.
This reverts commit
e8a64e5491260714c79dab65d1aa73245931d314.
Anirudh Prasad [Tue, 27 Jul 2021 15:26:00 +0000 (11:26 -0400)]
[SystemZ][z/OS] Initial code to generate assembly files on z/OS
- This patch consists of the bare basic code needed in order to generate some assembly for the z/OS target.
- Only the .text and the .bss sections are added for now.
- The relevant MCSectionGOFF/Symbol interfaces have been added. This enables us to print out the GOFF machine code sections.
- This patch enables us to add simple lit tests wherever possible, and contribute to the testing coverage for the z/OS target
- Further improvements and additions will be made in future patches.
Reviewed By: tmatheson
Differential Revision: https://reviews.llvm.org/D106380
Anna Thomas [Tue, 27 Jul 2021 13:48:13 +0000 (09:48 -0400)]
Strip undef implying attributes when moving calls
When hoisting/moving calls to locations, we strip unknown metadata. Such calls are usually marked `speculatable`, i.e. they are guaranteed to not cause undefined behaviour when run anywhere. So, we should strip attributes that can cause immediate undefined behaviour if those attributes are not valid in the context where the call is moved to.
This patch introduces such an API and uses it in relevant passes. See
updated tests.
Fix for PR50744.
Reviewed By: nikic, jdoerfert, lebedev.ri
Differential Revision: https://reviews.llvm.org/D104641
Tres Popp [Tue, 27 Jul 2021 14:49:15 +0000 (16:49 +0200)]
Revert "[X86][AVX] Add getBROADCAST_LOAD helper function. NFCI."
This reverts commit
1cfecf4fc4278afb0005923f6dff595cd372da5c.
This commit broke LLVM code generated through XLA by removing a
conditional on Ld->getExtensionType() == ISD::NON_EXTLOAD
This is not a perfect revert. The new function is left as other uses of
it exist now.
Tres Popp [Tue, 27 Jul 2021 14:47:52 +0000 (16:47 +0200)]
Revert "Revert "[X86][AVX] Add getBROADCAST_LOAD helper function. NFCI.""
This reverts commit
d7bbb1230a94cb239aa4a8cb896c45571444675d.
There were follow up uses of a deleted method and I didn't run the
tests. Undo the revert, so I can do it properly.
Tres Popp [Tue, 27 Jul 2021 14:21:10 +0000 (16:21 +0200)]
Revert "[X86][AVX] Add getBROADCAST_LOAD helper function. NFCI."
This reverts commit
1cfecf4fc4278afb0005923f6dff595cd372da5c.
This commit broke LLVM code generated through XLA by removing a
conditional on Ld->getExtensionType() == ISD::NON_EXTLOAD
Marek Kurdej [Tue, 27 Jul 2021 14:16:21 +0000 (16:16 +0200)]
[libc++] [c++2b] [P2166] Prohibit string and string_view construction from nullptr.
* https://wg21.link/P2166
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D106801
David Spickett [Wed, 31 Mar 2021 13:57:35 +0000 (14:57 +0100)]
[lldb][AArch64] Add memory tag writing to lldb
This adds memory tag writing to Process and the
GDB remote code. Supporting work for the
"memory tag write" command. (to follow)
Process WriteMemoryTags is similair to ReadMemoryTags.
It will pack the tags then call DoWriteMemoryTags.
That function will send the QMemTags packet to the gdb-remote.
The QMemTags packet follows the GDB specification in:
https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html#General-Query-Packets
Note that lldb-server will be treating partial writes as
complete failures. So lldb doesn't need to handle the partial
write case in any special way.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D105181
Jeremy Morse [Tue, 27 Jul 2021 13:58:49 +0000 (14:58 +0100)]
[DebugInfo][InstrRef] Correctly update DBG_PHIs during instr scheduling
Avoid several crashes when DBG_INSTR_REF and DBG_PHI instructions are fed
to the instruction scheduler. DBG_INSTR_REFs should be treated like
DBG_LABELs, and just ignored for the purpose of scheduling [0].
DBG_PHIs however behave much more like DBG_VALUEs: they refer to register
operands, and if some register defs get shuffled around during instruction
scheduling, there's a risk that the debug instr will refer to the wrong
value. There's already a facility for updating DBG_VALUEs to reflect this;
add DBG_PHI to the list of instructions that it will update.
[0] Suboptimal, but it's what instr scheduling does right now.
Differential Revision: https://reviews.llvm.org/D106663
Vassil Vassilev [Tue, 27 Jul 2021 10:02:13 +0000 (10:02 +0000)]
[clang-repl] Build and install clang-repl by default.
We have the basic infrastructure in place. We can recover from simple errors
(recovering from errors in template instantiations is not yet supported). It
looks like we are in a reasonably functional state for llvm13.
Differential revision: https://reviews.llvm.org/D106813
Louis Dionne [Tue, 27 Jul 2021 14:01:32 +0000 (10:01 -0400)]
[libc++] NFC: Try to trigger Docker image rebuild on CI nodes
Tres Popp [Tue, 27 Jul 2021 13:43:04 +0000 (15:43 +0200)]
Handle unused variable when assertions are disabled
Anna Thomas [Tue, 27 Jul 2021 01:39:18 +0000 (21:39 -0400)]
[IVDescriptors] Fix bug in checkOrderedReduction
The Exit instruction passed in for checking if it's an ordered reduction need not be
an FPAdd operation. We need to bail out at that point instead of
assuming it is an FPAdd (and hence has two operands). See added testcase.
It crashes without the patch because the Exit instruction is a phi with
exactly one operand.
This latent bug was exposed by 95346ba which added support for
multi-exit loops for vectorization.
Reviewed-By: kmclaughlin
Differential Revision: https://reviews.llvm.org/D106843
Chris Jackson [Tue, 27 Jul 2021 12:59:34 +0000 (13:59 +0100)]
[DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR
This reapplies commit
76f3ffb2b285998f02639db8fd42fb0de8a540d0 that was
reverted due to buildbot failures.
- Update lit tests with REQUIRES condition.
- Abandon salvage attempt if SCEVUnknown::getValue() returns nullptr.
Differential Revision: https://reviews.llvm.org/D105207
Kadir Cetinkaya [Sun, 25 Jul 2021 18:38:00 +0000 (20:38 +0200)]
Revert "Revert "[clangd] Adjust compile flags to contain only the requested file as input""
This reverts commit
04e21fbc44c145d5599ef8db9aaf66b159107f33.
Kadir Cetinkaya [Mon, 26 Jul 2021 09:20:47 +0000 (11:20 +0200)]
Revert "Revert "[clangd] Canonicalize compile flags before applying edits""
Set driver mode before parsing arglist.
Depends on D106789.
Differential Revision: https://reviews.llvm.org/D106794
Kadir Cetinkaya [Mon, 26 Jul 2021 12:09:36 +0000 (14:09 +0200)]
[clang][Driver] Expose driver mode detection logic
Also use it in other places that performed it on their own.
Differential Revision: https://reviews.llvm.org/D106789
Jeremy Morse [Tue, 27 Jul 2021 12:15:42 +0000 (13:15 +0100)]
[DebugInfo][InstrRef] Handle llvm.frameaddress intrinsics gracefully
When working out which instruction defines a value, the
instruction-referencing variable location code has a few special cases for
physical registers:
* Arguments are never defined by instructions,
* Constant physical registers always read the same value, are never def'd
This patch adds a third case for the llvm.frameaddress intrinsics: you can
read the framepointer in any block if you so choose, and use it as a
variable location, as shown in the added test.
This rather violates one of the assumptions behind instruction referencing,
that LLVM-ir shouldn't be able to read from an arbitrary register at some
arbitrary point in the program. The solution for now is to just emit a
DBG_PHI that reads the register value: this works, but if we wanted to do
something clever with DBG_PHIs in the future then this would probably get
in the way. As it stands, this patch avoids a crash.
Differential Revision: https://reviews.llvm.org/D106659
Chris Jackson [Tue, 27 Jul 2021 12:34:20 +0000 (13:34 +0100)]
[DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR
This reverts commit
76f3ffb2b285998f02639db8fd42fb0de8a540d0 because
of a failure on sanitixer-X86-64-linux-autoconf.
Sam McCall [Fri, 9 Jul 2021 08:26:44 +0000 (10:26 +0200)]
[clangd] Add platform triple (host & target) to version info
Useful in logs to understand issues around some platforms we don't have much
experience with (e.g. m1, mingw)
Differential Revision: https://reviews.llvm.org/D105681
Andrzej Warzynski [Wed, 21 Jul 2021 08:53:01 +0000 (09:53 +0100)]
[flang][driver] Make `flang` ignore `-Mfree/-Mfixed`
`-Mfixed` is not supported by the new driver and hence
`flang`, the bash wrapper script, forwards it to the host compiler.
The forwarded options are used by the host compiler when compiling the
unparsed files. As the unparsed source files are always in the free
form, forwarding `-Mfixed` is problematic.
With this patch, `-Mfixed` (and `-Mfree` for consistency) will be
ignored altogether. The user will only see a warning. This is not a
particularly sound approach, but `flang` is only a temporary solution
for us and this workaround is a fair compromise.
Differential Revision: https://reviews.llvm.org/D106428
Sam McCall [Tue, 27 Jul 2021 11:52:32 +0000 (13:52 +0200)]
[clangd] Use function pointer instead of function_ref to avoid GCC 5 bug
With GCC <6 constructing a function_ref from a free function reference
leads to it referencing a temporary function pointer. If the lifetime of
that temporary is insufficient it can crash.
Fixes https://github.com/clangd/clangd/issues/800
Chris Jackson [Thu, 22 Jul 2021 08:27:46 +0000 (09:27 +0100)]
[DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR
This patch extends salvaging of debuginfo in the Loop Strength Reduction
(LSR) pass by translating Scalar Evaluations (SCEV) into DIExpressions.
The method is as follows:
- Cache dbg.value intrinsics that are salvageable.
- Obtain a loop Induction Variable (IV) from ScalarExpressionExpander or
the loop header.
- Translate the IV SCEV into an expression that recovers the current
loop iteration count. Combine this with the dbg.value's location
op SCEV to create a DIExpression that salvages the value.
Review by: jmorse
Differential Revision: https://reviews.llvm.org/D105207
Raphael Isemann [Tue, 27 Jul 2021 11:58:48 +0000 (13:58 +0200)]
[lldb] Wait in TestGuiBasicDebug for the interface to open before quitting the welcome screen
Speculative fix for the failing lldb-aarch64-ubuntu bot.
Vignesh Balasubramanian [Tue, 27 Jul 2021 10:47:07 +0000 (16:17 +0530)]
Convert the error to warning for enabling OMPD in non-Linux platform
OMPD is enabled by default on Linux machines and disabled on others.
However, if explicitly enabled it throws an error and exit while configuring.
It is mentioned in Bug: https://bugs.llvm.org/show_bug.cgi?id=51121
This patch, instead of throwing error, disables OMPD support with a warning message,
so configuration can continue.
Reviewed By: @protze.joachim
Differential Revision: https://reviews.llvm.org/D106682
Nico Weber [Tue, 27 Jul 2021 11:50:27 +0000 (07:50 -0400)]
[clang/darwin] Pass libclang_rt.profile last on linker command
This reverts the functional change of https://reviews.llvm.org/D35385 because
it sounds like this is no longer necessary
(https://bugs.llvm.org/show_bug.cgi?id=51135#c11) and makes clang's behavior
more uniform across platforms.
Differential Revision: https://reviews.llvm.org/D106733
Chen Zheng [Tue, 27 Jul 2021 11:45:26 +0000 (11:45 +0000)]
[PowerPC] add more testcases for ld_splat; nfc
Fraser Cormack [Tue, 27 Jul 2021 11:04:09 +0000 (12:04 +0100)]
[LangRef][NFC] Fix variable name in llvm.maxnum docs
Simon Pilgrim [Tue, 27 Jul 2021 11:09:10 +0000 (12:09 +0100)]
[X86] Add PR37025 test coverage
David Spickett [Wed, 31 Mar 2021 13:02:34 +0000 (14:02 +0100)]
[lldb][AArch64] Add memory tag writing to lldb-server
This is implemented using the QMemTags packet, as specified
by GDB in:
https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html#General-Query-Packets
(recall that qMemTags was previously added to read tags)
On receipt of a valid packet lldb-server will:
* align the given address and length to granules
(most of the time lldb will have already done this
but the specification doesn't guarantee it)
* Repeat the supplied tags as many times as needed to cover
the range. (if tags > range we just use as many as needed)
* Call ptrace POKEMTETAGS to write the tags.
The ptrace step will loop just like the tag read does,
until all tags are written or we get an error.
Meaning that if ptrace succeeds it could be a partial write.
So we call it again and if we then get an error, return an error to
lldb.
We are not going to attempt to restore tags after a partial
write followed by an error. This matches the behaviour of the
existing memory writes.
The lldb-server tests have been extended to include read and
write in the same test file. With some updated function names
since "qMemTags" vs "QMemTags" isn't very clear when they're
next to each other.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D105180
Sander de Smalen [Mon, 26 Jul 2021 19:54:11 +0000 (20:54 +0100)]
[LV] Disable Scalable VFs when tail folding is enabled b/c of low tripcount.
The loop vectorizer may decide to use tail folding when the trip-count
is low. When that happens, scalable VFs are no longer a candidate,
since tail folding/predication is not yet supported for scalable vectors.
This can be re-enabled in a future patch.
Reviewed By: kmclaughlin
Differential Revision: https://reviews.llvm.org/D106657
Diana Picus [Tue, 27 Jul 2021 10:21:10 +0000 (10:21 +0000)]
[flang] Fix minor style issues. NFC
Diana Picus [Mon, 12 Jul 2021 14:04:34 +0000 (14:04 +0000)]
[flang] Fix thinko in CPU_TIME test
We used to test that end > start, but it can well be >= (otherwise the
loop doesn't make sense).
Jay Foad [Fri, 18 Jun 2021 12:22:11 +0000 (13:22 +0100)]
[GlobalISel] Constant fold G_SITOFP and G_UITOFP in CSEMIRBuilder
Differential Revision: https://reviews.llvm.org/D104528
Benjamin Kramer [Tue, 27 Jul 2021 10:18:54 +0000 (12:18 +0200)]
[mlir] Fix typo s/applyPermuationMap/applyPermutationMap/
Fraser Cormack [Fri, 23 Jul 2021 10:13:08 +0000 (11:13 +0100)]
[SelectionDAG] Support scalable splats in U(ADD|SUB)SAT combines
This patch builds on top of D106575 in which scalable-vector splats were
supported in `ISD::matchBinaryPredicate`. It teaches the DAGCombiner how
to perform a variety of the pre-existing saturating add/sub combines on
scalable-vector types.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D106652
Jan Svoboda [Tue, 27 Jul 2021 09:48:25 +0000 (11:48 +0200)]
[clang][tooling] Link LLVMOption to ToolingTests
This fixes a build failure introduced in
11ee699b3c812ebe56ce5d3b14ab7ef16c1e8495.
Dmitry Vyukov [Tue, 27 Jul 2021 09:05:11 +0000 (11:05 +0200)]
Revert "sanitizer_common: split LibIgnore into fast/slow paths"
This reverts commit
1e1f7520279c93a59fa6511028ff40412065985e.
It breaks ignore_noninstrumented_modules=1.
Somehow we did not have any portable tests for this mode before
(only Darwin tests). Add a portable test as well.
Moreover, I think I was too fast uninlining all LibIgnore checks.
For Java, Darwin and OpenMP LibIgnore is always enabled,
so it makes sense to leave it as it was before.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D106855
Hans Wennborg [Mon, 26 Jul 2021 13:46:36 +0000 (15:46 +0200)]
[clang-cl] Expose -fmodules and related flags in the driver (PR43391)
I don't know how well this works with clang-cl, but people want to try
it out, and I think we want to make it work, so exposing the flags seems
reasonable.
Differential revision: https://reviews.llvm.org/D106791
Fraser Cormack [Fri, 23 Jul 2021 10:10:04 +0000 (11:10 +0100)]
[RISCV] Add support for vector saturating add/sub operations
This patch adds support for lowering the saturating vector add/sub
intrinsics to RVV instructions, for both fixed-length and
scalable-vector forms alike.
Note that some of the DAG combines are still not triggering for the
scalable-vector tests. These require a bit more work in the DAGCombiner
itself.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D106651
David Green [Tue, 27 Jul 2021 09:11:51 +0000 (10:11 +0100)]
[NFC] Reflow some debug messages.
Jan Svoboda [Tue, 27 Jul 2021 09:09:10 +0000 (11:09 +0200)]
[clang][tooling] Link clangDriver to ToolingTests
This fixes a build failure introduced in
11ee699b3c812ebe56ce5d3b14ab7ef16c1e8495.
Jan Svoboda [Mon, 26 Jul 2021 11:40:43 +0000 (13:40 +0200)]
[clang][tooling] Accept Clang invocations with multiple jobs
When `-fno-integrated-as` is passed to the Clang driver (or set by default by a specific toolchain), it will construct an assembler job in addition to the cc1 job. Similarly, the `-fembed-bitcode` driver flag will create additional cc1 job that reads LLVM IR file.
The Clang tooling library only cares about the job that reads a source file. Instead of relying on the fact that the client injected `-fsyntax-only` to the driver invocation to get a single `-cc1` invocation that reads the source file, this patch filters out such jobs from `Compilation` automatically and ignores the rest.
This fixes a test failure in `ClangScanDeps/headerwithname.cpp` and `ClangScanDeps/headerwithnamefollowedbyinclude.cpp` on AIX reported here: https://reviews.llvm.org/D103461#2841918 and `clang-scan-deps` failures with `-fembed-bitcode`.
Depends on D106788.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D105695
Cullen Rhodes [Tue, 27 Jul 2021 08:00:49 +0000 (08:00 +0000)]
[AArch64][SME] Add zero instruction
This patch adds the zero instruction for zeroing a list of 64-bit
element ZA tiles. The instruction takes a list of up to eight tiles
ZA0.D-ZA7.D, which must be in order, e.g.
zero {za0.d,za1.d,za2.d,za3.d,za4.d,za5.d,za6.d,za7.d}
zero {za1.d,za3.d,za5.d,za7.d}
The assembler also accepts 32-bit, 16-bit and 8-bit element tiles which
are mapped to corresponding 64-bit element tiles in accordance with the
architecturally defined mapping between different element size tiles,
e.g.
* Zeroing ZA0.B, or the entire array name ZA, is equivalent to zeroing
all eight 64-bit element tiles ZA0.D to ZA7.D.
* Zeroing ZA0.S is equivalent to zeroing ZA0.D and ZA4.D.
The preferred disassembly of this instruction uses the shortest list of
tile names that represent the encoded immediate mask, e.g.
* An immediate which encodes 64-bit element tiles ZA0.D, ZA1.D, ZA4.D and
ZA5.D is disassembled as {ZA0.S, ZA1.S}.
* An immediate which encodes 64-bit element tiles ZA0.D, ZA2.D, ZA4.D and
ZA6.D is disassembled as {ZA0.H}.
* An all-ones immediate is disassembled as {ZA}.
* An all-zeros immediate is disassembled as an empty list {}.
This patch adds the MatrixTileList asm operand and related parsing to support
this.
Depends on D105570.
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D105575
Fraser Cormack [Fri, 23 Jul 2021 09:40:29 +0000 (10:40 +0100)]
[RISCV] Add tests showing missed vector saturating add/sub combines
These will be optimized by upcoming patches. The tests are primarily not
being optimized due to the lack of support for saturating vector
arithmetic in the RISC-V backend.
On top of that, however, a large percentage of the scalable-vector tests
are also lacking support in the DAGCombiner: either in
`ISD::matchBinaryPredicate` or due to checks specifically for
`BUILD_VECTOR` and not `SPLAT_VECTOR`.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D106649
David Green [Tue, 27 Jul 2021 08:11:58 +0000 (09:11 +0100)]
[ARM] Implement isLoad/StoreFromStackSlot for MVE stack stores accesses
This implements the isLoadFromStackSlot and isStoreToStackSlot for MVE
MVE_VSTRWU32 and MVE_VLDRWU32 functions. They behave the same as many
other loads/stores, expecting a FI in Op1 and zero offset in Op2. At the
same time this alters VLDR_P0_off and VSTR_P0_off to use the same code
too, as they too should be returning VPR in Op0, take a FI in Op1 and
zero offset in Op2.
Differential Revision: https://reviews.llvm.org/D106797
Rosie Sumpter [Thu, 22 Jul 2021 16:54:40 +0000 (17:54 +0100)]
[LoopFlatten] Use SCEV and Loop APIs to identify increment and trip count
Replace pattern-matching with existing SCEV and Loop APIs as a more
robust way of identifying the loop increment and trip count. Also
rename 'Limit' as 'TripCount' to be consistent with terminology.
Differential Revision: https://reviews.llvm.org/D106580
Lang Hames [Tue, 27 Jul 2021 07:31:57 +0000 (17:31 +1000)]
[docs] Update release notes with all LLVM-C API changes
Patch by Mats Larsen. Thanks Mats!
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D106764
Esme-Yi [Tue, 27 Jul 2021 07:28:59 +0000 (07:28 +0000)]
[Debug-Info][llvm-dwarfdump] Don't try to dump location
list for attributes that don't have the loclist class.
Summary: The overflow error occurs when we try to dump
location list for those attributes that do not have the
loclist class, like DW_AT_count and DW_AT_byte_size.
After re-reviewed the entire list, I sorted those
attributes into two parts, one for dumping location list
and one for dumping the location expression.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D105613
Jan Svoboda [Mon, 26 Jul 2021 11:30:58 +0000 (13:30 +0200)]
[clang][driver] NFC: Expose InputInfo in Job instead of plain filenames
This patch exposes `InputInfo` in `Job` instead of plain filenames. This is useful in a follow-up patch that uses this to recognize `-cc1` commands interesting for Clang tooling.
Depends on D106787.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D106788
Jan Svoboda [Mon, 26 Jul 2021 11:22:27 +0000 (13:22 +0200)]
[clang][driver] NFC: Move InputInfo.h from lib to include
Moving `InputInfo.h` from `lib/Driver/` into `include/Driver` to be able to expose it in an API consumed from outside of `clangDriver`.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D106787
LLVM GN Syncbot [Tue, 27 Jul 2021 06:54:07 +0000 (06:54 +0000)]
[gn build] Port
2487db1f2862
Lang Hames [Tue, 27 Jul 2021 03:50:19 +0000 (13:50 +1000)]
[ORC] Require ExecutorProcessControl when constructing an ExecutionSession.
Wrapper function call and dispatch handler helpers are moved to
ExecutionSession, and existing EPC-based tools are re-written to take an
ExecutionSession argument instead.
Requiring an ExecutorProcessControl instance simplifies existing EPC based
utilities (which only need to take an ES now), and should encourage more
utilities to use the EPC interface. It also simplifies process termination,
since the session can automatically call ExecutorProcessControl::disconnect
(previously this had to be done manually, and carefully ordered with the
rest of JIT tear-down to work correctly).
Johannes Doerfert [Tue, 27 Jul 2021 02:32:10 +0000 (21:32 -0500)]
[OpenMP] Try to simplify all loads in device code
Eliminating loads/stores in the device code is worth the extra effort,
especially for the new device runtime.
At the same time we do not compute AAExecutionDomain for non-device code
anymore, there is no point.
Differential Revision: https://reviews.llvm.org/D106845
Johannes Doerfert [Tue, 27 Jul 2021 06:35:02 +0000 (01:35 -0500)]
[Attributor][FIX] Copy all members in the assignment operator
Also improve debug output slightly.
Johannes Doerfert [Thu, 15 Jul 2021 23:24:58 +0000 (18:24 -0500)]
[Attributor] Utilize the InstSimplify interface to simplify instructions
When we simplify at least one operand in the Attributor simplification
we can use the InstSimplify to work on the simplified operands. This
allows us to avoid duplication of the logic.
Depends on D106189
Differential Revision: https://reviews.llvm.org/D106190
Johannes Doerfert [Thu, 15 Jul 2021 22:40:24 +0000 (17:40 -0500)]
[InstSimplify] Expose generic interface for replaced operand simplification
Users, especially the Attributor, might replace multiple operands at
once. The actual implementation of simplifyWithOpReplaced is able to
handle that just fine, the interface was simply not allowing to replace
more than one operand at a time. This is exposing a more generic
interface without intended changes for existing code.
Differential Revision: https://reviews.llvm.org/D106189
Johannes Doerfert [Sun, 25 Jul 2021 18:26:44 +0000 (13:26 -0500)]
[OpenMP] Prototype opt-in new GPU device RTL
The "old" OpenMP GPU device runtime (D14254) has served us well for many
years but modernizing it has caused some pain recently. This patch
introduces an alternative which is mostly written from scratch embracing
OpenMP 5.X, C++, LLVM coding style (where applicable), and conceptual
interfaces. This new runtime is opt-in through a clang flag (D106793).
The new runtime is currently only build for nvptx and has "-new" in its
name.
The design is tailored towards middle-end optimizations rather than
front-end code generation choices, a trend we already started in the old
runtime a while back. In contrast to the old one, state is organized in
a simple manner rather than a "smart" one. While this can induce costs
it helps optimizations. Our expectation is that the majority of codes
can be optimized and a "simple" design is therefore preferable. The new
runtime does also avoid users to pay for things they do not use,
especially wrt. memory. The unlikely case of nested parallelism is
supported but costly to make the more likely case use less resources.
The worksharing and reduction implementation have been taken from the
old runtime and will be rewritten in the future if necessary.
Documentation and debug features are still mostly missing and will be
added over time.
All external symbols start with `__kmpc` for legacy reasons but should
be renamed once we switch over to a single runtime. All internal symbols
are placed in appropriate namespaces (anonymous or `_OMP`) to avoid name
clashes with user symbols.
Differential Revision: https://reviews.llvm.org/D106803
Johannes Doerfert [Tue, 27 Jul 2021 05:53:56 +0000 (00:53 -0500)]
[Attributor] Update check lines for all AMDGPU attributor tests
I thought there was only one when I pushed
cdb4cfe8b3ce2b0c50d4855ec260eab07fe63611, these should be all (in the
CodeGen/AMDGPU folder).
Johannes Doerfert [Tue, 27 Jul 2021 05:21:09 +0000 (00:21 -0500)]
[Attributor][FIX] Update AMDGPU attributor test
The test contains UB and should be improved, for now we update the check
lines pass it.
Chuanqi Xu [Tue, 27 Jul 2021 05:13:39 +0000 (13:13 +0800)]
[Coroutine] Record the elided coroutines
Reviewed By: lxfind
Differential Revision: https://reviews.llvm.org/D105606
Tom Stellard [Tue, 27 Jul 2021 01:52:13 +0000 (01:52 +0000)]
Merge all the llvm-exegesis unit tests into a single binary
These tests access private symbols in the backends, so they cannot link
against libLLVM.so and must be statically linked. Linking these tests
can be slow and with debug builds the resulting binaries use a lot of
disk space.
By merging them into a single test binary means we now only need to
statically link 1 test instead of 6, which helps reduce the build
times and saves disk space.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D106464
wlei [Thu, 22 Jul 2021 19:53:42 +0000 (12:53 -0700)]
[CSSPGO] Tweak ICP threshold in top-down inliner
This change slightly relaxed the current ICP threshold in top-down inliner, specifically always allow one ICP for it. It shows some perf improvements on SPEC and our internal benchmarks. Also renamed the previous flag. We can also try to turn off PGO ICP in the future.
Reviewed By: wenlei, hoy, wmi
Differential Revision: https://reviews.llvm.org/D106588
Johannes Doerfert [Mon, 19 Jul 2021 20:31:10 +0000 (15:31 -0500)]
[Local] Do not introduce a new `llvm.trap` before `unreachable`
This is the second attempt to remove the `llvm.trap` insertion after
https://reviews.llvm.org/rGe14e7bc4b889dfaffb7180d176a03311df2d4ae6
reverted the first one. It is not clear what the exact issue was back
then and it might already be gone by now, it has been >5 years after
all.
Replaces D106299.
Differential Revision: https://reviews.llvm.org/D106308
Johannes Doerfert [Fri, 16 Jul 2021 20:40:37 +0000 (15:40 -0500)]
[Attributor] Delete dead stores
D106185 allows us to determine if a store is needed easily. Using that
knowledge we can start to delete dead stores.
In AAIsDead we now track more state as an instruction can be dead (= the
old optimisitc state) or just "removable". A store instruction can be
removable while being very much alive, e.g., if it stores a constant
into an alloca or internal global. If we would pretend it was dead
instead of only removablewe we would ignore it when we determine what
values a load can see, so that is not what we want.
Differential Revision: https://reviews.llvm.org/D106188
Johannes Doerfert [Mon, 12 Jul 2021 02:04:28 +0000 (21:04 -0500)]
[Attributor] Introduce getPotentialCopiesOfStoredValue and use it
This patch introduces `getPotentialCopiesOfStoredValue` which uses
AAPointerInfo to determine all "aliases" or "potential copies" of a
value that is stored into memory. This operation can fail but if it
succeeds it means we can visit all "uses" of a value even if it is
temporarily stored in memory.
There are two users for the function:
1) `Attributor::checkForAllUses` which will now ignore the value use
in a store if all "potential copies" can be identified and instead
be visited. This allows various AAs, including AAPointerInfo
itself, to look through memory.
2) `AANoCapture` which uses a custom use tracking through the
CaptureTracker interface and therefore needs to be thought
explicitly.
Differential Revision: https://reviews.llvm.org/D106185
Mehdi Amini [Fri, 16 Jul 2021 03:32:59 +0000 (03:32 +0000)]
Build libSupport with -Werror=global-constructors (NFC)
Ensure that libSupport does not carry any static global initializer.
libSupport can be embedded in use cases where we don't want to load all
cl::opt unless we want to parse the command line.
ManagedStatic can be used to enable lazy-initialization of globals.
The -Werror=global-constructors is only added on platform that have
support for the flag and for which std::mutex does not have a global
destructor. This is ensured by having CMake trying to compile a file
with a global mutex before adding the flag to libSupport.
Jianzhou Zhao [Tue, 27 Jul 2021 04:21:41 +0000 (04:21 +0000)]
[dfsan][NFC] Fix doc format
Craig Topper [Tue, 27 Jul 2021 02:04:49 +0000 (19:04 -0700)]
[AArch64] Fix -Wparentheses warning with gcc 5.4. NFC
Jun Ma [Tue, 27 Jul 2021 03:31:32 +0000 (11:31 +0800)]
[NFC][InstCombine] Fix typo
Lang Hames [Tue, 27 Jul 2021 03:03:19 +0000 (13:03 +1000)]
[llvm-jitlink] Don't hardcode LLVM version number into the runtime path.
This should unbreak builders that were failing due to different patch numbers.
Mitch Phillips [Tue, 27 Jul 2021 02:32:49 +0000 (19:32 -0700)]
Revert "[GlobalISel] Add scalar widening for G_MERGE_VALUES destination"
This reverts commit
0a37163d1d855a2db41e1f46ddbc3f4570bd7ca6.
Reason: Broke the sanitizer msan bots. More details are available in the
original Phabricator review: https://reviews.llvm.org/D106814.
Shilei Tian [Tue, 27 Jul 2021 02:45:52 +0000 (22:45 -0400)]
[AbstractAttributor] Fold __kmpc_parallel_level if possible
Similar to D105787, this patch tries to fold `__kmpc_parallel_level` if possible.
Note that `__kmpc_parallel_level` doesn't take activeness into consideration,
based on current `deviceRTLs`, its return value can be such as 0, 1, 2, instead
of 0, 129, 130, etc. that also indicate activeness.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106154
Johannes Doerfert [Tue, 20 Jul 2021 06:58:44 +0000 (01:58 -0500)]
[OpenMP] Run rewriteDeviceCodeStateMachine in the Module not CGSCC pass
While rewriteDeviceCodeStateMachine should probably be folded into
buildCustomStateMachine, we at least need the optimization to happen.
This was not reliably the case in the CGSCC pass but in the Module pass
it seems to work reliably.
This also ports a test to the new kernel encoding (target_init/deinit),
and makes sure we cannot run the kernel in SPMD mode.
Differential Revision: https://reviews.llvm.org/D106345
Johannes Doerfert [Mon, 26 Jul 2021 16:53:43 +0000 (11:53 -0500)]
[Attributor][FIX] Do not return CHANGED unconditionally
This caused us to rerun AAMemoryBehaviorFloating::updateImpl over and
over again. Unfortunately it turned out to be hard to reproduce the
behavior in a reasonable way.
Johannes Doerfert [Mon, 26 Jul 2021 18:33:54 +0000 (13:33 -0500)]
[Attributor][FIX] Track change status for AAIsDead properly
If we add a new live edge we need to indicate a change or otherwise the
new live block is not shown to users. Similarly, new known dead ends and
a changed `ToBeExploredFrom` set need to cause us to return CHANGED.
Mehdi Amini [Tue, 27 Jul 2021 01:37:35 +0000 (01:37 +0000)]
Define the namespace for the Affine dialect in ODS (NFC)
This aligns the structure of the Affine dialect on all the other dialects.
In particular this makes the ODS C++ generated code independent of the
enclosing namespace.
Nico Weber [Tue, 27 Jul 2021 02:10:46 +0000 (22:10 -0400)]
[gn build] Kind of port
c7b3a91017d2 (libclang version script)
libclang is only built as static library in the GN build at the
moment, which means we now generate a .exports file form a version
script and then link.exe and ld64 inputs from the .exports file
but don't use the version script, but hey.
Jianzhou Zhao [Tue, 27 Jul 2021 02:07:27 +0000 (02:07 +0000)]
[dfsan][NFC] Fix doc format
Tom Stellard [Tue, 27 Jul 2021 01:20:14 +0000 (18:20 -0700)]
libclang: Fixes for the python script that generates the export list
This script was added in
0cf37a3b0617457daaed3224373ffa07724f8482.
Carl Ritson [Tue, 27 Jul 2021 00:36:39 +0000 (09:36 +0900)]
[AMDGPU] Add SelectionDAG support for insert_subvector on v4f64
Enable custom insert_subvector for larger vector types.
This is necessary now that SelectionDAG can attempt v3f64 insert
to v4f64, etc.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D105385
Christopher Di Bella [Mon, 26 Jul 2021 21:00:40 +0000 (21:00 +0000)]
[libcxx][NFC] adjusts
41b17c44 so it meets requested feedback
Feedback requested in D106735 applied in Diff 3 seem to have
reverted in Diff 4. This patch fixes that up.
Differential Revision: https://reviews.llvm.org/D106829
Mehdi Amini [Tue, 27 Jul 2021 01:08:18 +0000 (01:08 +0000)]
Revert "Build libSupport with -Werror=global-constructors (NFC)"
This reverts commit
beff86e8ff429f11da6fe37efde86d22ea636ed5.
The sanitizer-x86_64-linux bot is still broken.
Walter Erquinigo [Wed, 21 Jul 2021 21:46:51 +0000 (14:46 -0700)]
[trace] Add the definition of a TraceExporter plugin
Copying from the inline documentation:
```
Trace exporter plug-ins operate on traces, converting the trace data provided by an \a lldb_private::TraceCursor into a different format that can be digested by other tools, e.g. Chrome Trace Event Profiler.
Trace exporters are supposed to operate on an architecture-agnostic fashion, as a TraceCursor, which feeds the data, hides the actual trace technology being used.
```
I want to use this to make the code in https://reviews.llvm.org/D105741 a plug-in. I also imagine that there will be more and more exporters being implemented, as an exporter creates something useful out of trace data. And tbh I don't want to keep adding more stuff to the lldb/Target folder.
This is the minimal definition for a TraceExporter plugin. I plan to use this with the following commands:
- thread trace export <plug-in name> [plug-in specific args]
- This command would support autocompletion of plug-in names
- thread trace export list
- This command would list the available trace exporter plug-ins
I don't plan to create yet a "process trace export" because it's easier to start analyzing the trace of a given thread than of the entire process. When we need a process-level command, we can implement it.
I also don't plan to force each "export" command implementation to support multiple threads (for example, "thread trace start 1 2 3" or "thread trace start all" operate on many threads simultaneously). The reason is that the format used by the exporter might or might not support multiple threads, so I'm leaving this decision to each trace exporter plug-in.
Differential Revision: https://reviews.llvm.org/D106501