Craig Topper [Fri, 23 Apr 2021 21:26:32 +0000 (14:26 -0700)]
[RISCV] Move getLMULForFixedLengthVector out of RISCVSubtarget.
Make it a static function RISCVISelLowering, the only place it
is used.
I think I'm going to make this return a fractional LMULs in some
cases so I'm sorting out where it should live before I start
making changes.
Craig Topper [Fri, 23 Apr 2021 20:05:23 +0000 (13:05 -0700)]
[RISCV] Only expose one interface for getContainerForFixedLengthVector in the RISCVTargetLowering class
We can have RISCVISelDAGToDAG.cpp call the VT only version by
finding the RISCVTargetLowering object via the Subtarget.
Make the static versions just global static functions in
RISCVISelLowering that can be called by static functions in that
file.
Jez Ng [Fri, 23 Apr 2021 22:05:34 +0000 (18:05 -0400)]
[lld-macho] Fix use-after-free in loadDylib()
We were taking a reference to a value in `loadedDylibs`, which in turn
called `make<DylibFile>()`, which could then recursively call
`loadDylibs`, which would then potentially resize `loadedDylibs` and
invalidate that reference.
Fixes PR50101.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D101175
Jez Ng [Fri, 23 Apr 2021 22:05:48 +0000 (18:05 -0400)]
[lld-macho]][nfc] Fix some typos + rephrase a comment
I was a bit confused by the comment because I thought that "Tests
that..." was describing the tests contained within the same file.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D101160
Dávid Bolvanský [Fri, 23 Apr 2021 21:45:50 +0000 (23:45 +0200)]
[utils] Disable -Wdeprecated-copy for googlemock/gtest
Simple fix for build breakage. Feel free to fix all places (quite a lot).
wlei [Fri, 23 Apr 2021 19:35:12 +0000 (12:35 -0700)]
[CSSPGO] Fix missing debug info of dangling pseudo probe
While doing speculative execution opt, it conservatively drops all insn's debug info in the merged `ThenBB`(see the loop at line 2384) including the dangling probe. The missing debug info of the dangling probe will cause the wrong inference computation.
So we should avoid dropping the debug info from pseudo probe, this change try to fix this by moving the to-be dangling probe to the merging target BB before the debug info is dropped.
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D101195
Aaron Puchert [Wed, 21 Apr 2021 15:21:22 +0000 (17:21 +0200)]
Thread safety analysis: Simplify intersectAndWarn (NFC)
Instead of conditionally overwriting a nullptr and then branching on its
nullness, just branch directly on the original condition. Then we can
make both pointers (non-null) references instead.
Stephen Kelly [Fri, 23 Apr 2021 12:30:19 +0000 (13:30 +0100)]
Enable AST introspection on non-X86
Thomas Lively [Fri, 23 Apr 2021 20:37:27 +0000 (13:37 -0700)]
[WebAssembly] Finalize wasm_simd128.h intrinsics
Adds new intrinsics for instructions that are in the final SIMD spec but did not
previously have intrinsics. Also updates the names of existing intrinsics to
reflect the final names of the underlying instructions in the spec. Keeps the
old names as deprecated functions to ease the transition to the new names.
Differential Revision: https://reviews.llvm.org/D101112
Nikita Popov [Fri, 23 Apr 2021 20:05:52 +0000 (22:05 +0200)]
[SCEV] Add loop guard tests for ugt/uge predicates (NFC)
Nemanja Ivanovic [Fri, 23 Apr 2021 16:50:13 +0000 (11:50 -0500)]
[PowerPC] Provide XL-compatible builtins in altivec.h
There are some interfaces in altivec.h that are not compatible
between Clang and XL (although Clang is compatible with GCC).
Currently, we have found 3 but there may be others.
Clang/GCC signatures:
vector double vec_ctf(vector signed long long)
vector double vec_ctf(vector unsigned long long)
vector signed long long vec_cts(vector double)
vector unsigned long long vec_ctu(vector double)
XL signatures:
vector float vec_ctf(vector signed long long)
vector float vec_ctf(vector unsigned long long)
vector signed int vec_cts(vector double)
vector unsigned int vec_ctu(vector double)
This patch provides the XL behaviour under the __XL_COMPAT_ALTIVEC__
macro for users that rely on XL behaviour.
Differential revision: https://reviews.llvm.org/D101130
Rob Suderman [Wed, 21 Apr 2021 07:53:50 +0000 (00:53 -0700)]
[mlir][tosa] Add tosa.resize lowering to linalg generic
Includes tests and implementation for both integer and floating point values.
Both nearest neighbor and bilinear interpolation is included.
Differential Revision: https://reviews.llvm.org/D101009
zoecarver [Fri, 23 Apr 2021 19:37:47 +0000 (12:37 -0700)]
[libcxx][nfc] Add license to `pointer_comparison_test_helper.h`
Christian Kandeler [Fri, 23 Apr 2021 19:17:43 +0000 (21:17 +0200)]
[clangd] Allow AST request without range
If no range is given, return the translation unit AST.
This is useful for tooling operations that require e.g. the full path to
a node.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D101057
Dávid Bolvanský [Fri, 23 Apr 2021 19:12:51 +0000 (21:12 +0200)]
[InstCombine] X - usub.sat(X, Y) => umin(X, Y)
Pattern regressed in LLVM 9 with the introduction of usub.sat.
Fixes https://bugs.llvm.org/show_bug.cgi?id=42178#c2
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D101184
Pooja Yadav [Fri, 23 Apr 2021 18:49:53 +0000 (00:19 +0530)]
[Docs] Updated LLVM_TARGETS_TO_BUILD section in GettingStarted.rst
Updated LLVM_TARGETS_TO_BUILD under https://llvm.org/docs/GettingStarted.html#local-llvm-configuration.
Differential Revision: https://reviews.llvm.org/D101101
Alexander Belyaev [Fri, 23 Apr 2021 17:47:51 +0000 (19:47 +0200)]
[mlir] Add block arguments for input/output operands of 'linalg.tiled_loop`.
Differential Revision: https://reviews.llvm.org/D101186
Nico Weber [Fri, 23 Apr 2021 16:49:14 +0000 (12:49 -0400)]
[lld/mac] Support more flags for --reproduce
I went through the callers of `readFile()` and `addFile()` in Driver.cpp
and checked that the options that use them all get rewritten in the
--reproduce response file. -(un)exported_symbols_list and -bundle_loader
weren't, so add them.
Also spruce up the test for reproduce a bit and actually try linking
with the exptracted repro archive.
Motivated by the response file in PR50098 complaining abou the
-exported_symbols_list path being wrong :)
Differential Revision: https://reviews.llvm.org/D101182
Peter Collingbourne [Fri, 23 Apr 2021 18:25:20 +0000 (11:25 -0700)]
scudo: Work around gcc 8 conversion warning.
Should fix:
https://lab.llvm.org/buildbot#builders/99/builds/2953
Hongtao Yu [Fri, 23 Apr 2021 07:26:11 +0000 (00:26 -0700)]
[CSSPGO] Fix incorrect prorating indirect call distribution factor that leads to target count loss.
Pseudo probe distribution factor is used to scale down profile samples to avoid misleading the counts inference due to the usage of "maximum" in `getBlockWeight`. For callsites, the scaling down can come from code duplication prior to the sample profile loader (prelink or postlink), or due to the indirect call promotion in sample loader inliner. This patch fixes an issue in sample loader ICP where the leftover indirect callsite scaling down causes the loss of non-promoted call target samples unexpectedly. While the scaling down is to favor BFI/BPI with accurate an callsite count, it doesn't fit in the current distribution factor that represents code duplication changes. Ideally, we would need two factors, one is for code duplication, the other is for ICP. However this seems over complicated. I'm going to trade one usage (callsite counts) for the other (call target counts).
Seeing perf win on one benchmark (mcf) of SPEC2017 with others unchanged.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D100993
Mitch Phillips [Fri, 23 Apr 2021 17:45:44 +0000 (10:45 -0700)]
[hwasan] Remove untagging of kernel-consumed memory
Now that page aliasing for x64 has landed, we don't need to worry about
passing tagged pointers to libc, and thus D98875 removed it.
Unfortunately, we still test on aarch64 devices that don't have the
kernel tagged address ABI (https://reviews.llvm.org/D98875#2649269).
All the memory that we pass to the kernel in these tests is from global
variables. Instead of having architecture-specific untagging mechanisms
for this memory, let's just not tag the globals.
Reviewed By: eugenis, morehouse
Differential Revision: https://reviews.llvm.org/D101121
Fangrui Song [Fri, 23 Apr 2021 17:49:19 +0000 (10:49 -0700)]
[OpenMP] Fix -Wdeprecated-copy
Mitch Phillips [Fri, 23 Apr 2021 17:36:56 +0000 (10:36 -0700)]
Revert "[X86][AMX] Try to hoist AMX shapes' def"
This reverts commit
90118563ad0f133c696e070ad72761fa0daa4517.
Reason: Broke the MSan buildbots.
https://lab.llvm.org/buildbot/#/builders/5/builds/6967/steps/9/logs/stdio
More details can be found in the original phabricator review:
https://reviews.llvm.org/D101067
Sanjay Patel [Fri, 23 Apr 2021 17:19:46 +0000 (13:19 -0400)]
[InstCombine] fold 'not' of ctpop in parity pattern
As discussed in https://llvm.org/PR50096 , we could
convert the 'not' into a 'sub' and see the same
fold. That's because we already have another demanded
bits optimization for 'sub'.
We could add a related transform for
odd-number-of-type-bits, but that seems unlikely
to be practical.
https://alive2.llvm.org/ce/z/TWJZXr
Sanjay Patel [Fri, 23 Apr 2021 17:06:02 +0000 (13:06 -0400)]
[InstCombine] add test for ctpop; NFC
Goes with
2912f42a / PR50096.
Mitch Phillips [Fri, 23 Apr 2021 16:47:07 +0000 (09:47 -0700)]
[Scudo] Use GWP-ASan's aligned allocations and fixup postalloc hooks.
This patch does a few cleanup things:
1. The non-standalone scudo has a problem where GWP-ASan allocations
may not meet alignment requirements where Scudo was requested to have
alignment >= 16. Use the new GWP-ASan API to fix this.
2. The standalone variant loses some debugging information inside of
GWP-ASan because we ask GWP-ASan to allocate an aligned size in the
frontend. This means reports end up with 'UaF on a 16-byte allocation'
for a 1-byte allocation with 16-byte alignment. Also use the new API to
fix this.
3. Add post-alloc hooks for GWP-ASan intercepted allocations, and add
stats tracking for GWP-ASan allocations.
4. Add a small test that checks the alignment of the frontend
allocator, so that it can be used under GWP-ASan torture mode.
5. Add GWP-ASan torture mode as a testing configuration to catch these
regressions.
Depends on D94830, D95889.
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D95884
Chris Hamilton [Thu, 22 Apr 2021 20:39:36 +0000 (15:39 -0500)]
[PR49761] Fix variadic arg handling in matcher
Mishandling of variadic arguments in a function call caused a crash
(runtime assert fail) in bugprone-infinite-loop tidy checker. Fix
is to limit argument matching to the lesser of the number of variadic
params in the prototype or the number of actual args in the call.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D101108
Teresa Johnson [Fri, 23 Apr 2021 16:30:28 +0000 (09:30 -0700)]
Mark type test intrinsics as speculatable to fix inline cost
There is already code in InlineCost.cpp to identify and ignore ephemeral
values (llvm.assume intrinsics and other side-effect free instructions
only feeding the assumes). However, because llvm.type.test intrinsics
were not marked speculatable, they and any instructions specifically
feeding the type test (typically a bitcast) were being counted towards
the instruction cost when inlining. This was causing profile matching
issues in some cases when enabling -fwhole-program-vtables for whole
program devirtualization.
According to the language reference, the speculatable attribute means:
"the function does not have any effects besides calculating its result
and does not have undefined behavior". I see no reason why type tests
cannot be marked with this attribute.
There are 2 test changes:
llvm/test/Transforms/Inline/ephemeral.ll: I added a type test intrinsic
here to verify the fix. Also, I found the test was not actually testing
what it originally intended. Many of the existing instructions were
optimized away by -Oz, and the cost of inlining was negative due to the
benefit of removing the call. So I changed the test to simply invoke the
inline pass and check the number of instructions computed by InlineCost.
I also fixed an instruction that was not actually used anywhere.
llvm/test/Transforms/SimplifyCFG/no-md-sink.ll needed to be made more
robust to code changes that reordered the metadata.
Differential Revision: https://reviews.llvm.org/D101180
Snehasish Kumar [Fri, 23 Apr 2021 00:38:13 +0000 (17:38 -0700)]
[NFC] Use hasSection instead of getSection().empty()
Use the optimized check hasSection() instead of calling
getSection().empty(). Originally suggested in D101004, but was dropped
in the commit.
Stephen Kelly [Fri, 23 Apr 2021 15:24:14 +0000 (16:24 +0100)]
[AST] Update tests to query for introspection support
Louis Dionne [Wed, 10 Feb 2021 21:19:50 +0000 (16:19 -0500)]
[libc++] Rewrite the tuple constructors to be strictly Standards conforming
This nasty patch rewrites the tuple constructors to match those defined
by the Standard. We were previously providing several extensions in those
constructors - those extensions are removed by this patch.
The issue with those extensions is that we've had numerous bugs filed
against us over the years for problems essentially caused by them. As a
result, people are unable to use tuple in ways that are blessed by the
Standard, all that for the perceived benefit of providing them extensions
that they never asked for.
Since this is an API break, I communicated it in the release notes.
I do not foresee major issues with this break because I don't think the
extensions are too widely relied upon, but we can ship it and see if we
get complaints before the next LLVM release - that will give us some
amount of information regarding how much use these extensions have.
Differential Revision: https://reviews.llvm.org/D96523
Jeremy Morse [Fri, 23 Apr 2021 16:38:44 +0000 (17:38 +0100)]
Drop a REQUIRES: lldb on a dexter regression test
As this is a test that actually gets to operating the debugger, it
needs to be limited to scenarios where the debugger is available.
(We'll file this in the set of things Dexter doesn't handle gracefully..)
Craig Topper [Fri, 23 Apr 2021 16:33:24 +0000 (09:33 -0700)]
[RISCV] Remove GetVRegNoV0 from the output register class of masked compare pseudo instructions.
Theses instructions are allowed to write v0 when they are masked.
We'll still never use v0 because of the earlyclobber constraint so
this doesn't really help anything. It just makes the definitions
correct.
While I was there remove an unused multiclass I noticed.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D101118
Craig Topper [Fri, 23 Apr 2021 16:33:09 +0000 (09:33 -0700)]
[RISCV] Have assembler check that the temp register is different than dest register for vmsgeu.vx pseudo.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D101015
Peter Collingbourne [Fri, 23 Apr 2021 05:41:01 +0000 (22:41 -0700)]
scudo: Store header on deallocation before retagging memory.
From a cache perspective it's better to store the header immediately
after loading it. If we delay this operation until after we've
retagged it's more likely that our header will have been evicted from
the cache and we'll need to fetch it again in order to perform the
compare-exchange operation.
For similar reasons, store the deallocation stack before retagging
instead of afterwards.
Differential Revision: https://reviews.llvm.org/D101137
Florian Hahn [Fri, 23 Apr 2021 10:33:38 +0000 (11:33 +0100)]
[VPlan] Add GraphTraits impl to traverse through VPRegionBlock.
This patch adds a new iterator to traverse through VPRegionBlocks and a
GraphTraits specialization using the iterator to traverse through
VPRegionBlocks.
Because there is already a GraphTraits specialization for VPBlockBase *
and co, a new VPBlockRecursiveTraversalWrapper helper is introduced.
This allows us to provide a new GraphTraits specialization for that
type. Users can use the new recursive traversal by using this wrapper.
The graph trait visits both the entry block of a region, as well as all
its successors. Exit blocks of a region implicitly have their parent
region's successors. This ensures all blocks in a region are visited
before any blocks in a successor region when doing a reverse post-order
traversal of the graph.
Reviewed By: a.elovikov
Differential Revision: https://reviews.llvm.org/D100175
Johannes Doerfert [Fri, 23 Apr 2021 01:36:18 +0000 (01:36 +0000)]
[OpenMP] Avoid reading uninitialized parallel level values
In a last minute change request for
a2dbfb6b72db we introduced a
read of the uninitialized parallel level value in SPMD-mode.
We go back to initializing the array early and checking for an
adjusted level.
Found by the miniqmc unit tests:
https://cdash.qmcpack.org/CDash/viewTest.php?onlyfailed&buildid=203434
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D101123
Johannes Doerfert [Wed, 21 Apr 2021 07:27:32 +0000 (02:27 -0500)]
[Clang] Allow the combination of loader_uninitialized and address spaces
When an object is allocated in a non-default address space we do not
need to check for a constructor if it is not initialized and has a
trivial constructor (which we won't call then).
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D100929
Dávid Bolvanský [Fri, 23 Apr 2021 16:16:22 +0000 (18:16 +0200)]
[libcxx] Fixed build break on buildbots with -Werror
Sebastian Neubauer [Fri, 23 Apr 2021 14:09:31 +0000 (16:09 +0200)]
[AMDGPU] Save WWM registers in functions
The values of registers in inactive lanes needs to be saved during
function calls.
Save all registers used for whole wave mode, similar to how it is done
for VGPRs that are used for SGPR spilling.
Differential Revision: https://reviews.llvm.org/D99429
Reapply with fixed tests on window.
Paul C. Anagnostopoulos [Fri, 23 Apr 2021 16:03:48 +0000 (12:03 -0400)]
[TableGen] [docs] Improve BNF for the 'multiclass' statement [NFC]
Nemanja Ivanovic [Fri, 23 Apr 2021 15:29:49 +0000 (10:29 -0500)]
[PowerPC] Add vec_ctsl and vec_ctul to altivec.h
These are added for compatibility with XLC. They are similar to
vec_cts and vec_ctu except that the result is a doubleword vector
regardless of the parameter type.
Dave Lee [Thu, 22 Apr 2021 17:08:53 +0000 (10:08 -0700)]
[cmake] Configure policy CMP0116
Using `cmake` >=3.20 results in many warnings about this new policy. This change silences the warnings by explicitly declaring use of the "OLD" behavior.
This policy currently affects only one place: the `tablegen()` function in `TableGen.cmake`.
Differential Revision: https://reviews.llvm.org/D101083
Simon Pilgrim [Fri, 23 Apr 2021 15:55:59 +0000 (16:55 +0100)]
[CostModel][X86] Improve v2f32 fadd reduction cost
This was being reported as a similar cost to v4f32 when its a lot cheaper (just a shufps+addps).
Nico Weber [Fri, 23 Apr 2021 15:45:49 +0000 (11:45 -0400)]
fix comment typo to cycle bots
Gabor Marton [Thu, 22 Apr 2021 13:12:40 +0000 (15:12 +0200)]
[Analyzer][StdLibraryFunctionsChecker] Describe arg constraints
In this patch, I provide a detailed explanation for each argument
constraint. This explanation is added in an extra 'note' tag, which is
displayed alongside the warning.
Since these new notes describe clearly the constraint, there is no need
to provide the number of the argument (e.g. 'Arg3') within the warning.
However, I decided to keep the name of the constraint in the warning (but
this could be a subject of discussion) in order to be able to identify
the different kind of constraint violations easily in a bug database
(e.g. CodeChecker).
Differential Revision: https://reviews.llvm.org/D101060
Stephen Kelly [Thu, 22 Apr 2021 11:53:52 +0000 (12:53 +0100)]
[AST] Sort introspection results without instantiating other data
Avoid string allocation in particular, but also avoid attempting to
impose any particular ordering based on formatted results.
Differential Revision: https://reviews.llvm.org/D101054
Andrzej Warzynski [Fri, 23 Apr 2021 14:49:10 +0000 (14:49 +0000)]
[flang] Switch from %f18 to %flang_fc1 in a test
This patch updates the final test that can be shared between the old and
the new Flang drivers and that has not been ported yet. %f18 (always
expanded as `f18`) is replaced with %flang_fc1 (expanded as either `f18`
or `flang-new -fc1`, depending on `FLANG_BUILD_NEW_DRIVER`).
This test should've been updated in https://reviews.llvm.org/D100309,
but I missed it then. That's because this test contains non-ascii
characters and `grep -I %f18` (as well as other grep-like tools) skips
it because it's interpreted as a data/binary file. In fact, it's just a
text file with non-ascii chars.
Since this is an obvious omission from D100309 (reviewed, accepted and
merged), I'm sending this without a review to reduce the noise on
Phabricator.
Sander de Smalen [Wed, 27 Jan 2021 15:01:16 +0000 (15:01 +0000)]
[TTI] NFC: Change getIntImmCost[Inst|Intrin] to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.
See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html
Differential Revision: https://reviews.llvm.org/D100565
Sander de Smalen [Wed, 27 Jan 2021 13:32:39 +0000 (13:32 +0000)]
[TTI] NFC: Change getScalingFactorCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.
See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html
Differential Revision: https://reviews.llvm.org/D100564
Sander de Smalen [Wed, 27 Jan 2021 13:25:18 +0000 (13:25 +0000)]
[TTI] NFC: Change getMemcpyCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.
See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html
Differential Revision: https://reviews.llvm.org/D100563
Sander de Smalen [Wed, 27 Jan 2021 13:15:21 +0000 (13:15 +0000)]
[TTI] NFC: Change getGEPCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.
See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html
Differential Revision: https://reviews.llvm.org/D100562
Sander de Smalen [Wed, 27 Jan 2021 13:12:56 +0000 (13:12 +0000)]
[TTI] NFC: Change getAddressComputationCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.
See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html
Differential Revision: https://reviews.llvm.org/D100561
dfukalov [Thu, 22 Apr 2021 11:52:25 +0000 (14:52 +0300)]
[TTI] NFC: Use InstructionCost to store ScalarizationCost in IntrinsicCostAttributes.
This patch migrates the TTI cost interfaces to return an InstructionCost.
See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html
Reviewed By: samparker
Differential Revision: https://reviews.llvm.org/D101151
Daniil Fukalov [Thu, 22 Apr 2021 19:14:49 +0000 (22:14 +0300)]
[TTI] Fix ScalarizationCost initialization.
In cases when ScalarizationCostPassed has no value, UINT_MAX is actually used
for cost estimation in `return ScalarCalls * ScalarCost + ScalarizationCost`.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D101099
Joe Ellis [Wed, 14 Apr 2021 09:50:57 +0000 (09:50 +0000)]
[AArch64][SVE] Fix bug in lowering of fixed-length integer vector divides
The function AArch64TargetLowering::LowerFixedLengthVectorIntDivideToSVE
previously assumed the operands were full vectors, but this is not
always true. This function would produce bogus if the division operands
are not full vectors, resulting in miscompiles when dividing 8-bit or
16-bit vectors.
The fix is to perform an extend + div + truncate for non-full vectors,
instead of the usual unpacking and unzipping logic. This is an additive
change which reduces the non-full integer vector divisions to a pattern
recognised by the existing lowering logic.
For future reference, an example of code that would miscompile before
this patch is below:
1 int8_t foo(unsigned N, int8_t *a, int8_t *b, int8_t *c) {
2 int8_t result = 0;
3 for (int i = 0; i < N; ++i) {
4 result += (a[i] / b[i]) / c[i];
5 }
6 return result;
7 }
Differential Revision: https://reviews.llvm.org/D100370
Jay Foad [Wed, 21 Apr 2021 14:32:00 +0000 (15:32 +0100)]
[AMDGPU] Fix typo in implicit operand lists
Several tests had a typo where they mentioned sgpr17 twice instead of
sgpr17 and sgpr27. This had a significant effect on the
"scavenge_sgpr_pei_no_sgprs" tests because there was actually an sgpr
available, namely sgpr27.
Differential Revision: https://reviews.llvm.org/D100960
Sebastian Neubauer [Fri, 23 Apr 2021 14:38:23 +0000 (16:38 +0200)]
Revert "[AMDGPU] Save WWM registers in functions"
This reverts commit
91464c30bfcf731ccb7f9d6ef6d26e8c1657a6e6.
Seems to break tests on windows.
Piotr Sobczak [Fri, 23 Apr 2021 14:19:04 +0000 (16:19 +0200)]
[AMDGPU][NFC] Update auto-gen test
Most likely the "glc" was not added to the test when
the volatile loads started generating those bits.
Krzysztof Parzyszek [Fri, 23 Apr 2021 13:11:59 +0000 (08:11 -0500)]
[Hexagon] Remove redundant HVX intrinsic selection patterns, NFC
Deleted HexagonMapAsm2IntrinV65.gen.td that wasn't included anywhere,
moved V6_vrmpy*_rtt* patterns to HexagonIntrinsics.td.
Touch CMakeLists.txt to force re-cmake (somehow the unused file was
listed as a dependency in the generated makefiles).
Sebastian Neubauer [Fri, 23 Apr 2021 14:09:31 +0000 (16:09 +0200)]
[AMDGPU] Save WWM registers in functions
The values of registers in inactive lanes needs to be saved during
function calls.
Save all registers used for whole wave mode, similar to how it is done
for VGPRs that are used for SGPR spilling.
Differential Revision: https://reviews.llvm.org/D99429
Paul C. Anagnostopoulos [Thu, 22 Apr 2021 17:26:19 +0000 (13:26 -0400)]
[TableGen] Correct some comments in the TableGen parser [NFC]
Differential Revision: https://reviews.llvm.org/D101088
Simon Pilgrim [Fri, 23 Apr 2021 13:51:25 +0000 (14:51 +0100)]
[X86] Add Win32/64 mulo test coverage
Part of an investigation to solve the windows regressions caused by rG13ec913bdf50
Paul C. Anagnostopoulos [Tue, 20 Apr 2021 17:05:56 +0000 (13:05 -0400)]
[TableGen] [docs] Improve description of NAME in Programmer's Reference
Also use "parent class" consistently and add a note about the term.
Differential Revision: https://reviews.llvm.org/D100867
Joseph Huber [Thu, 22 Apr 2021 19:29:28 +0000 (15:29 -0400)]
[OpenMP] Replace global InfoLevel with a reference to an internal one.
Summary:
This patch improves the implementation of D100774 by replacing the global
variable introduced with a function that returns a reference to an internal
one. This removes the need to define the variable in every plugin that uses it.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D101102
Anastasia Stulova [Fri, 23 Apr 2021 13:35:44 +0000 (14:35 +0100)]
[OpenCL] Fix typo in the test.
Dávid Bolvanský [Fri, 23 Apr 2021 13:25:27 +0000 (15:25 +0200)]
[InstCombine] Added tests for PR50096; NFC
Jez Ng [Thu, 22 Apr 2021 23:37:47 +0000 (19:37 -0400)]
[lld-macho] Have tests default to targeting macos 10.15
D101114 enforced proper version checks, which exposed a variety of version
mismatch issues in our tests. We previously changed the test inputs to
target 10.0, which was the simpler thing to do, but we should really
just have our lit.local.cfg default to targeting 10.15, which is what is done
here. We're not likely to ever have proper support for the older versions
anyway, as that would require more work for unclear benefit; for instance,
llvm-mc seems to generate a different compact unwind format for older macOS
versions, which would cause our compact-unwind.s test to fail.
Targeting 10.15 by default causes the following behavioral changes:
* `__mh_execute_header` is now a section symbol instead of an absolute symbol
* LC_BUILD_VERSION gets emitted instead of LC_VERSION_MIN_MACOSX. The former is
32 bytes in size whereas the latter is 16 bytes, so a bunch of hardcoded
address offsets in our tests had to be updated.
* >= 10.6 executables are PIE by default
Note that this diff was stacked atop of a local revert of most of the test
changes in rG8c17a875150f8e736e8f9061ddf084397f45f4c5, to make review easier.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D101119
Simon Pilgrim [Fri, 23 Apr 2021 13:19:24 +0000 (14:19 +0100)]
[X86] combineSetCCAtomicArith - pull out repeated ops. NFCI.
Reduces diff in D101074
Matt Arsenault [Fri, 23 Apr 2021 02:34:17 +0000 (22:34 -0400)]
AMDGPU: Fix assert on inline asm on gfx90a
This was assuming all mayLoad instructions have one def.
Timm Bäder [Fri, 23 Apr 2021 12:36:17 +0000 (14:36 +0200)]
[llvm][NFC] Fix assert indentation
This triggers GCC's misleading-indentation checker.
Dávid Bolvanský [Fri, 23 Apr 2021 12:42:37 +0000 (14:42 +0200)]
[InstCombine] Fixed newly added tests; NFC
Dawid Jurczak [Fri, 23 Apr 2021 12:24:19 +0000 (14:24 +0200)]
[InstCombine][NFC] add tests for printf("%s", str) --> puts(str)/noop transformation.
Split off from D100724.
Reviewed By: xbolva00
Differential Revision: https://reviews.llvm.org/D101149
Dávid Bolvanský [Fri, 23 Apr 2021 12:05:57 +0000 (14:05 +0200)]
Reland "[Clang] Propagate guaranteed alignment for malloc and others"
This relands commit
6914a0ed2b30924b188968e59a83efa07ac5fe57. Crash in InstCombine was fixed.
Dávid Bolvanský [Fri, 23 Apr 2021 12:03:49 +0000 (14:03 +0200)]
[InstCombine] Fixed crash when setting align attr for memalign
Adam Czachorowski [Thu, 15 Apr 2021 20:33:05 +0000 (22:33 +0200)]
[clang] Do not crash on template specialization following a fatal error
There was a missing isInvalid() check leading to an attempt to
instantiate template with an empty instantiation stack.
Differential Revision: https://reviews.llvm.org/D100675
Fraser Cormack [Thu, 22 Apr 2021 08:55:27 +0000 (09:55 +0100)]
[RISCV] Custom lower vector F(MIN|MAX)NUM to vf(min|max)
This patch adds support for both scalable- and fixed-length vector code
lowering of the llvm.minnum and llvm.maxnum intrinsics to the equivalent
RVV instructions.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D101035
Raphael Isemann [Fri, 23 Apr 2021 11:20:06 +0000 (13:20 +0200)]
[lldb][NFC] Remove a stray unicode character in the LLDB test docs
There was a U+2028 character in this line (a special paragraph separator).
OCHyams [Fri, 23 Apr 2021 11:00:16 +0000 (12:00 +0100)]
[dexter] Add keyword argument 'on_line' to DexLabel
Add optional keyword argument 'on_line' to DexLabel to label the specifed line
instead of the line the command is found on.
This will be helpful when used alongside DexDeclareFile (D99651).
Reviewed By: TWeaver
Differential Revision: https://reviews.llvm.org/D101055
Thomas Preud'homme [Thu, 22 Apr 2021 12:17:42 +0000 (13:17 +0100)]
[doc] Clarify constrained fcmps behavior
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D101053
Florian Hahn [Fri, 23 Apr 2021 08:57:03 +0000 (09:57 +0100)]
Recommit "[NewGVN] Track simplification dependencies for phi-of-ops."
This recommits
4f5da356ff35a218f23f0b0c4d08aee90da7de6e, including
explicit implementations of move a constructor and deleted copy
constructors/assignment operators, to fix failures with some compilers.
This reverts the revert
74854d00e854196445727a49df58fe5768d9ed5b.
Stephen Tozer [Thu, 22 Apr 2021 11:06:52 +0000 (12:06 +0100)]
Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"
Previous build failures were caused by an error in bitcode reading and
writing for DIArgList metadata, which has been fixed in
e5d844b587.
There were also some unnecessary asserts that were being triggered on
certain builds, which have been removed.
This reverts commit
dad5caa59e6b2bde8d6cf5b64a972c393c526c82.
Wang, Pengfei [Fri, 23 Apr 2021 09:11:45 +0000 (17:11 +0800)]
[X86][AMX][NFC] Make comparison operators to be complete
The previous D101039 didn't fix the SmallSet insertion issue, due to we
always return false for the comparison between 2 different nonnull BBs.
This patch makes the the comparison to be complete by comparing `MBB`
first, so that we can always get the invariant order by a single
operator.
Dávid Bolvanský [Fri, 23 Apr 2021 09:33:12 +0000 (11:33 +0200)]
Revert "[Clang] Propagate guaranteed alignment for malloc and others"
This reverts commit
c2297544c04764237cedc523083c7be2fb3833d4. Some buildbots are broken.
LLVM GN Syncbot [Fri, 23 Apr 2021 09:26:02 +0000 (09:26 +0000)]
[gn build] Port
c623945d707c
Matthias Springer [Fri, 23 Apr 2021 09:11:07 +0000 (18:11 +0900)]
[mlir] Support masked N-D vector transfer ops in ProgressiveVectorToSCF.
Mask vectors are handled similar to data vectors in N-D TransferWriteOp. They are copied into a temporary memory buffer, which can be indexed into with non-constant values.
Differential Revision: https://reviews.llvm.org/D101136
Tim Northover [Mon, 15 Feb 2021 11:58:35 +0000 (11:58 +0000)]
llvm-objdump: refactor SourcePrinter into separate file. NFC.
Preparatory patch for MachO feature.
Matthias Springer [Fri, 23 Apr 2021 09:04:58 +0000 (18:04 +0900)]
[mlir] Support masked 1D vector transfer ops in ProgressiveVectorToSCF
Support for masked N-D vector transfer ops will be added in a subsequent commit.
Differential Revision: https://reviews.llvm.org/D101132
Dávid Bolvanský [Fri, 23 Apr 2021 08:11:59 +0000 (10:11 +0200)]
[Clang] Propagate guaranteed alignment for malloc and others
LLVM should be smarter about *known* malloc's alignment and this knowledge may enable other optimizations.
Originally started as LLVM patch - https://reviews.llvm.org/D100862 but this logic should be really in Clang.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D100879
Matthias Springer [Fri, 23 Apr 2021 08:59:46 +0000 (17:59 +0900)]
[mlir] Support broadcast dimensions in ProgressiveVectorToSCF
This commit adds support for broadcast dimensions in permutation maps of vector transfer ops.
Also fixes a bug in VectorToSCF that generated incorrect in-bounds checks for broadcast dimensions.
Differential Revision: https://reviews.llvm.org/D101019
Florian Hahn [Fri, 23 Apr 2021 08:56:17 +0000 (09:56 +0100)]
Revert "[NewGVN] Track simplification dependencies for phi-of-ops."
This reverts commit
4f5da356ff35a218f23f0b0c4d08aee90da7de6e.
This causes some buildbot failures, e.g.
https://lab.llvm.org/buildbot/#/builders/139/builds/3019
Matthias Springer [Thu, 22 Apr 2021 02:33:23 +0000 (11:33 +0900)]
[mlir] Use SCF for loops in ProgressiveVectorToSCF
Use SCF for loops instead of Affine for loops.
Differential Revision: https://reviews.llvm.org/D101013
Marius Brehler [Fri, 23 Apr 2021 08:45:57 +0000 (08:45 +0000)]
[mlir][docs] Update `add_mlir_doc` usage
Updates the docs to reflect the changes introduced to the `add_mlir_doc`
CMake macro with https://reviews.llvm.org/D100517.
Florian Hahn [Fri, 23 Apr 2021 08:27:06 +0000 (09:27 +0100)]
[NewGVN] Track simplification dependencies for phi-of-ops.
If we are using a simplified value, we need to add an extra
dependency this value , because changes to the class of the
simplified value may require us to invalidate any decision based on
that value.
This is done by adding such values as additional users, however the
current code does not excludes temporary instructions.
At the moment, this means that we miss those dependencies for
phi-of-ops, because they are temporary instructions at this point. We
instead need to add the extra dependencies to the root instruction of
the phi-of-ops.
This patch pushes the responsibility of adding extra users to the
callers of createExpression & performSymbolicEvaluation. At those
points, it is clearer which real instruction to pick.
Alternatively we could either pass the 'real' instruction as additional
argument or use another map, but I think the approach in the patch makes
things a bit easier to follow.
Fixes PR35074.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D99987
Matthias Springer [Fri, 23 Apr 2021 08:22:40 +0000 (17:22 +0900)]
[mlir] Support dimension permutations in ProgressiveVectorToSCF
This commit adds support for dimension permutations in permutation maps of vector transfer ops.
Differential Revision: https://reviews.llvm.org/D101007
Raphael Isemann [Fri, 23 Apr 2021 08:36:43 +0000 (10:36 +0200)]
[lldb][NFC] Delete a checked-in build log in docs/testsuite
Uday Bondhugula [Thu, 22 Apr 2021 06:35:37 +0000 (12:05 +0530)]
[MLIR][NFC] Fix warning, trim includes + cleanup in AffineOps.h
Fix style/clang-tidy warning, trim stale includes and forward
declarations, and cleanup/fix stale comments.
Differential Revision: https://reviews.llvm.org/D101021
Matthias Springer [Fri, 23 Apr 2021 08:18:26 +0000 (17:18 +0900)]
[mlir] Handle strided 1D vector transfer ops in ProgressiveVectorToSCF
Strided 1D vector transfer ops are 1D transfers operating on a memref dimension different from the last one. Such transfer ops do not accesses contiguous memory blocks (vectors), but access memory in a strided fashion. In the absence of a mask, strided 1D vector transfer ops can also be lowered using matrix.column.major.* LLVM instructions (in a later commit).
Subsequent commits will extend the pass to handle the remaining missing permutation maps (broadcasts, transposes, etc.).
Differential Revision: https://reviews.llvm.org/D100946
Chen Zheng [Thu, 22 Apr 2021 05:53:41 +0000 (01:53 -0400)]
[Debug-Info] change return type to void for attribute adding functions.
Make following function return void:
addLabel()
addSectionLabel()
addSectionDelta()
This aligns with other attributes adding functions.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D101022
Jay Foad [Thu, 22 Apr 2021 11:38:02 +0000 (12:38 +0100)]
[GlobalISel] Remove ConstantFoldingMIRBuilder
ConstantFoldingMIRBuilder was an experiment which is not used for
anything. The constant folding functionality is now part of
CSEMIRBuilder.
Differential Revision: https://reviews.llvm.org/D101050