Andrew Wei [Fri, 18 Sep 2020 06:59:51 +0000 (14:59 +0800)]
[AArch64] Add tests for zext pattern match with AssertZext/AssertSext operand, NFC
Serge Pavlov [Wed, 16 Sep 2020 16:27:46 +0000 (23:27 +0700)]
[FPEnv] Use typed accessors in FPOptions
Previously methods `FPOptions::get*` returned unsigned value even if the
corresponding property was represented by specific enumeration type. With
this change such methods return actual type of the property. It also
allows printing value of a property as text rather than integer code.
Differential Revision: https://reviews.llvm.org/D87812
Artur Bialas [Fri, 18 Sep 2020 06:43:53 +0000 (08:43 +0200)]
Revert "This is a test commit"
This reverts commit
9d54b166c2e59f29e476a6566951b6809fc8808e.
Artur Bialas [Fri, 18 Sep 2020 06:43:18 +0000 (08:43 +0200)]
This is a test commit
Craig Topper [Fri, 18 Sep 2020 05:37:29 +0000 (22:37 -0700)]
[X86] Add some demanded bits test cases for PDEP with constant mask
The number of ones in the mask for the PDEP determines how many
bits of the other operand are used. If the mask is constant we
can use this to build a mask for SimplifyDemandedBits. This can
be used to replace the extends in the test with anyextend.
Andrew Wei [Fri, 18 Sep 2020 04:36:22 +0000 (12:36 +0800)]
[AArch64] Emit zext move when the source of the zext is AssertZext or AssertSext
When the source of the zext is AssertZext or AssertSext, it is hard to know any information about the upper 32 bits,
so we should insert a zext move before emitting SUBREG_TO_REG to define the lower 32 bits.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D87771
Teresa Johnson [Fri, 18 Sep 2020 04:18:16 +0000 (21:18 -0700)]
Revert "[sanitizer] Add facility to print the full StackDepot"
This reverts commit
2ffaa9a1732c6f2af514603d25f0e8c238b3dd06.
There were 2 reported bot failures that need more investigation:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/69871/steps/stage%201%20check/logs/stdio
This one is in my new test.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer/builds/39187/steps/check-fuzzer/logs/stdio
This one seems completely unrelated.
Tue Ly [Thu, 13 Aug 2020 01:18:28 +0000 (21:18 -0400)]
[libc] Add implementation for hypotf
Truncating the sum of squares, and then use shift-and-add algorithm to compute its square root.
Required MPFR testing infra is updated in https://reviews.llvm.org/D87514
Differential Revision: https://reviews.llvm.org/D87516
Teresa Johnson [Wed, 16 Sep 2020 20:47:16 +0000 (13:47 -0700)]
[sanitizer] Add facility to print the full StackDepot
Split out of D87120 (memory profiler). Added unit testing of the new
printing facility.
Differential Revision: https://reviews.llvm.org/D87792
Vitaly Buka [Fri, 18 Sep 2020 01:03:55 +0000 (18:03 -0700)]
[NFC] clang-format one line
Vitaly Buka [Fri, 18 Sep 2020 00:42:33 +0000 (17:42 -0700)]
[NFC][Lsan] Fix zero-sized array compilation error
Roland McGrath [Thu, 17 Sep 2020 19:35:31 +0000 (12:35 -0700)]
[scudo/standalone] Don't define test main function for Fuchsia
Fuchsia's unit test library provides the main function by default.
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D87809
Rahul Joshi [Thu, 17 Sep 2020 23:51:20 +0000 (16:51 -0700)]
[MLIR] Fix build failure due to https://reviews.llvm.org/D87059.
- Remove spurious ;
- Make comparison object invokable as const.
Differential Revision: https://reviews.llvm.org/D87872
Sean Silva [Thu, 17 Sep 2020 23:20:47 +0000 (16:20 -0700)]
[mlir][shape] Add `shape.cstr_require %bool`
This op is a catch-all for creating witnesses from various random kinds
of constraints. In particular, I when dealing with extents directly,
which are of `index` type, one can directly use std ops for calculating
the predicates, and then use cstr_require for the final conversion to a
witness.
Differential Revision: https://reviews.llvm.org/D87871
Vedant Kumar [Thu, 17 Sep 2020 23:53:17 +0000 (16:53 -0700)]
[lldb] Clarify docstring for SBBlock::IsInlined, NFC
Previously, there was a little ambiguity about whether IsInlined should
return true for an inlined lexical block, since technically the lexical
block would not represent an inlined function (it'd just be contained
within one).
Edit suggested by Jim Ingham.
Amara Emerson [Thu, 17 Sep 2020 23:42:18 +0000 (16:42 -0700)]
[AArch64][GlobalISel] Make G_STORE <8 x s8> legal.
Amara Emerson [Thu, 17 Sep 2020 23:40:36 +0000 (16:40 -0700)]
[AArch64][GlobalISel] clang-format AArch64LegalizerInfo.cpp. NFC.
Amy Kwan [Thu, 17 Sep 2020 23:20:37 +0000 (18:20 -0500)]
[PowerPC] Add Set Boolean Condition Instruction Definitions and MC Tests
This patch adds the instruction definitions and assembly/disassembly tests for
the set boolean condition instructions. This also includes the negative, and
reverse variants of the instruction.
Differential Revision: https://reviews.llvm.org/D86252
Amy Kwan [Wed, 16 Sep 2020 15:03:17 +0000 (10:03 -0500)]
[PowerPC] Implement Vector Count Mask Bits builtins in LLVM/Clang
This patch implements the vec_cntm function prototypes in altivec.h in order to
utilize the vector count mask bits instructions introduced in Power10.
Differential Revision: https://reviews.llvm.org/D82726
Philip Reames [Thu, 17 Sep 2020 23:07:22 +0000 (16:07 -0700)]
[MemorySSA] Fix an unused variable warning [NFC]
Rahul Joshi [Thu, 17 Sep 2020 20:18:09 +0000 (13:18 -0700)]
[MLIR][TableGen] Automatic detection and elimination of redundant methods
- Change OpClass new method addition to find and eliminate any existing methods that
are made redundant by the newly added method, as well as detect if the newly added
method will be redundant and return nullptr in that case.
- To facilitate that, add the notion of resolved and unresolved parameters, where resolved
parameters have each parameter type known, so that redundancy checks on methods
with same name but different parameter types can be done.
- Eliminate existing code to avoid adding conflicting/redundant build methods and rely
on this new mechanism to eliminate conflicting build methods.
Fixes https://bugs.llvm.org/show_bug.cgi?id=47095
Differential Revision: https://reviews.llvm.org/D87059
Zhaoshi Zheng [Fri, 27 Mar 2020 05:09:31 +0000 (22:09 -0700)]
[RISCV] Support Shadow Call Stack
Currenlty assume x18 is used as pointer to shadow call stack. User shall pass
flags:
"-fsanitize=shadow-call-stack -ffixed-x18"
Runtime supported is needed to setup x18.
If SCS is desired, all parts of the program should be built with -ffixed-x18 to
maintain inter-operatability.
There's no particuluar reason that we must use x18 as SCS pointer. Any register
may be used, as long as it does not have designated purpose already, like RA or
passing call arguments.
Differential Revision: https://reviews.llvm.org/D84414
Philip Reames [Thu, 17 Sep 2020 22:39:50 +0000 (15:39 -0700)]
[AArch64] Enable implicit null check transformation
This change enables the generic implicit null transformation for the AArch64 target. As background for those unfamiliar with our implicit null check support:
An implicit null check is the use of a signal handler to catch and redirect to a handler a null pointer. Specifically, it's replacing an explicit conditional branch with such a redirect. This is only done for very cold branches under frontend control w/appropriate metadata.
FAULTING_OP is used to wrap the faulting instruction. It is modelled as being a conditional branch to reflect the fact it can transfer control in the CFG.
FAULTING_OP does not need to be an analyzable branch to achieve it's purpose. (Or at least, that's the x86 model. I find this slightly questionable.)
When lowering to MC, we convert the FAULTING_OP back into the actual instruction, record the labels, and lower the original instruction.
As can be seen in the test changes, currently the AArch64 backend does not eliminate the unconditional branch to the fallthrough block. I've tried two approaches, neither of which worked. I plan to return to this in a separate change set once I've wrapped my head around the interactions a bit better. (X86 handles this via AllowModify on analyzeBranch, but adding the obvious code causing BranchFolding to crash. I haven't yet figured out if it's a latent bug in BranchFolding, or something I'm doing wrong.)
Differential Revision: https://reviews.llvm.org/D87851
Arthur Eubanks [Mon, 24 Aug 2020 20:43:02 +0000 (13:43 -0700)]
[test] Fix FullUnroll.ll
I believe the intention of this test added in
https://reviews.llvm.org/D71687 was to test LoopFullUnrollPass with
clang's -fno-unroll-loops, not its interaction with optnone. Loop
unrolling passes don't run under optnone/-O0.
Also added back unintentionally removed -disable-loop-unrolling from
https://reviews.llvm.org/D85578.
Reviewed By: echristo
Differential Revision: https://reviews.llvm.org/D86485
Quentin Colombet [Thu, 17 Sep 2020 21:47:12 +0000 (14:47 -0700)]
[TargetRegisterInfo] Add a couple of target hooks for the greedy register allocator
Before this patch, the last chance recoloring and deferred spilling
techniques were solely controled by command line options.
This patch adds target hooks for these two techniques so that it
is easier for backend writers to override the default behavior.
The default behavior of the hooks preserves the default values of
the related command line options.
NFC
Zhaoshi Zheng [Thu, 17 Sep 2020 22:14:14 +0000 (15:14 -0700)]
[NFC] Test Commit
Derek Schuff [Sat, 8 Aug 2020 04:23:11 +0000 (21:23 -0700)]
Support dwarf fission for wasm object files
Initial support for dwarf fission sections (-gsplit-dwarf) on wasm.
The most interesting change is support for writing 2 files (.o and .dwo) in the
wasm object writer. My approach moves object-writing logic into its own function
and calls it twice, swapping out the endian::Writer (W) in between calls.
It also splits the import-preparation step into its own function (and skips it when writing a dwo).
Differential Revision: https://reviews.llvm.org/D85685
Florian Hahn [Thu, 17 Sep 2020 21:09:53 +0000 (22:09 +0100)]
[MemorySSA] Be more conservative when traversing MemoryPhis.
I think we need to be even more conservative when traversing memory
phis, to make sure we catch any loop carried dependences.
This approach updates fillInCurrentPair to use unknown sizes for
locations when we walk over a phi, unless the location is guaranteed to
be loop-invariant for any possible loop. Using an unknown size for
locations should ensure we catch all memory accesses to locations after
the given memory location, which includes loop-carried dependences.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D87778
Arthur Eubanks [Thu, 17 Sep 2020 20:57:28 +0000 (13:57 -0700)]
[NewPM] Fix pr45927.ll under NPM
Alexander Shaposhnikov [Fri, 11 Sep 2020 05:05:20 +0000 (22:05 -0700)]
[llvm-install-name-tool] Update the command-line guide
Nikita Popov [Sat, 5 Sep 2020 09:38:39 +0000 (11:38 +0200)]
[InstCombine] Canonicalize SPF_ABS to abs intrinc
Enable canonicalization of SPF_ABS and SPF_NABS to the abs intrinsic.
To be conservative, the one-use check on the comparison is retained,
this may be relaxed if all goes well.
It's pretty likely that this will uncover places that missing
handling for the abs() intrinsic. Please report any seen performance
regressions.
Differential Revision: https://reviews.llvm.org/D87188
Whitney Tsang [Thu, 17 Sep 2020 17:53:26 +0000 (17:53 +0000)]
[LoopUnrollAndJam] Allow unroll and jam loops forced by user.
Summary: Allow unroll and jam loops forced by user.
LoopUnrollAndJamPass is still disabled by default in the NPM pipeline,
and can be controlled by -enable-npm-unroll-and-jam.
Reviewed By: Meinersbur, dmgreen
Differential Revision: https://reviews.llvm.org/D87786
Nikita Popov [Thu, 17 Sep 2020 19:22:37 +0000 (21:22 +0200)]
[GVN] Use that assume(!X) implies X==false (PR47496)
We already use that assume(X) implies X==true, do the same for
assume(!X) implying X==false. This fixes PR47496.
Nikita Popov [Thu, 17 Sep 2020 18:39:29 +0000 (20:39 +0200)]
[GVN] Add additional assume tests (NFC)
The other assume tests seem to be dealing with equalities in
particular. Test implication for the condition itself, especially
the negated case from PR47496.
Florian Hahn [Thu, 17 Sep 2020 15:45:02 +0000 (16:45 +0100)]
[SCEV] Add test cases for max BTC with loop guard info.
This adds test cases for PR40961 and PR47247. They illustrate cases in
which the max backedge-taken count can be improved by information from
the loop guards.
Victor Huang [Thu, 17 Sep 2020 19:13:29 +0000 (14:13 -0500)]
Disable hoisting MI to hotter basic blocks when using pgo
This is a follow up patch for https://reviews.llvm.org/D63676 to
enable the feature when using pgo.
Differential Revision: https://reviews.llvm.org/D85240
Vitaly Buka [Thu, 17 Sep 2020 19:15:00 +0000 (12:15 -0700)]
[Lsan] Use fp registers to search for pointers
X86 can use xmm registers for pointers operations. e.g. for std::swap.
I don't know yet if it's possible on other platforms.
NT_X86_XSTATE includes all registers from NT_FPREGSET so
the latter used only if the former is not available. I am not sure how
reasonable to expect that but LLD has such fallback in
NativeRegisterContextLinux_x86_64::ReadFPR.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D87754
Jon Roelofs [Thu, 17 Sep 2020 19:13:22 +0000 (12:13 -0700)]
AArch64::ArchKind's underlying type is uint64_t
LLVM GN Syncbot [Thu, 17 Sep 2020 19:09:34 +0000 (19:09 +0000)]
[gn build] Port
7e4c6fb8546
Andrew Litteken [Thu, 17 Sep 2020 17:28:09 +0000 (12:28 -0500)]
[IRSim] Adding IR Instruction Mapper
This introduces the IRInstructionMapper, and the associated wrapper for
instructions, IRInstructionData, that maps IR level Instructions to
unsigned integers.
Mapping is done mainly by using the "isSameOperationAs" comparison
between two instructions. If they return true, the opcode, result type,
and operand types of the instruction are used to hash the instruction
with an unsigned integer. The mapper accepts instruction ranges, and
adds each resulting integer to a list, and each wrapped instruction to
a separate list.
At present, branches, phi nodes are not mapping and exception handling
is illegal. Debug instructions are not considered.
The different mapping schemes are tested in
unittests/Analysis/IRSimilarityIdentifierTest.cpp
Recommit of:
b04c1a9d3127730c05e8a22a0e931a12a39528df
Differential Revision: https://reviews.llvm.org/D86968
Cameron McInally [Thu, 17 Sep 2020 18:54:46 +0000 (13:54 -0500)]
[SVE][WIP] Implement lowering for fixed length VSELECT to Scalable
Map fixed length VSELECT to its Scalable equivalent.
Differential Revision: https://reviews.llvm.org/D85364
Reid Kleckner [Thu, 4 Jun 2020 01:08:55 +0000 (18:08 -0700)]
[PDB] Split TypeServerSource and extend type index map lifetime
Extending the lifetime of these type index mappings does increase memory
usage (+2% in my case), but it decouples type merging from symbol
merging. This is a pre-requisite for two changes that I have in mind:
- parallel type merging: speeds up slow type merging
- defered symbol merging: avoid heap allocating (relocating) all symbols
This eliminates CVIndexMap and moves its data into TpiSource. The maps
are also split into a SmallVector and ArrayRef component, so that the
ipiMap can alias the tpiMap for /Z7 object files, and so that both maps
can simply alias the PDB type server maps for /Zi files.
Splitting TypeServerSource establishes that all input types to be merged
can be identified with two 32-bit indices:
- The index of the TpiSource object
- The type index of the record
This is useful, because this information can be stored in a single
64-bit atomic word to enable concurrent hashtable insertion.
One last change is that now all object files with debugChunks get a
TpiSource, even if they have no type info. This avoids some null checks
and special cases.
Differential Revision: https://reviews.llvm.org/D87736
Amara Emerson [Thu, 17 Sep 2020 18:17:18 +0000 (11:17 -0700)]
[AArch64][GlobalISel] Widen G_EXTRACT_VECTOR_ELT element types if < 8b.
In order to not unnecessarily promote the source vector to greater than our
native vector size of 128b, I've added some cascading rules to widen based on
the number of elements.
Amara Emerson [Thu, 17 Sep 2020 18:16:02 +0000 (11:16 -0700)]
[AArch64][GlobalISel] Make <8 x s16> and <16 x s8> legal for shifts.
Sanjay Patel [Thu, 17 Sep 2020 18:22:05 +0000 (14:22 -0400)]
[VectorCombine] limit load+insert transform to one-use
As discussed in:
https://llvm.org/PR47558
...there are several potential fixes/follow-ups visible
in the test case, but this is the quickest and safest
fix of the perf regression.
Craig Topper [Thu, 17 Sep 2020 17:33:34 +0000 (10:33 -0700)]
[X86] Don't match x87 register inline asm constraints unless the VT is floating point or its a clobber
The register class picked will be the RFP80 register class which has a f80 VT. The code in SelectionDAGBuilder that generates copies around inline assembly doesn't know how to handle an integer and floating point type of different bit widths.
The test case is derived from this https://godbolt.org/z/sEa659 which gcc accepts but clang crashes on. This patch just gives a more graceful error. I'm not sure if the single element struct case is special in gcc. Adding another field to the struct makes gcc reject it. If we want to support this correctly I think we need a change in the frontend to give us the true element type. Right now the frontend just realizes the constraint can take a memory argument so creates an integer type of the same size and bitcasts.
Differential Revision: https://reviews.llvm.org/D87485
Navdeep Kumar [Thu, 17 Sep 2020 18:07:21 +0000 (23:37 +0530)]
[MLIR][Affine] Add parametric tile size support for affine.for tiling
Add support to tile affine.for ops with parametric sizes (i.e., SSA
values). Currently supports hyper-rectangular loop nests with constant
lower bounds only. Move methods
- moveLoopBody(*)
- getTileableBands(*)
- checkTilingLegality(*)
- tilePerfectlyNested(*)
- constructTiledIndexSetHyperRect(*)
to allow reuse with constant tile size API. Add a test pass -test-affine
-parametric-tile to test parametric tiling.
Differential Revision: https://reviews.llvm.org/D87353
Abhishek Varma [Thu, 17 Sep 2020 18:00:47 +0000 (23:30 +0530)]
[MLIR] Support for return values in Affine.For yield
Add support for return values in affine.for yield along the same lines
as scf.for and affine.parallel.
Signed-off-by: Abhishek Varma <abhishek.varma@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D87437
Yaxun (Sam) Liu [Thu, 17 Sep 2020 17:53:38 +0000 (13:53 -0400)]
Revert "[NFC] Refactor DiagnosticBuilder and PartialDiagnostic"
This reverts commit
ee5519d323571c4a9a7d92cb817023c9b95334cd.
Yaxun (Sam) Liu [Thu, 17 Sep 2020 17:53:25 +0000 (13:53 -0400)]
Revert "[CUDA][HIP] Defer overloading resolution diagnostics for host device functions"
This reverts commit
7f1f89ec8d9944559042bb6d3b1132eabe3409de.
This reverts commit
40df06cdafc010002fc9cfe1dda73d689b7d27a6.
Sanjay Patel [Thu, 17 Sep 2020 17:49:48 +0000 (13:49 -0400)]
[VectorCombine] rearrange bailouts for load insert for efficiency; NFC
Sanjay Patel [Thu, 17 Sep 2020 17:21:58 +0000 (13:21 -0400)]
[VectorCombine] add test for multi-use load (PR47558); NFC
Jinsong Ji [Thu, 17 Sep 2020 17:43:41 +0000 (17:43 +0000)]
[PowerPC][AIX] Don't hardcode python invoke command line
We shouldn't assume python exists, we should let lit
to decide whether it is python or python3 and expand the path.
Adrian Prantl [Thu, 17 Sep 2020 17:46:03 +0000 (10:46 -0700)]
Add missing include
jerryyin [Thu, 17 Sep 2020 15:47:33 +0000 (08:47 -0700)]
[AMDGPU] Fix ROCm unit test memref initialization
Raul Tambre [Fri, 4 Sep 2020 16:10:09 +0000 (19:10 +0300)]
[Sema] Introduce BuiltinAttr, per-declaration builtin-ness
Instead of relying on whether a certain identifier is a builtin, introduce BuiltinAttr to specify a declaration as having builtin semantics.
This fixes incompatible redeclarations of builtins, as reverting the identifier as being builtin due to one incompatible redeclaration would have broken rest of the builtin calls.
Mostly-compatible redeclarations of builtins also no longer have builtin semantics. They don't call the builtin nor inherit their attributes.
A long-standing FIXME regarding builtins inside a namespace enclosed in extern "C" not being recognized is also addressed.
Due to the more correct handling attributes for builtin functions are added in more places, resulting in more useful warnings.
Tests are updated to reflect that.
Intrinsics without an inline definition in intrin.h had `inline` and `static` removed as they had no effect and caused them to no longer be recognized as builtins otherwise.
A pthread_create() related test is XFAIL-ed, as it relied on it being recognized as a builtin based on its name.
The builtin declaration syntax is too restrictive and doesn't allow custom structs, function pointers, etc.
It seems to be the only case and fixing this would require reworking the current builtin syntax, so this seems acceptable.
Fixes PR45410.
Reviewed By: rsmith, yutsumi
Differential Revision: https://reviews.llvm.org/D77491
Matt Morehouse [Thu, 17 Sep 2020 16:23:35 +0000 (09:23 -0700)]
[DFSan] Add bcmp wrapper.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D87801
Eduardo Caldas [Thu, 17 Sep 2020 09:32:46 +0000 (09:32 +0000)]
[SyntaxTree][Synthesis] Fix allocation in `createTree` for more general use
Prior to this change `createTree` could not create arbitrary syntax
trees. Now it dispatches to the constructor of the concrete syntax tree
according to the `NodeKind` passed as argument. This allows reuse inside
the Synthesis API. # Please enter the commit message for your changes.
Lines starting
Differential Revision: https://reviews.llvm.org/D87820
Bogdan Graur [Thu, 17 Sep 2020 16:04:21 +0000 (18:04 +0200)]
[amdgpu] Compilation fix for Release
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D87838
Sanjay Patel [Thu, 17 Sep 2020 13:02:26 +0000 (09:02 -0400)]
[InstSimplify] add tests for FP constant miscompile; NFC (PR43907)
David Green [Thu, 17 Sep 2020 15:58:35 +0000 (16:58 +0100)]
[ARM] Expand distributing increments to also handle existing pre/post inc instructions.
This extends the distributing postinc code in load/store optimizer to
also handle the case where there is an existing pre/post inc instruction,
where subsequent instructions can be modified to use the adjusted
offset from the increment. This can save us having to keep the old
register live past the increment instruction.
Differential Revision: https://reviews.llvm.org/D83377
Amara Emerson [Wed, 16 Sep 2020 19:14:40 +0000 (12:14 -0700)]
[AArch64][GlobalISel] Fix bug in fewVectorElts action while legalizing oversize G_FPTRUNC vectors.
For <8 x s32> = fptrunc <8 x s64> the fewerElementsVector action tries to break
down the source vector into the final source vectors of <2 x s64> using unmerge.
This fixes a crash due to using the wrong number of elements for the breakdown
type.
Also add some legalizer tests for explicitly G_FPTRUNC which we didn't have.
Differential Revision: https://reviews.llvm.org/D87814
Hanhan Wang [Thu, 17 Sep 2020 15:54:16 +0000 (08:54 -0700)]
[mlir][Vector] Add a folder for vector.broadcast
Fold the operation if the source is a scalar constant or splat constant.
Update transform-patterns-matmul-to-vector.mlir because the broadcast ops are folded in the conversion.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D87703
Yaxun (Sam) Liu [Thu, 17 Sep 2020 15:51:09 +0000 (11:51 -0400)]
Fix build failure in clangd
Simon Pilgrim [Thu, 17 Sep 2020 15:00:02 +0000 (16:00 +0100)]
ModuloSchedule.cpp - remove unnecessary includes. NFCI.
Already included in ModuloSchedule.h
Matt Morehouse [Thu, 17 Sep 2020 15:43:26 +0000 (08:43 -0700)]
Revert "[DFSan] Add bcmp wrapper."
This reverts commit
559f9198125392bfa8e7d462aa8e87fcf5030185 due to bot
failure.
Max Kazantsev [Thu, 17 Sep 2020 15:36:41 +0000 (22:36 +0700)]
[Test] Add tests showing that IndVars cannot prove (X + 1 > X)
Valentin Clement [Thu, 17 Sep 2020 15:34:28 +0000 (11:34 -0400)]
[flang][openacc] Lower clauses on loop construct to OpenACC dialect
Lower OpenACCLoopConstruct and most of the clauses to the OpenACC acc.loop operation in MLIR.
This patch refelcts what can be upstream from PR flang-compiler/f18-llvm-project#419
Reviewed By: SouraVX
Differential Revision: https://reviews.llvm.org/D87389
Valentin Clement [Thu, 17 Sep 2020 15:33:31 +0000 (11:33 -0400)]
[mlir][openacc] Change operand type from index to AnyInteger in parallel op
This patch change the type of operands async, wait, numGangs, numWorkers and vectorLength from index
to AnyInteger to fit with acc.loop and the OpenACC specification.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D87712
David Green [Thu, 17 Sep 2020 15:33:03 +0000 (16:33 +0100)]
[ARM] Add more MVE postinc distribution tests. NFC
Yaxun (Sam) Liu [Wed, 16 Sep 2020 19:42:08 +0000 (15:42 -0400)]
[CUDA][HIP] Defer overloading resolution diagnostics for host device functions
In CUDA/HIP a function may become implicit host device function by
pragma or constexpr. A host device function is checked in both
host and device compilation. However it may be emitted only
on host or device side, therefore the diagnostics should be
deferred until it is known to be emitted.
Currently clang is only able to defer certain diagnostics. This causes
false alarms and limits the usefulness of host device functions.
This patch lets clang defer all overloading resolution diagnostics for host device functions.
An option -fgpu-defer-diag is added to control this behavior. By default
it is off.
It is NFC for other languages.
Differential Revision: https://reviews.llvm.org/D84364
Sanne Wouda [Sat, 12 Sep 2020 00:17:42 +0000 (01:17 +0100)]
[AArch64] Match pairwise add/fadd pattern
D75689 turns the faddp pattern into a shuffle with vector add.
Match this new pattern in target-specific DAG combine, rather than ISel,
because legalization (for v2f32) turns it into a bit of a mess.
- extended to cover f16, f32, f64 and i64
Sanne Wouda [Fri, 4 Sep 2020 15:58:02 +0000 (16:58 +0100)]
Precommit test updates
Matt Morehouse [Thu, 17 Sep 2020 15:22:54 +0000 (08:22 -0700)]
[DFSan] Add bcmp wrapper.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D87801
Alexey Bataev [Wed, 16 Sep 2020 16:19:06 +0000 (12:19 -0400)]
[OpenMP 5.0] Fix user-defined mapper privatization in tasks
This patch fixes the problem that user-defined mapper array is not correctly privatized inside a task. This problem causes openmp/libomptarget/test/offloading/target_depend_nowait.cpp fails.
Differential Revision: https://reviews.llvm.org/D84470
Xun Li [Thu, 17 Sep 2020 15:12:46 +0000 (08:12 -0700)]
[Coroutine] Fix a bug where Coroutine incorrectly spills phi and invoke defs before CoroBegin
When a spill definition is before CoroBegin, we cannot spill it to the frame immediately after the definition. We have to spill it after the frame is ready.
The current implementation handles it properly for any other kinds of instructions except for PhINode and InvokeInst, which could also be defined before CoroBegin.
This patch fixes it by moving the CoroBegin dominance check earlier, so that it covers all cases.
Added a test.
Differential Revision: https://reviews.llvm.org/D87810
Louis Dionne [Thu, 30 Jul 2020 14:00:53 +0000 (10:00 -0400)]
[libc++] Remove some workarounds for missing variadic templates
We don't support GCC in C++03 mode, and Clang provides variadic templates
even in C++03 mode. So there's effectively no supported compiler that
doesn't support variadic templates.
This effectively gets rid of all uses of _LIBCPP_HAS_NO_VARIADICS, but
some workarounds for the lack of variadics remain.
Michael Liao [Wed, 9 Sep 2020 20:48:03 +0000 (16:48 -0400)]
[amdgpu] Lower SGPR-to-VGPR copy in the final phase of ISel.
- Need to lower COPY from SGPR to VGPR to a real instruction as the
standard COPY is used where the source and destination are from the
same register bank so that we potentially coalesc them together and
save one COPY. Considering that, backend optimizations, such as CSE,
won't handle them. However, the copy from SGPR to VGPR always needs
materializing to a native instruction, it should be lowered into a
real one before other backend optimizations.
Differential Revision: https://reviews.llvm.org/D87556
David Green [Thu, 17 Sep 2020 15:00:51 +0000 (16:00 +0100)]
[ARM] Sink splats to MVE intrinsics
The predicated MVE intrinsics are generated as, for example,
llvm.arm.mve.add.predicated(x, splat(y). p). We need to sink the splat
value back into the loop, like we do for other instructions, so we can
re-select qr variants.
Differential Revision: https://reviews.llvm.org/D87693
Kamil Rytarowski [Thu, 17 Sep 2020 14:57:30 +0000 (16:57 +0200)]
[compiler-rt] [scudo] Fix typo in function attribute
Fixes the build after landing https://reviews.llvm.org/D87562
Stephan Herhut [Wed, 16 Sep 2020 08:01:54 +0000 (10:01 +0200)]
[mlir][Standard] Canonicalize chains of tensor_cast operations
Adds a pattern that replaces a chain of two tensor_cast operations by a single tensor_cast operation if doing so will not remove constraints on the shapes.
Kamil Rytarowski [Thu, 17 Sep 2020 14:46:32 +0000 (16:46 +0200)]
[compiler-rt] [hwasan] Replace INLINE with inline
Fixes the build after landing D87562.
Kamil Rytarowski [Thu, 17 Sep 2020 14:34:59 +0000 (16:34 +0200)]
[compiler-rt] [netbsd] Include <sys/dkbad.h>
Fixes build on NetBSD/sparc64.
alex-t [Wed, 16 Sep 2020 16:54:29 +0000 (19:54 +0300)]
[AMDGPU] should expand ROTL i16 to shifts.
Instruction combining pass turns library rotl implementation to llvm.fshl.i16.
In the selection dag the intrinsic is turned to ISD::ROTL node that cannot be selected.
Need to expand it to shifts again.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D87618
Kamil Rytarowski [Thu, 17 Sep 2020 14:27:48 +0000 (16:27 +0200)]
[compiler-rt] [tsan] [netbsd] Catch unsupported LONG_JMP_SP_ENV_SLOT
Error out during build for unsupported CPU.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D87602
Kamil Rytarowski [Thu, 17 Sep 2020 14:04:50 +0000 (16:04 +0200)]
[compiler-rt] Replace INLINE with inline
This fixes the clash with BSD headers.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D87562
Simon Pilgrim [Thu, 17 Sep 2020 14:05:45 +0000 (15:05 +0100)]
LiveDebugVariables.cpp - remove unnecessary Compiler.h include. NFCI.
Already included in LiveDebugVariables.h
Simon Pilgrim [Thu, 17 Sep 2020 14:03:53 +0000 (15:03 +0100)]
DwarfExpression.cpp - remove unnecessary includes. NFCI.
Already included in DwarfExpression.h
Simon Pilgrim [Thu, 17 Sep 2020 14:00:11 +0000 (15:00 +0100)]
ValueList.cpp - remove unnecessary includes. NFCI.
Already included in ValueList.h
Kamil Rytarowski [Thu, 17 Sep 2020 14:02:59 +0000 (16:02 +0200)]
[compiler-rt] Avoid pulling libatomic to sanitizer tests
Avoid fallbacking to software emulated compiler atomics, that are usually
provided by libatomic, which is not always present.
This fixes the test on NetBSD, which does not provide libatomic in base.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D87568
Simon Pilgrim [Thu, 17 Sep 2020 13:45:46 +0000 (14:45 +0100)]
SafeStackLayout.cpp - remove unnecessary StackLifetime.h include. NFCI.
Already included in SafeStackLayout.h
jerryyin [Wed, 16 Sep 2020 15:57:37 +0000 (08:57 -0700)]
[AMDGPU] Bump to ROCm 3.7 dependency hip_hcc->amdhip64
Differential Revision: https://reviews.llvm.org/D87773
Simon Pilgrim [Thu, 17 Sep 2020 13:27:15 +0000 (14:27 +0100)]
InstCombiner.h - remove unnecessary KnownBits.h include. NFCI.
Move the include down to cpp files with an implicit dependency.
Yvan Roux [Thu, 17 Sep 2020 13:13:55 +0000 (15:13 +0200)]
[ARM][MachineOutliner] Add missing testcase for calls.
Florian Hahn [Wed, 16 Sep 2020 17:44:40 +0000 (18:44 +0100)]
[MemorySSA] Add another loop clobber test case.
Kerry McLaughlin [Thu, 17 Sep 2020 10:52:14 +0000 (11:52 +0100)]
[SVE][CodeGen] Lower floating point -> integer conversions
This patch adds new ISD nodes, FCVTZS_MERGE_PASSTHRU &
FCVTZU_MERGE_PASSTHRU, which are used to lower scalable vector
FP_TO_SINT/FP_TO_UINT operations and the following intrinsics:
- llvm.aarch64.sve.fcvtzu
- llvm.aarch64.sve.fcvtzs
Reviewed By: efriedma, paulwalker-arm
Differential Revision: https://reviews.llvm.org/D87232
Georgii Rymar [Thu, 17 Sep 2020 12:36:06 +0000 (15:36 +0300)]
[obj2yaml] - Don't emit EM_NONE.
When ELF header's `e_machine == 0`, we emit:
```
Machine: EM_NONE
```
We can avoid doing this, because yaml2obj sets the
`e_machine` field to `EM_NONE` by default.
Differential revision: https://reviews.llvm.org/D87829
Georgii Rymar [Tue, 15 Sep 2020 13:17:08 +0000 (16:17 +0300)]
[llvm-readelf/obj][test] - Document what we print in various places for unnamed section symbols.
We have an issue with `ELFDumper<ELFT>::getSymbolSectionName`:
1) It is used deeply for both LLVM/GNU styles and might return LLVM-style only
values to describe symbols: "Undefined", "Processor Specific", "Absolute", etc.
2) `getSymbolSectionName` is used by `getFullSymbolName` and these special values
might appear in instead of symbol names in many places.
This occurs for unnamed section symbols.
It was not noticed because for most cases I've found it is unexpected to have an
unnamed section symbol. This patch documents the existent behavior, adds tests and FIXMEs.
Differential revision: https://reviews.llvm.org/D87763
Sanjay Patel [Thu, 17 Sep 2020 12:39:23 +0000 (08:39 -0400)]
[SLP] sort candidates to increase chance of optimal compare reduction
This is one (small) part of improving PR41312:
https://llvm.org/PR41312
As shown there and in the smaller tests here, if we have some member of the
reduction values that does not match the others, we want to push it to the
end (bring the matching members forward and together).
In the regression tests, we have 5 candidates for the 4 slots of the reduction.
If the one "wrong" compare is grouped with the others, it prevents forming the
ideal v4i1 compare reduction.
Differential Revision: https://reviews.llvm.org/D87772
Jessica Clarke [Thu, 17 Sep 2020 12:44:01 +0000 (13:44 +0100)]
[clang][docs] Fix documentation of -O
D79916 changed the behaviour from -O2 to -O1 but the documentation was
not updated to reflect this.