Abhishek Varma [Thu, 17 Sep 2020 18:00:47 +0000 (23:30 +0530)]
[MLIR] Support for return values in Affine.For yield
Add support for return values in affine.for yield along the same lines
as scf.for and affine.parallel.
Signed-off-by: Abhishek Varma <abhishek.varma@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D87437
Yaxun (Sam) Liu [Thu, 17 Sep 2020 17:53:38 +0000 (13:53 -0400)]
Revert "[NFC] Refactor DiagnosticBuilder and PartialDiagnostic"
This reverts commit
ee5519d323571c4a9a7d92cb817023c9b95334cd.
Yaxun (Sam) Liu [Thu, 17 Sep 2020 17:53:25 +0000 (13:53 -0400)]
Revert "[CUDA][HIP] Defer overloading resolution diagnostics for host device functions"
This reverts commit
7f1f89ec8d9944559042bb6d3b1132eabe3409de.
This reverts commit
40df06cdafc010002fc9cfe1dda73d689b7d27a6.
Sanjay Patel [Thu, 17 Sep 2020 17:49:48 +0000 (13:49 -0400)]
[VectorCombine] rearrange bailouts for load insert for efficiency; NFC
Sanjay Patel [Thu, 17 Sep 2020 17:21:58 +0000 (13:21 -0400)]
[VectorCombine] add test for multi-use load (PR47558); NFC
Jinsong Ji [Thu, 17 Sep 2020 17:43:41 +0000 (17:43 +0000)]
[PowerPC][AIX] Don't hardcode python invoke command line
We shouldn't assume python exists, we should let lit
to decide whether it is python or python3 and expand the path.
Adrian Prantl [Thu, 17 Sep 2020 17:46:03 +0000 (10:46 -0700)]
Add missing include
jerryyin [Thu, 17 Sep 2020 15:47:33 +0000 (08:47 -0700)]
[AMDGPU] Fix ROCm unit test memref initialization
Raul Tambre [Fri, 4 Sep 2020 16:10:09 +0000 (19:10 +0300)]
[Sema] Introduce BuiltinAttr, per-declaration builtin-ness
Instead of relying on whether a certain identifier is a builtin, introduce BuiltinAttr to specify a declaration as having builtin semantics.
This fixes incompatible redeclarations of builtins, as reverting the identifier as being builtin due to one incompatible redeclaration would have broken rest of the builtin calls.
Mostly-compatible redeclarations of builtins also no longer have builtin semantics. They don't call the builtin nor inherit their attributes.
A long-standing FIXME regarding builtins inside a namespace enclosed in extern "C" not being recognized is also addressed.
Due to the more correct handling attributes for builtin functions are added in more places, resulting in more useful warnings.
Tests are updated to reflect that.
Intrinsics without an inline definition in intrin.h had `inline` and `static` removed as they had no effect and caused them to no longer be recognized as builtins otherwise.
A pthread_create() related test is XFAIL-ed, as it relied on it being recognized as a builtin based on its name.
The builtin declaration syntax is too restrictive and doesn't allow custom structs, function pointers, etc.
It seems to be the only case and fixing this would require reworking the current builtin syntax, so this seems acceptable.
Fixes PR45410.
Reviewed By: rsmith, yutsumi
Differential Revision: https://reviews.llvm.org/D77491
Matt Morehouse [Thu, 17 Sep 2020 16:23:35 +0000 (09:23 -0700)]
[DFSan] Add bcmp wrapper.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D87801
Eduardo Caldas [Thu, 17 Sep 2020 09:32:46 +0000 (09:32 +0000)]
[SyntaxTree][Synthesis] Fix allocation in `createTree` for more general use
Prior to this change `createTree` could not create arbitrary syntax
trees. Now it dispatches to the constructor of the concrete syntax tree
according to the `NodeKind` passed as argument. This allows reuse inside
the Synthesis API. # Please enter the commit message for your changes.
Lines starting
Differential Revision: https://reviews.llvm.org/D87820
Bogdan Graur [Thu, 17 Sep 2020 16:04:21 +0000 (18:04 +0200)]
[amdgpu] Compilation fix for Release
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D87838
Sanjay Patel [Thu, 17 Sep 2020 13:02:26 +0000 (09:02 -0400)]
[InstSimplify] add tests for FP constant miscompile; NFC (PR43907)
David Green [Thu, 17 Sep 2020 15:58:35 +0000 (16:58 +0100)]
[ARM] Expand distributing increments to also handle existing pre/post inc instructions.
This extends the distributing postinc code in load/store optimizer to
also handle the case where there is an existing pre/post inc instruction,
where subsequent instructions can be modified to use the adjusted
offset from the increment. This can save us having to keep the old
register live past the increment instruction.
Differential Revision: https://reviews.llvm.org/D83377
Amara Emerson [Wed, 16 Sep 2020 19:14:40 +0000 (12:14 -0700)]
[AArch64][GlobalISel] Fix bug in fewVectorElts action while legalizing oversize G_FPTRUNC vectors.
For <8 x s32> = fptrunc <8 x s64> the fewerElementsVector action tries to break
down the source vector into the final source vectors of <2 x s64> using unmerge.
This fixes a crash due to using the wrong number of elements for the breakdown
type.
Also add some legalizer tests for explicitly G_FPTRUNC which we didn't have.
Differential Revision: https://reviews.llvm.org/D87814
Hanhan Wang [Thu, 17 Sep 2020 15:54:16 +0000 (08:54 -0700)]
[mlir][Vector] Add a folder for vector.broadcast
Fold the operation if the source is a scalar constant or splat constant.
Update transform-patterns-matmul-to-vector.mlir because the broadcast ops are folded in the conversion.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D87703
Yaxun (Sam) Liu [Thu, 17 Sep 2020 15:51:09 +0000 (11:51 -0400)]
Fix build failure in clangd
Simon Pilgrim [Thu, 17 Sep 2020 15:00:02 +0000 (16:00 +0100)]
ModuloSchedule.cpp - remove unnecessary includes. NFCI.
Already included in ModuloSchedule.h
Matt Morehouse [Thu, 17 Sep 2020 15:43:26 +0000 (08:43 -0700)]
Revert "[DFSan] Add bcmp wrapper."
This reverts commit
559f9198125392bfa8e7d462aa8e87fcf5030185 due to bot
failure.
Max Kazantsev [Thu, 17 Sep 2020 15:36:41 +0000 (22:36 +0700)]
[Test] Add tests showing that IndVars cannot prove (X + 1 > X)
Valentin Clement [Thu, 17 Sep 2020 15:34:28 +0000 (11:34 -0400)]
[flang][openacc] Lower clauses on loop construct to OpenACC dialect
Lower OpenACCLoopConstruct and most of the clauses to the OpenACC acc.loop operation in MLIR.
This patch refelcts what can be upstream from PR flang-compiler/f18-llvm-project#419
Reviewed By: SouraVX
Differential Revision: https://reviews.llvm.org/D87389
Valentin Clement [Thu, 17 Sep 2020 15:33:31 +0000 (11:33 -0400)]
[mlir][openacc] Change operand type from index to AnyInteger in parallel op
This patch change the type of operands async, wait, numGangs, numWorkers and vectorLength from index
to AnyInteger to fit with acc.loop and the OpenACC specification.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D87712
David Green [Thu, 17 Sep 2020 15:33:03 +0000 (16:33 +0100)]
[ARM] Add more MVE postinc distribution tests. NFC
Yaxun (Sam) Liu [Wed, 16 Sep 2020 19:42:08 +0000 (15:42 -0400)]
[CUDA][HIP] Defer overloading resolution diagnostics for host device functions
In CUDA/HIP a function may become implicit host device function by
pragma or constexpr. A host device function is checked in both
host and device compilation. However it may be emitted only
on host or device side, therefore the diagnostics should be
deferred until it is known to be emitted.
Currently clang is only able to defer certain diagnostics. This causes
false alarms and limits the usefulness of host device functions.
This patch lets clang defer all overloading resolution diagnostics for host device functions.
An option -fgpu-defer-diag is added to control this behavior. By default
it is off.
It is NFC for other languages.
Differential Revision: https://reviews.llvm.org/D84364
Sanne Wouda [Sat, 12 Sep 2020 00:17:42 +0000 (01:17 +0100)]
[AArch64] Match pairwise add/fadd pattern
D75689 turns the faddp pattern into a shuffle with vector add.
Match this new pattern in target-specific DAG combine, rather than ISel,
because legalization (for v2f32) turns it into a bit of a mess.
- extended to cover f16, f32, f64 and i64
Sanne Wouda [Fri, 4 Sep 2020 15:58:02 +0000 (16:58 +0100)]
Precommit test updates
Matt Morehouse [Thu, 17 Sep 2020 15:22:54 +0000 (08:22 -0700)]
[DFSan] Add bcmp wrapper.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D87801
Alexey Bataev [Wed, 16 Sep 2020 16:19:06 +0000 (12:19 -0400)]
[OpenMP 5.0] Fix user-defined mapper privatization in tasks
This patch fixes the problem that user-defined mapper array is not correctly privatized inside a task. This problem causes openmp/libomptarget/test/offloading/target_depend_nowait.cpp fails.
Differential Revision: https://reviews.llvm.org/D84470
Xun Li [Thu, 17 Sep 2020 15:12:46 +0000 (08:12 -0700)]
[Coroutine] Fix a bug where Coroutine incorrectly spills phi and invoke defs before CoroBegin
When a spill definition is before CoroBegin, we cannot spill it to the frame immediately after the definition. We have to spill it after the frame is ready.
The current implementation handles it properly for any other kinds of instructions except for PhINode and InvokeInst, which could also be defined before CoroBegin.
This patch fixes it by moving the CoroBegin dominance check earlier, so that it covers all cases.
Added a test.
Differential Revision: https://reviews.llvm.org/D87810
Louis Dionne [Thu, 30 Jul 2020 14:00:53 +0000 (10:00 -0400)]
[libc++] Remove some workarounds for missing variadic templates
We don't support GCC in C++03 mode, and Clang provides variadic templates
even in C++03 mode. So there's effectively no supported compiler that
doesn't support variadic templates.
This effectively gets rid of all uses of _LIBCPP_HAS_NO_VARIADICS, but
some workarounds for the lack of variadics remain.
Michael Liao [Wed, 9 Sep 2020 20:48:03 +0000 (16:48 -0400)]
[amdgpu] Lower SGPR-to-VGPR copy in the final phase of ISel.
- Need to lower COPY from SGPR to VGPR to a real instruction as the
standard COPY is used where the source and destination are from the
same register bank so that we potentially coalesc them together and
save one COPY. Considering that, backend optimizations, such as CSE,
won't handle them. However, the copy from SGPR to VGPR always needs
materializing to a native instruction, it should be lowered into a
real one before other backend optimizations.
Differential Revision: https://reviews.llvm.org/D87556
David Green [Thu, 17 Sep 2020 15:00:51 +0000 (16:00 +0100)]
[ARM] Sink splats to MVE intrinsics
The predicated MVE intrinsics are generated as, for example,
llvm.arm.mve.add.predicated(x, splat(y). p). We need to sink the splat
value back into the loop, like we do for other instructions, so we can
re-select qr variants.
Differential Revision: https://reviews.llvm.org/D87693
Kamil Rytarowski [Thu, 17 Sep 2020 14:57:30 +0000 (16:57 +0200)]
[compiler-rt] [scudo] Fix typo in function attribute
Fixes the build after landing https://reviews.llvm.org/D87562
Stephan Herhut [Wed, 16 Sep 2020 08:01:54 +0000 (10:01 +0200)]
[mlir][Standard] Canonicalize chains of tensor_cast operations
Adds a pattern that replaces a chain of two tensor_cast operations by a single tensor_cast operation if doing so will not remove constraints on the shapes.
Kamil Rytarowski [Thu, 17 Sep 2020 14:46:32 +0000 (16:46 +0200)]
[compiler-rt] [hwasan] Replace INLINE with inline
Fixes the build after landing D87562.
Kamil Rytarowski [Thu, 17 Sep 2020 14:34:59 +0000 (16:34 +0200)]
[compiler-rt] [netbsd] Include <sys/dkbad.h>
Fixes build on NetBSD/sparc64.
alex-t [Wed, 16 Sep 2020 16:54:29 +0000 (19:54 +0300)]
[AMDGPU] should expand ROTL i16 to shifts.
Instruction combining pass turns library rotl implementation to llvm.fshl.i16.
In the selection dag the intrinsic is turned to ISD::ROTL node that cannot be selected.
Need to expand it to shifts again.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D87618
Kamil Rytarowski [Thu, 17 Sep 2020 14:27:48 +0000 (16:27 +0200)]
[compiler-rt] [tsan] [netbsd] Catch unsupported LONG_JMP_SP_ENV_SLOT
Error out during build for unsupported CPU.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D87602
Kamil Rytarowski [Thu, 17 Sep 2020 14:04:50 +0000 (16:04 +0200)]
[compiler-rt] Replace INLINE with inline
This fixes the clash with BSD headers.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D87562
Simon Pilgrim [Thu, 17 Sep 2020 14:05:45 +0000 (15:05 +0100)]
LiveDebugVariables.cpp - remove unnecessary Compiler.h include. NFCI.
Already included in LiveDebugVariables.h
Simon Pilgrim [Thu, 17 Sep 2020 14:03:53 +0000 (15:03 +0100)]
DwarfExpression.cpp - remove unnecessary includes. NFCI.
Already included in DwarfExpression.h
Simon Pilgrim [Thu, 17 Sep 2020 14:00:11 +0000 (15:00 +0100)]
ValueList.cpp - remove unnecessary includes. NFCI.
Already included in ValueList.h
Kamil Rytarowski [Thu, 17 Sep 2020 14:02:59 +0000 (16:02 +0200)]
[compiler-rt] Avoid pulling libatomic to sanitizer tests
Avoid fallbacking to software emulated compiler atomics, that are usually
provided by libatomic, which is not always present.
This fixes the test on NetBSD, which does not provide libatomic in base.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D87568
Simon Pilgrim [Thu, 17 Sep 2020 13:45:46 +0000 (14:45 +0100)]
SafeStackLayout.cpp - remove unnecessary StackLifetime.h include. NFCI.
Already included in SafeStackLayout.h
jerryyin [Wed, 16 Sep 2020 15:57:37 +0000 (08:57 -0700)]
[AMDGPU] Bump to ROCm 3.7 dependency hip_hcc->amdhip64
Differential Revision: https://reviews.llvm.org/D87773
Simon Pilgrim [Thu, 17 Sep 2020 13:27:15 +0000 (14:27 +0100)]
InstCombiner.h - remove unnecessary KnownBits.h include. NFCI.
Move the include down to cpp files with an implicit dependency.
Yvan Roux [Thu, 17 Sep 2020 13:13:55 +0000 (15:13 +0200)]
[ARM][MachineOutliner] Add missing testcase for calls.
Florian Hahn [Wed, 16 Sep 2020 17:44:40 +0000 (18:44 +0100)]
[MemorySSA] Add another loop clobber test case.
Kerry McLaughlin [Thu, 17 Sep 2020 10:52:14 +0000 (11:52 +0100)]
[SVE][CodeGen] Lower floating point -> integer conversions
This patch adds new ISD nodes, FCVTZS_MERGE_PASSTHRU &
FCVTZU_MERGE_PASSTHRU, which are used to lower scalable vector
FP_TO_SINT/FP_TO_UINT operations and the following intrinsics:
- llvm.aarch64.sve.fcvtzu
- llvm.aarch64.sve.fcvtzs
Reviewed By: efriedma, paulwalker-arm
Differential Revision: https://reviews.llvm.org/D87232
Georgii Rymar [Thu, 17 Sep 2020 12:36:06 +0000 (15:36 +0300)]
[obj2yaml] - Don't emit EM_NONE.
When ELF header's `e_machine == 0`, we emit:
```
Machine: EM_NONE
```
We can avoid doing this, because yaml2obj sets the
`e_machine` field to `EM_NONE` by default.
Differential revision: https://reviews.llvm.org/D87829
Georgii Rymar [Tue, 15 Sep 2020 13:17:08 +0000 (16:17 +0300)]
[llvm-readelf/obj][test] - Document what we print in various places for unnamed section symbols.
We have an issue with `ELFDumper<ELFT>::getSymbolSectionName`:
1) It is used deeply for both LLVM/GNU styles and might return LLVM-style only
values to describe symbols: "Undefined", "Processor Specific", "Absolute", etc.
2) `getSymbolSectionName` is used by `getFullSymbolName` and these special values
might appear in instead of symbol names in many places.
This occurs for unnamed section symbols.
It was not noticed because for most cases I've found it is unexpected to have an
unnamed section symbol. This patch documents the existent behavior, adds tests and FIXMEs.
Differential revision: https://reviews.llvm.org/D87763
Sanjay Patel [Thu, 17 Sep 2020 12:39:23 +0000 (08:39 -0400)]
[SLP] sort candidates to increase chance of optimal compare reduction
This is one (small) part of improving PR41312:
https://llvm.org/PR41312
As shown there and in the smaller tests here, if we have some member of the
reduction values that does not match the others, we want to push it to the
end (bring the matching members forward and together).
In the regression tests, we have 5 candidates for the 4 slots of the reduction.
If the one "wrong" compare is grouped with the others, it prevents forming the
ideal v4i1 compare reduction.
Differential Revision: https://reviews.llvm.org/D87772
Jessica Clarke [Thu, 17 Sep 2020 12:44:01 +0000 (13:44 +0100)]
[clang][docs] Fix documentation of -O
D79916 changed the behaviour from -O2 to -O1 but the documentation was
not updated to reflect this.
Simon Pilgrim [Thu, 17 Sep 2020 12:28:14 +0000 (13:28 +0100)]
Remove unnecessary forward declarations. NFCI.
All of these forward declarations are fully defined in headers that are directly included.
Mikael Holmen [Thu, 17 Sep 2020 12:20:34 +0000 (14:20 +0200)]
[ConstraintSystem] Remove local variable that is set but not read [NFC]
gcc 7.4 warns about it.
mydeveloperday [Thu, 17 Sep 2020 12:22:26 +0000 (13:22 +0100)]
[clang-format][regression][PR47461] ifdef causes catch to be seen as a function
https://bugs.llvm.org/show_bug.cgi?id=47461
The following change {D80940} caused a regression in code which ifdef's around the try and catch block cause incorrect brace placement around the catch
```
try
{
}
catch (...) {
// This is not a small function
bar = 1;
}
}
```
The brace after the catch will be placed on a newline
Reviewed By: curdeius
Differential Revision: https://reviews.llvm.org/D87291
Simon Pilgrim [Thu, 17 Sep 2020 12:08:42 +0000 (13:08 +0100)]
MetadataLoader.cpp - remove unnecessary StringRef include. NFCI.
Already included in MetadataLoader.h
Simon Pilgrim [Thu, 17 Sep 2020 11:52:23 +0000 (12:52 +0100)]
SymbolizableObjectFile.h - remove unnecessary includes. NFCI.
Use forward declarations where possible, move includes down to SymbolizableObjectFile.cpp and avoid duplicate includes.
Sam Parker [Thu, 17 Sep 2020 12:07:46 +0000 (13:07 +0100)]
[NFC][ARM] Tail fold test changes
Run update script on one test and add another.
David Spickett [Thu, 17 Sep 2020 12:07:44 +0000 (13:07 +0100)]
Revert "[lldb] Don't send invalid region addresses to lldb server"
This reverts commit
c687af0c30b4dbdc9f614d5e061c888238e0f9c5
due to a test failure on Windows.
David Green [Thu, 17 Sep 2020 11:39:21 +0000 (12:39 +0100)]
[ARM] Additional tests for qr intrinsics in loops. NFC
Simon Pilgrim [Thu, 17 Sep 2020 11:18:27 +0000 (12:18 +0100)]
DwarfStringPool.cpp - remove unnecessary StringRef include. NFCI.
Already included in DwarfStringPool.h
Simon Pilgrim [Thu, 17 Sep 2020 11:12:00 +0000 (12:12 +0100)]
DwarfFile.h - remove unnecessary includes. NFCI.
Use forward declarations where possible, move includes down to DwarfFile.cpp and avoid duplicate includes.
David Green [Thu, 17 Sep 2020 11:10:23 +0000 (12:10 +0100)]
[ARM] Extra fp16 bitcast tests. NFC
Alex Zinenko [Thu, 17 Sep 2020 10:59:57 +0000 (12:59 +0200)]
[mlir] turn clang-format back on in C API test
C API test uses FileCheck comments inside C code and needs to
temporarily switch off clang-format to prevent it from messing with
FileCheck directives. A recently landed commit forgot to turn it back on
after a block of FileCheck comments. Fix that.
Nico Weber [Thu, 17 Sep 2020 10:33:24 +0000 (06:33 -0400)]
[gn build] (manually) port
c9af34027bc
Vincent Zhao [Wed, 16 Sep 2020 15:04:09 +0000 (16:04 +0100)]
[MLIR] Turns swapId into a FlatAffineConstraints member func
`swapId` used to be a static function in `AffineStructures.cpp`. This diff makes it accessible from the external world by turning it into a member function of `FlatAffineConstraints`. This will be very helpful for other projects that need to manipulate the content of `FlatAffineConstraints`.
Differential Revision: https://reviews.llvm.org/D87766
Simon Pilgrim [Wed, 16 Sep 2020 18:02:20 +0000 (19:02 +0100)]
[AsmPrinter] DwarfDebug - use DebugLoc const references where possible. NFC.
Avoid unnecessary copies.
Simon Pilgrim [Wed, 16 Sep 2020 18:01:42 +0000 (19:01 +0100)]
[AMDGPU] Remove orphan SITargetLowering::LowerINT_TO_FP declaration. NFCI.
Method implementation no longer exists.
Simon Pilgrim [Wed, 16 Sep 2020 17:52:28 +0000 (18:52 +0100)]
[AsmPrinter] Remove orphan DwarfUnit::shareAcrossDWOCUs declaration. NFCI.
Method implementation no longer exists.
Jakub Lichman [Thu, 17 Sep 2020 09:26:30 +0000 (09:26 +0000)]
[mlir][Linalg] Convolution tiling added to ConvOp vectorization pass
ConvOp vectorization supports now only convolutions of static shapes with dimensions
of size either 3(vectorized) or 1(not) as underlying vectors have to be of static
shape as well. In this commit we add support for convolutions of any size as well as
dynamic shapes by leveraging existing matmul infrastructure for tiling of both input
and kernel to sizes accepted by the previous version of ConvOp vectorization.
In the future this pass can be extended to take "tiling mask" as a user input which
will enable vectorization of user specified dimensions.
Differential Revision: https://reviews.llvm.org/D87676
Cullen Rhodes [Fri, 11 Sep 2020 15:18:44 +0000 (15:18 +0000)]
[clang][aarch64] ACLE: Support implicit casts between GNU and SVE vectors
This patch adds support for implicit casting between GNU vectors and SVE
vectors when `__ARM_FEATURE_SVE_BITS==N`, as defined by the Arm C
Language Extensions (ACLE, version 00bet5, section 3.7.3.3) for SVE [1].
This behavior makes it possible to use GNU vectors with ACLE functions
that operate on VLAT. For example:
typedef int8_t vec __attribute__((vector_size(32)));
vec f(vec x) { return svasrd_x(svptrue_b8(), x, 1); }
Tests are also added for implicit casting between GNU and fixed-length
SVE vectors created by the 'arm_sve_vector_bits' attribute. This
behavior makes it possible to use VLST with existing interfaces that
operate on GNUT. For example:
typedef int8_t vec1 __attribute__((vector_size(32)));
void f(vec1);
#if __ARM_FEATURE_SVE_BITS==256 && __ARM_FEATURE_SVE_VECTOR_OPERATORS
typedef svint8_t vec2 __attribute__((arm_sve_vector_bits(256)));
void g(vec2 x) { f(x); } // OK
#endif
The `__ARM_FEATURE_SVE_VECTOR_OPERATORS` feature macro indicates
interoperability with the GNU vector extension. This is the first patch
providing support for this feature, which once complete will be enabled
by the `-msve-vector-bits` flag, as the `__ARM_FEATURE_SVE_BITS` feature
currently is.
[1] https://developer.arm.com/documentation/100987/latest
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D87607
David Spickett [Tue, 15 Sep 2020 13:49:48 +0000 (14:49 +0100)]
[lldb] Don't send invalid region addresses to lldb server
Previously when <addr> in "memory region <addr>" didn't
parse correctly, we'd print an error then also ask lldb-server
for a region containing LLDB_INVALID_ADDRESS.
(lldb) memory region not_an_address
error: invalid address argument "not_an_address"...
error: Server returned invalid range
Only send the command to lldb-server if the address
parsed correctly.
(lldb) memory region not_an_address
error: invalid address argument "not_an_address"...
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D87694
Rainer Orth [Thu, 17 Sep 2020 09:17:11 +0000 (11:17 +0200)]
[X86] Fix stack alignment on 32-bit Solaris/x86
On Solaris/x86, several hundred 32-bit tests `FAIL`, all in the same way:
env ASAN_OPTIONS=halt_on_error=false ./halt_on_error_suppress_equal_pcs.cpp.tmp
Segmentation Fault (core dumped)
They segfault during startup:
Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0x080f21f0 in __sanitizer::internal_mmap(void*, unsigned long, int, int, int, unsigned long long) () at /vol/llvm/src/llvm-project/dist/compiler-rt/lib/sanitizer_common/sanitizer_solaris.cpp:65
65 int prot, int flags, int fd, OFF_T offset) {
1: x/i $pc
=> 0x80f21f0 <_ZN11__sanitizer13internal_mmapEPvmiiiy+16>: movaps 0x30(%esp),%xmm0
(gdb) p/x $esp
$3 = 0xfeffd488
The problem is that `movaps` expects 16-byte alignment, while 32-bit Solaris/x86
only guarantees 4-byte alignment following the i386 psABI.
This patch updates `X86Subtarget::initSubtargetFeatures` accordingly,
handles Solaris/x86 in the corresponding testcase, and allows for some
variation in address alignment in
`compiler-rt/test/ubsan/TestCases/TypeCheck/vptr.cpp`.
Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D87615
Douglas Yung [Thu, 17 Sep 2020 08:28:32 +0000 (01:28 -0700)]
Revert "Re-land: Add new hidden option -print-changed which only reports changes to IR"
The test added in this commit is failing on Windows bots:
http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/1269
This reverts commit
f9e6d1edc0dad9afb26e773aa125ed62c58f7080 and follow-up commit
6859d95ea2d0f3fe0de2923a3f642170e66a1a14.
Roman Lebedev [Thu, 17 Sep 2020 08:08:26 +0000 (11:08 +0300)]
[NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%)
This is functionally equivalent to the old implementation.
As per https://llvm-compile-time-tracker.com/compare.php?from=
5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=
4739e6e4eb54d3736e6457249c0919b30f6c855a&stat=instructions
this is a clear geomean compile-time regression-free win with overall geomean of `-0.08%`
32 PHI's appears to be the sweet spot; both the 16 and 64 performed worse:
https://llvm-compile-time-tracker.com/compare.php?from=
5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=
c4efe1fbbfdf0305ac26cd19eacb0c7774cdf60e&stat=instructions
https://llvm-compile-time-tracker.com/compare.php?from=
5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=
e4989d1c67010d3339d1a40ff5286a31f10cfe82&stat=instructions
If we have more PHI's than that, we fall-back to the original DenseSet-based implementation,
so the not-so-fast cases will still be handled.
However compile-time isn't the main motivation here.
I can name at least 3 limitations of this CSE:
1. Assumes that all PHI nodes have incoming basic blocks in the same order (can be fixed while keeping the DenseMap)
2. Does not special-handle `undef` incoming values (i don't see how we can do this with hashing)
3. Does not special-handle backedge incoming values (maybe can be fixed by hashing backedge as some magical value)
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D87408
Jay Foad [Wed, 16 Sep 2020 10:13:45 +0000 (11:13 +0100)]
[SplitKit] Only copy live lanes
When splitting a live interval with subranges, only insert copies for
the lanes that are live at the point of the split. This avoids some
unnecessary copies and fixes a problem where copying dead lanes was
generating MIR that failed verification. The test case for this is
test/CodeGen/AMDGPU/splitkit-copy-live-lanes.mir.
Without this fix, some earlier live range splitting would create %430:
%430 [256r,848r:0)[848r,2584r:1) 0@256r 1@848r L0000000000000003 [848r,2584r:0) 0@848r L0000000000000030 [256r,2584r:0) 0@256r weight:1.480938e-03
...
256B undef %430.sub2:vreg_128 = V_LSHRREV_B32_e32 16, %20.sub1:vreg_128, implicit $exec
...
848B %430.sub0:vreg_128 = V_AND_B32_e32 %92:sreg_32, %20.sub1:vreg_128, implicit $exec
...
2584B %431:vreg_128 = COPY %430:vreg_128
Then RAGreedy::tryLocalSplit would split %430 into %432 and %433 just
before 848B giving:
%432 [256r,844r:0) 0@256r L0000000000000030 [256r,844r:0) 0@256r weight:3.066802e-03
%433 [844r,848r:0)[848r,2584r:1) 0@844r 1@848r L0000000000000030 [844r,2584r:0) 0@844r L0000000000000003 [844r,844d:0)[848r,2584r:1) 0@844r 1@848r weight:2.831776e-03
...
256B undef %432.sub2:vreg_128 = V_LSHRREV_B32_e32 16, %20.sub1:vreg_128, implicit $exec
...
844B undef %433.sub0:vreg_128 = COPY %432.sub0:vreg_128 {
internal %433.sub2:vreg_128 = COPY %432.sub2:vreg_128
848B }
%433.sub0:vreg_128 = V_AND_B32_e32 %92:sreg_32, %20.sub1:vreg_128, implicit $exec
...
2584B %431:vreg_128 = COPY %433:vreg_128
Note that the copy from %432 to %433 at 844B is a curious
bundle-without-a-BUNDLE-instruction that SplitKit creates deliberately,
and it includes a copy of .sub0 which is not live at this point, and
that causes it to fail verification:
*** Bad machine code: No live subrange at use ***
- function: zextload_global_v64i16_to_v64i64
- basic block: %bb.0 (0x7faed48) [0B;2848B)
- instruction: 844B undef %433.sub0:vreg_128 = COPY %432.sub0:vreg_128
- operand 1: %432.sub0:vreg_128
- interval: %432 [256r,844r:0) 0@256r L0000000000000030 [256r,844r:0) 0@256r weight:3.066802e-03
- at: 844B
Using real bundles with a BUNDLE instruction might also fix this
problem, but the current fix is less invasive and also avoids some
unnecessary copies.
https://bugs.llvm.org/show_bug.cgi?id=47492
Differential Revision: https://reviews.llvm.org/D87757
Jay Foad [Wed, 16 Sep 2020 19:28:02 +0000 (20:28 +0100)]
[AMDGPU] Generate test checks for splitkit-copy-bundle.mir
This is a pre-commit for D87757 "[SplitKit] Only copy live lanes".
Sjoerd Meijer [Thu, 17 Sep 2020 07:47:39 +0000 (08:47 +0100)]
[Lint] Add check for intrinsic get.active.lane.mask
As @efriedma pointed out in D86301, this "not equal to 0 check" of
get.active.lane.mask's second operand needs to live here in Lint and not the
Verifier.
Differential Revision: https://reviews.llvm.org/D87228
Qiu Chaofan [Thu, 17 Sep 2020 08:00:54 +0000 (16:00 +0800)]
[SelectionDAG] Check any use of negation result before removal
2508ef01 fixed a bug about constant removal in negation. But after
sanitizing check I found there's still some issue about it so it's
reverted.
Temporary nodes will be removed if useless in negation. Before the
removal, they'd be checked if any other nodes used it. So the removal
was moved after getNode. However in rare cases the node to be removed is
the same as result of getNode. We missed that and will be fixed by this
patch.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D87614
Tres Popp [Tue, 15 Sep 2020 16:28:59 +0000 (18:28 +0200)]
[mlir] Remove redundant shape.cstr_broadcastable canonicalization.
These canonicalizations are already handled by folding which will occur
in a superset of situations, so they are being removed.
Differential Revision: https://reviews.llvm.org/D87706
Fangrui Song [Thu, 17 Sep 2020 06:18:46 +0000 (23:18 -0700)]
[llvm-cov gcov] Add --demangled-names (-m)
gcov 4.9 introduced the option.
Artur Bialas [Thu, 17 Sep 2020 05:53:52 +0000 (22:53 -0700)]
[mlir][spirv] Add GroupNonUniformBroadcastOp
Added GroupNonUniformBroadcastOp to spirv dialect.
Differential Revision: https://reviews.llvm.org/D87688
Igor Kudrin [Thu, 17 Sep 2020 05:47:38 +0000 (12:47 +0700)]
[DebugInfo] Simplify DIEInteger::SizeOf().
An AsmPrinter should always be provided to the method because some forms
depend on its parameters. The only place in the codebase which passed
a nullptr value was found in the unit tests, so the patch updates it to
use some dummy AsmPrinter instead.
Differential Revision: https://reviews.llvm.org/D85293
Fangrui Song [Thu, 17 Sep 2020 05:41:30 +0000 (22:41 -0700)]
[llvm-cov gcov][test] Move tests to gcov/
And rename llvm-cov.test (misnomer) to basic.test
Craig Topper [Thu, 17 Sep 2020 04:56:01 +0000 (21:56 -0700)]
Add __divmodti4 to match libgcc.
gcc has used this on x86-64 since at least version 7.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D80506
Jonas Devlieghere [Thu, 17 Sep 2020 04:11:40 +0000 (21:11 -0700)]
[lldb] Return FileSP and StreamFileSP by value in IOHandler (NFC)
Smart pointers should be returned by value.
Jianzhou Zhao [Thu, 17 Sep 2020 04:02:19 +0000 (04:02 +0000)]
Fix the arguments of std::min
fixing
https://github.com/llvm/llvm-project/commit/
11201315d5881a135faa5aa87f415ce03f99eb96
Jianzhou Zhao [Thu, 17 Sep 2020 03:48:36 +0000 (03:48 +0000)]
Add the header of std::min
fixing
https://github.com/llvm/llvm-project/commit/
11201315d5881a135faa5aa87f415ce03f99eb96
Jianzhou Zhao [Sat, 12 Sep 2020 19:35:17 +0000 (19:35 +0000)]
Flush bitcode incrementally for LTO output
Bitcode writer does not flush buffer until the end by default. This is
fine to small bitcode files. When -flto,--plugin-opt=emit-llvm,-gmlt are
used, the final bitcode file is large, for example, >8G. Keeping all
data in memory consumes a lot of memory.
This change allows bitcode writer flush data to disk early when buffered
data size is above some threshold. This is only enabled when lld emits
LLVM bitcode.
One issue to address is backpatching bitcode: subblock length, function
body indexes, meta data indexes need to backfill. If buffer can be
flushed partially, we introduced raw_fd_stream that supports
read/seek/write, and enables backpatching bitcode flushed in disk.
Reviewed-by: tejohnson, MaskRay
Differential Revision: https://reviews.llvm.org/D86905
LLVM GN Syncbot [Thu, 17 Sep 2020 03:02:00 +0000 (03:02 +0000)]
[gn build] Port
a895040eb02
Stella Stamenova [Thu, 17 Sep 2020 03:00:43 +0000 (20:00 -0700)]
Revert "[IRSim] Adding IR Instruction Mapper"
This reverts commit
b04c1a9d3127730c05e8a22a0e931a12a39528df.
David Blaikie [Tue, 15 Sep 2020 19:49:53 +0000 (12:49 -0700)]
debug_rnglists/symbolizing: reduce memory usage by not caching rnglists
This matches the debug_ranges behavior - though is currently implemented
differently. (the debug_ranges parsing was handled by creating a new
ranges parser during DIE address querying, and just destroying it after
the query - whereas the rnglists parser is a member of the DWARFUnit
currently - so the API doesn't cache anymore)
I think this could/should be improved by not parsing debug_rnglists
headers at all when dumping debug_info or symbolizing - do it the way
DWARF (roughly) intended: take the rnglists_base, add addr*index to it,
read the offset, parse the list at rnglists_base+offset. This would have
no error checking for valid index (because the number of valid indexes
is stored in the header, which has a negative offset from rnglists_base
- and is sort of only intended for use by dumpers, not by parsers going
from debug_info to a rnglist) or out of contribution bounds access
(since it wouldn't know the length of the contribution, also in the
header) - nor any error-checking that the rnglist contribution was using
the same properties as the debug_info (version, DWARF32/64, address
size, etc).
Eric Christopher [Wed, 16 Sep 2020 22:52:50 +0000 (15:52 -0700)]
Use zu rather than llu format specifier for size_t (-Wformat warning fix).
Qiu Chaofan [Thu, 17 Sep 2020 02:19:09 +0000 (10:19 +0800)]
[PowerPC] Fix store-fptoi combine of f128 on Power8
llc would crash for (store (fptosi-f128-i32)) when -mcpu=pwr8, we should
not generate FP_TO_(S|U)INT_IN_VSR for f128 types at this time. This
patch fixes it.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D86686
Chen Zheng [Thu, 17 Sep 2020 01:51:53 +0000 (21:51 -0400)]
[MachineSink] add one more mir case - nfc
Ryan Prichard [Wed, 16 Sep 2020 08:22:55 +0000 (01:22 -0700)]
[libunwind][DWARF] Fix end of .eh_frame calculation
* When .eh_frame is located using .eh_frame_hdr (PT_GNU_EH_FRAME), the
start of .eh_frame is known, but not the size. In this case, the
unwinder must rely on a terminator present at the end of .eh_frame.
Set dwarf_section_length to UINTPTR_MAX to indicate this.
* Add a new field, text_segment_length, that the FrameHeaderCache uses
to track the size of the PT_LOAD segment indicated by dso_base.
* Compute ehSectionEnd by adding sectionLength to ehSectionStart,
never to fdeHint.
Fixes PR46829.
Differential Revision: https://reviews.llvm.org/D87750
LLVM GN Syncbot [Thu, 17 Sep 2020 01:54:10 +0000 (01:54 +0000)]
[gn build] Port
b04c1a9d312
zhanghb97 [Mon, 14 Sep 2020 14:52:22 +0000 (22:52 +0800)]
[mlir] expose affine map to C API
This patch provides C API for MLIR affine map.
- Implement C API for AffineMap class.
- Add Utils.h to include/mlir/CAPI/, and move the definition of the CallbackOstream to Utils.h to make sure mlirAffineMapPrint work correct.
- Add TODO for exposing the C API related to AffineExpr and mutable affine map.
Differential Revision: https://reviews.llvm.org/D87617
Andrew Litteken [Thu, 17 Sep 2020 01:24:29 +0000 (20:24 -0500)]
[IRSim] Adding IR Instruction Mapper
This introduces the IRInstructionMapper, and the associated wrapper for
instructions, IRInstructionData, that maps IR level Instructions to
unsigned integers.
Mapping is done mainly by using the "isSameOperationAs" comparison
between two instructions. If they return true, the opcode, result type,
and operand types of the instruction are used to hash the instruction
with an unsigned integer. The mapper accepts instruction ranges, and
adds each resulting integer to a list, and each wrapped instruction to
a separate list.
At present, branches, phi nodes are not mapping and exception handling
is illegal. Debug instructions are not considered.
The different mapping schemes are tested in
unittests/Analysis/IRSimilarityIdentifierTest.cpp
Differential Revision: https://reviews.llvm.org/D86968