David Green [Tue, 30 Mar 2021 10:19:16 +0000 (11:19 +0100)]
[ARM] Handle Splats in MVE lane interleaving
As another addition to MVE lane interleaving, this handles Splat shuffle
vectors, as the shuffle of a splat is a splat.
Differential Revision: https://reviews.llvm.org/D97291
Serguei Katkov [Tue, 30 Mar 2021 06:21:03 +0000 (13:21 +0700)]
[RegAlloc] Add a test with use in statepoint expected to be on stack.
The test shows that RA computes the spill weight independent on the
fact that statepoint instruction for var operands is ok to accept
this operand on stack. As a result the corresponding virtual register
evicts the other register which requires register for use.
It causes redundant fill operation.
David Sherwood [Wed, 10 Mar 2021 17:06:47 +0000 (17:06 +0000)]
[LoopVectorize] Add support for scalable vectorization of induction variables
This patch adds support for the vectorization of induction variables when
using scalable vectors, which required the following changes:
1. Removed assert from InnerLoopVectorizer::getStepVector.
2. Modified InnerLoopVectorizer::createVectorIntOrFpInductionPHI to use
a runtime determined value for VF and removed an assert.
3. Modified InnerLoopVectorizer::buildScalarSteps to work for scalable
vectors. I did this by calculating the full vector value for each Part
of the unroll factor (UF) and caching this in the VP state. This means
that we are always able to extract an arbitrary element from the vector
if necessary. In addition to this, I also permitted the caching of the
individual lane values themselves for the known minimum number of elements
in the same way we do for fixed width vectors. This is a further
optimisation that improves the code quality since it avoids unnecessary
extractelement operations when extracting the first lane.
4. Added an assert to InnerLoopVectorizer::widenPHIInstruction, since while
testing some code paths I noticed this is currently broken for scalable
vectors.
Various tests to support different cases have been added here:
Transforms/LoopVectorize/AArch64/sve-inductions.ll
Differential Revision: https://reviews.llvm.org/D98715
Stefan Gränitz [Tue, 30 Mar 2021 09:57:38 +0000 (11:57 +0200)]
Re-apply "[lli] Make -jit-kind=orc the default JIT engine"
MCJIT served well as the default JIT engine in lli for a long time, but the code is getting old and maintenance efforts don't seem to be in sight. In the meantime Orc became mature enough to fill that gap. The newly added greddy mode is very similar to the execution model of MCJIT. It should work as a drop-in replacement for common JIT tasks.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D98931
Gabor Marton [Fri, 26 Mar 2021 16:26:10 +0000 (17:26 +0100)]
[ASTImporter] Import member specialization/instantiation of enum decls
We do the import of the member enum specialization similarly to as we do
with member CXXRecordDecl specialization.
Differential Revision: https://reviews.llvm.org/D99421
Krasimir Georgiev [Tue, 30 Mar 2021 09:44:05 +0000 (11:44 +0200)]
Revert "[loop-idiom] Hoist loop memcpys to loop preheader"
This reverts commit
92ddd3c1b6cd8f01f39dfd716cf3e976de126e66.
Causes multistage clang crashes, e.g.:
https://lab.llvm.org/buildbot/#/builders/36/builds/6678
Pavel Labath [Tue, 30 Mar 2021 09:42:19 +0000 (11:42 +0200)]
[lldb] Change CreateHostNativeRegisterContextLinux argument type
to NativeThreadLinux. This avoid casts down the line.
Joe Ellis [Mon, 29 Mar 2021 10:11:42 +0000 (10:11 +0000)]
[AArch64][SVE] Lower fixed length INSERT_VECTOR_ELT
Differential Revision: https://reviews.llvm.org/D98496
Joe Ellis [Tue, 23 Mar 2021 14:53:19 +0000 (14:53 +0000)]
[AArch64][SVE] Lower fixed length EXTRACT_VECTOR_ELT
Differential Revision: https://reviews.llvm.org/D98625
Kadir Cetinkaya [Fri, 12 Mar 2021 18:49:40 +0000 (19:49 +0100)]
[clangd] Perform merging for stale symbols in MergeIndex
Clangd drops symbols from static index whenever the dynamic index is
authoritative for the file. This results in regressions when static and
dynamic index contains different set of information, e.g.
IncludeHeaders.
After this patch, we'll choose to merge symbols from static index with
dynamic one rather than just dropping. This implies correctness problems
when the definition/documentation of the symbol is deleted. But seems
like it is worth having in more cases.
We still drop symbols if dynamic index owns the file and didn't report
the symbol, which means symbol is deleted.
Differential Revision: https://reviews.llvm.org/D98538
Raphael Isemann [Tue, 30 Mar 2021 09:08:08 +0000 (11:08 +0200)]
[lldb] Add a test for Obj-C properties with conflicting names
This is apparently allowed in Objective-C so we should test this in LLDB.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D99513
Raphael Isemann [Tue, 30 Mar 2021 09:07:04 +0000 (11:07 +0200)]
[ObjC][CodeGen] Fix missing debug info in situations where an instance and class property have the same identifier
Since the introduction of class properties in Objective-C it is possible to declare a class and an instance
property with the same identifier in an interface/protocol.
Right now Clang just generates debug information for whatever property comes first in the source file.
The second property is ignored as it's filtered out by the set of already emitted properties (which is just
using the identifier of the property to check for equivalence). I don't think generating debug info in this case
was never supported as the identifier filter is in place since
7123bca7fb6e1dde51be8329cfb523d2bb9ffadf
(which precedes the introduction of class properties).
This patch expands the filter to take in account identifier + whether the property is class/instance. This
ensures that both properties are emitted in this special situation.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D99512
Nuno Lopes [Tue, 30 Mar 2021 09:00:31 +0000 (10:00 +0100)]
[docs] remove references to checking out svn repos
Bing1 Yu [Tue, 30 Mar 2021 08:33:07 +0000 (16:33 +0800)]
Revert "[X86] Pass to transform tdpbsud&tdpbusd&tdpbuud intrinsics to scalar operation"
This reverts commit
275df61f043ccf86a9c17957379bff9434da1489.
Sander de Smalen [Tue, 30 Mar 2021 07:54:59 +0000 (08:54 +0100)]
[InstructionCost] Don't conflate Invalid costs with Unknown costs.
We previously made a change to getUserCost to return a Invalid cost
when one of the TTI costs returned '-1' (meaning 'unknown' or
'infinitely expensive'). It makes no sense to say that:
shufflevector <2 x i8> %x, <2 x i8> %y, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
has an invalid cost. Perhaps the cost is not known, but the IR is valid
and can be code-generated. Invalid should only be used for IR that
cannot possibly be code-generated and where a cost is nonsensical.
With more passes now asserting that the cost must be valid, it is possible
that those assertions will fail for perfectly valid IR. An incomplete
cost-model probably shouldn't be a reason for the compiler to break.
It's better to consider these costs as 'very expensive' and ignore them
for other reasons. At some point, we should consider replacing -1 with
some other mechanism.
Reviewed By: paulwalker-arm, dmgreen
Differential Revision: https://reviews.llvm.org/D99502
Bing1 Yu [Wed, 24 Mar 2021 08:53:12 +0000 (16:53 +0800)]
[X86] Pass to transform tdpbsud&tdpbusd&tdpbuud intrinsics to scalar operation
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D99244
Pavel Labath [Tue, 30 Mar 2021 07:57:32 +0000 (09:57 +0200)]
Revert "[lldb/DWARF] Simplify DIE extraction code slightly"
This reverts commit
1b96e133cf5215cb9ebfe7f14630f479c1611f22 due to
failures on windows.
Tim Renouf [Tue, 30 Mar 2021 07:33:07 +0000 (08:33 +0100)]
[AMDGPU] Update AMDGPU PAL usage documentation
Change-Id: I65f3edcfe5063551cad5aab0da1374c3a6ccd3a2
Stefan Gränitz [Tue, 30 Mar 2021 07:24:46 +0000 (09:24 +0200)]
[lli] Add option -lljit-platform=Inactive to disable platform support explicitly
This option tells LLJIT to disable platform support explicitly: JITDylibs aren't scanned for special init/deinit symbols and no runtime API interposes are injected.
It's useful in two cases: for platforms that don't have such requirements and platforms for which we have no explicit support yet and that don't work well with the generic IR platform.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D99416
Vitaly Buka [Tue, 30 Mar 2021 07:21:20 +0000 (00:21 -0700)]
[NFC][scudo] Produce debug info
Markus Böck [Tue, 30 Mar 2021 06:52:28 +0000 (08:52 +0200)]
[llvm-profdata] Make sure to consume Error on the error path of setIsIRLevelProfile
Encountered a crash while running a debug build, where this code path would be taken due to a mismatch in profile coverage data versions. Without consuming the error, an assert would be triggered inside the destructor of Error.
Differential Revision: https://reviews.llvm.org/D99457
Pavel Labath [Sat, 27 Mar 2021 20:21:30 +0000 (21:21 +0100)]
[lldb] Remove ScriptInterpreterLuaTest.Plugin unittest
This test is not useful as the functions it's testing are just returning
a constant. It also fails in unoptimized builds as it's comparing
character strings by address.
Pavel Labath [Tue, 30 Mar 2021 06:46:36 +0000 (08:46 +0200)]
[lldb] Add a dwarf unit test for null unit dies
This is the test I mentioned in the previous commit (
1b96e133), but
forgot to add.
Pavel Labath [Sat, 27 Mar 2021 20:00:59 +0000 (21:00 +0100)]
[lldb/DWARF] Simplify DIE extraction code slightly
Remove the "depth" variable, as the same information can be obtained
through die_index_stack.size().
Also add a test case for a one tricky case I noticed -- a unit
containing only a null unit die.
Han Zhu [Tue, 9 Feb 2021 01:24:25 +0000 (17:24 -0800)]
[loop-idiom] Hoist loop memcpys to loop preheader
For a simple loop like:
```
struct S {
int x;
int y;
char b;
};
unsigned foo(S* __restrict__ a, S* b, int n) {
for (int i = 0; i < n; i++)
a[i] = b[i];
return sizeof(a[0]);
}
```
We could eliminate the loop and convert it to a large memcpy of 12*n bytes. Currently this is not handled. Output of `opt -loop-idiom -S < memcpy_before.ll`
```
%struct.S = type { i32, i32, i8 }
define dso_local i32 @_Z3fooP1SS0_i(%struct.S* noalias nocapture %a, %struct.S* nocapture readonly %b, i32 %n) local_unnamed_addr {
entry:
%cmp7 = icmp sgt i32 %n, 0
br i1 %cmp7, label %for.body.preheader, label %for.cond.cleanup
for.body.preheader: ; preds = %entry
br label %for.body
for.cond.cleanup.loopexit: ; preds = %for.body
br label %for.cond.cleanup
for.cond.cleanup: ; preds = %for.cond.cleanup.loopexit, %entry
ret i32 12
for.body: ; preds = %for.body, %for.body.preheader
%i.08 = phi i32 [ %inc, %for.body ], [ 0, %for.body.preheader ]
%idxprom = zext i32 %i.08 to i64
%arrayidx = getelementptr inbounds %struct.S, %struct.S* %b, i64 %idxprom
%arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %a, i64 %idxprom
%0 = bitcast %struct.S* %arrayidx2 to i8*
%1 = bitcast %struct.S* %arrayidx to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 4 dereferenceable(12) %0, i8* nonnull align 4 dereferenceable(12) %1, i64 12, i1 false)
%inc = add nuw nsw i32 %i.08, 1
%cmp = icmp slt i32 %inc, %n
br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit
}
; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #0
attributes #0 = { argmemonly nofree nosync nounwind willreturn }
```
The loop idiom pass currently only handles load and store instructions. Since struct S is too big to fit in a register, the loop body contains a memcpy intrinsic.
With this change, re-run `opt -loop-idiom -S < memcpy_before.ll`. The loop memcpy is promoted to loop preheader. For this trivial case, the loop is dead and will be removed by another pass.
```
%struct.S = type { i32, i32, i8 }
define dso_local i32 @_Z3fooP1SS0_i(%struct.S* noalias nocapture %a, %struct.S* nocapture readonly %b, i32 %n) local_unnamed_addr {
entry:
%a1 = bitcast %struct.S* %a to i8*
%b2 = bitcast %struct.S* %b to i8*
%cmp7 = icmp sgt i32 %n, 0
br i1 %cmp7, label %for.body.preheader, label %for.cond.cleanup
for.body.preheader: ; preds = %entry
%0 = zext i32 %n to i64
%1 = mul nuw nsw i64 %0, 12
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %a1, i8* align 4 %b2, i64 %1, i1 false)
br label %for.body
for.cond.cleanup.loopexit: ; preds = %for.body
br label %for.cond.cleanup
for.cond.cleanup: ; preds = %for.cond.cleanup.loopexit, %entry
ret i32 12
for.body: ; preds = %for.body, %for.body.preheader
%i.08 = phi i32 [ %inc, %for.body ], [ 0, %for.body.preheader ]
%idxprom = zext i32 %i.08 to i64
%arrayidx = getelementptr inbounds %struct.S, %struct.S* %b, i64 %idxprom
%arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %a, i64 %idxprom
%2 = bitcast %struct.S* %arrayidx2 to i8*
%3 = bitcast %struct.S* %arrayidx to i8*
%inc = add nuw nsw i32 %i.08, 1
%cmp = icmp slt i32 %inc, %n
br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit
}
; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #0
attributes #0 = { argmemonly nofree nosync nounwind willreturn }
```
Reviewed By: zino
Differential Revision: https://reviews.llvm.org/D97667
Han Zhu [Tue, 30 Mar 2021 06:35:10 +0000 (23:35 -0700)]
Revert "[loop-idiom] Hoist loop memcpys to loop preheader"
This reverts commit
deb5095833a834e0ef5f784138da53e66febff05.
Bad commit message.
Fangrui Song [Tue, 30 Mar 2021 06:31:14 +0000 (23:31 -0700)]
[DebugInfo][unittest] Fix heap-use-after-free after D76115
Han Zhu [Tue, 9 Feb 2021 01:24:25 +0000 (17:24 -0800)]
[loop-idiom] Hoist loop memcpys to loop preheader
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Blame Revision:
Differential Revision: https://phabricator.intern.facebook.com/
D26380397
Johannes Doerfert [Mon, 29 Mar 2021 01:13:38 +0000 (20:13 -0500)]
[OpenMP][NFC] Move the `noinline` to the parallel entry point
The `noinline` for non-SPMD parallel functions is probably not necessary
but as long as we use it we should put it on the outermost parallel
function, which is the wrapper, not the actual outlined function.
Resolves PR49752
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D99506
Max Kazantsev [Tue, 30 Mar 2021 05:27:33 +0000 (12:27 +0700)]
[Test] Add a test demonstrating a missing opportunity to PRE a load
Fangrui Song [Tue, 30 Mar 2021 05:14:29 +0000 (22:14 -0700)]
[sanitizer] Improve accuracy of GetTls on x86/s390
The previous code may underestimate the static TLS surplus part, which may cause
false positives to LeakSanitizer if a dynamically loaded module uses the surplus
and there is an allocation only referenced by a thread's TLS.
Vitaly Buka [Tue, 30 Mar 2021 02:41:47 +0000 (19:41 -0700)]
[NFC][scudo] Sort sources in CMake file
Vitaly Buka [Tue, 30 Mar 2021 02:39:35 +0000 (19:39 -0700)]
[NFC][scudo] Add memtag.h into CMake file
Alok Kumar Sharma [Thu, 25 Mar 2021 11:04:57 +0000 (16:34 +0530)]
[DebugInfo] Upgrade DISubragne::count to accept DIExpression also
This is needed for Fortran assumed shape arrays whose dimensions are
defined as,
- 'count' is taken from array descriptor passed as parameter by
caller, access from descriptor is defined by type DIExpression.
- 'lowerBound' is defined by callee.
The current alternate way represents using upperBound in place of
count, where upperBound is calculated in callee in a temp variable
using lowerBound and count
Representation with count (DIExpression) is not only clearer as
compared to upperBound (DIVariable) but it has another advantage that
variable count is accessed by being parameter has better chance of
survival at higher optimization level than upperBound being local
variable.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D99335
Rahman Lavaee [Fri, 26 Mar 2021 01:44:00 +0000 (18:44 -0700)]
[Propeller] Do not generate the BB address map for empty functions.
Empty functions (functions with no real code) are irrelevant for propeller optimizations and their addresses sometimes conflict with other functions which obfuscates the analysis.
This simple change skips the BB address map emission for such functions.
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D99395
Stella Stamenova [Tue, 30 Mar 2021 03:06:31 +0000 (20:06 -0700)]
Revert "Add missing dependency to fix building the jit tests"
This breaks the windows bots because the dependency does not exist on Windows.
Per the cmake file:
if(CMAKE_HOST_UNIX)
add_subdirectory(LLJITWithRemoteDebugging)
endif()
This reverts commit
bd56e91fdbc65053dd08cca1f2c9e15087c062eb.
Jun Ma [Mon, 29 Mar 2021 06:24:47 +0000 (14:24 +0800)]
[NFC][SVE] Remove redundant pattern
Jun Ma [Mon, 29 Mar 2021 06:19:13 +0000 (14:19 +0800)]
[AArch64][SVE] Codegen dup_lane for dup(vector_extract)
Differential Revision: https://reviews.llvm.org/D99324
Jun Ma [Fri, 26 Mar 2021 07:55:46 +0000 (15:55 +0800)]
[AArch64][SVEIntrinsicOpts] Optimize tbl+dup into dup+extractelement
Differential Revision: https://reviews.llvm.org/D99412
Amy Huang [Tue, 30 Mar 2021 02:10:09 +0000 (19:10 -0700)]
Revert "[COFF] Only consider associated EH sections during ICF"
This change causes an asan error for ODR violation.
This reverts commit
7ce9a3e9a91bb0c71cd3560079ff4c31d5dade1b.
Louis Dionne [Thu, 25 Mar 2021 18:14:20 +0000 (14:14 -0400)]
[libc++] Re-enable macOS back-deployment testing
Download older roots from Dropbox instead of Green Dragon, which is too
unreliable. Also XFAIL tests that were broken for back-deployment
configurations by D98097.
Differential Revision: https://reviews.llvm.org/D99359
Hsiangkai Wang [Mon, 29 Mar 2021 10:27:27 +0000 (18:27 +0800)]
[RISCV] Add inline asm constraint 'vr' and 'vm' in Clang for RISC-V 'V'.
Add asm constraint 'vr' for vector registers.
Add asm constraint 'vm' for vector mask registers.
Differential Revision: https://reviews.llvm.org/D98616
Evandro Menezes [Tue, 30 Mar 2021 01:02:05 +0000 (20:02 -0500)]
[RISCV] Move scheduling resources for B into a separate file (NFC)
Differential Revision: https://reviews.llvm.org/D99557
Adrian Prantl [Mon, 29 Mar 2021 22:02:25 +0000 (15:02 -0700)]
Add debug support for set types
This commit adds debugging support for set types defined in languages
such as Pascal and Modula-2.
Patch by Peter McKinna!
Differential Revision: https://reviews.llvm.org/D76115
Dave Lee [Mon, 29 Mar 2021 22:59:05 +0000 (15:59 -0700)]
[llvm][utils] Fix handling of llvm::None
David Blaikie [Mon, 29 Mar 2021 23:11:38 +0000 (16:11 -0700)]
Add missing dependency to fix building the jit tests
Thomas Lively [Tue, 30 Mar 2021 00:23:15 +0000 (17:23 -0700)]
[WebAssembly] Fix i8x16.popcnt opcode
When I updated the SIMD opcodes in
f5764a8654e3, I accidentally missed updating
i8x16.popcnt. This patch fixes the omission.
Differential Revision: https://reviews.llvm.org/D99536
Jonas Devlieghere [Tue, 30 Mar 2021 00:14:35 +0000 (17:14 -0700)]
[dsymutil] s/dwarfdump/llvm-dwarfdump/ in test
Huihui Zhang [Mon, 29 Mar 2021 23:37:01 +0000 (16:37 -0700)]
[IPO][SampleContextTracker] Use SmallVector to track context profiles to prevent non-determinism.
Use SmallVector instead of SmallSet to track the context profiles mapped. Doing this
can help avoid non-determinism caused by iterating over unordered containers.
This bug was found with reverse iteration turning on,
--extra-llvm-cmake-variables="-DLLVM_REVERSE_ITERATION=ON".
Failing LLVM test profile-context-tracker-debug.ll .
Reviewed By: MaskRay, wenlei
Differential Revision: https://reviews.llvm.org/D99547
Jessica Paquette [Mon, 29 Mar 2021 23:29:10 +0000 (16:29 -0700)]
[AArch64][GlobalISel] NFC: Replace IR regbankselect test with MIR test
regbank-ceil.ll -> regbank-ceil.mir
The IR test was intended to only check register banks. This makes it brittle,
especially as we improve load/store combines in GlobalISel.
Rewriting this as a MIR test also makes it more consistent with the rest of
the testcases in GlobalISel.
Jonas Devlieghere [Mon, 29 Mar 2021 19:35:17 +0000 (12:35 -0700)]
[dsymutil] Relocate DW_TAG_label
dsymutil is not relocating the DW_AT_low_pc for a DW_TAG_label. This
patch fixes that and adds a test.
Differential revision: https://reviews.llvm.org/D99534
Jonas Devlieghere [Mon, 29 Mar 2021 22:29:28 +0000 (15:29 -0700)]
[lldb] Prints error using WithColor::error in lldb-platform
Greg Clayton [Fri, 26 Mar 2021 07:48:49 +0000 (00:48 -0700)]
Fix .debug_aranges parsing issues.
When LLVM error handling was introduced to the parsing of the .debug_aranges it would cause major issues if any DWARFDebugArangeSet::extract() calls returned any errors. The code in DWARFDebugInfo::GetCompileUnitAranges() would end up calling DWARFDebugAranges::extract() which would return an error if _any_ DWARFDebugArangeSet had any errors, but it default constructed a DWARFDebugAranges object into DWARFDebugInfo::m_cu_aranges_up and populated it partially, and returned an error prior to finishing much needed functionality in the DWARFDebugInfo::GetCompileUnitAranges() function. Subsequent callers to this function would see that the DWARFDebugInfo::m_cu_aranges_up was actually valid and return this partially populated DWARFDebugAranges reference _and_ it would not be sorted or minimized.
This above bugs would cause an incomplete .debug_aranges parsing, it would skip manually parsing any compile units for ranges, and would not sort the DWARFDebugAranges in m_cu_aranges_up.
This bug would also cause breakpoints set by file and line to fail to set correctly if a symbol context for an address could not be resolved properly, which the incomplete and unsorted DWARFDebugAranges object that DWARFDebugInfo::GetCompileUnitAranges() returned would cause symbol context lookups resolved by address (breakpoint address) to fail to find any DWARF debug info for a given address.
This patch fixes all of the issues that I found:
- DWARFDebugInfo::GetCompileUnitAranges() no longer returns a "llvm::Expected<DWARFDebugAranges &>", but just returns a "const DWARFDebugAranges &". Why? Because this code contained a fallback that would parse all of the valid DWARFDebugArangeSet objects, and would check which compile units had valid .debug_aranges set entries, and manually build an address ranges table using DWARFUnit::BuildAddressRangeTable(). If we return an error because any DWARFDebugArangeSet has any errors, then we don't do any of this code. Now we parse all DWARFDebugArangeSet objects that have no errors, if any calls to DWARFDebugArangeSet::extract() return errors, we skip that DWARFDebugArangeSet so that we can use the fallback call to DWARFUnit::BuildAddressRangeTable(). Since DWARFDebugInfo::GetCompileUnitAranges() needs to parse what it can from the .debug_aranges and build address ranges tables for any compile units that don't have any .debug_aranges sets, everything now works as expected.
- Fix an issue where a DWARFDebugArangeSet contains multiple terminator entries. The LLVM parser and llvm-dwarfdump properly warn about this because it happens with linux compilers and linkers and was the original cause of the bug I am fixing here. We now correctly warn about this issue if "log enable dwarf info" is enabled, but we continue to parse the DWARFDebugArangeSet correctly so we don't lose data that is contained in the .debug_aranges section.
- DWARFDebugAranges::extract() no longer returns a llvm::Error because we need to be able to parse all of the valid DWARFDebugArangeSet objects. It also will correctly skip a DWARFDebugArangeSet object that has errors in the middle of the stream by setting the start offsets of each DWARFDebugArangeSet to be calculated by the previous DWARFDebugArangeSet::extract() calculated offset that uses the header which contains the length of the DWARFDebugArangeSet. This means if do we run into real errors while parsing individual DWARFDebugArangeSet objects, we can continue to parse the rest of the validly encoded DWARFDebugArangeSet objects in the .debug_aranges section. This will allow LLDB to parse DWARF that contains a possibly newer .debug_aranges set format than LLDB currently supports because we will error out for the parsing of the DWARFDebugArangeSet, but be able to skip to the next DWARFDebugArangeSet object using the "DWARFDebugArangeSet.m_header.length" field to calculate the next starting offset.
Tests were added to cover all new functionality.
Differential Revision: https://reviews.llvm.org/D99401
LLVM GN Syncbot [Mon, 29 Mar 2021 22:12:00 +0000 (22:12 +0000)]
[gn build] Port
5178ffc7cf92
Gulfem Savrun Yeniceri [Tue, 29 Dec 2020 21:32:13 +0000 (21:32 +0000)]
[Passes] Add relative lookup table converter pass
Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in:
https://bugs.llvm.org/show_bug.cgi?id=45244
This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly.
Differential Revision: https://reviews.llvm.org/D94355
Florian Hahn [Mon, 29 Mar 2021 19:19:45 +0000 (20:19 +0100)]
[AArch64] Remove custom zext/sext legalization code.
Currently performExtendCombine assumes that the src-element bitwidth * 2
is a valid MVT. But this is not the case for i1 and it causes a crash on
the v64i1 test cases added in this patch.
It turns out that this code appears to not be needed; the same patterns are
handled by other code and we end up with the same results, even without the
custom lowering. I also added additional test cases in
a50037aaa6d5df.
Let's just remove the unneeded code.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D99437
Jonas Devlieghere [Mon, 29 Mar 2021 21:19:03 +0000 (14:19 -0700)]
[lldb] Print stack trace when lldb-vscode crashes
Print LLVM's pretty stack trace when lldb-vscode crashes. Also removes
the unnecessary call to PrintStackTraceOnErrorSignal in lldb-server as
it's already part of InitLLVM.
Differential revision: https://reviews.llvm.org/D99535
Nikita Popov [Sun, 14 Mar 2021 15:47:41 +0000 (16:47 +0100)]
[X86][FastISel] Fix with.overflow eflags clobber (PR49587)
If the successor block has a phi node, then additional moves may
be inserted into predecessors, which may clobber eflags. Don't try
to fold the with.overflow result into the branch in that case.
This is done by explicitly checking for any phis in successor
blocks, not sure if there's some more principled way to address
this. Other fused compare and branch patterns avoid the issue by
emitting the comparison when handling the branch, so that no
instructions may be inserted in between. In this case, the
with.overflow call is emitted separately (and I don't think this
is avoidable, as it will generally have at least two users).
Fixes https://bugs.llvm.org/show_bug.cgi?id=49587.
Differential Revision: https://reviews.llvm.org/D98600
Fanbo Meng [Mon, 29 Mar 2021 20:47:57 +0000 (16:47 -0400)]
[NFC] clang-formatting zos-alignment.c
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D99514
Fangrui Song [Mon, 29 Mar 2021 20:35:10 +0000 (13:35 -0700)]
[lsan] realloc: don't deallocate if requested size is too large
This is the behavior required by the standards.
Differential Revision: https://reviews.llvm.org/D99480
Petr Hosek [Mon, 29 Mar 2021 20:07:39 +0000 (13:07 -0700)]
Revert "[CMake] Use write_basic_package_version_file for LLVM"
This reverts commit
3001d080c813da20b329303bf8f45451480e5905 which
seems to have introduced a race condition that's failing the build
in some cases.
MaheshRavishankar [Mon, 29 Mar 2021 19:33:26 +0000 (12:33 -0700)]
Fix broken build for commit
9b0517035faee275ce1feabb03d0c7606ea7f819
Differential Revision: https://reviews.llvm.org/D99533
Nico Weber [Mon, 29 Mar 2021 19:47:13 +0000 (15:47 -0400)]
fix comment typo to cycle bots
Stanislav Mekhanoshin [Mon, 29 Mar 2021 18:37:07 +0000 (11:37 -0700)]
[AMDGPU] Fix "Sequence" spelling. NFC.
Samuel [Mon, 29 Mar 2021 04:18:45 +0000 (21:18 -0700)]
[llvm-reduce] Remove dso_local when possible
Add a new delta pass to llvm-reduce that removes dso_local when possible
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D98673
Petr Hosek [Mon, 29 Mar 2021 17:49:55 +0000 (10:49 -0700)]
[libcxx] Use integer division
In Python 3, math.floor returns int when both arguments are ints.
In Python 2, math.floor returns float. This leads to a failure
because the result of math.floor is used as an array index. While
Python 2 is on its way out, it's still used in some places so use
an integer division instead.
Differential Revision: https://reviews.llvm.org/D99520
Joe Nash [Mon, 29 Mar 2021 18:43:19 +0000 (14:43 -0400)]
Revert "[AMDGPU] Mark additional VOP3 as commutable"
This reverts commit
d35d8da7d6ac6c08578ec0569b072292631691e0.
Nico Weber [Mon, 29 Mar 2021 18:40:43 +0000 (14:40 -0400)]
fix comment typo to cycle bots
Fangrui Song [Mon, 29 Mar 2021 18:41:07 +0000 (11:41 -0700)]
[lsan][test] Add malloc(0) and realloc(p, 0) tests
MaheshRavishankar [Mon, 29 Mar 2021 17:57:23 +0000 (10:57 -0700)]
[mlir] Enhance InferShapedTypeOpInterface and move LinalgOps to use them.
A new `InterfaceMethod` is added to `InferShapedTypeOpInterface` that
allows an operation to return the `Value`s for each dim of its
results. It is intended for the case where the `Value` returned for
each dim is computed using the operands and operation attributes. This
interface method is for cases where the result dim of an operation can
be computed independently, and it avoids the need to aggregate all
dims of a result into a single shape value. This also implies that
this is not suitable for cases where the result type is unranked (for
which the existing interface methods is to be used).
Also added is a canonicalization pattern that uses this interface and
resolves the shapes of the output in terms of the shapes of the
inputs. Moving Linalg ops to use this interface, so that many
canonicalization patterns implemented for individual linalg ops to
achieve the same result can be removed in favor of the added
canonicalization pattern.
Differential Revision: https://reviews.llvm.org/D97887
Nico Weber [Mon, 29 Mar 2021 18:35:57 +0000 (14:35 -0400)]
fix comment typo to cycle bots
Stella Laurenzo [Mon, 29 Mar 2021 18:30:50 +0000 (18:30 +0000)]
NFC: Update MLIR python bindings docs to install deps via requirements.txt.
* Also adds some verbiage about upgrading `pip` itself, since this is a
common source of issues.
Differential Revision: https://reviews.llvm.org/D99522
Joe Nash [Tue, 23 Mar 2021 15:33:38 +0000 (11:33 -0400)]
[AMDGPU] Mark additional VOP3 as commutable
Note, only src0 and src1 will be commuted if the isCommutable flag
is set. This patch does not change that, it just makes it possible
to commute src0 and src1 of more instructions.
Reviewed By: foad, rampitec
Differential Revision: https://reviews.llvm.org/D99376
Change-Id: I61e20490962d95ea429beb355c55f55c024dafdc
Jez Ng [Mon, 29 Mar 2021 18:08:12 +0000 (14:08 -0400)]
[lld-macho] Implement -segprot
Addresses llvm.org/PR49405.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D99389
Florian Hahn [Mon, 29 Mar 2021 17:19:24 +0000 (18:19 +0100)]
[AArch64] Add a few more vector extension tests.
Raphael Isemann [Mon, 29 Mar 2021 17:47:17 +0000 (19:47 +0200)]
[lldb][NFC] Fix -Wdocumentation issue in ModuleSpec.h/ThreadTrace.h
Raphael Isemann [Mon, 29 Mar 2021 17:40:41 +0000 (19:40 +0200)]
[lldb][NFC] Fix -Wdocumentation issue in ProcessMinidump
Roger Ferrer Ibanez [Mon, 29 Mar 2021 13:30:48 +0000 (13:30 +0000)]
[PrologEpilogInserter][AMDGPU] Only adjust offset for emergency spill slots if the stack grows down
D89239 adjusts the stack offset of emergency spill slots for overaligned
stacks. However the adjustment is not valid for targets whose stack
grows up (such as AMDGPU).
This change makes the adjustment conditional only to those targets whose
stack grows down.
Fixes https://bugs.llvm.org/show_bug.cgi?id=49686
Differential Revision: https://reviews.llvm.org/D99504
Craig Topper [Mon, 29 Mar 2021 17:11:18 +0000 (10:11 -0700)]
[RISCV] When custom iseling masked loads/stores, copy the mask into V0 instead of virtual register.
This matches what we do in our isel patterns. In our internal
testing we've found this is needed to make the fast register
allocator happy at -O0. Otherwise it may assign V0 to an earlier
operand and find itself with no registers left when it reaches
the mask operand. By using V0 explicitly, the fast register allocator
will see it when it checks for phys register usages before it
starts allocating vregs. I'll try to update this with a test case.
Unfortunately, this does appear to prevent some instruction reordering
by the pre-RA scheduler which leads to the increased spills seen in
some tests. I suspect that problem could already occur for other
instructions that already used V0 directly.
There's a lot of repeated code here that could do with some
wrapper functions. Not sure if that should be at the level of the
new code that deals with V0. That would require multiple output
parameters to pass the glue, chain and register back. Maybe it
should be at a higher level over the entire set of push_backs.
Reviewed By: frasercrmck, HsiangKai
Differential Revision: https://reviews.llvm.org/D99367
Peter Steinfeld [Thu, 25 Mar 2021 15:04:19 +0000 (08:04 -0700)]
[flang] Fix CHECK() calls on erroneous procedure declarations
When writing tests for a previous problem, I ran across situations where the
compiler was failing calls to CHECK(). In these situations, the compiler had
inconsistent semantic information because the programs were erroneous. This
inconsistent information was causing the calls to CHECK().
I fixed this by avoiding the code that ended up making the failed calls to
CHECK() and making sure that we were only avoiding these situations when the
associated symbols were erroneous.
I also added tests that would cause the calls to CHECK() without these changes.
Differential Revision: https://reviews.llvm.org/D99342
Craig Topper [Mon, 29 Mar 2021 16:54:26 +0000 (09:54 -0700)]
[X86] Always use rip-relative addressing on 64-bit when rematerializing all zeros/ones registers using a folded load.
Previously we only used RIP relative when PIC was enabled. But
we know we're in small/kernel code model here so we should
be able to always use RIP-relative which will give a smaller
encoding.
Here's a godbolt link that demonstrates the current codegen https://godbolt.org/z/j3158o
Note in the non-PIC version the load from .LCPI0_0 doesn't use
RIP-relative addressing, but if you change the constant in the
source from 0.0 to 1.0 it will become RIP-relative.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D97208
Roger Ferrer Ibanez [Thu, 18 Mar 2021 10:26:33 +0000 (10:26 +0000)]
[RISCV] Fix offset computation for RVV
In D97111 we changed the RVV frame layout when using sp or bp to address
the stack slots so we could address the emergency stack slot. The idea
is to put the RVV objects as far as possible (in offset terms) from the
frame reference register (sp / fp / bp).
When using fp this happens naturally because the RVV objects are already
the top of the stack and due to the constraints of RVV (VLENB being a
power of two >= 128) the stack remains aligned. The rest of this summary
does not apply to this case.
When using sp / bp we need to skip the non-RVV stack slots. The size of
the the non-RVV objects is computed subtracting the callee saved
register size (whose computation is added in D97111 itself) to the total
size of the stack (which does not account for RVV stack slots). However,
when doing so we round to 16 bytes when computing that size and we end
emitting a smaller offset that may belong to a scalar stack slot (see
D98801). So this change removes that rounding.
Also, because we want the RVV objects be between the non-RVV stack slots
and the callee-saved register slots, we need to make sure the RVV
objects are properly aligned to 8 bytes. Adding a padding of 8 would
render the stack unaligned. So when allocating space for RVV (only when
we don't use fp) we need to have extra padding that preserves the stack
alignment. This way we can round to 8 bytes the offset that skips the
non-RVV objects and we do not misalign the whole stack in the way. In
some circumstances this means that the RVV objects may have padding
before (=lower offsets from sp/bp) and after (before the CSR stack
slots).
Differential Revision: https://reviews.llvm.org/D98802
Roger Ferrer Ibanez [Thu, 18 Mar 2021 10:37:18 +0000 (10:37 +0000)]
[NFC][RISCV] Add test showing wrong stack slot for GPR and RVV spilled registers
This testcase shows that we attempt to assign the same offset sp + 16 to
two different stack objects.
The fix will come in a later change.
Differential Revision: https://reviews.llvm.org/D98801
Roger Ferrer Ibanez [Wed, 17 Mar 2021 18:24:58 +0000 (18:24 +0000)]
[NFC][RISCV] Pass file through update_llc_tests to fix whitespace issues
While addressing RVV frame layout issues I found this file had
whitespace differences that made diffs noisier than they should be.
Differential Revision: https://reviews.llvm.org/D98800
Wenlei He [Fri, 5 Mar 2021 15:50:36 +0000 (07:50 -0800)]
[CSSPGO][llvm-profgen] Context-sensitive global pre-inliner
This change sets up a framework in llvm-profgen to estimate inline decision and adjust context-sensitive profile based on that. We call it a global pre-inliner in llvm-profgen.
It will serve two purposes:
1) Since context profile for not inlined context will be merged into base profile, if we estimate a context will not be inlined, we can merge the context profile in the output to save profile size.
2) For thinLTO, when a context involving functions from different modules is not inined, we can't merge functions profiles across modules, leading to suboptimal post-inline count quality. By estimating some inline decisions, we would be able to adjust/merge context profiles beforehand as a mitigation.
Compiler inline heuristic uses inline cost which is not available in llvm-profgen. But since inline cost is closely related to size, we could get an estimate through function size from debug info. Because the size we have in llvm-profgen is the final size, it could also be more accurate than the inline cost estimation in the compiler.
This change only has the framework, with a few TODOs left for follow up patches for a complete implementation:
1) We need to retrieve size for funciton//inlinee from debug info for inlining estimation. Currently we use number of samples in a profile as place holder for size estimation.
2) Currently the thresholds are using the values used by sample loader inliner. But they need to be tuned since the size here is fully optimized machine code size, instead of inline cost based on not yet fully optimized IR.
Differential Revision: https://reviews.llvm.org/D99146
Florian Hahn [Mon, 29 Mar 2021 16:37:48 +0000 (17:37 +0100)]
[Clang] Fix line numbers in CHECK lines.
Wei Mi [Thu, 25 Mar 2021 23:59:10 +0000 (16:59 -0700)]
[SampleFDO] Do not scale the magic number NOMORE_ICP_MAGICNUM in value profile
during profile update.
When we inline a function and update the profile, the value profiles of the
indirect call in the inliner and inlinee will be scaled. In
https://reviews.llvm.org/D96806 and https://reviews.llvm.org/D97350, we start
using the magic number NOMORE_ICP_MAGICNUM (-1) to mark targets which have
been promoted. The magic number shouldn't be scaled during the profile update.
Although the problem has been suppressed by https://reviews.llvm.org/D98187
for SampleFDO, which stops profile update for inlining in sampleFDO, the patch
is still wanted since it will be more consistent to handle the magic number
properly in profile update.
Differential Revision: https://reviews.llvm.org/D99394
Florian Hahn [Mon, 29 Mar 2021 16:27:01 +0000 (17:27 +0100)]
[Clang] Only run test when X86 backend is built.
After
c773d0f97304 the remark is only emitted if the loop is profitable
to vectorize, but cannot be vectorized. Hence, it depends on
X86-specific cost-modeling.
Jonas Devlieghere [Mon, 29 Mar 2021 16:14:06 +0000 (09:14 -0700)]
[lldb] Move UpdateISAToDescriptorMap into ClassInfoExtractor (NFC)
Move UpdateISAToDescriptorMap into ClassInfoExtractor so that all the
formerly public functions can be private and remain an implementation
detail of the extractor.
Differential revision: https://reviews.llvm.org/D99448
Joseph Huber [Mon, 29 Mar 2021 15:00:39 +0000 (11:00 -0400)]
[OpenMP] Trim error messages in CUDA plugin
Summary:
Remove some of the error messages printed when the CUDA plugin fails. The current error messages can be confusing because they are the first error messages printed after the async stream finds an error. This means that the printed values aren't related to what caused the issue, but are simply the last asyncronous operation that succeeded on the device. Remove these as they can be misleading.
Reviewers: jdoerfert
Differential Revision: https://reviews.llvm.org/D99510
MaheshRavishankar [Mon, 29 Mar 2021 16:18:43 +0000 (09:18 -0700)]
[mlir][Linalg] Rewrite SubTensors that take a slice out of a unit-extend dimension.
Subtensor operations that are taking a slice out of a tensor that is
unit-extent along a dimension can be rewritten to drop that dimension.
Differential Revision: https://reviews.llvm.org/D99226
Asher Mancinelli [Mon, 29 Mar 2021 15:56:43 +0000 (16:56 +0100)]
[flang] Update output format test to use GTest
Better document each test in output formatting tests. Use GTest primitives and infrastructure in same
spirit as [[ https://reviews.llvm.org/D97403 | D97403 ]]. [[ https://github.com/flang-compiler/f18/issues/995#issuecomment-
790737912 | See legacy github issue linked here ]] for additional context. Reorganize long test cases to be more readable.
Reviewed By: awarzynski, klausler
Differential Revision: https://reviews.llvm.org/D98303
MaheshRavishankar [Mon, 29 Mar 2021 16:16:06 +0000 (09:16 -0700)]
[mlir][Linalg] Drop spurious error message
Drop usage of `emitRemark` and use `notifyMatchFailure` instead to
avoid unnecessary spew during compilation.
Differential Revision: https://reviews.llvm.org/D99485
Christopher Di Bella [Thu, 18 Mar 2021 17:21:35 +0000 (17:21 +0000)]
[libcxx] adds std::identity to <functional>
Implements parts of:
- P0898R3 Standard Library Concepts
Differential Revision: https://reviews.llvm.org/D98151
Fanbo Meng [Mon, 29 Mar 2021 16:06:12 +0000 (12:06 -0400)]
[SystemZ][z/OS] Add test of leading zero length bitfield in const/volatile struct
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D99508
Jonas Devlieghere [Mon, 29 Mar 2021 15:55:58 +0000 (08:55 -0700)]
[lldb] Include llvm-config.h instead of config.h
This distinction doesn't matter for an in-tree build, but when building
against an installed llvm, only the former is present.
This should fix the LLDB Standalone bot:
http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake-standalone/
thomasraoux [Wed, 24 Mar 2021 16:53:53 +0000 (09:53 -0700)]
[mlir][vector] Add lowering of Transfer_read with broadcast and permutation map
Convert transfer_read ops with permutation maps into simpler
transfer_read with minority map + vector.braodcast and vector.transpose.
And transfer_read with leading dimensions broacast into transfer_read of
lower rank.
Differential Revision: https://reviews.llvm.org/D99019
Christopher Di Bella [Fri, 26 Mar 2021 03:26:22 +0000 (03:26 +0000)]
[libcxx] reworks invocable and regular_invocable tests
The tests for `std::invocable` and `std::regular_invocable` were
woefully incomplete. This patch closes many of the gaps (though some
probably remain).
Differential Revision: https://reviews.llvm.org/D99398
Florian Hahn [Mon, 29 Mar 2021 14:16:03 +0000 (15:16 +0100)]
Recommit "[LV] Move runtime pointer size check to LVP::plan()."
Re-apply
25fbe803d4db, with a small update to emit the right remark
class.
Original message:
[LV] Move runtime pointer size check to LVP::plan().
This removes the need for the remaining doesNotMeet check and instead
directly checks if there are too many runtime checks for vectorization
in the planner.
A subsequent patch will adjust the logic used to decide whether to
vectorize with runtime to consider their cost more accurately.
Reviewed By: lebedev.ri
Bradley Smith [Thu, 18 Mar 2021 15:52:48 +0000 (15:52 +0000)]
[SelectionDAG][AArch64][SVE] Perform SETCC condition legalization in LegalizeVectorOps
This is currently performed in SelectionDAGLegalize, here we make it also
happen in LegalizeVectorOps, allowing a target to lower the SETCC condition
codes first in LegalizeVectorOps and then lower to a custom node afterwards,
without having to duplicate all of the SETCC condition legalization in the
target specific lowering.
As a result of this, fixed length floating point SETCC nodes can now be
properly lowered for SVE.
Differential Revision: https://reviews.llvm.org/D98939