Florian Hahn [Tue, 7 Jul 2020 22:15:01 +0000 (23:15 +0100)]
Revert "[SLP] Make sure instructions are ordered when computing spill cost."
This seems to break http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/24371
This reverts commit
eb46137daa92723b75d828f2db959f2061612622.
Fangrui Song [Tue, 7 Jul 2020 22:04:48 +0000 (15:04 -0700)]
[RuntimeDyld][test] Fix ExecutionEngine/RuntimeDyld/X86/ELF_x86-64_none.yaml after D60122
*.yaml tests don't currently run, so we failed to notice it.
Davide Italiano [Tue, 7 Jul 2020 22:03:08 +0000 (15:03 -0700)]
[dotest] Log a warning when --server and --out-of-tree-debugserver are set
Suggested by Vedant.
Davide Italiano [Tue, 7 Jul 2020 22:00:08 +0000 (15:00 -0700)]
Do not set LLDB_DEBUGSERVER_PATH if --out-of-tree-debugserver is passed.
This gets rid of some surprising interplay between the flags.
Mainly needed because of Rosetta debugserver & Apple Silicon.
Differential Revision: https://reviews.llvm.org/D82804
Fangrui Song [Tue, 7 Jul 2020 22:00:29 +0000 (15:00 -0700)]
[llvm-readobj][test] Fix ELF/verneed-flags.yaml
*.yaml tests don't currently run, so we failed to update it.
Christopher Tetreault [Tue, 7 Jul 2020 21:44:40 +0000 (14:44 -0700)]
[SVE] Remove calls to VectorType::getNumElements from AsmParserTest
Reviewers: efriedma, c-rhodes, david-arm, kmclaughlin, fpetrogalli, sdesmalen
Reviewed By: efriedma
Subscribers: tschuett, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83339
Med Ismail Bennani [Tue, 7 Jul 2020 20:08:20 +0000 (22:08 +0200)]
[lldb/api] Add checks for StackFrame::GetRegisterContext calls (NFC)
This patch fixes a crash that is happening because of a null pointer
dereference in SBFrame.
StackFrame::GetRegisterContext says explicitly that you might not get
a valid RegisterContext back but the pointer wasn't tested before,
resulting in crashes. This should solve the issue.
rdar://
54462095
Differential Revision: https://reviews.llvm.org/D83343
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
cgyurgyik [Sun, 28 Jun 2020 19:22:41 +0000 (14:22 -0500)]
[libc] Add memchr implementation.
Eric Astor [Tue, 7 Jul 2020 21:01:10 +0000 (17:01 -0400)]
[ms] [llvm-ml] Add initial MASM STRUCT/UNION support
Summary:
Add support for user-defined types to MasmParser, including initialization and field access.
Known issues:
- Omitted entry initializers (e.g., <,0>) do not work consistently for nested structs/arrays.
- Size checking/inference for values with known types is not yet implemented.
- Some ml64.exe syntaxes for accessing STRUCT fields are not recognized.
- `[<register>.<struct name>].<field>`
- `[<register>[<struct name>.<field>]]`
- `(<struct name> PTR [<register>]).<field>`
- `[<variable>.<struct name>].<field>`
- `(<struct name> PTR <variable>).<field>`
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D75306
Nicolas Vasilache [Tue, 7 Jul 2020 20:44:24 +0000 (16:44 -0400)]
[mlir][Vector] Add ExtractOp folding
This revision adds foldings for ExtractOp operations that come from previous InsertOp.
InsertOp have cumulative semantic where multiple chained inserts are necessary to produce the final value from which the extracts are obtained.
Additionally, TransposeOp may be interleaved and need to be tracked in order to follow the producer consumer relationships and properly compute positions.
Differential revision: https://reviews.llvm.org/D83150
Christopher Tetreault [Tue, 7 Jul 2020 20:16:00 +0000 (13:16 -0700)]
[SVE] Make Constant::getSplatValue work for scalable vector splats
Summary:
Make Constant::getSplatValue recognize scalable vector splats of the
form created by ConstantVector::getSplat. Add unit test to verify that
C == ConstantVector::getSplat(C)->getSplatValue() for fixed width and
scalable vector splats
Reviewers: efriedma, spatel, fpetrogalli, c-rhodes
Reviewed By: efriedma
Subscribers: sdesmalen, tschuett, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82416
Matt Arsenault [Tue, 7 Jul 2020 19:21:13 +0000 (15:21 -0400)]
GlobalISel: Handle EVT argument lowering correctly
handleAssignments was assuming every argument type is an MVT, and
assignArg would always fail. This fixes one of the hacks in the
current AMDGPU calling convention code that pre-processes the
arguments.
Matt Arsenault [Tue, 7 Jul 2020 17:57:09 +0000 (13:57 -0400)]
AMDGPU/GlobalISel: Fix skipping unused kernel arguments
The tests in
a5b9ad7e9aca1329ba310e638dafa58c47468a58 actually failed
the verifier, which for some reason is not the default. Also add tests
for 0-sized function arguments, which do not add entries to the
expected register lists.
Philip Reames [Tue, 7 Jul 2020 19:20:34 +0000 (12:20 -0700)]
[Statepoint] Factor out logic for non-stack non-vreg lowering [almost NFC]
This is inspired by D81648. The basic idea is to have the set of SDValues which are lowered as either constants or direct frame references explicit in one place, and to separate them clearly from the spilling logic.
This is not NFC in that the handling of constants larger than > 64 bit has changed. The old lowering would crash on values which could not be encoded as a sign extended 64 bit value. The new lowering just spills all constants > 64 bits. We could be consistent about doing the sext(Con64) optimization, but I happen to know that this code path is utterly unexercised in practice, so simple is better for now.
Zola Bridges [Wed, 13 May 2020 18:25:08 +0000 (11:25 -0700)]
[x86][seses] Add clang flag; Use lvi-cfi with seses
This patch creates a clang flag to enable SESES. This flag also ensures that
lvi-cfi is on when using seses via clang.
SESES should use lvi-cfi to mitigate returns and indirect branches.
The flag to enable the SESES functionality only without lvi-cfi is now
-x86-seses-enable-without-lvi-cfi to warn users part of the mitigation is not
enabled if they use this flag. This is useful in case folks want to see the
cost of SESES separate from the LVI-CFI.
Reviewed By: sconstab
Differential Revision: https://reviews.llvm.org/D79910
Muhammad Omair Javaid [Tue, 7 Jul 2020 19:27:10 +0000 (00:27 +0500)]
Minor fixups to LLDB AArch64 register infos macros for SVE register infos
Summary:
This patch adds some cosmetic changes to LLDB AArch64 register infos macros in order to use them in SVE register infos struct in follow up patches.
This patch initially added invalidate lists to register infos struct but that is no longer needed and problem disappeared after updating qemu testing environment.
old headline comments for reference:
AArch64 reigster X and V registers are primary GPR and vector registers respectively. If these registers are modified their corresponding children w regs or s/d regs should be invalidated. Specially when a register write fails it is important that failure gets reflected to all the registers which draw their value from a particular value register.
Reviewers: labath, rengolin
Reviewed By: labath
Subscribers: tschuett, kristof.beyls, danielkiss, lldb-commits
Differential Revision: https://reviews.llvm.org/D77045
Arthur Eubanks [Thu, 2 Jul 2020 18:17:21 +0000 (11:17 -0700)]
[Inliner] Don't skip inlining alwaysinline in optnone functions
Previously the NPM inliner would skip all potential inlines in an
optnone function, but alwaysinline callees should be inlined regardless
of optnone.
Fixes inline-optnone.ll under NPM.
Reviewed By: kazu
Differential Revision: https://reviews.llvm.org/D83021
aartbik [Tue, 7 Jul 2020 19:34:38 +0000 (12:34 -0700)]
[mlir] [VectorOps] [integration-test] Add i64 typed outer product
Yields proper SIMD vpmullq/vpaddq on x86.
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D83328
Zachary Selk [Tue, 7 Jul 2020 19:33:31 +0000 (12:33 -0700)]
[flang] Added missing runtime I/O definitions
Added runtime function definitions for 32-bit real I/O and 32-bit complex output
Differential Revision: https://reviews.llvm.org/D83112
Katherine Rasmussen [Tue, 7 Jul 2020 19:31:06 +0000 (12:31 -0700)]
[flang] Make 'num_images()' intrinsic
I added 'num_images()' to the list of functions that are evaluated as intrinsic. I also added a test file in flang/test/Semantics to test calls to 'num_images()'. There was a call to 'num_images()' in flang/test/Semantics/call10.f90 that expected an error, now it no longer produces an error. So I edited that file accordingly. I also edited the intrinsics unit test to add further testing of 'num_images()'.
Differential Revision: https://reviews.llvm.org/D83142
Nikita Popov [Sun, 5 Jul 2020 19:31:06 +0000 (21:31 +0200)]
[SCCP] Use range metadata for loads and calls
When all else fails, use range metadata to constrain the result
of loads and calls. It should also be possible to use !nonnull,
but that would require some general support for inequalities in
SCCP first.
Differential Revision: https://reviews.llvm.org/D83179
Michał Górny [Sat, 4 Jul 2020 17:23:48 +0000 (19:23 +0200)]
[llvm] [docs] Do not require recommonmark for manpage build
Do not enforce recommonmark dependency if sphinx is called to build
manpages. In order to do this, try to import recommonmark first
and do not configure it if it's not available. Additionally, declare
a custom tags for the selected builder via CMake, and ignore
recommonmark import failure when 'man' target is used.
This will permit us to avoid the problematic recommonmark dependency
for the majority of Gentoo users that do not need to locally build
the complete documentation but want to have tool manpages.
Differential Revision: https://reviews.llvm.org/D83161
Stanislav Mekhanoshin [Tue, 30 Jun 2020 21:51:07 +0000 (14:51 -0700)]
LIS: fix handleMove to properly extend main range
handleMoveDown or handleMoveUp cannot properly repair a main
range of a LiveInterval since they only get LiveRange. There
is a problem if certain use has moved few segments away and
there is a hole in the main range in between of these two
locations. We may get a SubRange with a very extended Segment
spanning several Segments of the main range and also spanning
that hole. If that happens then we end up with the main range
not covering its SubRange which is an error.
It might be possible to attempt fixing the main range in place
just between of the old and new index by extending all of its
Segments in between, but it is unclear this logic will be
faster than just straight constructMainRangeFromSubranges,
which itself is pretty cheap since it only contains interval
logic. That will also require shrinkToUses() call after which
is probably even more expensive.
In the test second move is from 64B to 92B for the sub1.
Subrange is correctly fixed:
L000000000000000C [16r,32B:0)[32B,92r:1) 0@16r 1@32B-phi
But the main range has a hole in between 80d and 88r after
updateRange():
%1 [16r,32B:0)[32B,80r:4)[80r,80d:3)[88r,96r:1)[96r,160B:2)
Since source position is 64B this segment is not even considered
by the updateRange().
Differential Revision: https://reviews.llvm.org/D82916
Vy Nguyen [Tue, 7 Jul 2020 18:39:16 +0000 (14:39 -0400)]
Clang crashed while checking for deletion of copy and move ctors
Crash:
@ 0x559d129463fc clang::CXXRecordDecl::defaultedCopyConstructorIsDeleted()
@ 0x559d1288d3e5 clang::Sema::checkIllFormedTrivialABIStruct()::$_7::operator()()
@ 0x559d12884c34 clang::Sema::checkIllFormedTrivialABIStruct()
@ 0x559d1288412e clang::Sema::CheckCompletedCXXClass()
@ 0x559d1288d843 clang::Sema::ActOnFinishCXXMemberSpecification()
@ 0x559d12020109 clang::Parser::ParseCXXMemberSpecification()
@ 0x559d1201e80c clang::Parser::ParseClassSpecifier()
@ 0x559d1204e807 clang::Parser::ParseDeclarationSpecifiers()
@ 0x559d120e9aa9 clang::Parser::ParseSingleDeclarationAfterTemplate()
@ 0x559d120e8f21 clang::Parser::ParseTemplateDeclarationOrSpecialization()
@ 0x559d120e8886 clang::Parser::ParseDeclarationStartingWithTemplate()
@ 0x559d1204a1d4 clang::Parser::ParseDeclaration()
@ 0x559d12004b1d clang::Parser::ParseExternalDeclaration()
@ 0x559d12017689 clang::Parser::ParseInnerNamespace()
@ 0x559d12017024 clang::Parser::ParseNamespace()
@ 0x559d1204a29b clang::Parser::ParseDeclaration()
@ 0x559d12004c74 clang::Parser::ParseExternalDeclaration()
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83263
Med Ismail Bennani [Tue, 7 Jul 2020 16:45:30 +0000 (18:45 +0200)]
[lldb/Core] Fix crash in ValueObject::CreateChildAtIndex
The patch fixes a crash in ValueObject::CreateChildAtIndex caused by a
null pointer dereferencing. This is a corner case that is happening when
trying to dereference a variable with an incomplete type, and this same
variable doesn't have a synthetic value to get the child ValueObject.
If this happens, lldb will now return a null pointer that will results
in an error message.
rdar://
65181171
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Nikita Popov [Mon, 6 Jul 2020 20:17:16 +0000 (22:17 +0200)]
[SCCP] Handle assume predicates
Take assume predicates into account when visiting ssa.copy. The
handling is the same as for branch predicates, with the difference
that we're always on the true edge.
Differential Revision: https://reviews.llvm.org/D83257
Simon Pilgrim [Tue, 7 Jul 2020 18:08:15 +0000 (19:08 +0100)]
[X86][AVX] Don't fold PEXTR(VBROADCAST_LOAD(X)) -> LOAD(X).
We were checking the VBROADCAST_LOAD element size against the extraction destination size instead of the extracted vector element size - PEXTRW/PEXTB have implicit zext'ing so have i32 destination sizes for v8i16/v16i8 vectors, resulting in us extracting from the wrong part of a load.
This patch bails from the fold if the vector element sizes don't match, and we now use the target constant extraction code later on like the pre-AVX2 targets, fixing the test case.
Found by internal fuzzing tests.
Zola Bridges [Wed, 17 Jun 2020 18:12:52 +0000 (11:12 -0700)]
[x86][lvi][seses] Use SESES at O0 for LVI mitigation
Use SESES as the fallback at O0 where the optimized LVI pass isn't desired due
to its effect on build times at O0.
I updated the LVI tests since this changes the code gen for the tests touched in the parent revision.
This is a follow up to the comments I made here: https://reviews.llvm.org/D80964
Hopefully we can continue the discussion here.
Also updated SESES to handle LFENCE instructions properly instead of adding
redundant LFENCEs. In particular, 1) no longer add LFENCE if the current
instruction being processed is an LFENCE and 2) no longer add LFENCE if the
instruction right before the instruction being processed is an LFENCE
Reviewed By: sconstab
Differential Revision: https://reviews.llvm.org/D82037
Ulrich Weigand [Tue, 7 Jul 2020 17:52:38 +0000 (19:52 +0200)]
[SystemZ ABI] Allow class types in GetSingleElementType
The SystemZ ABI specifies that aggregate types with just a single
member of floating-point type shall be passed as if they were just
a scalar of that type. This applies to both struct and class types
(but not unions).
However, the current ABI support code in clang only checks this
case for struct types, which means that for class types, generated
code does not adhere to the platform ABI.
Fixed by accepting both struct and class types in the
SystemZABIInfo::GetSingleElementType routine.
Aaron Ballman [Tue, 7 Jul 2020 17:54:02 +0000 (13:54 -0400)]
Speculatively fix the sphinx build.
LLVM GN Syncbot [Tue, 7 Jul 2020 17:49:12 +0000 (17:49 +0000)]
[gn build] Port
dfa0db79d0e
Thomas Lively [Tue, 7 Jul 2020 17:45:26 +0000 (10:45 -0700)]
[WebAssembly] Avoid scalarizing vector shifts in more cases
Since WebAssembly's vector shift instructions take a scalar shift
amount rather than a vector shift amount, we have to check in ISel
that the vector shift amount is a splat. Previously, we were checking
explicitly for splat BUILD_VECTOR nodes, but this change uses the
standard utilities for detecting splat values that can handle more
complex splat patterns. Since the C++ ISel lowering is now more
general than the ISel patterns, this change also simplifies shift
lowering by using the C++ lowering for all SIMD shifts rather than
mixing C++ and normal pattern-based lowering.
This change improves ISel for shifts to the point that the
simd-shift-unroll.ll regression test no longer tests the code path it
was originally meant to test. The bug corresponding to that regression
test is no longer reproducible with its original reported reproducer,
so rather than try to fix the regression test, this change just
removes it.
Differential Revision: https://reviews.llvm.org/D83278
Arthur Eubanks [Tue, 7 Jul 2020 17:43:40 +0000 (10:43 -0700)]
[BasicAA] Remove -basicaa alias
Follow up of https://reviews.llvm.org/D82607.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D83067
Alexander Belyaev [Tue, 7 Jul 2020 17:36:48 +0000 (19:36 +0200)]
[mlir] Support unranked types in func signature conversion in BufferPlacement.
Currently, only ranked tensor args and results can be converted to memref types.
Differential Revision: https://reviews.llvm.org/D83324
Arthur Eubanks [Tue, 7 Jul 2020 17:42:33 +0000 (10:42 -0700)]
[NewPM][LoopFusion] Rename loop-fuse -> loop-fusion
The legacy pass name is "loop-fusion".
Fixes most tests under Transforms/LoopFusion under NPM.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D83066
Sean Silva [Mon, 6 Jul 2020 23:49:52 +0000 (16:49 -0700)]
[mlir] Convert function signatures before converting globals
Summary: This allows global initializers to reference functions.
Differential Revision: https://reviews.llvm.org/D83266
Simon Pilgrim [Tue, 7 Jul 2020 17:29:58 +0000 (18:29 +0100)]
[X86][AVX] Add test case showing incorrect extraction from VBROADCAST_LOAD on AVX2 targets
On AVX2 we tend to lower BUILD_VECTOR of constants as broadcasts if we can, in this case a <2 x i16> non-uniform constant has been lowered as a <4 x i32> broadcast.
The test case shows that the extraction folding code has incorrectly extracted the wrong part (lower WORD) of the resulting i32 memory source.
Found by internal fuzzing tests.
Simon Pilgrim [Tue, 7 Jul 2020 17:19:58 +0000 (18:19 +0100)]
[X86][AVX] Add AVX2 tests to extractelement-load.ll
Ellis Hoag [Tue, 7 Jul 2020 17:30:36 +0000 (13:30 -0400)]
Warn pointer captured in async block
The block arguments in dispatch_async() and dispatch_after() are
guaranteed to escape. If those blocks capture any pointers with the
noescape attribute then it is an error.
Chris Lattner [Tue, 7 Jul 2020 17:28:14 +0000 (10:28 -0700)]
Expand the LLVM Developer Policy to include new sections on adding
a project to the LLVM Monorepo, and a second about the LLVM
Incubator projects.
Differential Revision: https://reviews.llvm.org/D83182
Erik Pilkington [Tue, 7 Jul 2020 15:13:47 +0000 (11:13 -0400)]
[SemaObjC] Fix a -Wobjc-signed-char-bool false-positive with binary conditional operator
We were previously bypassing the conditional expression special case for binary
conditional expressions.
rdar://
64134411
Differential revision: https://reviews.llvm.org/D81751
Erik Pilkington [Thu, 25 Jun 2020 20:10:46 +0000 (16:10 -0400)]
[SemaObjC] Add a warning for @selector expressions that potentially refer to objc_direct methods
By default, only warn when the selector matches a direct method in the current
class. This commit also adds a more strict off-by-default warning when there
isn't a non-direct method in the current class.
rdar://
64621668
Differential revision: https://reviews.llvm.org/D82611
Biplob Mishra [Tue, 7 Jul 2020 15:28:08 +0000 (10:28 -0500)]
[PowerPC] Implement Vector Replace Builtins in LLVM
Provide the LLVM intrinsics needed to implement vector replace element
builtins in altivec.h which will be added in a subsequent patch.
Differential Revision: https://reviews.llvm.org/D83308
Jennifer Yu [Tue, 7 Jul 2020 16:27:20 +0000 (09:27 -0700)]
orrectly generate invert xor value for Binary Atomics of int size > 64
When using __sync_nand_and_fetch with __int128, a problem is found that
the wrong value for the 'invert' value gets emitted to the xor in case
where the int size is greater than 64 bits.
This is because uses of llvm::ConstantInt::get which zero extends the
greater than 64 bits, so instead -1 that we require, it end up
getting
18446744073709551615
This patch replaces the call to llvm::ConstantInt::get with the call
to llvm::Constant::getAllOnesValue which works for all integer types.
Reviewers: jfp, erichkeane, rjmccall, hfinkel
Differential Revision: https://reviews.llvm.org/D82832
Dan Liew [Tue, 7 Jul 2020 17:12:18 +0000 (10:12 -0700)]
Revert "Temporarily disable the following failing tests on Darwin:"
This reverts commit
f3a089506fdcc4a1d658697009572c93e00c4373.
888951aaca583bcce85b42ea6166416db8f96fe0 introduced a fix that
should make the disabled tests work again.
rdar://problem/
62141412
Dan Liew [Fri, 26 Jun 2020 23:14:22 +0000 (16:14 -0700)]
Disable interception of sigaltstack on i386 macOS.
Summary:
28c91219c7e introduced an interceptor for `sigaltstack`. It turns out this
broke `setjmp` on i386 macOS. This is because the implementation of `setjmp` on
i386 macOS is written in assembly and makes the assumption that the call to
`sigaltstack` does not clobber any registers. Presumably that assumption was
made because it's a system call. In particular `setjmp` assumes that before
and after the call that `%ecx` will contain a pointer the `jmp_buf`. The
current interceptor breaks this assumption because it's written in C++ and
`%ecx` is not a callee-saved register. This could be fixed by writing a
trampoline interceptor to the existing interceptor in assembly that
ensures all the registers are preserved. However, this is a lot of work
for very little gain. Instead this patch just disables the interceptor
on i386 macOS.
For other Darwin architectures it currently appears to be safe to intercept
`sigaltstack` using the current implementation because:
* `setjmp` for x86_64 saves the pointer `jmp_buf` to the stack before calling `sigaltstack`.
* `setjmp` for armv7/arm64/arm64_32/arm64e appears to not call `sigaltstack` at all.
This patch should unbreak (once they are re-enabled) the following
tests:
```
AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer.LongJmpTest
AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer.SigLongJmpTest
AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer.LongJmpTest
AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer.SigLongJmpTest
AddressSanitizer-i386-darwin :: TestCases/longjmp.cpp
```
This patch introduces a `SANITIZER_I386` macro for convenience.
rdar://problem/
62141412
Reviewers: kubamracek, yln, eugenis
Subscribers: kristof.beyls, #sanitizers, llvm-commits
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D82691
Jonas Devlieghere [Tue, 7 Jul 2020 17:12:17 +0000 (10:12 -0700)]
[lldb] Fix unaligned load in DataExtractor
Somehow UBSan would only report the unaligned load in TestLinuxCore.py
when running the tests with reproducers. This patch fixes the issue by
using a memcpy in the GetDouble and the GetFloat method.
Differential revision: https://reviews.llvm.org/D83256
Hans Wennborg [Tue, 7 Jul 2020 12:51:34 +0000 (14:51 +0200)]
[GlobalOpt] Don't remove inalloca from musttail-called functions
Otherwise the verifier complains about the mismatching function ABIs.
Differential revision: https://reviews.llvm.org/D83300
Sanjay Patel [Tue, 7 Jul 2020 16:55:07 +0000 (12:55 -0400)]
[x86] fix miscompile in buildvector v16i8 lowering
In the test based on PR46586:
https://bugs.llvm.org/show_bug.cgi?id=46586
...we are inserting 16-bits into the high element of the vector, shuffling it
to element 0, and extracting 32-bits. But xmm1 was never initialized, so the
top 16-bits of the extract are undef without this patch.
(It seems like we could do better than this by recognizing that we only demand
a subsection of the build vector, but I want to make sure we fix the
miscompile 1st.)
This path is only used for pre-SSE4.1, and simpler patterns get squashed
somewhere along the way, so the test still includes a 'urem' as it did in the
original test from the bug report.
Differential Revision: https://reviews.llvm.org/D83319
Amy Huang [Tue, 16 Jun 2020 00:26:42 +0000 (17:26 -0700)]
[NativeSession] Add column numbers to NativeLineNumber.
Summary:
This adds column numbers if they are present, and otherwise
sets the column number to be zero.
Bug: https://bugs.llvm.org/show_bug.cgi?id=41795
Reviewers: amccarth
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81950
Fangrui Song [Tue, 7 Jul 2020 16:47:08 +0000 (09:47 -0700)]
[ELF] Ignore --no-relax for RISC-V
In GNU ld, --no-relax can disable x86-64 GOTPCRELX relaxation.
It is not useful, so we don't implement it.
For RISC-V, --no-relax disables linker relaxations which have larger
impact.
Linux kernel specifies --no-relax when CONFIG_DYNAMIC_FTRACE is specified
(since http://git.kernel.org/linus/
a1d2a6b4cee858a2f27eebce731fbf1dfd72cb4e ).
LLD has not implemented the relaxations, so this option is a no-op.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D81359
Aaron En Ye Shi [Thu, 2 Jul 2020 20:13:19 +0000 (20:13 +0000)]
[HIP] Use default triple in llvm-mc for system ld
The Ubuntu system ld does not recognize the amdgcn-amd-amdhsa target.
Instead the host object with embedded device fat binary should not be
assembled by that triple. It should use default triple, so that the
object is compatible with system ld.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D83145
Sanjay Patel [Tue, 7 Jul 2020 15:47:20 +0000 (11:47 -0400)]
[x86] add test for buildvector lowering miscompile (PR46586); NFC
Mehdi Amini [Tue, 7 Jul 2020 15:46:06 +0000 (15:46 +0000)]
Revert "Create the framework and testing environment for MLIR Reduce - a tool"
This reverts commit
28a45d54a7fe722248233165fc7fdbd18d18d233.
Windows bot is broken with:
LLVM ERROR: Error running interestingness test: posix_spawn failed: Permission denied
Muhammad Omair Javaid [Tue, 7 Jul 2020 15:24:14 +0000 (20:24 +0500)]
Combine multiple defs of arm64 register sets
Summary:
This patch aims to combine similar arm64 register set definitions defined in NativeRegisterContextLinux_arm64 and RegisterContextPOSIX_arm64.
I have implemented a register set interface out of RegisterInfoInterface class and moved arm64 register sets into RegisterInfosPOSIX_arm64 which is similar to Utility/RegisterContextLinux_* implemented by various other targets. This will help in managing register sets of new ARM64 architecture features in one place.
Built and tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabihf targets.
Reviewers: labath
Reviewed By: labath
Subscribers: mhorne, emaste, kristof.beyls, atanasyan, danielkiss, lldb-commits
Differential Revision: https://reviews.llvm.org/D80105
Shuhong Liu [Tue, 7 Jul 2020 15:10:15 +0000 (11:10 -0400)]
[Clang] Handle AIX Include management in the driver
Summary: Modify the AIX clang toolchain to include AIX dependencies in the search path
Reviewers: daltenty, stevewan, hubert.reinterpretcast
Reviewed By: daltenty, stevewan, hubert.reinterpretcast
Subscribers: ormris, hubert.reinterpretcast, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82677
Nathan James [Tue, 7 Jul 2020 15:05:09 +0000 (16:05 +0100)]
[ASTMatchers] Added hasDirectBase Matcher
Adds a matcher called `hasDirectBase` for matching the `CXXBaseSpecifier` of a class that directly derives from another class.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D81552
Pavel Labath [Tue, 7 Jul 2020 14:56:05 +0000 (16:56 +0200)]
[lldb/Utility] Fix float->integral conversions in Scalar APInt getters
These functions were doing a bitcast on the float value, which is not
consistent with the other getters, which were doing a numeric conversion
(47.0 -> 47). Change these to do numeric conversions too.
SharmaRithik [Tue, 7 Jul 2020 14:26:34 +0000 (19:56 +0530)]
[CodeMoverUtils] Make specific analysis dependent checks optional
Summary: This patch makes code motion checks optional which are dependent on
specific analysis example, dominator tree, post dominator tree and dependence
info. The aim is to make the adoption of CodeMoverUtils easier for clients that
don't use analysis which were strictly required by CodeMoverUtils. This will
also help in diversifying code motion checks using other analysis example MSSA.
Authored By: RithikSharma
Reviewer: Whitney, bmahjour, etiotto
Reviewed By: Whitney
Subscribers: Prazek, hiraditya, george.burgess.iv, asbirlea, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D82566
Guillaume Chatelet [Tue, 7 Jul 2020 14:35:12 +0000 (14:35 +0000)]
[Bitfields][NFC] Make sure bitfields are contiguous
Differential Revision: https://reviews.llvm.org/D83202
Eric Schweitz [Wed, 1 Jul 2020 19:32:44 +0000 (12:32 -0700)]
[flang] Add lowering of I/O statements.
The IO module is where I/O related statements are lowered to calls to the runtime library.
Differential revision: https://reviews.llvm.org/D83267
Balázs Kéri [Tue, 7 Jul 2020 12:21:18 +0000 (14:21 +0200)]
[ASTImporter] Corrected import of repeated friend declarations.
Summary:
Import declarations in correct order if a class contains
multiple redundant friend (type or decl) declarations.
If the order is incorrect this could cause false structural
equivalences and wrong declaration chains after import.
Reviewers: a.sidorin, shafik, a_sidorin
Reviewed By: shafik
Subscribers: dkrupp, Szelethus, gamesh411, teemperor, martong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75740
Ye Luo [Tue, 7 Jul 2020 14:13:37 +0000 (10:13 -0400)]
[OpenMP] Use primary context in CUDA plugin
Summary:
Retaining per device primary context is preferred to creating a context owned by the plugin.
From CUDA documentation
1. Note that the use of multiple CUcontext s per device within a single process will substantially degrade performance and is strongly discouraged. Instead, it is highly recommended that the implicit one-to-one device-to-context mapping for the process provided by the CUDA Runtime API be used." from https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DRIVER.html
2. Right under cuCtxCreate. In most cases it is recommended to use cuDevicePrimaryCtxRetain. https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__CTX.html#group__CUDA__CTX_1g65dc0012348bc84810e2103a40d8e2cf
3. The primary context is unique per device and shared with the CUDA runtime API. These functions allow integration with other libraries using CUDA. https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__PRIMARY__CTX.html#group__CUDA__PRIMARY__CTX
Two issues are addressed by this patch:
1. Not using the primary context caused interoperability issue with libraries like cublas, cusolver. CUBLAS_STATUS_EXECUTION_FAILED and cudaErrorInvalidResourceHandle
2. On OLCF summit, "Error returned from cuCtxCreate" and "CUDA error is: invalid device ordinal"
Regarding the flags of the primary context. If it is inactive, we set CU_CTX_SCHED_BLOCKING_SYNC. If it is already active, we respect the current flags.
Reviewers: grokos, ABataev, jdoerfert, protze.joachim, AndreyChurbanov, Hahnfeld
Reviewed By: jdoerfert
Subscribers: openmp-commits, yaxunl, guansong, sstefan1, tianshilei1992
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D82718
Alexey Bataev [Tue, 7 Jul 2020 13:33:06 +0000 (09:33 -0400)]
[DEBUGINFO]Add dwarf versions to the test, NFC.
Pavel Labath [Tue, 7 Jul 2020 13:23:49 +0000 (15:23 +0200)]
[lldb/test] Fix lldbutil.run_to_***_breakpoint for shared libraries
Even non-remote targets may need to set the launch environment
((DY)LD_LIBRARY_PATH, specifically) to run successfully.
Also, add an assertion to better detect the case when launching a target
fails and the breakpoint is never hit.
Roman Lebedev [Tue, 7 Jul 2020 13:53:19 +0000 (16:53 +0300)]
[Scalarizer] When gathering scattered scalar, don't replace it with itself
The (previously-crashing) test-case would cause us to seemingly-harmlessly
replace some use with something else, but we can't replace it with itself,
so we would crash.
Liu, Chen3 [Tue, 7 Jul 2020 13:22:27 +0000 (21:22 +0800)]
[X86] Fix a bug that when lowering byval argument
When an argument has 'byval' attribute and should be
passed on the stack according calling convention,
a stack copy would be emitted twice. This will cause
the real value will be put into stack where the pointer
should be passed.
Differential Revision: https://reviews.llvm.org/D83175
Joel E. Denny [Tue, 7 Jul 2020 13:48:22 +0000 (09:48 -0400)]
[OpenMP][NFC] Remove hard-coded line numbers from more tests
This is a continuation of D82224.
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D83057
Georgii Rymar [Mon, 6 Jul 2020 13:43:01 +0000 (16:43 +0300)]
[llvm-readobj] - Refactor the MipsGOTParser<ELFT> to stop using report_fatal_error().
`MipsGOTParser` is a helper class that is used to dump MIPS GOT and PLT.
There is a problem with it: it might call report_fatal_error() on invalid input.
When this happens, the tool reports a crash:
```
# command stderr:
LLVM ERROR: Cannot find PLTGOT dynamic table tag.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backt
race.
Stack dump:
...
```
Such error were not tested. In this patch I've refactored `MipsGOTParser`:
I've splitted handling of GOT and PLT to separate methods. This allows to propagate
any possible errors to caller and should allow to dump the PLT when something is wrong
with the GOT and vise versa in the future.
I've added tests for each `report_fatal_error()`
and now calling the `reportError` instead. In the future we might want to switch to
reporting warnings, but it requres the additional testing and should
be performed independently.
I've kept `unwrapOrError` calls untouched for now as I'd like to focus on eliminating
`report_fatal_error` calls in this patch only.
Differential revision: https://reviews.llvm.org/D83225
Nathan James [Tue, 7 Jul 2020 13:30:52 +0000 (14:30 +0100)]
[NFC] Use hasAnyName matcher in place of anyOf(hasName()...)
Georgii Rymar [Tue, 7 Jul 2020 13:22:10 +0000 (16:22 +0300)]
[llvm-readobj] - Fix indentation in broken-dynamic-reloc.test. NFC.
Fix a broken indentation introduced my myself in rG4a3c3d741a17.
Georgii Rymar [Mon, 6 Jul 2020 15:07:35 +0000 (18:07 +0300)]
[llvm-readobj] - Don't abort when dumping dynamic relocations when an object has both REL and RELA.
Currently, llvm-readobj calls `report_fatal_error` when an object has
both REL and RELA dynamic relocations.
llvm-readelf is able to handle this case properly. This patch adds such a test case
and adjusts the llvm-readobj code to follow (and be consistent with its own RELR and PLTREL cases).
Differential revision: https://reviews.llvm.org/D83232
Sam McCall [Thu, 2 Jul 2020 21:51:26 +0000 (23:51 +0200)]
[clangd] Store index in '.cache/clangd/index' instead of '.clangd/index'
Summary:
.clangd/index was well-intentioned in
2754942cbaef, but `.clangd` is the best
filename for the clangd config file (matching .clang-format and .clang-tidy).
And of course we can't have both .clangd/index and .clangd...
There are a few overlapping goals to satisfy:
- it should be clear from the directory name that this is transient
data that is safe to delete at the cost of recomputation, i.e. a cache
- it should be easy and self-documenting to blacklist these files in .gitignore
- we should have some consistency between filenames in-tree and
corresponding files in user storage (e.g. under XDG's ~/.cache/)
- we should be consistent across platforms (including windows, which
doesn't have distinct cache vs config directories)
So the plan is:
$PROJECT/.clangd (project config)
$PROJECT/.cache/clangd/index/ (project index)
$PROJECT/.cache/clangd/modules/ (maybe in future)
$XDG_CONFIG_HOME/clangd/config.yaml (user config)
$XDG_CACHE_HOME/clangd/index/ (index of non-project files)
$XDG_CACHE_HOME/clangd/modules/ (maybe in future)
This is sensible if XDG_{CONFIG,CACHE}_HOME coincide, and has a simple
.gitignore rule going forward: `.cache/`.
The monorepo gitignore is updated to reflect the backwards-compatible practice:
ignore .clangd/ (with trailing slash) matching index files from clangd 9/10
ignore .cache matching index from clangd 11+, and potentially other tools.
The entries from llvm-project/llvm gitignore are removed (obsolete).
Reviewers: kadircet, hokein
Subscribers: ilya-biryukov, MaskRay, jkorous, omtcyfz, arphaman, usaxena95, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D83099
Benjamin Kramer [Tue, 7 Jul 2020 10:49:32 +0000 (12:49 +0200)]
[mlir][VectorOps] Lower vector.outerproduct of int vectors
vector.fma and mulf don't work on integers. Use a muli/addi pair or
plain muli instead.
Differential Revision: https://reviews.llvm.org/D83292
Lei Zhang [Tue, 7 Jul 2020 12:28:25 +0000 (08:28 -0400)]
[mlir][spirv] Introduce OwningSPIRVModuleRef for ownership
Similar to OwningModuleRef, OwningSPIRVModuleRef signals ownership
transfer clearly. This is useful for APIs like spirv::deserialize,
where a spirv::ModuleOp is returned by deserializing SPIR-V binary
module.
This addresses the ASAN error as reported in
https://bugs.llvm.org/show_bug.cgi?id=46272
Differential Revision: https://reviews.llvm.org/D81652
Ayal Zaks [Sun, 7 Jun 2020 08:36:57 +0000 (11:36 +0300)]
[LV] Vectorize without versioning-for-unit-stride under -Os/-Oz
If a loop is in a function marked OptSize, Loop Access Analysis should refrain
from generating runtime checks for unit strides that will version the loop.
If a loop is in a function marked OptSize and its vectorization is enabled, it
should be vectorized w/o any versioning.
Fixes PR46228.
Differential Revision: https://reviews.llvm.org/D81345
Raphael Isemann [Tue, 7 Jul 2020 11:30:52 +0000 (13:30 +0200)]
[lldb] Make TestIOHandlerResizeNoEditline pass with Python 2
io.BytesIO seems to produce a stream in Python 2 which isn't recognized
as a file object in the SWIG API, so this test fails for Python 2 (and I assume
also an old SWIG version needs to be involved).
Instead just open an empty input file which is a file object in all Python
versions to make this test pass everywhere.
Georgii Rymar [Tue, 7 Jul 2020 11:43:34 +0000 (14:43 +0300)]
[llvm-readobj] - Add prepending # to mips-got.test and mips-plt.test. NFC.
It was requested in D83225 review to do it separately.
Haojian Wu [Tue, 7 Jul 2020 11:35:22 +0000 (13:35 +0200)]
[clang-tidy] Fix an unused-raii check crash on objective-c++.
Differential Revision: https://reviews.llvm.org/D83293
Georgii Rymar [Fri, 3 Jul 2020 14:24:09 +0000 (17:24 +0300)]
[llvm-readobj] - Refine the error reporting in LLVMStyle<ELFT>::printELFLinkerOptions.
It is possible to:
1) Avoid using the `unwrapOrError` calls and hence allow to continue dumping even when
something is not OK with one of SHT_LLVM_LINKER_OPTIONS sections.
2) replace `reportWarning` with `reportUniqueWarning` calls. In this method it is no-op,
because it is not possible to have a duplicated warnings anyways, but since we probably
want to switch to `reportUniqueWarning` globally, this is a good thing to do.
This patch addresses both these points.
Differential revision: https://reviews.llvm.org/D83131
Georgii Rymar [Thu, 2 Jul 2020 12:01:35 +0000 (15:01 +0300)]
[llvm-readobj] - Split the printHashSymbols. NFCI.
This introduces `printHashTableSymbols` and
`printGNUHashTableSymbols` to split the `printHashSymbols`.
It makes the code more readable and consistent.
Differential revision: https://reviews.llvm.org/D83040
Kerry McLaughlin [Tue, 7 Jul 2020 10:29:12 +0000 (11:29 +0100)]
[SVE][CodeGen] Legalisation of unpredicated store instructions
Summary:
When splitting a store of a scalable type, the new address is
calculated in SplitVecOp_STORE using a vscale and an add instruction.
Reviewers: sdesmalen, efriedma, david-arm
Reviewed By: david-arm
Subscribers: tschuett, hiraditya, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83041
Georgii Rymar [Mon, 6 Jul 2020 10:45:49 +0000 (13:45 +0300)]
[llvm-readobj] - Refactor ELFDumper<ELFT>::getStaticSymbolName.
This is a followup for D83129.
It is possible to make `getStaticSymbolName` report warnings inside
and return the "<?>" on a error. This allows to encapsulate errors handling
and slightly simplifies the logic in callers code.
Differential revision: https://reviews.llvm.org/D83208
Georgii Rymar [Fri, 3 Jul 2020 13:23:46 +0000 (16:23 +0300)]
[llvm-readobj] - Allow dumping partially corrupted SHT_LLVM_CALL_GRAPH_PROFILE sections.
The code we have currently reports an error if something is not right with the
profile section. Instead we can report a warning and continue dumping when it is possible.
This patch does it.
Differential revision: https://reviews.llvm.org/D83129
Kerry McLaughlin [Tue, 7 Jul 2020 09:35:41 +0000 (10:35 +0100)]
[SVE][CodeGen] Legalisation of unpredicated load instructions
Summary:
When splitting a load of a scalable type, the new address is
calculated in SplitVecRes_LOAD using a vscale and an add instruction.
This patch also adds a DAG combiner fold to visitADD for vscale:
- Fold (add (vscale(C0)), (vscale(C1))) to (add (vscale(C0 + C1)))
Reviewers: sdesmalen, efriedma, david-arm
Reviewed By: david-arm
Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82792
Manuel Klimek [Mon, 6 Jul 2020 12:02:54 +0000 (14:02 +0200)]
Hand Allocator and IdentifierTable into FormatTokenLexer.
This allows us to share the allocator in the future so we can create tokens while parsing.
Differential Revision: https://reviews.llvm.org/D83218
Guillaume Chatelet [Tue, 7 Jul 2020 09:54:13 +0000 (09:54 +0000)]
[NFC] Adding the align attribute on Atomic{CmpXchg|RMW}Inst
This is the first step to add support for the align attribute to AtomicRMWInst and AtomicCmpXchgInst.
Next step is to add support in IRBuilder and BitcodeReader.
Bug: https://bugs.llvm.org/show_bug.cgi?id=27168
Differential Revision: https://reviews.llvm.org/D83136
Pavel Labath [Mon, 6 Jul 2020 09:04:58 +0000 (11:04 +0200)]
[lldb/DWARF] Add a utility function for (forceful) completion of types
Summary:
Unify the code for requiring a complete type and move it into a single
place. The only functional change is that the "cannot start a definition
of an incomplete type" is upgrated from a runtime error/warning to an
lldbassert. An plain assert might also be fine, since (AFAICT) this can
only happen in case of a programmer error.
Reviewers: teemperor, aprantl, shafik
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D83199
Georgii Rymar [Thu, 2 Jul 2020 10:38:42 +0000 (13:38 +0300)]
[llvm-readobj] - Fix a crash scenario in GNUStyle<ELFT>::printHashSymbols().
We might crash when the dynamic symbols table is empty (or not found)
and --hash-symbols is requested. Both .hash and .gnu.hash logic is affected.
The patch fixes this issue.
Differential revision: https://reviews.llvm.org/D83037
Kiran Kumar T P [Tue, 7 Jul 2020 08:56:22 +0000 (14:26 +0530)]
[flang][OpenMP] Enhance parser support for flush construct to OpenMP 5.0
Summary:
This patch enhances parser support for flush construct to OpenMP 5.0 by including memory-order-clause.
2.18.8 flush Construct
!$omp flush [memory-order-clause] [(list)]
where memory-order-clause is
acq_rel
release
acquire
The patch includes code changes and testcase modifications.
Reviewed By: klausler, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D82177
River Riddle [Tue, 7 Jul 2020 08:35:23 +0000 (01:35 -0700)]
[mlir][NFC] Remove usernames and google bug numbers from TODO comments.
These were largely leftover from when MLIR was a google project, and don't really follow LLVM guidelines.
David Sherwood [Fri, 19 Jun 2020 08:06:21 +0000 (09:06 +0100)]
[SVE] Add more warnings checks to clang and LLVM SVE tests
There are now more SVE tests in LLVM and Clang that do not
emit warnings related to invalid use of EVT::getVectorNumElements()
and VectorType::getNumElements(). For these tests I have added
additional checks that there are no warnings in order to prevent
any future regressions.
Differential Revision: https://reviews.llvm.org/D82943
David Sherwood [Thu, 25 Jun 2020 07:19:49 +0000 (08:19 +0100)]
[SVE][CodeGen] Fix bug when falling back to DAG ISel
In an earlier commit
584d0d5c1749c13625a5d322178ccb4121eea610 I
added functionality to allow AArch64 CodeGen support for falling
back to DAG ISel when Global ISel encounters scalable vector
types. However, it seems that we were not falling back early
enough as llvm::getLLTForType was still being invoked for scalable
vector types.
I've added a new fallback function to the call lowering class in
order to catch this problem early enough, rather than wait for
lowerFormalArguments to reject scalable vector types.
Differential Revision: https://reviews.llvm.org/D82524
David Sherwood [Mon, 29 Jun 2020 08:39:22 +0000 (09:39 +0100)]
[CodeGen] Fix warnings in sve-vector-splat.ll and sve-trunc.ll
This patch fixes all remaining warnings in:
llvm/test/CodeGen/AArch64/sve-trunc.ll
llvm/test/CodeGen/AArch64/sve-vector-splat.ll
I hit some warnings related to getCopyPartsToVector. I fixed two
issues:
1. In widenVectorToPartType() we assumed that we'd always be
using BUILD_VECTOR nodes to expand from one vector type to another,
which is incorrect for scalable vector types. I've fixed this for now
by simply bailing out immediately for scalable vectors.
2. In getCopyToPartsVector() I've changed the code to compare
the element counts of different types.
Differential Revision: https://reviews.llvm.org/D83028
Craig Topper [Tue, 7 Jul 2020 07:27:50 +0000 (00:27 -0700)]
[X86] Add 64bit and retpoline-external-thunk to list of featuers in X86TargetParser.def.
'64bit' shows up from -march=native on 64-bit capable CPUs.
'retpoline-eternal-thunk' isn't a real feature but shows up
when -mretpoline-external-thunk is passed to clang.
Craig Topper [Tue, 7 Jul 2020 07:17:59 +0000 (00:17 -0700)]
[X86] Remove assert for missing features from X86::getImpliedFeatures
This is failing on the bots. Remove while I try to figure out
what feature I missed in the table.
Craig Topper [Tue, 7 Jul 2020 06:50:41 +0000 (23:50 -0700)]
[X86] Merge X86TargetInfo::setFeatureEnabled and X86TargetInfo::setFeatureEnabledImpl. NFC
setFeatureEnabled is a virtual function. setFeatureEnabledImpl
was its implementation. This split was to avoid virtual calls
when we need to call setFeatureEnabled in initFeatureMap.
With C++11 we can use 'final' on setFeatureEnabled to enable
the compiler to perform de-virtualization for the initFeatureMap
calls.
Carl Ritson [Tue, 7 Jul 2020 06:40:35 +0000 (15:40 +0900)]
[AMDGPU] Update isFMAFasterThanFMulAndFAdd assumptions
MAD/MAC is no longer always available.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D83207
Saiyedul Islam [Tue, 7 Jul 2020 06:15:26 +0000 (06:15 +0000)]
[libomptarget] Implement atomic inc and fence functions for AMDGCN using clang builtins
This function uses __builtin_amdgcn_atomic_inc32():
uint32_t atomicInc(uint32_t *address, uint32_t max);
These functions use __builtin_amdgcn_fence():
__kmpc_impl_threadfence()
__kmpc_impl_threadfence_block()
__kmpc_impl_threadfence_system()
They will take place of current mechanism of directly calling IR functions.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D83132
Saiyedul Islam [Tue, 7 Jul 2020 06:13:43 +0000 (06:13 +0000)]
[AMDGPU] Change Clang AMDGCN atomic inc/dec builtins to take unsigned values
builtin_amdgcn_atomic_inc32(uint *Ptr, uint Val, unsigned MemoryOrdering, const char *SyncScope)
builtin_amdgcn_atomic_inc64(uint64_t *Ptr, uint64_t Val, unsigned MemoryOrdering, const char *SyncScope)
builtin_amdgcn_atomic_dec32(uint *Ptr, uint Val, unsigned MemoryOrdering, const char *SyncScope)
builtin_amdgcn_atomic_dec64(uint64_t *Ptr, uint64_t Val, unsigned MemoryOrdering, const char *SyncScope)
As AMDGCN IR instrinsic for atomic inc/dec does unsigned comparison,
these clang builtins should also take unsigned types instead of signed
int types.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D83121