Benjamin Kramer [Wed, 6 Nov 2019 11:27:11 +0000 (12:27 +0100)]
Silence warning, PyMODINIT_FUNC already contains extern "C"
PythonReadline.h:22:12: warning: duplicate 'extern' declaration specifier [-Wduplicate-decl-specifier]
dfukalov [Tue, 5 Nov 2019 14:30:52 +0000 (17:30 +0300)]
[AMDGPU] Improve code size cost model (part 2)
Summary: Added estimations for ShuffleVector, some cast and arithmetic instructions
Reviewers: rampitec
Reviewed By: rampitec
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, zzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69629
Tim Northover [Wed, 6 Nov 2019 10:22:00 +0000 (10:22 +0000)]
NeonEmitter: remove special 'a' type modifier.
'a' used to implement a splat in C++ code in NeonEmitter.cpp, but this
can be done directly from .td expansions now (and most ops already did).
So removing it simplifies the overall code.
https://reviews.llvm.org/D69716
Sjoerd Meijer [Wed, 6 Nov 2019 09:58:36 +0000 (09:58 +0000)]
[TTI][LV] preferPredicateOverEpilogue
We have two ways to steer creating a predicated vector body over creating a
scalar epilogue. To force this, we have 1) a command line option and 2) a
pragma available. This adds a third: a target hook to TargetTransformInfo that
can be queried whether predication is preferred or not, which allows the
vectoriser to make the decision without forcing it.
While this change behaves as a non-functional change for now, it shows the
required TTI plumbing, usage of this new hook in the vectoriser, and the
beginning of an ARM MVE implementation. I will follow up on this with:
- a complete MVE implementation, see D69845.
- a patch to disable this, i.e. we should respect "vector_predicate(disable)"
and its corresponding loophint.
Differential Revision: https://reviews.llvm.org/D69040
Tim Northover [Wed, 6 Nov 2019 09:47:07 +0000 (09:47 +0000)]
NeonEmitter: switch to enum for internal Type representation.
Previously we had a handful of bools (Signed, Floating, ...) that could
easily end up in an inconsistent state. This adds an enum Kind which
holds the mutually exclusive states a type might be in, retaining some
of the bools that modified an underlying type.
https://reviews.llvm.org/D69715
Ilya Biryukov [Wed, 6 Nov 2019 09:56:05 +0000 (10:56 +0100)]
[Syntax] Add nodes for most common statements
Summary:
Most of the statements mirror the ones provided by clang AST.
Major differences are:
- expressions are wrapped into 'ExpressionStatement' instead of being
a subclass of statement,
- semicolons are always consumed by the leaf expressions (return,
expression satement, etc),
- some clang statements are not handled yet, we wrap those into an
UnknownStatement class, which is not present in clang.
We also define an 'Expression' and 'UnknownExpression' classes in order
to produce 'ExpressionStatement' where needed. The actual implementation
of expressions is not yet ready, it will follow later.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D63835
Simon Tatham [Thu, 31 Oct 2019 17:02:07 +0000 (17:02 +0000)]
[ARM,MVE] Add intrinsics for gather/scatter load/stores.
This patch adds two new families of intrinsics, both of which are
memory accesses taking a vector of locations to load from / store to.
The vldrq_gather_base / vstrq_scatter_base intrinsics take a vector of
base addresses, and an immediate offset to be added consistently to
each one. vldrq_gather_offset / vstrq_scatter_offset take a scalar
base address, and a vector of offsets to add to it. The
'shifted_offset' variants also multiply each offset by the element
size type, so that the vector is effectively of array indices.
At the IR level, these operations are represented by a single set of
four IR intrinsics: {gather,scatter} × {base,offset}. The other
details (signed/unsigned, shift, and memory element size as opposed to
vector element size) are all specified by IR intrinsic polymorphism
and immediate operands, because that made the selection job easier
than making a huge family of similarly named intrinsics.
I considered using the standard IR representations such as
llvm.masked.gather, but they're not a good fit. In order to use
llvm.masked.gather to represent a gather_offset load with element size
smaller than a pointer, you'd have to expand the <8 x i16> vector of
offsets into an <8 x i16*> vector of pointers, which would be split up
during legalization, so you'd spend most of your time undoing the mess
it had made. Also, ISel support for llvm.masked.gather would be easy
enough in a trivial way (you can expand it into a gather-base load
with a zero immediate offset), but instruction-selecting lots of
fiddly idioms back into all the _other_ MVE load instructions would be
much more work. So I think dedicated IR intrinsics are the more
sensible approach, at least for the moment.
On the clang tablegen side, I've added two new features to the
Tablegen source accepted by MveEmitter: a 'CopyKind' type node for
defining a type that varies with the parameter type (it lets you ask
for an unsigned integer type of the same width as the parameter), and
an 'unsignedflag' value node for passing an immediate IR operand which
is 0 for a signed integer type or 1 for an unsigned one. That lets me
write each kind of intrinsic just once and get all its subtypes and
immediate arguments generated automatically.
Also I've tweaked the handling of pointer-typed values in the code
generation part of MveEmitter: they're generated as Address rather
than Value (i.e. including an alignment) so that they can be given to
the ordinary IR load and store operations, but I'd omitted the code to
convert them back to Value when they're going to be used as an
argument to an IR intrinsic.
On the MC side, I've enhanced MVEVectorVTInfo so that it can tell you
not only the full assembly-language suffix for a given vector type
(like 's32' or 'u16') but also the numeric-only one used by store
instructions (just '32' or '16').
Reviewers: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D69791
Simon Tatham [Mon, 4 Nov 2019 12:40:25 +0000 (12:40 +0000)]
[ARM,MVE] Integer-type nitpicks in MVE intrinsics.
A few integer types in the ACLE definitions of MVE intrinsics are
given as 'int' or 'unsigned' instead of <stdint.h> fixed-size types
like uint32_t. Usually these are the ones where the size isn't that
important, such as immediate offsets in loads (which have a range
limited by the instruction encoding) or the carry flag in vadcq which
can only be 0 or 1 anyway.
With this change, <arm_mve.h> follows that exact type naming, so that
the function prototypes look identical to the ones in ACLE, instead of
replacing int and unsigned with int32_t and uint32_t.
Reviewers: dmgreen
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69790
Simon Tatham [Thu, 31 Oct 2019 17:02:42 +0000 (17:02 +0000)]
[clang,MveEmitter] Fix sign/zero extension in range limits.
In the code that generates Sema range checks on constant arguments, I
had a piece of code that checks the bounds specified in the Tablegen
intrinsic description against the range of the integer type being
tested. If the bounds are large enough to permit any value of the
integer type, you can omit the compile-time range check. (This case is
expected to come up in some of the bitwise operation intrinsics.)
But somehow I got my signed/unsigned check backwards (asking for the
signed min/max of an unsigned type and vice versa), and also made a
sign extension error in which a signed negative value gets
zero-extended. Now rewritten more sensibly, and it should get its
first sensible test from the next batch of intrinsics I'm planning to
add in D69791.
Reviewers: dmgreen
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69789
Simon Tatham [Thu, 31 Oct 2019 17:00:39 +0000 (17:00 +0000)]
[ARM MVE] Remove accidental 64-bit vst2/vld2 intrinsics.
ACLE defines no such intrinsic as vst2q_u64, and the MVE instruction
set has no corresponding instruction. But I had accidentally added
them to the fledgling <arm_mve.h> anyway, and if you used them, you'd
get a compiler crash.
Reviewers: dmgreen
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69788
Haojian Wu [Wed, 30 Oct 2019 12:21:47 +0000 (13:21 +0100)]
[clangd] Implement a function to lex the file to find candidate occurrences.
Summary:
This will be used for incoming cross-file rename (to detect index
staleness issue).
Reviewers: ilya-biryukov
Subscribers: MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69615
paulhoad [Wed, 6 Nov 2019 09:50:54 +0000 (09:50 +0000)]
clang-format: Add a fallback style to Emacs mode
Summary:
This allows one to enable `clang-format-buffer` on file save and avoid
reformatting files that are outside of any project with .clang-format style.
Reviewers: djasper, klimek, sammccall, owenpan, mitchell-stellar, MyDeveloperDay
Reviewed By: MyDeveloperDay
Subscribers: cfe-commits
Patch By: dottedmag
Tags: #clang, #clang-format
Differential Revision: https://reviews.llvm.org/D69752
paulhoad [Wed, 6 Nov 2019 09:34:01 +0000 (09:34 +0000)]
[clang-format] [PR35518] C++17 deduction guides are wrongly formatted
Summary:
see https://bugs.llvm.org/show_bug.cgi?id=35518
clang-format removes spaces around deduction guides but not trailing return types, make the consistent
```
template <typename T> S(T)->S<T>;
auto f(int, int) -> double;
```
becomes
```
template <typename T> S(T) -> S<T>;
auto f(int, int) -> double;
```
Reviewers: klimek, mitchell-stellar, owenpan, sammccall, lichray, curdeius, KyrBoh
Reviewed By: curdeius
Subscribers: merge_guards_bot, hans, lichray, cfe-commits
Tags: #clang-format, #clang-tools-extra, #clang
Differential Revision: https://reviews.llvm.org/D69577
LLVM GN Syncbot [Wed, 6 Nov 2019 08:29:28 +0000 (08:29 +0000)]
gn build: Merge
24130d661ed
Matthias Gehre [Sun, 22 Sep 2019 21:19:41 +0000 (23:19 +0200)]
[clang-tidy] Add readability-make-member-function-const
Summary:
Finds non-static member functions that can be made ``const``
because the functions don't use ``this`` in a non-const way.
The check conservatively tries to preserve logical costness in favor of
physical costness. See readability-make-member-function-const.rst for more
details.
Reviewers: aaron.ballman, gribozavr, hokein, alexfh
Subscribers: mgorny, xazax.hun, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D68074
Thomas Finch [Wed, 6 Nov 2019 05:51:04 +0000 (21:51 -0800)]
YAML parser robustness improvements
Summary: This patch fixes a number of bugs found in the YAML parser
through fuzzing. In general, this makes the parser more robust against
malformed inputs.
The fixes are mostly improved null checking and returning errors in
more cases. In some cases, asserts were changed to regular errors,
this provides the same robustness but also protects release builds
from the triggering conditions. This also improves the fuzzability of
the YAML parser since asserts can act as a roadblock to further
fuzzing once they're hit.
Each fix has a corresponding test case:
- TestAnchorMapError - Added proper null pointer handling in
`Stream::printError` if N is null and `KeyValueNode::getValue` if
getKey returns null, `Input::createHNodes` `dyn_casts` changed to
`dyn_cast_or_null` so the null pointer checks are actually able to
fail
- TestFlowSequenceTokenErrors - Added case in
`Document::parseBlockNode` for FlowMappingEnd, FlowSequenceEnd, or
FlowEntry tokens outside of mappings or sequences
- TestDirectiveMappingNoValue - Changed assert to regular error
return in `Scanner::scanValue`
- TestUnescapeInfiniteLoop - Fixed infinite loop in
`ScalarNode::unescapeDoubleQuoted` by returning an error for
unrecognized escape codes
- TestScannerUnexpectedCharacter - Changed asserts to regular error
returns in `Scanner::consume`
- TestUnknownDirective - For both of the inputs the stream doesn't
fail and correctly returns TK_Error, but there is no valid root
node for the document. There's no reasonable way to make the
scanner fail for unknown directives without breaking the YAML spec
(see spec-07-01.test). I think the assert is unnecessary given
that an error is still generated for this case.
The `SimpleKeys.clear()` line fixes a bug found by AddressSanitizer
triggered by multiple test cases - when TokenQueue is cleared
SimpleKeys is still holding dangling pointers into it, so SimpleKeys
should be cleared as well.
Patch by Thomas Finch!
Reviewers: chandlerc, Bigcheese, hintonda
Reviewed By: Bigcheese, hintonda
Subscribers: hintonda, kristina, beanz, dexonsmith, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61608
Yevgeny Rouban [Wed, 6 Nov 2019 04:17:51 +0000 (11:17 +0700)]
[ADT] Add equality operator for SmallPtrSet
Reviewed By: tellenbach
Differential Revision: https://reviews.llvm.org/D69429
QingShan Zhang [Wed, 6 Nov 2019 02:46:37 +0000 (02:46 +0000)]
[PowerPC] Fix the incorrect 'RM' flag set on load/store instr
The 'RM' flag model the "Rounding Mode" and it has nothing to do with the load/store instructions.
Differential Revision: https://reviews.llvm.org/D69551
Chris Bieneman [Wed, 30 Oct 2019 19:50:04 +0000 (12:50 -0700)]
Implement `sys::getHostCPUName()` for Darwin ARM
Summary: Currently there is no implementation of `sys::getHostCPUName()` for Darwin ARM targets. This patch makes it so that LLVM running on ARM makes reasonable guesses about the CPU features of the host CPU.
Reviewers: t.p.northover, lhames, efriedma
Reviewed By: efriedma
Subscribers: rjmccall, efriedma, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69597
Vladimir Vereschaka [Wed, 6 Nov 2019 01:09:50 +0000 (17:09 -0800)]
Fixed a profdata file size detection on Windows system.
The space symbols are allowed in the group names on Windows system (as
example: Domain Users). In that case the test extracts a wrong field
from the output to get a size of the profdata file.
This patch avoids a printing of the group names in the test output and
extracts a proper field as a file size.
Differential Revision: https://reviews.llvm.org/D69317
Teresa Johnson [Tue, 5 Nov 2019 22:00:58 +0000 (14:00 -0800)]
[IRMover] Set Address Space for moved global values
Summary:
Set Address Space when creating a new function (from another).
Fix PR41154.
Patch by Ehud Katz <ehudkatz@gmail.com>
Reviewers: tejohnson, chandlerc
Reviewed By: tejohnson
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69361
Daniel Sanders [Tue, 29 Oct 2019 02:10:26 +0000 (19:10 -0700)]
[globalisel][docs] Rework GMIR documentation and add an early GenericOpcode reference
It looks like I pushed an older version of this commit without the review
fixups earlier. This applies the review changes
Differential Revision: https://reviews.llvm.org/D69545
Daniel Sanders [Tue, 5 Nov 2019 23:10:00 +0000 (15:10 -0800)]
[globalisel][docs] Rework GMIR documentation and add an early GenericOpcode reference
Summary:
Rework the GMIR documentation to focus more on the end user than the
implementation and tie it in to the MIR document. There was also some
out-of-date information which has been removed.
The quality of the GenericOpcode reference is highly variable and drops
sharply as I worked through them all but we've got to start somewhere :-).
It would be great if others could expand on this too as there is an awful
lot to get through.
Also fix a typo in the definition of G_FLOG. Previously, the comments said
we had two base-2's (G_FLOG and G_FLOG2).
Reviewers: aemerson, volkan, rovka, arsenm
Reviewed By: rovka
Subscribers: wdng, arphaman, jfb, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69545
James Molloy [Tue, 5 Nov 2019 22:53:56 +0000 (22:53 +0000)]
[Automaton] Make Automaton thread-safe
In an optimization to improve performance (rL375240) we added a std::shared_ptr
around the main table map. This is safe, but we also ended up making the
transcriber object a std::shared_ptr too. This has mutable state, so must be
copied when we copy the Automaton object. This is very cheap; the main optimization
was about the map `M` only.
Reported by Dan Palermo. No test as triggering this is rather hard from a unit test.
Daniel Sanders [Wed, 30 Oct 2019 21:47:36 +0000 (14:47 -0700)]
[globalisel][docs] Add a section about debugging with the block extractor
Summary: Depends on D69644
Reviewers: rovka, volkan, arsenm
Subscribers: wdng, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69645
Stanislav Mekhanoshin [Tue, 5 Nov 2019 22:15:08 +0000 (14:15 -0800)]
[AMDGPU] Add missing flags to DS_Real
Differential Revision: https://reviews.llvm.org/D69867
Sanjay Patel [Tue, 5 Nov 2019 22:18:03 +0000 (17:18 -0500)]
[SLP] add tests for 2-wide reductions; NFC
Alex Langford [Tue, 5 Nov 2019 22:11:24 +0000 (14:11 -0800)]
[TestMTCSimple] Disable the test if you don't have libMTC
If you are running on macOS and have the CommandLineTools installed of
Xcode, this test will fail because CommandLineTools doesn't ship with
libMainThreadChecker. Skip the test if you don't have it installed.
Volodymyr Sapsai [Tue, 5 Nov 2019 22:03:36 +0000 (14:03 -0800)]
Revert "[analyzer] Add test directory for scan-build."
This reverts commit
0aba69eb1a01c44185009f50cc633e3c648e9950 with
subsequent changes to test files.
It caused test failures on GreenDragon, e.g.,
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/
Teresa Johnson [Tue, 5 Nov 2019 21:07:20 +0000 (13:07 -0800)]
[IRMover] Use GlobalValue::getAddressSpace instead of directly from its type [NFC]
Summary: Change the old form of G->getType()->getAddressSpace() to the new G->getAddressSpace() (underneath does the same).
Patch by Ehud Katz <ehudkatz@gmail.com>
Reviewers: tejohnson, chandlerc
Reviewed By: tejohnson
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69550
Simon Atanasyan [Tue, 5 Nov 2019 21:12:39 +0000 (00:12 +0300)]
[mips] Fix `getRegForInlineAsmConstraint` to do not crash on empty Constraint
Kelvin Li [Tue, 5 Nov 2019 18:44:46 +0000 (13:44 -0500)]
[CMake] Prevent adding lld to test dependency (TEST_DEPS) when lld project is not built
D69405 causes failure if running LIT when the compiler was built without lld.
Patch by Anh Tuyen Tran (anhtuyen)
Differential Revision: https://reviews.llvm.org/D69685
Alina Sbirlea [Tue, 5 Nov 2019 21:37:23 +0000 (13:37 -0800)]
[LoopRotationUtils] Check values are newly inserted into maps.
This is a cleanup that came up in D63680.
All values added to the ValueMaps should be newly added.
Simon Pilgrim [Tue, 5 Nov 2019 21:25:55 +0000 (21:25 +0000)]
[Hexagon] getCompoundCandidateGroup - fix 'false' value is implicitly cast to unsigned warning. NFCI.
Consistently return HexagonII::HCG_None.
Haibo Huang [Tue, 5 Nov 2019 01:04:54 +0000 (17:04 -0800)]
[lldb] Add a install target for lldb python on darwin
Summary: Similar to D68370 but for darwin framework build.
Reviewers: aadsm
Subscribers: mgorny, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D69834
Philip Reames [Tue, 5 Nov 2019 21:17:01 +0000 (13:17 -0800)]
[X86/Atomics] Correct a few transforms for new atomic lowering
This is a partial fix for the issues described in commit message of 027aa27 (the revert of G24609). Unfortunately, I can't provide test coverage for it on it's own as the only (known) wrong example is still wrong, but due to a separate issue.
These fixes are cases where when performing unrelated DAG combines, we were dropping the atomicity flags entirely.
Bill Wendling [Tue, 5 Nov 2019 21:09:42 +0000 (13:09 -0800)]
Fix typo so that '-O0' is correctly specified
Alexey Bataev [Tue, 5 Nov 2019 20:33:18 +0000 (15:33 -0500)]
[OPENMP50]Simplify processing of context selector scores.
If the context selector score was not specified, its value must be set
to 0. Simplify the processing of unspecified scores + save memory in
attribute representation.
Amy Huang [Tue, 5 Nov 2019 18:54:50 +0000 (10:54 -0800)]
[MIR] Add MIR parsing for heap alloc site instruction markers
Summary:
This patch adds MIR parsing and printing for heap alloc markers, which were
added in D69136. They are printed as an operand similar to pre-/post-instr
symbols, with a heap-alloc-marker token and a metadata node.
Reviewers: rnk
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69864
Mark de Wever [Tue, 5 Nov 2019 19:39:55 +0000 (20:39 +0100)]
[Sema] Fixes templated friend member assertion
Fixes PR41792: Clang assertion failure on templated friend member function
Differential Revision: https://reviews.llvm.org/D69481
Adrian Prantl [Tue, 5 Nov 2019 20:43:00 +0000 (12:43 -0800)]
[ValueObject] Upstream early exit from swift-lldb. (NFC)
Adrian Prantl [Tue, 5 Nov 2019 19:09:27 +0000 (11:09 -0800)]
[ValueObject] Upstream initialization from swift-lldb.
This is a non-Swift-specific change in swift-lldb that seems to be
useful for remote debugging. If does in fact turn out to be redundant
we can remove it from llvm.org and then it will disappear in
swift-lldb, too.
Jonas Devlieghere [Tue, 5 Nov 2019 20:28:25 +0000 (12:28 -0800)]
[Reproducer] Add test case for expression evaluation
Benjamin Kramer [Tue, 5 Nov 2019 20:21:29 +0000 (21:21 +0100)]
[X86] Gate select->fmin/fmax transform on NoSignedZeros instead of UnsafeFPMath
Fred Riss [Tue, 5 Nov 2019 19:14:38 +0000 (11:14 -0800)]
TestBatchMode.py: add missing @skipIfRemote
All the tests in this file were already marked as skipped for remote tests
except for this one.
Fred Riss [Tue, 5 Nov 2019 19:10:21 +0000 (11:10 -0800)]
testsuite: skipIfNoSBHeaders should skip when running remotely
The LLDB dylib/framework will not be available on the remote host, it makes
no sense to try to run those tests in a remote scenario.
Fred Riss [Tue, 5 Nov 2019 18:56:29 +0000 (10:56 -0800)]
Modernize add-dsym test Makefile
Julian Lettner [Tue, 5 Nov 2019 20:10:43 +0000 (12:10 -0800)]
Revert "[lit] Better/earlier errors when no tests are executed"
This reverts commit
d8f2bff75126c6dde694ad245f9807fa12ad5630.
Stanislav Mekhanoshin [Tue, 5 Nov 2019 19:22:07 +0000 (11:22 -0800)]
[AMDGPU] Removed dead code from R600ISelLowering.cpp
This was added to inhibit a warning from gcc 7.3 according to
the comment. However, it triggers warning from PVS. In addition
I cannot reproduce it with gcc 7.4 and I also cannot reproduce
it with gcc 7.3 using compiler explorer.
Differential Revision: https://reviews.llvm.org/D69863
Philip Reames [Tue, 5 Nov 2019 19:15:09 +0000 (11:15 -0800)]
[X86/Atomics] (Semantically) revert G246098, switch back to the old atomic example
When writing an email for a follow up proposal, I realized one of the diffs in the committed change was incorrect. Digging into it revealed that the fix is complicated enough to require some thought, so reverting in the meantime.
The problem is visible in this diff (from the revert):
; X64-SSE-LABEL: store_fp128:
; X64-SSE: # %bb.0:
-; X64-SSE-NEXT: movaps %xmm0, (%rdi)
+; X64-SSE-NEXT: subq $24, %rsp
+; X64-SSE-NEXT: .cfi_def_cfa_offset 32
+; X64-SSE-NEXT: movaps %xmm0, (%rsp)
+; X64-SSE-NEXT: movq (%rsp), %rsi
+; X64-SSE-NEXT: movq {{[0-9]+}}(%rsp), %rdx
+; X64-SSE-NEXT: callq __sync_lock_test_and_set_16
+; X64-SSE-NEXT: addq $24, %rsp
+; X64-SSE-NEXT: .cfi_def_cfa_offset 8
; X64-SSE-NEXT: retq
store atomic fp128 %v, fp128* %fptr unordered, align 16
ret void
The problem here is three fold:
1) x86-64 doesn't guarantee atomicity of anything larger than 8 bytes. Some platforms observably break this guarantee, others don't, but the codegen isn't considering this, so it's wrong on at least some platforms.
2) When I started to track down the problem, I discovered that DAGCombiner had stripped the atomicity off the store entirely. This comes down to idiomatic usage of DAG.getStore passing all MMO components separately as opposed to just passing the MMO.
3) On x86 (not -64), there are cases where 8 byte atomiciy is supported, but only for floating point operations. This would seem to imply that operation typing matters for correctness, and DAGCombine happily folds away bitcasts. I'm not 100% sure there's a problem here, but I'm not entirely sure there isn't either.
I plan on returning to each issue in turn; sorry for the churn here.
Michael Liao [Fri, 7 Jun 2019 19:08:29 +0000 (15:08 -0400)]
[HIP] Fix visibility for 'extern' device variables.
Summary:
- Fix a bug which misses the change for a variable to be set with
target-specific attributes.
Reviewers: yaxunl
Subscribers: jvesely, nhaehnle, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D63020
Sid Manning [Tue, 5 Nov 2019 19:13:18 +0000 (11:13 -0800)]
[llvm-objdump] Fix spurious "The end of the file was unexpectedly encountered" if a SHT_NOBITS sh_offset is larger than the file size
llvm-objdump -D this file:
int a[100000];
int main() { return 0; }
Will produce an error: "The end of the file was unexpectedly encountered".
This happens because of a check in Binary.h checkOffset. (Addr + Size > M.getBufferEnd()).
The sh_offset and sh_size fields can be ignored for SHT_NOBITS sections.
Fix the error by changing ELFObjectFile<ELFT>::getSectionContents to use
the file base for SHT_NOBITS sections.
Reviewed By: grimar, MaskRay
Differential Revision: https://reviews.llvm.org/D69192
Joel E. Denny [Tue, 5 Nov 2019 15:05:10 +0000 (10:05 -0500)]
[lit] Fix `not` calling internal commands
Without this patch, when using lit's internal shell, if `not` on a lit
RUN line calls `env`, `diff`, or any of the other in-process shell
builtins that lit implements, lit accidentally searches for the latter
as an external executable. What's worse is that works fine when a
developer is testing on a platform where those executables are
available and behave as expected, but it then breaks on other
platforms.
`not` seems useful for some builtins, such as `diff`, so this patch
supports such uses. `not --crash` does not seem useful for builtins,
so this patch diagnoses such uses. In all cases, this patch ensures
shell builtins are found behind any sequence of `env` and `not`
commands.
`not` calling `env` calling an external command appears useful when
the `env` and external command are part of a lit substitution, as in
D65156. This patch supports that by looking through any sequence of
`env` and `not` commands, building the environment from the `env`s,
and storing the `not`s. The `not`s are then added back to the command
line without the `env`s to execute externally. This avoids the need
to replicate the `not` implementation, in particular the `--crash`
option, in lit.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D66531
Stanislav Mekhanoshin [Tue, 5 Nov 2019 18:54:03 +0000 (10:54 -0800)]
[AMDGPU] Removed dead code handling M0CopyReg
Static analyzer complains about always false condition.
See https://bugs.llvm.org/show_bug.cgi?id=43886
Differential Revision: https://reviews.llvm.org/D69860
Adrian Prantl [Tue, 5 Nov 2019 18:53:01 +0000 (10:53 -0800)]
ValueObject: Upstream early-exit from swift-lldb. (NFC)
Reid Kleckner [Fri, 1 Nov 2019 18:47:53 +0000 (11:47 -0700)]
[dexter] Fix feature tests on Windows
First, add LLD as a dependency on Windows. The windows batch scripts
pass -fuse-ld=lld, so they need it.
Second, decode builder stdout/stderr even if the command fails.
Otherwise it gets printed as b'line 1\n\rline 2\n\r'.
Last, make the batch script one line less noisy. We might want to try to
do more here, though. It would be nice if we could get as close to
possible as lit, where you can literally copy & paste the failing
command to re-run it.
With the two changes above, now the feature tests that use clang++.bat
pass for me. The clang-cl_vs2015 ones still fail, and I'll fix them
separately.
Reviewers: jmorse
Differential Revision: https://reviews.llvm.org/D69725
Reid Kleckner [Fri, 1 Nov 2019 18:34:02 +0000 (11:34 -0700)]
[dexter] Remove lit check for python 3
This is checking the version of Python used to run lit, which is not
necessarily the same as the version used to run the dexter tests. If
the tests are run via the build/bin/llvm-lit[.py] helper script, then
that is likely to pick up whatever version of Python is on PATH.
Conventionally, this will find Python 2. CMake already checks that
Python 3 is in use and puts the path to it in the lit site config, so
this check is redundant, and Python 3 will ultimately be used to run
dexter.
Reviewers: jmorse
Differential Revision: https://reviews.llvm.org/D69724
Benjamin Kramer [Tue, 5 Nov 2019 18:12:44 +0000 (19:12 +0100)]
[X86] Specifically limit fmin/fmax commutativity to NoNaNs + NoSignedZeros
The backend UnsafeFPMath flag is not a superset of all the others, so
limit it to the exact bits needed.
Daniel Sanders [Fri, 1 Nov 2019 20:18:00 +0000 (13:18 -0700)]
[globalisel] Rename G_GEP to G_PTR_ADD
Summary:
G_GEP is rather poorly named. It's a simple pointer+scalar addition and
doesn't support any of the complexities of getelementptr. I therefore
propose that we rename it. There's a G_PTR_MASK so let's follow that
convention and go with G_PTR_ADD
Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm
Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69734
Stanislav Mekhanoshin [Mon, 4 Nov 2019 20:41:31 +0000 (12:41 -0800)]
[AMDGPU] return Fail instead of SolfFail from addOperand()
addOperand() method of AMDGPU disassembler returns SoftFail
on error. All instances which may lead to that place are
an impossible encdoing, not something which is possible to
encode, but semantically incorrect as described for SoftFail.
Then tablegen generates a check of the following form:
if (Decode...(..) == MCDisassembler::Fail) { return MCDisassembler::Fail; }
Since we can only return Success and SoftFail that is dead
code as detected by the static code analyzer.
Solution: return Fail as it should be.
See https://bugs.llvm.org/show_bug.cgi?id=43886
Differential Revision: https://reviews.llvm.org/D69819
Ilya Biryukov [Tue, 5 Nov 2019 18:06:12 +0000 (19:06 +0100)]
[clangd] Implement semantic highlightings via findExplicitReferences
Summary:
To keep the logic of finding locations of interesting AST nodes in one
place.
The advantage is better coverage of various AST nodes, both now and in
the future: as new nodes get added to `findExplicitReferences`, semantic
highlighting will automatically pick them up.
The drawback of this change is that we have to traverse declarations
inside our file twice in order to highlight dependent names, 'auto'
and 'decltype'. Hopefully, this should not affect the actual latency
too much, most time should be spent in building the AST and not
traversing it.
Reviewers: hokein
Reviewed By: hokein
Subscribers: nridge, merge_guards_bot, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69673
Jonas Devlieghere [Tue, 5 Nov 2019 18:12:05 +0000 (10:12 -0800)]
[lldb] Fix Python 3 incompatibility in API/lit.cfg.py
This code path is only taken on the sanitized bot, where it caused a
TypeError: "Can't mix strings and bytes in path components".
Michael Liao [Mon, 4 Nov 2019 16:41:07 +0000 (11:41 -0500)]
[hip] Enable pointer argument lowering through coercing type.
Reviewers: tra, rjmccall, yaxunl
Subscribers: jvesely, nhaehnle, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69826
Sergey Dmitriev [Tue, 5 Nov 2019 16:58:18 +0000 (08:58 -0800)]
[SLP] - Add couple safety checks to TreeEntry::dump(). NFC
Summary: Check for MainOp and AltOp for NULL before dereferencing or issue NULL.
Reviewers: Vasilis, dtemirbulatov, RKSimon, ABataev
Reviewed By: ABataev
Subscribers: mehdi_amini, hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69812
Daniel Sanders [Wed, 30 Oct 2019 21:25:56 +0000 (14:25 -0700)]
[globalisel][docs] Add KnownBits Analysis documentation
Summary:
This is largely based off of the slides from the keynote
Depends on D69545
Reviewers: volkan, rovka, arsenm
Subscribers: wdng, arphaman, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69644
Kazu Hirata [Tue, 5 Nov 2019 17:46:57 +0000 (09:46 -0800)]
[JumpThreading] Factor out code to merge basic blocks (NFC)
Summary:
This patch factors out code to merge a basic block with its sole
successor -- partly for readability and partly to facilitate an
upcoming patch of my own.
Reviewers: wmi
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69852
Steven Wu [Tue, 5 Nov 2019 17:34:26 +0000 (09:34 -0800)]
Revert "[Object][MachO] Rewrite macho-invalid-fat-arch-size into YAML"
The invalid binary trying to construct triggers an assertion.
Simon Pilgrim [Tue, 5 Nov 2019 16:50:46 +0000 (16:50 +0000)]
Remove redundant assignment. NFCI.
Fixes cppcheck warning.
Simon Pilgrim [Tue, 5 Nov 2019 16:46:10 +0000 (16:46 +0000)]
Use iterator prefix increment. NFCI.
Simon Pilgrim [Tue, 5 Nov 2019 15:58:04 +0000 (15:58 +0000)]
[MachineOutliner] Reduce scope of variable and stop duplicate getMF() calls. NFCI.
Steven Wu [Tue, 5 Nov 2019 16:57:34 +0000 (08:57 -0800)]
[Object][MachO] Rewrite macho-invalid-fat-arch-size into YAML
Rewrite one of the invalid macho test input file with YAML file. The
original invalid macho is breaking our internal test infrastusture
because it is too broken to be copy around.
rdar://problem/
56879982
Fangrui Song [Thu, 24 Oct 2019 22:48:32 +0000 (15:48 -0700)]
[llvm-objcopy][ELF] Implement --only-keep-debug
--only-keep-debug produces a debug file as the output that only
preserves contents of sections useful for debugging purposes (the
binutils implementation preserves SHT_NOTE and non-SHF_ALLOC sections),
by changing their section types to SHT_NOBITS and rewritting file
offsets.
See https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html
The intended use case is:
```
llvm-objcopy --only-keep-debug a a.dbg
llvm-objcopy --strip-debug a b
llvm-objcopy --add-gnu-debuglink=a.dbg b
```
The current layout algorithm is incapable of deleting contents and
shrinking segments, so it is not suitable for implementing the
functionality.
This patch adds a new algorithm which assigns sh_offset to sections
first, then modifies p_offset/p_filesz of program headers. It bears a
resemblance to lld/ELF/Writer.cpp.
Reviewed By: jhenderson, jakehehrlich
Differential Revision: https://reviews.llvm.org/D67137
Fangrui Song [Fri, 1 Nov 2019 20:49:42 +0000 (13:49 -0700)]
[llvm-objcopy][ELF] Add OriginalType & OriginalFlags
`llvm::objcopy::elf::*Section::classof` matches Type and Flags, yet Type
and Flags are mutable (by setSectionFlagsAndTypes and upcoming
--only-keep-debug feature). Add OriginalType & OriginalFlags to be used
in classof, to prevent classof results from changing.
Reviewed By: jakehehrlich, jhenderson, alexshap
Differential Revision: https://reviews.llvm.org/D69739
David Green [Tue, 5 Nov 2019 15:59:31 +0000 (15:59 +0000)]
[ARM] Multi-vector MVE spill test
This is a test from D67169, that can now be added after the vld2
intrinsics were committed upstream.
Michał Górny [Tue, 5 Nov 2019 15:29:46 +0000 (16:29 +0100)]
[lldb] [Python] Build readline override module only on Linux
Restrict building the readline override to Linux only. It both does not
build on *BSD systems, and is largely irrelevant since they default to
using libedit over readline anyway. This restores the behavior
of the old readline override that also was built only on Linux.
Differential Revision: https://reviews.llvm.org/D69846
jmolloy [Mon, 4 Nov 2019 19:25:13 +0000 (19:25 +0000)]
[DFAPacketizer] Allow up to 64 functional units
Summary:
To drive the automaton we used a uint64_t as an action type. This
contained the transition's resource requirements as a conjunction:
(a OR b) AND (b OR c)
We encoded this conjunction as a sequence of four 16-bit bitmasks.
This limited the number of addressable functional units to 16, which
is quite low and has bitten many people in the past.
Instead, the DFAEmitter now generates a lookup table from InstrItinerary
class (index of the ItinData inside the ProcItineraries) to an internal
action index which is essentially a dense embedding of the conjunctive
form. Because we never materialize the conjunctive form, we no longer
have the 16 FU restriction.
In this patch we limit to 64 functional units due to using a uint64_t
bitmask in the DFAEmitter. Now that we've decoupled these representations
we can increase this in future.
Reviewers: ThomasRaoux, kparzysz, majnemer
Reviewed By: ThomasRaoux
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69110
Alexey Bataev [Tue, 5 Nov 2019 15:10:50 +0000 (10:10 -0500)]
[OPENMP]Improve diagnostics for unsupported unified addressing.
Improved diagnostics for better user experience.
Gil Rapaport [Mon, 7 Oct 2019 14:24:33 +0000 (17:24 +0300)]
[LV] Apply sink-after & interleave-groups as VPlan transformations (NFC)
This recommits
2be17087f8c38934b7fc9208ae6cf4e9b4d44f4b (reverted in
d3ec06d219788801380af1948c7f7ef9d3c6100b for heap-use-after-free) with a fix
in IAI's reset() which was not clearing the set of interleave groups after
deleting them.
Simon Pilgrim [Tue, 5 Nov 2019 15:14:22 +0000 (15:14 +0000)]
Fix uninitialized variable warning. NFCI.
Simon Pilgrim [Tue, 5 Nov 2019 15:13:28 +0000 (15:13 +0000)]
[MCObjectFileInfo] Fix uninitialized variable warnings. NFCI.
Simon Pilgrim [Tue, 5 Nov 2019 15:08:21 +0000 (15:08 +0000)]
[MachineOutliner] Fix uninitialized variable warnings. NFCI.
Alexey Bataev [Tue, 5 Nov 2019 15:13:16 +0000 (10:13 -0500)]
[OPENMP][DOCS]Fix coloring of the implemented features status, NFC.
Francis Visoiu Mistrih [Tue, 5 Nov 2019 00:28:23 +0000 (16:28 -0800)]
[ObjC][ARC] Ignore lifetime markers between *ReturnValue calls
When eliminating a pair of
`llvm.objc.autoreleaseReturnValue`
followed by
`llvm.objc.retainAutoreleasedReturnValue`
we need to make sure that the instructions in between are safe to
ignore.
Other than bitcasts and useless GEPs, it's also safe to ignore lifetime
markers for both static allocas (lifetime.start/lifetime.end) and dynamic
allocas (stacksave/stackrestore).
These get added by the inliner as part of the return sequence and can
prevent the transformation from happening in practice.
Differential Revision: https://reviews.llvm.org/D69833
Francis Visoiu Mistrih [Tue, 5 Nov 2019 00:45:21 +0000 (16:45 -0800)]
[NFC][ObjC][ARC] Add tests for OptimizeRetainRVCall
Add tests for bitcasts + zero GEPs, and pre-commit tests for lifetime
markers.
Kazu Hirata [Mon, 4 Nov 2019 18:10:34 +0000 (10:10 -0800)]
[JumpThreading] Factor out common code to update the SSA form (NFC)
Summary:
This patch factors out common code to update the SSA form in
JumpThreading.cpp -- partly for readability and partly to facilitate
an coming patch of my own.
Reviewers: wmi
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69811
Simon Pilgrim [Tue, 5 Nov 2019 14:10:32 +0000 (14:10 +0000)]
[GVN] Fix uninitialized variable warnings. NFCI.
Simon Pilgrim [Tue, 5 Nov 2019 13:41:31 +0000 (13:41 +0000)]
Add missing GVN =operator. NFCI.
Fixes PVS Studio warning that the 'ValueTable' class implements a copy constructor, but lacks the '=' operator.
Sanjay Patel [Tue, 5 Nov 2019 13:16:48 +0000 (08:16 -0500)]
[InstCombine] add tests for shift-logic-shift; NFC
This is based on existing CodeGen test files for x86 and AArch64.
The corresponding potential transform is shown in:
rL370617
serge-sans-paille [Tue, 5 Nov 2019 13:15:09 +0000 (14:15 +0100)]
[lldb] Fix readline/libedit compat patch for py2
This is a follow-up to https://reviews.llvm.org/D69793
Dávid Bolvanský [Tue, 5 Nov 2019 12:55:46 +0000 (13:55 +0100)]
[AtomicExpandPass] Silence static analyzer warnings about operator priority. NFCI.
David Green [Tue, 5 Nov 2019 11:54:22 +0000 (11:54 +0000)]
[MachineScheduler] Enable AA in PostRA Machine scheduler
This adds AA to Post-RA Machine Scheduling, allowing the pass more
freedom when handling memory operations.
My understanding is that this was just never done, not that it is
inherently incorrect to do so. The older PostRA List scheduler already
makes use of AA, it's just that the MI PostRA Scheduler was never taught
to use it.
Differential Revision: https://reviews.llvm.org/D69814
Nuno Lopes [Tue, 5 Nov 2019 11:32:56 +0000 (11:32 +0000)]
[Docs] Add LangRef documentation for freeze instruction
Summary:
- Describe the new freeze instruction
- Make it explicit that branch on undef/poison is UB
Reviewers: chandlerc, majnemer, efriedma, nikic, reames, jdoerfert, lebedev.ri, regehr
Subscribers: fhahn, bollu, lebedev.ri, delcypher, spatel, filcab, llvm-commits, aqjune
Differential Revision: https://reviews.llvm.org/D29121
Jonas Paulsson [Tue, 5 Nov 2019 10:44:04 +0000 (11:44 +0100)]
[Clang FE] Recognize -mnop-mcount CL option (SystemZ only).
Recognize -mnop-mcount from the command line and add a function attribute
"mnop-mcount"="true" when passed.
When this option is used, a nop is added instead of a call to fentry. This
is used when building the Linux Kernel.
If this option is passed for any other target than SystemZ, an error is
generated.
Review: Ulrich Weigand
https://reviews.llvm.org/D67763
Thomas Preud'homme [Thu, 3 Oct 2019 16:00:37 +0000 (17:00 +0100)]
Fix PR40644: miscompile indexed FP constant store
Summary:
Functions replaceStoreOfFPConstant() and OptimizeFloatStore() both
replace store of float by a store of an integer unconditionally. However
this generates wrong code when the store that is replaced is an indexed
or truncating store. This commit solves this issue by adding an early
return in these functions when the store being considered is not a
normal store.
Bug was only observed on out of tree targets, hence the lack of testcase
in this commit.
Reviewers: efriedma
Subscribers: hiraditya, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68420
David Green [Tue, 5 Nov 2019 10:46:56 +0000 (10:46 +0000)]
[ARM] Always enable UseAA in the arm backend
This feature controls whether AA is used into the backend, and was
previously turned on for certain subtargets to help create less
constrained scheduling graphs. This patch turns it on for all
subtargets, so that they can all make use of the extra information to
produce better code.
Differential Revision: https://reviews.llvm.org/D69796
David Green [Tue, 5 Nov 2019 09:10:58 +0000 (09:10 +0000)]
[Scheduling][ARM] Consistently enable PostRA Machine scheduling
In the ARM backend, for historical reasons we have only some targets
using Machine Scheduling. The rest use the old list scheduler as they
are using itinaries and the list scheduler seems to produce better code
(and not crash running out of register on v6m codes). So whether to use
the MIScheduler or not is checked at runtime from the subtarget
features.
This is fine, except for post-ra scheduling. Whether to use the old
post-ra list scheduler or the post-ra machine schedule is decided as the
pass manager is set up, in arms case from a newly constructed subtarget.
Under some situations, like LTO, this won't include the correct cpu so
can pick the wrong option. This can have a surprising effect on
performance.
To fix that, this patch overrides targetSchedulesPostRAScheduling and
addPreSched2 in the ARM backend, adding _both_ post-ra schedulers and
picking at runtime which to execute. To pick between the two I've had to
add a enablePostRAMachineScheduler() method that normally returns
enableMachineScheduler() && enablePostRAScheduler(), which can be
overridden to enable just one of PostRAMachineScheduler vs
PostRAScheduler.
Thanks to David Penry for the identifying this problem.
Differential Revision: https://reviews.llvm.org/D69775
Pavel Labath [Tue, 5 Nov 2019 10:37:59 +0000 (11:37 +0100)]
lldb/breakpad: add suppport for the "x86_64h" architecture
serge-sans-paille [Tue, 5 Nov 2019 10:38:39 +0000 (11:38 +0100)]
Revert and patch "[Python] Remove readline module"
Fix https://bugs.llvm.org/show_bug.cgi?id=43830 while avoiding polluting the
global Python namespace.
This both reverts r357277 to rebundle a version of Python's readline module
based on libedit.
However, this patch also provides two improvements over the previous
implementation:
1. use PyMem_RawMalloc instead of PyMem_Malloc, as expected by PyOS_Readline
(prevents to segfault upon exit of interactive session)
2. patch the readline module upon embedded interpreter loading, instead of
patching it globally, which should prevent any side effect on other
modules/packages
3. only activate the patched module if libedit is actually linked in lldb
Differential Revision: https://reviews.llvm.org/D69793
Sven van Haastregt [Tue, 5 Nov 2019 10:16:45 +0000 (10:16 +0000)]
[OpenCL] Group builtin functions by prototype
The TableGen-generated file containing the function definitions can be
reorganized to save some memory in the Clang binary. Functions having
the same prototype(s) will point to a shared list of prototype(s).
Patch by Pierre Gondois and Sven van Haastregt.
Differential Revision: https://reviews.llvm.org/D63557
Sven van Haastregt [Tue, 5 Nov 2019 10:07:43 +0000 (10:07 +0000)]
[OpenCL] Add builtin function attribute handling
Add handling for the "pure", "const" and "convergent" function
attributes for OpenCL builtin functions.
Patch by Pierre Gondois and Sven van Haastregt.
Differential Revision: https://reviews.llvm.org/D64319