Serge Guelton [Thu, 21 Feb 2019 04:55:50 +0000 (04:55 +0000)]
[NFC] Always initialize all members in ABIArgInfo
Differential Revision: https://reviews.llvm.org/D57523
llvm-svn: 354546
Douglas Yung [Thu, 21 Feb 2019 04:55:31 +0000 (04:55 +0000)]
Attempt to fix VS2015 build breakage from r354517. NFCI.
llvm-svn: 354545
Sam Clegg [Thu, 21 Feb 2019 03:27:00 +0000 (03:27 +0000)]
[WebAssembly] Default to something reasonable in WebAssemblyAddMissingPrototypes
Previously if we couldn't derive a prototype for a "no-prototype"
function from C we would leave it as is:
void foo(...)
With this change we instead give is an empty signature and remove
the "no-prototype" attribute.
This fixes the current wasm waterfall test failure.
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58488
llvm-svn: 354544
Stanislav Mekhanoshin [Thu, 21 Feb 2019 02:58:00 +0000 (02:58 +0000)]
[AMDGPU] fix commuted case of sub combine
Differential Revision: https://reviews.llvm.org/D58481
llvm-svn: 354543
Wei Mi [Thu, 21 Feb 2019 02:57:52 +0000 (02:57 +0000)]
[Inliner] Pass nullptr for the ORE param of getInlineCost if RemarkEnabled
is false.
Right now for inliner and partial inliner, we always pass the address of a
valid ORE object to getInlineCost even if RemarkEnabled is false because of
no -Rpass is specified. Since ComputeFullInlineCost will be set to true if
ORE is non-null in getInlineCost, this introduces the problem that in
getInlineCost we cannot return early even if we already know the cost is
definitely higher than the threshold. It is a general problem for compile
time.
This patch fixes that by pass nullptr as the ORE argument if RemarkEnabled is
false.
Differential Revision: https://reviews.llvm.org/D58399
llvm-svn: 354542
Xin Tong [Thu, 21 Feb 2019 02:11:06 +0000 (02:11 +0000)]
Add skipFunction to PostRA machine sinking pass.
Summary: Add skipFunction to PostRA machine sinking pass.
Reviewers: junbuml
Subscribers: arsenm, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57847
llvm-svn: 354541
Davide Italiano [Thu, 21 Feb 2019 01:55:31 +0000 (01:55 +0000)]
Revert "[lldb-mi] Move TestMIPrompt away from pexpect()."
I see a test failing on the macOS bots. I can't reproduce
locally, so try to get the bots green before I can investigate.
llvm-svn: 354540
Sam Clegg [Thu, 21 Feb 2019 01:33:26 +0000 (01:33 +0000)]
[WebAssembly] Remove redundant code added in rL354538. NFC.
The code for encoding the symbols signature into its name
was not actually being used in the final version of this change.
Differential Revision: https://reviews.llvm.org/D58482
llvm-svn: 354539
Ahmed Bougacha [Thu, 21 Feb 2019 01:13:27 +0000 (01:13 +0000)]
[AArch64] Change size suffix for FP16FML intrinsics.
These currently use _u32, but they should instead use _f16, the
types of the multiplication (matching the various integer vmlal
variants).
Differential Revision: https://reviews.llvm.org/D58306
llvm-svn: 354538
Louis Dionne [Thu, 21 Feb 2019 00:53:26 +0000 (00:53 +0000)]
[NFC] Fix incorrect comment in std::function test
llvm-svn: 354537
Kostya Serebryany [Thu, 21 Feb 2019 00:43:46 +0000 (00:43 +0000)]
[libFuzzer] fix the docs
llvm-svn: 354536
Richard Trieu [Thu, 21 Feb 2019 00:36:14 +0000 (00:36 +0000)]
Fix unused variable warning.
llvm-svn: 354535
Stephane Moore [Thu, 21 Feb 2019 00:34:01 +0000 (00:34 +0000)]
[clang-tidy] Make google-objc-function-naming ignore implicit functions 🙈
Summary:
Implicit functions are outside the control of source authors and should
be exempt from style restrictions.
Tested via running clang tools tests.
This is an amended followup to https://reviews.llvm.org/D57207
Reviewers: aaron.ballman
Reviewed By: aaron.ballman
Subscribers: jdoerfert, xazax.hun, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58095
llvm-svn: 354534
Kostya Serebryany [Thu, 21 Feb 2019 00:32:30 +0000 (00:32 +0000)]
[libFuzzer] document -fork=N
llvm-svn: 354533
Amara Emerson [Thu, 21 Feb 2019 00:31:13 +0000 (00:31 +0000)]
Revert "[AArch64][GlobalISel] Implement partial support for G_SHUFFLE_VECTOR"
This reverts r354521 because it broke the bots, but passes on Darwin somehow.
llvm-svn: 354532
Artem Dergachev [Thu, 21 Feb 2019 00:19:24 +0000 (00:19 +0000)]
[attributes] Fix buildbot after r354530.
Update the test after adding more attribute subjects.
Differential Revision: https://reviews.llvm.org/D58365
llvm-svn: 354531
Artem Dergachev [Thu, 21 Feb 2019 00:01:02 +0000 (00:01 +0000)]
[attributes] Add an attribute for server routines in Mach kernel and extensions.
The new __attribute__ ((mig_server_routine)) is going to be used for annotating
Mach Interface Generator (MIG) callback functions as such, so that additional
static analysis could be applied to their implementations. It can also be
applied to regular functions behavior of which is supposed to be identical to
that of a MIG server routine.
Differential Revision: https://reviews.llvm.org/D58365
llvm-svn: 354530
Amara Emerson [Wed, 20 Feb 2019 23:22:15 +0000 (23:22 +0000)]
[GlobalISel] Add -O0 to some tests to see if it fixes them. I can't reproduce the failures locally,
and greendragon also passes, but some other bots fail for reasons I don't understand.
The only difference I can see between these tests is it's missing an -O0
If this doesn't work I'll revert and continue investigating.
llvm-svn: 354529
Sam Clegg [Wed, 20 Feb 2019 23:19:31 +0000 (23:19 +0000)]
[WebAssembly] Don't generate invalid modules when function signatures mismatch
Previously we could emit a warning and generate a potentially invalid
wasm module (due to call sites and functions having conflicting
signatures). Now, rather than create invalid binaries we handle such
cases by creating stub functions containing unreachable, effectively
turning these into runtime errors rather than validation failures.
Differential Revision: https://reviews.llvm.org/D57909
llvm-svn: 354528
Shoaib Meenai [Wed, 20 Feb 2019 23:16:15 +0000 (23:16 +0000)]
[clang] Add CMake target for installing clang's CMake exports
This mirrors LLVM's install-cmake-exports target.
Differential Revision: https://reviews.llvm.org/D58480
llvm-svn: 354527
Alex Langford [Wed, 20 Feb 2019 23:12:56 +0000 (23:12 +0000)]
Merge target triple into module triple when constructing module from memory
Summary:
While debugging an android process remotely from a windows machine, I
noticed that the modules constructed from an object file in memory only had
information about the architecture. Without knowledge of the OS or environment,
expression evaluation sometimes leads to incorrectly generated code or a
debugger crash. While we cannot know for certain what triple a module
constructed from an in-memory object file will have, we can use the
triple from the target to try and fill in the missing details.
Reviewers: clayborg, zturner, JDevlieghere, compnerd, aprantl, labath
Subscribers: jdoerfert, lldb-commits
Differential Revision: https://reviews.llvm.org/D58405
llvm-svn: 354526
Shoaib Meenai [Wed, 20 Feb 2019 23:08:43 +0000 (23:08 +0000)]
[clang] Switch to LLVM_ENABLE_IDE
r344555 switched LLVM to guarding install targets with LLVM_ENABLE_IDE
instead of CMAKE_CONFIGURATION_TYPES, which expresses the intent more
directly and can be overridden by a user. Make the corresponding change
in clang. LLVM_ENABLE_IDE is computed by HandleLLVMOptions, so it should
be available for both standalone and integrated builds.
Differential Revision: https://reviews.llvm.org/D58284
llvm-svn: 354525
Petr Hosek [Wed, 20 Feb 2019 23:06:10 +0000 (23:06 +0000)]
[CMake][runtimes] Set clang-header dependency for builtins
compiler-rt builtins depend on clang headers, but that dependency
wasn't explicitly stated in the build system and we were relying
on the transitive depenendecy via clang. However, when we're
cross-compiling clang, we'll be using host compiler instead and
that depenendecy is missing, breaking the build.
Differential Revision: https://reviews.llvm.org/D58471
llvm-svn: 354524
Sam Clegg [Wed, 20 Feb 2019 22:40:57 +0000 (22:40 +0000)]
[WebAssembly] Don't error on conflicting uses of prototype-less functions
When we can't determine with certainty the signature of a function
import we pick the fist signature we find rather than error'ing out.
The resulting program might not do what is expected since we might pick
the wrong signature. However since undefined behavior in C to use the
same function with different signatures this seems better than refusing
to compile such programs.
Fixes PR40472
Differential Revision: https://reviews.llvm.org/D58304
llvm-svn: 354523
Julian Lettner [Wed, 20 Feb 2019 22:28:11 +0000 (22:28 +0000)]
[LSan] Fix `__sanitizer_print_stack_trace` via fast unwinder
Summary: Quick follow-up to: https://reviews.llvm.org/D58156
Reviewers: vitalybuka
Differential Revision: https://reviews.llvm.org/D58358
llvm-svn: 354522
Amara Emerson [Wed, 20 Feb 2019 22:11:39 +0000 (22:11 +0000)]
[AArch64][GlobalISel] Implement partial support for G_SHUFFLE_VECTOR
This change makes some basic type combinations for G_SHUFFLE_VECTOR legal, and
implements them with a very pessimistic TBL2 instruction in the selector.
For TBL2, support is also needed to generate constant pool entries and load from
them in order to materialize the mask register.
Currently supports <2 x s64> and <4 x s32> result types.
Differential Revision: https://reviews.llvm.org/D58466
llvm-svn: 354521
Craig Topper [Wed, 20 Feb 2019 21:35:05 +0000 (21:35 +0000)]
[X86] Add test cases to show missed opportunities to remove AND mask from BTC/BTS/BTR instructions when LHS of AND has known zeros.
We can currently remove the mask if the immediate has all ones in the LSBs, but if the LHS of the AND is known zero, then the immediate might have had bits removed.
A similar issue also occurs with shifts and rotates. I'm preparing a common fix for all of them.
llvm-svn: 354520
Sanjay Patel [Wed, 20 Feb 2019 21:23:04 +0000 (21:23 +0000)]
[CGP] match a special-case of unsigned subtract overflow
This is the 'sub0' (negate) pattern from PR31754:
https://bugs.llvm.org/show_bug.cgi?id=31754
llvm-svn: 354519
Nirav Dave [Wed, 20 Feb 2019 21:07:50 +0000 (21:07 +0000)]
[DAGCombine] Generalize Dead Store to overlapping stores.
Summary:
Remove stores that are immediately overwritten by larger
stores.
Reviewers: courbet, rnk
Reviewed By: rnk
Subscribers: javed.absar, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58467
llvm-svn: 354518
Jonas Toth [Wed, 20 Feb 2019 21:04:36 +0000 (21:04 +0000)]
[clang-tidy] refactor ExceptionAnalyzer further to give ternary answer
Summary:
The analsis on the throwing behvaiour on functions and statements gave only
a binary answer whether an exception could occur and if yes which types are
thrown.
This refactoring allows keeping track if there is a unknown factor, because the
code calls to some functions with unavailable source code with no `noexcept`
information.
This 'potential Unknown' information is propagated properly and can be queried
separately.
Reviewers: lebedev.ri, aaron.ballman, baloghadamsoftware, alexfh
Reviewed By: lebedev.ri, baloghadamsoftware
Subscribers: xazax.hun, rnkovacs, a.sidorin, Szelethus, donat.nagy, dkrupp, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D57883
llvm-svn: 354517
Tom Stellard [Wed, 20 Feb 2019 21:02:37 +0000 (21:02 +0000)]
AMDGPU/GlobalISel: Move SMRD selection logic to TableGen
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D52922
llvm-svn: 354516
Dimitry Andric [Wed, 20 Feb 2019 21:01:31 +0000 (21:01 +0000)]
Fix the build with gcc when `-Wredundant-decls` is passed
Summary:
gcc warns that `__throw_runtime_error` is declared both in `<__locale>`
and `<stdexcept>`, if `-Wredundant-decls` is passed on the command
line; this is the case with FreeBSD when ${WARNS} == 6.
Since `<__locale>` gets its first declaration via a transitive include
of `<stdexcept>`, and the second declaration is after the first
invocation of `__throw_runtime_error`, delete that second declaration.
Signed-off-by: Enji Cooper <yaneurabeya@gmail.com>
Reviewers: kristina, MaskRay, EricWF, ldionne, ngie
Reviewed By: EricWF
Subscribers: krytarowski, brooks, emaste, dim, christof, jdoerfert, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D58425
llvm-svn: 354515
Craig Topper [Wed, 20 Feb 2019 20:52:26 +0000 (20:52 +0000)]
[SelectionDAG] Teach GetDemandedBits to look at the known zeros of the LHS when handling ISD::AND
If the LHS has known zeros, then the RHS immediate mask might have been simplified to remove those bits.
This patch adds a call to computeKnownBits to get the known zeroes to handle that possibility. I left an early out to skip the call if all of the demanded bits are set in the mask.
Differential Revision: https://reviews.llvm.org/D58464
llvm-svn: 354514
Nikita Popov [Wed, 20 Feb 2019 20:41:44 +0000 (20:41 +0000)]
[SDAG] Support vector UMULO/SMULO
Second part of https://bugs.llvm.org/show_bug.cgi?id=40442.
This adds an extra UnrollVectorOverflowOp() method to SDAG, because
the general UnrollOverflowOp() method can't deal with multiple results.
Additionally we need to expand UMULO/SMULO during vector op
legalization, as it may result in unrolling, which may need additional
type legalization.
Differential Revision: https://reviews.llvm.org/D57997
llvm-svn: 354513
Nemanja Ivanovic [Wed, 20 Feb 2019 20:27:33 +0000 (20:27 +0000)]
Make predefined FLT16 macros conditional on support for the type
We unconditionally predefine these macros. However, they may be used to
determine if the type is supported. In that case, there are unnecessary
failures to compile the code.
This is the proposed fix for https://bugs.llvm.org/show_bug.cgi?id=40559
Differential revision: https://reviews.llvm.org/D57577
llvm-svn: 354512
Craig Topper [Wed, 20 Feb 2019 20:18:20 +0000 (20:18 +0000)]
[X86] Add more load folding patterns for blend instructions as a follow up to r354363.
This avoids depending on the peephole pass to do load folding.
Also adds some load folding for some insert_subvector patterns that use blend.
All of this was found by temporarily adding TB_NO_FORWARD to the blend immediate entries in the load folding tables.
I've added -disable-peephole to some of the affected tests from that experiment to ensure we're testing isel patterns.
llvm-svn: 354511
Tom Stellard [Wed, 20 Feb 2019 19:43:47 +0000 (19:43 +0000)]
Add support for pointer types in patterns
Summary:
This adds support for defining patterns for global isel using pointer
types, for example:
def : Pat<(load GPR32:$src),
(p1 (LOAD GPR32:$src))>;
DAGISelEmitter will ignore the pointer information and treat these
types as integers with the same bit-width as the pointer type.
Reviewers: dsanders, rtereshin, arsenm
Reviewed By: arsenm
Subscribers: Petar.Avramovic, wdng, rovka, kristof.beyls, jfb, volkan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57065
llvm-svn: 354510
Alexey Bataev [Wed, 20 Feb 2019 19:37:17 +0000 (19:37 +0000)]
[OPENMP] Use targetDiag for diagnostics of unsupported exceptions, NFC.
llvm-svn: 354509
Nirav Dave [Wed, 20 Feb 2019 19:26:47 +0000 (19:26 +0000)]
Fix testcase.
llvm-svn: 354508
Ilya Biryukov [Wed, 20 Feb 2019 19:26:39 +0000 (19:26 +0000)]
[clangd] Fix a crash in Selection
Summary:
The assertion checking that a range of a node is a token range does
not hold in case of "split" tokens, e.g. between two closing template
argument lists (`vector<vector<int>>`).
Reviewers: kadircet, sammccall
Reviewed By: kadircet
Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58447
llvm-svn: 354507
Davide Italiano [Wed, 20 Feb 2019 19:25:12 +0000 (19:25 +0000)]
[lldb-mi] Move TestMIPrompt away from pexpect().
llvm-svn: 354506
Ilya Biryukov [Wed, 20 Feb 2019 19:08:06 +0000 (19:08 +0000)]
[clangd] Store index in '.clangd/index' instead of '.clangd-index'
Summary: To take up the .clangd folder for other potential uses in the future.
Reviewers: kadircet, sammccall
Reviewed By: kadircet
Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58440
llvm-svn: 354505
Nirav Dave [Wed, 20 Feb 2019 19:07:55 +0000 (19:07 +0000)]
Add test case.
llvm-svn: 354504
Gabor Marton [Wed, 20 Feb 2019 19:07:36 +0000 (19:07 +0000)]
Fix remaining semicolon pedantic errors for intel
llvm-svn: 354503
Siva Chandra [Wed, 20 Feb 2019 19:07:04 +0000 (19:07 +0000)]
[Clang Driver] Add support for "-static-pie" argument to the Clang driver.
Summary: This change mimics GCC's support for the "-static-pie" argument.
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58307
llvm-svn: 354502
Craig Topper [Wed, 20 Feb 2019 19:02:01 +0000 (19:02 +0000)]
[X86] Add test case to show missed opportunity to remove an explicit AND on the bit position from BT when it has known zeros. NFC
If the bit position has known zeros in it, then the AND immediate will likely be optimized to remove bits.
This can prevent GetDemandedBits from recognizing that the AND is unnecessary.
llvm-svn: 354501
Vitaly Buka [Wed, 20 Feb 2019 18:55:52 +0000 (18:55 +0000)]
Fix license headers
llvm-svn: 354500
Craig Topper [Wed, 20 Feb 2019 18:47:26 +0000 (18:47 +0000)]
Revert r354498 "[X86] Add test case to show missed opportunity to remove an explicit AND on the bit position from BT when it has known zeros."
I accidentally committed more than just the test.
llvm-svn: 354499
Craig Topper [Wed, 20 Feb 2019 18:45:38 +0000 (18:45 +0000)]
[X86] Add test case to show missed opportunity to remove an explicit AND on the bit position from BT when it has known zeros.
If the bit position has known zeros in it, then the AND immediate will likely be optimized to remove bits.
This can prevent GetDemandedBits from recognizing that the AND is unnecessary.
llvm-svn: 354498
Tom Stellard [Wed, 20 Feb 2019 18:43:45 +0000 (18:43 +0000)]
AArch64/test: Add check for function name to machine-outliner-bad-adrp.mir
Summary:
This test was failing in one of our setups because the generated ModuleID
had the full path of the test file and that path contained the string
BL.
Reviewers: t.p.northover, jpaquette, paquette
Reviewed By: paquette
Subscribers: javed.absar, kristof.beyls, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58217
llvm-svn: 354497
Puyan Lotfi [Wed, 20 Feb 2019 18:30:44 +0000 (18:30 +0000)]
Fixing NDEBUG typo in include/llvm/Support/raw_ostream.h
NDEBUG is misspelled as NDBEBUG in include/llvm/Support/raw_ostream.h.
llvm-svn: 354495
Davide Italiano [Wed, 20 Feb 2019 18:27:29 +0000 (18:27 +0000)]
[lldb-mi] Remove a test that uses pexpect().
Summary:
Its functionality is entirely covered by exec-run.test (which
doesn't use pexpect)
Reviewers: serge-sans-paille
Subscribers: ki.stfu, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58459
llvm-svn: 354494
Andrea Di Biagio [Wed, 20 Feb 2019 18:23:19 +0000 (18:23 +0000)]
[MCA][Scheduler] Correctly initialize field NumDispatchedToThePendingSet.
This should have been part of r354490.
llvm-svn: 354493
Daniel Sanders [Wed, 20 Feb 2019 18:08:48 +0000 (18:08 +0000)]
Add partial implementation of std::to_address() as llvm::to_address()
Summary:
Following on from the review for D58088, this patch provides the
prerequisite to_address() implementation that's needed to have
pointer_iterator support unique_ptr.
The late bound return should be removed once we move to C++14 to better
align with the C++20 declaration. Also, this implementation can be removed
once we move to C++20 where it's defined as std::to_addres()
The std::pointer_traits<>::to_address(p) variations of these overloads has
not been implemented.
Reviewers: dblaikie, paquette
Reviewed By: dblaikie
Subscribers: dexonsmith, kristina, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58421
llvm-svn: 354491
Andrea Di Biagio [Wed, 20 Feb 2019 18:01:49 +0000 (18:01 +0000)]
[MCA][Scheduler] Collect resource pressure and memory dependency bottlenecks.
Every cycle, the Scheduler checks if instructions in the ReadySet can be issued
to the underlying pipelines. If an instruction cannot be issued because one or
more pipeline resources are unavailable, then field
Instruction::CriticalResourceMask is updated with the resource identifier of the
unavailable resources.
If an instruction cannot be promoted from the PendingSet to the ReadySet because
of a memory dependency, then field Instruction::CriticalMemDep is updated with
the identifier of the dependending memory instruction.
Bottleneck information is collected after every cycle for instructions that are
waiting to execute. The idea is to help identify causes of bottlenecks; this
information can be used in future to implement a bottleneck analysis.
llvm-svn: 354490
Simon Pilgrim [Wed, 20 Feb 2019 17:58:29 +0000 (17:58 +0000)]
[X86][SSE] combineX86ShufflesRecursively - begin generalizing the number of shuffle inputs. NFCI.
We currently bail if the target shuffle decodes to more than 2 input vectors, this is some initial cleanup that still has the limit but generalizes the opindices to an array that will be necessary when we drop the limit.
llvm-svn: 354489
Jonas Devlieghere [Wed, 20 Feb 2019 17:43:34 +0000 (17:43 +0000)]
[TestModuleCXX] Make this test Darwin-only.
Apparently this functionality is not expected to work on non-Darwin
systems. I should've checked the decorator on the original test.
llvm-svn: 354487
Alexey Bataev [Wed, 20 Feb 2019 17:42:57 +0000 (17:42 +0000)]
[OPENMP] Delay emission of the asm target-specific error messages.
Summary:
Added the ability to emit target-specific builtin assembler error
messages only in case if the function is really is going to be emitted
for the device.
Reviewers: rjmccall
Subscribers: guansong, jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58243
llvm-svn: 354486
Yan Zhang [Wed, 20 Feb 2019 17:32:41 +0000 (17:32 +0000)]
Update property prefix regex to allow numbers.
Subscribers: jfb, cfe-commits
Differential Revision: https://reviews.llvm.org/D56896
llvm-svn: 354485
James Henderson [Wed, 20 Feb 2019 17:21:38 +0000 (17:21 +0000)]
[llvm-readelf]Test a couple of corner-cases for --section-mapping
This patch adds two new tests for edge-case behaviour for --section-
mapping, namely when there are no program headers, and when there are no
section headers.
Reviewed by: mattd
Differential Revision: https://reviews.llvm.org/D58456
llvm-svn: 354484
Michal Gorny [Wed, 20 Feb 2019 17:10:34 +0000 (17:10 +0000)]
[lldb] [test] Fix expected netbsd output for TestImageListMultiArchitecture
llvm-svn: 354483
Gabor Marton [Wed, 20 Feb 2019 16:57:41 +0000 (16:57 +0000)]
Fix compile error with Intel's compiler (-Werror=pedantic)
An extra semicolon at the end of macro invocations caused a build bot
failure for Intel's compiler when pedantic is turned on.
llvm-svn: 354482
Petr Hosek [Wed, 20 Feb 2019 16:53:08 +0000 (16:53 +0000)]
[CodeGen] Enable the complex-math test for arm
This test wasn't running due to a missing : after the RUN statement.
Enabling this test revealed that it's actually broken.
Differential Revision: https://reviews.llvm.org/D58429
llvm-svn: 354481
Matt Arsenault [Wed, 20 Feb 2019 16:42:52 +0000 (16:42 +0000)]
GlobalISel: Fix fewerElementsVector for ctlz with different result type
Also complete the set of related operations.
llvm-svn: 354480
Alexey Bataev [Wed, 20 Feb 2019 16:36:22 +0000 (16:36 +0000)]
[OPENMP][NVPTX]Use faster teams reduction algorithm.
A faster way to reduce the values in teams reductions was found, the
codegen is updated to use this faster algorithm and new runtime functions.
llvm-svn: 354479
Matt Arsenault [Wed, 20 Feb 2019 16:11:22 +0000 (16:11 +0000)]
GlobalISel: Implement moreElementsVector for g_insert results
llvm-svn: 354477
Clement Courbet [Wed, 20 Feb 2019 15:45:58 +0000 (15:45 +0000)]
Re-land the refactoring part of r354244 "[DAGCombiner] Eliminate dead stores to stack."
This is an NFC.
llvm-svn: 354476
Sanjay Patel [Wed, 20 Feb 2019 15:40:58 +0000 (15:40 +0000)]
[CGP][x86] add tests for usubo special-case; NFC
This is another example from PR31754:
https://bugs.llvm.org/show_bug.cgi?id=31754
llvm-svn: 354475
James Henderson [Wed, 20 Feb 2019 15:13:44 +0000 (15:13 +0000)]
[obj2yaml][yaml2obj]Locate all .yaml and .test tests
A number of the obj2yaml tests end in .yaml, but .yaml is not a default
file type picked up by lit, so these tests weren't being run when
running the testsuite as a whole (they could be run explicitly still).
This change adds a lit local config file to specify the known file types
for obj2yaml tests (.yaml and .test). Additionally, it fixes the
yaml2obj config file to allow both .test and .yaml suffixed tests
(previously, the two tests ending in '.test' were not being run).
Reviewed by: grimar
Differential Revision: https://reviews.llvm.org/D58439
llvm-svn: 354474
Krzysztof Parzyszek [Wed, 20 Feb 2019 15:05:19 +0000 (15:05 +0000)]
[Hexagon] Split vector pairs for ISD::SIGN_EXTEND and ISD::ZERO_EXTEND
llvm-svn: 354473
Hans Wennborg [Wed, 20 Feb 2019 14:56:31 +0000 (14:56 +0000)]
Speculative buildfix for Mac
Our builds were failing with
FAILED: lib/Support/CMakeFiles/LLVMSupport.dir/ARMBuildAttrs.cpp.o
[..]
In file included from /b/c/b/ToTMac/src/third_party/llvm/lib/Support/ARMBuildAttrs.cpp:9:
In file included from /b/c/b/ToTMac/src/third_party/llvm/include/llvm/ADT/StringRef.h:12:
In file included from /b/c/b/ToTMac/src/third_party/llvm/include/llvm/ADT/STLExtras.h:19:
/b/c/b/ToTMac/src/third_party/llvm/include/llvm/ADT/Optional.h:88:25: error: no member named 'addressof' in namespace 'std'
::new ((void *)std::addressof(value)) T(std::forward<Args>(args)...);
~~~~~^
Try to fix by including <memory>
llvm-svn: 354472
Gheorghe-Teodor Bercea [Wed, 20 Feb 2019 14:55:55 +0000 (14:55 +0000)]
[OpenMP][libomptarget] New reduction scheme for team reductions
Summary:
This patch adds a more sophisticated team reduction scheme to the OpenMP libomptarget-nvptx runtime.
The scheme uses a fixed size global memory buffer whose length can be adjusted via compiler flag:
```
-fopenmp-cuda-teams-reduction-recs-num=1024
```
The global buffer is a structure of arrays (with default size of 1024 each and controlled by the above flag), one array for each reduction variable.
Values in the buffer are processed by the last team to finish executing the body of the target region.
In addition to adding support for the new flag, the compiler also emits special functions used for the reduction of the intermediate reduction values. These changes will be added in a separate compiler patch following this one.
Reviewers: ABataev, caomhin
Reviewed By: ABataev
Subscribers: guansong, jfb, jdoerfert, openmp-commits
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D58409
llvm-svn: 354471
Andrea Di Biagio [Wed, 20 Feb 2019 14:53:18 +0000 (14:53 +0000)]
[MCA][ResourceManager] Add a table that maps processor resource indices to processor resource identifiers.
This patch adds a lookup table to speed up resource queries in the ResourceManager.
This patch also moves helper function 'getResourceStateIndex()' from
ResourceManager.cpp to Support.h, so that we can reuse that logic in the
SummaryView (and potentially other views in llvm-mca).
No functional change intended.
llvm-svn: 354470
Hans Wennborg [Wed, 20 Feb 2019 14:50:08 +0000 (14:50 +0000)]
Fix the build with gcc/libstdc++ 4.8.2 after r354441
llvm-svn: 354469
Simon Atanasyan [Wed, 20 Feb 2019 14:47:02 +0000 (14:47 +0000)]
[mips] Put some MIPS-specific sections to separate segments
Three MIPS-specific sections `.reginfo`, `.MIPS.options`, and `.MIPS.abiflags`
are used by loader to read their contents and setup environment for running
a program. Loader looks up these data in the corresponding segments:
`PT_MIPS_REGINFO`, `PT_MIPS_OPTIONS`, and `PT_MIPS_ABIFLAGS` respectively.
This patch put these sections to separate segments like we do already
for ARM `SHT_ARM_EXIDX` section.
Differential Revision: http://reviews.llvm.org/D58381
llvm-svn: 354468
Sanjay Patel [Wed, 20 Feb 2019 14:34:00 +0000 (14:34 +0000)]
[InstSimplify] use any-zero matcher for fcmp folds
The m_APFloat matcher does not work with anything but strict
splat vector constants, so we could miss these folds and then
trigger an assertion in instcombine:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201
The previous attempt at this in rL354406 had a logic bug that
actually triggered a regression test failure, but I failed to
notice it the first time.
llvm-svn: 354467
Michal Gorny [Wed, 20 Feb 2019 14:31:06 +0000 (14:31 +0000)]
[lldb] [ObjectFile/ELF] Fix recognizing NetBSD images
Split the recognition into NetBSD executables & shared libraries
and core(5) files.
Introduce new owner type: "NetBSD-CORE", as core(5) files are not tagged
in the same way as regular NetBSD executables.
Stop using incorrectly ABI_TAG and ABI_SIZE. Introduce IDENT_TAG,
IDENT_DECSZ, IDENT_NAMESZ and PROCINFO.
The new values detect correctly the NetBSD images.
The patch has been originally written by Kamil Rytarowski. I've added
tests and applied minor code changes per review. The work has been
sponsored by the NetBSD Foundation.
Differential Revision: https://reviews.llvm.org/D42870
llvm-svn: 354466
George Rimar [Wed, 20 Feb 2019 14:01:02 +0000 (14:01 +0000)]
[yaml2elf] - Rename a variable. NFC.
Was suggested during review of D58441.
llvm-svn: 354463
George Rimar [Wed, 20 Feb 2019 13:58:43 +0000 (13:58 +0000)]
[yaml2obj] - Simplify implementation. NFCI.
Knowing about how types are declared for 32/64 bit platforms:
https://github.com/llvm-mirror/llvm/blob/master/include/llvm/BinaryFormat/ELF.h#L28
it is possible to simplify code that writes a binary a bit.
The patch does that.
Differential revision: https://reviews.llvm.org/D58441
llvm-svn: 354462
Petar Avramovic [Wed, 20 Feb 2019 13:42:44 +0000 (13:42 +0000)]
[MIPS MSA] Avoid some DAG combines for vector shifts
DAG combiner combines two shifts into shift + and with bitmask.
Avoid such combines for vectors since leaving two vector shifts
as they are produces better end results.
Differential Revision: https://reviews.llvm.org/D58225
llvm-svn: 354461
Ilya Biryukov [Wed, 20 Feb 2019 12:31:44 +0000 (12:31 +0000)]
[clangd] Fix a typo. NFC
The documentation for -index-file mentioned clang-index instead of
clangd-indexer.
llvm-svn: 354456
Petar Avramovic [Wed, 20 Feb 2019 12:13:11 +0000 (12:13 +0000)]
[MIPS MSA] Add test for vector shift combines
Add test for vector shift combines.
llvm-svn: 354455
Simon Pilgrim [Wed, 20 Feb 2019 12:04:54 +0000 (12:04 +0000)]
[SLPVectorizer][X86] Add add/sub/mul overflow tests
Baseline tests - overflow intrinsics aren't flagged as vectorizable yet
llvm-svn: 354454
Kadir Cetinkaya [Wed, 20 Feb 2019 11:45:20 +0000 (11:45 +0000)]
[clangd] Revert r354442 and r354444
Looks like sysroot is only working on linux.
llvm-svn: 354453
Krasimir Georgiev [Wed, 20 Feb 2019 11:44:21 +0000 (11:44 +0000)]
[clang-format] Do not emit replacements if Java imports are OK
Summary:
Currently clang-format would always emit a replacement for a block of Java imports even if it is correctly formatted:
```
% cat /tmp/Aggregator.java
import X;
% clang-format /tmp/Aggregator.java
import X;
% clang-format -output-replacements-xml /tmp/Aggregator.java
<?xml version='1.0'?>
<replacements xml:space='preserve' incomplete_format='false'>
<replacement offset='0' length='9'>import X;</replacement>
</replacements>
%
```
This change makes clang-format not emit replacements in this case. Note that
there is logic to not emit replacements in this case for C++.
Reviewers: ioeric
Reviewed By: ioeric
Subscribers: jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58436
llvm-svn: 354452
H.J. Lu [Wed, 20 Feb 2019 11:43:43 +0000 (11:43 +0000)]
[sanitizers] Restore internal_readlink for x32
r316591 has
@@ -389,13 +383,11 @@ uptr internal_dup2(int oldfd, int newfd) {
}
uptr internal_readlink(const char *path, char *buf, uptr bufsize) {
-#if SANITIZER_NETBSD
- return internal_syscall_ptr(SYSCALL(readlink), path, buf, bufsize);
-#elif SANITIZER_USES_CANONICAL_LINUX_SYSCALLS
+#if SANITIZER_USES_CANONICAL_LINUX_SYSCALLS
return internal_syscall(SYSCALL(readlinkat), AT_FDCWD,
(uptr)path, (uptr)buf, bufsize);
#else
- return internal_syscall(SYSCALL(readlink), (uptr)path, (uptr)buf, bufsize);
+ return internal_syscall_ptr(SYSCALL(readlink), path, buf, bufsize);
#endif
}
which dropped the (uptr) cast and broke x32. This patch puts back the
(uptr) cast to restore x32 and fixes:
https://bugs.llvm.org/show_bug.cgi?id=40783
Differential Revision: https://reviews.llvm.org/D58413
llvm-svn: 354451
Fangrui Song [Wed, 20 Feb 2019 11:34:18 +0000 (11:34 +0000)]
ELF: Remove field for .gdb_index in InStruct. NFC.
Summary: This field is unreferenced outside of createSyntheticSections.
Reviewers: ruiu, pcc, espindola, grimar
Reviewed By: grimar
Subscribers: grimar, emaste, arichardson, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58423
llvm-svn: 354449
Kadir Cetinkaya [Wed, 20 Feb 2019 10:32:04 +0000 (10:32 +0000)]
[clangd] Try to fix windows build bots
llvm-svn: 354444
David Green [Wed, 20 Feb 2019 10:22:18 +0000 (10:22 +0000)]
[Codegen] Remove dead flags on Physical Defs in machine cse
We may leave behind incorrect dead flags on instructions that are CSE'd. Make
sure we remove the dead flags on physical registers to prevent other incorrect
code motion.
Differential Revision: https://reviews.llvm.org/D58115
llvm-svn: 354443
Kadir Cetinkaya [Wed, 20 Feb 2019 09:41:26 +0000 (09:41 +0000)]
[clangd] Testcase for bug 39811
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58133
llvm-svn: 354442
Roman Lebedev [Wed, 20 Feb 2019 09:14:04 +0000 (09:14 +0000)]
[llvm-exegesis] Opcode stabilization / reclusterization (PR40715)
Summary:
Given an instruction `Opcode`, we can make benchmarks (measurements) of the
instruction characteristics/performance. Then, to facilitate further analysis
we group the benchmarks with *similar* characteristics into clusters.
Now, this is all not entirely deterministic. Some instructions have variable
characteristics, depending on their arguments. And thus, if we do several
benchmarks of the same instruction `Opcode`, we may end up with *different*
performance characteristics measurements. And when we then do clustering,
these several benchmarks of the same instruction `Opcode` may end up being
clustered into *different* clusters. This is not great for further analysis.
We shall find every `Opcode` with benchmarks not in just one cluster, and move
*all* the benchmarks of said `Opcode` into one new unstable cluster per `Opcode`.
I have solved this by making `ClusterId` a bit field, adding a `IsUnstable` bit,
and introducing `-analysis-display-unstable-clusters` switch to toggle between
displaying stable-only clusters and unstable-only clusters.
The reclusterization is deterministically stable, produces identical reports
between runs. (Or at least that is what i'm seeing, maybe it isn't)
Timings/comparisons:
old (current trunk/head) {
F8303582}
```
$ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-old.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-old.html' (25 runs):
6624.73 msec task-clock # 0.999 CPUs utilized ( +- 0.53% )
172 context-switches # 25.965 M/sec ( +- 29.89% )
0 cpu-migrations # 0.042 M/sec ( +- 56.54% )
31073 page-faults # 4690.754 M/sec ( +- 0.08% )
26538711696 cycles # 4006230.292 GHz ( +- 0.53% ) (83.31%)
2017496807 stalled-cycles-frontend # 7.60% frontend cycles idle ( +- 0.93% ) (83.32%)
13403650062 stalled-cycles-backend # 50.51% backend cycles idle ( +- 0.33% ) (33.37%)
19770706799 instructions # 0.74 insn per cycle
# 0.68 stalled cycles per insn ( +- 0.04% ) (50.04%)
4419821812 branches #
667207369.714 M/sec ( +- 0.03% ) (66.69%)
121741669 branch-misses # 2.75% of all branches ( +- 0.28% ) (83.34%)
6.6283 +- 0.0358 seconds time elapsed ( +- 0.54% )
```
patch, with reclustering but without filtering (i.e. outputting all the stable *and* unstable clusters) {
F8303586}
```
$ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-all.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-all.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-all.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-all.html' (25 runs):
6475.29 msec task-clock # 0.999 CPUs utilized ( +- 0.31% )
213 context-switches # 32.952 M/sec ( +- 23.81% )
1 cpu-migrations # 0.130 M/sec ( +- 43.84% )
31287 page-faults # 4832.057 M/sec ( +- 0.08% )
25939086577 cycles # 4006160.279 GHz ( +- 0.31% ) (83.31%)
1958812858 stalled-cycles-frontend # 7.55% frontend cycles idle ( +- 0.68% ) (83.32%)
13218961512 stalled-cycles-backend # 50.96% backend cycles idle ( +- 0.29% ) (33.37%)
19752995402 instructions # 0.76 insn per cycle
# 0.67 stalled cycles per insn ( +- 0.04% ) (50.04%)
4417079244 branches #
682195472.305 M/sec ( +- 0.03% ) (66.70%)
121510065 branch-misses # 2.75% of all branches ( +- 0.19% ) (83.34%)
6.4832 +- 0.0229 seconds time elapsed ( +- 0.35% )
```
Funnily, *this* measurement shows that said reclustering actually improved performance.
patch, with reclustering, only the stable clusters {
F8303594}
```
$ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-stable.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-stable.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-stable.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-stable.html' (25 runs):
6387.71 msec task-clock # 0.999 CPUs utilized ( +- 0.13% )
133 context-switches # 20.792 M/sec ( +- 23.39% )
0 cpu-migrations # 0.063 M/sec ( +- 61.24% )
31318 page-faults # 4903.256 M/sec ( +- 0.08% )
25591984967 cycles # 4006786.266 GHz ( +- 0.13% ) (83.31%)
1881234904 stalled-cycles-frontend # 7.35% frontend cycles idle ( +- 0.25% ) (83.33%)
13209749965 stalled-cycles-backend # 51.62% backend cycles idle ( +- 0.16% ) (33.36%)
19767554347 instructions # 0.77 insn per cycle
# 0.67 stalled cycles per insn ( +- 0.04% ) (50.03%)
4417480305 branches #
691618858.046 M/sec ( +- 0.03% ) (66.68%)
118676358 branch-misses # 2.69% of all branches ( +- 0.07% ) (83.33%)
6.3954 +- 0.0118 seconds time elapsed ( +- 0.18% )
```
Performance improved even further?! Makes sense i guess, less clusters to print.
patch, with reclustering, only the unstable clusters {
F8303601}
```
$ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-unstable.html -analysis-display-unstable-clusters
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-unstable.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-unstable.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-unstable.html -analysis-display-unstable-clusters' (25 runs):
6124.96 msec task-clock # 1.000 CPUs utilized ( +- 0.20% )
194 context-switches # 31.709 M/sec ( +- 20.46% )
0 cpu-migrations # 0.039 M/sec ( +- 49.77% )
31413 page-faults # 5129.261 M/sec ( +- 0.06% )
24536794267 cycles # 4006425.858 GHz ( +- 0.19% ) (83.31%)
1676085087 stalled-cycles-frontend # 6.83% frontend cycles idle ( +- 0.46% ) (83.32%)
13035595603 stalled-cycles-backend # 53.13% backend cycles idle ( +- 0.16% ) (33.36%)
18260877653 instructions # 0.74 insn per cycle
# 0.71 stalled cycles per insn ( +- 0.05% ) (50.03%)
4112411983 branches #
671484364.603 M/sec ( +- 0.03% ) (66.68%)
114066929 branch-misses # 2.77% of all branches ( +- 0.11% ) (83.32%)
6.1278 +- 0.0121 seconds time elapsed ( +- 0.20% )
```
This tells us that the actual `-analysis-inconsistencies-output-file=` outputting only takes ~0.4 sec for 43970 benchmark points (3 whole sweeps)
(Also, wow this is fast, it used to take several minutes originally)
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=40715 | PR40715 ]].
Reviewers: courbet, gchatelet
Reviewed By: courbet
Subscribers: tschuett, jdoerfert, llvm-commits, RKSimon
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58355
llvm-svn: 354441
Mikael Holmen [Wed, 20 Feb 2019 07:14:39 +0000 (07:14 +0000)]
[RegAllocGreedy] Take last chance recoloring into account in split and assign
Summary:
This is a follow-up to r353988 where tryEvict was extended to take last
chance recoloring into account. Now we do the same thing for trySplit and
tryAssign.
Now we always pass a "FixedRegisters" argument to canEvictInterference and
tryEvict so it doesn't need to have a default value anymore.
The need for this was found long ago in an out-of-tree target.
Unfortunately I don't have a reproducer for an in-tree target.
Reviewers: qcolombet, rudkx
Reviewed By: qcolombet, rudkx
Subscribers: rudkx, MatzeB, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58376
llvm-svn: 354439
Chen Zheng [Wed, 20 Feb 2019 07:01:04 +0000 (07:01 +0000)]
[NFC] add/modify wrapper function for findRegisterDefOperand().
llvm-svn: 354438
Chijun Sima [Wed, 20 Feb 2019 05:49:01 +0000 (05:49 +0000)]
[DTU] Refine the document of mutation APIs [NFC] (PR40528)
Summary:
It was pointed out in [[ https://bugs.llvm.org/show_bug.cgi?id=40528 | Bug 40528 ]] that it is not clear whether insert/deleteEdge can be used to perform multiple updates and [[ https://reviews.llvm.org/D57316#1388344 | a comment in D57316 ]] reveals that the difference between several ways to update the DominatorTree is confusing.
This patch tries to address issues above.
Reviewers: mkazantsev, kuhar, asbirlea, chandlerc, brzycki
Reviewed By: mkazantsev, kuhar, brzycki
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57881
llvm-svn: 354437
Craig Topper [Wed, 20 Feb 2019 05:39:11 +0000 (05:39 +0000)]
[X86] Remove FeatureSlowIncDec from Sandy Bridge and later Intel Core CPUs
Summary:
Inc and Dec were at one point slow on Intel CPUs due to their tendency to cause partial flag stalls on P6 derived CPU cores. This is because these instructions are defined to preserve the carry flag. This partial flag stall issue persisted until Sandy Bridge when flag merging was changed to be handled as a data dependency instead of as a stall until retirement. Sandy Bridge and later CPUs rename the C flag separately from OSPAZ so there is no flag merge needed on INC/DEC to preserve the C flag.
Given these improvements I don't know why INC/DEC was ever considered slow on Sandy Bridge. If anything they should have been disabled on the earlier CPUs instead.
Note after this patch, INC/DEC are still considered slow on Silvermont, Goldmont, Knights Landing and our generic "x86-64" CPU.
Reviewers: spatel, RKSimon, chandlerc
Reviewed By: chandlerc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D58412
llvm-svn: 354436
Leonard Chan [Wed, 20 Feb 2019 05:07:14 +0000 (05:07 +0000)]
Limit new PM tests to X86 registered targets.
llvm-svn: 354435
Eric Christopher [Wed, 20 Feb 2019 04:42:07 +0000 (04:42 +0000)]
Temporarily Revert "[X86][SLP] Enable SLP vectorization for 128-bit horizontal X86 instructions (add, sub)"
As this has broken the lto bootstrap build for 3 days and is
showing a significant regression on the Dither_benchmark results (from
the LLVM benchmark suite) -- specifically, on the
BENCHMARK_FLOYD_DITHER_128, BENCHMARK_FLOYD_DITHER_256, and
BENCHMARK_FLOYD_DITHER_512; the others are unchanged. These have
regressed by about 28% on Skylake, 34% on Haswell, and over 40% on
Sandybridge.
This reverts commit r353923.
llvm-svn: 354434
Fangrui Song [Wed, 20 Feb 2019 04:39:42 +0000 (04:39 +0000)]
[Dominators] Simplify and optimize path compression used in link-eval forest.
Summary:
* NodeToInfo[*] have been allocated so the addresses are stable. We can store them instead of NodePtr to save NumToNode lookups.
* Nodes are traversed twice. Using `Visited` to check the traversal number is expensive and obscure. Just split the two traversals into two loops explicitly.
* The check `VInInfo.DFSNum < LastLinked` is redundant as it is implied by `VInInfo->Parent < LastLinked`
* VLabelInfo PLabelInfo are used to save a NodeToInfo lookup in the second traversal.
Also add some comments explaining eval().
This shows a ~4.5% improvement (9.8444s -> 9.3996s) on
perf stat -r 10 taskset -c 0 opt -passes=$(printf '%.0srequire<domtree>,invalidate<domtree>,' {1..1000})'require<domtree>' -disable-output sqlite-autoconf-3270100/sqlite3.bc
Reviewers: kuhar, sanjoy, asbirlea
Reviewed By: kuhar
Subscribers: brzycki, NutshellySima, kristina, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58327
llvm-svn: 354433
Leonard Chan [Wed, 20 Feb 2019 04:35:28 +0000 (04:35 +0000)]
Remove test on incompatible mpis target.
llvm-svn: 354432
Leonard Chan [Wed, 20 Feb 2019 03:50:11 +0000 (03:50 +0000)]
[NewPM] Add other sanitizers at O0
This allows for MSan and TSan to be used without optimizations required.
Differential Revision: https://reviews.llvm.org/D58424
llvm-svn: 354431