platform/upstream/llvm.git
4 years ago[InstCombine] add more tests for mul of bools; NFC
Sanjay Patel [Fri, 3 Jul 2020 19:27:43 +0000 (15:27 -0400)]
[InstCombine] add more tests for mul of bools; NFC

4 years ago[libcxx] Put clang::trivial_abi on std::unique_ptr, std::shared_ptr, and std::weak_ptr
Vy Nguyen [Wed, 24 Jun 2020 19:03:08 +0000 (15:03 -0400)]
[libcxx] Put clang::trivial_abi on std::unique_ptr, std::shared_ptr, and std::weak_ptr

Reviewers: jyknight, EricWF, #libc!

Subscribers: arphaman, libcxx-commits

Tags: #libc

Differential Revision: https://reviews.llvm.org/D82490

4 years ago[clangd] Fix hover crash on invalid decls
Kadir Cetinkaya [Fri, 3 Jul 2020 18:52:41 +0000 (20:52 +0200)]
[clangd] Fix hover crash on invalid decls

Summary: This also changes the way we display Size and Offset to be independent.

Reviewers: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83143

4 years ago[PowerPC] Implement Vector Insert Builtins in LLVM/Clang
Biplob Mishra [Fri, 3 Jul 2020 17:45:27 +0000 (12:45 -0500)]
[PowerPC] Implement Vector Insert Builtins in LLVM/Clang

Implements vec_insertl() and vec_inserth().

Differential Revision: https://reviews.llvm.org/D82365

4 years agoRevert AST Matchers default to AsIs mode
Stephen Kelly [Thu, 2 Jul 2020 19:46:27 +0000 (20:46 +0100)]
Revert AST Matchers default to AsIs mode

Reviewers: aaron.ballman, klimek

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83076

4 years ago[flang] Track known file size, add IsATerminal (ext. I/O work part 5)
peter klausler [Fri, 3 Jul 2020 17:56:08 +0000 (10:56 -0700)]
[flang] Track known file size, add IsATerminal (ext. I/O work part 5)

Add a data member knownSize_ and an accessor to allow the size of
an external file to be tracked when known.  Also add a wrapper for
::isatty() here in the filesystem encapsulation module.  These
features are needed for the external I/O rework changes still
to come.

Reviewed By: sscalpone

Differential Revision: https://reviews.llvm.org/D83141

4 years ago[flang] Define new runtime error IOSTAT values (I/O runtime work part 4)
peter klausler [Fri, 3 Jul 2020 17:47:02 +0000 (10:47 -0700)]
[flang] Define new runtime error IOSTAT values (I/O runtime work part 4)

Add more IOSTAT= values for errors that can arise in external I/O.

Reviewed By: sscalpone

Differential Revision: https://reviews.llvm.org/D83140

4 years ago[InstCombine] Try to narrow expr if trunc cannot be removed.
Florian Hahn [Fri, 3 Jul 2020 19:22:51 +0000 (20:22 +0100)]
[InstCombine] Try to narrow expr if trunc cannot be removed.

Narrowing an input expression of a truncate to a type larger than the
result of the truncate won't allow removing the truncate, but it may
enable further optimizations, e.g. allowing for larger vectorization
factors.

For now this is intentionally limited to integer types only, to avoid
producing new vector ops that might not be suitable for the target.

If we know that the only user is a trunc, we can also be allow more
cases, e.g. also shortening expressions with some additional shifts.

I would appreciate feedback on the best place to do such a narrowing.

This fixes PR43580.

Reviewers: spatel, RKSimon, lebedev.ri, xbolva00

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D82973

4 years ago[libc++/libc++abi] Automatically detect whether exceptions are enabled
Louis Dionne [Fri, 3 Jul 2020 17:46:41 +0000 (13:46 -0400)]
[libc++/libc++abi] Automatically detect whether exceptions are enabled

Instead of detecting it automatically (in libc++) and relying on
_LIBCXXABI_NO_EXCEPTIONS being set explicitly (in libc++abi), always
detect whether exceptions are enabled automatically.

This commit also removes support for specifying -D_LIBCPP_NO_EXCEPTIONS
and -D_LIBCXXABI_NO_EXCEPTIONS explicitly -- those should just be inferred
from using -fno-exceptions (or an equivalent flag).

Allowing both -D_FOO_NO_EXCEPTIONS to be provided explicitly and trying
to detect it automatically is just confusing, especially since we did
specify it explicitly when building libc++abi. We should have only one
way to detect whether exceptions are enabled, but it should be robust.

4 years ago[flang] Add FIRBuilder.cpp
Eric Schweitz [Fri, 3 Jul 2020 00:44:51 +0000 (17:44 -0700)]
[flang] Add FIRBuilder.cpp

The FIR builder is a helper class that manages the creation of MLIR
operations from the bridge. The focus of the builder is the creation of
Operations, Types, etc.

Differential revision: htps://reviews.llvm.org/D83107

4 years ago[XCOFF][AIX] Use 'L..' instead of '.L' for getPrivateGlobalPrefix in DataLayout
jasonliu [Thu, 2 Jul 2020 22:45:59 +0000 (22:45 +0000)]
[XCOFF][AIX] Use 'L..' instead of '.L' for getPrivateGlobalPrefix in DataLayout

Summary:
D80831 changed part of the prefix usage for AIX.
But there are other places getting prefix from DataLayout.
This patch intends to make prefix usage consistent on AIX.

Reviewed by: hubert.reinterpretcast, daltenty

Differential Revision: https://reviews.llvm.org/D81270

4 years ago[llvm-ar][test] Unsupport error-opening-directory.test on FreeBSD
sameerarora101 [Fri, 3 Jul 2020 17:57:32 +0000 (10:57 -0700)]
[llvm-ar][test] Unsupport error-opening-directory.test on FreeBSD

Differential Revision: https://reviews.llvm.org/D82786

4 years ago[InstCombine] fold mul of zext bools to 'and'
Sanjay Patel [Fri, 3 Jul 2020 17:08:59 +0000 (13:08 -0400)]
[InstCombine] fold mul of zext bools to 'and'

The base case only works because we are relying on a
poison-unsafe select transform; if that is fixed, we
would regress on patterns like this.

The extra use tests show that the select transform can't
be applied consistently. So it may be a regression to have
an extra instruction on 1 test, but that result was not
created safely and does not happen reliably.

4 years ago[InstCombine] add tests for mul of bools; NFC
Sanjay Patel [Fri, 3 Jul 2020 16:52:34 +0000 (12:52 -0400)]
[InstCombine] add tests for mul of bools; NFC

4 years ago[NFC][InstCombine] Add some more tests for select based on non-canonical bit-test
Roman Lebedev [Fri, 3 Jul 2020 17:12:46 +0000 (20:12 +0300)]
[NFC][InstCombine] Add some more tests for select based on non-canonical bit-test

4 years ago[InstSimplify] Fold icmp with dominating assume
Nikita Popov [Sun, 28 Jun 2020 13:59:56 +0000 (15:59 +0200)]
[InstSimplify] Fold icmp with dominating assume

If we assume(x > y), then we should be able to fold the basic
implications of that, like x >= y. This already happens if either
one of the operands is constant (LVI) or if the conditions are
exactly the same (GVN), but not if we have an implication with
non-constant operands. Support this by querying AssumptionCache.

Fixes https://bugs.llvm.org/show_bug.cgi?id=40149.

Differential Revision: https://reviews.llvm.org/D82717

4 years ago[ELF] Resolve R_DTPREL in .debug_* referencing discarded symbols to -1
Fangrui Song [Fri, 3 Jul 2020 16:50:30 +0000 (09:50 -0700)]
[ELF] Resolve R_DTPREL in .debug_* referencing discarded symbols to -1

The location of a TLS variable is encoded as a DW_OP_const4u/DW_OP_const8u
followed by a DW_OP_push_tls_address (or DW_OP_GNU_push_tls_address https://sourceware.org/bugzilla/show_bug.cgi?id=11616 ).

This change follows up to D81784 and makes relocations types generalized as
R_DTPREL (e.g. R_X86_64_DTPOFF{32,64}, R_PPC64_DTPREL64) use -1 as the
tombstone value as well. This works for both TLS Variant I and Variant II
architectures.

* arm: .long tls(tlsldo)   # not working currently (R_ARM_TLS_LDO32 is R_ABS)
* mips64: .dtpreldword tls+32768
* ppc64: .quad tls@DTPREL+0x8000
* riscv: neither GCC nor clang has implemented DW_AT_location. It is likely .long/.quad tls@dtprel+0x800
* x86-32: .long tls@DTPOFF
* x86-64: .long tls@DTPOFF; .quad tls@DTPOFF

tls has a non-negative st_value, so such relocations (st_value+addend)
never resolve to -1 in a normal (not discarded) case.

```
// clang -fuse-ld=lld -g -ffunction-sections a.c -Wl,--gc-sections
// foo and tls will be discarded by --gc-sections.
// DW_AT_location [DW_FORM_exprloc] (DW_OP_const8u 0xffffffffffffffff, DW_OP_GNU_push_tls_address)
thread_local int tls;
int foo() { return ++tls; }
int main() {}
```

Also, drop logic added in D26201 intended to address PR30793. It added a test
(gc-debuginfo-tls.s) using a non-SHF_ALLOC section and a local symbol, which
does not reflect the intended scenario: a relocation in a SHF_ALLOC section
referencing a discarded non-local symbol. For such a non .debug_* section, just
emit an error.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D82899

4 years ago[SLP] Make sure instructions are ordered when computing spill cost.
Florian Hahn [Fri, 3 Jul 2020 16:28:45 +0000 (17:28 +0100)]
[SLP] Make sure instructions are ordered when computing spill cost.

The entries in VectorizableTree are not necessarily ordered by their
position in basic blocks. Collect them and order them by dominance so
later instructions are guaranteed to be visited first. For instructions
in different basic blocks, we only scan to the beginning of the block,
so their order does not matter, as long as all instructions in a basic
block are grouped together. Using dominance ensures a deterministic order.

The modified test case contains an example where we compute a wrong
spill cost (2) without this patch, even though there is no call between
any instruction in the bundle.

This seems to have limited practical impact, .e.g on X86 with a recent
Intel Xeon CPU with -O3 -march=native -flto on MultiSource,SPEC2000,SPEC2006
there are no binary changes.

Reviewers: craig.topper, RKSimon, xbolva00, ABataev, spatel

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D82444

4 years ago[ARM][HWLoops] Create hardware loops for sibling loops
David Green [Fri, 3 Jul 2020 13:18:32 +0000 (14:18 +0100)]
[ARM][HWLoops] Create hardware loops for sibling loops

Given a loop with two subloops, it should be possible for both to be
converted to hardware loops. That's what this patch does, simply enough.
It slightly alters the loop iterating order to try and convert all
subloops. If one (or more) succeeds, it stops as before.

Differential Revision: https://reviews.llvm.org/D78502

4 years ago[SLP] Precommit test for which spill cost is computed incorrectly.
Florian Hahn [Tue, 23 Jun 2020 18:07:30 +0000 (19:07 +0100)]
[SLP] Precommit test for which spill cost is computed incorrectly.

Test for D82444.

4 years ago[InstCombine] Precommit tests for PR43580.
Florian Hahn [Wed, 1 Jul 2020 12:54:24 +0000 (13:54 +0100)]
[InstCombine] Precommit tests for PR43580.

4 years agoEnable basepointer for AIX.
Sean Fertile [Fri, 3 Jul 2020 15:55:49 +0000 (11:55 -0400)]
Enable basepointer for AIX.

Differential Revision: https://reviews.llvm.org/D82030

4 years ago[InstCombine] add one-use check to cast+select narrowing transform
Sanjay Patel [Fri, 3 Jul 2020 15:52:52 +0000 (11:52 -0400)]
[InstCombine] add one-use check to cast+select narrowing transform

Prevent increasing the instruction count.

4 years ago[InstCombine] add tests to show missing one-use checks; NFC
Sanjay Patel [Fri, 3 Jul 2020 15:28:03 +0000 (11:28 -0400)]
[InstCombine] add tests to show missing one-use checks; NFC

4 years ago[DWARFYAML][test] Use --ignore-case to suppress errors.
Xing GUO [Fri, 3 Jul 2020 15:44:36 +0000 (23:44 +0800)]
[DWARFYAML][test] Use --ignore-case to suppress errors.

This patch is to fix build bot failure (http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64/builds/553).

4 years ago[flang] Improve API for runtime allocator (I/O runtime work part 3)
peter klausler [Fri, 3 Jul 2020 01:35:20 +0000 (18:35 -0700)]
[flang] Improve API for runtime allocator (I/O runtime work part 3)

New<A> used to return an A&; now it returns an OwningPtr<A>
to force better ownership tracking of allocations.  Its API
has also been split into New<A> and SizedNew<A> to allow
allocations with a size override.

Reviewed By: tskeith

Differential Revision: https://reviews.llvm.org/D83108

4 years ago[clang][NFC] Removed unused parameters in InitializeSourceManager
Andrzej Warzynski [Fri, 3 Jul 2020 14:44:06 +0000 (15:44 +0100)]
[clang][NFC] Removed unused parameters in InitializeSourceManager

4 years ago[InstCombine] canEvaluateTruncated - use KnownBits to check for inrange shift amounts
Simon Pilgrim [Fri, 3 Jul 2020 14:51:06 +0000 (15:51 +0100)]
[InstCombine] canEvaluateTruncated - use KnownBits to check for inrange shift amounts

Currently canEvaluateTruncated can only attempt to truncate shifts if they are scalar/uniform constant amounts that are in range.

This patch replaces the constant extraction code with KnownBits handling, using the KnownBits::getMaxValue to check that the amounts are inrange.

This enables support for nonuniform constant cases, and also variable shift amounts that have been masked somehow. Annoyingly, this still won't work for vectors with (demanded) undefs as KnownBits returns nothing in those cases, but its a definite improvement on what we currently have.

Differential Revision: https://reviews.llvm.org/D83127

4 years ago[AMDGPU] Added support of new inline assembler constraints
Dmitry Preobrazhensky [Fri, 3 Jul 2020 11:36:57 +0000 (14:36 +0300)]
[AMDGPU] Added support of new inline assembler constraints

Added support for constraints 'I', 'J', 'L', 'B', 'C', 'Kf', 'DA', 'DB'.

See https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints.

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D81657

4 years ago[ARM] Generate [SU]RHADD from (b - (~a)) >> 1
Petre-Ionut Tudor [Wed, 3 Jun 2020 14:28:43 +0000 (15:28 +0100)]
[ARM] Generate [SU]RHADD from (b - (~a)) >> 1

Summary:
Teach LLVM to recognize the above pattern, which is usually a
transformation of (a + b + 1) >> 1, where the operands are either
signed or unsigned types.

Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82669

4 years ago[lldb/DWARF] Look for complete member definitions in other modules
Pavel Labath [Wed, 1 Jul 2020 14:46:24 +0000 (16:46 +0200)]
[lldb/DWARF] Look for complete member definitions in other modules

With -flimit-debug-info, we can have a definition of a class, but no
definition for some of its members. This extends the same logic we were
using for incomplete base classes to cover incomplete members too.

Test forward-declarations.s is removed as it is no longer applicable --
we don't warn anymore when encountering incomplete members as they could
be completed elsewhere. New checks added to TestLimitDebugInfo cover the
handling of incomplete members more thoroughly.

4 years ago[mlir] Add check for ViewLikeOpInterface that creates additional aliases.
Julian Gross [Thu, 2 Jul 2020 13:10:24 +0000 (15:10 +0200)]
[mlir] Add check for ViewLikeOpInterface that creates additional aliases.

ViewLikeOpInterfaces introduce new aliases that need to be added to the alias
list. This is necessary to place deallocs in the right positions.

Differential Revision: https://reviews.llvm.org/D83044

4 years ago[ObjectYAML][ELF] Add support for emitting the .debug_gnu_pubnames/pubtypes sections.
Xing GUO [Fri, 3 Jul 2020 10:13:49 +0000 (18:13 +0800)]
[ObjectYAML][ELF] Add support for emitting the .debug_gnu_pubnames/pubtypes sections.

This patch helps add support for emitting the .debug_gnu_pubnames and .debug_gnu_pubtypes sections.

The .debug_gnu_pub* sections is verified by llvm-dwarfdump.

Known issues:
- Doesn't support emitting multiple pub-tables.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D82367

4 years ago[lldb/Utility] Simplify more Scalar methods
Pavel Labath [Fri, 3 Jul 2020 14:31:50 +0000 (16:31 +0200)]
[lldb/Utility] Simplify more Scalar methods

A lot of the methods handle all integral and all floating point types
the same way. They can be changed to switch on the category of the type,
instead of the actual type, saving a lot of boilerplate.

This patch does that for the methods where I could be reasonably certain
of their expected semantics.

4 years ago[DWARFYAML][unittest] Use parseDWARFYAML() in unit test. NFC.
Xing GUO [Fri, 3 Jul 2020 14:34:27 +0000 (22:34 +0800)]
[DWARFYAML][unittest] Use parseDWARFYAML() in unit test. NFC.

4 years ago[mlir] Add redundant copy removal transform
Ehsan Toosi [Thu, 25 Jun 2020 15:02:11 +0000 (17:02 +0200)]
[mlir] Add redundant copy removal transform

This pass removes redundant dialect-independent Copy operations in different
situations like the following:

%from = ...
%to = ...
... (no user/alias for %to)
copy(%from, %to)
... (no user/alias for %from)
dealloc %from
use(%to)

Differential Revision: https://reviews.llvm.org/D82757

4 years ago[NFC][SimplifyCFG] Move X86 tests into subdir
Sam Parker [Fri, 3 Jul 2020 13:20:57 +0000 (14:20 +0100)]
[NFC][SimplifyCFG] Move X86 tests into subdir

4 years ago[llvm-readobj] - Use cantFail() for all `Obj->sections()` calls. NFCI.
Georgii Rymar [Fri, 3 Jul 2020 11:17:08 +0000 (14:17 +0300)]
[llvm-readobj] - Use cantFail() for all `Obj->sections()` calls. NFCI.

`ELFDumper<ELFT>::ELFDumper` calls `Obj->sections()` in its constructor:
https://github.com/llvm/llvm-project/blob/master/llvm/tools/llvm-readobj/ELFDumper.cpp#L2046

this means that all subsequent calls can't fail and can be
wrapped into `cantFail` in instead of `unwrapOrError` for simplicity.

Actually we already do it in a few places. In this patch I've fixed all
other places I've found.

Differential revision: https://reviews.llvm.org/D83126

4 years ago[IR] Short-circuit comparison with itself for Attributes
Danila Malyutin [Tue, 23 Jun 2020 11:43:02 +0000 (14:43 +0300)]
[IR] Short-circuit comparison with itself for Attributes

Differential Revision: https://reviews.llvm.org/D82295

4 years ago[clang][NFC] Add a missing /dev/null in test/AST/ast-dump-lambda.cpp
Bruno Ricci [Fri, 3 Jul 2020 12:58:59 +0000 (13:58 +0100)]
[clang][NFC] Add a missing /dev/null in test/AST/ast-dump-lambda.cpp

4 years ago[clang][NFC] Also test for serialization in test/AST/ast-dump-comment.cpp
Bruno Ricci [Fri, 3 Jul 2020 12:58:19 +0000 (13:58 +0100)]
[clang][NFC] Also test for serialization in test/AST/ast-dump-comment.cpp

4 years ago[clang][NFC] Store a pointer to the ASTContext in ASTDumper and TextNodeDumper
Bruno Ricci [Fri, 3 Jul 2020 12:54:10 +0000 (13:54 +0100)]
[clang][NFC] Store a pointer to the ASTContext in ASTDumper and TextNodeDumper

In general there is no way to get to the ASTContext from most AST nodes
(Decls are one of the exception). This will be a problem when implementing
the rest of APValue::dump since we need the ASTContext to dump some kinds of
APValues.

The ASTContext* in ASTDumper and TextNodeDumper is not always non-null.
This is because we still want to be able to use the various dump() functions
in a debugger.

No functional changes intended.

Reverted in fcf4d5e4499a391dff42ea1a096f146db44147b6 since a few dump()
functions in lldb where missed.

4 years agoAdd tests for trunc(shl/lshr/ashr(*ext(x),zext(and(y,c)))) patterns with variable...
Simon Pilgrim [Fri, 3 Jul 2020 12:39:16 +0000 (13:39 +0100)]
Add tests for trunc(shl/lshr/ashr(*ext(x),zext(and(y,c)))) patterns with variable shifts with clamped shift amounts

4 years agoAdd vector trunc(or(shl(zext(x),c1),zext(x))) tests
Simon Pilgrim [Fri, 3 Jul 2020 10:51:59 +0000 (11:51 +0100)]
Add vector trunc(or(shl(zext(x),c1),zext(x))) tests

4 years ago[LLD][ELF][Windows] Allow LLD to overwrite existing output files that are in use
Ben Dunbobbin [Fri, 3 Jul 2020 11:54:24 +0000 (12:54 +0100)]
[LLD][ELF][Windows] Allow LLD to overwrite existing output files that are in use

On Windows co-operative programs can be expected to open LLD's
output in FILE_SHARE_DELETE mode. This allows us to delete the
file (by moving it to a temporary filename and then deleting
it) so that we can link another output file that overwrites
the existing file, even if the current file is in use.

A similar strategy is documented here:
https://boostgsoc13.github.io/boost.afio/doc/html/afio/FAQ/deleting_open_files.html

Differential Revision: https://reviews.llvm.org/D82567

4 years ago[AMDGPU] Don't combine DPP if DPP register is used more than once per instruction
vpykhtin [Thu, 25 Jun 2020 14:24:58 +0000 (17:24 +0300)]
[AMDGPU] Don't combine DPP if DPP register is used more than once per instruction

Reviewers: arsenm, rampitec, foad

Reviewed By: rampitec, foad

Subscribers: wuzish, kzhuravl, nemanjai, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kbarton, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82551

4 years ago[ARM] Add Cortex-A77 Support for Clang and LLVM
Luke Geeson [Tue, 30 Jun 2020 15:45:36 +0000 (16:45 +0100)]
[ARM] Add Cortex-A77 Support for Clang and LLVM

This patch upstreams support for the Arm-v8 Cortex-A77
processor for AArch64 and ARM.

In detail:
- Adding cortex-a77 as a cpu option for aarch64 and arm targets in clang
- Cortex-A77 CPU name and ProcessorModel in llvm

details of the CPU can be found here:
https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a77

and a similar submission to GCC can be found here:
https://github.com/gcc-mirror/gcc/commit/e0664b7a63ed8305e9f8539309df7fb3eb13babe

The following people contributed to this patch:
- Luke Geeson
- Mikhail Maltsev

Reviewers: t.p.northover, dmgreen, ostannard, SjoerdMeijer

Reviewed By: dmgreen

Subscribers: dmgreen, kristof.beyls, hiraditya, danielkiss, cfe-commits,
llvm-commits, miyuki

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D82887

4 years agoRevert RecursiveASTVisitor fixes.
Dmitri Gribenko [Fri, 3 Jul 2020 11:46:59 +0000 (13:46 +0200)]
Revert RecursiveASTVisitor fixes.

This reverts commit 8bf4c40af813e73de77739b33b8808f6bd13497b.
This reverts commit 7b0be962d681c408c8ecf7180c6ad8f9fbcdaf2d.
This reverts commit 94454442c3c15a67ae70ef3a73616632968973fc.

Some compilers on some buildbots didn't accept the specialization of
is_same_method_impl in a non-namespace scope.

4 years agoMake RecursiveASTVisitor call WalkUpFrom for operators when the data recursion queue...
Dmitri Gribenko [Fri, 3 Jul 2020 10:39:14 +0000 (12:39 +0200)]
Make RecursiveASTVisitor call WalkUpFrom for operators when the data recursion queue is absent

Reviewers: eduucaldas, ymandel, rsmith

Reviewed By: eduucaldas

Subscribers: gribozavr2, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82889

4 years agoMake RecursiveASTVisitor call WalkUpFrom for unary and binary operators in post-order...
Dmitri Gribenko [Fri, 3 Jul 2020 10:39:03 +0000 (12:39 +0200)]
Make RecursiveASTVisitor call WalkUpFrom for unary and binary operators in post-order traversal mode

Reviewers: ymandel, eduucaldas, rsmith

Reviewed By: eduucaldas, rsmith

Subscribers: gribozavr2, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82787

4 years agoRecursiveASTVisitor: don't call WalkUp unnecessarily in post-order traversal
Dmitri Gribenko [Fri, 3 Jul 2020 10:38:45 +0000 (12:38 +0200)]
RecursiveASTVisitor: don't call WalkUp unnecessarily in post-order traversal

Summary:
How does RecursiveASTVisitor call the WalkUp callback for expressions?

* In pre-order traversal mode, RecursiveASTVisitor calls the WalkUp
  callback from the default implementation of Traverse callbacks.

* In post-order traversal mode when we don't have a DataRecursionQueue,
  RecursiveASTVisitor also calls the WalkUp callback from the default
  implementation of Traverse callbacks.

* However, in post-order traversal mode when we have a DataRecursionQueue,
  RecursiveASTVisitor calls the WalkUp callback from PostVisitStmt.

As a result, when the user overrides the Traverse callback, in pre-order
traversal mode they never get the corresponding WalkUp callback. However
in the post-order traversal mode the WalkUp callback is invoked or not
depending on whether the data recursion optimization could be applied.

I had to adjust the implementation of TraverseCXXForRangeStmt in the
syntax tree builder to call the WalkUp method directly, as it was
relying on this behavior. There is an existing test for this
functionality and it prompted me to make this extra fix.

In addition, I had to fix the default implementation implementation of
RecursiveASTVisitor::TraverseSynOrSemInitListExpr to call WalkUpFrom in
the same manner as the implementation generated by the DEF_TRAVERSE_STMT
macro. Without this fix, the InitListExprIsPostOrderNoQueueVisitedTwice
test was failing because WalkUpFromInitListExpr was never called.

Reviewers: eduucaldas, ymandel

Reviewed By: eduucaldas, ymandel

Subscribers: gribozavr2, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82486

4 years agoAdded tests for RecursiveASTVisitor for AST nodes that are special cased
Dmitri Gribenko [Fri, 3 Jul 2020 10:38:35 +0000 (12:38 +0200)]
Added tests for RecursiveASTVisitor for AST nodes that are special cased

Summary:
RecursiveASTVisitor has special code for handling operator AST nodes,
specifically, unary, binary, and compound assignment operators. In this
change I'm adding tests for operator AST nodes that follow the existing
pattern of tests for the CallExpr node (an AST node that triggers the
common code path).

Reviewers: ymandel, eduucaldas

Reviewed By: ymandel, eduucaldas

Subscribers: gribozavr2, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82875

4 years ago[libcxx testing] Remove ALLOW_RETRIES from another test
David Zarzycki [Fri, 3 Jul 2020 10:53:44 +0000 (06:53 -0400)]
[libcxx testing] Remove ALLOW_RETRIES from another test

4 years ago[DebugInfo] Use Cursor to detect errors in debug line prologue parser
James Henderson [Thu, 2 Jul 2020 13:13:43 +0000 (14:13 +0100)]
[DebugInfo] Use Cursor to detect errors in debug line prologue parser

Previously, the debug line parser would keep attempting to read data
even if it had run out of data to read. This meant errors in parsing
would often end up being reported as something else, such as an unknown
version or malformed directory/filename table. This patch fixes the
issues by using the Cursor API to capture errors.

Reviewed by: labath

Differential Revision: https://reviews.llvm.org/D83043

4 years agoRegenerate apint-cast tests and replace %tmp variable names to silence update_test_ch...
Simon Pilgrim [Fri, 3 Jul 2020 10:41:21 +0000 (11:41 +0100)]
Regenerate apint-cast tests and replace %tmp variable names to silence update_test_checks warnings

4 years agoAdd nonuniform vector trunc(or(shl(zext(x),c1),srl(zext(x),c2))) tests
Simon Pilgrim [Fri, 3 Jul 2020 10:40:17 +0000 (11:40 +0100)]
Add nonuniform vector trunc(or(shl(zext(x),c1),srl(zext(x),c2))) tests

4 years agoRegenerate mul-trunc tests, add vector variants and replace %tmp variable names to...
Simon Pilgrim [Fri, 3 Jul 2020 10:37:26 +0000 (11:37 +0100)]
Regenerate mul-trunc tests, add vector variants and replace %tmp variable names to silence update_test_checks warnings

4 years ago[lldb] Fix missing characters when autocompleting LLDB commands in REPL
Martin Svensson [Fri, 3 Jul 2020 09:36:34 +0000 (11:36 +0200)]
[lldb] Fix missing characters when autocompleting LLDB commands in REPL

Summary:

When tabbing to complete LLDB commands in REPL, characters would at best be
missing but at worst cause the REPL to crash due to out of range string access.
This patch appends the command character to the completion results to fulfill
the assumption that all matches are prefixed by the request's cursor argument
prefix.

Bug report for the Swift REPL
https://bugs.swift.org/browse/SR-12867

Reviewers: teemperor

Reviewed By: teemperor

Subscribers: lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D82835

4 years ago[DWARFYAML][debug_gnu_*] Add the missing context `IsGNUStyle`. NFC.
Xing GUO [Fri, 3 Jul 2020 09:56:02 +0000 (17:56 +0800)]
[DWARFYAML][debug_gnu_*] Add the missing context `IsGNUStyle`. NFC.

This patch helps add the missing context `IsGNUStyle`. Before this patch, yaml2obj cannot parse the YAML description of 'debug_gnu_pubnames' and 'debug_gnu_pubtypes' correctly due to the missing context.

In other words, if we have

```
DWARF:
  debug_gnu_pubtypes:
    Length:
      TotalLength: 0x1234
    Version:    2
    UnitOffset: 0x1234
    UnitSize:   0x4321
    Entries:
      - DieOffset:  0x12345678
        Name:       abc
        Descriptor: 0x00      ## Descriptor can never be mapped into Entry.Descriptor
```

yaml2obj will complain that "error: unknown key 'Descriptor'".

This patch helps resolve this problem.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D82435

4 years agoFix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning.
Simon Pilgrim [Fri, 3 Jul 2020 09:42:54 +0000 (10:42 +0100)]
Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning.

4 years ago[clangd] Improve hover on arguments to function call
Adam Czachorowski [Fri, 3 Jul 2020 09:20:22 +0000 (11:20 +0200)]
[clangd] Improve hover on arguments to function call

Summary:
In cases like:
  foo(a, ^b);
We now additionally show the name and type of the parameter to foo that
corresponds that "b" is passed as.

The name should help with understanding what it's used for and type can
be useful to find out if call to foo() can mutate variable "b" or not
(i.e. if it is pass by value, reference, const reference, etc).

Patch By: adamcz@ !

Reviewers: kadircet

Reviewed By: kadircet

Subscribers: nridge, ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81169

4 years ago[InstCombine] Add sext(ashr(shl(trunc(x),c),c)) folding support for vectors
Simon Pilgrim [Thu, 2 Jul 2020 18:24:38 +0000 (19:24 +0100)]
[InstCombine] Add sext(ashr(shl(trunc(x),c),c)) folding support for vectors

Replacing m_ConstantInt with m_Constant permits folding of vectors as well as scalars.

Differential Revision: https://reviews.llvm.org/D83058

4 years agoRegenerate PR19420 tests
Simon Pilgrim [Thu, 2 Jul 2020 17:30:34 +0000 (18:30 +0100)]
Regenerate PR19420 tests

4 years ago[llvm-readelf] - Do not report a misleading warning when there is no string table.
Georgii Rymar [Thu, 2 Jul 2020 13:06:26 +0000 (16:06 +0300)]
[llvm-readelf] - Do not report a misleading warning when there is no string table.

This is a follow-up for D82955, which allows to continue dumping when a symbol table is broken.
When we are unable to get the string table and trying to print symbols,
the existent tool logic together with D82955 reports an error:

"st_name (0x??) is past the end of the string table of size 0x??"

Though, when there is no string table, this message becomes misleading and excessive.
It is easy to fix it though and that is what this patch does.

Differential revision: https://reviews.llvm.org/D83042

4 years ago[llvm-readelf] - Do not error out when dumping symbols.
Georgii Rymar [Wed, 1 Jul 2020 12:25:33 +0000 (15:25 +0300)]
[llvm-readelf] - Do not error out when dumping symbols.

When the --symbols option/--dyn-symbols is given we might report an
error and exit when something goes not right. E.g. when the SHT_SYMTAB
section is broken. Though we could report a warning and try to continue
dumping instead in many cases.

This patch removes `unwrapOrErr` calls from the code involved in the
flow described.

Differential revision: https://reviews.llvm.org/D82955

4 years ago[Alignment][NFC] Use 5 bits to store Instructions Alignment
Guillaume Chatelet [Fri, 3 Jul 2020 08:54:27 +0000 (08:54 +0000)]
[Alignment][NFC] Use 5 bits to store Instructions Alignment

As per [MaxAlignmentExponent]{https://github.com/llvm/llvm-project/blob/b7338fb1a6a464472850211165391983d2c8fdf3/llvm/include/llvm/IR/Value.h#L688} alignment is not allowed to be more than 2^29.
Encoded as Log2, this means that storing alignment uses 5 bits.
This patch makes sure all instructions store their alignment in a consistent way, encoded as Log2 and using 5 bits.

Differential Revision: https://reviews.llvm.org/D83119

4 years ago[flang][NFC] Move and rework pgmath description used in folding
Jean Perier [Fri, 3 Jul 2020 08:01:35 +0000 (10:01 +0200)]
[flang][NFC] Move and rework pgmath description used in folding

This change prepares usage of lipgmath description in lowering.
- Removes the static variable templates that were used to abstract
  libpgmath description
- Move the description to pgmath.h.inc header and rework the macros
  so that they can both be used to declare pgmath functions and use
  them.
  The way they are to be used is left to pgmath.h.inc user that
  must define PGMATH_USE_XX macros that will be called for all pgmath
  functions in pgmath.h.inc.
- In intrinsic-library.cpp define PGMATH_USE_XX macro callbacks in
  order to capture function pointers to pgmath functions as well as
  a description of their type. This will be used for constant folding
  using pgmath.
- Change atan/atan2 handling to use atan2 instead of atan when there are two
arguments  because it is easier to handle in the runtime description.

Also fixes lipgmath linking regression after D78215 cmake changes.

This change is motivated by the need to use a similar pgmath
description in lowering. The difference is that no function pointers will
be taken there, and instead only the function name and type are needed.

Reviewed By: schweitz, sscalpone

Differential Revision: https://reviews.llvm.org/D83051

4 years ago[Alignment][NFC] Use proper getter to retrieve alignment from ConstantInt and Constan...
Guillaume Chatelet [Fri, 3 Jul 2020 08:06:43 +0000 (08:06 +0000)]
[Alignment][NFC] Use proper getter to retrieve alignment from ConstantInt and ConstantSDNode

This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D83082

4 years ago[OpenMP][OMPT]Add event callbacks for taskwait with depend
Joachim Protze [Mon, 15 Jun 2020 22:31:14 +0000 (00:31 +0200)]
[OpenMP][OMPT]Add event callbacks for taskwait with depend

This adds the missing event callbacks to express dependencies on included tasks
and taskwait with depend clause.

The test fails for GCC, see bug report:
https://bugs.llvm.org/show_bug.cgi?id=46573

Reviewed by: hbae

Differential Revision: https://reviews.llvm.org/D81891

4 years ago[Attributor] Create getName() method for abstract attribute
Luofan Chen [Fri, 3 Jul 2020 07:22:18 +0000 (15:22 +0800)]
[Attributor] Create getName() method for abstract attribute

Summary: The `getName()` method returns the name of the abstract attribute

Reviewers: jdoerfert, sstefan1, uenoku, homerdin, baziotis

Reviewed By: sstefan1

Subscribers: uenoku, kuter, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83109

4 years agoFix stack-clash probing for large static alloca
serge-sans-paille [Tue, 30 Jun 2020 12:04:00 +0000 (14:04 +0200)]
Fix stack-clash probing for large static alloca

Differential Revision: https://reviews.llvm.org/D82867

4 years ago[NFC] Use ADT/Bitfields in Instructions
Guillaume Chatelet [Fri, 3 Jul 2020 07:20:22 +0000 (07:20 +0000)]
[NFC] Use ADT/Bitfields in Instructions

This is an example patch for D81580.

Differential Revision: https://reviews.llvm.org/D81662

4 years ago[X86] Remove MODRM_SPLITREGM from the disassembler tables.
Craig Topper [Fri, 3 Jul 2020 07:13:59 +0000 (00:13 -0700)]
[X86] Remove MODRM_SPLITREGM from the disassembler tables.

This offers a very minor table size reduction due to only being
used for one AMX opcode.

4 years ago[clang] Check ValueDependent instead of InstantiationDependent before executing the...
Haojian Wu [Fri, 3 Jul 2020 06:54:36 +0000 (08:54 +0200)]
[clang] Check ValueDependent instead of InstantiationDependent before executing the align expr for builtin align functions.

in general, value dependent is a subset of instnatiation dependent. This
would allows us to produce diagnostics for the align expression (which
is instantiation dependent but not value dependent).

Differential Revision: https://reviews.llvm.org/D83074

4 years ago[CostModel] Fix cast crash
Sam Parker [Thu, 2 Jul 2020 11:13:23 +0000 (12:13 +0100)]
[CostModel] Fix cast crash

Don't presume instruction operands while matching reductions.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46430

Differential Revision: https://reviews.llvm.org/D82453

4 years ago[PowerPC] Implement probing for dynamic stack allocation
Kai Luo [Fri, 3 Jul 2020 05:27:25 +0000 (05:27 +0000)]
[PowerPC] Implement probing for dynamic stack allocation

This patch is part of supporting `-fstack-clash-protection`. Mainly do
such things compared to existing `lowerDynamicAlloc`

- Added a new pseudo instruction PPC::PREPARE_PROBED_ALLOC to get
  actual frame pointer and final stack pointer.
- Synthesize a loop to probe by blocks.
- Use DYNAREAOFFSET to get MaxCallFrameSize which is calculated in
  prologepilog.

Differential Revision: https://reviews.llvm.org/D81358

4 years ago[X86] Add back support for matching VPTERNLOG from back to back logic ops.
Craig Topper [Fri, 3 Jul 2020 05:11:52 +0000 (22:11 -0700)]
[X86] Add back support for matching VPTERNLOG from back to back logic ops.

I think this mostly looks ok. The only weird thing I noticed was
a couple rotate vXi8 tests picked up an extra logic op where we have

(and (or (and), (andn)), X). Previously we matched the (or (and), (andn))
to vpternlog, but now we match the (and (or), X) and leave the and/andn
unmatched.

4 years ago[AMDGPU] Insert PS early exit at end of control flow
Carl Ritson [Fri, 3 Jul 2020 03:25:33 +0000 (12:25 +0900)]
[AMDGPU] Insert PS early exit at end of control flow

Exit early if the exec mask is zero at the end of control flow.
Mark the ends of control flow during control flow lowering and
convert these to exits during the insert skips pass.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D82737

4 years ago[PowerPC][NFC] Prevent unused error when assertion is disabled.
Kai Luo [Fri, 3 Jul 2020 04:17:01 +0000 (04:17 +0000)]
[PowerPC][NFC] Prevent unused error when assertion is disabled.

4 years ago[lld-macho] Support binding dysyms to any section
Jez Ng [Fri, 3 Jul 2020 04:19:55 +0000 (21:19 -0700)]
[lld-macho] Support binding dysyms to any section

Previously, we only supported binding dysyms to the GOT. This
diff adds support for binding them to any arbitrary section. C++
programs appear to use this, I believe for vtables and type_info.

This diff also makes our bind opcode encoding a bit smarter -- we now
encode just the differences between bindings, which will make things
more compact.

I was initially concerned about the performance overhead of iterating
over these relocations, but it turns out that the number of such
relocations is small. A quick analysis of my llvm-project build
directory showed that < 1.3% out of ~7M relocations are RELOC_UNSIGNED
bindings to symbols (including both dynamic and static symbols).

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D83103

4 years agoRevert "[AMDGPU] Insert PS early exit at end of control flow"
Carl Ritson [Fri, 3 Jul 2020 04:03:33 +0000 (13:03 +0900)]
Revert "[AMDGPU] Insert PS early exit at end of control flow"

This reverts commit 2bfcacf0ad362956277a1c2c9ba00ddc453a42ce.

There appears to be an issue to analysis preservation.

4 years ago[PowerPC][NFC] Refactor lowerDynamicAlloc
Kai Luo [Fri, 3 Jul 2020 03:30:38 +0000 (03:30 +0000)]
[PowerPC][NFC] Refactor lowerDynamicAlloc

When performing dynamic stack allocation, calculation of frame pointer
and actual negsize can be separated. This patch refactors
`lowerDynamicAlloc` in preparation of supporting
`-fstack-clash-protection` which also has to calculate actual frame
pointer and negsize.

Differential Revision: https://reviews.llvm.org/D81354

4 years ago[AMDGPU] Insert PS early exit at end of control flow
Carl Ritson [Fri, 3 Jul 2020 03:25:33 +0000 (12:25 +0900)]
[AMDGPU] Insert PS early exit at end of control flow

Exit early if the exec mask is zero at the end of control flow.
Mark the ends of control flow during control flow lowering and
convert these to exits during the insert skips pass.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D82737

4 years ago[AMDGPU] Unify early PS termination blocks
Carl Ritson [Fri, 3 Jul 2020 00:57:42 +0000 (09:57 +0900)]
[AMDGPU] Unify early PS termination blocks

Generate a single early exit block out-of-line and branch to this
if all lanes are killed. This avoids branching if lanes are active.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D82641

4 years ago[flang] External I/O runtime work, repackaged (part 2)
peter klausler [Thu, 2 Jul 2020 22:51:07 +0000 (15:51 -0700)]
[flang] External I/O runtime work, repackaged (part 2)

Clean up the input editing path so external input works better
when combined with further changes.  List-directed input needed
to allow for advancement to following records.

Reviewed By: tskeith, sscalpone

Differential Revision: https://reviews.llvm.org/D83104

4 years ago[NFC][Scalarizer] Also scalarize loads in newly-added tests
Roman Lebedev [Thu, 2 Jul 2020 23:37:29 +0000 (02:37 +0300)]
[NFC][Scalarizer] Also scalarize loads in newly-added tests

Should help better showcase improvements

4 years ago[NFC][Scalarizer] Add some insertelement/extractelement tests
Roman Lebedev [Thu, 2 Jul 2020 22:53:13 +0000 (01:53 +0300)]
[NFC][Scalarizer] Add some insertelement/extractelement tests

See D82961/D82970/D83101/D83102.

4 years ago[gn build] get everything to build when llvm_targets_to_build is just AArch64
Nico Weber [Thu, 2 Jul 2020 22:52:05 +0000 (18:52 -0400)]
[gn build] get everything to build when llvm_targets_to_build is just AArch64

4 years ago[X86] Teach lower512BitShuffle to try bitmask and bitblend before splitting v32i16...
Craig Topper [Thu, 2 Jul 2020 22:35:47 +0000 (15:35 -0700)]
[X86] Teach lower512BitShuffle to try bitmask and bitblend before splitting v32i16/v64i8 on av512f only targets.

We consider v32i16/v64i8 to be legal types on avx512f, but we
don't have most operations until avx512bw. But we can use
and/or/xor operations. So try those before splitting.

This is especially helpful since we turn some ands with constant
masks into shuffles in early DAG combines. So we should make sure
we recover those back to AND.

4 years ago[flang] External I/O runtime work, repackaged (part 1)
peter klausler [Thu, 2 Jul 2020 21:11:14 +0000 (14:11 -0700)]
[flang] External I/O runtime work, repackaged (part 1)

Add a isFixedRecordLength flag member to Connection to
disambiguate the state of "record has known variable length"
from "record has fixed length".  Code that sets and tests this
flag will appear in later patches.  Rearrange data members to
reduce storage requirements, since Connection might indirectly
end up on a program stack frame.  Add a utility member function
BeginRecord(); use it in internal I/O processing.

Reviewed By: tskeith, sscalpone

Differential Revision: https://reviews.llvm.org/D83098

4 years ago[PowerPC] Implement Vector Blend Builtins in LLVM/Clang
Biplob Mishra [Thu, 2 Jul 2020 21:14:27 +0000 (16:14 -0500)]
[PowerPC] Implement Vector Blend Builtins in LLVM/Clang

Implements vec_blendv()

Differential Revision: https://reviews.llvm.org/D82774

4 years agoFix typo and check commit access.
Sameer Arora [Thu, 2 Jul 2020 21:42:01 +0000 (14:42 -0700)]
Fix typo and check commit access.

4 years ago[x86] remove redundant tests with no check lines; NFC
Sanjay Patel [Thu, 2 Jul 2020 21:45:57 +0000 (17:45 -0400)]
[x86] remove redundant tests with no check lines; NFC

These were accidentally included with:
rGb93e6650c8ac

4 years ago[SelectionDAG] don't split branch on logic-of-vector-compares
Sanjay Patel [Thu, 2 Jul 2020 20:48:09 +0000 (16:48 -0400)]
[SelectionDAG] don't split branch on logic-of-vector-compares

SelectionDAGBuilder converts logic-of-compares into multiple branches based
on a boolean TLI setting in isJumpExpensive(). But that probably never
considered the pattern of extracted bools from a vector compare - it seems
unlikely that we would want to turn vector logic into control-flow.

The motivating x86 reduction case is shown in PR44565:
https://bugs.llvm.org/show_bug.cgi?id=44565
...and that test shows the expected improvement from using pmovmsk codegen.

For AArch64, I modified the test to include an extra op because the simpler
test gets transformed by a codegen invocation of SimplifyCFG.

Differential Revision: https://reviews.llvm.org/D82602

4 years ago[PowerPC]Add Vector Insert Instruction Definitions and MC Test
Amy Kwan [Thu, 2 Jul 2020 20:15:14 +0000 (15:15 -0500)]
[PowerPC]Add Vector Insert Instruction Definitions and MC Test

Adds td definitions and asm/disasm tests for the following instructions:

  VINSBVLX
  VINSBVRX
  VINSHVLX
  VINSHVRX
  VINSWVLX
  VINSWVRX
  VINSBLX
  VINSBRX
  VINSHLX
  VINSHRX
  VINSWLX
  VINSWRX
  VINSDLX
  VINSDRX
  VINSW
  VINSD

Differential Revision: https://reviews.llvm.org/D83052

4 years ago[X86] Add vpternlog to the broadcast unfolding table.
Craig Topper [Thu, 2 Jul 2020 20:43:27 +0000 (13:43 -0700)]
[X86] Add vpternlog to the broadcast unfolding table.

4 years ago[X86] Add test case for unfolding broadcast load from vpternlog.
Craig Topper [Thu, 2 Jul 2020 20:29:30 +0000 (13:29 -0700)]
[X86] Add test case for unfolding broadcast load from vpternlog.

4 years ago[test] Deflake test/profile/ContinuousSyncMode/online-merging.c
Vedant Kumar [Thu, 2 Jul 2020 20:28:48 +0000 (13:28 -0700)]
[test] Deflake test/profile/ContinuousSyncMode/online-merging.c

This test spawns 32 child processes which race to update counters on
shared memory pages. On some Apple-internal machines, two processes race
to perform an update in approximately 0.5% of the test runs, leading to
dropped counter updates. Deflake the test by using atomic increments.

Tested with:

```
$ for I in $(seq 1 1000); do echo ":: Test run $I..."; ./bin/llvm-lit projects/compiler-rt/test/profile/Profile-x86_64h/ContinuousSyncMode/online-merging.c -av || break; done
```

rdar://64956774

4 years ago[InstSimplify] Add test for sext/zext comparisons (NFC)
Nikita Popov [Thu, 2 Jul 2020 20:21:23 +0000 (22:21 +0200)]
[InstSimplify] Add test for sext/zext comparisons (NFC)

4 years ago[mlir] [VectorOps] Add choice between dot and axpy lowering of vector.contract
aartbik [Thu, 2 Jul 2020 20:21:14 +0000 (13:21 -0700)]
[mlir] [VectorOps] Add choice between dot and axpy lowering of vector.contract

Default vector.contract lowering essentially yields a series of sdot/ddot
operations. However, for some layouts a series of saxpy/daxpy operations,
chained through fma are more efficient. This CL introduces a choice between
the two lowering paths. A default heuristic is to follow.

Some preliminary avx2 performance numbers for matrix-times-vector.
Here, dot performs best for 64x64 A x b and saxpy for 64x64 A^T x b.

```
------------------------------------------------------------
            A x b                          A^T x b
------------------------------------------------------------
GFLOPS    sdot (reassoc)    saxpy    sdot (reassoc)    saxpy
------------------------------------------------------------
1x1        0.6               0.9       0.6             0.9
2x2        2.5               3.2       2.4             3.5
4x4        6.4               8.4       4.9             11.8
8x8       11.7               6.1       5.0             29.6
16x16     20.7              10.8       7.3             43.3
32x32     29.3               7.9       6.4             51.8
64x64     38.9                                         79.3
128x128   32.4                                         40.7
------------------------------------------------------------
```

Reviewed By: nicolasvasilache, ftynse

Differential Revision: https://reviews.llvm.org/D83012