platform/upstream/llvm.git
3 years agoUse zu rather than llu format specifier for size_t (-Wformat warning fix).
Eric Christopher [Wed, 16 Sep 2020 22:52:50 +0000 (15:52 -0700)]
Use zu rather than llu format specifier for size_t (-Wformat warning fix).

3 years ago[PowerPC] Fix store-fptoi combine of f128 on Power8
Qiu Chaofan [Thu, 17 Sep 2020 02:19:09 +0000 (10:19 +0800)]
[PowerPC] Fix store-fptoi combine of f128 on Power8

llc would crash for (store (fptosi-f128-i32)) when -mcpu=pwr8, we should
not generate FP_TO_(S|U)INT_IN_VSR for f128 types at this time. This
patch fixes it.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D86686

3 years ago[MachineSink] add one more mir case - nfc
Chen Zheng [Thu, 17 Sep 2020 01:51:53 +0000 (21:51 -0400)]
[MachineSink] add one more mir case - nfc

3 years ago[libunwind][DWARF] Fix end of .eh_frame calculation
Ryan Prichard [Wed, 16 Sep 2020 08:22:55 +0000 (01:22 -0700)]
[libunwind][DWARF] Fix end of .eh_frame calculation

 * When .eh_frame is located using .eh_frame_hdr (PT_GNU_EH_FRAME), the
   start of .eh_frame is known, but not the size. In this case, the
   unwinder must rely on a terminator present at the end of .eh_frame.
   Set dwarf_section_length to UINTPTR_MAX to indicate this.

 * Add a new field, text_segment_length, that the FrameHeaderCache uses
   to track the size of the PT_LOAD segment indicated by dso_base.

 * Compute ehSectionEnd by adding sectionLength to ehSectionStart,
   never to fdeHint.

Fixes PR46829.

Differential Revision: https://reviews.llvm.org/D87750

3 years ago[gn build] Port b04c1a9d312
LLVM GN Syncbot [Thu, 17 Sep 2020 01:54:10 +0000 (01:54 +0000)]
[gn build] Port b04c1a9d312

3 years ago[mlir] expose affine map to C API
zhanghb97 [Mon, 14 Sep 2020 14:52:22 +0000 (22:52 +0800)]
[mlir] expose affine map to C API

This patch provides C API for MLIR affine map.
- Implement C API for AffineMap class.
- Add Utils.h to include/mlir/CAPI/, and move the definition of the CallbackOstream to Utils.h to make sure mlirAffineMapPrint work correct.
- Add TODO for exposing the C API related to AffineExpr and mutable affine map.

Differential Revision: https://reviews.llvm.org/D87617

3 years ago[IRSim] Adding IR Instruction Mapper
Andrew Litteken [Thu, 17 Sep 2020 01:24:29 +0000 (20:24 -0500)]
[IRSim] Adding IR Instruction Mapper

This introduces the IRInstructionMapper, and the associated wrapper for
instructions, IRInstructionData, that maps IR level Instructions to
unsigned integers.

Mapping is done mainly by using the "isSameOperationAs" comparison
between two instructions.  If they return true, the opcode, result type,
and operand types of the instruction are used to hash the instruction
with an unsigned integer.  The mapper accepts instruction ranges, and
adds each resulting integer to a list, and each wrapped instruction to
a separate list.

At present, branches, phi nodes are not mapping and exception handling
is illegal.  Debug instructions are not considered.

The different mapping schemes are tested in
unittests/Analysis/IRSimilarityIdentifierTest.cpp

Differential Revision: https://reviews.llvm.org/D86968

3 years ago[NewPM] Port -print-alias-sets to NPM
Arthur Eubanks [Tue, 15 Sep 2020 02:01:38 +0000 (19:01 -0700)]
[NewPM] Port -print-alias-sets to NPM

Really it should be named print<alias-sets>, but for the sake of
changing fewer tests, added a TODO to rename after NPM switch and test
cleanup.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D87713

3 years agoPR47555: Inheriting constructors are implicitly definable.
Richard Smith [Thu, 17 Sep 2020 01:08:03 +0000 (18:08 -0700)]
PR47555: Inheriting constructors are implicitly definable.

Don't forget to define them if they're constexpr and used inside a
template; we might try to evaluate a call to them before the template is
instantiated.

3 years agoCanonicalize declaration pointers when forming APValues.
Richard Smith [Wed, 2 Sep 2020 21:42:37 +0000 (14:42 -0700)]
Canonicalize declaration pointers when forming APValues.

References to different declarations of the same entity aren't different
values, so shouldn't have different representations.

Recommit of e6393ee813178e9d3306b8e3c6949a4f32f8a2cb with fixed
handling for weak declarations. We now look for attributes on the most
recent declaration when determining whether a declaration is weak.

3 years ago[MemorySSA] Rename uses in blocks with Phis.
Alina Sbirlea [Tue, 15 Sep 2020 01:07:44 +0000 (18:07 -0700)]
[MemorySSA] Rename uses in blocks with Phis.

Renaming should include blocks with existing Phis.

Resolves PR45927.

Differential Revision: https://reviews.llvm.org/D87661

3 years ago[DAGCombiner] Teach visitMSTORE to replace an all ones mask with an unmasked store.
Craig Topper [Wed, 16 Sep 2020 23:37:36 +0000 (16:37 -0700)]
[DAGCombiner] Teach visitMSTORE to replace an all ones mask with an unmasked store.

Similar to what done in D87788 for MLOAD.

Again I've skipped indexed, truncating, and compressing stores.

3 years ago[AArch64] Add -mmark-bti-property flag.
Daniel Kiss [Wed, 16 Sep 2020 21:55:46 +0000 (23:55 +0200)]
[AArch64] Add -mmark-bti-property flag.

Writing the .note.gnu.property manually is error prone and hard to
maintain in the assembly files.
The -mmark-bti-property is for the assembler to emit the section with the
GNU_PROPERTY_AARCH64_FEATURE_1_BTI. To be used when C/C++ is compiled
with -mbranch-protection=bti.

This patch refactors the .note.gnu.property handling.

Reviewed By: chill, nickdesaulniers

Differential Revision: https://reviews.llvm.org/D81930

Reland with test dependency on aarch64 target.

3 years agoRevert "[AArch64] Add -mmark-bti-property flag."
Daniel Kiss [Wed, 16 Sep 2020 23:17:23 +0000 (01:17 +0200)]
Revert "[AArch64] Add -mmark-bti-property flag."

This reverts commit 95e43f84b7b9c61011aece7583c0367297dd67d8.

3 years agoCommenting out atomics with padding to unbreak MSAN tests
ogiroux [Wed, 16 Sep 2020 23:12:10 +0000 (16:12 -0700)]
Commenting out atomics with padding to unbreak MSAN tests

3 years ago[Flang] Fixed installation permission of the "binary" flang
Shilei Tian [Wed, 16 Sep 2020 22:54:11 +0000 (18:54 -0400)]
[Flang] Fixed installation permission of the "binary" flang

Under current configuration, the permission of `flang` after installation is 700.
This could bring a problem for system administrators who build and install flang
for other users, which only the user who builds LLVM can execute it, and others
can not. In this patch, the explicit permission setting in the `install` command
is removed, and let CMake determine what perssion to be used like other components.

Reviewed By: DavidTruby

Differential Revision: https://reviews.llvm.org/D87783

3 years ago[EarlyCSE] Simplify max/min pattern matching. NFC.
Michael Liao [Wed, 16 Sep 2020 22:21:10 +0000 (18:21 -0400)]
[EarlyCSE] Simplify max/min pattern matching. NFC.

3 years ago[gn build] (manually) port 1321160a2
Nico Weber [Wed, 16 Sep 2020 22:28:51 +0000 (18:28 -0400)]
[gn build] (manually) port 1321160a2

3 years ago[AArch64] Add -mmark-bti-property flag.
Daniel Kiss [Wed, 16 Sep 2020 21:55:46 +0000 (23:55 +0200)]
[AArch64] Add -mmark-bti-property flag.

Writing the .note.gnu.property manually is error prone and hard to
maintain in the assembly files.
The -mmark-bti-property is for the assembler to emit the section with the
GNU_PROPERTY_AARCH64_FEATURE_1_BTI. To be used when C/C++ is compiled
with -mbranch-protection=bti.

This patch refactors the .note.gnu.property handling.

Reviewed By: chill, nickdesaulniers

Differential Revision: https://reviews.llvm.org/D81930

3 years agoDisable a large test for EXPENSIVE_CHECKS and debug build
jasonliu [Wed, 16 Sep 2020 21:51:41 +0000 (21:51 +0000)]
Disable a large test for EXPENSIVE_CHECKS and debug build

Summary:
When running a large test in LLVM_ENABLE_EXPENSIVE_CHECKS=ON mode,
buildbot could hit timeout.
Disable the test when this mode is on.
Also disable it for debug so that the test won't hang for too long.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D87794

3 years ago[flang] Substrings with lower bound greater than upper bound
Peter Steinfeld [Wed, 16 Sep 2020 21:42:30 +0000 (14:42 -0700)]
[flang] Substrings with lower bound greater than upper bound

According to section 9.4.1, paragraph 3,
 If the starting point is greater than the ending point, the substring has
 length zero

But the compilers code for substring processing was failing a call to `CHECK()`
in this case.  I fixed this by just setting the number of items in the
resulting string to 0 for this situation.

Differential Revision: https://reviews.llvm.org/D87799

3 years ago[libunwind] Support for leaf function unwinding.
Daniel Kiss [Wed, 16 Sep 2020 21:03:19 +0000 (23:03 +0200)]
[libunwind] Support for leaf function unwinding.

Unwinding leaf function is useful in cases when the backtrace finds a
leaf function for example when it caused a signal.
This patch also add the support for the DW_CFA_undefined because it marks
the end of the frames.

Ryan Prichard provided code for the tests.

Reviewed By: #libunwind, mstorsjo

Differential Revision: https://reviews.llvm.org/D83573

3 years ago[NFC] Refactor DiagnosticBuilder and PartialDiagnostic
Yaxun (Sam) Liu [Wed, 22 Jul 2020 19:31:53 +0000 (15:31 -0400)]
[NFC] Refactor DiagnosticBuilder and PartialDiagnostic

PartialDiagnostic misses some functions compared to DiagnosticBuilder.

This patch refactors DiagnosticBuilder and PartialDiagnostic, extracts
the common functionality so that the streaming << operators are
shared.

Differential Revision: https://reviews.llvm.org/D84362

3 years ago[lldb/test] Enable faulthandler in dotest
Jordan Rupprecht [Wed, 16 Sep 2020 21:26:40 +0000 (14:26 -0700)]
[lldb/test] Enable faulthandler in dotest

Register the `faulthandler` module so we can see what lldb tests are doing when they misbehave (e.g. run under a test runner that sets a timeout). This will print a stack trace for the following signals:

- `SIGSEGV`, `SIGFPE`, `SIGABRT`, `SIGBUS`, and `SIGILL` (via `faulthandler.enable()`)
- `SIGTERM` (via `faulthandler.register(SIGTERM)`) [This is what our test runners sends when it times out].

The only signal we currently handle is `SIGINT` (via `unittest2.signals.installHandler()`) so there should be no overlap added by this patch.

Because this import is not available until python3, and the `register()` method is not available on Windows, this is enabled defensively.

This should have absolutely no effect when tests are passing (or even normally failing), but can be observed by running this while ninja is running:

```
kill -s SIGTERM $(ps aux | grep dotest.py | head -1 | awk '{print $2}')
```

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D87637

3 years ago[obj2yaml] - Match ".stack_size" with the original section name, and not the uniquifi...
Rahman Lavaee [Wed, 16 Sep 2020 21:17:02 +0000 (14:17 -0700)]
[obj2yaml] - Match ".stack_size" with the original section name, and not the uniquified name.

Without this patch, obj2yaml decodes the content of only one ".stack_size" section. Other sections are dumped with their full contents.

Reviewed By: grimar, MaskRay

Differential Revision: https://reviews.llvm.org/D87727

3 years ago[NFC][regalloc] type LiveInterval::reg() as Register
Mircea Trofin [Wed, 16 Sep 2020 15:36:58 +0000 (08:36 -0700)]
[NFC][regalloc] type LiveInterval::reg() as Register

We have the Register type which precisely captures the role of this
member. Storage-wise, it's an unsigned.

This helps readability & maintainability.

Differential Revision: https://reviews.llvm.org/D87768

3 years ago[ELF] Bump the limit of thunk creation passes from 10 to 15
Fangrui Song [Wed, 16 Sep 2020 21:03:34 +0000 (14:03 -0700)]
[ELF] Bump the limit of thunk creation passes from 10 to 15

I have noticed that a 374MiB powerpc64le 'ld.lld' requires 11 passes to link.
There is a ThunkSection (whose parent OutputSection is ".text" of 169MiB) with 12867 thunks.

3 years ago[NFC][LSAN] Change SuspendedThreadsList interface
Vitaly Buka [Wed, 16 Sep 2020 08:14:55 +0000 (01:14 -0700)]
[NFC][LSAN] Change SuspendedThreadsList interface

Remove RegisterCount and let GetRegistersAndSP to resize buffer as needed.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D87747

3 years ago[AMDGPU] gfx1030 test update. NFC.
Stanislav Mekhanoshin [Wed, 16 Sep 2020 20:43:45 +0000 (13:43 -0700)]
[AMDGPU] gfx1030 test update. NFC.

3 years agoRevert "Do not apply calling conventions to MSVC entry points"
Amy Huang [Wed, 16 Sep 2020 20:51:36 +0000 (13:51 -0700)]
Revert "Do not apply calling conventions to MSVC entry points"

This reverts commit 4cff1b40dacf6a5489b09657d94ea4757b8cd3b0.

Caused "undefined symbol: _WinMain@16" link errors.

3 years ago[ORC] Add operations to create and lookup JITDylibs to OrcV2 C bindings.
Lang Hames [Wed, 16 Sep 2020 20:46:55 +0000 (13:46 -0700)]
[ORC] Add operations to create and lookup JITDylibs to OrcV2 C bindings.

3 years agoRevert "[lsan] Share platform allocator settings between ASan and LSan"
Petr Hosek [Wed, 16 Sep 2020 20:48:19 +0000 (13:48 -0700)]
Revert "[lsan] Share platform allocator settings between ASan and LSan"

This reverts commit c57df3dc09e8b59c55c83ba5c354569a82a5c3b8 which broke
Windows sanitizer bots.

3 years ago[lsan] Share platform allocator settings between ASan and LSan
Petr Hosek [Wed, 16 Sep 2020 20:18:41 +0000 (13:18 -0700)]
[lsan] Share platform allocator settings between ASan and LSan

This moves the platform-specific parameter logic from asan into
sanitizer_common so lsan can reuse it.

Patch By: mcgrathr

Differential Revision: https://reviews.llvm.org/D85930

3 years ago[DAGCombiner] Teach visitMLOAD to replace an all ones mask with an unmasked load
Craig Topper [Wed, 16 Sep 2020 20:21:15 +0000 (13:21 -0700)]
[DAGCombiner] Teach visitMLOAD to replace an all ones mask with an unmasked load

If we have an all ones mask, we can just a regular masked load. InstCombine already gets this in IR. But the all ones mask can appear after type legalization.

Only avx512 test cases are affected because X86 backend already looks for element 0 and the last element being 1. It replaces this with an unmasked load and blend. The all ones mask is a special case of that where the blend will be removed. That transform is only enabled on avx2 targets. I believe that's because a non-zero passthru on avx2 already requires a separate blend so its more profitable to handle mixed constant masks.

This patch adds a dedicated all ones handling to the target independent DAG combiner. I've skipped extending, expanding, and index loads for now. X86 doesn't use index so I don't know much about it. Extending made me nervous because I wasn't sure I could trust the memory VT had the right element count due to some weirdness in vector splitting. For expanding I wasn't sure if we needed different undef handling.

Differential Revision: https://reviews.llvm.org/D87788

3 years ago[X86] Add test case for a masked load mask becoming all ones after type legalization.
Craig Topper [Wed, 16 Sep 2020 19:20:38 +0000 (12:20 -0700)]
[X86] Add test case for a masked load mask becoming all ones after type legalization.

We should be able to turn this into a unmasked load. X86 has an
optimization to detect that the first and last element aren't masked
and then turn the whole thing into an unmasked load and a blend.
That transform is disabled on avx512 though.

But if we know the blend isn't needed, then the unmasked load by
itself should always be profitable.

3 years ago [flang][msvc] Work around if constexpr (false) evaluation. NFC.
Michael Kruse [Wed, 16 Sep 2020 19:58:29 +0000 (14:58 -0500)]
 [flang][msvc] Work around if constexpr (false) evaluation. NFC.

MSVC tries to expand templates that are in the false-branch of a `if constexpr` construct. In this case, the condition checks whether a tuple has at least one element and then is trying to access it using `std::get<0>`, which fails when the tuple has 0 elements.

The workaround is to extract that case into a separate method.

This patch is part of the series to make flang compilable with MS Visual Studio <http://lists.llvm.org/pipermail/flang-dev/2020-July/000448.html>.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D87728

3 years ago[aarch64][tests] Add tests which show current lack of implicit null support
Philip Reames [Wed, 16 Sep 2020 19:54:15 +0000 (12:54 -0700)]
[aarch64][tests] Add tests which show current lack of implicit null support

I will be posting a patch which adds appropriate target support shortly; landing the tests so that the diffs are clear.

3 years ago[UpdateTestChecks] Allow $ in function names
David Greene [Thu, 23 Jan 2020 20:30:32 +0000 (14:30 -0600)]
[UpdateTestChecks] Allow $ in function names

Some compilers generation functions with '$' in their names, so recognize those
functions.

This also requires recognizing function names inside quotes in some contexts in
order to escape certain characters.

Differential Revision: https://reviews.llvm.org/D82995

3 years ago[gn build] Port 56069b5c71c
LLVM GN Syncbot [Wed, 16 Sep 2020 19:03:25 +0000 (19:03 +0000)]
[gn build] Port 56069b5c71c

3 years agoReapply [InstCombine] Simplify select operand based on equality condition
Nikita Popov [Thu, 10 Sep 2020 16:45:53 +0000 (18:45 +0200)]
Reapply [InstCombine] Simplify select operand based on equality condition

Reapply after fixing SimplifyWithOpReplaced() to never return
the original value, which would lead to an infinite loop in this
transform.

-----

For selects of the type X == Y ? A : B, check if we can simplify A
by using the X == Y equality and replace the operand if that's
possible. We already try to do this in InstSimplify, but will only
fold if the result of the simplification is the same as B, in which
case the select can be dropped entirely. Here the select will be
retained, just one operand simplified.

As we are performing an actual replacement here, we don't have
problems with refinement / poison values.

Differential Revision: https://reviews.llvm.org/D87480

3 years ago[InstSimplify] Clarify SimplifyWithOpReplaced() return value
Nikita Popov [Wed, 16 Sep 2020 18:49:08 +0000 (20:49 +0200)]
[InstSimplify] Clarify SimplifyWithOpReplaced() return value

If SimplifyWithOpReplaced() cannot simplify the value, null should
be returned. Make sure this really does happen in all cases,
including those where SimplifyBinOp() returns the original value.

This does not matter for existing users, but does mattter for
D87480, which would go into an infinite loop otherwise.

3 years ago[InstCombine] Add test for infinite combine loop (NFC)
Nikita Popov [Wed, 16 Sep 2020 16:27:55 +0000 (18:27 +0200)]
[InstCombine] Add test for infinite combine loop (NFC)

Test courtesy of bkramer for the infinite combine loop introduced
by D87480.

3 years agoFix build.
Michael Liao [Wed, 16 Sep 2020 18:43:08 +0000 (14:43 -0400)]
Fix build.

3 years ago[gn build] unconfuse sync script about "sources = []" in clang/lib/Headers/BUILD.gn
Nico Weber [Wed, 16 Sep 2020 18:50:29 +0000 (14:50 -0400)]
[gn build] unconfuse sync script about "sources = []" in clang/lib/Headers/BUILD.gn

3 years ago[SystemZ][z/OS] Set aligned allocation unavailable by default for z/OS
Fanbo Meng [Wed, 16 Sep 2020 17:52:28 +0000 (13:52 -0400)]
[SystemZ][z/OS] Set aligned allocation unavailable by default for z/OS

Aligned allocation is not supported on z/OS. This patch sets -faligned-alloc-unavailable as default in z/OS toolchain.

Reviewed By: abhina.sreeskantharajan, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D87611

3 years agoRevert "[obj2yaml] - Match ".stack_size" with the original section name, and not...
Rahman Lavaee [Wed, 16 Sep 2020 18:41:54 +0000 (11:41 -0700)]
Revert "[obj2yaml] - Match ".stack_size" with the original section name, and not the uniquified name."

This reverts commit 14e55f82980cf1342d4d3eea4885a5375e829496.

3 years ago[AMDGPU] gfx1030 RT support
Stanislav Mekhanoshin [Wed, 16 Sep 2020 18:09:25 +0000 (11:09 -0700)]
[AMDGPU] gfx1030 RT support

Differential Revision: https://reviews.llvm.org/D87782

3 years ago[OpenMP] Support `std::complex` math functions in target regions
Johannes Doerfert [Thu, 6 Aug 2020 20:46:44 +0000 (15:46 -0500)]
[OpenMP] Support `std::complex` math functions in target regions

The last (big) missing piece to get "math" working in OpenMP target
regions (that I know of) was complex math functions, e.g.,
`std::sin(std::complex<double>)`. With this patch we overload the system
template functions for these operations with versions that have been
distilled from `libcxx/include/complex`. We use the same
  `omp begin/end declare variant`
mechanism we use for other math functions before, except that we this
time overload templates (via D85735).

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D85777

3 years ago[OpenMP] Context selector extensions for template functions
Johannes Doerfert [Sun, 31 May 2020 16:40:09 +0000 (11:40 -0500)]
[OpenMP] Context selector extensions for template functions

With this extension the effects of `omp begin declare variant` will be
applied to template function declarations. The behavior is opt-in and
controlled by the `extension(allow_templates)` trait. While generally
useful, this will enable us to implement complex math function calls by
overloading the templates of the standard library with the ones in
libc++.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D85735

3 years ago[OpenMP] Overload `std::isnan` and friends multiple times for the GPU
Johannes Doerfert [Wed, 12 Aug 2020 21:45:46 +0000 (16:45 -0500)]
[OpenMP] Overload `std::isnan` and friends multiple times for the GPU

`std::isnan` and friends can be found in two variants in the wild, one
returns `bool`, as the standard defines it, one returns `int`, as the C
macros do. So far we kinda hoped the system versions of these functions
will work for people, e.g. they are definitions that can be compiled for
the target. We know that is not the case always so we leverage the
`disable_implicit_base` OpenMP context extension to specialize both
versions of these functions without causing an invalid redeclaration.

Reviewed By: JonChesterfield, tra

Differential Revision: https://reviews.llvm.org/D85879

3 years ago[OpenMP] Context selector extensions for return value overloading
Johannes Doerfert [Wed, 12 Aug 2020 21:49:10 +0000 (16:49 -0500)]
[OpenMP] Context selector extensions for return value overloading

This extension allows to declare variants in between `omp begin/end
declare variant` that do not match the type of the existing function
with that name. Without this extension we would not find a base function
(with a compatible type), therefore create a new one, which would
cause conflicting declarations. With this extension we will not create
"missing" base functions, which basically renders these specializations
harmless. They will be generated but never called.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D85878

3 years ago[OpenMP] Support nested OpenMP context selectors (declare variant)
Johannes Doerfert [Thu, 13 Aug 2020 06:05:51 +0000 (01:05 -0500)]
[OpenMP] Support nested OpenMP context selectors (declare variant)

Due to `omp begin/end declare variant`, OpenMP context selectors can be
nested. This patch adds initial support for this so we can use it for
target math variants. We should improve the detection of "equivalent"
scores and user conditions, we should also revisit the data structures
of the OMPTraitInfo object, however, both are not pressing issues right
now.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D85877

3 years ago[OpenMP][FIX] Do not drop a '$' while demangling declare variant names
Johannes Doerfert [Thu, 13 Aug 2020 06:12:31 +0000 (01:12 -0500)]
[OpenMP][FIX] Do not drop a '$' while demangling declare variant names

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D85876

3 years ago[OpenMP][FIX] Do not crash trying to print a missing (demangled) user condition
Johannes Doerfert [Thu, 13 Aug 2020 00:44:25 +0000 (19:44 -0500)]
[OpenMP][FIX] Do not crash trying to print a missing (demangled) user condition

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D85875

3 years ago[UpdateTestChecks][NFC] Fix spelling
Johannes Doerfert [Fri, 14 Aug 2020 01:25:02 +0000 (20:25 -0500)]
[UpdateTestChecks][NFC] Fix spelling

3 years agoAdd '<' meta command to read in code from external file
Patrick Beard [Thu, 30 Jul 2020 21:43:46 +0000 (14:43 -0700)]
Add '<' meta command to read in code from external file

Perform all error handling in ReadCode()

Add :help text describing “< path”, add extra line before Commands

Differential Revision: https://reviews.llvm.org/D87640

3 years ago[obj2yaml] - Match ".stack_size" with the original section name, and not the uniquifi...
Rahman Lavaee [Wed, 16 Sep 2020 18:31:21 +0000 (11:31 -0700)]
[obj2yaml] - Match ".stack_size" with the original section name, and not the uniquified name.

Without this patch, obj2yaml decodes the content of only one ".stack_size" section. Other sections are dumped with their full contents.

Reviewed By: grimar, MaskRay

Differential Revision: https://reviews.llvm.org/D87727

3 years agoGlobalISel: Lift store value widening restriction
Matt Arsenault [Sat, 22 Aug 2020 16:34:38 +0000 (12:34 -0400)]
GlobalISel: Lift store value widening restriction

This doesn't change the memory size and doesn't need to worry about
non-power-of-2 sizes.

3 years ago[gn build] make "all" target build
Nico Weber [Wed, 16 Sep 2020 18:21:14 +0000 (14:21 -0400)]
[gn build] make "all" target build

If you want to build everything, building the default target
via just `ninja` is better, but `ninja all` shouldn't give you
compile errors -- this fixes that.

3 years ago[AArch64][GlobalISel] Make G_BUILD_VECTOR os <16 x s8> legal.
Amara Emerson [Wed, 16 Sep 2020 18:19:08 +0000 (11:19 -0700)]
[AArch64][GlobalISel] Make G_BUILD_VECTOR os <16 x s8> legal.

3 years ago[clang][codegen] Skip adding default function attributes on intrinsics.
Michael Liao [Wed, 16 Sep 2020 12:52:02 +0000 (08:52 -0400)]
[clang][codegen] Skip adding default function attributes on intrinsics.

- After loading builtin bitcode for linking, skip adding default
  function attributes on LLVM intrinsics as their attributes are
  well-defined and retrieved directly from internal definitions. Adding
  extra attributes on intrinsics results in inconsistent result when
  `-save-temps` is present. Also, that makes few optimizations
  conservative.

Differential Revision: https://reviews.llvm.org/D87761

3 years agofix test no-rtti.cpp
Zequan Wu [Wed, 16 Sep 2020 18:03:04 +0000 (11:03 -0700)]
fix test no-rtti.cpp

3 years agoSema: add support for `__attribute__((__swift_bridge__))`
Saleem Abdulrasool [Wed, 9 Sep 2020 22:43:37 +0000 (22:43 +0000)]
Sema: add support for `__attribute__((__swift_bridge__))`

This extends semantic analysis of attributes for Swift interoperability
by introducing the `swift_bridge` attribute.  This attribute enables
bridging Objective-C types to Swift specific types.

This is based on the work of the original changes in
https://github.com/llvm/llvm-project-staging/commit/8afaf3aad2af43cfedca7a24cd817848c4e95c0c

Differential Revision: https://reviews.llvm.org/D87532
Reviewed By: Aaron Ballman

3 years ago[libFuzzer] Enable entropic by default.
Matt Morehouse [Tue, 15 Sep 2020 17:33:23 +0000 (10:33 -0700)]
[libFuzzer] Enable entropic by default.

Entropic has performed at least on par with vanilla scheduling on
Clusterfuzz, and has shown a slight coverage improvement on FuzzBench:
https://www.fuzzbench.com/reports/2020-08-31/index.html

Reviewed By: Dor1s

Differential Revision: https://reviews.llvm.org/D87476

3 years ago[Sema][MSVC] warn at dynamic_cast/typeid when /GR- is given
Zequan Wu [Tue, 15 Sep 2020 20:44:22 +0000 (13:44 -0700)]
[Sema][MSVC] warn at dynamic_cast/typeid when /GR- is given

Differential Revision: https://reviews.llvm.org/D86369

3 years ago[GISel] Add new combines for unary FP instrs with constant operand
Michael Kitzan [Sat, 22 Aug 2020 06:11:22 +0000 (23:11 -0700)]
[GISel] Add new combines for unary FP instrs with constant operand

https://reviews.llvm.org/D86393

Patch adds five new `GICombinerRules`, one for each of the following unary
FP instrs: `G_FNEG`, `G_FABS`, `G_FPTRUNC`, `G_FSQRT`, and `G_FLOG2`. The
combine rules perform the FP operation on the constant operand and replace
the original instr with the result. Patch additionally adds new combiner
tests for the AArch64 target to test these new combiner rules.

3 years agoDwarfUnit.h - remove unnecessary includes. NFCI.
Simon Pilgrim [Wed, 16 Sep 2020 17:32:03 +0000 (18:32 +0100)]
DwarfUnit.h - remove unnecessary includes. NFCI.

3 years agoraw_ostream.cpp - remove duplicate includes. NFCI.
Simon Pilgrim [Wed, 16 Sep 2020 17:11:39 +0000 (18:11 +0100)]
raw_ostream.cpp - remove duplicate includes. NFCI.

Remove headers already included in raw_ostream.h

3 years agoInterferenceCache.cpp - remove duplicate includes. NFCI.
Simon Pilgrim [Wed, 16 Sep 2020 17:09:30 +0000 (18:09 +0100)]
InterferenceCache.cpp - remove duplicate includes. NFCI.

Remove headers already included in InterferenceCache.h

3 years agoValueEnumerator.cpp - remove duplicate includes. NFCI.
Simon Pilgrim [Wed, 16 Sep 2020 17:08:32 +0000 (18:08 +0100)]
ValueEnumerator.cpp - remove duplicate includes. NFCI.

Remove headers already included in ValueEnumerator.h

3 years ago[SLP] add tests for reduction ordering; NFC
Sanjay Patel [Wed, 16 Sep 2020 14:59:30 +0000 (10:59 -0400)]
[SLP] add tests for reduction ordering; NFC

3 years ago[llvm-nm] Use aggregate initialization instead of memset zero
Fangrui Song [Wed, 16 Sep 2020 17:24:58 +0000 (10:24 -0700)]
[llvm-nm] Use aggregate initialization instead of memset zero

3 years agoRe-land: Add new hidden option -print-changed which only reports changes to IR
Jamie Schmeiser [Wed, 16 Sep 2020 17:25:13 +0000 (17:25 +0000)]
Re-land: Add new hidden option -print-changed which only reports changes to IR

A new hidden option -print-changed is added along with code to support
printing the IR as it passes through the opt pipeline in the new pass
manager. Only those passes that change the IR are reported, with others
only having the banner reported, indicating that they did not change the
IR, were filtered out or ignored. Filtering of output via the
-filter-print-funcs is supported and a new supporting hidden option
-filter-passes is added. The latter takes a comma separated list of pass
names and filters the output to only show those passes in the list that
change the IR. The output can also be modified via the -print-module-scope
function.

The code introduces a template base class that generalizes the comparison
of IRs that takes an IR representation as template parameter. The
constructor takes a series of lambdas that provide an event based API
for generalized reporting of IRs as they are changed in the opt pipeline
through the new pass manager.

The first of several instantiations is provided that prints the IR
in a form similar to that produced by -print-after-all with the above
mentioned filtering capabilities. This version, and the others to
follow will be introduced at the upcoming developer's conference.

Reviewed By: aeubanks (Arthur Eubanks), yrouban (Yevgeny Rouban), ychen (Yuanfang Chen)

Differential Revision: https://reviews.llvm.org/D86360

3 years ago[libc++] Ensure streams are initialized early
Louis Dionne [Thu, 14 May 2020 13:56:35 +0000 (09:56 -0400)]
[libc++] Ensure streams are initialized early

When statically linking libc++ on some systems, the streams are not
initialized early enough, which causes all kinds of issues. This was
reported e.g. in http://llvm.org/PR28954, but also in various open
source projects that use libc++.

Fixes http://llvm.org/PR28954.

Differential Revision: https://reviews.llvm.org/D31413

3 years agoRegAllocFast: Make self loop live-out heuristic more aggressive
Matt Arsenault [Mon, 31 Aug 2020 19:09:50 +0000 (15:09 -0400)]
RegAllocFast: Make self loop live-out heuristic more aggressive

This currently has no impact on code, but prevents sizeable code size
regressions after D52010. This prevents spilling and reloading all
values inside blocks that loop back. Add a baseline test which would
regress without this patch.

3 years agoInclude (Type|Symbol)Record.h less
Reid Kleckner [Wed, 16 Sep 2020 16:55:22 +0000 (09:55 -0700)]
Include (Type|Symbol)Record.h less

Most clients only need CVType and CVSymbol, not structs for every type
and symbol. Move CVSymbol and CVType to CVRecord.h to accomplish this.
Update some of the common headers that need CVSymbol and CVType to use
the new location.

3 years agoAMDGPU: Clear offset register when using local stack area
Matt Arsenault [Thu, 10 Sep 2020 16:11:53 +0000 (12:11 -0400)]
AMDGPU: Clear offset register when using local stack area

eliminateFrameIndex won't fix up the offset register when the direct
frame index reference is moved to a separate move instruction. Switch
the offset to a base 0 (which it probably should be to begin with).

3 years agoAMDGPU: Add baseline test for incorrect SP access
Matt Arsenault [Thu, 10 Sep 2020 17:06:12 +0000 (13:06 -0400)]
AMDGPU: Add baseline test for incorrect SP access

3 years agoLocalStackSlotAllocation: Swap order of check
Matt Arsenault [Thu, 10 Sep 2020 16:08:41 +0000 (12:08 -0400)]
LocalStackSlotAllocation: Swap order of check

3 years agoDo not apply calling conventions to MSVC entry points
Elizabeth Andrews [Mon, 14 Sep 2020 21:33:01 +0000 (14:33 -0700)]
Do not apply calling conventions to MSVC entry points

Fix link error for MSVC entry points when calling conventions
are specified. MSVC entry points should have default calling
convention.

Differential Revision: https://reviews.llvm.org/D87701

3 years ago[libfuzzer] Reduce default verbosity when printing large mutation sequences
mhl [Wed, 16 Sep 2020 15:02:34 +0000 (08:02 -0700)]
[libfuzzer] Reduce default verbosity when printing large mutation sequences

When using a custom mutator (e.g. thrift mutator, similar to LPM)
that calls back into libfuzzer's mutations via `LLVMFuzzerMutate`, the mutation
sequences needed to achieve new coverage can get prohibitively large.

Printing these large sequences has two downsides:

1) It makes the logs hard to understand for a human.
2) The performance cost slows down fuzzing.

In this patch I change the `PrintMutationSequence` function to take a max
number of entries, to achieve this goal. I also update `PrintStatusForNewUnit`
to default to printing only 10 entries, in the default verbosity level (1),
requiring the user to set verbosity to 2 if they want the full mutation
sequence.

For our use case, turning off verbosity is not an option, as that would also
disable `PrintStats()` which is very useful for infrastructure that analyzes
the logs in realtime. I imagine most users of libfuzzer always want those logs
in the default.

I built a fuzzer locally with this patch applied to libfuzzer.

When running with the default verbosity, I see logs like this:

    #65 NEW    cov: 4799 ft: 10443 corp: 41/1447Kb lim: 64000 exec/s: 1 rss: 575Mb L: 28658/62542 MS: 196 Custom-CrossOver-ChangeBit-EraseBytes-ChangeBit-ChangeBit-ChangeBit-CrossOver-ChangeBit-CrossOver- DE: "\xff\xff\xff\x0e"-"\xfe\xff\xff\x7f"-"\xfe\xff\xff\x7f"-"\x17\x00\x00\x00\x00\x00\x00\x00"-"\x00\x00\x00\xf9"-"\xff\xff\xff\xff"-"\xfa\xff\xff\xff"-"\xf7\xff\xff\xff"-"@\xff\xff\xff\xff\xff\xff\xff"-"E\x00"-
    #67 NEW    cov: 4810 ft: 10462 corp: 42/1486Kb lim: 64000 exec/s: 1 rss: 577Mb L: 39823/62542 MS: 135 Custom-CopyPart-ShuffleBytes-ShuffleBytes-ChangeBit-ChangeBinInt-EraseBytes-ChangeBit-ChangeBinInt-ChangeBit- DE: "\x01\x00\x00\x00\x00\x00\x01\xf1"-"\x00\x00\x00\x07"-"\x00\x0d"-"\xfd\xff\xff\xff"-"\xfe\xff\xff\xf4"-"\xe3\xff\xff\xff"-"\xff\xff\xff\xf1"-"\xea\xff\xff\xff"-"\x00\x00\x00\xfd"-"\x01\x00\x00\x05"-

Staring hard at the logs it's clear that the cap of 10 is applied.

When running with verbosity level 2, the logs look like the below:

    #66    NEW    cov: 4700 ft: 10188 corp: 37/1186Kb lim: 64000 exec/s: 2 rss: 509Mb L: 47616/61231 MS: 520 Custom-CopyPart-ChangeBinInt-ChangeBit-ChangeByte-EraseBytes-PersAutoDict-CopyPart-ShuffleBytes-ChangeBit-ShuffleBytes-CopyPart-EraseBytes-CopyPart-ChangeBinInt-CopyPart-ChangeByte-ShuffleBytes-ChangeBinInt-ShuffleBytes-ChangeBit-CMP-ShuffleBytes-ChangeBit-CrossOver-ChangeBinInt-ChangeByte-ShuffleBytes-CrossOver-EraseBytes-ChangeBinInt-InsertRepeatedBytes-PersAutoDict-InsertRepeatedBytes-InsertRepeatedBytes-CrossOver-ChangeByte-ShuffleBytes-CopyPart-ShuffleBytes-CopyPart-CrossOver-ChangeBit-ShuffleBytes-CrossOver-PersAutoDict-ChangeByte-ChangeBit-ShuffleBytes-CrossOver-ChangeByte-EraseBytes-CopyPart-ChangeBinInt-PersAutoDict-CrossOver-ShuffleBytes-CrossOver-CrossOver-EraseBytes-CrossOver-EraseBytes-CrossOver-ChangeBit-ChangeBinInt-ChangeByte-EraseBytes-ShuffleBytes-ShuffleBytes-ChangeBit-EraseBytes-ChangeBinInt-ChangeBit-ChangeBinInt-CopyPart-EraseBytes-PersAutoDict-EraseBytes-CopyPart-ChangeBinInt-ChangeByte-CrossOver-ChangeBinInt-ShuffleBytes-PersAutoDict-PersAutoDict-ChangeBinInt-CopyPart-ChangeBinInt-CrossOver-ChangeBit-ChangeBinInt-CopyPart-ChangeByte-ChangeBit-CopyPart-CrossOver-ChangeByte-ChangeBit-ChangeByte-ShuffleBytes-CMP-ChangeBit-CopyPart-ChangeBit-ChangeByte-ChangeBinInt-PersAutoDict-ChangeBinInt-CrossOver-ChangeBinInt-ChangeBit-ChangeBinInt-ChangeBinInt-PersAutoDict-ChangeBinInt-ChangeBinInt-ChangeByte-CopyPart-ShuffleBytes-ChangeByte-ChangeBit-ChangeByte-ChangeByte-EraseBytes-CrossOver-ChangeByte-ChangeByte-EraseBytes-EraseBytes-InsertRepeatedBytes-ShuffleBytes-CopyPart-CopyPart-ChangeBit-ShuffleBytes-PersAutoDict-ShuffleBytes-ChangeBit-ChangeByte-ChangeBit-ShuffleBytes-ChangeByte-ChangeBinInt-CrossOver-ChangeBinInt-ChangeBit-EraseBytes-CopyPart-ChangeByte-CrossOver-EraseBytes-CrossOver-ChangeByte-ShuffleBytes-ChangeByte-ChangeBinInt-CrossOver-ChangeByte-InsertRepeatedBytes-InsertByte-ShuffleBytes-PersAutoDict-ChangeBit-ChangeByte-ChangeBit-ShuffleBytes-ShuffleBytes-CopyPart-ShuffleBytes-EraseBytes-ShuffleBytes-ShuffleBytes-CrossOver-ChangeBinInt-CopyPart-CopyPart-CopyPart-EraseBytes-EraseBytes-ChangeByte-ChangeBinInt-ShuffleBytes-CMP-InsertByte-EraseBytes-ShuffleBytes-CopyPart-ChangeBit-CrossOver-CopyPart-CopyPart-ShuffleBytes-ChangeByte-ChangeByte-ChangeBinInt-EraseBytes-ChangeByte-ChangeBinInt-ChangeBit-ChangeBit-ChangeByte-ShuffleBytes-PersAutoDict-PersAutoDict-CMP-ChangeBit-ShuffleBytes-PersAutoDict-ChangeBinInt-EraseBytes-EraseBytes-ShuffleBytes-ChangeByte-ShuffleBytes-ChangeBit-EraseBytes-CMP-ShuffleBytes-ChangeByte-ChangeBinInt-EraseBytes-ChangeBinInt-ChangeByte-EraseBytes-ChangeByte-CrossOver-ShuffleBytes-EraseBytes-EraseBytes-ShuffleBytes-ChangeBit-EraseBytes-CopyPart-ShuffleBytes-ShuffleBytes-CrossOver-CopyPart-ChangeBinInt-ShuffleBytes-CrossOver-InsertByte-InsertByte-ChangeBinInt-ChangeBinInt-CopyPart-EraseBytes-ShuffleBytes-ChangeBit-ChangeBit-EraseBytes-ChangeByte-ChangeByte-ChangeBinInt-CrossOver-ChangeBinInt-ChangeBinInt-ShuffleBytes-ShuffleBytes-ChangeByte-ChangeByte-ChangeBinInt-ShuffleBytes-CrossOver-EraseBytes-CopyPart-CopyPart-CopyPart-ChangeBit-ShuffleBytes-ChangeByte-EraseBytes-ChangeByte-InsertRepeatedBytes-InsertByte-InsertRepeatedBytes-PersAutoDict-EraseBytes-ShuffleBytes-ChangeByte-ShuffleBytes-ChangeBinInt-ShuffleBytes-ChangeBinInt-ChangeBit-CrossOver-CrossOver-ShuffleBytes-CrossOver-CopyPart-CrossOver-CrossOver-CopyPart-ChangeByte-ChangeByte-CrossOver-ChangeBit-ChangeBinInt-EraseBytes-ShuffleBytes-EraseBytes-CMP-PersAutoDict-PersAutoDict-InsertByte-ChangeBit-ChangeByte-CopyPart-CrossOver-ChangeByte-ChangeBit-ChangeByte-CopyPart-ChangeBinInt-EraseBytes-CrossOver-ChangeBit-CrossOver-PersAutoDict-CrossOver-ChangeByte-CrossOver-ChangeByte-ChangeByte-CrossOver-ShuffleBytes-CopyPart-CopyPart-ShuffleBytes-ChangeByte-ChangeByte-ChangeBinInt-ChangeBinInt-ChangeBinInt-ChangeBinInt-ShuffleBytes-CrossOver-ChangeBinInt-ShuffleBytes-ChangeBit-PersAutoDict-ChangeBinInt-ShuffleBytes-ChangeBinInt-ChangeByte-CrossOver-ChangeBit-CopyPart-ChangeBit-ChangeBit-CopyPart-ChangeByte-PersAutoDict-ChangeBit-ShuffleBytes-ChangeByte-ChangeBit-CrossOver-ChangeByte-CrossOver-ChangeByte-CrossOver-ChangeBit-ChangeByte-ChangeBinInt-PersAutoDict-CopyPart-ChangeBinInt-ChangeBit-CrossOver-ChangeBit-PersAutoDict-ShuffleBytes-EraseBytes-CrossOver-ChangeByte-ChangeBinInt-ShuffleBytes-ChangeBinInt-InsertRepeatedBytes-PersAutoDict-CrossOver-ChangeByte-Custom-PersAutoDict-CopyPart-CopyPart-ChangeBinInt-ShuffleBytes-ChangeBinInt-ChangeBit-ShuffleBytes-CrossOver-CMP-ChangeByte-CopyPart-ShuffleBytes-CopyPart-CopyPart-CrossOver-CrossOver-CrossOver-ShuffleBytes-ChangeByte-ChangeBinInt-ChangeBit-ChangeBit-ChangeBit-ChangeByte-EraseBytes-ChangeByte-ChangeBit-ChangeByte-ChangeByte-CopyPart-PersAutoDict-ChangeBinInt-PersAutoDict-PersAutoDict-PersAutoDict-CopyPart-CopyPart-CrossOver-ChangeByte-ChangeBinInt-ShuffleBytes-ChangeBit-CopyPart-EraseBytes-CopyPart-CopyPart-CrossOver-ChangeByte-EraseBytes-ShuffleBytes-ChangeByte-CopyPart-EraseBytes-CopyPart-CrossOver-ChangeBinInt-ChangeBinInt-InsertByte-ChangeBinInt-ChangeBit-ChangeByte-CopyPart-ChangeByte-EraseBytes-ChangeByte-ChangeBit-ChangeByte-ShuffleBytes-CopyPart-ChangeBinInt-EraseBytes-CrossOver-ChangeBit-ChangeBit-CrossOver-EraseBytes-ChangeBinInt-CopyPart-CopyPart-ChangeBinInt-ChangeBit-EraseBytes-InsertRepeatedBytes-EraseBytes-ChangeBit-CrossOver-CrossOver-EraseBytes-EraseBytes-ChangeByte-CopyPart-CopyPart-ShuffleBytes-ChangeByte-ChangeBit-ChangeByte-EraseBytes-ChangeBit-ChangeByte-ChangeByte-CrossOver-CopyPart-EraseBytes-ChangeByte-EraseBytes-ChangeByte-ShuffleBytes-ShuffleBytes-ChangeByte-CopyPart-ChangeByte-ChangeByte-ChangeBit-CopyPart-ChangeBit-ChangeBinInt-CopyPart-ShuffleBytes-ChangeBit-ChangeBinInt-ChangeBit-EraseBytes-CMP-CrossOver-CopyPart-ChangeBinInt-CrossOver-CrossOver-CopyPart-CrossOver-CrossOver-InsertByte-InsertByte-CopyPart-Custom- DE: "warn"-"\x00\x00\x00\x80"-"\xfe\xff\xff\xfb"-"\xff\xff"-"\x10\x00\x00\x00"-"\xfe\xff\xff\xff"-"\xff\xff\xff\xf6"-"U\x01\x00\x00\x00\x00\x00\x00"-"\xd9\xff\xff\xff"-"\xfe\xff\xff\xea"-"\xf0\xff\xff\xff"-"\xfc\xff\xff\xff"-"warn"-"\xff\xff\xff\xff"-"\xfe\xff\xff\xfb"-"\x00\x00\x00\x80"-"\xfe\xff\xff\xf1"-"\xfe\xff\xff\xea"-"\x00\x00\x00\x00\x00\x00\x012"-"\xe2\x00"-"\xfb\xff\xff\xff"-"\x00\x00\x00\x00"-"\xe9\xff\xff\xff"-"\xff\xff"-"\x00\x00\x00\x80"-"\x01\x00\x04\xc9"-"\xf0\xff\xff\xff"-"\xf9\xff\xff\xff"-"\xff\xff\xff\xff\xff\xff\xff\x12"-"\xe2\x00"-"\xfe\xff\xff\xff"-"\xfe\xff\xff\xea"-"\xff\xff\xff\xff"-"\xf4\xff\xff\xff"-"\xe9\xff\xff\xff"-"\xf1\xff\xff\xff"-
    #48    NEW    cov: 4502 ft: 9151 corp: 27/750Kb lim: 64000 exec/s: 2 rss: 458Mb L: 50772/50772 MS: 259 ChangeByte-ShuffleBytes-ChangeBinInt-ChangeByte-ChangeByte-ChangeByte-ChangeByte-ChangeBit-CopyPart-CrossOver-CopyPart-ChangeByte-CrossOver-CopyPart-ChangeBit-ChangeByte-EraseBytes-ChangeByte-CopyPart-CopyPart-CopyPart-ChangeBit-EraseBytes-ChangeBinInt-CrossOver-CopyPart-CrossOver-CopyPart-ChangeBit-ChangeByte-ChangeBit-InsertByte-CrossOver-InsertRepeatedBytes-InsertRepeatedBytes-InsertRepeatedBytes-ChangeBinInt-EraseBytes-InsertRepeatedBytes-InsertByte-ChangeBit-ShuffleBytes-ChangeBit-ChangeBit-CopyPart-ChangeBit-ChangeByte-CrossOver-ChangeBinInt-ChangeByte-CrossOver-CMP-ChangeByte-CrossOver-ChangeByte-ShuffleBytes-ShuffleBytes-ChangeByte-ChangeBinInt-CopyPart-EraseBytes-CrossOver-ChangeBit-ChangeBinInt-InsertByte-ChangeBit-CopyPart-ChangeBinInt-ChangeByte-CrossOver-ChangeBit-EraseBytes-CopyPart-ChangeBinInt-ChangeBit-ChangeBit-ChangeByte-CopyPart-ChangeBinInt-CrossOver-PersAutoDict-ChangeByte-ChangeBit-ChangeByte-ChangeBinInt-ChangeBinInt-EraseBytes-CopyPart-CopyPart-ChangeByte-ChangeByte-EraseBytes-PersAutoDict-CopyPart-ChangeByte-ChangeByte-EraseBytes-CrossOver-CopyPart-CopyPart-CopyPart-ChangeByte-ChangeBit-CMP-CopyPart-ChangeBinInt-ChangeBinInt-CrossOver-ChangeBit-ChangeBit-EraseBytes-ChangeByte-ShuffleBytes-ChangeBit-ChangeBinInt-CMP-InsertRepeatedBytes-CopyPart-Custom-ChangeByte-CrossOver-EraseBytes-ChangeBit-CopyPart-CrossOver-CMP-ShuffleBytes-EraseBytes-CrossOver-PersAutoDict-ChangeByte-CrossOver-CopyPart-CrossOver-CrossOver-ShuffleBytes-ChangeBinInt-CrossOver-ChangeBinInt-ShuffleBytes-PersAutoDict-ChangeByte-EraseBytes-ChangeBit-CrossOver-EraseBytes-CrossOver-ChangeBit-ChangeBinInt-EraseBytes-InsertByte-InsertRepeatedBytes-InsertByte-InsertByte-ChangeByte-ChangeBinInt-ChangeBit-CrossOver-ChangeByte-CrossOver-EraseBytes-ChangeByte-ShuffleBytes-ChangeBit-ChangeBit-ShuffleBytes-CopyPart-ChangeByte-PersAutoDict-ChangeBit-ChangeByte-InsertRepeatedBytes-CMP-CrossOver-ChangeByte-EraseBytes-ShuffleBytes-CrossOver-ShuffleBytes-ChangeBinInt-ChangeBinInt-CopyPart-PersAutoDict-ShuffleBytes-ChangeBit-CopyPart-ShuffleBytes-CopyPart-EraseBytes-ChangeByte-ChangeBit-ChangeBit-ChangeBinInt-ChangeByte-CopyPart-EraseBytes-ChangeBinInt-EraseBytes-EraseBytes-PersAutoDict-CMP-PersAutoDict-CrossOver-CrossOver-ChangeBit-CrossOver-PersAutoDict-CrossOver-CopyPart-ChangeByte-EraseBytes-ChangeByte-ShuffleBytes-ChangeByte-ChangeByte-CrossOver-ChangeBit-EraseBytes-ChangeByte-EraseBytes-ChangeBinInt-CrossOver-CrossOver-EraseBytes-ChangeBinInt-CrossOver-ChangeBit-ShuffleBytes-ChangeBit-ChangeByte-EraseBytes-ChangeBit-CrossOver-CrossOver-CrossOver-ChangeByte-ChangeBit-ShuffleBytes-ChangeBit-ChangeBit-EraseBytes-CrossOver-CrossOver-CopyPart-ShuffleBytes-ChangeByte-ChangeByte-CopyPart-CrossOver-CopyPart-CrossOver-CrossOver-EraseBytes-EraseBytes-ShuffleBytes-InsertRepeatedBytes-ChangeBit-CopyPart-Custom- DE: "\xfe\xff\xff\xfc"-"\x00\x00\x00\x00"-"F\x00"-"\xf3\xff\xff\xff"-"St9exception"-"_\x00\x00\x00"-"\xf6\xff\xff\xff"-"\xfe\xff\xff\xff"-"\x00\x00\x00\x00"-"p\x02\x00\x00\x00\x00\x00\x00"-"\xfe\xff\xff\xfb"-"\xff\xff"-"\xff\xff\xff\xff"-"\x01\x00\x00\x07"-"\xfe\xff\xff\xfe"-

These are prohibitively large and of limited value in the default case (when
someone is running the fuzzer, not debugging it), in my opinion.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D86658

3 years ago[Coro][NewPM] Handle llvm.coro.prepare.retcon in NPM coro-split pass
Arthur Eubanks [Sat, 12 Sep 2020 02:57:17 +0000 (19:57 -0700)]
[Coro][NewPM] Handle llvm.coro.prepare.retcon in NPM coro-split pass

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D87731

3 years ago[ARM][MVE] Tail-predication: predicate new elementcount checks on force-enabled
Sjoerd Meijer [Wed, 16 Sep 2020 15:35:53 +0000 (16:35 +0100)]
[ARM][MVE] Tail-predication: predicate new elementcount checks on force-enabled

Additional sanity checks were added to get.active.lane.mask's second argument,
the loop tripcount/elementcount, in rG635b87511ec3. Like the other (overflow)
checks, skip this if tail-predication is forced.

Differential Revision: https://reviews.llvm.org/D87769

3 years ago[AMDGPU] Remove obsolete comment
Jay Foad [Wed, 16 Sep 2020 16:02:42 +0000 (17:02 +0100)]
[AMDGPU] Remove obsolete comment

Obsoleted by e4464bf3d45848461630e3771d66546d389f1ed5 "AMDGPU/GlobalISel: Select scalar v2s16 G_BUILD_VECTOR"

3 years ago[llvm][CodeGen] Do not scalarize `llvm.masked.[gather|scatter]` operating on scalable...
Francesco Petrogalli [Tue, 8 Sep 2020 08:08:59 +0000 (08:08 +0000)]
[llvm][CodeGen] Do not scalarize `llvm.masked.[gather|scatter]` operating on scalable vectors.

This patch prevents the `llvm.masked.gather` and `llvm.masked.scatter` intrinsics to be scalarized when invoked on scalable vectors.

The change in `Function.cpp` is needed to prevent the warning that is raised when `getNumElements` is used in place of `getElementCount` on `VectorType` instances. The tests guards for regressions on this change.

The tests makes sure that calls to `llvm.masked.[gather|scatter]` are still scalarized when:

  # the intrinsics are operating on fixed size vectors, and
  # the compiler is not targeting fixed length SVE code generation.

Reviewed By: efriedma, sdesmalen

Differential Revision: https://reviews.llvm.org/D86249

3 years ago[NPM] Translate alias analysis into require<> as well
Arthur Eubanks [Wed, 16 Sep 2020 05:06:50 +0000 (22:06 -0700)]
[NPM] Translate alias analysis into require<> as well

'require<globals-aa>' is needed to make globals-aa work in NPM, since
globals-aa is a module analysis but function passes cannot run module
analyses on demand.
So don't skip translating alias analyses to 'require<>'.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87743

3 years ago[AMDGPU] Corrected directive to use for ELF weak refs
Dmitry Preobrazhensky [Wed, 16 Sep 2020 15:51:26 +0000 (18:51 +0300)]
[AMDGPU] Corrected directive to use for ELF weak refs

WeakRefDirective should specify a directive to declare "a global as being a weak undefined symbol".
The directive used by AMDGPU was incorrect - ".weakref" was intended for other purposes.
The correct directive is ".weak" and it is already defined as default for ELF.
So the redefinition was removed.

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D87762

3 years ago[ASTMatchers] Add missing definition for decompositionDecl
Benjamin Kramer [Wed, 16 Sep 2020 15:28:59 +0000 (17:28 +0200)]
[ASTMatchers] Add missing definition for decompositionDecl

Otherwise we'd get a linker error whenever decompositionDecl is ODR
used.

3 years ago[X86] EmitInstrWithCustomInserter - remove redundant getDebugLoc() calls. NFCI.
Simon Pilgrim [Wed, 16 Sep 2020 15:26:13 +0000 (16:26 +0100)]
[X86] EmitInstrWithCustomInserter - remove redundant getDebugLoc() calls. NFCI.

Use the same DebugLoc that is called at the top of the method.

Fixes some Wshadow static analyzer warnings.

3 years ago[NFC][Regalloc] accessors for 'reg' and 'weight'
Mircea Trofin [Tue, 15 Sep 2020 21:54:38 +0000 (14:54 -0700)]
[NFC][Regalloc] accessors for 'reg' and 'weight'

Also renamed the fields to follow style guidelines.

Accessors help with readability - weight mutation, in particular,
is easier to follow this way.

Differential Revision: https://reviews.llvm.org/D87725

3 years agoAMDGPU: Improve <2 x i24> arguments and return value handling
Matt Arsenault [Sun, 30 Aug 2020 21:28:48 +0000 (17:28 -0400)]
AMDGPU: Improve <2 x i24> arguments and return value handling

This was asserting for GlobalISel. For SelectionDAG, this was
passing this on the stack. Instead, scalarize this as if it were a
32-bit vector.

3 years ago[AMDGPU] Add v3f16/v3i16 support to SDag
Sebastian Neubauer [Thu, 23 Jul 2020 14:59:00 +0000 (16:59 +0200)]
[AMDGPU] Add v3f16/v3i16 support to SDag

Fix lowering and instruction selection for v3x16 types
and enable InstCombine to emit them.

This patch only implements it for the selection dag.
GlobalISel tests in GlobalISel/llvm.amdgcn.image.load.1d.d16.ll and
GlobalISel/llvm.amdgcn.image.store.2d.d16.ll still don't work.

Differential Revision: https://reviews.llvm.org/D84420

3 years ago[X86] Assert that we've found a terminator instruction. NFCI.
Simon Pilgrim [Wed, 16 Sep 2020 15:17:35 +0000 (16:17 +0100)]
[X86] Assert that we've found a terminator instruction. NFCI.

Fixes clang static analayzer null dereference warning.

3 years ago[AMDGPU] Enable scheduling around FP MODE-setting instructions
Jay Foad [Wed, 9 Sep 2020 16:21:36 +0000 (17:21 +0100)]
[AMDGPU] Enable scheduling around FP MODE-setting instructions

Pre-gfx10 all MODE-setting instructions were S_SETREG_B32 which is
marked as having unmodeled side effects, which makes the machine
scheduler treat it as a barrier. Now that we have proper implicit $mode
operands we can use a no-side-effects S_SETREG_B32_mode pseudo instead
for setregs that only touch the FP MODE bits, to give the scheduler more
freedom.

Differential Revision: https://reviews.llvm.org/D87446

3 years ago[AMDGPU] Add -show-mc-encoding to setreg tests
Jay Foad [Tue, 15 Sep 2020 11:00:38 +0000 (12:00 +0100)]
[AMDGPU] Add -show-mc-encoding to setreg tests

This is a pre-commit for D87446 "[AMDGPU] Enable scheduling around FP MODE-setting instructions"

3 years ago[X86][SSE] Move VZEXT_MOVL(INSERT_SUBVECTOR(UNDEF,X,0)) handling into combineTargetSh...
Simon Pilgrim [Wed, 16 Sep 2020 14:46:23 +0000 (15:46 +0100)]
[X86][SSE] Move VZEXT_MOVL(INSERT_SUBVECTOR(UNDEF,X,0)) handling into combineTargetShuffle.

Now that we're getting better at combining shuffles of different vector widths, this can now be performed as part of the standard target shuffle combines and isn't required for cleanup.

Exposed a minor issue in combineX86ShufflesRecursively where we failed to check if a shuffle's src ops were simple types.

3 years ago[mlir][openacc] Add missing operands for acc.parallel operation
Valentin Clement [Wed, 16 Sep 2020 14:48:51 +0000 (10:48 -0400)]
[mlir][openacc] Add missing operands for acc.parallel operation

Add missing operands to represent copin with readonly modifier, copyout with zero
modifier, create with zero modifier and default clause.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D87733

3 years agoEnable inlining for Linalg dialect
Eugene Zhulenev [Wed, 16 Sep 2020 14:03:35 +0000 (10:03 -0400)]
Enable inlining for Linalg dialect

Enable inlining for Linalg dialect.

Differential Revision: https://reviews.llvm.org/D87567

3 years ago[Partial Inliner] Compute intrinsic cost through TTI
Dangeti Tharun kumar [Wed, 16 Sep 2020 14:11:24 +0000 (15:11 +0100)]
[Partial Inliner] Compute intrinsic cost through TTI

https://bugs.llvm.org/show_bug.cgi?id=45932

assert(OutlinedFunctionCost >= Cloner.OutlinedRegionCost && "Outlined function cost should be no less than the outlined region") getting triggered in computeBBInlineCost.

Intrinsics like "assume" are considered regular function calls while computing costs.
This patch enables computeBBInlineCost to queries TTI for intrinsic call cost.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D87132

3 years ago[mlir] Model StringRef in C API
Alex Zinenko [Tue, 15 Sep 2020 10:04:59 +0000 (12:04 +0200)]
[mlir] Model StringRef in C API

Numerous MLIR functions return instances of `StringRef` to refer to a
non-owning fragment of a string (usually owned by the context). This is a
relatively simple class that is defined in LLVM. Provide a simple wrapper in
the MLIR C API that contains the pointer and length of the string fragment and
use it for Standard attribute functions that return StringRef instead of the
previous, callback-based mechanism.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D87677