platform/upstream/llvm.git
3 years ago[NVPTX] Add selp.f32 checks to select(cond,fpbinop(),fpbinop()) tests
Simon Pilgrim [Thu, 15 Jul 2021 11:42:29 +0000 (12:42 +0100)]
[NVPTX] Add selp.f32 checks to select(cond,fpbinop(),fpbinop()) tests

Will help show codegen diffs in an upcoming patch

3 years ago[InstCombine] Strip inbounds from (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (selec...
Simon Pilgrim [Thu, 15 Jul 2021 11:19:10 +0000 (12:19 +0100)]
[InstCombine] Strip inbounds from (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0)) fold

As discussed on rGd561b6fbdbe6, we can't guarantee that the new gep is inbounds

3 years ago[MIPS] Refresh ashr test checks. NFCI.
Simon Pilgrim [Thu, 15 Jul 2021 11:05:33 +0000 (12:05 +0100)]
[MIPS] Refresh ashr test checks. NFCI.

3 years ago[mlir][nvvm]: Add math::Exp2Op lowering to NVVM.
Adrian Kuegel [Thu, 15 Jul 2021 10:04:25 +0000 (12:04 +0200)]
[mlir][nvvm]: Add math::Exp2Op lowering to NVVM.

Differential Revision: https://reviews.llvm.org/D106050

3 years ago[AArch64][GlobalISel] Optimise lowering for some vector types for min/max
Irina Dobrescu [Fri, 9 Jul 2021 12:09:06 +0000 (13:09 +0100)]
[AArch64][GlobalISel] Optimise lowering for some vector types for min/max

Differential Revision: https://reviews.llvm.org/D105696

3 years ago[AMDGPU] Use isMetaInstruction for instruction size
Sebastian Neubauer [Thu, 15 Jul 2021 08:21:33 +0000 (10:21 +0200)]
[AMDGPU] Use isMetaInstruction for instruction size

Meta instructions have a size of 0. Use isMetaInstruction instead of
listing them explicitly.

Differential Revision: https://reviews.llvm.org/D106043

3 years ago[TSan] Add SystemZ SANITIZER_GO support
Ilya Leoshkevich [Fri, 2 Jul 2021 00:49:30 +0000 (02:49 +0200)]
[TSan] Add SystemZ SANITIZER_GO support

Define the address ranges (similar to the C/C++ ones, but with the heap
range merged into the app range) and enable the sanity check.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629

3 years ago[TSan] Enable SystemZ support
Ilya Leoshkevich [Fri, 2 Jul 2021 00:43:00 +0000 (02:43 +0200)]
[TSan] Enable SystemZ support

Enable building the runtime and enable -fsanitize=thread in clang.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629

3 years ago[TSan] Adjust tests for SystemZ
Ilya Leoshkevich [Fri, 2 Jul 2021 00:43:49 +0000 (02:43 +0200)]
[TSan] Adjust tests for SystemZ

XFAIL map32bit, define the maximum possible allocation size in
mmap_large.cpp.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629

3 years ago[TSan] Intercept __tls_get_addr_internal and __tls_get_offset on SystemZ
Ilya Leoshkevich [Fri, 2 Jul 2021 14:59:32 +0000 (16:59 +0200)]
[TSan] Intercept __tls_get_addr_internal and __tls_get_offset on SystemZ

Reuse the assembly glue code from sanitizer_common_interceptors.inc and
the handling logic from the __tls_get_addr interceptor.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629

3 years ago[TSan] Disable __TSAN_HAS_INT128 on SystemZ
Ilya Leoshkevich [Fri, 2 Jul 2021 00:47:11 +0000 (02:47 +0200)]
[TSan] Disable __TSAN_HAS_INT128 on SystemZ

SystemZ does not have 128-bit atomics.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629

3 years ago[TSan] Add SystemZ longjmp support
Ilya Leoshkevich [Fri, 2 Jul 2021 00:46:21 +0000 (02:46 +0200)]
[TSan] Add SystemZ longjmp support

Implement the interceptor and stack pointer demangling.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629

3 years ago[TSan] Define C/C++ address ranges for SystemZ
Ilya Leoshkevich [Fri, 2 Jul 2021 00:44:43 +0000 (02:44 +0200)]
[TSan] Define C/C++ address ranges for SystemZ

The kernel supports a full 64-bit VMA, but we can use only 48 bits due
to the limitation imposed by SyncVar::GetId(). So define the address
ranges similar to the other architectures, except that the address
space "tail" needs to be made inaccessible in CheckAndProtect(). Since
it's for only one architecture, don't make an abstraction for this.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629

3 years ago[TSan] Define PTHREAD_ABI_BASE for SystemZ
Ilya Leoshkevich [Fri, 2 Jul 2021 00:47:25 +0000 (02:47 +0200)]
[TSan] Define PTHREAD_ABI_BASE for SystemZ

SystemZ's glibc symbols use version 2.3.2.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629

3 years ago[TSan] Build ignore_lib{0,1,5} tests with -fno-builtin
Ilya Leoshkevich [Thu, 8 Jul 2021 13:09:10 +0000 (15:09 +0200)]
[TSan] Build ignore_lib{0,1,5} tests with -fno-builtin

These tests depend on TSan seeing the intercepted memcpy(), so they
break when the compiler chooses the builtin version.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629

3 years ago[TSan] Use zeroext for function parameters
Ilya Leoshkevich [Fri, 2 Jul 2021 00:42:24 +0000 (02:42 +0200)]
[TSan] Use zeroext for function parameters

SystemZ ABI requires zero-extending function parameters to 64-bit. The
compiler is free to optimize the code around this assumption, e.g.
failing to zero-extend __tsan_atomic32_load()'s morder may cause
crashes in to_mo() switch table lookup.

Fix by adding zeroext attributes to TSan's FunctionCallees, similar to
how it was done in commit 3bc439bdff8b ("[MSan] Add instrumentation for
SystemZ"). This is a no-op on arches that don't need it.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629

3 years ago[TSan] Align thread_registry_placeholder
Ilya Leoshkevich [Tue, 13 Jul 2021 13:51:47 +0000 (15:51 +0200)]
[TSan] Align thread_registry_placeholder

s390x requires ThreadRegistry.mtx_.opaque_storage_ to be 4-byte
aligned. Since other architectures may have similar requirements, use
the maximum thread_registry_placeholder alignment from other
sanitizers, which is 64 (LSan).

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629

3 years ago[sanitizer] Force TLS allocation on s390
Ilya Leoshkevich [Tue, 13 Jul 2021 17:19:42 +0000 (19:19 +0200)]
[sanitizer] Force TLS allocation on s390

When running with an old glibc, CollectStaticTlsBlocks() calls
__tls_get_addr() in order to force TLS allocation. This function is not
available on s390 and the code simply does nothing in this case,
so all the resulting static TLS blocks end up being incorrect.

Fix by calling __tls_get_offset() on s390.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629

3 years ago[sanitizer] Fix __sanitizer_kernel_sigset_t endianness issue
Ilya Leoshkevich [Fri, 2 Jul 2021 00:42:38 +0000 (02:42 +0200)]
[sanitizer] Fix __sanitizer_kernel_sigset_t endianness issue

setuid(0) hangs on SystemZ under TSan because TSan's BackgroundThread
ignores SIGSETXID. This in turn happens because internal_sigdelset()
messes up the mask bits on big-endian system due to how
__sanitizer_kernel_sigset_t is defined.

Commit d9a1a53b8d80 ("[ESan] [MIPS] Fix workingset-signal-posix.cpp on
MIPS") fixed this for MIPS by adjusting the __sanitizer_kernel_sigset_t
definition. Generalize this by defining __SANITIZER_KERNEL_NSIG based
on kernel's _NSIG and using uptr[] for __sanitizer_kernel_sigset_t.sig
on all platforms.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629

3 years ago[Test] We can benefit from pipelining of ymm load/stores
Max Kazantsev [Thu, 15 Jul 2021 09:40:34 +0000 (16:40 +0700)]
[Test] We can benefit from pipelining of ymm load/stores

This patch demonstrates a scenario when we need to load/store a single
64-byte value, which is done by 2 ymm loads and stores in AVX. The current
codegen choses the following sequence:

  load ymm0
  load ymm1
  store ymm1
  store ymm0

If we instead stored ymm0 before ymm1, we could execute 2nd load and 1st store
in parallel.

3 years ago[AArch64][SME] Add outer product instructions
Cullen Rhodes [Thu, 15 Jul 2021 08:41:08 +0000 (08:41 +0000)]
[AArch64][SME] Add outer product instructions

This patch adds support for the following outer product instructions:

  * BFMOPA, BFMOPS, FMOPA, FMOPS, SMOPA, SMOPS, SUMOPA, SUMOPS, UMOPA,
    UMOPS, USMOPA, USMOPS.

Depends on D105570.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D105571

3 years ago[NFC] [hwasan] Split argument logic into functions.
Florian Mayer [Wed, 14 Jul 2021 11:50:50 +0000 (12:50 +0100)]
[NFC] [hwasan] Split argument logic into functions.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D105971

3 years agoFixes memory sanitizer 'use-of-uninitialized-value' diagnostic.
Bogdan Graur [Thu, 15 Jul 2021 09:15:17 +0000 (11:15 +0200)]
Fixes memory sanitizer 'use-of-uninitialized-value' diagnostic.

Differential Revision: https://reviews.llvm.org/D106047

3 years agoFix undeduced type assert
serge-sans-paille [Mon, 7 Jun 2021 15:14:43 +0000 (17:14 +0200)]
Fix undeduced type assert

If the instantiation of a member variable makes it possible to
compute a previously undeduced type, we should use that piece of
information.

Fix bug#50590

Differential Revision: https://reviews.llvm.org/D103849

3 years ago[llvm][tools] Hide unrelated llvm-bcanalyzer options
Timm Bäder [Tue, 13 Jul 2021 14:37:26 +0000 (16:37 +0200)]
[llvm][tools] Hide unrelated llvm-bcanalyzer options

They otherwise show up when we link against the dynamic libLLVM.so.

Differential Revision: https://reviews.llvm.org/D105893

3 years ago[mlir][crunner] fix bug in memref copy for rank 0
Aart Bik [Thu, 15 Jul 2021 04:01:00 +0000 (21:01 -0700)]
[mlir][crunner] fix bug in memref copy for rank 0

While replacing linalg.copy with the more desired memref.copy
I found a bug in the support library for rank 0 memref copying.
The code would loop for something like the following, since there
is code for no-rank and rank > 0, but rank == 0 was unexpected.

  memref.copy %0, %1: memref<f32> to memref<f32>

Note that a "regression test" for this will follow using the
sparse compiler migration to memref.copy which exercises this
case many times.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D106036

3 years ago[gn build] Port b0d38ad0bc25
LLVM GN Syncbot [Thu, 15 Jul 2021 07:50:35 +0000 (07:50 +0000)]
[gn build] Port b0d38ad0bc25

3 years ago[clang][Analyzer] Add symbol uninterestingness to bug report.
Balázs Kéri [Thu, 15 Jul 2021 06:34:59 +0000 (08:34 +0200)]
[clang][Analyzer] Add symbol uninterestingness to bug report.

`PathSensitiveBughReport` has a function to mark a symbol as interesting but
it was not possible to clear this flag. This can be useful in some cases,
so the functionality is added.

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D105637

3 years ago[2/2][RemoveRedundantDebugValues] Add a Pass that removes redundant DBG_VALUEs
Djordje Todorovic [Thu, 15 Jul 2021 06:45:19 +0000 (23:45 -0700)]
[2/2][RemoveRedundantDebugValues] Add a Pass that removes redundant DBG_VALUEs

This patch adds the forward scan for finding redundant DBG_VALUEs.

This analysis aims to remove redundant DBG_VALUEs by going forward
in the basic block by considering the first DBG_VALUE as a valid
until its first (location) operand is not clobbered/modified.
For example:

(1) DBG_VALUE $edi, !"var1", ...
(2) <block of code that does affect $edi>
(3) DBG_VALUE $edi, !"var1", ...
 ...
in this case, we can remove (3).

Differential Revision: https://reviews.llvm.org/D105280

3 years ago[AMDGPU] Reserve AMDGPU ELF e_flags machine 0x44
Tony Tye [Wed, 14 Jul 2021 03:31:04 +0000 (03:31 +0000)]
[AMDGPU] Reserve AMDGPU ELF e_flags machine 0x44

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D106034

3 years ago[Coroutines] Run coroutine passes by default
Chuanqi Xu [Thu, 15 Jul 2021 06:31:31 +0000 (14:31 +0800)]
[Coroutines] Run coroutine passes by default

This patch make coroutine passes run by default in LLVM pipeline. Now
the clang and opt could handle IR inputs containing coroutine intrinsics
without special options.
It should be fine. On the one hand, the coroutine passes seems to be stable
since there are already many projects using coroutine feature.
On the other hand, the coroutine passes should do nothing for IR who doesn't
contain coroutine intrinsic.

Test Plan: check-llvm

Reviewed by: lxfind, aeubanks

Differential Revision: https://reviews.llvm.org/D105877

3 years agoDefend early against operation created without a registered dialect
Mehdi Amini [Thu, 15 Jul 2021 02:13:30 +0000 (02:13 +0000)]
Defend early against operation created without a registered dialect

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D105961

3 years ago[MLIR] [Python] Add `owner` to PyValue and fix its parent reference
John Demme [Thu, 15 Jul 2021 03:19:27 +0000 (20:19 -0700)]
[MLIR] [Python] Add `owner` to PyValue and fix its parent reference

Adds `owner` python call to `mlir.ir.Value`.

Assuming that `PyValue.parentOperation` is intended to be the value's owner, this fixes the construction of it from `PyOpOperandList`.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D103853

3 years agoRevert "Defend early against operation created without a registered dialect"
Mehdi Amini [Thu, 15 Jul 2021 03:31:19 +0000 (03:31 +0000)]
Revert "Defend early against operation created without a registered dialect"

This reverts commit 58018858e887320e2432e2e00ace13273b8a1f29.

The Python bindings test are broken.

3 years ago[Attributor] AACallEdges, Add a way to ask nonasm unknown callees
Kuter Dinel [Wed, 14 Jul 2021 15:42:51 +0000 (18:42 +0300)]
[Attributor] AACallEdges, Add a way to ask nonasm unknown callees

This patch adds a feature to AACallEdges AbstractAttribute that allows
users to ask if there is a unknown callee that isn't a inline assembly.
This feature is needed by some of it's users.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D105992

3 years agoDefend early against operation created without a registered dialect
Mehdi Amini [Thu, 15 Jul 2021 02:13:30 +0000 (02:13 +0000)]
Defend early against operation created without a registered dialect

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D105961

3 years ago[PowerPC][NFC] add testcase for update-form preparation with non-const increment
Chen Zheng [Thu, 15 Jul 2021 02:20:01 +0000 (02:20 +0000)]
[PowerPC][NFC] add testcase for update-form preparation with non-const increment

3 years ago[mlir][linalg] Improve codegen when tiling PadTensor evenly
Matthias Springer [Thu, 15 Jul 2021 02:27:52 +0000 (11:27 +0900)]
[mlir][linalg] Improve codegen when tiling PadTensor evenly

Produce simpler IR with more static type information and fewer affine expressions.

Differential Revision: https://reviews.llvm.org/D105530

3 years ago[mlir][linalg] Improve codegen of ExtractSliceOfPadTensorSwapPattern
Matthias Springer [Thu, 15 Jul 2021 02:05:12 +0000 (11:05 +0900)]
[mlir][linalg] Improve codegen of ExtractSliceOfPadTensorSwapPattern

Generate simpler code in case low/high padding of the PadTensorOp is statically zero.

Differential Revision: https://reviews.llvm.org/D105529

3 years ago[mlir][linalg] Fix Windows build
Matthias Springer [Thu, 15 Jul 2021 01:55:22 +0000 (10:55 +0900)]
[mlir][linalg] Fix Windows build

The build failure was introduced by D105458. (Linux builds were not affected.)

Differential Revision: https://reviews.llvm.org/D106029

3 years ago[mlir][linalg] Tile PadTensorOp
Matthias Springer [Thu, 15 Jul 2021 01:35:46 +0000 (10:35 +0900)]
[mlir][linalg] Tile PadTensorOp

Tiling can be enabled with `linalg-tile-pad-tensor-ops`. Only scf::ForOp can be generated at the moment.

Differential Revision: https://reviews.llvm.org/D105460

3 years ago[mlir][NFC] Move asOpFoldResult helper functions to StaticValueUtils
Matthias Springer [Thu, 15 Jul 2021 01:28:25 +0000 (10:28 +0900)]
[mlir][NFC] Move asOpFoldResult helper functions to StaticValueUtils

Differential Revision: https://reviews.llvm.org/D105602

3 years ago[mlir][linalg] Add optional output operand to PadTensorOp
Matthias Springer [Thu, 15 Jul 2021 01:20:00 +0000 (10:20 +0900)]
[mlir][linalg] Add optional output operand to PadTensorOp

This optional operand will be used for tiling in a subsequent commit.

Differential Revision: https://reviews.llvm.org/D105459

3 years ago[mlir][linalg][NFC] Factor out tile generation in makeTiledShapes
Matthias Springer [Thu, 15 Jul 2021 01:11:35 +0000 (10:11 +0900)]
[mlir][linalg][NFC] Factor out tile generation in makeTiledShapes

Factor out the functionality into a new function, so that it can be used for creating PadTensorOp tiles.

Differential Revision: https://reviews.llvm.org/D105458

3 years ago[gn build] Port b9c3941cd61d
LLVM GN Syncbot [Thu, 15 Jul 2021 01:12:36 +0000 (01:12 +0000)]
[gn build] Port b9c3941cd61d

3 years ago[PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand
Kai Luo [Thu, 15 Jul 2021 00:49:42 +0000 (00:49 +0000)]
[PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand

This patch uses AtomicExpandPass to implement quadword lock free atomic operations. It adopts the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D103614

3 years ago[OpenCL] opencl-c.h: CL3.0 generic address space
Dave Airlie [Thu, 15 Jul 2021 00:51:01 +0000 (10:51 +1000)]
[OpenCL] opencl-c.h: CL3.0 generic address space

This is one of the easier pieces of adding CL3.0 support.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D105526

3 years ago[OpenCL][NFC] opencl-c.h: reorder atomic operations
Dave Airlie [Thu, 15 Jul 2021 00:48:19 +0000 (10:48 +1000)]
[OpenCL][NFC] opencl-c.h: reorder atomic operations

This just reorders the atomics, it doesn't change anything except their layout in the header.

This is a prep patch for adding some conditionals around these for CL3.0 but that patch is much easier to review if all the atomic operations are grouped together like this.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D105601

3 years agolibclc: Add -cl-no-stdinc to clang flags on clang >=13
Jan Vesely [Thu, 15 Jul 2021 00:41:50 +0000 (10:41 +1000)]
libclc: Add -cl-no-stdinc to clang flags on clang >=13

cf3ef15a6ec5e5b45c6c54e8fbe3769255e815ce ("[OpenCL] Add builtin
declarations by default.")
 switched behaviour to include "opencl-c-base.h". We don't want or need
 that for libclc so pass the flag to revert to old behaviour.

Fixes build since cf3ef15a6ec5e5b45c6c54e8fbe3769255e815ce

Reviewed By: tstellar

Differential Revision: https://reviews.llvm.org/D99794

3 years ago[AMDGPU] Use update_test_checks.py script for annotate kernel features tests.
Kuter Dinel [Tue, 13 Jul 2021 02:14:50 +0000 (05:14 +0300)]
[AMDGPU] Use update_test_checks.py script for annotate kernel features tests.

This patch makes the annotate kernel features tests use the update_tests_checks.py
script. Which makes it easy to update the tests.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D105864

3 years ago[libc++] NFCI: Restore code duplication in wrap_iter, with test.
Arthur O'Dwyer [Wed, 14 Jul 2021 04:01:47 +0000 (00:01 -0400)]
[libc++] NFCI: Restore code duplication in wrap_iter, with test.

It turns out that D105040 broke `std::rel_ops`; we actually do need
both a one-template-parameter and a two-template-parameter version of
all the comparison operators, because if we have only the heterogeneous
two-parameter version, then `x > x` is ambiguous:

    template<class T, class U> int f(S<T>, S<U>) { return 1; }
    template<class T> int f(T, T) { return 2; }  // rel_ops
    S<int> s; f(s,s);  // ambiguous between #1 and #2

Adding the one-template-parameter version fixes the ambiguity:

    template<class T, class U> int f(S<T>, S<U>) { return 1; }
    template<class T> int f(T, T) { return 2; }  // rel_ops
    template<class T> int f(S<T>, S<T>) { return 3; }
    S<int> s; f(s,s);  // #3 beats both #1 and #2

We have the same problem with `reverse_iterator` as with `__wrap_iter`.
But so do libstdc++ and Microsoft, so we're not going to worry about it.

Differential Revision: https://reviews.llvm.org/D105894

3 years ago[clang] Refactor AST printing tests to share more infrastructure
Nathan Ridge [Tue, 6 Jul 2021 05:40:24 +0000 (01:40 -0400)]
[clang] Refactor AST printing tests to share more infrastructure

Differential Revision: https://reviews.llvm.org/D105457

3 years ago[WebAssembly] Codegen for v128.storeX_lane instructions
Thomas Lively [Wed, 14 Jul 2021 23:15:24 +0000 (16:15 -0700)]
[WebAssembly] Codegen for v128.storeX_lane instructions

Replace the experimental clang builtins and LLVM intrinsics for these
instructions with normal codegen patterns. Resolves PR50435.

Differential Revision: https://reviews.llvm.org/D106019

3 years ago[GlobalOpt] Fix a miscompile when evaluating struct initializers.
Jon Roelofs [Mon, 12 Jul 2021 19:43:45 +0000 (12:43 -0700)]
[GlobalOpt] Fix a miscompile when evaluating struct initializers.

The bug was that evaluateBitcastFromPtr attempts a narrowing to a struct's 0th
element of a store that covers other elements. While this is okay on the load
side, applying it to stores causes us to miss the writes to the additionally
covered elements.

rdar://79503568

Differential revision: https://reviews.llvm.org/D105838

3 years ago[Support] Turn on SupportTest for Apple Silicon
Steven Wu [Wed, 14 Jul 2021 22:23:37 +0000 (15:23 -0700)]
[Support] Turn on SupportTest for Apple Silicon

Follow up for D106012, turn on unittest for Host on Apple Silicon.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D106020

3 years ago[libomptarget] Keep the Shadow Pointer Map up-to-date
George Rokos [Tue, 13 Jul 2021 22:33:50 +0000 (15:33 -0700)]
[libomptarget] Keep the Shadow Pointer Map up-to-date

D105812 introduced a regression where if a PTR_AND_OBJ entry was mapped on the device, then the OBJ was deallocated and then reallocated at a different address, the Shadow Pointer Map would still contain an entry for the PTR but pointing to the old address. This caused test `env/base_ptr_ref_count.c` to fail.

Differential Revision: https://reviews.llvm.org/D105947

3 years ago[clang-format] Make BreakAfterReturnType work with K&R C functions
owenca [Wed, 14 Jul 2021 06:49:48 +0000 (23:49 -0700)]
[clang-format] Make BreakAfterReturnType work with K&R C functions

This fixes PR50999.

Differential Revision: https://reviews.llvm.org/D105964

3 years ago[docs][OpaquePtr] Remove finished task
Arthur Eubanks [Wed, 14 Jul 2021 21:36:07 +0000 (14:36 -0700)]
[docs][OpaquePtr] Remove finished task

3 years ago[ARM] Fix RELA relocations for 32bit ARM.
Wolfgang Pieb [Fri, 25 Jun 2021 17:54:24 +0000 (10:54 -0700)]
[ARM] Fix RELA relocations for 32bit ARM.

RELA relocations for 32 bit ARM ignored the addend. Some tools generate
them instead of REL type relocations. This fixes PR50473.

    Reviewed By: MaskRay, peter.smith

    Differential Revision: https://reviews.llvm.org/D105214

3 years ago[Polly] Fix misleading debug message. NFC.
Michael Kruse [Wed, 14 Jul 2021 21:21:45 +0000 (16:21 -0500)]
[Polly] Fix misleading debug message. NFC.

The number of parameters can be the reason for aliasing checks not being
generated, but most of the time it for other reasons.

3 years ago[llvm-strip][WebAssembly] Support strip flags
Derek Schuff [Fri, 31 Jan 2020 23:55:47 +0000 (15:55 -0800)]
[llvm-strip][WebAssembly] Support strip flags

Summary:
Add support for the basic section stripping (and keeping) flags for wasm:
strip with no flags, --strip-all, --strip-debug,
--only-section, --keep-section, and --only-keep-debug.

Factor section removal into a function and use a predicate chain like
the ELF implementation.

Reviewers: jhenderson, sbc100

Differential Revision: https://reviews.llvm.org/D73820

3 years agoPrecommit test for D106017
Arthur Eubanks [Wed, 14 Jul 2021 20:56:59 +0000 (13:56 -0700)]
Precommit test for D106017

3 years ago[SimpleLoopUnswitch] Don't non-trivially unswitch loops with catchswitch exits
Arthur Eubanks [Thu, 8 Jul 2021 22:05:50 +0000 (15:05 -0700)]
[SimpleLoopUnswitch] Don't non-trivially unswitch loops with catchswitch exits

SplitBlock() can't handle catchswitch.

Fixes PR50973.

Reviewed By: aheejin

Differential Revision: https://reviews.llvm.org/D105672

3 years ago[AArch64] Fix selection of G_UNMERGE <2 x s16>
Jon Roelofs [Wed, 14 Jul 2021 19:45:28 +0000 (12:45 -0700)]
[AArch64] Fix selection of G_UNMERGE <2 x s16>

Differential revision: https://reviews.llvm.org/D106007

3 years ago[mlir][affine] Add single result affine.min/max -> affine.apply canonicalization.
Nicolas Vasilache [Wed, 14 Jul 2021 20:33:29 +0000 (20:33 +0000)]
[mlir][affine] Add single result affine.min/max -> affine.apply canonicalization.

Differential Revision: https://reviews.llvm.org/D106014

3 years ago[tests] Stablize tests for possible change in deref semantics
Philip Reames [Wed, 14 Jul 2021 20:35:18 +0000 (13:35 -0700)]
[tests] Stablize tests for possible change in deref semantics

This is conceptually part of e75a2dfe.  This file contains both tests whose results don't change (with the right attributes added), and tests which fundementally regress with the current proposal.  Doing the update took some care, thus the seperate change.

Here's the e75a2dfe context repeated:

There's a potential change in dereferenceability attribute semantics in the nearish future.  See llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree" and D99100 for context.

This change simply adds appropriate attributes to tests to keep transform logic exercised under both old and new/proposed semantics.  Note that for many of these cases, O3 would infer exactly these attributes on the test IR.

This change handles the idiomatic pattern of a dereferenceable object being passed to a call which can not free that memory.  There's a couple other tests which need more one-off attention, they'll be handled in another change.

3 years ago[asan][clang] Add flag to outline instrumentation
Kirill Stoimenov [Wed, 14 Jul 2021 19:31:49 +0000 (12:31 -0700)]
[asan][clang] Add flag to outline instrumentation

Summary This option can be used to reduce the size of the
binary. The trade-off in this case would be the run-time
performance.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D105726

3 years ago[lldb] Make TargetList iterable (NFC)
Jonas Devlieghere [Tue, 13 Jul 2021 17:34:27 +0000 (10:34 -0700)]
[lldb] Make TargetList iterable (NFC)

Make it possible to iterate over the TargetList.

Differential revision: https://reviews.llvm.org/D105914

3 years ago[lldb] Always call DestroyImpl from Process::Finalize
Jonas Devlieghere [Wed, 14 Jul 2021 20:21:36 +0000 (13:21 -0700)]
[lldb] Always call DestroyImpl from Process::Finalize

Always destroy the process, regardless of its private state. This will
call the virtual function DoDestroy under the hood, giving our derived
class a chance to do the necessary tear down, including what to do when
the private state is eStateExited.

Differential revision: https://reviews.llvm.org/D106004

3 years ago[Support] Get correct number of physical cores on Apple Silicon
Steven Wu [Wed, 14 Jul 2021 20:29:15 +0000 (13:29 -0700)]
[Support] Get correct number of physical cores on Apple Silicon

Fix a bug that `computeHostNumPhysicalCores` is fallback to default
unknown when building for Apple Silicon macs.

rdar://80533675

Reviewed By: arphaman

Differential Revision: https://reviews.llvm.org/D106012

3 years ago[mlir] NFC - Add AffineMap::replace variant with dim/symbol inference
Nicolas Vasilache [Wed, 14 Jul 2021 20:06:16 +0000 (20:06 +0000)]
[mlir] NFC - Add AffineMap::replace variant with dim/symbol inference

3 years agoGlobal variables with strong definitions cannot be freed
Philip Reames [Wed, 14 Jul 2021 20:19:48 +0000 (13:19 -0700)]
Global variables with strong definitions cannot be freed

With the current deref semantics, this is redundant - since we assume that anything which is dereferenceable (ever) can't be freed - but it becomes neccessary for the deref-at-point semantics.

Testing wise, this is covered by test/CodeGen/X86/hoist-invariant-load.ll when -use-dereferenceable-at-point-semantics is active.  I didn't bother duplicating the command line since a) it's an in-development mode, and b) the change is pretty obvious.

3 years ago[libcxx] [test] Remove a LIBCXX-WINDOWS-FIXME in trivial_abi/unique_ptr_ret
Martin Storsjö [Wed, 14 Jul 2021 06:40:25 +0000 (09:40 +0300)]
[libcxx] [test] Remove a LIBCXX-WINDOWS-FIXME in trivial_abi/unique_ptr_ret

This is the same thing that was clarified in D105906 for weak_ptr_ret.

Differential Revision: https://reviews.llvm.org/D105965

3 years ago[tests] Stablize tests for possible change in deref semantics
Philip Reames [Wed, 14 Jul 2021 19:35:23 +0000 (12:35 -0700)]
[tests] Stablize tests for possible change in deref semantics

There's a potential change in dereferenceability attribute semantics in the nearish future.  See llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree" and D99100 for context.

This change simply adds appropriate attributes to tests to keep transform logic exercised under both old and new/proposed semantics.  Note that for many of these cases, O3 would infer exactly these attributes on the test IR.

This change handles the idiomatic pattern of a dereferenceable object being passed to a call which can not free that memory.  There's a couple other tests which need more one-off attention, they'll be handled in another change.

3 years ago[AMDGPU] Add TII::isIgnorableUse() to allow VOP rematerialization
Stanislav Mekhanoshin [Mon, 12 Jul 2021 19:27:34 +0000 (12:27 -0700)]
[AMDGPU] Add TII::isIgnorableUse() to allow VOP rematerialization

Any def of EXEC prevents rematerialization of any VOP instruction
because of the physreg use. Create a callback to check if the
physreg use can be ingored to allow rematerialization.

Differential Revision: https://reviews.llvm.org/D105836

3 years ago[SLP][NFC]Fix variables names, NFC.
Alexey Bataev [Wed, 14 Jul 2021 19:42:51 +0000 (12:42 -0700)]
[SLP][NFC]Fix variables names, NFC.

3 years ago[docs] Fix :option:`--file-header` reference in llvm-readelf.rst after D105532
Fangrui Song [Wed, 14 Jul 2021 19:39:22 +0000 (12:39 -0700)]
[docs] Fix :option:`--file-header` reference in llvm-readelf.rst after D105532

3 years ago[SLP] Fix case of variable name. NFCI.
Simon Pilgrim [Wed, 14 Jul 2021 19:16:56 +0000 (20:16 +0100)]
[SLP] Fix case of variable name. NFCI.

3 years ago[Bazel] Uniformly export all MLIR td files
Geoffrey Martin-Noble [Wed, 14 Jul 2021 19:09:41 +0000 (12:09 -0700)]
[Bazel] Uniformly export all MLIR td files

CMake would have no restrictions on this and the custom list is a pain
to maintain.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D106003

3 years ago[runtimes] Bring back TARGET_TRIPLE
Louis Dionne [Wed, 14 Jul 2021 19:13:19 +0000 (15:13 -0400)]
[runtimes] Bring back TARGET_TRIPLE

This commit reverts 5099e01568 and 77396bbc98, which broke the build
in various ways. I'm reverting until I can investigate, since that
change appears to be way more subtle than it seemed.

3 years ago[NFC] Drop redundant check prefixes in newly added test file
Roman Lebedev [Wed, 14 Jul 2021 19:14:05 +0000 (22:14 +0300)]
[NFC] Drop redundant check prefixes in newly added test file

3 years ago[Attributes] Use single method to fetch type from AttributeSet (NFC)
Nikita Popov [Wed, 14 Jul 2021 19:09:06 +0000 (21:09 +0200)]
[Attributes] Use single method to fetch type from AttributeSet (NFC)

While it is nice to have separate methods in the public AttributeSet
API, we can fetch the type from the internal AttributeSetNode
using a generic API for all type attribute kinds.

3 years ago[NFC][PhaseOrdering] Add test for the lack of CSE after SimplifyCFG (PR51092)
Roman Lebedev [Wed, 14 Jul 2021 18:54:04 +0000 (21:54 +0300)]
[NFC][PhaseOrdering] Add test for the lack of CSE after SimplifyCFG (PR51092)

3 years ago[ARM] Move add(VMLALVA(A, X, Y), B) to VMLALVA(add(A, B), X, Y)
David Green [Wed, 14 Jul 2021 19:06:49 +0000 (20:06 +0100)]
[ARM] Move add(VMLALVA(A, X, Y), B) to VMLALVA(add(A, B), X, Y)

For i64 reductions we currently try and convert add(VMLALV(X, Y), B) to
VMLALVA(B, X, Y), incorporating the addition into the VMLALVA. If we
have an add of an existing VMLALVA, this patch pushes the add up above
the VMLALVA so that it may potentially be simplified further, for
example being folded into another VMLALV.

Differential Revision: https://reviews.llvm.org/D105686

3 years ago[scudo] Don't enabled MTE for small alignment
Vitaly Buka [Wed, 14 Jul 2021 01:11:57 +0000 (18:11 -0700)]
[scudo] Don't enabled MTE for small alignment

Differential Revision: https://reviews.llvm.org/D105954

3 years agoRemove uses of deprecated target AllPassesAndDialectsNoRegistration in Bazel (NFC)
Mehdi Amini [Wed, 14 Jul 2021 19:01:34 +0000 (19:01 +0000)]
Remove uses of deprecated target AllPassesAndDialectsNoRegistration in Bazel (NFC)

It was an alias for a long time.

3 years ago[Verifier] Improve incompatible attribute type check
Nikita Popov [Wed, 14 Jul 2021 18:58:52 +0000 (20:58 +0200)]
[Verifier] Improve incompatible attribute type check

A couple of attributes had explicit checks for incompatibility
with pointer types. However, this is already handled generically
by the typeIncompatible() check. We can drop these after adding
SwiftError to typeIncompatible().

However, the previous implementation of the check prints out all
attributes that are incompatible with a given type, even though
those attributes aren't actually used. This has the annoying
result that the error message changes every time a new attribute
is added to the list. Improve this by explicitly finding which
attribute isn't compatible and printing just that.

3 years agoDemangle: correct swift_async demangling for Microsoft scheme
Saleem Abdulrasool [Wed, 14 Jul 2021 18:42:24 +0000 (11:42 -0700)]
Demangle: correct swift_async demangling for Microsoft scheme

The emission was corrected for the swift_async calling convention but
the demangling support was not.  This repairs the demangling support as
well.

3 years ago[SelectionDAG] Add an overload of getStepVector that assumes step 1.
Eli Friedman [Mon, 12 Jul 2021 22:11:01 +0000 (15:11 -0700)]
[SelectionDAG] Add an overload of getStepVector that assumes step 1.

This is mostly a minor convenience, but the pattern seems frequent
enough to be worthwhile (and we'll probably add more uses in the
future).

Differential Revision: https://reviews.llvm.org/D105850

3 years ago[WebAssembly] Codegen for v128.loadX_lane instructions
Thomas Lively [Wed, 14 Jul 2021 18:31:53 +0000 (11:31 -0700)]
[WebAssembly] Codegen for v128.loadX_lane instructions

Replace the experimental clang builtin and LLVM intrinsics for these
instructions with normal codegen patterns. Resolves PR50433.

Differential Revision: https://reviews.llvm.org/D105950

3 years ago[runtimes] Inherit the TARGET_TRIPLE that may be set by LLVM
Louis Dionne [Wed, 14 Jul 2021 18:25:13 +0000 (14:25 -0400)]
[runtimes] Inherit the TARGET_TRIPLE that may be set by LLVM

3 years ago[WebAssembly] Remove datalayout strings from llc tests
Thomas Lively [Wed, 14 Jul 2021 18:17:08 +0000 (11:17 -0700)]
[WebAssembly] Remove datalayout strings from llc tests

The data layout strings do not have any effect on llc tests and will become
misleadingly out of date as we continue to update the canonical data layout, so
remove them from the tests.

Differential Revision: https://reviews.llvm.org/D105842

3 years ago[ELF] --fortran-common: prefer STB_WEAK to COMMON
Fangrui Song [Wed, 14 Jul 2021 17:18:30 +0000 (10:18 -0700)]
[ELF] --fortran-common: prefer STB_WEAK to COMMON

The ELF specification says "The link editor honors the common definition and
ignores the weak ones." GNU ld and our Symbol::compare follow this, but the
--fortran-common code (D86142) made a mistake on the precedence.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51082

Reviewed By: peter.smith, sfertile

Differential Revision: https://reviews.llvm.org/D105945

3 years ago[ARM] Lower v16i8 -> i64 VMLA reductions.
David Green [Wed, 14 Jul 2021 17:11:32 +0000 (18:11 +0100)]
[ARM] Lower v16i8 -> i64 VMLA reductions.

MVE does not have a VMLALV instruction that can perform v16i8 -> i64
reductions, like it does for v8i16->i64 and v4i32->i64 reductions. That
means that the pattern to create them will be spilt up by type
legalization, creating a lot of instructions.

This extends the patterns for matching i64 reductions a little to handle
the v16i8->i64 case. We need to turn them into a pair of v8i16->i64
VMLALVs that each perform half of the reduction and are summed together
(so the later is a VMLALVA). The order of the lanes does not matter for
the reduction so we generate a MVEEXT for the extension, that will
either be folded into a extending load or can be optimized to a
VREV/VMOVL. Some of the resulting codegen isn't optimal, but will be
improved in a later patch.

Differential Revision: https://reviews.llvm.org/D105680

3 years ago[InstCombine] reorder icmp with offset folds for better results
Sanjay Patel [Wed, 14 Jul 2021 15:57:36 +0000 (11:57 -0400)]
[InstCombine] reorder icmp with offset folds for better results

This set of folds was added recently with:
c7b658aeb526
0c400e895306
40b752d28d95

...and I noted that this wasn't likely to fire in code derived
from C/C++ source because of nsw in particular. But I didn't
notice that I had placed the code above the no-wrap block
of transforms.

This is likely the cause of regressions noted from the previous
commit because -- as shown in the test diffs -- we may have
transformed into a compare with an arbitrary constant rather
than a simpler signbit test.

3 years ago[InstCombine] add tests for icmp with constant offset and no-wrap flags; NFC
Sanjay Patel [Wed, 14 Jul 2021 15:35:23 +0000 (11:35 -0400)]
[InstCombine] add tests for icmp with constant offset and no-wrap flags; NFC

3 years ago[LV] Print remark when loop cannot be vectorized due to invalid costs.
Sander de Smalen [Wed, 14 Jul 2021 15:45:07 +0000 (16:45 +0100)]
[LV] Print remark when loop cannot be vectorized due to invalid costs.

This patch emits remarks for instructions that have invalid costs for
a given set of vectorization factors. Some example output:

  t.c:4:19: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): load
      dst[i] = sinf(src[i]);
                    ^
  t.c:4:14: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1, vscale x 2, vscale x 4): call to llvm.sin.f32
      dst[i] = sinf(src[i]);
               ^
  t.c:4:12: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): store
      dst[i] = sinf(src[i]);
             ^

Reviewed By: fhahn, kmclaughlin

Differential Revision: https://reviews.llvm.org/D105806

3 years agoGlobalISel: Handle lowering non-power-of-2 extloads
Matt Arsenault [Thu, 10 Jun 2021 13:28:20 +0000 (09:28 -0400)]
GlobalISel: Handle lowering non-power-of-2 extloads

3 years ago[CostModel][AArch64] Make loads/stores of <vscale x 1 x eltty> invalid.
Sander de Smalen [Wed, 14 Jul 2021 08:43:30 +0000 (09:43 +0100)]
[CostModel][AArch64] Make loads/stores of <vscale x 1 x eltty> invalid.

At the moment, <vscale x 1 x eltty> are not yet fully handled by the
code-generator, so to avoid vectorizing loops with that VF, we mark the
cost for these types as invalid.
The reason for not adding a new "TTI::getMinimumScalableVF" is because
the type is supposed to be a type that can be legalized. It partially is,
although the support for these types need some more work.

Reviewed By: paulwalker-arm, dmgreen

Differential Revision: https://reviews.llvm.org/D103882

3 years agoCombine two diagnostics into one and correct grammar
Aaron Ballman [Wed, 14 Jul 2021 15:40:37 +0000 (11:40 -0400)]
Combine two diagnostics into one and correct grammar

The anonymous and non-anonymous bit-field diagnostics are easily
combined into one diagnostic. However, the diagnostic was missing a
"the" that is present in the almost-identically worded
warn_bitfield_width_exceeds_type_width diagnostic, hence the changes to
test cases.