platform/upstream/llvm.git
2 years ago[SLP][X86] Regenerate intrinsic.ll test checks
Simon Pilgrim [Thu, 19 Aug 2021 17:35:14 +0000 (18:35 +0100)]
[SLP][X86] Regenerate intrinsic.ll test checks

2 years ago[libc] Add a trivial implementation for bcmp
Guillaume Chatelet [Thu, 19 Aug 2021 17:55:16 +0000 (17:55 +0000)]
[libc] Add a trivial implementation for bcmp

Differential Revision: https://reviews.llvm.org/D108225

2 years ago[libomptarget][nfc] Move lanemask_t type into target_impl.h
Jon Chesterfield [Thu, 19 Aug 2021 17:42:23 +0000 (18:42 +0100)]
[libomptarget][nfc] Move lanemask_t type into target_impl.h

2 years agoAArch64: copy all parts of the mem operand across when combining a store
Tim Northover [Thu, 19 Aug 2021 14:15:37 +0000 (15:15 +0100)]
AArch64: copy all parts of the mem operand across when combining a store

In particular we were dropping volatility, which can lead to unwanted
transformations.

2 years ago[CostModel][X86] Add isnan half/float/double costs tests
Simon Pilgrim [Thu, 19 Aug 2021 17:06:52 +0000 (18:06 +0100)]
[CostModel][X86] Add isnan half/float/double costs tests

2 years ago[InstCombine] Avoid folding GEPs across loop boundaries
Chang-Sun Lin, Jr [Thu, 19 Aug 2021 17:01:34 +0000 (20:01 +0300)]
[InstCombine] Avoid folding GEPs across loop boundaries

Folding a GEP from outside to inside a loop will materialize an add where there wasn't an equivalent operation before. Check the containing loops before making this fold.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D107935

2 years ago[NFC][InstCombine] Add test for one-use one-index geps in different loops
Chang-Sun Lin, Jr [Thu, 19 Aug 2021 16:53:48 +0000 (19:53 +0300)]
[NFC][InstCombine] Add test for one-use one-index geps in different loops

2 years ago[OpaquePtr][Inline] Use byval type instead of pointee type
Arthur Eubanks [Fri, 9 Jul 2021 16:37:50 +0000 (09:37 -0700)]
[OpaquePtr][Inline] Use byval type instead of pointee type

Reviewed By: #opaque-pointers, dblaikie

Differential Revision: https://reviews.llvm.org/D105711

2 years agoUse v16i8 rather than v2i64 as the VT for memset expansion on AArch64.
Owen Anderson [Thu, 19 Aug 2021 08:00:29 +0000 (08:00 +0000)]
Use v16i8 rather than v2i64 as the VT for memset expansion on AArch64.

This allows the instruction selector to realize that it can directly
broadcast the low byte of the memset value, rather than replicating
it to a 64-bit GPR before broadcasting.

This fixes PR50985.

Differential Revision: https://reviews.llvm.org/D108354

2 years agoFix unknown parameter Wdocumentation warnings. NFC.
Simon Pilgrim [Thu, 19 Aug 2021 16:45:26 +0000 (17:45 +0100)]
Fix unknown parameter Wdocumentation warnings. NFC.

2 years ago[clang] Do not warn unused -enable-trivial-auto-var-init-zero-knowing-it-will-be...
Yi Kong [Wed, 18 Aug 2021 08:24:04 +0000 (16:24 +0800)]
[clang] Do not warn unused -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang

Android enables zero initialisation globally by default, but also allows
subprojects to override with different option. Clang complains the above
flag being unused in this case.

Instead of adding a 75 char long -no-* flag, don't warn unused argument
for this flag.

Differential Revision: https://reviews.llvm.org/D108278

2 years agoMemoryBuiltins: trailing , on collection literal
Augie Fackler [Thu, 19 Aug 2021 15:17:39 +0000 (11:17 -0400)]
MemoryBuiltins: trailing , on collection literal

This was probably bugging more than is reasonable, but it makes merging
changes in this file slightly less annoying to have the trailing comma
here. I only noticed this because Rust is currently carrying a patch to
this file and it kept making life a little difficult.

2 years agoFix CodeGen/X86/fsafdo_test2.ll fail in release
Thomas Preud'homme [Thu, 19 Aug 2021 11:04:42 +0000 (12:04 +0100)]
Fix CodeGen/X86/fsafdo_test2.ll fail in release

Require debug build for CodeGen/X86/fsafdo_test2.ll since it checks for
messages only printed in debug mode.

Reviewed By: wenlei, hoy

Differential Revision: https://reviews.llvm.org/D108364

2 years agoFix empty paragraph passed to parameter Wdocumentation warning. NFC.
Simon Pilgrim [Thu, 19 Aug 2021 15:48:09 +0000 (16:48 +0100)]
Fix empty paragraph passed to parameter Wdocumentation warning. NFC.

2 years agoRevert "[SelectionDAGBuilder] Compute and cache PreferredExtendType on demand."
Craig Topper [Thu, 19 Aug 2021 15:42:05 +0000 (08:42 -0700)]
Revert "[SelectionDAGBuilder] Compute and cache PreferredExtendType on demand."

This reverts commit add08c874147638e52d89eb07e40797dbc98d73b.

There was a compile time jump on tramp3d-v4 on https://llvm-compile-time-tracker.com/
Want to see if it goes away with this reverted.

2 years ago[CRT][LIT] build the target_cflags for Popen properly
Jinsong Ji [Thu, 19 Aug 2021 15:37:50 +0000 (15:37 +0000)]
[CRT][LIT] build the target_cflags for Popen properly

We recently enabled crt for powerpc in
https://reviews.llvm.org/rGb7611ad0b16769d3bf172e84fa9296158f8f1910.

And we started to see some unexpected error message when running
check-runtimes.

eg:
https://lab.llvm.org/buildbot/#/builders/57/builds/9488/steps/6/logs/stdio
line 100 - 103:

"
clang-14: error: unknown argument: '-m64 -fno-function-sections'
clang-14: error: unknown argument: '-m64 -fno-function-sections'
clang-14: error: unknown argument: '-m64 -fno-function-sections'
clang-14: error: unknown argument: '-m64 -fno-function-sections'
"

Looks like we shouldn't strip the space at the beginning,
or else the command line passed to subprocess won't work well.

Reviewed By: phosek, MaskRay

Differential Revision: https://reviews.llvm.org/D108329

2 years ago[Clang][AST][NFC] Resolve FIXME: Make CXXRecordDecl *Record const.
Alfsonso Gregory [Thu, 19 Aug 2021 15:36:05 +0000 (16:36 +0100)]
[Clang][AST][NFC] Resolve FIXME: Make CXXRecordDecl *Record const.

Differential Revision: https://reviews.llvm.org/D107477

2 years ago[docs] Document how to install sphinx and recommonmark on Ubuntu
Yaron Keren [Thu, 19 Aug 2021 13:57:15 +0000 (16:57 +0300)]
[docs] Document how to install sphinx and recommonmark on Ubuntu

Differential Revision: https://reviews.llvm.org/D108374

2 years ago[ISel] Expand saddsat and ssubsat via asr and xor
David Green [Thu, 19 Aug 2021 15:08:07 +0000 (16:08 +0100)]
[ISel] Expand saddsat and ssubsat via asr and xor

This changes the lowering of saddsat and ssubsat so that instead of
using:
  r,o = saddo x, y
  c = setcc r < 0
  s = c ? INTMAX : INTMIN
  ret o ? s : r
into using asr and xor to materialize the INTMAX/INTMIN constants:
  r,o = saddo x, y
  s = ashr r, BW-1
  x = xor s, INTMIN
  ret o ? x : r
https://alive2.llvm.org/ce/z/TYufgD

This seems to reduce the instruction count in most testcases across most
architectures. X86 has some custom lowering added to compensate for
cases where it can increase instruction count.

Differential Revision: https://reviews.llvm.org/D105853

2 years ago[AIX] Remove XFAIL from macro-same-context
Jinsong Ji [Thu, 19 Aug 2021 14:52:41 +0000 (14:52 +0000)]
[AIX] Remove XFAIL from macro-same-context

We have enabled inline asm intergrated assembler support,
this test is passing now.

2 years agoFix unknown parameter Wdocumentation warnings. NFC.
Simon Pilgrim [Thu, 19 Aug 2021 14:39:53 +0000 (15:39 +0100)]
Fix unknown parameter Wdocumentation warnings. NFC.

2 years agoFix unknown parameter Wdocumentation warning. NFC.
Simon Pilgrim [Thu, 19 Aug 2021 14:30:06 +0000 (15:30 +0100)]
Fix unknown parameter Wdocumentation warning. NFC.

2 years agoFix unknown parameter Wdocumentation warning. NFC.
Simon Pilgrim [Thu, 19 Aug 2021 14:23:53 +0000 (15:23 +0100)]
Fix unknown parameter Wdocumentation warning. NFC.

2 years agoFix unknown parameter Wdocumentation warning. NFC.
Simon Pilgrim [Thu, 19 Aug 2021 14:22:05 +0000 (15:22 +0100)]
Fix unknown parameter Wdocumentation warning. NFC.

2 years ago[CostModel][X86] Add VPOPCNTDQ/BITALG ctpop costs
Simon Pilgrim [Thu, 19 Aug 2021 13:25:17 +0000 (14:25 +0100)]
[CostModel][X86] Add VPOPCNTDQ/BITALG ctpop costs

VPOPCNTDQ + BITALG add ctpop instructions for vXi64/vXi32 + vXi16/vXi8 vector types respectively

2 years ago[SelectionDAGBuilder] Compute and cache PreferredExtendType on demand.
Craig Topper [Thu, 19 Aug 2021 14:18:30 +0000 (07:18 -0700)]
[SelectionDAGBuilder] Compute and cache PreferredExtendType on demand.

Previously we pre-calculated this and cached it for every
instruction in the function. Most of the calculated results will
never be used. So instead calculate it only on the first use, and
then cache it.

The cache was originally added to fix a compile time issue which
caused r216066 to be reverted.

This change exposed that we weren't pre-computing the Value for
Arguments. I've explicitly disabled that for now as it seemed to
regress some tests on AArch64 which has sext built into its compare
instructions.

Spotted while investigating how to improve heuristics to work better
with RISCV preferring sign extend for unsigned compares for i32 on RV64.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D107976

2 years ago[TypePromotion] Use Instruction* instead of Value* for a couple functions. NFC
Craig Topper [Wed, 18 Aug 2021 22:02:33 +0000 (15:02 -0700)]
[TypePromotion] Use Instruction* instead of Value* for a couple functions. NFC

This matches how they are called and allows some isa/cast/dyn_cast
to be removed.

Differential Revision: https://reviews.llvm.org/D108333

2 years ago[RISCV] Reduce duplicate code for calling SimplifyDemandedBits.
Craig Topper [Wed, 18 Aug 2021 19:21:04 +0000 (12:21 -0700)]
[RISCV] Reduce duplicate code for calling SimplifyDemandedBits.

This encapsulates the APInt creation and worklist management into
a helper function.

To keep one common interface I've use Log2_32 in places that
previously created a mask by subtracting 1 from a power of 2.

Differential Revision: https://reviews.llvm.org/D108324

2 years ago[ARM] Add MVE min/max intrinsic tests. NFC
David Green [Thu, 19 Aug 2021 13:33:34 +0000 (14:33 +0100)]
[ARM] Add MVE min/max intrinsic tests. NFC

2 years ago[DWARF][Verifier][NFC] Use reference to DWARFAddressRangesVector to avoid copying.
Alexey Lapshin [Thu, 19 Aug 2021 11:19:07 +0000 (14:19 +0300)]
[DWARF][Verifier][NFC] Use reference to DWARFAddressRangesVector to avoid copying.

Avoid copying while access to RangesOrError.get().

2 years ago[CostModel][X86] Add VPOPCNT/BITALG test coverage for ctpop/cttz costs
Simon Pilgrim [Thu, 19 Aug 2021 12:49:50 +0000 (13:49 +0100)]
[CostModel][X86] Add VPOPCNT/BITALG test coverage for ctpop/cttz costs

2 years ago[RISCV][test] Improve tests for (add (mul x, c1), c2)
Ben Shi [Thu, 19 Aug 2021 13:03:46 +0000 (21:03 +0800)]
[RISCV][test] Improve tests for (add (mul x, c1), c2)

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D107710

2 years ago[mlir][SparseTensor] Split scf.for loop into masked/unmasked parts
Matthias Springer [Thu, 19 Aug 2021 12:46:12 +0000 (21:46 +0900)]
[mlir][SparseTensor] Split scf.for loop into masked/unmasked parts

Apply the "for loop peeling" pattern from SCF dialect transforms. This pattern splits scf.for loops into full and partial iterations. In the full iteration, all masked loads/stores are canonicalized to unmasked loads/stores.

Differential Revision: https://reviews.llvm.org/D107733

2 years agoRevert "[CVP] processSwitch: Remove default case when switch cover all possible values."
Sanjay Patel [Thu, 19 Aug 2021 12:43:51 +0000 (08:43 -0400)]
Revert "[CVP] processSwitch: Remove default case when switch cover all possible values."

This reverts commit 9934a5b2ed5aa6e6bbb2e55c3cd98839722c226e.
This patch may cause miscompiles because it missed a constraint
as shown in the examples from:
https://llvm.org/PR51531

2 years ago[InstCombine] add min/max intrinsics as freely invertible candidates
Sanjay Patel [Wed, 18 Aug 2021 22:45:51 +0000 (18:45 -0400)]
[InstCombine] add min/max intrinsics as freely invertible candidates

In the optimized test, we are able to peak through the
min/max that has 2 min/max operands and invert them all:
https://alive2.llvm.org/ce/z/7gYMN5

2 years ago[InstCombine] add tests for min/max with inverts; NFC
Sanjay Patel [Wed, 18 Aug 2021 22:27:15 +0000 (18:27 -0400)]
[InstCombine] add tests for min/max with inverts; NFC

2 years ago[InstCombine] add one-use check for min/max fold with not operands; NFC
Sanjay Patel [Wed, 18 Aug 2021 21:02:55 +0000 (17:02 -0400)]
[InstCombine] add one-use check for min/max fold with not operands; NFC

This makes the intrinsic logic match the cmp+select idiom folds
just below. It's not clearly a win either way unless we think
that a 'not' op costs more than min/max.

The cmp+select folds on these patterns are more extensive than
the intrinsics currently and may have some complicated interactions,
so I'm trying to make those line up and bring the optimizations
for intrinsics up to parity.

2 years ago[openmp][nfc] Replace OMPGridValues array with struct
Jon Chesterfield [Thu, 19 Aug 2021 12:25:41 +0000 (13:25 +0100)]
[openmp][nfc] Replace OMPGridValues array with struct

[nfc] Replaces enum indices into an array with a struct. Named the
fields to match the enum, leaves memory layout and initialization unchanged.

Motivation is to later safely remove dead fields and replace redundant ones
with (compile time) computation. It should also be possible to factor some
common fields into a base and introduce a gfx10 amdgpu instance with less
duplication than the arrays of integers require.

Reviewed By: ronlieb

Differential Revision: https://reviews.llvm.org/D108339

2 years ago[LoopFlatten] Fix assertion failure
Rosie Sumpter [Tue, 17 Aug 2021 10:43:34 +0000 (11:43 +0100)]
[LoopFlatten] Fix assertion failure

There is an assertion failure in computeOverflowForUnsignedMul
(used in checkOverflow) due to the inner and outer trip counts
having different types. This occurs when the IV has been widened,
but the loop components are not successfully rediscovered.
This is fixed by some refactoring of the code in findLoopComponents
which identifies the trip count of the loop.

Differential Revision: https://reviews.llvm.org/D108107

2 years ago[LegalizeTypes][VP] Add widening support for binary VP ops
Fraser Cormack [Wed, 11 Aug 2021 12:09:14 +0000 (13:09 +0100)]
[LegalizeTypes][VP] Add widening support for binary VP ops

This patch adds the beginnings of more thorough support in the
legalizers for vector-predicated (VP) operations.

The first step is the ability to widen illegal vectors. The more
complicated scenario in which the result/operands need widening but the
mask doesn't has not been handled here. That would require a lot of code
without an in-tree target on which to test it.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D107904

2 years ago[CodeCompletion] Provide placeholders for known attribute arguments
Sam McCall [Fri, 13 Aug 2021 08:17:59 +0000 (10:17 +0200)]
[CodeCompletion] Provide placeholders for known attribute arguments

Completion now looks more like function/member completion:

  used
  alias(Aliasee)
  abi_tag(Tags...)

Differential Revision: https://reviews.llvm.org/D108109

2 years ago[AArch64][SVE] Teach cost model that masked loads/stores are cheap
Matthew Devereau [Thu, 19 Aug 2021 10:42:20 +0000 (11:42 +0100)]
[AArch64][SVE] Teach cost model that masked loads/stores are cheap

Reduce the cost of VLS masked loads/stores to make the vectorizor emit them more frequently.

2 years ago[RISCV][test] Add new tests for add optimization in the zba extension
Ben Shi [Tue, 17 Aug 2021 06:21:24 +0000 (14:21 +0800)]
[RISCV][test] Add new tests for add optimization in the zba extension

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D108188

2 years ago[X86] Regenerate store_op_load_fold.ll test checks
Simon Pilgrim [Thu, 19 Aug 2021 11:30:46 +0000 (12:30 +0100)]
[X86] Regenerate store_op_load_fold.ll test checks

2 years ago[CodeComplete] Only complete attributes that match the current LangOpts
Sam McCall [Mon, 16 Aug 2021 09:40:24 +0000 (11:40 +0200)]
[CodeComplete] Only complete attributes that match the current LangOpts

Differential Revision: https://reviews.llvm.org/D108111

2 years ago[tsan] Fix pthread_once() on Mac OS X
Marco Elver [Thu, 19 Aug 2021 11:17:45 +0000 (13:17 +0200)]
[tsan] Fix pthread_once() on Mac OS X

Change 636428c727cd enabled BlockingRegion hooks for pthread_once().
Unfortunately this seems to cause crashes on Mac OS X which uses
pthread_once() from locations that seem to result in crashes:

| ThreadSanitizer:DEADLYSIGNAL
| ==31465==ERROR: ThreadSanitizer: stack-overflow on address 0x7ffee73fffd8 (pc 0x00010807fd2a bp 0x7ffee7400050 sp 0x7ffee73fffb0 T93815)
|     #0 __tsan::MetaMap::GetSync(__tsan::ThreadState*, unsigned long, unsigned long, bool, bool) tsan_sync.cpp:195 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x78d2a)
|     #1 __tsan::MutexPreLock(__tsan::ThreadState*, unsigned long, unsigned long, unsigned int) tsan_rtl_mutex.cpp:143 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x6cefc)
|     #2 wrap_pthread_mutex_lock sanitizer_common_interceptors.inc:4240 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x3dae0)
|     #3 flockfile <null>:2 (libsystem_c.dylib:x86_64+0x38a69)
|     #4 puts <null>:2 (libsystem_c.dylib:x86_64+0x3f69b)
|     #5 wrap_puts sanitizer_common_interceptors.inc (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x34d83)
|     #6 __tsan::OnPotentiallyBlockingRegionBegin() cxa_guard_acquire.cpp:8 (foo:x86_64+0x100000e48)
|     #7 wrap_pthread_once tsan_interceptors_posix.cpp:1512 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x2f6e6)

From the stack trace it can be seen that the caller is unknown, and the
resulting stack-overflow seems to indicate that whoever the caller is
does not have enough stack space or otherwise is running in a limited
environment not yet ready for full instrumentation.

Fix it by reverting behaviour on Mac OS X to not call BlockingRegion
hooks from pthread_once().

Reported-by: azharudd
Reviewed By: glider

Differential Revision: https://reviews.llvm.org/D108305

2 years agoAvoid unused variable when NDEBUG
Frederik Gossen [Thu, 19 Aug 2021 10:51:14 +0000 (12:51 +0200)]
Avoid unused variable when NDEBUG

2 years ago[OpenCL] Fix as_type(vec3) invalid store creation
Sven van Haastregt [Thu, 19 Aug 2021 10:57:09 +0000 (11:57 +0100)]
[OpenCL] Fix as_type(vec3) invalid store creation

With -fpreserve-vec3-type enabled, a cast was not created when
converting from a vec3 type to a non-vec3 type, even though a
conversion to vec4 was performed.  This resulted in creation of
invalid store instructions.

Differential Revision: https://reviews.llvm.org/D107963

2 years ago[NewPM] Make some sanitizer passes parameterized in the PassRegistry
Bjorn Pettersson [Mon, 28 Jun 2021 09:16:40 +0000 (11:16 +0200)]
[NewPM] Make some sanitizer passes parameterized in the PassRegistry

Refactored implementation of AddressSanitizerPass and
HWAddressSanitizerPass to use pass options similar to passes like
MemorySanitizerPass. This makes sure that there is a single mapping
from class name to pass name (needed by D108298), and options like
-debug-only and -print-after makes a bit more sense when (despite
that it is the unparameterized pass name that should be used in those
options).

A result of the above is that some pass names are removed in favor
of the parameterized versions:
- "khwasan" is now "hwasan<kernel;recover>"
- "kasan" is now "asan<kernel>"
- "kmsan" is now "msan<kernel>"

Differential Revision: https://reviews.llvm.org/D105007

2 years ago[docs] Document that psutil should be installed in non-user location
Yaron Keren [Thu, 19 Aug 2021 09:27:49 +0000 (12:27 +0300)]
[docs] Document that psutil should be installed in non-user location

Differential Revision: https://reviews.llvm.org/D108356

2 years agoUpdate {Small}BitVector size_type definition
Renato Golin [Wed, 18 Aug 2021 10:50:15 +0000 (11:50 +0100)]
Update {Small}BitVector size_type definition

SmallBitVector implements a level of indirection over BitVector by
storing a smaller bit-vector in a pointer-sized element, or in case the
number of elements exceeds the bucket size, it creates a new pointer to
a BitVector and uses that as its storage.

However, the functions returning the vector size were using `unsigned`,
which is ok for BitVector, but not for SmallBitVector, which is actually
`uintptr_t`.

This commit reuses the `size_type` definition to more than just `count`
and propagates them into range iteration, size calculation, etc.

This is a continuation of D108124.

I haven't changed all occurrences of `unsigned` or `uintptr_t` to
`size_type`, just those that were directly related.

Following directions from clang-tidy on case of variables.

Differential Revision: https://reviews.llvm.org/D108290

2 years ago[OptTable] Refine how `printHelp` treats empty help texts
Andrzej Warzynski [Thu, 5 Aug 2021 11:42:30 +0000 (11:42 +0000)]
[OptTable] Refine how `printHelp` treats empty help texts

Currently, `printHelp` behaves differently for options that:
  * do not define `HelpText` (such options _are not printed_), and
  * define its `HelpText` as `HelpText<"">` (such options _are printed_).
In practice, both approaches lead to no help text and `printHelp` should
treat them consistently. This patch addresses that by making
`printHelpt` check the length of the help text to be printed.

All affected tests have been updated accordingly. The option definitions
for llvm-cvtres have been updated with a short description or "Not
  implemented" for options that are ignored by the tool.

Differential Revision: https://reviews.llvm.org/D107557

2 years ago[clang] [MSVC] Implement __mulh and __umulh builtins for aarch64
Martin Storsjö [Fri, 23 Jul 2021 21:04:10 +0000 (00:04 +0300)]
[clang] [MSVC] Implement __mulh and __umulh builtins for aarch64

The code is based on the same __mulh and __umulh intrinsics for
x86.

This should fix PR51128.

Differential Revision: https://reviews.llvm.org/D106721

2 years ago[LoopVectorize][AArch64] Enable ordered reductions by default for AArch64
David Sherwood [Fri, 2 Jul 2021 10:12:16 +0000 (11:12 +0100)]
[LoopVectorize][AArch64] Enable ordered reductions by default for AArch64

I have added a new TTI interface called enableOrderedReductions() that
controls whether or not ordered reductions should be enabled for a
given target. By default this returns false, whereas for AArch64 it
returns true and we rely upon the cost model to make sensible
vectorisation choices. It is still possible to override the new TTI
interface by setting the command line flag:

  -force-ordered-reductions=true|false

I have added a new RUN line to show that we use ordered reductions by
default for SVE and Neon:

  Transforms/LoopVectorize/AArch64/strict-fadd.ll
  Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll

Differential Revision: https://reviews.llvm.org/D106653

2 years ago[flang][driver] Add print function name Plugin example
Stuart Ellis [Thu, 19 Aug 2021 08:07:45 +0000 (08:07 +0000)]
[flang][driver] Add print function name Plugin example

Replacing Hello World example Plugin with one that counts and prints the names of
functions and subroutines.
This involves changing the `PluginParseTreeAction` Plugin base class to
inherit from `PrescanAndSemaAction` class to get access to the Parse Tree
so that the Plugin can walk it.
Additionally, there are tests of this new Plugin to check it prints the correct
things in different circumstances.

Depends on: D106137

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D107089

2 years ago[mlir][scf] Simplify affine.min ops after loop peeling
Matthias Springer [Thu, 19 Aug 2021 08:08:21 +0000 (17:08 +0900)]
[mlir][scf] Simplify affine.min ops after loop peeling

Simplify affine.min ops, enabling various other canonicalizations inside the peeled loop body.

affine.min ops such as:
```
map = affine_map<(d0)[s0, s1] -> (s0, -d0 + s1)>
%r = affine.min #affine.min #map(%iv)[%step, %ub]
```
are rewritten them into (in the case the peeled loop):
```
%r = %step
```

To determine how an affine.min op should be rewritten and to prove its correctness, FlatAffineConstraints is utilized.

Differential Revision: https://reviews.llvm.org/D107222

2 years ago[flang] Add POSIX implementation for SYSTEM_CLOCK
Diana Picus [Tue, 13 Jul 2021 11:37:43 +0000 (11:37 +0000)]
[flang] Add POSIX implementation for SYSTEM_CLOCK

This is very similar to CPU_TIME, except that we return nanoseconds
rather than seconds. This means we're potentially dealing with rather
large numbers, so we'll have to wrap around to avoid overflows.

Differential Revision: https://reviews.llvm.org/D105970

2 years agoSimplify setting up LLVM as bazel external repo
Christian Sigg [Wed, 18 Aug 2021 07:14:42 +0000 (09:14 +0200)]
Simplify setting up LLVM as bazel external repo

Only require one intermediate repository instead of two.
Fewer parameters in llvm_config.

Second attempt of https://reviews.llvm.org/D107714, this time also updating `third_party_build` and `deps_impl` paths.

Reviewed By: GMNGeoffrey

Differential Revision: https://reviews.llvm.org/D108274

2 years ago[MLIR] [Python] Add `owner` to `mlir.ir.Block`
John Demme [Thu, 19 Aug 2021 07:02:09 +0000 (00:02 -0700)]
[MLIR] [Python] Add `owner` to `mlir.ir.Block`

Provides a way for python users to access the owning Operation from a Block.

2 years ago[mlir][linalg] Set result types in all builders.
Tobias Gysi [Thu, 19 Aug 2021 06:17:41 +0000 (06:17 +0000)]
[mlir][linalg] Set result types in all builders.

Add code to set the result types in all yaml op builders.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D108273

2 years ago[CSSPGO] Track and use context-sensitive post-optimization function size to drive...
Wenlei He [Tue, 17 Aug 2021 01:29:07 +0000 (18:29 -0700)]
[CSSPGO] Track and use context-sensitive post-optimization function size to drive global pre-inliner in llvm-profgen

This change enables llvm-profgen to use accurate context-sensitive post-optimization function byte size as a cost proxy to drive global preinline decisions.

To do this, BinarySizeContextTracker is introduced to track function byte size under different inline context during disassembling. In preinliner, we can not query context byte size under switch `context-cost-for-preinliner`. The tracker uses a reverse trie to keep size of functions under different context (callee as parent, caller as child), and it can give best/longest possible matching context size for given input context.

The new size cost is off by default. There're a few TODOs that needs to addressed: 1) avoid dangling string from `Offset2LocStackMap`, which will be addressed in split context work; 2) using inlinee's entry probe to make sure we have correct zero size for inlinee that's completely optimized away after inlining. Some tuning is also needed.

Differential Revision: https://reviews.llvm.org/D108180

2 years agoRevert "[HIP] Allow target addr space in target builtins"
Anshil Gandhi [Thu, 19 Aug 2021 03:37:53 +0000 (21:37 -0600)]
Revert "[HIP] Allow target addr space in target builtins"

This reverts commit a35008955fa606487f79a050f5cc80fc7ee84dda.

2 years ago[examples] Fix Kaleidoscope for Windows
Lang Hames [Thu, 19 Aug 2021 03:17:35 +0000 (13:17 +1000)]
[examples] Fix Kaleidoscope for Windows

This fixes "Resolving symbol with incorrect flags" errors when running the
Kaleidoscope tutorials on Windows.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D108348

2 years ago[WebAssembly] Avoid unused function imports in PIC mode
Sam Clegg [Thu, 19 Aug 2021 02:20:55 +0000 (22:20 -0400)]
[WebAssembly] Avoid unused function imports in PIC mode

In PIC mode we import function address via `GOT.mem` imports but for
direct function calls we still import the first class function.
However, if the function is never directly called we can avoid the first
class import completely.

Differential Revision: https://reviews.llvm.org/D108345

2 years ago[JITLink] Optimize GOTPCRELX Relocations
luxufan [Thu, 19 Aug 2021 02:13:40 +0000 (10:13 +0800)]
[JITLink] Optimize GOTPCRELX Relocations

This patch optimize the GOTPCRELX Reloations, which is described in X86-64 psabi chapter B.2. And Not all optimization of this chapter is implemented.

1. Convert call and jmp has been implemented
2. Convert mov, but the optimization that when the symbol is defined in the lower 32-bit address space, memory operand in `mov` can be convertted into immediate operand has not been implemented.
3. Conver Test and Binop has not been implemented.

The new test file named ELF_got_plt_optimizations.s has been added, and I moved some test cases about optimization of got/plt from ELF_x86_64_small_pic_relocations.s to the new test file.

By referencing the lld, so, the optimization `Convert call and jmp` is not same as what psabi says, and I have explained it in the comment.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D108280

2 years ago[mlir][linalg] Canonicalize dim ops of tiled_loop block args
Matthias Springer [Thu, 19 Aug 2021 02:23:36 +0000 (11:23 +0900)]
[mlir][linalg] Canonicalize dim ops of tiled_loop block args

E.g.:
```
%y = ... : tensor<...>
linalg.tiled_loop ... ins(%x = %y : tensor<...>) {
  tensor.dim %x, %c0 : tensor<...>
}
```

is rewritten to:
```
%y = ... : tensor<...>
linalg.tiled_loop ... ins(%x = %y : tensor<...>) {
  tensor.dim %y, %c0 : tensor<...>
}
```

Differential Revision: https://reviews.llvm.org/D108272

2 years ago[ORC] Handle void and no-argument async wrapper calls.
Lang Hames [Thu, 19 Aug 2021 02:19:36 +0000 (12:19 +1000)]
[ORC] Handle void and no-argument async wrapper calls.

2 years ago[WebAssembly][lld] Convert signature-mismatch.ll test to asm. NFC
Sam Clegg [Thu, 19 Aug 2021 00:30:58 +0000 (20:30 -0400)]
[WebAssembly][lld] Convert signature-mismatch.ll test to asm. NFC

Differential Revision: https://reviews.llvm.org/D108346

2 years ago[sanitizer] Use TMPDIR in Android test
Vitaly Buka [Thu, 19 Aug 2021 02:02:02 +0000 (19:02 -0700)]
[sanitizer] Use TMPDIR in Android test

TMPDIR was added long time ago, so no need to use EXTERNAL_STORAGE.

2 years ago[mlir][linalg] Remove ConstraintsSet class
Matthias Springer [Thu, 19 Aug 2021 01:47:17 +0000 (10:47 +0900)]
[mlir][linalg] Remove ConstraintsSet class

The same functionality can be implemented with FlatAffineValueConstraints.

Differential Revision: https://reviews.llvm.org/D108179

2 years ago[gn build] Port 5fdaaf7fd8f3
LLVM GN Syncbot [Thu, 19 Aug 2021 01:52:47 +0000 (01:52 +0000)]
[gn build] Port 5fdaaf7fd8f3

2 years agoStackLifetime: Remove asserts for multiple lifetime intrinsics.
Peter Collingbourne [Wed, 18 Aug 2021 22:03:03 +0000 (15:03 -0700)]
StackLifetime: Remove asserts for multiple lifetime intrinsics.

According to the langref, it is valid to have multiple consecutive
lifetime start or end intrinsics on the same object.

For llvm.lifetime.start:
"If ptr [...] is a stack object that is already alive, it simply
fills all bytes of the object with poison."

For llvm.lifetime.end:
"Calling llvm.lifetime.end on an already dead alloca is no-op."

However, we currently fail an assertion in such cases. I've observed
the assertion failure when the loop vectorization pass duplicates
the intrinsic.

We can conservatively handle these intrinsics by ignoring all but
the first one, which can be implemented by removing the assertions.

Differential Revision: https://reviews.llvm.org/D108337

2 years ago[SampleFDO] Flow Sensitive Sample FDO (FSAFDO) profile loader
Rong Xu [Wed, 18 Aug 2021 23:59:02 +0000 (16:59 -0700)]
[SampleFDO] Flow Sensitive Sample FDO (FSAFDO) profile loader

This patch implements Flow Sensitive Sample FDO (FSAFDO) profile
loader. We have two profile loaders for FS profile,
one before RegAlloc and one before BlockPlacement.

To enable it, when -fprofile-sample-use=<profile> is specified,
add "-enable-fs-discriminator=true \
     -disable-ra-fsprofile-loader=false \
     -disable-layout-fsprofile-loader=false"
to turn on the FS profile loaders.

Differential Revision: https://reviews.llvm.org/D107878

2 years ago[mlir][Analysis][NFC] FlatAffineConstraints: Use BoundType enum in functions
Matthias Springer [Thu, 19 Aug 2021 00:53:39 +0000 (09:53 +0900)]
[mlir][Analysis][NFC] FlatAffineConstraints: Use BoundType enum in functions

Differential Revision: https://reviews.llvm.org/D108185

2 years ago[scudo] Don't build SCUDO for Android
Vitaly Buka [Thu, 19 Aug 2021 01:22:28 +0000 (18:22 -0700)]
[scudo] Don't build SCUDO for Android

Android 11 uses scudo_standalone as default
allocator making difficult to test legacy scudo.

2 years ago[openmp] Annotate tmp variables with omp_thread_mem_alloc
Jon Chesterfield [Thu, 19 Aug 2021 01:22:10 +0000 (02:22 +0100)]
[openmp] Annotate tmp variables with omp_thread_mem_alloc

Fixes miscompile of calls into ocml. Bug 51445.

The stack variable `double __tmp` is moved to dynamically allocated shared
memory by CGOpenMPRuntimeGPU. This is usually fine, but when the variable
is passed to a function that is explicitly annotated address_space(5) then
allocating the variable off-stack leads to a miscompile in the back end,
which cannot decide to move the variable back to the stack from shared.

This could be fixed by removing the AS(5) annotation from the math library
or by explicitly marking the variables as thread_mem_alloc. The cast to
AS(5) is still a no-op once IR is reached.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D107971

2 years ago[NFC][DebugInfo] getDwarfCompileUnitID
Kyungwoo Lee [Thu, 19 Aug 2021 00:14:46 +0000 (17:14 -0700)]
[NFC][DebugInfo] getDwarfCompileUnitID

This is a refactoring for the use in https://reviews.llvm.org/D108261

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D108271

2 years ago[libomptarget] Apply D106710 to amdgcn devicertl
Jon Chesterfield [Thu, 19 Aug 2021 00:34:33 +0000 (01:34 +0100)]
[libomptarget] Apply D106710 to amdgcn devicertl

2 years ago[mlir][sparse] use shared util for DimOp generation
Aart Bik [Wed, 18 Aug 2021 17:39:14 +0000 (10:39 -0700)]
[mlir][sparse] use shared util for DimOp generation

This shares more code with existing utilities. Also, to be consistent,
we moved dimension permutation on the DimOp to the tensor lowering phase.
This way, both pre-existing DimOps on sparse tensors (not likely but
possible) as well as compiler generated DimOps are handled consistently.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108309

2 years ago[libomptarget][nfc][devicertl] Delete unused enums
Jon Chesterfield [Wed, 18 Aug 2021 23:12:33 +0000 (00:12 +0100)]
[libomptarget][nfc][devicertl] Delete unused enums

2 years ago[NFC][libcxxabi] Run clang-format on libcxxabi/src/cxa_guard_impl.h
Daniel McIntosh [Wed, 4 Aug 2021 17:37:57 +0000 (13:37 -0400)]
[NFC][libcxxabi] Run clang-format on libcxxabi/src/cxa_guard_impl.h

I'm about to submit a change which involves re-writing most of
cxa_guard_impl.h. Running clang-format on the whole file first seems like a
good idea.

Reviewed By: ldionne, #libc_abi

Differential Revision: https://reviews.llvm.org/D108231

2 years ago[mlir] Fix typo in SuperVectorizer
Diego Caballero [Wed, 18 Aug 2021 22:21:23 +0000 (22:21 +0000)]
[mlir] Fix typo in SuperVectorizer

NFC.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D108334

2 years ago[LLDB][GUI] Add Process Launch form
Omar Emara [Wed, 18 Aug 2021 18:59:57 +0000 (11:59 -0700)]
[LLDB][GUI] Add Process Launch form

This patch adds a process launch form. Additionally, a LazyBoolean field
was implemented and numerous utility methods were added to various
fields to get the launch form working.

Differential Revision: https://reviews.llvm.org/D107869

2 years ago[clang-format] Improve detection of parameter declarations in K&R C
owenca [Sun, 15 Aug 2021 21:03:17 +0000 (14:03 -0700)]
[clang-format] Improve detection of parameter declarations in K&R C

Clean up the detection of parameter declarations in K&R C function
definitions. Also make it more precise by requiring the second
token after the r_paren to be either a star or keyword/identifier.

Differential Revision: https://reviews.llvm.org/D108094

2 years ago[LLDB][GUI] Fix text field incorrect key handling
Omar Emara [Wed, 18 Aug 2021 22:06:05 +0000 (15:06 -0700)]
[LLDB][GUI] Fix text field incorrect key handling

The isprint libc function was used to determine if the key code
represents a printable character. The problem is that the specification
leaves the behavior undefined if the key is not representable as an
unsigned char, which is the case for many ncurses keys. This patch adds
and explicit check for this undefined behavior and make it consistent.

The llvm::isPrint function didn't work correctly for some reason, most
likely because it takes a char instead of an int, which I guess makes it
unsuitable for checking ncurses key codes.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D108327

2 years ago[gn build] Port d8bbfe8a4897
LLVM GN Syncbot [Wed, 18 Aug 2021 21:58:30 +0000 (21:58 +0000)]
[gn build] Port d8bbfe8a4897

2 years ago[DWARF] Expose raw bytes in DWARFExpression
Rafael Auler [Thu, 5 Aug 2021 00:15:29 +0000 (17:15 -0700)]
[DWARF] Expose raw bytes in DWARFExpression

This information is necessary for clients of DebugInfo that
do not want to process a DWARF expression, but just treat it as a blob
of data. In BOLT, for example, we need to read these expressions in
CFIs and write them back to the binary, unchanged, so having access to
the original expression encoding is a shortcut to avoid the need to
re-encode the entire expression when re-writing exception handling
info (CFIs).

This patch is an alternative to https://reviews.llvm.org/D98301, in
which we implement the support to re-encode these expressions. But
since we don't really need to change anything in these expressions,
we can just copy their bytes.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D107515

2 years agoEnables inferring return types for Shape op if possible
Chia-hung Duan [Wed, 18 Aug 2021 20:46:26 +0000 (20:46 +0000)]
Enables inferring return types for Shape op if possible

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D102565

2 years ago[AArch64][GlobalISel] Don't allow s128 for G_ISNAN
Jessica Paquette [Wed, 18 Aug 2021 20:57:42 +0000 (13:57 -0700)]
[AArch64][GlobalISel] Don't allow s128 for G_ISNAN

getAPFloatFromSize doesn't support s128, so we can't lower this without
asserting right now.

To fix the buildbots, don't allow any scalars other than s16, s32, and s64.

2 years agogn build: Build libclang.so and libLTO.so on ELF platforms.
Peter Collingbourne [Mon, 23 Nov 2020 19:45:06 +0000 (11:45 -0800)]
gn build: Build libclang.so and libLTO.so on ELF platforms.

This requires changing the ELF build to enable -fPIC, consistent
with other platforms.

Differential Revision: https://reviews.llvm.org/D108223

2 years ago[AArch64][GlobalISel] Mark G_FMINNUM/G_FMAXNUM as floating point opcodes
Jessica Paquette [Wed, 18 Aug 2021 00:40:23 +0000 (17:40 -0700)]
[AArch64][GlobalISel] Mark G_FMINNUM/G_FMAXNUM as floating point opcodes

We need to ensure that these end up on FPR to allow imported patterns to
select them.

This will also ensure that we get good regbank selection when dealing with
instructions like G_PHI/G_LOAD/G_STORE which deduce their banks from their
uses/users.

Differential Revision: https://reviews.llvm.org/D108260

2 years ago[AArch64][GlobalISel] Legalize scalar G_FMINNUM + G_FMAXNUM
Jessica Paquette [Wed, 18 Aug 2021 00:26:48 +0000 (17:26 -0700)]
[AArch64][GlobalISel] Legalize scalar G_FMINNUM + G_FMAXNUM

For subtargets with full FP16, this is legal for s16, s32, and s64. Without
full FP16, it's legal for s32 and s64.

For s128, this is a libcall.

We also support some vector types, but for now, let's just support scalars.

Differential Revision: https://reviews.llvm.org/D108259

2 years ago[libomptarget][devicertl] Replace lanemask with uint64 at interface
Jon Chesterfield [Wed, 18 Aug 2021 19:47:33 +0000 (20:47 +0100)]
[libomptarget][devicertl] Replace lanemask with uint64 at interface

Use uint64_t for lanemask on all GPU architectures at the interface
with clang. Updates tests. The deviceRTL is always linked as IR so the zext
and trunc introduced for wave32 architectures will fold after inlining.

Simplification partly motivated by amdgpu gfx10 which will be wave32 and
is awkward to express in the current arch-dependant typedef interface.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108317

2 years ago[AggressiveInstCombine] Add logical shift right instr to `TruncInstCombine` DAG
Anton Afanasyev [Tue, 17 Aug 2021 10:49:53 +0000 (13:49 +0300)]
[AggressiveInstCombine] Add logical shift right instr to `TruncInstCombine` DAG

Add `lshr` instruction to the DAG post-dominated by `trunc`, allowing
TruncInstCombine to reduce bitwidth of expressions containing
these instructions.

We should be shifting by less than the target bitwidth.
Also it is sufficient to require that all truncated bits
of the value-to-be-shifted are zeros: https://alive2.llvm.org/ce/z/_LytbB

Alive2 variable-length proof:
https://godbolt.org/z/1srE1aqzf => s/32/8/ => https://alive2.llvm.org/ce/z/StwPia

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108201

2 years ago[Test][AggressiveInstCombine] Add one more tests for shifts
Anton Afanasyev [Wed, 18 Aug 2021 14:23:09 +0000 (17:23 +0300)]
[Test][AggressiveInstCombine] Add one more tests for shifts

2 years ago[mlir][tosa] Fix clamp to restrict only within valid bitwidth range
Robert Suderman [Wed, 18 Aug 2021 18:55:54 +0000 (11:55 -0700)]
[mlir][tosa] Fix clamp to restrict only within valid bitwidth range

Its possible for the clamp to have invalid min/max values on its range. To fix
this we validate the range of the min/max and clamp to a valid range.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D108256

2 years ago[Polly] Introduce caching for the isErrorBlock function. NFC.
Michael Kruse [Wed, 18 Aug 2021 18:36:17 +0000 (13:36 -0500)]
[Polly] Introduce caching for the isErrorBlock function. NFC.

Compilation of the file insn-attrtab.c of the SPEC CPU 2017 502.gcc_r
benchmark takes excessive time (> 30min) with Polly enabled. Most time
is spent in the isErrorBlock function querying the DominatorTree.
The isErrorBlock is invoked redundantly over the course of ScopDetection
and ScopBuilder. This patch introduces a caching mechanism for its
result.

Instead of a free function, isErrorBlock is moved to ScopDetection where
its cache map resides. This also means that many functions directly or
indirectly calling isErrorBlock are not "const" anymore. The
DetectionContextMap was marked as "mutable", but IMHO it never should
have been since it stores the detection result.

502.gcc_r only takes excessive time with the new pass manager. The
reason seeams to be that it invalidates the ScopDetection analysis more
often than the legacy pass manager, for unknown reasons.

2 years agoReapply: [NFC] factor out unrolling decision logic
Ali Sedaghati [Wed, 18 Aug 2021 18:57:56 +0000 (11:57 -0700)]
Reapply: [NFC] factor out unrolling decision logic

reverting ffd8a268bdc518f87e9ba7524aba0458f4b9979c (reapplying
4d559837e887c278d7c27274f4f6b1b78b97c00d) - removed spurious inclusion
of <optional>

Differential Revision: https://reviews.llvm.org/D106001

2 years ago[X86][NFC] Pre-commit tests for PR51494
Andrea Di Biagio [Wed, 18 Aug 2021 18:40:35 +0000 (19:40 +0100)]
[X86][NFC] Pre-commit tests for PR51494

2 years ago[PowerPC] Regenerate 2007-09-08-unaligned.ll test checks
Simon Pilgrim [Wed, 18 Aug 2021 18:53:57 +0000 (19:53 +0100)]
[PowerPC] Regenerate 2007-09-08-unaligned.ll test checks