review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

Matthias Springer [Thu, 15 Jul 2021 02:27:52 +0000 (11:27 +0900)]

[mlir][linalg] Improve codegen when tiling PadTensor evenly

Produce simpler IR with more static type information and fewer affine expressions.

Differential Revision: https://reviews.llvm.org/D105530

commit | commitdiff | tree

Matthias Springer [Thu, 15 Jul 2021 02:05:12 +0000 (11:05 +0900)]

[mlir][linalg] Improve codegen of ExtractSliceOfPadTensorSwapPattern

Generate simpler code in case low/high padding of the PadTensorOp is statically zero.

Differential Revision: https://reviews.llvm.org/D105529

commit | commitdiff | tree

Matthias Springer [Thu, 15 Jul 2021 01:55:22 +0000 (10:55 +0900)]

[mlir][linalg] Fix Windows build

The build failure was introduced by D105458. (Linux builds were not affected.)

Differential Revision: https://reviews.llvm.org/D106029

commit | commitdiff | tree

Matthias Springer [Thu, 15 Jul 2021 01:35:46 +0000 (10:35 +0900)]

[mlir][linalg] Tile PadTensorOp

Tiling can be enabled with `linalg-tile-pad-tensor-ops`. Only scf::ForOp can be generated at the moment.

Differential Revision: https://reviews.llvm.org/D105460

commit | commitdiff | tree

Matthias Springer [Thu, 15 Jul 2021 01:28:25 +0000 (10:28 +0900)]

[mlir][NFC] Move asOpFoldResult helper functions to StaticValueUtils

Differential Revision: https://reviews.llvm.org/D105602

commit | commitdiff | tree

Matthias Springer [Thu, 15 Jul 2021 01:20:00 +0000 (10:20 +0900)]

[mlir][linalg] Add optional output operand to PadTensorOp

This optional operand will be used for tiling in a subsequent commit.

Differential Revision: https://reviews.llvm.org/D105459

commit | commitdiff | tree

Matthias Springer [Thu, 15 Jul 2021 01:11:35 +0000 (10:11 +0900)]

[mlir][linalg][NFC] Factor out tile generation in makeTiledShapes

Factor out the functionality into a new function, so that it can be used for creating PadTensorOp tiles.

Differential Revision: https://reviews.llvm.org/D105458

commit | commitdiff | tree

LLVM GN Syncbot [Thu, 15 Jul 2021 01:12:36 +0000 (01:12 +0000)]

[gn build] Port b9c3941cd61d

commit | commitdiff | tree

Kai Luo [Thu, 15 Jul 2021 00:49:42 +0000 (00:49 +0000)]

[PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand

This patch uses AtomicExpandPass to implement quadword lock free atomic operations. It adopts the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D103614

commit | commitdiff | tree

Dave Airlie [Thu, 15 Jul 2021 00:51:01 +0000 (10:51 +1000)]

[OpenCL] opencl-c.h: CL3.0 generic address space

This is one of the easier pieces of adding CL3.0 support.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D105526

commit | commitdiff | tree

Dave Airlie [Thu, 15 Jul 2021 00:48:19 +0000 (10:48 +1000)]

[OpenCL][NFC] opencl-c.h: reorder atomic operations

This just reorders the atomics, it doesn't change anything except their layout in the header.

This is a prep patch for adding some conditionals around these for CL3.0 but that patch is much easier to review if all the atomic operations are grouped together like this.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D105601

commit | commitdiff | tree

Jan Vesely [Thu, 15 Jul 2021 00:41:50 +0000 (10:41 +1000)]

libclc: Add -cl-no-stdinc to clang flags on clang >=13

cf3ef15a6ec5e5b45c6c54e8fbe3769255e815ce ("[OpenCL] Add builtin
declarations by default.")
switched behaviour to include "opencl-c-base.h". We don't want or need
that for libclc so pass the flag to revert to old behaviour.

Fixes build since cf3ef15a6ec5e5b45c6c54e8fbe3769255e815ce

Reviewed By: tstellar

Differential Revision: https://reviews.llvm.org/D99794

commit | commitdiff | tree

Kuter Dinel [Tue, 13 Jul 2021 02:14:50 +0000 (05:14 +0300)]

[AMDGPU] Use update_test_checks.py script for annotate kernel features tests.

This patch makes the annotate kernel features tests use the update_tests_checks.py
script. Which makes it easy to update the tests.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D105864

commit | commitdiff | tree

Arthur O'Dwyer [Wed, 14 Jul 2021 04:01:47 +0000 (00:01 -0400)]

[libc++] NFCI: Restore code duplication in wrap_iter, with test.

It turns out that D105040 broke `std::rel_ops`; we actually do need
both a one-template-parameter and a two-template-parameter version of
all the comparison operators, because if we have only the heterogeneous
two-parameter version, then `x > x` is ambiguous:

    template<class T, class U> int f(S<T>, S<U>) { return 1; }
    template<class T> int f(T, T) { return 2; }  // rel_ops
    S<int> s; f(s,s);  // ambiguous between #1 and #2

Adding the one-template-parameter version fixes the ambiguity:

    template<class T, class U> int f(S<T>, S<U>) { return 1; }
    template<class T> int f(T, T) { return 2; }  // rel_ops
    template<class T> int f(S<T>, S<T>) { return 3; }
    S<int> s; f(s,s);  // #3 beats both #1 and #2

We have the same problem with `reverse_iterator` as with `__wrap_iter`.
But so do libstdc++ and Microsoft, so we're not going to worry about it.

Differential Revision: https://reviews.llvm.org/D105894

commit | commitdiff | tree

Nathan Ridge [Tue, 6 Jul 2021 05:40:24 +0000 (01:40 -0400)]

[clang] Refactor AST printing tests to share more infrastructure

Differential Revision: https://reviews.llvm.org/D105457

commit | commitdiff | tree

Thomas Lively [Wed, 14 Jul 2021 23:15:24 +0000 (16:15 -0700)]

[WebAssembly] Codegen for v128.storeX_lane instructions

Replace the experimental clang builtins and LLVM intrinsics for these
instructions with normal codegen patterns. Resolves PR50435.

Differential Revision: https://reviews.llvm.org/D106019

commit | commitdiff | tree

Jon Roelofs [Mon, 12 Jul 2021 19:43:45 +0000 (12:43 -0700)]

[GlobalOpt] Fix a miscompile when evaluating struct initializers.

The bug was that evaluateBitcastFromPtr attempts a narrowing to a struct's 0th
element of a store that covers other elements. While this is okay on the load
side, applying it to stores causes us to miss the writes to the additionally
covered elements.

rdar://79503568

Differential revision: https://reviews.llvm.org/D105838

commit | commitdiff | tree

Steven Wu [Wed, 14 Jul 2021 22:23:37 +0000 (15:23 -0700)]

[Support] Turn on SupportTest for Apple Silicon

Follow up for D106012, turn on unittest for Host on Apple Silicon.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D106020

commit | commitdiff | tree

George Rokos [Tue, 13 Jul 2021 22:33:50 +0000 (15:33 -0700)]

[libomptarget] Keep the Shadow Pointer Map up-to-date

D105812 introduced a regression where if a PTR_AND_OBJ entry was mapped on the device, then the OBJ was deallocated and then reallocated at a different address, the Shadow Pointer Map would still contain an entry for the PTR but pointing to the old address. This caused test `env/base_ptr_ref_count.c` to fail.

Differential Revision: https://reviews.llvm.org/D105947

commit | commitdiff | tree

owenca [Wed, 14 Jul 2021 06:49:48 +0000 (23:49 -0700)]

[clang-format] Make BreakAfterReturnType work with K&R C functions

This fixes PR50999.

Differential Revision: https://reviews.llvm.org/D105964

commit | commitdiff | tree

Arthur Eubanks [Wed, 14 Jul 2021 21:36:07 +0000 (14:36 -0700)]

[docs][OpaquePtr] Remove finished task

commit | commitdiff | tree

Wolfgang Pieb [Fri, 25 Jun 2021 17:54:24 +0000 (10:54 -0700)]

[ARM] Fix RELA relocations for 32bit ARM.

RELA relocations for 32 bit ARM ignored the addend. Some tools generate
them instead of REL type relocations. This fixes PR50473.

Reviewed By: MaskRay, peter.smith

Differential Revision: https://reviews.llvm.org/D105214

commit | commitdiff | tree

Michael Kruse [Wed, 14 Jul 2021 21:21:45 +0000 (16:21 -0500)]

[Polly] Fix misleading debug message. NFC.

The number of parameters can be the reason for aliasing checks not being
generated, but most of the time it for other reasons.

commit | commitdiff | tree

Derek Schuff [Fri, 31 Jan 2020 23:55:47 +0000 (15:55 -0800)]

[llvm-strip][WebAssembly] Support strip flags

Summary:
Add support for the basic section stripping (and keeping) flags for wasm:
strip with no flags, --strip-all, --strip-debug,
--only-section, --keep-section, and --only-keep-debug.

Factor section removal into a function and use a predicate chain like
the ELF implementation.

Reviewers: jhenderson, sbc100

Differential Revision: https://reviews.llvm.org/D73820

commit | commitdiff | tree

Arthur Eubanks [Wed, 14 Jul 2021 20:56:59 +0000 (13:56 -0700)]

Precommit test for D106017

commit | commitdiff | tree

Arthur Eubanks [Thu, 8 Jul 2021 22:05:50 +0000 (15:05 -0700)]

[SimpleLoopUnswitch] Don't non-trivially unswitch loops with catchswitch exits

SplitBlock() can't handle catchswitch.

Fixes PR50973.

Reviewed By: aheejin

Differential Revision: https://reviews.llvm.org/D105672

commit | commitdiff | tree

Jon Roelofs [Wed, 14 Jul 2021 19:45:28 +0000 (12:45 -0700)]

[AArch64] Fix selection of G_UNMERGE <2 x s16>

Differential revision: https://reviews.llvm.org/D106007

commit | commitdiff | tree

Nicolas Vasilache [Wed, 14 Jul 2021 20:33:29 +0000 (20:33 +0000)]

[mlir][affine] Add single result affine.min/max -> affine.apply canonicalization.

Differential Revision: https://reviews.llvm.org/D106014

commit | commitdiff | tree

Philip Reames [Wed, 14 Jul 2021 20:35:18 +0000 (13:35 -0700)]

[tests] Stablize tests for possible change in deref semantics

This is conceptually part of e75a2dfe.  This file contains both tests whose results don't change (with the right attributes added), and tests which fundementally regress with the current proposal.  Doing the update took some care, thus the seperate change.

Here's the e75a2dfe context repeated:

There's a potential change in dereferenceability attribute semantics in the nearish future.  See llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree" and D99100 for context.

This change simply adds appropriate attributes to tests to keep transform logic exercised under both old and new/proposed semantics.  Note that for many of these cases, O3 would infer exactly these attributes on the test IR.

This change handles the idiomatic pattern of a dereferenceable object being passed to a call which can not free that memory.  There's a couple other tests which need more one-off attention, they'll be handled in another change.

commit | commitdiff | tree

Kirill Stoimenov [Wed, 14 Jul 2021 19:31:49 +0000 (12:31 -0700)]

[asan][clang] Add flag to outline instrumentation

Summary This option can be used to reduce the size of the
binary. The trade-off in this case would be the run-time
performance.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D105726

commit | commitdiff | tree

Jonas Devlieghere [Tue, 13 Jul 2021 17:34:27 +0000 (10:34 -0700)]

[lldb] Make TargetList iterable (NFC)

Make it possible to iterate over the TargetList.

Differential revision: https://reviews.llvm.org/D105914

commit | commitdiff | tree

Jonas Devlieghere [Wed, 14 Jul 2021 20:21:36 +0000 (13:21 -0700)]

[lldb] Always call DestroyImpl from Process::Finalize

Always destroy the process, regardless of its private state. This will
call the virtual function DoDestroy under the hood, giving our derived
class a chance to do the necessary tear down, including what to do when
the private state is eStateExited.

Differential revision: https://reviews.llvm.org/D106004

commit | commitdiff | tree

Steven Wu [Wed, 14 Jul 2021 20:29:15 +0000 (13:29 -0700)]

[Support] Get correct number of physical cores on Apple Silicon

Fix a bug that `computeHostNumPhysicalCores` is fallback to default
unknown when building for Apple Silicon macs.

rdar://80533675

Reviewed By: arphaman

Differential Revision: https://reviews.llvm.org/D106012

commit | commitdiff | tree

Nicolas Vasilache [Wed, 14 Jul 2021 20:06:16 +0000 (20:06 +0000)]

[mlir] NFC - Add AffineMap::replace variant with dim/symbol inference

commit | commitdiff | tree

Philip Reames [Wed, 14 Jul 2021 20:19:48 +0000 (13:19 -0700)]

Global variables with strong definitions cannot be freed

With the current deref semantics, this is redundant - since we assume that anything which is dereferenceable (ever) can't be freed - but it becomes neccessary for the deref-at-point semantics.

Testing wise, this is covered by test/CodeGen/X86/hoist-invariant-load.ll when -use-dereferenceable-at-point-semantics is active. I didn't bother duplicating the command line since a) it's an in-development mode, and b) the change is pretty obvious.

commit | commitdiff | tree

Martin Storsjö [Wed, 14 Jul 2021 06:40:25 +0000 (09:40 +0300)]

[libcxx] [test] Remove a LIBCXX-WINDOWS-FIXME in trivial_abi/unique_ptr_ret

This is the same thing that was clarified in D105906 for weak_ptr_ret.

Differential Revision: https://reviews.llvm.org/D105965

commit | commitdiff | tree

Philip Reames [Wed, 14 Jul 2021 19:35:23 +0000 (12:35 -0700)]

[tests] Stablize tests for possible change in deref semantics

There's a potential change in dereferenceability attribute semantics in the nearish future.  See llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree" and D99100 for context.

This change simply adds appropriate attributes to tests to keep transform logic exercised under both old and new/proposed semantics.  Note that for many of these cases, O3 would infer exactly these attributes on the test IR.

This change handles the idiomatic pattern of a dereferenceable object being passed to a call which can not free that memory.  There's a couple other tests which need more one-off attention, they'll be handled in another change.

commit | commitdiff | tree

Stanislav Mekhanoshin [Mon, 12 Jul 2021 19:27:34 +0000 (12:27 -0700)]

[AMDGPU] Add TII::isIgnorableUse() to allow VOP rematerialization

Any def of EXEC prevents rematerialization of any VOP instruction
because of the physreg use. Create a callback to check if the
physreg use can be ingored to allow rematerialization.

Differential Revision: https://reviews.llvm.org/D105836

commit | commitdiff | tree

Alexey Bataev [Wed, 14 Jul 2021 19:42:51 +0000 (12:42 -0700)]

[SLP][NFC]Fix variables names, NFC.

commit | commitdiff | tree

Fangrui Song [Wed, 14 Jul 2021 19:39:22 +0000 (12:39 -0700)]

[docs] Fix :option:`--file-header` reference in llvm-readelf.rst after D105532

commit | commitdiff | tree

Simon Pilgrim [Wed, 14 Jul 2021 19:16:56 +0000 (20:16 +0100)]

[SLP] Fix case of variable name. NFCI.

commit | commitdiff | tree

Geoffrey Martin-Noble [Wed, 14 Jul 2021 19:09:41 +0000 (12:09 -0700)]

[Bazel] Uniformly export all MLIR td files

CMake would have no restrictions on this and the custom list is a pain
to maintain.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D106003

commit | commitdiff | tree

Louis Dionne [Wed, 14 Jul 2021 19:13:19 +0000 (15:13 -0400)]

[runtimes] Bring back TARGET_TRIPLE

This commit reverts 5099e01568 and 77396bbc98, which broke the build
in various ways. I'm reverting until I can investigate, since that
change appears to be way more subtle than it seemed.

commit | commitdiff | tree

Roman Lebedev [Wed, 14 Jul 2021 19:14:05 +0000 (22:14 +0300)]

[NFC] Drop redundant check prefixes in newly added test file

commit | commitdiff | tree

Nikita Popov [Wed, 14 Jul 2021 19:09:06 +0000 (21:09 +0200)]

[Attributes] Use single method to fetch type from AttributeSet (NFC)

While it is nice to have separate methods in the public AttributeSet
API, we can fetch the type from the internal AttributeSetNode
using a generic API for all type attribute kinds.

commit | commitdiff | tree

Roman Lebedev [Wed, 14 Jul 2021 18:54:04 +0000 (21:54 +0300)]

[NFC][PhaseOrdering] Add test for the lack of CSE after SimplifyCFG (PR51092)

commit | commitdiff | tree

David Green [Wed, 14 Jul 2021 19:06:49 +0000 (20:06 +0100)]

[ARM] Move add(VMLALVA(A, X, Y), B) to VMLALVA(add(A, B), X, Y)

For i64 reductions we currently try and convert add(VMLALV(X, Y), B) to
VMLALVA(B, X, Y), incorporating the addition into the VMLALVA. If we
have an add of an existing VMLALVA, this patch pushes the add up above
the VMLALVA so that it may potentially be simplified further, for
example being folded into another VMLALV.

Differential Revision: https://reviews.llvm.org/D105686

commit | commitdiff | tree

Vitaly Buka [Wed, 14 Jul 2021 01:11:57 +0000 (18:11 -0700)]

[scudo] Don't enabled MTE for small alignment

Differential Revision: https://reviews.llvm.org/D105954

commit | commitdiff | tree

Mehdi Amini [Wed, 14 Jul 2021 19:01:34 +0000 (19:01 +0000)]

Remove uses of deprecated target AllPassesAndDialectsNoRegistration in Bazel (NFC)

It was an alias for a long time.

commit | commitdiff | tree

Nikita Popov [Wed, 14 Jul 2021 18:58:52 +0000 (20:58 +0200)]

[Verifier] Improve incompatible attribute type check

A couple of attributes had explicit checks for incompatibility
with pointer types. However, this is already handled generically
by the typeIncompatible() check. We can drop these after adding
SwiftError to typeIncompatible().

However, the previous implementation of the check prints out all
attributes that are incompatible with a given type, even though
those attributes aren't actually used. This has the annoying
result that the error message changes every time a new attribute
is added to the list. Improve this by explicitly finding which
attribute isn't compatible and printing just that.

commit | commitdiff | tree

Saleem Abdulrasool [Wed, 14 Jul 2021 18:42:24 +0000 (11:42 -0700)]

Demangle: correct swift_async demangling for Microsoft scheme

The emission was corrected for the swift_async calling convention but
the demangling support was not. This repairs the demangling support as
well.

commit | commitdiff | tree

Eli Friedman [Mon, 12 Jul 2021 22:11:01 +0000 (15:11 -0700)]

[SelectionDAG] Add an overload of getStepVector that assumes step 1.

This is mostly a minor convenience, but the pattern seems frequent
enough to be worthwhile (and we'll probably add more uses in the
future).

Differential Revision: https://reviews.llvm.org/D105850

commit | commitdiff | tree

Thomas Lively [Wed, 14 Jul 2021 18:31:53 +0000 (11:31 -0700)]

[WebAssembly] Codegen for v128.loadX_lane instructions

Replace the experimental clang builtin and LLVM intrinsics for these
instructions with normal codegen patterns. Resolves PR50433.

Differential Revision: https://reviews.llvm.org/D105950

commit | commitdiff | tree

Louis Dionne [Wed, 14 Jul 2021 18:25:13 +0000 (14:25 -0400)]

[runtimes] Inherit the TARGET_TRIPLE that may be set by LLVM

commit | commitdiff | tree

Thomas Lively [Wed, 14 Jul 2021 18:17:08 +0000 (11:17 -0700)]

[WebAssembly] Remove datalayout strings from llc tests

The data layout strings do not have any effect on llc tests and will become
misleadingly out of date as we continue to update the canonical data layout, so
remove them from the tests.

Differential Revision: https://reviews.llvm.org/D105842

commit | commitdiff | tree

Fangrui Song [Wed, 14 Jul 2021 17:18:30 +0000 (10:18 -0700)]

[ELF] --fortran-common: prefer STB_WEAK to COMMON

The ELF specification says "The link editor honors the common definition and
ignores the weak ones." GNU ld and our Symbol::compare follow this, but the
--fortran-common code (D86142) made a mistake on the precedence.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51082

Reviewed By: peter.smith, sfertile

Differential Revision: https://reviews.llvm.org/D105945

commit | commitdiff | tree

David Green [Wed, 14 Jul 2021 17:11:32 +0000 (18:11 +0100)]

[ARM] Lower v16i8 -> i64 VMLA reductions.

MVE does not have a VMLALV instruction that can perform v16i8 -> i64
reductions, like it does for v8i16->i64 and v4i32->i64 reductions. That
means that the pattern to create them will be spilt up by type
legalization, creating a lot of instructions.

This extends the patterns for matching i64 reductions a little to handle
the v16i8->i64 case. We need to turn them into a pair of v8i16->i64
VMLALVs that each perform half of the reduction and are summed together
(so the later is a VMLALVA). The order of the lanes does not matter for
the reduction so we generate a MVEEXT for the extension, that will
either be folded into a extending load or can be optimized to a
VREV/VMOVL. Some of the resulting codegen isn't optimal, but will be
improved in a later patch.

Differential Revision: https://reviews.llvm.org/D105680

commit | commitdiff | tree

Sanjay Patel [Wed, 14 Jul 2021 15:57:36 +0000 (11:57 -0400)]

[InstCombine] reorder icmp with offset folds for better results

This set of folds was added recently with:
c7b658aeb526
0c400e895306
40b752d28d95

...and I noted that this wasn't likely to fire in code derived
from C/C++ source because of nsw in particular. But I didn't
notice that I had placed the code above the no-wrap block
of transforms.

This is likely the cause of regressions noted from the previous
commit because -- as shown in the test diffs -- we may have
transformed into a compare with an arbitrary constant rather
than a simpler signbit test.

commit | commitdiff | tree

Sanjay Patel [Wed, 14 Jul 2021 15:35:23 +0000 (11:35 -0400)]

[InstCombine] add tests for icmp with constant offset and no-wrap flags; NFC

commit | commitdiff | tree

Sander de Smalen [Wed, 14 Jul 2021 15:45:07 +0000 (16:45 +0100)]

[LV] Print remark when loop cannot be vectorized due to invalid costs.

This patch emits remarks for instructions that have invalid costs for
a given set of vectorization factors. Some example output:

  t.c:4:19: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): load
      dst[i] = sinf(src[i]);
                    ^
  t.c:4:14: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1, vscale x 2, vscale x 4): call to llvm.sin.f32
      dst[i] = sinf(src[i]);
               ^
  t.c:4:12: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): store
      dst[i] = sinf(src[i]);
             ^

Reviewed By: fhahn, kmclaughlin

Differential Revision: https://reviews.llvm.org/D105806

commit | commitdiff | tree

Matt Arsenault [Thu, 10 Jun 2021 13:28:20 +0000 (09:28 -0400)]

GlobalISel: Handle lowering non-power-of-2 extloads

commit | commitdiff | tree

Sander de Smalen [Wed, 14 Jul 2021 08:43:30 +0000 (09:43 +0100)]

[CostModel][AArch64] Make loads/stores of <vscale x 1 x eltty> invalid.

At the moment, <vscale x 1 x eltty> are not yet fully handled by the
code-generator, so to avoid vectorizing loops with that VF, we mark the
cost for these types as invalid.
The reason for not adding a new "TTI::getMinimumScalableVF" is because
the type is supposed to be a type that can be legalized. It partially is,
although the support for these types need some more work.

Reviewed By: paulwalker-arm, dmgreen

Differential Revision: https://reviews.llvm.org/D103882

commit | commitdiff | tree

Aaron Ballman [Wed, 14 Jul 2021 15:40:37 +0000 (11:40 -0400)]

Combine two diagnostics into one and correct grammar

The anonymous and non-anonymous bit-field diagnostics are easily
combined into one diagnostic. However, the diagnostic was missing a
"the" that is present in the almost-identically worded
warn_bitfield_width_exceeds_type_width diagnostic, hence the changes to
test cases.

commit | commitdiff | tree

Jay Foad [Tue, 13 Jul 2021 13:30:54 +0000 (14:30 +0100)]

[AMDGPU] Check llc-pipeline.ll with -match-full-lines -strict-whitespace

This prevents breaking the indentation that shows the structure of the
pass managers.

Differential Revision: https://reviews.llvm.org/D105891

commit | commitdiff | tree

Alexey Bataev [Mon, 12 Jul 2021 17:44:36 +0000 (10:44 -0700)]

[SLP]Workaround for InsertSubVector cost.

The cost of the InsertSubvector shuffle kind cost is not complete and
may end up with just extracts + inserts costs in many cases. Added
a workaround to represent it as a generic PermuteSingleSrc, which is
still pessimistic but better than InsertSubvector.

Differential Revision: https://reviews.llvm.org/D105827

commit | commitdiff | tree

Louis Dionne [Wed, 14 Jul 2021 14:49:28 +0000 (10:49 -0400)]

[runtimes] NFCI: Drop intermediate CMake variable TARGET_TRIPLE

We might as well use the various XXX_TARGET_TRIPLE variables directly.

commit | commitdiff | tree

Yitzhak Mandelbaum [Fri, 2 Jul 2021 18:53:10 +0000 (18:53 +0000)]

[Lexer] Fix bug in `makeFileCharRange` called on split tokens.

When the end loc of the specified range is a split token, `makeFileCharRange`
does not process it correctly. This patch adds proper support for split tokens.

Differential Revision: https://reviews.llvm.org/D105365

commit | commitdiff | tree

Peixin Qiao [Wed, 14 Jul 2021 13:42:26 +0000 (09:42 -0400)]

[flang][OpenMP] Fix semantic check of test case in taskloop simd construct

The following semantic check is removed in OpenMP Version 5.0:
```
Taskloop simd construct restrictions: No reduction clause can be specified.
```

Also fix several typos.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D105874

commit | commitdiff | tree

Jinsong Ji [Wed, 14 Jul 2021 13:37:50 +0000 (13:37 +0000)]

[AIX] Enable dollar sign as PC in inlineasm

$ is used as PC for PowerPC inlineasm, ELF use it,
enable it for AIX XCOFF as well.

Reviewed By: #powerpc, amyk, nemanjai

Differential Revision: https://reviews.llvm.org/D105956

commit | commitdiff | tree

Matthias Springer [Wed, 14 Jul 2021 13:14:05 +0000 (22:14 +0900)]

[mlir][linalg] Fix typo in ExtractSliceOfPadTensorSwapPattern

Differential Revision: https://reviews.llvm.org/D105607

commit | commitdiff | tree

oToToT [Wed, 14 Jul 2021 13:14:12 +0000 (21:14 +0800)]

[docs] Update CMake cross compiling guide link

The CMake community Wiki has been moved to the [[ https://gitlab.kitware.com/cmake/community/wikis/home | Kitware GitLab Instance ]].
Also, the original anchor for `Information how to set up various cross compiling toolchains` section might not work as expected. The original content is now being collapsed, so browser won't navigate to the right section directly.

Hence, I think it might be better to provide the section name instead of `this section` with link to help readers find the right section by themselves.

Reviewed By: void

Differential Revision: https://reviews.llvm.org/D104996

commit | commitdiff | tree

Tim Northover [Wed, 14 Jul 2021 13:11:20 +0000 (14:11 +0100)]

ARM: reuse existing libcall global variable if possible.

If we try to create a new GlobalVariable on each iteration, the Module will
detect the name collision and "helpfully" rename later iterations by appending
".1" etc. But "___udivsi3.1" doesn't exist and we definitely don't want to try
to call it.

So instead check whether there's already a global with the right name in the
module and use that if so.

commit | commitdiff | tree

Sanjay Patel [Wed, 14 Jul 2021 13:02:31 +0000 (09:02 -0400)]

[SLP] match logical and/or as reduction candidates

This has been a work-in-progress for a long time...we finally have all of
the pieces in place to handle vectorization of compare code as shown in:
https://llvm.org/PR41312

To do this (see PhaseOrdering tests), we converted SimplifyCFG and
InstCombine to the poison-safe (select) forms of the logic ops, so now we
need to have SLP recognize those patterns and insert a freeze op to make
a safe reduction:
https://alive2.llvm.org/ce/z/NH54Ah

We get the minimal patterns with this patch, but the PhaseOrdering tests
show that we still need adjustments to get the ideal IR in some or all of
the motivating cases.

Differential Revision: https://reviews.llvm.org/D105730

commit | commitdiff | tree

Gabor Marton [Wed, 9 Jun 2021 15:03:47 +0000 (17:03 +0200)]

[Analyzer][solver] Add dump methods for (dis)equality classes.

This proved to be very useful during debugging.

Differential Revision: https://reviews.llvm.org/D103967

commit | commitdiff | tree

Alexander Shaposhnikov [Wed, 14 Jul 2021 11:33:09 +0000 (04:33 -0700)]

[lld][MachO] Code cleanup

Make use of ArgList::getLastArgValue. NFC.

Test plan: make check-lld-macho

Differential revision: https://reviews.llvm.org/D105452

commit | commitdiff | tree

Djordje Todorovic [Mon, 28 Jun 2021 12:15:31 +0000 (05:15 -0700)]

[RemoveRedundantDebugValues] Add a Pass that removes redundant DBG_VALUEs

This new MIR pass removes redundant DBG_VALUEs.

After the register allocator is done, more precisely, after
the Virtual Register Rewriter, we end up having duplicated
DBG_VALUEs, since some virtual registers are being rewritten
into the same physical register as some of existing DBG_VALUEs.
Each DBG_VALUE should indicate (at least before the LiveDebugValues)
variables assignment, but it is being clobbered for function
parameters during the SelectionDAG since it generates new DBG_VALUEs
after COPY instructions, even though the parameter has no assignment.
For example, if we had a DBG_VALUE $regX as an entry debug value
representing the parameter, and a COPY and after the COPY,
DBG_VALUE $virt_reg, and after the virtregrewrite the $virt_reg gets
rewritten into $regX, we'd end up having redundant DBG_VALUE.

This breaks the definition of the DBG_VALUE since some analysis passes
might be built on top of that premise..., and this patch tries to fix
the MIR with the respect to that.

This first patch performs bacward scan, by trying to detect a sequence of
consecutive DBG_VALUEs, and to remove all DBG_VALUEs describing one
variable but the last one:

For example:

(1) DBG_VALUE $edi, !"var1", ...
(2) DBG_VALUE $esi, !"var2", ...
(3) DBG_VALUE $edi, !"var1", ...
...

in this case, we can remove (1).

By combining the forward scan that will be introduced in the next patch
(from this stack), by inspecting the statistics, the RemoveRedundantDebugValues
removes 15032 instructions by using gdb-7.11 as a testbed.

Differential Revision: https://reviews.llvm.org/D105279

commit | commitdiff | tree

Simon Pilgrim [Wed, 14 Jul 2021 11:20:47 +0000 (12:20 +0100)]

[InstCombine] Fold (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0)) (PR50183) (REAPPLIED)

As discussed on PR50183, we already fold to prefer 'select-of-idx' vs 'select-of-gep':

define <4 x i32>* @select0a(<4 x i32>* %a0, i64 %a1, i1 %a2, i64 %a3) {
  %gep0 = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1
  %gep1 = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a3
  %sel = select i1 %a2, <4 x i32>* %gep0, <4 x i32>* %gep1
  ret <4 x i32>* %sel
}
-->
define <4 x i32>* @select1a(<4 x i32>* %a0, i64 %a1, i1 %a2, i64 %a3) {
  %sel = select i1 %a2, i64 %a1, i64 %a3
  %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel
  ret <4 x i32>* %gep
}

This patch adds basic handling for the 'fallthrough' cases where the gep idx == 0 has been folded away to the base address:

define <4 x i32>* @select0(<4 x i32>* %a0, i64 %a1, i1 %a2) {
  %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1
  %sel = select i1 %a2, <4 x i32>* %a0, <4 x i32>* %gep
  ret <4 x i32>* %sel
}
-->
define <4 x i32>* @select1(<4 x i32>* %a0, i64 %a1, i1 %a2) {
  %sel = select i1 %a2, i64 0, i64 %a1
  %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel
  ret <4 x i32>* %gep
}

Reapplied with a fix for the bpf "-bpf-disable-avoid-speculation" tests

Differential Revision: https://reviews.llvm.org/D105901

commit | commitdiff | tree

Chuanqi Xu [Wed, 14 Jul 2021 11:12:57 +0000 (19:12 +0800)]

[NFC] [Coroutines] Remove unused CoroFree

commit | commitdiff | tree

Bruce Mitchener [Wed, 14 Jul 2021 10:59:08 +0000 (10:59 +0000)]

[lldb][docs] Remove mention of subversion. NFC.

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D103744

commit | commitdiff | tree

Simon Pilgrim [Wed, 14 Jul 2021 11:03:16 +0000 (12:03 +0100)]

[X86] Implement smarter instruction lowering for FP_TO_UINT from f32/f64 to i32/i64 and vXf32/vXf64 to vXi32 for SSE2 and AVX2 by using the exact semantic of the CVTTPS2SI instruction.

We know that "CVTTPS2SI" returns 0x80000000 for out of range inputs (and for FP_TO_UINT, negative float values are undefined). We can use this to make unsigned conversions from vXf32 to vXi32 more efficient, particularly on targets without blend using the following logic:

small := CVTTPS2SI(x);
fp_to_ui(x) := small | (CVTTPS2SI(x - 2^31) & ARITHMETIC_RIGHT_SHIFT(small, 31))

Even on targets where "PBLENDVPS"/"PBLENDVB" exists, it is often a latency 2, low throughput instruction so this logic is applied there too (in particular for AVX2 also). It furthermore gets rid of one high latency floating point comparison in the previous lowering.

@TomHender checked the correctness of this for all possible floats between -1 and 2^32 (both ends excluded).

Original Patch by @TomHender (Tom Hender)

Differential Revision: https://reviews.llvm.org/D89697

commit | commitdiff | tree

LLVM GN Syncbot [Wed, 14 Jul 2021 10:49:08 +0000 (10:49 +0000)]

[gn build] Port c08dabb0f476

commit | commitdiff | tree

Simon Pilgrim [Wed, 14 Jul 2021 10:48:22 +0000 (11:48 +0100)]

Revert rGb803294cf78714303db2d3647291a2308347ef23 : "[InstCombine] Fold (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0)) (PR50183)"

Missed some BPF test changes that need addressing

commit | commitdiff | tree

Nico Weber [Wed, 14 Jul 2021 10:43:23 +0000 (06:43 -0400)]

[gn build] (manually) merge 462d4de35b0c

commit | commitdiff | tree

Stefan Pintilie [Wed, 14 Jul 2021 02:15:30 +0000 (21:15 -0500)]

[NFC][PowerPC] Added test to check regsiter allocation for ACC registers

ACC regsiters are a combination of 4 consecutive vector regsiters and therefore
somtimes require special treatment for register allocation. This patch only
adds a test.

commit | commitdiff | tree

Stephen Tozer [Tue, 13 Jul 2021 12:31:11 +0000 (13:31 +0100)]

[DebugInfo] Correctly update dbg.values with duplicated location ops

This patch fixes code that incorrectly handled dbg.values with duplicate
location operands, i.e. !DIArgList(i32 %a, i32 %a). The errors in
question were caused by either applying an update to dbg.value multiple
times when the update is only valid once, or by updating the
DIExpression for only the first instance of a value that appears
multiple times.

Differential Revision: https://reviews.llvm.org/D105831

commit | commitdiff | tree

Simon Pilgrim [Tue, 13 Jul 2021 18:06:13 +0000 (19:06 +0100)]

[InstCombine] Fold (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0)) (PR50183)

As discussed on PR50183, we already fold to prefer 'select-of-idx' vs 'select-of-gep':

define <4 x i32>* @select0a(<4 x i32>* %a0, i64 %a1, i1 %a2, i64 %a3) {
  %gep0 = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1
  %gep1 = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a3
  %sel = select i1 %a2, <4 x i32>* %gep0, <4 x i32>* %gep1
  ret <4 x i32>* %sel
}
-->
define <4 x i32>* @select1a(<4 x i32>* %a0, i64 %a1, i1 %a2, i64 %a3) {
  %sel = select i1 %a2, i64 %a1, i64 %a3
  %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel
  ret <4 x i32>* %gep
}

This patch adds basic handling for the 'fallthrough' cases where the gep idx == 0 has been folded away to the base address:

define <4 x i32>* @select0(<4 x i32>* %a0, i64 %a1, i1 %a2) {
  %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1
  %sel = select i1 %a2, <4 x i32>* %a0, <4 x i32>* %gep
  ret <4 x i32>* %sel
}
-->
define <4 x i32>* @select1(<4 x i32>* %a0, i64 %a1, i1 %a2) {
  %sel = select i1 %a2, i64 0, i64 %a1
  %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel
  ret <4 x i32>* %gep
}

Differential Revision: https://reviews.llvm.org/D105901

commit | commitdiff | tree

Butygin [Tue, 6 Jul 2021 16:11:16 +0000 (19:11 +0300)]

[mlir][SCF] populateSCFStructuralTypeConversionsAndLegality WhileOp support

Differential Revision: https://reviews.llvm.org/D105923

commit | commitdiff | tree

Fraser Cormack [Tue, 13 Jul 2021 16:08:05 +0000 (17:08 +0100)]

[RISCV] Fix the neutral element in vector 'fadd' reductions

Using positive zero as the neutral element in 'fadd' reductions, while
it generates better code, is incorrect. The correct neutral element is
negative zero: 0.0 + -0.0 = 0.0, whereas -0.0 + -0.0 = -0.0.

There are perhaps more optimal lowerings of negative zero avoiding
constant-pool loads which could be left as future work.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D105902

commit | commitdiff | tree

Sebastian Neubauer [Wed, 14 Jul 2021 08:03:54 +0000 (10:03 +0200)]

[AMDGPU] Init scratch only if necessary

If no scratch or flat instructions are used, we do not need to
initialize the flat scratch hardware register.

Differential Revision: https://reviews.llvm.org/D105920

commit | commitdiff | tree

Sebastian Neubauer [Tue, 13 Jul 2021 14:37:15 +0000 (16:37 +0200)]

[AMDGPU] Precommit flat-scratch-init.ll test

commit | commitdiff | tree

Cullen Rhodes [Wed, 14 Jul 2021 08:01:19 +0000 (08:01 +0000)]

[AArch64][SME] Add matrix register definitions and parsing support

SME introduces the ZA array, a new piece of architectural register state
consisting of a matrix of [SVLb x SVLb] bytes, where SVL is the
implementation defined Streaming SVE vector length and SVLb is the
number of 8-bit elements in a vector of SVL bits.

SME instructions consist of three types of matrix operands:

  * Tiles: a ZA tile is a square, two-dimensional sub-array of elements
  within the ZA array. These tiles make up the larger accumulator array
  and the granularity varies based on the element size, i.e.
    - ZAQ0..ZAQ15 (smallest tile granule)
    - ZAD0..ZAD7
    - ZAS0..ZAS3
    - ZAH0..ZAH1
    or ZAB0       (largest tile granule, single tile)
  * Tile vectors: similar to regular tiles, but have an extra 'h' or 'v'
  to tell how the vector at [reg+offset] is layed out in the tile,
  horizontally or vertically. E.g. za1h.h or za15v.q, which corresponds
  to vectors in registers ZAH1 and ZAQ15, respectively.
  * Accumulator matrix: this is the entire accumulator array ZA.

This patch adds the register classes and related operands and parsing
for SME instructions operating on the accumulator array.

The ADDHA and ADDVA instructions which operate on tiles are also added
in this patch to make some use of the code added, later patches will
make use of the other operands introduced here.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Co-authored by: Sander de Smalen (@sdesmalen)

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D105570

commit | commitdiff | tree

Sam McCall [Fri, 9 Jul 2021 07:40:18 +0000 (09:40 +0200)]

[clangd] Add CMake option to (not) link in clang-tidy checks

This reduces the size of the dependency graph and makes incremental
development a little more pleasant (less rebuilding).

This introduces a bit of complexity/fragility as some tests verify
clang-tidy behavior. I attempted to isolate these and build/run as much
of the tests as possible in both configs to prevent rot.

Expectation is that (some) developers will use this locally, but
buildbots etc will keep testing clang-tidy.

Fixes https://github.com/clangd/clangd/issues/233

Differential Revision: https://reviews.llvm.org/D105679

commit | commitdiff | tree

Ruiling Song [Thu, 8 Jul 2021 01:42:06 +0000 (09:42 +0800)]

[AMDGPU] Don't handle export done when unify exit nodes

This patch aims to revert the changes introduced by D70781 D71192 D76364

D70781 was introduced to fix hardware hang where we do not insert exp-
null-done for a kill inside infinit loop. At that time we have not added
exp-null-done for kill early termination, but I believe as for now, we will
always add the exp-null-done for early termination case in LaterBranchLowering.

D71192 was introduced to handle the only_kill case, which is also been
handled by the kill early termination work.

D76364 was used to fix a regression by D71192, where we cleared the done
bit of the export in the existing program and not let the normal return
block branching to the new unified return block.

With this change, we just trust frontends have setup exp-done correctly
which is true for all existing frontends. The backend only inserts
exp-null-done for the kill cases which is handled in SILateBranchLowering.cpp.

Reviewed by: critson

Differential Revision: https://reviews.llvm.org/D105610

commit | commitdiff | tree

Ruiling Song [Thu, 8 Jul 2021 03:09:33 +0000 (11:09 +0800)]

[NFC][AMDGPU] autogenerate kill-infinite-loop.ll checks

This would help us to track the assembly changes to these tests.

Reviewed by: foad

Differential Revision: https://reviews.llvm.org/D105609

commit | commitdiff | tree

Ruiling Song [Thu, 17 Jun 2021 22:40:44 +0000 (06:40 +0800)]

[RegisterCoalescer] Resolve conflict based on liveness of subregister

Currently we are resolving lane/subregister conflict by visiting
instructions sequentially in current block to see whether there is any
use of the tainted lanes. To save compile time, we are not doing further
check in successor blocks. This sounds reasonable without subgregister liveness.

But since we have added subregister liveness tracking capability to
register coalescer, we can easily determine whether we have subregister
liveness conflict by checking subranges. This would help coalescing more
COPYs for target that enables subregister liveness tracking.

Reviewed by: arsenm, qcolombet

Differential Revision: https://reviews.llvm.org/D104509

commit | commitdiff | tree

Kito Cheng [Tue, 29 Jun 2021 07:23:55 +0000 (15:23 +0800)]

[RISCV] Pass -u to linker correctly.

`-u` is a linker option used to pretend a symbol is undefined,
this option are common used for forcing archive member extraction.

This option should pass to `ld`, and many other toolchain in Clang
like `tools::gnutools` has pass that too.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D105091

commit | commitdiff | tree

Martin Storsjö [Tue, 13 Jul 2021 12:39:54 +0000 (12:39 +0000)]

[libcxx] [test] Clarify weak_ptr_ret on Windows, remove a LIBCXX-WINDOWS-FIXME

On Windows, structs with a destructor are always returned indirectly;
add this to the list of known exceptions in the test where the class
isn't returned in registers as expected.

Differential Revision: https://reviews.llvm.org/D105906

commit | commitdiff | tree

Dmitry Vyukov [Mon, 12 Jul 2021 19:06:28 +0000 (12:06 -0700)]

sanitizer_common: add simpler ThreadRegistry ctor

Currently ThreadRegistry is overcomplicated because of tsan,
it needs tid quarantine and reuse counters. Other sanitizers
don't need that. It also seems that no other sanitizer now
needs max number of threads. Asan used to need 2^24 limit,
but it does not seem to be needed now. Other sanitizers blindly
copy-pasted that without reasons. Lsan also uses quarantine,
but I don't see why that may be potentially needed.

Add a ThreadRegistry ctor that does not require any sizes
and use it in all sanitizers except for tsan.
In preparation for new tsan runtime, which won't need
any of these parameters as well.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D105713

commit | commitdiff | tree

Yuichi Yoshida [Wed, 14 Jul 2021 05:47:31 +0000 (05:47 +0000)]

Reformulate OrcJIT tutorial doc to make it more clear.

Fixed a minor writing error. The text was hard to understand.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D105899

commit | commitdiff | tree

Zakk Chen [Wed, 14 Jul 2021 03:32:55 +0000 (20:32 -0700)]

[RISCV] Support overloading for RVV miscellaneous functions.

Based on this update to the intrinsic doc
https://github.com/riscv/rvv-intrinsic-doc/pull/103

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D105611

Domain: System / Toolchain;

RSS Atom