review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

Ye Luo [Mon, 6 Sep 2021 04:41:53 +0000 (23:41 -0500)]

[OpenMP][libomptarget][NFC] Change checkDeviceAndCtors return type to bool.

What is exactly needed is only a boolean. Pulling OFFLOAD_SUCCESS/FAIL only adds confusion.

Differential Revision: https://reviews.llvm.org/D109303

commit | commitdiff | tree

Nikita Popov [Sat, 4 Sep 2021 20:38:46 +0000 (22:38 +0200)]

[UseListOrder] Fix use list order for function operands

Functions can have a personality function, as well as prefix and
prologue data as additional operands. Unused operands are assigned
a dummy value of i1* null. This patch addresses multiple issues in
use-list order preservation for these:

* Fix verify-uselistorder to also enumerate the dummy values.
   This means that now use-list order values of these values are
   shuffled even if there is no other mention of i1* null in the
   module. This results in failures of Assembler/call-arg-is-callee.ll,
   Assembler/opaque-ptr.ll and Bitcode/use-list-order2.ll.
* The use-list order prediction in ValueEnumerator does not take
   into account the fact that a global may use a value more than
   once and leaves uses in the same global effectively unordered.
   We should be comparing the operand number here, as we do for
   the more general case.
* While we enumerate all operands of a function together (which
   seems sensible to me), the bitcode reader would first resolve
   prefix data for all function, then prologue data for all
   functions, then personality functions for all functions. Change
   this to resolve all operands for a given function together
   instead.

Differential Revision: https://reviews.llvm.org/D109282

commit | commitdiff | tree

Arthur Eubanks [Tue, 7 Sep 2021 18:50:01 +0000 (11:50 -0700)]

Add missing overloads for Function::addRetAttr(s)

commit | commitdiff | tree

Nikita Popov [Tue, 20 Jul 2021 15:19:52 +0000 (17:19 +0200)]

[ConstFold] Support opaque pointers in constexpr GEPs

Support opaque pointers in SymbolicallyEvaluateGEP() by using the
value type of a GlobalValue base or falling back to i8 if there
isn't one. We don't unconditionally generate i8 GEPs here because
that would lose inrange attribues, and because some optimizations
on globals currently rely on GEP types (e.g. the globals SROA
mentioned in the comment).

Differential Revision: https://reviews.llvm.org/D109297

commit | commitdiff | tree

Andy Kaylor [Sat, 4 Sep 2021 01:24:09 +0000 (18:24 -0700)]

Copy Elementtype Attribute to IR at Link step

Copying IR during linking causes a type mismatch due to the field being missing in IRMover/Valuemapper. Adds the full range of typed attributes including elementtype attribute in the copy functions.

Patch by Chenyang Liu

Differential Revision: https://reviews.llvm.org/D108796

commit | commitdiff | tree

Fangrui Song [Tue, 7 Sep 2021 18:38:43 +0000 (11:38 -0700)]

[ELF][test] Improve gitBitcodeMachineKind tests

commit | commitdiff | tree

Maksim Panchenko [Tue, 31 Aug 2021 18:53:54 +0000 (11:53 -0700)]

[llvm-objdump] Fix 'llvm-objdump -dr' for executables with relocations

Print relocations interleaved with disassembled instructions for
executables with relocatable sections, e.g. those built with "-Wl,-q".

Differential Revision: https://reviews.llvm.org/D109016

commit | commitdiff | tree

Arthur Eubanks [Mon, 6 Sep 2021 20:02:36 +0000 (13:02 -0700)]

[NFC][InstCombine] Make check for sret in a vararg function clearer

We're trying to get the parameter index of sret and see if it's part of
a function's varargs.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D109335

commit | commitdiff | tree

Roman Lebedev [Tue, 7 Sep 2021 17:48:39 +0000 (20:48 +0300)]

Reland "[InstCombine] Recognize `((x * y) s/ x) !=/== y` as an signed multiplication overflow check (PR48769)"

This reverts commit 91f7a4fff75179e75d38b692715ae69471668b5e,
relanding commit 13ec913bdf500e2354cc55bf29e2f5d99e0c709e.

The original commit was reverted because of (essentially)
https://bugs.llvm.org/show_bug.cgi?id=35922
which has now been addressed by d0eeb64be5848a7832d13db9d69904db281d02e8.

commit | commitdiff | tree

Arthur O'Dwyer [Mon, 6 Sep 2021 16:40:05 +0000 (12:40 -0400)]

[libc++] Remove a stray `const` on ranges::data and ranges::ssize. NFCI.

These are specced as `inline constexpr auto`; the extra `const`
isn't doing anything except being inconsistent with the other CPOs.
Now all the implemented CPOs can be detected by
git grep 'inline constexpr auto.*fn' ../libcxx/include/
and I think that's beautiful.

commit | commitdiff | tree

Arthur O'Dwyer [Mon, 6 Sep 2021 18:11:45 +0000 (14:11 -0400)]

[libc++] Fix std::to_address(array).

There were basically two bugs here:

When C++20 `to_address` is called on `int arr[10]`, then `const _Ptr&` becomes
a reference to a const array, and then we dispatch to `__to_address<const int(&)[10]>`,
which, oops, gives us a `const int*` result instead of an `int*` result.
Solution: We need to provide the two standard-specified overloads of
`std::to_address` in exactly the same way that we provide two overloads
of `__to_address`.

When `__to_address` is called on a pointer type, `__to_address(const _Ptr&)`
is disabled so we successfully avoid trying to instantiate pointer_traits of
that pointer type. But when it's called on an array type, it's not disabled
for array types, so we go ahead and instantiate pointer_traits<int[10]>,
which goes boom. Solution: We need to disable `__to_address(const _Ptr&)`
for both pointer and array types. Also disable it for function types,
so that they get the nice error message; and put a test on it.

Differential Revision: https://reviews.llvm.org/D109331

commit | commitdiff | tree

Joe Loser [Tue, 7 Sep 2021 17:48:10 +0000 (13:48 -0400)]

[libc++][NFC] Test span is nothrow trivially destructible

Add tests showing `span` is trivially_destructible and nothrow_destructible.
Note that we do not need to explicitly default the destructor in `span`.

Reviewed By: ldionne, Mordante, #libc

Differential Revision: https://reviews.llvm.org/D109286

commit | commitdiff | tree

Nick Desaulniers [Tue, 7 Sep 2021 17:26:22 +0000 (10:26 -0700)]

[X86ISelLowering] avoid emitting libcalls to __mulodi4()

Similar to D108842, D108844, and D108926.

__has_builtin(builtin_mul_overflow) returns true for 32b x86 targets,
but Clang is deferring to compiler RT when encountering long long types.
This breaks ARCH=i386 + CONFIG_BLK_DEV_NBD=y builds of the Linux kernel
that are using builtin_mul_overflow with these types for these targets.

If the semantics of __has_builtin mean "the compiler resolves these,
always" then we shouldn't conditionally emit a libcall.

This will still need to be worked around in the Linux kernel in order to
continue to support these builds of the Linux kernel for this
target with older releases of clang.

Link: https://bugs.llvm.org/show_bug.cgi?id=28629
Link: https://bugs.llvm.org/show_bug.cgi?id=35922
Link: https://github.com/ClangBuiltLinux/linux/issues/1438
Reviewed By: lebedev.ri, RKSimon

Differential Revision: https://reviews.llvm.org/D108928

commit | commitdiff | tree

peter klausler [Mon, 30 Aug 2021 16:36:33 +0000 (09:36 -0700)]

[flang] evaluate: Fold SQRT, HYPOT, & CABS

Implement IEEE Real::SQRT() operation, then use it to
also implement Real::HYPOT(), which can then be used directly
to implement Complex::ABS().

Differential Revision: https://reviews.llvm.org/D109250

commit | commitdiff | tree

Nico Weber [Tue, 7 Sep 2021 15:23:52 +0000 (11:23 -0400)]

[lldb] Alphabetize some CMake files a bit better

No observable behavior change, but makes the generated Plugins.def a bit easier
to read.

Differential Revision: https://reviews.llvm.org/D109367

commit | commitdiff | tree

Alex Zinenko [Tue, 7 Sep 2021 12:27:48 +0000 (14:27 +0200)]

[mlir] Fix SplatOp lowering to the LLVM dialect

The lowering has been incorrectly using the operands of the original op instead
of rewritten operands provided to matchAndRewrite call. This may lead to
spurious materializations and generally invalid IR.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D109355

commit | commitdiff | tree

Alexandre Rames [Tue, 7 Sep 2021 16:42:46 +0000 (09:42 -0700)]

[Support] Automatically support `hash_value` when `HashBuilder` support is available.

Use the `HBuilder` interface to provide default implementations of `llvm::hash_value`.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D109024

commit | commitdiff | tree

aristotelis [Tue, 7 Sep 2021 16:25:44 +0000 (09:25 -0700)]

Greedy set cover implementation of `Merger::Merge`

Extend the existing single-pass algorithm for `Merger::Merge` with an algorithm that gives better results. This new implementation can be used with a new **set_cover_merge=1** flag.

This greedy set cover implementation gives a substantially smaller final corpus (40%-80% less testcases) while preserving the same features/coverage. At the same time, the execution time penalty is not that significant (+50% for ~1M corpus files and far less for smaller corpora). These results were obtained by comparing several targets with varying size corpora.

Change `Merger::CrashResistantMergeInternalStep` to collect all features from each file and not just unique ones. This is needed for the set cover algorithm to work correctly. The implementation of the algorithm in `Merger::SetCoverMerge` uses a bitvector to store features that are covered by a file while performing the pass. Collisions while indexing the bitvector are ignored similarly to the fuzzer.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D105284

commit | commitdiff | tree

Alexandre Rames [Thu, 2 Sep 2021 23:13:28 +0000 (16:13 -0700)]

[NFC][support] Extract `IsHashableData` out of class

Extract `HashBuilder::IsHashableData` out of class; it does not depend on
template parametres.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D109205

commit | commitdiff | tree

Simon Pilgrim [Tue, 7 Sep 2021 16:09:53 +0000 (17:09 +0100)]

[X86] X86InstrAVX512.td - remove unused template parameters. NFC.

Identified in D109359

commit | commitdiff | tree

Hansang Bae [Fri, 11 Jun 2021 22:35:28 +0000 (17:35 -0500)]

[OpenMP] Add interface for 5.1 scope construct

The new interface only marks begin/end of a scope construct for
corresponding OMPT events, and we can use existing interfaces for
reduction operations.

Differential Revision: https://reviews.llvm.org/D108062

commit | commitdiff | tree

Kazu Hirata [Tue, 7 Sep 2021 16:19:33 +0000 (09:19 -0700)]

[Analysis, Target, Transforms] Construct SmallVector with iterator ranges (NFC)

commit | commitdiff | tree

Kazu Hirata [Tue, 7 Sep 2021 16:19:31 +0000 (09:19 -0700)]

[RISCV] Fix "set but not used" warnings

commit | commitdiff | tree

peter klausler [Fri, 3 Sep 2021 20:55:18 +0000 (13:55 -0700)]

[flang] Fix GetHostProcedure() for main program

It only worked for internal procedures of subprograms,
but must also allow for internal procedures of the
main program. This broke the use of host-associated
implicitly-typed symbols in specification expressions
of internal procedures.

Differential Revision: https://reviews.llvm.org/D109262

commit | commitdiff | tree

Dávid Bolvanský [Tue, 7 Sep 2021 16:04:38 +0000 (18:04 +0200)]

[InstCombine] ror/rol(X, RotAmt) == C --> X == rol/ror(C, RotAmt)   (PR51567)

```
----------------------------------------
define i1 @src(i32 %0) {
%1:
  %2 = fshl i32 %0, i32 %0, i32 25
  %3 = icmp eq i32 %2, 5
  ret i1 %3
}
=>
define i1 @tgt(i32 %0) {
%1:
  %2 = icmp eq i32 %0, 640
  ret i1 %2
}
Transformation seems to be correct!
```

https://alive2.llvm.org/ce/z/GdY8Jm

Solves PR51567

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D109283

commit | commitdiff | tree

Andrew Litteken [Wed, 28 Jul 2021 14:59:37 +0000 (07:59 -0700)]

[IROutliner] Adding outlining for single entry/single exit multiblock regions

Using the similarity found from the IRSimilarity Identifier, we take regions with structural similarity, and deduplicate them into a separate function. The Code Extractor is able to provide most of this functionality.

For simplicity, we start by only outlining regions with a single entry and single exit branch, this reduces the complexity in handling phi nodes outside the region, and handling many sets of outputs for each of the different exit blocks.

Reviewer: paquette

Differential Revision: https://reviews.llvm.org/D106990

commit | commitdiff | tree

Victor Huang [Mon, 30 Aug 2021 15:52:06 +0000 (10:52 -0500)]

[PowerPC] Fixed the crash due to early if conversion with fixed CR fields

This patch adds a fix to do early if conversion to select when
conditional branch not using physical register to prevent the crash when
expanding ISEL instruction.

Reviewed By: lei, kamaub, PowerPC

Differential revision: https://reviews.llvm.org/D108302

commit | commitdiff | tree

Vladimir Vereschaka [Fri, 3 Sep 2021 19:25:16 +0000 (15:25 -0400)]

[libc++] Provide 'buildhost=<platform> feature for the tests.

The target platform could differ from the host platform for the cross
platform builds. Some tests are depended on the build host features and
they need to determine a proper platform environment.

This commit adds a build host platform name feature for the libc++ tests
in format `buildhost=<platform>`, such as `buildhost=linux`, `buildhost=darwin`,
`buildhost=windows`, etc.

The Windows host gets two features: one `buildhost=windows` and another based
on Windows "sub-system", such as `buildhost=win32`, `buildhost=cygwin`, etc.

Differential Revision: https://reviews.llvm.org/D102045

commit | commitdiff | tree

David Spickett [Tue, 7 Sep 2021 15:48:35 +0000 (15:48 +0000)]

[lldb] Add missing newline to stderr output on failed attach

commit | commitdiff | tree

Sanjay Patel [Tue, 7 Sep 2021 14:34:15 +0000 (10:34 -0400)]

[InstCombine] add tests for smear-a-set-bit; NFC

Possible follow-ups from patterns discussed in D109155.

commit | commitdiff | tree

Jonas Devlieghere [Tue, 7 Sep 2021 15:36:58 +0000 (08:36 -0700)]

[lldb] Update crashlog.py to accept multiple results from mdfind

mdfind can return multiple results, some of which are not even dSYM
bundles, but Xcode archives (.xcrachive).

Currently, we end up concatenating the paths, which is obviously bogus.
This patch not only fixes that, but now also skips paths that don't have
a Contents/Resources/DWARF subdirectory.

rdar://81270312

Differential revision: https://reviews.llvm.org/D109263

commit | commitdiff | tree

Simon Pilgrim [Tue, 7 Sep 2021 14:54:12 +0000 (15:54 +0100)]

[X86] Add missing domain to avx512_ord_cmp_sae comis sae patterns

It doesn't appear to be possible to generate this from tests atm, but it matches what we do in sse12_ord_cmp

Fixes unused template arg identified in D109359

commit | commitdiff | tree

Jinsong Ji [Tue, 7 Sep 2021 15:16:36 +0000 (15:16 +0000)]

[PowerPC] Guard XSRSP in P8 for FastISel

This is exposed by enabling FastIsel on 64bit AIX.
We are generating XSRSP regardless of the arch,
which may be wrong when -mcpu=pwr7.

The fix is to guard the generation in P8 only.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D109365

commit | commitdiff | tree

Jingu Kang [Tue, 7 Sep 2021 15:01:18 +0000 (16:01 +0100)]

[test] precommit a test for D109354

commit | commitdiff | tree

Hans Wennborg [Tue, 7 Sep 2021 12:34:32 +0000 (14:34 +0200)]

Add llvm-ml to LLVM_TOOLCHAIN_TOOLS (PR50536)

so that it gets installed in LLVM_INSTALL_TOOLCHAIN_ONLY builds,
such as used by the Windows installer.

Differential revision: https://reviews.llvm.org/D109358

commit | commitdiff | tree

Roman Lebedev [Tue, 7 Sep 2021 14:09:58 +0000 (17:09 +0300)]

[Exegesis] Native clusterization: sub-partition by sched class id

Currently native clusterization simply groups all benchmarks
by the opcode of key instruction, but that is suboptimal in certain cases,
e.g. where we can already tell that the particular instructions
already resolve into different sched classes.

commit | commitdiff | tree

Roman Lebedev [Tue, 7 Sep 2021 14:43:35 +0000 (17:43 +0300)]

[NFC][exegesis] Add test for the following patch

commit | commitdiff | tree

Alex Zinenko [Tue, 7 Sep 2021 12:28:23 +0000 (14:28 +0200)]

[mlir] Fix GPU LaunchFunc conversion to the LLVM dialect

The conversion has been incorrectly using the operands of the original
operation instead of the converted operands provided to the matchAndRewrite
call. This may lead to spurious materializations and generally invalid IR if
the producer of the original operands is deleted in the process of conversion.

Reviewed By: csigg

Differential Revision: https://reviews.llvm.org/D109356

commit | commitdiff | tree

Sander de Smalen [Tue, 7 Sep 2021 13:29:48 +0000 (14:29 +0100)]

[AArch64][SVE] Improve extract_subvector for predicates.

Using PUNPKLO/HI instead of ZIP1/ZIP2, because that avoids
having to generate a predicate with all lanes inactive (PFALSE).

Reviewed By: CarolineConcatto

Differential Revision: https://reviews.llvm.org/D109312

commit | commitdiff | tree

Peter Smith [Mon, 9 Aug 2021 10:40:22 +0000 (11:40 +0100)]

[MC] Use local MCSubtargetInfo in writeNops

On some architectures such as Arm and X86 the encoding for a nop may
change depending on the subtarget in operation at the time of
encoding. This change replaces the per module MCSubtargetInfo retained
by the targets AsmBackend in favour of passing through the local
MCSubtargetInfo in operation at the time.

On Arm using the architectural NOP instruction can have a performance
benefit on some implementations.

For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to
limit the chances of this causing problems in the future. I've not
done this for other targets such as X86 as there is more frequent use
of the MCSubtargetInfo and it looks to be for stable properties that
we would not expect to vary per function.

This change required threading STI through MCNopsFragment and
MCBoundaryAlignFragment.

I've attempted to take into account the in tree experimental backends.

Differential Revision: https://reviews.llvm.org/D45962

commit | commitdiff | tree

Peter Smith [Fri, 6 Aug 2021 16:42:12 +0000 (17:42 +0100)]

[MC] Add MCSubtargetInfo to MCAlignFragment

In preparation for passing the MCSubtargetInfo (STI) through to writeNops
so that it can use the STI in operation at the time, we need to record the
STI in operation when a MCAlignFragment may write nops as padding. The
STI is currently unused, a further patch will pass it through to
writeNops.

There are many places that can create an MCAlignFragment, in most cases
we can find out the STI in operation at the time. In a few places this
isn't possible as we are in initialisation or finalisation, or are
emitting constant pools. When possible I've tried to find the most
appropriate existing fragment to obtain the STI from, when none is
available use the per module STI.

For constant pools we don't actually need to use EmitCodeAlign as the
constant pools are data anyway so falling through into it via an
executable NOP is no better than falling through into data padding.

This is a prerequisite for D45962 which uses the STI to emit the
appropriate NOP for the STI. Which can differ per fragment.

Note that involves an interface change to InitSections. It is now
called initSections and requires a SubtargetInfo as a parameter.

Differential Revision: https://reviews.llvm.org/D45961

commit | commitdiff | tree

Michael Liao [Mon, 30 Aug 2021 05:42:18 +0000 (01:42 -0400)]

[amdgpu] Enable selection of `s_cselect_b64`.

Differential Revision: https://reviews.llvm.org/D109159

commit | commitdiff | tree

Mirko Brkusanin [Tue, 7 Sep 2021 14:25:04 +0000 (16:25 +0200)]

[AMDGPU][GlobalISel] Legalize G_MUL for non-standard types

Legalizing G_MUL for non-standard types (like i33) generated an error. Putting
minScalar and maxScalar instead of clampScalar. Also using new rule, instead
of widening to the next power of 2, widen to the next multiple of the passed
argument (32 in this case), so instead of widening i65 to i128, we widen it to
i96.

Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D109228

commit | commitdiff | tree

Mirko Brkusanin [Tue, 7 Sep 2021 14:18:19 +0000 (16:18 +0200)]

[AMDGPU][GlobalISel] Legalization of G_ROTL and G_ROTR

Add implementation for the legalization of G_ROTL and G_ROTR machine
instructions. They are very similar to funnel shift instructions, the only
difference is funnel shifts have 3 operands, whereas rotate instructions have
two operands, the first being the register that is being rotated and the second
being the number of shifts. The legalization of G_ROTL/G_ROTR is just lowering
them into funnel shift instructions if they are legal.

Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D105347

commit | commitdiff | tree

Simon Pilgrim [Tue, 7 Sep 2021 14:13:05 +0000 (15:13 +0100)]

[X86] X86InstrSSE.td - remove unused template parameters. NFC.

Identified in D109359

commit | commitdiff | tree

Simon Pilgrim [Tue, 7 Sep 2021 13:45:55 +0000 (14:45 +0100)]

[X86] X86InstrVecCompiler.td - remove unused template parameters. NFC.

Identified in D109359

commit | commitdiff | tree

Simon Pilgrim [Tue, 7 Sep 2021 13:45:25 +0000 (14:45 +0100)]

[X86] X86InstrFMA.td - remove unused template parameters. NFC.

Identified in D109359

commit | commitdiff | tree

Anton Afanasyev [Sun, 5 Sep 2021 11:00:04 +0000 (14:00 +0300)]

[AggressiveInstCombine] Add `AssumptionCache` to aggressive instcombine

Add support for @llvm.assume() to TruncInstCombine allowing
optimizations based on these intrinsics while computing known bits.

commit | commitdiff | tree

Anton Afanasyev [Wed, 1 Sep 2021 22:00:37 +0000 (01:00 +0300)]

[AggressiveInstCombine][Test] Add test for assumptions

commit | commitdiff | tree

Anton Afanasyev [Sun, 5 Sep 2021 07:19:43 +0000 (10:19 +0300)]

[AggresiveInstCombine] Add wrapper calls for `KnownBits` computing

Precommit before `AssumptionCache` adding: reviews.llvm.org/D109141

Differential Revision: https://reviews.llvm.org/D109288

commit | commitdiff | tree

Simon Pilgrim [Tue, 7 Sep 2021 13:33:03 +0000 (14:33 +0100)]

[llvm-exegesis][x86] Limit llvm-exegesis analysis tests to x86_64 triple hosts

Attempting to fix an issue with test failures on arm m1 apple macintoshes reported on D109353

commit | commitdiff | tree

Kadir Cetinkaya [Tue, 7 Sep 2021 13:15:21 +0000 (15:15 +0200)]

[clang][Driver] Pick the last --driver-mode in case of multiple ones

This was an accidental behaviour change in D106789 and this patch
restores it back to original state.

Differential Revision: https://reviews.llvm.org/D109361

commit | commitdiff | tree

Sander de Smalen [Tue, 7 Sep 2021 12:11:42 +0000 (13:11 +0100)]

[AArch64][SVE] Implement all-inactive predicate with PFALSE.

Instead of using a WHILE XZR, XZR instruction, just emit a PFALSE.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D109311

commit | commitdiff | tree

Nawrin Sultana [Tue, 31 Aug 2021 21:35:16 +0000 (16:35 -0500)]

[OpenMP] Change monotonicity of dynamic schedule

This patch changes the default monotonicity of dynamic schedule from
monotonic to non-monotonic when no modifier is specified.

Differential Revision: https://reviews.llvm.org/D109026

commit | commitdiff | tree

David Sherwood [Wed, 1 Sep 2021 12:09:49 +0000 (13:09 +0100)]

[SVE][NFC] Add SVE cost model tests for gathers/scatters

We previously didn't have any tests to defend the cost model
for gathers and scatters using SVE without a vscale_range
attribute. I've added tests to existing files:

Analysis/CostModel/AArch64/sve-gather.ll
Analysis/CostModel/AArch64/sve-scatter.ll

Differential Revision: https://reviews.llvm.org/D109055

commit | commitdiff | tree

Simon Pilgrim [Tue, 7 Sep 2021 12:57:49 +0000 (13:57 +0100)]

[llvm-exegesis] Analysis tests should run even without libpfm (PR51687)

Move inverse_throughput, latency and uops to sub-directories (like we already do for lbr), which require libpfm, so we can relax the lit limits for analysis tests in the x86 root directory.

Differential Revision: https://reviews.llvm.org/D109353

commit | commitdiff | tree

Dávid Bolvanský [Tue, 7 Sep 2021 12:29:59 +0000 (14:29 +0200)]

[NFC] Added test for stpcpy -> strcpy transformation with AS != 0

commit | commitdiff | tree

Brad Smith [Tue, 7 Sep 2021 11:54:23 +0000 (07:54 -0400)]

Mention OpenBSD in the documentation

commit | commitdiff | tree

Simon Pilgrim [Tue, 7 Sep 2021 10:43:26 +0000 (11:43 +0100)]

[KnownBits] Add support for X*X self-multiplication

Add KnownBits handling and unit tests for X*X self-multiplication cases which guarantee that bit1 of their results will be zero - see PR48683.

https://alive2.llvm.org/ce/z/NN_eaR

The next step will be to add suitable test coverage so this can be enabled in ValueTracking/DAG/GlobalISel - currently only a single Analysis/ScalarEvolution test is affected.

Differential Revision: https://reviews.llvm.org/D108992

commit | commitdiff | tree

Mirko Brkusanin [Tue, 7 Sep 2021 09:30:11 +0000 (11:30 +0200)]

[AMDGPU][GlobalISel] Legalize memcpy family of intrinsics

Legalize G_MEMCPY, G_MEMMOVE, G_MEMSET and G_MEMCPY_INLINE.

Corresponding intrinsics are replaced by a loop that uses loads/stores in
AMDGPULowerIntrinsics pass unless their length is a constant lower then
MemIntrinsicExpandSizeThresholdOpt (default 1024). Any G_MEM* instruction that
reaches legalizer should have a const length argument and should be expanded
into appropriate number of loads + stores.

Differential Revision: https://reviews.llvm.org/D108357

commit | commitdiff | tree

Fraser Cormack [Tue, 31 Aug 2021 14:29:47 +0000 (15:29 +0100)]

[RISCV][VP] Custom lower VP_STORE and VP_LOAD

This patch adds support for the vector-predicated `VP_STORE` and
`VP_LOAD` nodes. We do this in the same way we lower `MSTORE` and
`MLOAD`: to regular load/store instructions via intrinsics.

One necessary change was made to `SelectionDAGLegalize` so that
`VP_STORE` nodes' operation actions are taken from the stored "value"
operands, in the same vein as `STORE` or `MSTORE`.

Reviewed By: craig.topper, rogfer01

Differential Revision: https://reviews.llvm.org/D108999

commit | commitdiff | tree

Fraser Cormack [Tue, 31 Aug 2021 11:43:12 +0000 (12:43 +0100)]

[RISCV][VP] Custom lower VP_SCATTER and VP_GATHER

This patch adds support for the `VP_SCATTER` and `VP_GATHER` nodes by
lowering them to RVV's `vsox`/`vlux` instructions, respectively. This
process is almost identical to the existing `MSCATTER`/`MGATHER` support.

One extra change was made to `SelectionDAGLegalize` so that
`VP_SCATTER`'s operation action is derived from its stored "value"
operand rather than its return type (which is always the chain).

Reviewed By: craig.topper, rogfer01

Differential Revision: https://reviews.llvm.org/D108987

commit | commitdiff | tree

Roman Lebedev [Tue, 7 Sep 2021 08:47:20 +0000 (11:47 +0300)]

[exegesis][X86] ParallelSnippetGenerator: don't accidentally create serialized instructions

In the case of no tied variables, we pick random defs, and then random uses that don't alias with defs we just picked.
Sounds good, except that an X86 instruction may have implicit reg uses,
e.g. for `MULX` it's `EDX`/`RDX`: `Intel SDM, 4-162 Vol. 2B MULX — Unsigned Multiply Without Affecting Flags`
> Performs an unsigned multiplication of the implicit source operand (EDX/RDX) and the specified source operand
> (the third operand) and stores the low half of the result in the second destination (second operand), the high half
> of the result in the first destination operand (first operand), without reading or writing the arithmetic flags.

And indeed, every once in a while `llvm-exegesis` happened to pick EDX as a def while measuring throughput,
and producing garbage output:
```
$ ./bin/llvm-exegesis -num-repetitions=1000000 -mode=inverse_throughput -repetition-mode=min --loop-body-size=4096 -dump-object-to-disk=false -opcode-name=MULX32rr --max-configs-per-opcode=65536
---
mode:            inverse_throughput
key:
  instructions:
    - 'MULX32rr EDX R11D R12D'
  config:          ''
  register_initial_values:
    - 'R12D=0x0'
    - 'EDX=0x0'
cpu_name:        znver3
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 1000000
measurements:
  - { key: inverse_throughput, value: 4.00014, per_snippet_value: 4.00014 }
error:           ''
info:            instruction has no tied variables picking Uses different from defs
assembled_snippet: 415441BC00000000BA00000000C4C223F6D4C4C223F6D4C4C223F6D4C4C223F6D4415CC3415441BC00000000BA0000000049B80200000000000000C4C223F6D4C4C223F6D44983C0FF75F0415CC3
...
```
```
$ ./bin/llvm-exegesis -num-repetitions=1000000 -mode=inverse_throughput -repetition-mode=min --loop-body-size=4096 -dump-object-to-disk=false -opcode-name=MULX32rr --max-configs-per-opcode=65536
---
mode:            inverse_throughput
key:
  instructions:
    - 'MULX32rr R13D EDX ECX'
  config:          ''
  register_initial_values:
    - 'ECX=0x0'
    - 'EDX=0x0'
cpu_name:        znver3
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 1000000
measurements:
  - { key: inverse_throughput, value: 3.00013, per_snippet_value: 3.00013 }
error:           ''
info:            instruction has no tied variables picking Uses different from defs
assembled_snippet: 4155B900000000BA00000000C4626BF6E9C4626BF6E9C4626BF6E9C4626BF6E9415DC34155B900000000BA0000000049B80200000000000000C4626BF6E9C4626BF6E94983C0FF75F0415DC3
...
```
Oops! Not only does that not look fun, i did hit that pitfail during AMD Zen 3 enablement.
While i have since then addressed this in rGd4d459e7475b4bb0d15280f12ed669342fa5edcd,
i suspect there may be other buggy results lying around, so we should at least stop producing them.

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D109275

commit | commitdiff | tree

Justas Janickas [Thu, 2 Sep 2021 10:51:39 +0000 (11:51 +0100)]

[OpenCL] Disallows static kernel functions in C++ for OpenCL

It is disallowed in OpenCL C to declare static kernel functions and
C++ for OpenCL is expected to inherit such behaviour. Error is now
correctly reported in C++ for OpenCL when declaring a static kernel
function.

Differential Revision: https://reviews.llvm.org/D109150

commit | commitdiff | tree

Andrew Wei [Tue, 7 Sep 2021 09:05:39 +0000 (17:05 +0800)]

[AArch64] Avoid adding duplicate implicit operands when expanding pseudo insts.

When expanding pseudo insts, in order to create a new machine instr, we use BuildMI,
which will add implicit operands by default. And transferImpOps will also copy implicit
operands from old ones. Finally, duplicate implicit operands are added to the same inst.
Sometimes this can cause correctness issues. Like below inst,
renamable $w18 = nsw SUBSWrr renamable $w30, renamable $w14, implicit-def dead $nzcv
After expanding, it will become
$w18 = SUBSWrs renamable $w13, renamable $w14, 0, implicit-def $nzcv, implicit-def dead $nzcv
A redundant implicit-def $nzcv is added, but the dead flag is missing.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D109069

commit | commitdiff | tree

Fraser Cormack [Mon, 6 Sep 2021 09:23:56 +0000 (10:23 +0100)]

[SelectionDAG][VP] Fix MemSDNode::getBasePtr

Found while working on D108987. When interpreting VP nodes as
`MemSDNode` nodes, this function would return the incorrect indices.
This was due to `VP_GATHER` and having no "passthru", and both
`VP_GATHER` and `VP_SCATTER` having their mask operands *after* the base
pointer, unlike `MGATHER` and `MSCATTER`.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D109308

commit | commitdiff | tree

luxufan [Mon, 6 Sep 2021 02:48:56 +0000 (10:48 +0800)]

[RuntimeDyld] Don't use bitwise operation on SymbolRef::Type

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D109292

commit | commitdiff | tree

Brad Smith [Tue, 7 Sep 2021 08:38:52 +0000 (04:38 -0400)]

Mention OpenBSD in the documentation

commit | commitdiff | tree

Frederic Cambus [Tue, 7 Sep 2021 08:25:12 +0000 (04:25 -0400)]

[compiler-rt] Document that builtins is known to work on OpenBSD.

Differential Revision: https://reviews.llvm.org/D109346

commit | commitdiff | tree

Ben Shi [Tue, 7 Sep 2021 02:21:38 +0000 (10:21 +0800)]

[ARM] Implement target hook function to decide folding (mul (add x, c1), c2)

Prevent the folding in DAGCombine if it leads to worse code.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D109124

commit | commitdiff | tree

Ben Shi [Wed, 1 Sep 2021 13:19:22 +0000 (21:19 +0800)]

[ARM][test] Add new tests for (mul (add r, c0), c1)

Reviewed By: RKSimon, dmgreen

Differential Revision: https://reviews.llvm.org/D109123

commit | commitdiff | tree

Clement Courbet [Tue, 7 Sep 2021 07:06:18 +0000 (09:06 +0200)]

[llvm-exegesis] Add unit test in preparation for DD109275

commit | commitdiff | tree

Nathan Ridge [Tue, 31 Aug 2021 08:34:09 +0000 (04:34 -0400)]

[clangd] Omit default template arguments from type hints

Differential Revision: https://reviews.llvm.org/D108975

commit | commitdiff | tree

Nathan Ridge [Tue, 31 Aug 2021 07:42:16 +0000 (03:42 -0400)]

[clangd] Omit type hints that are too long

Differential Revision: https://reviews.llvm.org/D108972

commit | commitdiff | tree

Ye Luo [Sat, 4 Sep 2021 19:07:41 +0000 (14:07 -0500)]

[OpenMP][libomptarget] Change device vector elements to unique_ptr type

Using std::vector<DeviceTy> requires implementing copy constructor and copied assign operator for DeviceTy.
Indeed DeviceTy should never be copied. After changing to std::vector<std::unique_ptr<DeviceTy>>,
All the unsafe copy constructor and copy assign operator implementations can be removed.
Compilers mark them deleted due to mutex or underlying objects and this is the desired behavior.

Differential Revision: https://reviews.llvm.org/D109276

commit | commitdiff | tree

oToToT [Tue, 7 Sep 2021 02:39:01 +0000 (10:39 +0800)]

[clang] Add '-ast-dump-filter=' support

Before this patch, we only support syntax like
`clang -cc1 -ast-dump -ast-dump-filter main a.c`
or
`clang -Xclang -ast-dump -Xclang -ast-dump-filter -Xclang main a.c`
when using ast-dump-filter.

It is helpful to also support `-ast-dump-filter=` syntax, so we can do
something like
`clang -cc1 -ast-dump -ast-dump-filter=main a.c`
or
`clang -Xclang -ast-dump -Xclang -ast-dump-filter=main a.c`

It is more cleaner when passing arguments through `-Xclang` in this case.

Also, **clang-check** do support this syntax, and I think people might
be confiused when they found they can't use `ast-dump-filter` with
clang.

commit | commitdiff | tree

Ye Luo [Tue, 7 Sep 2021 02:27:12 +0000 (21:27 -0500)]

[OpenMP][libomptarget] Change synchronize_ty return type to int32_t

Plugins always return int32_t. Stay consistent with other functions which return error status.

Differential Revision: https://reviews.llvm.org/D109341

commit | commitdiff | tree

Jinsong Ji [Tue, 7 Sep 2021 01:20:35 +0000 (01:20 +0000)]

[RuntimeDyld] Guard UsedTLSStorage to x86 ELF only

UsedTLSStorage is only used in allocateTLSSection,
guarded in x87 ELF only.
So clang will emit error with -Werror on.

.../llvm/tools/llvm-rtdyld/llvm-rtdyld.cpp:288:12:
error: private field 'UsedTLSStorage' is not used
[-Werror,-Wunused-private-field]
unsigned UsedTLSStorage = 0;
^

commit | commitdiff | tree

Matthias Springer [Tue, 7 Sep 2021 00:40:04 +0000 (09:40 +0900)]

[mlir][linalg] linalg.tiled_loop peeling

Differential Revision: https://reviews.llvm.org/D108270

commit | commitdiff | tree

Craig Topper [Tue, 7 Sep 2021 00:44:51 +0000 (17:44 -0700)]

[X86] Handle inverted inputs when matching VPTERNLOG from 2 binary ops.

This is a more general version of D109273. Though it doesn't
peek through bitcasts or rearange broadcasts.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D109295

commit | commitdiff | tree

Fangrui Song [Mon, 6 Sep 2021 22:54:02 +0000 (15:54 -0700)]

[X86] Simplify condition guarding emitCalleeSavedFrameMoves. NFC

commit | commitdiff | tree

Fangrui Song [Mon, 6 Sep 2021 22:47:40 +0000 (15:47 -0700)]

[X86] Simplify two hasFP(F). NFC

commit | commitdiff | tree

David Green [Mon, 6 Sep 2021 21:03:32 +0000 (22:03 +0100)]

[ARM] Add tests for MVE narrowing intrinsic demand bits.

commit | commitdiff | tree

Nikita Popov [Mon, 6 Sep 2021 20:18:11 +0000 (22:18 +0200)]

[SCEV] Fix applyLoopGuards() with range check idiom (PR51760)

Due to a typo, this replaced %x with umax(C1, umin(C2, %x + C3))
rather than umax(C1, umin(C2, %x)). This didn't make a difference
for the existing tests, because the result is only used for range
calculation, and %x will usually have an unknown starting range,
and the additional offset keeps it unknown. However, if %x already
has a known range, we may compute a result range that is too
small.

commit | commitdiff | tree

Sanjay Patel [Mon, 6 Sep 2021 18:47:02 +0000 (14:47 -0400)]

[DAGCombine] Prevent the transform of combine for multi-use operand

The test is based on a miscompile example in:
https://llvm.org/PR51321

Differential Revision: https://reviews.llvm.org/D107692

commit | commitdiff | tree

Benjamin Kramer [Mon, 6 Sep 2021 19:17:29 +0000 (21:17 +0200)]

[lldb] Fix pessimizing move warning

lldb/source/Core/PluginManager.cpp:695:21: warning: moving a temporary object prevents copy elision [-Wpessimizing-move]
      return Status(std::move(ret.takeError()));
                    ^
lldb/source/Core/PluginManager.cpp:695:21: note: remove std::move call here
      return Status(std::move(ret.takeError()));
                    ^~~~~~~~~~               ~

commit | commitdiff | tree

Andrew Litteken [Wed, 28 Jul 2021 14:02:00 +0000 (07:02 -0700)]

[IRSim] Adding support for recognizing branch similarity

The current IRSimilarityIdentifier does not try to find similarity across blocks, this patch provides a mechanism to compare two branches against one another, to find similarity across basic blocks, rather than just within them.

This adds a step in the similarity identification process that labels all of the basic blocks so that we can identify the relative branching locations. Within an IRSimilarityCandidate we use these relative locations to determine whether if the branching to other relative locations in the same region is the same between branches. If they are, we consider them similar.

We do not consider the relative location of the branch if the target branch is outside of the region. In this case, both branches must exit to a location outside the region, but the exact relative location does not matter.

Reviewers: paquette, yroux

Differential Revision: https://reviews.llvm.org/D106989

commit | commitdiff | tree

Dávid Bolvanský [Mon, 6 Sep 2021 17:40:52 +0000 (19:40 +0200)]

[NFC] Added tests for D109283

commit | commitdiff | tree

Craig Topper [Mon, 6 Sep 2021 17:22:39 +0000 (10:22 -0700)]

[X86] Pre-commit test cases for D109295. NFC

commit | commitdiff | tree

David Blaikie [Mon, 6 Sep 2021 17:20:39 +0000 (10:20 -0700)]

DebugInfo: Add a FIXME/suggestion about using sibling/parent index to DWARFDebugInfoEntry

As a reminder if someone comes looking to improve iteration or parent
navigation performance of DWARFDebugInfoEntry.

commit | commitdiff | tree

Michał Górny [Mon, 26 Apr 2021 20:47:05 +0000 (22:47 +0200)]

[lldb] Support SaveCore() from gdb-remote client

Extend PluginManager::SaveCore() to support saving core dumps
via Process plugins. Implement the client-side part of qSaveCore
request in the gdb-remote plugin, that creates the core dump
on the remote host and then uses vFile packets to transfer it.

Differential Revision: https://reviews.llvm.org/D101329

commit | commitdiff | tree

Kazu Hirata [Mon, 6 Sep 2021 16:10:07 +0000 (09:10 -0700)]

[Support] Qualify auto (NFC)

Identified with readability-qualified-auto.

commit | commitdiff | tree

Andrzej Warzynski [Sun, 22 Aug 2021 16:32:44 +0000 (16:32 +0000)]

[flang][plugins] Make `PluginParseTreeAction` an abstract class

There's no point in providing a default implementation for
`PluginParseTreeAction`. This patch makes it abstract forcing users to
specialise it in order to use it.

Differential Revision: https://reviews.llvm.org/D108518

commit | commitdiff | tree

Jonas Paulsson [Sun, 5 Sep 2021 15:27:22 +0000 (17:27 +0200)]

[SelectionDAGBuilder] Bugfix in visitInlineAsm()

In case of a virtual register tied to a phys-def, the register class needs to
be computed. Make sure that this works generally also with fast regalloc by
using TLI.getRegClassFor() whenever possible, and make only the case of
'Untyped' use getMinimalPhysRegClass().

Fixes https://bugs.llvm.org/show_bug.cgi?id=51699.

Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D109291

commit | commitdiff | tree

Sanjay Patel [Mon, 6 Sep 2021 15:08:17 +0000 (11:08 -0400)]

[InstCombine] fix infinite loop from shift transform

I'm not sure if there is a better way or another bug
still here, but this is enough to avoid the loop from:
https://llvm.org/PR51657

The test requires multiple blocks and datalayout to
trigger the problem path.

commit | commitdiff | tree

Sanjay Patel [Mon, 6 Sep 2021 14:40:52 +0000 (10:40 -0400)]

[InstCombine] refactor to reduce indent; NFC

This transform should be updated to use better
variable names and code comments. It could
also create the shift-of-shift directly instead
of relying on another combine for that.

commit | commitdiff | tree

Sanjay Patel [Mon, 6 Sep 2021 14:22:24 +0000 (10:22 -0400)]

[InstCombine] fix one-use condition for shift transform

This transform is written in a confusing style,
and I suspect it is at fault for a more serious
bug noted in PR51567.

But it's been around forever, so I'm making the
minimal change to fix another bug - it could
increase instructions because it was not checking
uses.

commit | commitdiff | tree

Sanjay Patel [Mon, 6 Sep 2021 14:14:50 +0000 (10:14 -0400)]

[InstCombine] early exit to reduce indentation; NFC

commit | commitdiff | tree

Sanjay Patel [Mon, 6 Sep 2021 13:30:44 +0000 (09:30 -0400)]

[InstCombine] add test for shift-trunc-shift with extra uses; NFC

The transform doesn't check for extra uses, so we
have more instructions than we started with.

commit | commitdiff | tree

Ivan Zhechev [Mon, 6 Sep 2021 13:57:14 +0000 (13:57 +0000)]

[Flang] Port test_modfile.sh to Python

To enable Flang testing on Windows, shell scripts have
to be ported to Python. The following changes have been made:
"test_modfile.sh" has been ported to Python, and
the relevant tests relying on it.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D107956

Domain: System / Toolchain;

RSS Atom