Jon Chesterfield [Wed, 26 May 2021 18:25:24 +0000 (19:25 +0100)]
[libomptarget][nfc][amdgpu] Refactor uses of KernelInfoTable
Suggested in D103059. Use a single lookup instead of two, more const, less mutation.
Reviewed By: dhruvachak
Differential Revision: https://reviews.llvm.org/D103093
Philip Reames [Wed, 26 May 2021 18:16:11 +0000 (11:16 -0700)]
[SCEV] Generalize getSmallConstantTripCount(L) for multiple exit loops
This came up in review for another patch, see https://reviews.llvm.org/D102982#2782407 for full context.
I've reviewed the callers to make sure they can handle multiple exit loops w/non-zero returns. There's two cases in target cost models where results might change (Hexagon and PowerPC), but the results looked legal and reasonable. If a target maintainer wishes to back out the effect of the costing change, they should explicitly check for multiple exit loops and handle them as desired.
Differential Revision: https://reviews.llvm.org/D103182
thomasraoux [Wed, 26 May 2021 17:28:45 +0000 (10:28 -0700)]
[mlir] Make StripDebugInfo strip out block arguments locs
Differential Revision: https://reviews.llvm.org/D103187
Mitch Phillips [Wed, 26 May 2021 17:50:26 +0000 (10:50 -0700)]
Revert "[Scudo] Make -fsanitize=scudo use standalone. Migrate tests."
This reverts commit
6911114d8cbed06a8a809c34ae07f4e3e89ab252.
Broke the QEMU sanitizer bots due to a missing header dependency. This
actually needs to be fixed on the bot-side, but for now reverting this
patch until I can fix up the bot.
Fangrui Song [Wed, 26 May 2021 17:43:32 +0000 (10:43 -0700)]
[llvm-mc] Add -M to replace -riscv-no-aliases and -riscv-arch-reg-names
In objdump, many targets support `-M no-aliases`. Instead of having a
`-*-no-aliases` for each target when LLVM adds the support, it makes more sense
to introduce objdump style `-M`.
-riscv-arch-reg-names is removed. -riscv-no-aliases has too many uses and thus is retained for now.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D103004
Philip Reames [Wed, 26 May 2021 17:40:25 +0000 (10:40 -0700)]
[SCEV] Add a utility for converting from "exit count" to "trip count"
(Mostly as a logical place to put a comment since this is a reoccuring confusion.)
Craig Topper [Wed, 26 May 2021 17:23:30 +0000 (10:23 -0700)]
[RISCV] Optimize SEW=64 shifts by splat on RV32.
SEW=64 shifts only uses the log2(64) bits of shift amount. If we're
splatting a 64 bit value in 2 parts, we can avoid splatting the
upper bits and just let the low bits be sign extended. They won't
be read anyway.
For the purposes of SelectionDAG semantics of the generic ISD opcodes,
if hi was non-zero or bit 31 of the low is 1, the shift was already
undefined so it should be ok to replace high with sign extend of low.
In order do be able to find the split i64 value before it becomes
a stack operation, I added a new ISD opcode that will be expanded
to the stack spill in PreprocessISelDAG. This new node is conceptually
similar to BuildPairF64, but I expanded earlier so that we could
go through regular isel to get the right VLSE opcode for the LMUL.
BuildPairF64 is expanded in a CustomInserter.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D102521
Philip Reames [Wed, 26 May 2021 17:08:53 +0000 (10:08 -0700)]
[SCEV] Extract out a helper for computing trip multiples
Mitch Phillips [Wed, 26 May 2021 17:03:10 +0000 (10:03 -0700)]
[Scudo] Make -fsanitize=scudo use standalone. Migrate tests.
This patch moves -fsanitize=scudo to link the standalone scudo library,
rather than the original compiler-rt based library. This is one of the
major remaining roadblocks to deleting the compiler-rt based scudo,
which should not be used any more. The standalone Scudo is better in
pretty much every way and is much more suitable for production usage.
As well as patching the litmus tests for checking that the
scudo_standalone lib is linked instead of the scudo lib, this patch also
ports all the scudo lit tests to run under scudo standalone.
This patch also adds a feature to scudo standalone that was under test
in the original scudo - that arguments passed to an aligned operator new
were checked that the alignment was a power of two.
Some lit tests could not be migrated, due to the following issues:
1. Features that aren't supported in scudo standalone, like the rss
limit.
2. Different quarantine implementation where the test needs some more
thought.
3. Small bugs in scudo standalone that should probably be fixed, like
the Secondary allocator having a full page on the LHS of an allocation
that only contains the chunk header, so underflows by <= a page aren't
caught.
4. Slight differences in behaviour that's technically correct, like
'realloc(malloc(1), 0)' returns nullptr in standalone, but a real
pointer in old scudo.
5. Some tests that might be migratable, but not easily.
Tests that are obviously not applicable to scudo standalone (like
testing that no sanitizer symbols made it into the DSO) have been
deleted.
After this patch, the remaining work is:
1. Update the Scudo documentation. The flags have changed, etc.
2. Delete the old version of scudo.
3. Patch up the tests in lit-unmigrated, or fix Scudo standalone.
Reviewed By: cryptoad, vitalybuka
Differential Revision: https://reviews.llvm.org/D102543
Jessica Clarke [Wed, 26 May 2021 16:59:10 +0000 (17:59 +0100)]
[RISCV] Remove --riscv-no-aliases from RVV tests
This serves no useful purpose other than to clutter things up. Diff
summary as the real diff is extremely unwieldy:
24844 -; CHECK-NEXT: jalr zero, 0(ra)
24844 +; CHECK-NEXT: ret
8 -; CHECK-NEXT: vl4re8.v v28, (a0)
8 +; CHECK-NEXT: vl4r.v v28, (a0)
64 -; CHECK-NEXT: vl8re8.v v24, (a0)
64 +; CHECK-NEXT: vl8r.v v24, (a0)
392 -; RUN: --riscv-no-aliases < %s | FileCheck %s
392 +; RUN: < %s | FileCheck %s
1 -; RUN: -verify-machineinstrs --riscv-no-aliases < %s \
1 +; RUN: -verify-machineinstrs < %s \
As discussed in D103004.
Craig Topper [Wed, 26 May 2021 16:31:01 +0000 (09:31 -0700)]
[RISCV] Don't propagate VL/VTYPE across inline assembly in the Insert VSETVLI pass.
It's conceivable someone could put a vsetvli in inline assembly
so its safer to consider them as barriers. The alternative would
be to trust that the user marks VL and VTYPE registers as clobbers
of the inline assembly if they do that, but hat seems error prone.
I'm assuming inline assembly in vector code is going to be rare.
Reviewed By: frasercrmck, HsiangKai
Differential Revision: https://reviews.llvm.org/D103126
Kostya Kortchinsky [Tue, 25 May 2021 22:00:58 +0000 (15:00 -0700)]
[scudo] Get rid of initLinkerInitialized
Now that everything is forcibly linker initialized, it feels like a
good time to get rid of the `init`/`initLinkerInitialized` split.
This allows to get rid of various `memset` construct in `init` that
gcc complains about (this fixes a Fuchsia open issue).
I added various `DCHECK`s to ensure that we would get a zero-inited
object when entering `init`, which required ensuring that
`unmapTestOnly` leaves the object in a good state (tests are currently
the only location where an allocator can be "de-initialized").
Running the tests with `--gtest_repeat=` showed no issue.
Differential Revision: https://reviews.llvm.org/D103119
Alexey Bataev [Wed, 26 May 2021 13:18:34 +0000 (06:18 -0700)]
[SLP]Fix vectorization of insertelements with multiple uses.
SLP vectorizer should not consider in sertelements with multiple uses as
a part of high level build vector, it must be considered as
a terminating insertelement in the vector build, otherwise it may
produce incorrect code.
Differential Revision: https://reviews.llvm.org/D103164
Stephen Tozer [Wed, 26 May 2021 14:27:57 +0000 (15:27 +0100)]
[DebugInfo] Limit the number of values that may be referenced by a dbg.value
Following the addition of salvaging dbg.values using DIArgLists to
reference multiple values, a case has been found where excessively large
DIArgLists are produced as a result of this salvaging, resulting in
large enough performance costs to effectively freeze the compiler.
This patch introduces an upper bound of 16 to the number of values that
may be salvaged into a dbg.value, to limit the impact of these extreme
cases to performance.
Differential Revision: https://reviews.llvm.org/D103162
Richard Howell [Wed, 26 May 2021 16:32:57 +0000 (09:32 -0700)]
[lldb] add LLDB_SKIP_DSYM option
Add an option to skip generating a dSYM when installing the LLDB framework on Darwin.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D103124
Shoaib Meenai [Mon, 24 May 2021 03:41:57 +0000 (20:41 -0700)]
[libunwind] Inform ASan that resumption is noreturn
If you're building libunwind instrumented with ASan, `_Unwind_RaiseException`
will poison the stack and then transfer control in a manner which isn't
understood by ASan, so the stack will remain poisoned. This can cause
false positives, e.g. if you call an uninstrumented function (so it
doesn't re-poison the stack) after catching an exception. Add a call to
`__asan_handle_no_return` inside `__unw_resume` to get ASan to unpoison
the stack and avoid this.
`__unw_resume` seems like the appropriate place to make this call, since
it's used for resumption by all unwind implementations except SJLJ. SJLJ
uses `__builtin_longjmp` to handle resumption, which is already
recognized as noreturn (and therefore ASan adds the `__asan_handle_no_return`
call itself), so it doesn't need any special handling.
PR32434 is somewhat similar (in particular needing a component built
without ASan to trigger the bug), and rG781ef03e1012, the fix for that
bug, adds an interceptor for `_Unwind_RaiseException`. This interceptor
won't always be triggered though, e.g. if you statically link the
unwinder into libc++abi in a way that prevents interposing the unwinder
functions (e.g. marking the symbols as hidden, using `--exclude-libs`,
or using `-Bsymbolic`). rG53335d6d86d5 makes `__cxa_throw` call
`__asan_handle_no_return` explicitly, to similarly avoid relying on
interception.
Reviewed By: #libunwind, compnerd
Differential Revision: https://reviews.llvm.org/D103002
Raphael Isemann [Wed, 26 May 2021 15:52:51 +0000 (17:52 +0200)]
[lldb] Remove cache in get_demangled_name_without_arguments
This function has a single-value caching based on function local static variables.
This causes two problems:
* There is no synchronization, so this function randomly returns the demangled
name of other functions that are demangled at the same time.
* The 1-element cache is not very effective (the cache rate is around 0% when
running the LLDB test suite that calls this function around 30k times).
I would propose just removing it.
To prevent anyone else the git archeology: the static result variables were
originally added as this returned a ConstString reference, but that has since
been changed so that this returns by value.
Reviewed By: #lldb, JDevlieghere, shafik
Differential Revision: https://reviews.llvm.org/D103107
Craig Topper [Tue, 25 May 2021 23:28:34 +0000 (16:28 -0700)]
[RISCV] Enable cross basic block aware vsetvli insertion
This patch extends D102737 to allow VL/VTYPE changes to be taken
into account before adding an explicit vsetvli.
We do this by using a data flow analysis to propagate VL/VTYPE
information from predecessors until we've determined a value for
every value in the function.
We use this information to determine if a vsetvli needs to be
inserted before the first vector instruction the block.
Differential Revision: https://reviews.llvm.org/D102739
Sebastian Neubauer [Wed, 26 May 2021 16:20:33 +0000 (18:20 +0200)]
[AMDGPU][NFC] Remove non-existing function header
Jon Chesterfield [Wed, 26 May 2021 16:02:19 +0000 (17:02 +0100)]
[libomptarget][nfc][amdgpu] Remove atmi_status_t type
ATMI_STATUS_UNKNOWN was unused, deleted references to it.
Replaced ATMI_STATUS_{SUCCESS,ERROR} with HSA_STATUS_{SUCCESS,ERROR}
Replaced atmi_status_t with hsa_status_t
Otherwise no change. In particular, conversions between atmi_status_t and
hsa_status_t will now be conversions between hsa_status_t and itself.
Reviewed By: pdhaliwal
Differential Revision: https://reviews.llvm.org/D103115
LLVM GN Syncbot [Wed, 26 May 2021 15:57:01 +0000 (15:57 +0000)]
[gn build] Port
de9df3f5b952
Mark de Wever [Tue, 18 May 2021 18:00:22 +0000 (20:00 +0200)]
[libc++][format] Adds availability macros for std::format.
This prevents std::format to be available until there's an ABI stable
version. (This only impacts the Apple platform.)
Depends on D102703
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D102705
Alexander Belyaev [Wed, 26 May 2021 12:36:35 +0000 (14:36 +0200)]
[mlir] Add `distributionTypes` to LinalgTilingOptions.
Differential Revision: https://reviews.llvm.org/D103161
Mark de Wever [Sun, 25 Apr 2021 16:23:42 +0000 (18:23 +0200)]
[libc++][NFC] Move basic_format_parse_context to its own header.
This is a preparation to split the format header in smaller parts for the
upcoming patches.
Depends on D101723
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D102703
LLVM GN Syncbot [Wed, 26 May 2021 15:45:57 +0000 (15:45 +0000)]
[gn build] Port
16342e39947b
Mark de Wever [Sun, 25 Apr 2021 15:58:03 +0000 (17:58 +0200)]
[libc++][NFC] Move format_error to its own header.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D101723
Valentin Clement [Wed, 26 May 2021 15:38:49 +0000 (11:38 -0400)]
[mlir][openacc] Translate UpdateOp to LLVM IR
Add translation to LLVM IR for the UpdateOp with host and device operands.
Translation is done with call using the runtime. This is done in a similar way as
D101504 and D102381.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D102382
Philip Reames [Wed, 26 May 2021 15:40:01 +0000 (08:40 -0700)]
[unroll] Use value domain for symbolic execution based cost model
The current full unroll cost model does a symbolic evaluation of the loop up to a fixed limit. That symbolic evaluation currently simplifies to constants, but we can generalize to arbitrary Values using the InstructionSimplify infrastructure at very low cost.
By itself, this enables some simplifications, but it's mainly useful when combined with the branch simplification over in D102928.
Differential Revision: https://reviews.llvm.org/D102934
Louis Dionne [Wed, 26 May 2021 15:21:33 +0000 (11:21 -0400)]
[libc++] Fix concepts tests with GCC
Jonas Paulsson [Mon, 19 Apr 2021 19:31:01 +0000 (21:31 +0200)]
[SystemZ] Support i128 inline asm operands.
Support virtual, physical and tied i128 register operands in inline assembly.
i128 is on SystemZ not really supported and is not a legal type and generally
such a value will be split into two i64 parts. There are however some
instructions that require a pair of two GPR64 registers contained in the GR128
bit reg class, which is untyped.
For inline assmebly operands, it proved to be very cumbersome to first follow
the general behavior of splitting an i128 operand into two parts and then
later rebuild the INLINEASM MI to have one GR128 register. Instead, some
minor common code changes were made to SelectionDAGBUilder to only create one
GR128 register part to begin with. In particular:
- getNumRegisters() now has an optional parameter "RegisterVT" which is
passed by AddInlineAsmOperands() and GetRegistersForValue().
- The bitcasting in GetRegistersForValue is not performed if RegVT is
Untyped.
- The RC for a tied use in AddInlineAsmOperands() is now computed either from
the tied def (virtual register), or by getMinimalPhysRegClass() (physical
register).
- InstrEmitter.cpp:EmitCopyFromReg() has been fixed so that the register
class (DstRC) can also be computed for an illegal type.
In the SystemZ backend getNumRegisters(), splitValueIntoRegisterParts() and
joinRegisterPartsIntoValue() have been implemented to handle i128 operands.
Differential Revision: https://reviews.llvm.org/D100788
Review: Ulrich Weigand
Kadir Cetinkaya [Wed, 19 May 2021 11:45:44 +0000 (13:45 +0200)]
[clangd] New ParsingCallback for semantics changes
Previously notification of the Server about semantic happened strictly
before notification of the AST thread.
Hence a racy Server could make a request (like semantic tokens) after
the notification, with the assumption that it'll be served fresh
content. But it wasn't true if AST thread wasn't notified about the
change yet.
This change reverses the order of those notifications to prevent racy
interactions.
Differential Revision: https://reviews.llvm.org/D102761
Andrea Di Biagio [Wed, 26 May 2021 14:38:45 +0000 (15:38 +0100)]
[MCA] Add a test for PR50483. NFC
Anirudh Prasad [Wed, 26 May 2021 14:49:39 +0000 (10:49 -0400)]
[SystemZ][z/OS] Enable the AllowAtInName attribute for the HLASM dialect
- Currently, LLVM supports symbols of the name "token1@token2".
- "token2" is used to identify whether an appropriate symbol reference can be used for the symbol.
- Now, if the symbol reference couldn't be found, the AsmParser usually emits an error, unless the backend is configured to accept the "@" in a symbol name
- Thus, this patch aims to do that. It sets the `AllowAtInName` attribute in the SystemZ backend for the HLASM dialect.
- Setting this attribute ensures that, if a particular symbol reference is found, it uses that. If it doesn't, and there exists an "@" in the symbol name, it will use that instead of explicitly erroring out.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D103111
jweightma [Wed, 26 May 2021 14:33:33 +0000 (16:33 +0200)]
[AMDGPU] Fix function pointer argument bug in AMDGPU Propagate Attributes pass.
This patch fixes a bug in the AMDGPU Propagate Attributes pass where a call
instruction with a function pointer argument is identified as a user of the
passed function, and illegally replaces the called function of the
instruction with the function argument.
For example, given functions f and g with appropriate types, the following
illegal transformation could occur without this fix:
call void @f(void ()* @g)
-->
call void @g(void ()* @g.1)
The solution introduced in this patch is to prevent the cloning and
substitution if the instruction's called function and the function which
might be cloned do not match.
Reviewed By: arsenm, madhur13490
Differential Revision: https://reviews.llvm.org/D101847
Anirudh Prasad [Wed, 26 May 2021 14:36:50 +0000 (10:36 -0400)]
[SystemZ][z/OS] Validate symbol names for z/OS for printing without quotes
- Currently, before printing a label in MCSymbol.cpp (MCSymbol::print), the current code "validates" the label that is to be printed.
- If it fails the validation step, then it prints the label within double quotes.
- However, the validation is provided as a virtual function in MCAsmInfo.h (i.e. isAcceptableChar() function). So we can override this for the AD_HLASM dialect in SystemZMCAsmInfo.cpp.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D103091
Hans Wennborg [Wed, 26 May 2021 09:30:03 +0000 (11:30 +0200)]
[clang-cl] Add driver support for /std:c++20 and bump /std:c++latest (PR50465)
VS 2019 16.11 (just released in Preview) is adding support for the
/std:c++20 option and bumping /std:c++latest to "post-c++20". This
updates clang-cl to match.
Differential revision: https://reviews.llvm.org/D103155
Luo, Yuanke [Wed, 26 May 2021 09:41:49 +0000 (17:41 +0800)]
[X86][AMX] Fix a bug on tile config.
The previous code detect if a MBB is bottom block to determine if it is
a backedge of a loop. We should check latch block instead of bottom
block and we should check the header and the bottom block are in the
same loop.
Differential Revision: https://reviews.llvm.org/D103145
Sjoerd Meijer [Wed, 26 May 2021 13:33:40 +0000 (14:33 +0100)]
[CostModel][AArch64] Add tests for bitreverse. NFC.
David Green [Wed, 26 May 2021 13:54:36 +0000 (14:54 +0100)]
[ARM] Extra test for reverted WLS memset. NFC
Simon Pilgrim [Wed, 26 May 2021 13:50:47 +0000 (14:50 +0100)]
[X86][SSE] Regenerate some tests to expose the rip relative vector/broadcast loads
Andrea Di Biagio [Wed, 26 May 2021 12:58:49 +0000 (13:58 +0100)]
[MCA][InOrderIssueStage] Fix LastWriteBackCycle computation.
Conservatively use the instruction latency to compute the last write-back cycle.
Before this patch, the last write cycle computation was incorrect for store
instructions that didn't declare any register writes.
Alexey Bataev [Wed, 26 May 2021 13:14:59 +0000 (06:14 -0700)]
[SLP][NFC]Add a test for multiple uses of insertelement instruction,
NFC.
Kerry McLaughlin [Wed, 26 May 2021 10:59:04 +0000 (11:59 +0100)]
[LoopVectorize] Enable strict reductions when allowReordering() returns false
When loop hints are passed via metadata, the allowReordering function
in LoopVectorizationLegality will allow the order of floating point
operations to be changed:
bool allowReordering() const {
// When enabling loop hints are provided we allow the vectorizer to change
// the order of operations that is given by the scalar loop. This is not
// enabled by default because can be unsafe or inefficient.
The -enable-strict-reductions flag introduced in D98435 will currently only
vectorize reductions in-loop if hints are used, since canVectorizeFPMath()
will return false if reordering is not allowed.
This patch changes canVectorizeFPMath() to query whether it is safe to
vectorize the loop with ordered reductions if no hints are used. For
testing purposes, an additional flag (-hints-allow-reordering) has been
added to disable the reordering behaviour described above.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D101836
Max Kazantsev [Wed, 26 May 2021 12:39:19 +0000 (19:39 +0700)]
Return "[LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration" (try 2)
The patch was reverted due to compile time impact of contextual SCEV
queries. It also appeared that it introduced a miscompile on irreducible CFG.
Changes made:
1. isKnownPredicateAt is replaced with more lightweight isKnownPredicate;
2. Irreducible CFG in live code is now detected and excluded from processing.
Differential Revision: https://reviews.llvm.org/D102615
Sanjay Patel [Wed, 26 May 2021 12:19:44 +0000 (08:19 -0400)]
[InstCombine] add fmul tests with shared operand; NFC
Baseline tests for:
D102698
Sanjay Patel [Wed, 26 May 2021 12:15:22 +0000 (08:15 -0400)]
[InstCombine] avoid 'tmp' usage in test files; NFC
The update script ( utils/update_test_checks.py ) warns against this.
Sanjay Patel [Wed, 26 May 2021 12:11:17 +0000 (08:11 -0400)]
[InstCombine] avoid 'tmp' usage in test file; NFC
The update script ( utils/update_test_checks.py ) warns against this.
Max Kazantsev [Wed, 26 May 2021 12:29:07 +0000 (19:29 +0700)]
Revert "Return "[LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration""
This reverts commit
43d2e51c2e86788b9e2a582fdd3d8ffa7829328a.
Commited wrong version.
Max Kazantsev [Wed, 26 May 2021 09:52:57 +0000 (16:52 +0700)]
Return "[LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration"
The patch was reverted due to compile time impact of contextual SCEV
queries. It also appeared that it introduced a miscompile on irreducible CFG.
Changes made:
1. isKnownPredicateAt is replaced with more lightweight isKnownPredicate;
2. Irreducible CFG in live code is now detected and excluded from processing.
Differential Revision: https://reviews.llvm.org/D102615
Adrian Kuegel [Wed, 26 May 2021 10:28:14 +0000 (12:28 +0200)]
[mlir] Fold complex.create(complex.re(op), complex.im(op))
Differential Revision: https://reviews.llvm.org/D103148
Andrew Savonichev [Wed, 5 May 2021 19:18:02 +0000 (22:18 +0300)]
[AArch64] Generate LD1 for anyext i8 or i16 vector load
The existing LD1 patterns do not cover cases where result type does
not match the memory type. This happens when illegal vector types are
extended and scalarized, for example:
load <2 x i16>* %v2i16
is lowered into:
// first element
(v4i32 (insert_subvector (v2i32 (scalar_to_vector (load anyext from i16)))))
// other elements
(v4i32 (insert_vector_elt (i32 (load anyext from i16)) idx))
Before this patch these patterns were compiled into LDR + INS.
Now they are compiled into LD1.
The problem was reported in
PR24820: LLVM Generates abysmal code in simple situation.
Differential Revision: https://reviews.llvm.org/D102938
Max Kazantsev [Wed, 26 May 2021 11:35:30 +0000 (18:35 +0700)]
[Test] Add Loop Deletion test with irreducible CFG
Authored by Mikael Holmén. It demonstrated miscompile on irreducible
CFG with patch "[LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration".
The patch is reverted. Checking in the test to make sure this bug
does not return.
Sven van Haastregt [Wed, 26 May 2021 11:32:07 +0000 (12:32 +0100)]
[OpenCL] Include header for atomic-ops test
Avoid duplicating the memory_order and memory_scope enum definitions.
Tomas Matheson [Wed, 26 May 2021 11:27:25 +0000 (12:27 +0100)]
[MC] Move elf-unique-sections-by-flags.ll to X86/
pooja2299 [Wed, 26 May 2021 10:39:36 +0000 (16:09 +0530)]
[Docs] Updated the content of getting started documentation under llvm/lib/MC
Wrote about llvm/lib/MC subproject on https://llvm.org/docs/GettingStarted.html page.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D101047
Tomas Matheson [Wed, 12 May 2021 17:56:43 +0000 (18:56 +0100)]
[MC][ELF] Emit unique sections for different flags
Global values imply flags such as readable, writable, executable for the
sections that they will be placed in. Currently MC places all such
entries into the same section, using the first set of flags seen. This
can lead to situations in LTO where a writable global is placed in the
same named section as a readable global from another file, and the
section may not be marked writable.
D72194 ensures that mergeable globals with explicit sections are placed
in separate sections with compatible entry size, by emitting the
`unique` assembly syntax where appropriate. This change extends that
approach to include section flags, so that globals with different
section flags are emitted in separate unique sections.
Differential revision: https://reviews.llvm.org/D100944
Tomas Matheson [Thu, 22 Apr 2021 14:41:33 +0000 (15:41 +0100)]
[MC][NFCI] Factor out ELF section unique ID calculation
Precursor to D100944. The logic for determining the unique ID had become
quite difficult to reason about, so I have factored this out into a
separate function.
Differential Revision: https://reviews.llvm.org/D102336
Pushpinder Singh [Tue, 25 May 2021 07:57:10 +0000 (07:57 +0000)]
[AMDGPU][Libomptarget] Inline atmi_init/atmi_finalize
After D102847, these functions can be inlined.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D103075
Pushpinder Singh [Tue, 25 May 2021 07:29:09 +0000 (07:29 +0000)]
[AMDGPU][Libomptarget] Delete g_atmi_initialized
This patch drops g_atmi_initialized and inlines the Initialize &
Finalize methods from Runtime class.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D102847
Raphael Isemann [Wed, 26 May 2021 10:19:37 +0000 (12:19 +0200)]
[lldb][NFC] Use C++ versions of the deprecated C standard library headers
The C headers are deprecated so as requested in D102845, this is replacing them
all with their (not deprecated) C++ equivalent.
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D103084
Simon Pilgrim [Wed, 26 May 2021 10:07:22 +0000 (11:07 +0100)]
[X86][SLM] Fix vector PSHUFB + variable shift resource/throughputs
Match whats documented in the Intel AOM (+Agner) - PSHUFB xmm is really slow, and mmx/xmm vector shifts are half rate.
Noticed while working to get the cost tables to more closely match llvm-mca analysis, in this case for shifts and truncations.
Florian Hahn [Tue, 25 May 2021 16:34:53 +0000 (17:34 +0100)]
[SCEV] Add tests with signed predicates for applyLoopGuards.
Pushpinder Singh [Tue, 25 May 2021 07:08:53 +0000 (07:08 +0000)]
[AMDGPU][Libomptarget] Move Kernel/Symbol info tables to RTLDeviceInfoTy
Two globals KernelInfoTable & SymbolInfoTable are moved
into RTLDeviceInfoTy class.
This builds on the top of D102691.
[2/2]
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D102692
Kerry McLaughlin [Wed, 26 May 2021 09:27:32 +0000 (10:27 +0100)]
[NFC] Add CHECK lines for unordered FP reductions
An additional RUN line has been added to both strict-fadd.ll &
scalable-strict-fadd.ll to ensure the correct behaviour of these
tests where `-enable-strict-reductions` is false.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D103015
Mirko Brkusanin [Wed, 26 May 2021 09:49:05 +0000 (11:49 +0200)]
[AMDGPU][GlobalISel] Stop foldInsertEltToCmpSelect from changing reg banks
This function can change regbank for registers which already have a selected
bank. Depending on the instruction where these registers were used it can
cause instruction selection to fail.
Differential Revision: https://reviews.llvm.org/D98515
Mirko Brkusanin [Wed, 26 May 2021 09:47:21 +0000 (11:47 +0200)]
Revert "[AMDGPU][GlobalISel] Stop foldInsertEltToCmpSelect from changing reg banks"
This reverts commit
18c5444702893fd63b0a99ec7133dd714284f9d2.
Fraser Cormack [Wed, 26 May 2021 09:30:13 +0000 (10:30 +0100)]
[RISCV] Pre-commit fixed-length mask vselect tests
These are default-expanded but later unrolled due to RISC-V's vector
boolean content policy. A patch to improve this codegen will follow
shortly.
Max Kazantsev [Wed, 26 May 2021 09:38:10 +0000 (16:38 +0700)]
[Test] Add simplified versions of tests for loop deletion that don't need context
Tim Northover [Wed, 26 May 2021 08:10:40 +0000 (09:10 +0100)]
AArch64: support post-indexed stores to bfloat types.
Simon Pilgrim [Tue, 25 May 2021 17:42:01 +0000 (18:42 +0100)]
[CostModel][X86] Remove old testshift* tests
The vector shift cost tests are better covered (more cpu/sse levels) by the vshift-*-*cost files, and we're trying to avoid codegen tests in here as it makes it harder to maintain the test files.
Simon Pilgrim [Tue, 25 May 2021 17:00:53 +0000 (18:00 +0100)]
[X86][Atom] Fix vector variable shift resource/throughputs
Match whats documented in the Intel AOM - the non-immediate variants of the PSLL*/PSRA*/PSRL* shift instructions requires BOTH ports - this was being incorrectly modelled as EITHER port.
Now that we can use in-order models in llvm-mca, the atom model is a good "worst case scenario" analysis for x86.
Max Kazantsev [Wed, 26 May 2021 09:25:08 +0000 (16:25 +0700)]
[Test] Add test on unrolling to make sure it won't fail
Initially it failed an assertion with "Do actual DCE in LoopUnroll (try 2)"
which was later reverted. Make sure that when this patch is returned, the
test works fine.
Roman Lebedev [Wed, 26 May 2021 09:17:44 +0000 (12:17 +0300)]
[NFC][X86] clang-format X86TTIImpl::getInterleavedMemoryOpCostAVX2()
I plan to make changes to it, and undoing formatting each time is not going to be fun.
David Sherwood [Wed, 26 May 2021 08:59:45 +0000 (09:59 +0100)]
Bjorn Pettersson [Wed, 26 May 2021 09:07:45 +0000 (11:07 +0200)]
[HIP] Adjust check in hip-include-path.hip test case
The changes in commit
722c39fef5ab6 caused the test case to fail
when building with -DLLVM_LIBDIR_SUFFIX=64. This patch makes the
checks a bit more relaxed to support libdir suffixes again.
Also adjusting the regular expressions to avoid mathes including
double quotes.
Butygin [Sat, 10 Apr 2021 16:38:11 +0000 (19:38 +0300)]
[mlir] LocalAliasAnalysis: Assume allocation scope to function scope if cannot determine better
It helps when checking aliasing between AllocOp result and function arguments.
Differential Revision: https://reviews.llvm.org/D102557
Adrian Kuegel [Wed, 26 May 2021 08:59:09 +0000 (10:59 +0200)]
[mlir] Simplify folding code (NFC)
David Sherwood [Tue, 4 May 2021 12:58:02 +0000 (13:58 +0100)]
[InstCombine] Fold extractelement + vector GEP with one use
We sometimes see code like this:
Case 1:
%gep = getelementptr i32, i32* %a, <2 x i64> %splat
%ext = extractelement <2 x i32*> %gep, i32 0
or this:
Case 2:
%gep = getelementptr i32, <4 x i32*> %a, i64 1
%ext = extractelement <4 x i32*> %gep, i32 0
where there is only one use of the GEP. In such cases it makes
sense to fold the two together such that we create a scalar GEP:
Case 1:
%ext = extractelement <2 x i64> %splat, i32 0
%gep = getelementptr i32, i32* %a, i64 %ext
Case 2:
%ext = extractelement <2 x i32*> %a, i32 0
%gep = getelementptr i32, i32* %ext, i64 1
This may create further folding opportunities as a result, i.e.
the extract of a splat vector can be completely eliminated. Also,
even for the general case where the vector operand is not a splat
it seems beneficial to create a scalar GEP and extract the scalar
element from the operand. Therefore, in this patch I've assumed
that a scalar GEP is always preferrable to a vector GEP and have
added code to unconditionally fold the extract + GEP.
I haven't added folds for the case when we have both a vector of
pointers and a vector of indices, since this would require
generating an additional extractelement operation.
Tests have been added here:
Transforms/InstCombine/gep-vector-indices.ll
Differential Revision: https://reviews.llvm.org/D101900
Adrian Kuegel [Wed, 26 May 2021 07:43:26 +0000 (09:43 +0200)]
[mlir] Fold complex.re(complex.create) and complex.im(complex.create)
This extends the folding we already have. A test needs to be adjusted.
Differential Revision: https://reviews.llvm.org/D103141
Esme-Yi [Wed, 26 May 2021 08:47:53 +0000 (08:47 +0000)]
[NFC][object] Change the input parameter of the method isDebugSection.
Summary: This is a NFC patch to change the input parameter of the method SectionRef::isDebugSection(), by replacing the StringRef SectionName with DataRefImpl Sec. This allows us to determine if a section is debug type in more ways than just by section name.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D102601
David Green [Wed, 26 May 2021 08:22:12 +0000 (09:22 +0100)]
[ARM] Add patterns for vmulh
Now that vmulh can be selected, this adds the MVE patterns to make it
legal and generate instructions.
Differential Revision: https://reviews.llvm.org/D88011
Björn Schäpers [Tue, 25 May 2021 15:55:12 +0000 (17:55 +0200)]
[clang-format][NFC] correctly sort StatementAttributeLike-macros' IO.map
LLVM GN Syncbot [Wed, 26 May 2021 04:31:12 +0000 (04:31 +0000)]
[gn build] Port
36d0fdf9ac3b
Christopher Di Bella [Wed, 5 May 2021 07:14:08 +0000 (07:14 +0000)]
[libcxx][iterator] adds `std::ranges::advance`
Implements part of P0896 'The One Ranges Proposal'.
Implements [range.iter.op.advance].
Differential Revision: https://reviews.llvm.org/D101922
Arthur Eubanks [Tue, 25 May 2021 19:36:25 +0000 (12:36 -0700)]
[OpaquePtr] Make atomicrmw work with opaque pointers
FullTy is only necessary when we need to figure out what type an
instruction works with given a pointer's pointee type. However, we just
end up using the value operand's type, so FullTy isn't necessary.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102788
Jonas Devlieghere [Wed, 26 May 2021 00:21:01 +0000 (17:21 -0700)]
Revert "[lldb] Avoid format string in LLDB_SCOPED_TIMER"
Right after pushing, I remembered that this was added to silence a GCC
warning (https://reviews.llvm.org/D99120). This reverts my patch and
adds a comment.
Jonas Devlieghere [Wed, 26 May 2021 00:12:28 +0000 (17:12 -0700)]
[lldb] Avoid format string in LLDB_SCOPED_TIMER
Pass LLVM_PRETTY_FUNCTION directly for the no-argument macro.
Teresa Johnson [Tue, 25 May 2021 05:02:44 +0000 (22:02 -0700)]
[LTT] Handle merged llvm.assume when dropping type tests
When the lower type test pass is invoked a second time with
DropTypeTests set to true, it expects that all remaining type tests feed
assume instructions, which are removed along with the type tests.
In some cases the llvm.assume might have been merged with another one,
i.e. from a builtin_assume instruction, in which case the type test
would actually feed a phi that in turn feeds the merged assume
instruction. In this case we can simply replace that operand of the phi
with "true" before removing the type test.
Differential Revision: https://reviews.llvm.org/D103073
Arthur Eubanks [Tue, 25 May 2021 22:31:38 +0000 (15:31 -0700)]
[OpaquePtr] Create new bitcode encoding for atomicrmw
Since the opaque pointer type won't contain the pointee type, we need to
separately encode the value type for an atomicrmw.
Emit this new code for atomicrmw.
Handle this new code and the old one in the bitcode reader.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D103123
Fangrui Song [Tue, 25 May 2021 23:28:17 +0000 (16:28 -0700)]
[sanitizer] Let glibc aarch64 use O(1) GetTls
The generic approach can still be used by musl and FreeBSD. Note: on glibc
2.31, TLS_PRE_TCB_SIZE is 0x700, larger than ThreadDescriptorSize() by 16, but
this is benign: as long as the range includes pthread::{specific_1stblock,specific}
pthread_setspecific will not cause false positives.
Note: the state before
afec953857ffd682cb4119e7950f3593efbaaa81 underestimated
the TLS size a lot (nearly ThreadDescriptorSize() = 1776).
That may explain why
afec953857ffd682cb4119e7950f3593efbaaa81 actually made some
tests pass.
Kevin Athey [Thu, 13 May 2021 18:41:43 +0000 (11:41 -0700)]
LLVM Detailed IR tests for introduction of flag -fsanitize-address-detect-stack-use-after-return-mode.
Rework all tests that interact with use after return to correctly handle the case where the mode has been explicitly set to Never or Always.
for issue: https://github.com/google/sanitizers/issues/1394
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102462
Alexandre Ganea [Tue, 25 May 2021 22:03:55 +0000 (18:03 -0400)]
[benchmark] Silence 'suggest override' and 'missing override' warnings
When building with Clang 11 on Windows, silence the following:
F:\aganea\llvm-project\llvm\utils\benchmark\include\benchmark/benchmark.h(955,8): warning: 'Run' overrides a member function but is not marked 'override' [-Wsuggest-override]
void Run(State& st);
^
F:\aganea\llvm-project\llvm\utils\benchmark\include\benchmark/benchmark.h(895,16): note: overridden virtual function is here
virtual void Run(State& state) = 0;
^
1 warning generated.
Alexandre Ganea [Tue, 25 May 2021 21:22:08 +0000 (17:22 -0400)]
[gcov] Silence warning: comparison of integers of different signs
When building with Clang 11 on Windows, silence the following:
[432/5643] Building C object projects\compiler-rt\lib\profile\CMakeFiles\clang_rt.profile-x86_64.dir\GCDAProfiling.c.obj
F:\aganea\llvm-project\compiler-rt\lib\profile\GCDAProfiling.c(464,13): warning: comparison of integers of different signs: 'uint32_t' (aka 'unsigned int') and 'int' [-Wsign-compare]
if (val != (gcov_version >= 90 ? GCOV_TAG_OBJECT_SUMMARY
~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.
Rob Suderman [Tue, 25 May 2021 22:27:11 +0000 (15:27 -0700)]
[NFC][MLIR][TOSA] Replaced tosa linalg.indexed_generic lowerings with linalg.index
Indexed Generic should be going away in the future. Migrate to linalg.index.
Reviewed By: NatashaKnk, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D103110
Vitaly Buka [Tue, 25 May 2021 22:25:05 +0000 (15:25 -0700)]
[NFC][SCUDO] Fix unittest for -gtest_repeat=10
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D103122
Chris Lattner [Tue, 25 May 2021 21:38:01 +0000 (14:38 -0700)]
[MLIR Core] Cache the empty StringAttr like we do for empty dictionaries. NFC.
MLIRContext holds a few special case values that occur frequently like empty
dictionary and NoneType, which allow us to avoid taking locks to get an instance
of them. Give the empty StringAttr this treatment as well. This cuts several
percent off compile time for CIRCT.
Differential Revision: https://reviews.llvm.org/D103117
Chris Lattner [Tue, 25 May 2021 21:50:35 +0000 (14:50 -0700)]
[Toy] Update tests to pass with top-down canonicalize pass. NFC
Jon Chesterfield [Tue, 25 May 2021 21:43:16 +0000 (22:43 +0100)]
[libomptarget][nfc] Move hostcall required test to rtl
[libomptarget][nfc] Move hostcall required test to rtl
Remove a global, fix minor race. First of N patches to bring up hostcall.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D103058
Louis Dionne [Tue, 25 May 2021 21:34:57 +0000 (17:34 -0400)]
[libc++] Install GCC 11 on CI builders
David Green [Tue, 25 May 2021 21:24:06 +0000 (22:24 +0100)]
[ARM] Extra predicated tests for VMULH. NFC