platform/upstream/llvm.git
3 years ago[InstCombine] avoid 'tmp' usage in test files; NFC
Sanjay Patel [Wed, 26 May 2021 12:15:22 +0000 (08:15 -0400)]
[InstCombine] avoid 'tmp' usage in test files; NFC

The update script ( utils/update_test_checks.py ) warns against this.

3 years ago[InstCombine] avoid 'tmp' usage in test file; NFC
Sanjay Patel [Wed, 26 May 2021 12:11:17 +0000 (08:11 -0400)]
[InstCombine] avoid 'tmp' usage in test file; NFC

The update script ( utils/update_test_checks.py ) warns against this.

3 years agoRevert "Return "[LoopDeletion] Break backedge if we can prove that the loop is exited...
Max Kazantsev [Wed, 26 May 2021 12:29:07 +0000 (19:29 +0700)]
Revert "Return "[LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration""

This reverts commit 43d2e51c2e86788b9e2a582fdd3d8ffa7829328a.

Commited wrong version.

3 years agoReturn "[LoopDeletion] Break backedge if we can prove that the loop is exited on...
Max Kazantsev [Wed, 26 May 2021 09:52:57 +0000 (16:52 +0700)]
Return "[LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration"

The patch was reverted due to compile time impact of contextual SCEV
queries. It also appeared that it introduced a miscompile on irreducible CFG.

Changes made:
1. isKnownPredicateAt is replaced with more lightweight isKnownPredicate;
2. Irreducible CFG in live code is now detected and excluded from processing.

Differential Revision: https://reviews.llvm.org/D102615

3 years ago[mlir] Fold complex.create(complex.re(op), complex.im(op))
Adrian Kuegel [Wed, 26 May 2021 10:28:14 +0000 (12:28 +0200)]
[mlir] Fold complex.create(complex.re(op), complex.im(op))

Differential Revision: https://reviews.llvm.org/D103148

3 years ago[AArch64] Generate LD1 for anyext i8 or i16 vector load
Andrew Savonichev [Wed, 5 May 2021 19:18:02 +0000 (22:18 +0300)]
[AArch64] Generate LD1 for anyext i8 or i16 vector load

The existing LD1 patterns do not cover cases where result type does
not match the memory type. This happens when illegal vector types are
extended and scalarized, for example:

  load <2 x i16>* %v2i16

is lowered into:

  // first element
  (v4i32 (insert_subvector (v2i32 (scalar_to_vector (load anyext from i16)))))
  // other elements
  (v4i32 (insert_vector_elt (i32 (load anyext from i16)) idx))

Before this patch these patterns were compiled into LDR + INS.
Now they are compiled into LD1.

The problem was reported in
PR24820: LLVM Generates abysmal code in simple situation.

Differential Revision: https://reviews.llvm.org/D102938

3 years ago[Test] Add Loop Deletion test with irreducible CFG
Max Kazantsev [Wed, 26 May 2021 11:35:30 +0000 (18:35 +0700)]
[Test] Add Loop Deletion test with irreducible CFG

Authored by Mikael Holmén. It demonstrated miscompile on irreducible
CFG with patch "[LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration".
The patch is reverted. Checking in the test to make sure this bug
does not return.

3 years ago[OpenCL] Include header for atomic-ops test
Sven van Haastregt [Wed, 26 May 2021 11:32:07 +0000 (12:32 +0100)]
[OpenCL] Include header for atomic-ops test

Avoid duplicating the memory_order and memory_scope enum definitions.

3 years ago[MC] Move elf-unique-sections-by-flags.ll to X86/
Tomas Matheson [Wed, 26 May 2021 11:27:25 +0000 (12:27 +0100)]
[MC] Move elf-unique-sections-by-flags.ll to X86/

3 years ago[Docs] Updated the content of getting started documentation under llvm/lib/MC
pooja2299 [Wed, 26 May 2021 10:39:36 +0000 (16:09 +0530)]
[Docs] Updated the content of getting started documentation under llvm/lib/MC

Wrote about llvm/lib/MC subproject on https://llvm.org/docs/GettingStarted.html page.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D101047

3 years ago[MC][ELF] Emit unique sections for different flags
Tomas Matheson [Wed, 12 May 2021 17:56:43 +0000 (18:56 +0100)]
[MC][ELF] Emit unique sections for different flags

Global values imply flags such as readable, writable, executable for the
sections that they will be placed in. Currently MC places all such
entries into the same section, using the first set of flags seen. This
can lead to situations in LTO where a writable global is placed in the
same named section as a readable global from another file, and the
section may not be marked writable.

D72194 ensures that mergeable globals with explicit sections are placed
in separate sections with compatible entry size, by emitting the
`unique` assembly syntax where appropriate. This change extends that
approach to include section flags, so that globals with different
section flags are emitted in separate unique sections.

Differential revision: https://reviews.llvm.org/D100944

3 years ago[MC][NFCI] Factor out ELF section unique ID calculation
Tomas Matheson [Thu, 22 Apr 2021 14:41:33 +0000 (15:41 +0100)]
[MC][NFCI] Factor out ELF section unique ID calculation

Precursor to D100944. The logic for determining the unique ID had become
quite difficult to reason about, so I have factored this out into a
separate function.

Differential Revision: https://reviews.llvm.org/D102336

3 years ago[AMDGPU][Libomptarget] Inline atmi_init/atmi_finalize
Pushpinder Singh [Tue, 25 May 2021 07:57:10 +0000 (07:57 +0000)]
[AMDGPU][Libomptarget] Inline atmi_init/atmi_finalize

After D102847, these functions can be inlined.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D103075

3 years ago[AMDGPU][Libomptarget] Delete g_atmi_initialized
Pushpinder Singh [Tue, 25 May 2021 07:29:09 +0000 (07:29 +0000)]
[AMDGPU][Libomptarget] Delete g_atmi_initialized

This patch drops g_atmi_initialized and inlines the Initialize &
Finalize methods from Runtime class.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D102847

3 years ago[lldb][NFC] Use C++ versions of the deprecated C standard library headers
Raphael Isemann [Wed, 26 May 2021 10:19:37 +0000 (12:19 +0200)]
[lldb][NFC] Use C++ versions of the deprecated C standard library headers

The C headers are deprecated so as requested in D102845, this is replacing them
all with their (not deprecated) C++ equivalent.

Reviewed By: shafik

Differential Revision: https://reviews.llvm.org/D103084

3 years ago[X86][SLM] Fix vector PSHUFB + variable shift resource/throughputs
Simon Pilgrim [Wed, 26 May 2021 10:07:22 +0000 (11:07 +0100)]
[X86][SLM] Fix vector PSHUFB + variable shift resource/throughputs

Match whats documented in the Intel AOM (+Agner) - PSHUFB xmm is really slow, and mmx/xmm vector shifts are half rate.

Noticed while working to get the cost tables to more closely match llvm-mca analysis, in this case for shifts and truncations.

3 years ago[SCEV] Add tests with signed predicates for applyLoopGuards.
Florian Hahn [Tue, 25 May 2021 16:34:53 +0000 (17:34 +0100)]
[SCEV] Add tests with signed predicates for applyLoopGuards.

3 years ago[AMDGPU][Libomptarget] Move Kernel/Symbol info tables to RTLDeviceInfoTy
Pushpinder Singh [Tue, 25 May 2021 07:08:53 +0000 (07:08 +0000)]
[AMDGPU][Libomptarget] Move Kernel/Symbol info tables to RTLDeviceInfoTy

Two globals KernelInfoTable & SymbolInfoTable are moved
into RTLDeviceInfoTy class.
This builds on the top of D102691.
[2/2]

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D102692

3 years ago[NFC] Add CHECK lines for unordered FP reductions
Kerry McLaughlin [Wed, 26 May 2021 09:27:32 +0000 (10:27 +0100)]
[NFC] Add CHECK lines for unordered FP reductions

An additional RUN line has been added to both strict-fadd.ll &
scalable-strict-fadd.ll to ensure the correct behaviour of these
tests where `-enable-strict-reductions` is false.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D103015

3 years ago[AMDGPU][GlobalISel] Stop foldInsertEltToCmpSelect from changing reg banks
Mirko Brkusanin [Wed, 26 May 2021 09:49:05 +0000 (11:49 +0200)]
[AMDGPU][GlobalISel] Stop foldInsertEltToCmpSelect from changing reg banks

This function can change regbank for registers which already have a selected
bank. Depending on the instruction where these registers were used it can
cause instruction selection to fail.

Differential Revision: https://reviews.llvm.org/D98515

3 years agoRevert "[AMDGPU][GlobalISel] Stop foldInsertEltToCmpSelect from changing reg banks"
Mirko Brkusanin [Wed, 26 May 2021 09:47:21 +0000 (11:47 +0200)]
Revert "[AMDGPU][GlobalISel] Stop foldInsertEltToCmpSelect from changing reg banks"

This reverts commit 18c5444702893fd63b0a99ec7133dd714284f9d2.

3 years ago[RISCV] Pre-commit fixed-length mask vselect tests
Fraser Cormack [Wed, 26 May 2021 09:30:13 +0000 (10:30 +0100)]
[RISCV] Pre-commit fixed-length mask vselect tests

These are default-expanded but later unrolled due to RISC-V's vector
boolean content policy. A patch to improve this codegen will follow
shortly.

3 years ago[Test] Add simplified versions of tests for loop deletion that don't need context
Max Kazantsev [Wed, 26 May 2021 09:38:10 +0000 (16:38 +0700)]
[Test] Add simplified versions of tests for loop deletion that don't need context

3 years agoAArch64: support post-indexed stores to bfloat types.
Tim Northover [Wed, 26 May 2021 08:10:40 +0000 (09:10 +0100)]
AArch64: support post-indexed stores to bfloat types.

3 years ago[CostModel][X86] Remove old testshift* tests
Simon Pilgrim [Tue, 25 May 2021 17:42:01 +0000 (18:42 +0100)]
[CostModel][X86] Remove old testshift* tests

The vector shift cost tests are better covered (more cpu/sse levels) by the vshift-*-*cost files, and we're trying to avoid codegen tests in here as it makes it harder to maintain the test files.

3 years ago[X86][Atom] Fix vector variable shift resource/throughputs
Simon Pilgrim [Tue, 25 May 2021 17:00:53 +0000 (18:00 +0100)]
[X86][Atom] Fix vector variable shift resource/throughputs

Match whats documented in the Intel AOM - the non-immediate variants of the PSLL*/PSRA*/PSRL* shift instructions requires BOTH ports - this was being incorrectly modelled as EITHER port.

Now that we can use in-order models in llvm-mca, the atom model is a good "worst case scenario" analysis for x86.

3 years ago[Test] Add test on unrolling to make sure it won't fail
Max Kazantsev [Wed, 26 May 2021 09:25:08 +0000 (16:25 +0700)]
[Test] Add test on unrolling to make sure it won't fail

Initially it failed an assertion with "Do actual DCE in LoopUnroll (try 2)"
which was later reverted. Make sure that when this patch is returned, the
test works fine.

3 years ago[NFC][X86] clang-format X86TTIImpl::getInterleavedMemoryOpCostAVX2()
Roman Lebedev [Wed, 26 May 2021 09:17:44 +0000 (12:17 +0300)]
[NFC][X86] clang-format X86TTIImpl::getInterleavedMemoryOpCostAVX2()

I plan to make changes to it, and undoing formatting each time is not going to be fun.

3 years agoFix warning introduced by 9c766f4090d19e3e2f56e87164177f8c3eba4b96
David Sherwood [Wed, 26 May 2021 08:59:45 +0000 (09:59 +0100)]
Fix warning introduced by 9c766f4090d19e3e2f56e87164177f8c3eba4b96

3 years ago[HIP] Adjust check in hip-include-path.hip test case
Bjorn Pettersson [Wed, 26 May 2021 09:07:45 +0000 (11:07 +0200)]
[HIP] Adjust check in hip-include-path.hip test case

The changes in commit 722c39fef5ab6 caused the test case to fail
when building with -DLLVM_LIBDIR_SUFFIX=64. This patch makes the
checks a bit more relaxed to support libdir suffixes again.

Also adjusting the regular expressions to avoid mathes including
double quotes.

3 years ago[mlir] LocalAliasAnalysis: Assume allocation scope to function scope if cannot determ...
Butygin [Sat, 10 Apr 2021 16:38:11 +0000 (19:38 +0300)]
[mlir] LocalAliasAnalysis: Assume allocation scope to function scope if cannot determine better

It helps when checking aliasing between AllocOp result and function arguments.

Differential Revision: https://reviews.llvm.org/D102557

3 years ago[mlir] Simplify folding code (NFC)
Adrian Kuegel [Wed, 26 May 2021 08:59:09 +0000 (10:59 +0200)]
[mlir] Simplify folding code (NFC)

3 years ago[InstCombine] Fold extractelement + vector GEP with one use
David Sherwood [Tue, 4 May 2021 12:58:02 +0000 (13:58 +0100)]
[InstCombine] Fold extractelement + vector GEP with one use

We sometimes see code like this:

Case 1:
  %gep = getelementptr i32, i32* %a, <2 x i64> %splat
  %ext = extractelement <2 x i32*> %gep, i32 0

or this:

Case 2:
  %gep = getelementptr i32, <4 x i32*> %a, i64 1
  %ext = extractelement <4 x i32*> %gep, i32 0

where there is only one use of the GEP. In such cases it makes
sense to fold the two together such that we create a scalar GEP:

Case 1:
  %ext = extractelement <2 x i64> %splat, i32 0
  %gep = getelementptr i32, i32* %a, i64 %ext

Case 2:
  %ext = extractelement <2 x i32*> %a, i32 0
  %gep = getelementptr i32, i32* %ext, i64 1

This may create further folding opportunities as a result, i.e.
the extract of a splat vector can be completely eliminated. Also,
even for the general case where the vector operand is not a splat
it seems beneficial to create a scalar GEP and extract the scalar
element from the operand. Therefore, in this patch I've assumed
that a scalar GEP is always preferrable to a vector GEP and have
added code to unconditionally fold the extract + GEP.

I haven't added folds for the case when we have both a vector of
pointers and a vector of indices, since this would require
generating an additional extractelement operation.

Tests have been added here:

  Transforms/InstCombine/gep-vector-indices.ll

Differential Revision: https://reviews.llvm.org/D101900

3 years ago[mlir] Fold complex.re(complex.create) and complex.im(complex.create)
Adrian Kuegel [Wed, 26 May 2021 07:43:26 +0000 (09:43 +0200)]
[mlir] Fold complex.re(complex.create) and complex.im(complex.create)

This extends the folding we already have. A test needs to be adjusted.

Differential Revision: https://reviews.llvm.org/D103141

3 years ago[NFC][object] Change the input parameter of the method isDebugSection.
Esme-Yi [Wed, 26 May 2021 08:47:53 +0000 (08:47 +0000)]
[NFC][object] Change the input parameter of the method isDebugSection.

Summary: This is a NFC patch to change the input parameter of the method SectionRef::isDebugSection(), by replacing the StringRef SectionName with DataRefImpl Sec. This allows us to determine if a section is debug type in more ways than just by section name.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D102601

3 years ago[ARM] Add patterns for vmulh
David Green [Wed, 26 May 2021 08:22:12 +0000 (09:22 +0100)]
[ARM] Add patterns for vmulh

Now that vmulh can be selected, this adds the MVE patterns to make it
legal and generate instructions.

Differential Revision: https://reviews.llvm.org/D88011

3 years ago[clang-format][NFC] correctly sort StatementAttributeLike-macros' IO.map
Björn Schäpers [Tue, 25 May 2021 15:55:12 +0000 (17:55 +0200)]
[clang-format][NFC] correctly sort StatementAttributeLike-macros' IO.map

3 years ago[gn build] Port 36d0fdf9ac3b
LLVM GN Syncbot [Wed, 26 May 2021 04:31:12 +0000 (04:31 +0000)]
[gn build] Port 36d0fdf9ac3b

3 years ago[libcxx][iterator] adds `std::ranges::advance`
Christopher Di Bella [Wed, 5 May 2021 07:14:08 +0000 (07:14 +0000)]
[libcxx][iterator] adds `std::ranges::advance`

Implements part of P0896 'The One Ranges Proposal'.
Implements [range.iter.op.advance].

Differential Revision: https://reviews.llvm.org/D101922

3 years ago[OpaquePtr] Make atomicrmw work with opaque pointers
Arthur Eubanks [Tue, 25 May 2021 19:36:25 +0000 (12:36 -0700)]
[OpaquePtr] Make atomicrmw work with opaque pointers

FullTy is only necessary when we need to figure out what type an
instruction works with given a pointer's pointee type. However, we just
end up using the value operand's type, so FullTy isn't necessary.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102788

3 years agoRevert "[lldb] Avoid format string in LLDB_SCOPED_TIMER"
Jonas Devlieghere [Wed, 26 May 2021 00:21:01 +0000 (17:21 -0700)]
Revert "[lldb] Avoid format string in LLDB_SCOPED_TIMER"

Right after pushing, I remembered that this was added to silence a GCC
warning (https://reviews.llvm.org/D99120). This reverts my patch and
adds a comment.

3 years ago[lldb] Avoid format string in LLDB_SCOPED_TIMER
Jonas Devlieghere [Wed, 26 May 2021 00:12:28 +0000 (17:12 -0700)]
[lldb] Avoid format string in LLDB_SCOPED_TIMER

Pass LLVM_PRETTY_FUNCTION directly for the no-argument macro.

3 years ago[LTT] Handle merged llvm.assume when dropping type tests
Teresa Johnson [Tue, 25 May 2021 05:02:44 +0000 (22:02 -0700)]
[LTT] Handle merged llvm.assume when dropping type tests

When the lower type test pass is invoked a second time with
DropTypeTests set to true, it expects that all remaining type tests feed
assume instructions, which are removed along with the type tests.

In some cases the llvm.assume might have been merged with another one,
i.e. from a builtin_assume instruction, in which case the type test
would actually feed a phi that in turn feeds the merged assume
instruction. In this case we can simply replace that operand of the phi
with "true" before removing the type test.

Differential Revision: https://reviews.llvm.org/D103073

3 years ago[OpaquePtr] Create new bitcode encoding for atomicrmw
Arthur Eubanks [Tue, 25 May 2021 22:31:38 +0000 (15:31 -0700)]
[OpaquePtr] Create new bitcode encoding for atomicrmw

Since the opaque pointer type won't contain the pointee type, we need to
separately encode the value type for an atomicrmw.

Emit this new code for atomicrmw.

Handle this new code and the old one in the bitcode reader.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D103123

3 years ago[sanitizer] Let glibc aarch64 use O(1) GetTls
Fangrui Song [Tue, 25 May 2021 23:28:17 +0000 (16:28 -0700)]
[sanitizer] Let glibc aarch64 use O(1) GetTls

The generic approach can still be used by musl and FreeBSD. Note: on glibc
2.31, TLS_PRE_TCB_SIZE is 0x700, larger than ThreadDescriptorSize() by 16, but
this is benign: as long as the range includes pthread::{specific_1stblock,specific}
pthread_setspecific will not cause false positives.

Note: the state before afec953857ffd682cb4119e7950f3593efbaaa81 underestimated
the TLS size a lot (nearly ThreadDescriptorSize() = 1776).
That may explain why afec953857ffd682cb4119e7950f3593efbaaa81 actually made some
tests pass.

3 years agoLLVM Detailed IR tests for introduction of flag -fsanitize-address-detect-stack-use...
Kevin Athey [Thu, 13 May 2021 18:41:43 +0000 (11:41 -0700)]
LLVM Detailed IR tests for introduction of flag -fsanitize-address-detect-stack-use-after-return-mode.

Rework all tests that interact with use after return to correctly handle the case where the mode has been explicitly set to Never or Always.

for issue: https://github.com/google/sanitizers/issues/1394

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D102462

3 years ago[benchmark] Silence 'suggest override' and 'missing override' warnings
Alexandre Ganea [Tue, 25 May 2021 22:03:55 +0000 (18:03 -0400)]
[benchmark] Silence 'suggest override' and 'missing override' warnings

When building with Clang 11 on Windows, silence the following:

F:\aganea\llvm-project\llvm\utils\benchmark\include\benchmark/benchmark.h(955,8): warning: 'Run' overrides a member function but is not marked 'override' [-Wsuggest-override]
  void Run(State& st);
       ^
F:\aganea\llvm-project\llvm\utils\benchmark\include\benchmark/benchmark.h(895,16): note: overridden virtual function is here
  virtual void Run(State& state) = 0;
               ^
1 warning generated.

3 years ago[gcov] Silence warning: comparison of integers of different signs
Alexandre Ganea [Tue, 25 May 2021 21:22:08 +0000 (17:22 -0400)]
[gcov] Silence warning: comparison of integers of different signs

When building with Clang 11 on Windows, silence the following:

[432/5643] Building C object projects\compiler-rt\lib\profile\CMakeFiles\clang_rt.profile-x86_64.dir\GCDAProfiling.c.obj
F:\aganea\llvm-project\compiler-rt\lib\profile\GCDAProfiling.c(464,13): warning: comparison of integers of different signs: 'uint32_t' (aka 'unsigned int') and 'int' [-Wsign-compare]
    if (val != (gcov_version >= 90 ? GCOV_TAG_OBJECT_SUMMARY
        ~~~ ^   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.

3 years ago[NFC][MLIR][TOSA] Replaced tosa linalg.indexed_generic lowerings with linalg.index
Rob Suderman [Tue, 25 May 2021 22:27:11 +0000 (15:27 -0700)]
[NFC][MLIR][TOSA] Replaced tosa linalg.indexed_generic lowerings with linalg.index

Indexed Generic should be going away in the future. Migrate to linalg.index.

Reviewed By: NatashaKnk, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D103110

3 years ago[NFC][SCUDO] Fix unittest for -gtest_repeat=10
Vitaly Buka [Tue, 25 May 2021 22:25:05 +0000 (15:25 -0700)]
[NFC][SCUDO] Fix unittest for -gtest_repeat=10

Reviewed By: cryptoad

Differential Revision: https://reviews.llvm.org/D103122

3 years ago[MLIR Core] Cache the empty StringAttr like we do for empty dictionaries. NFC.
Chris Lattner [Tue, 25 May 2021 21:38:01 +0000 (14:38 -0700)]
[MLIR Core] Cache the empty StringAttr like we do for empty dictionaries. NFC.

MLIRContext holds a few special case values that occur frequently like empty
dictionary and NoneType, which allow us to avoid taking locks to get an instance
of them.  Give the empty StringAttr this treatment as well.  This cuts several
percent off compile time for CIRCT.

Differential Revision: https://reviews.llvm.org/D103117

3 years ago[Toy] Update tests to pass with top-down canonicalize pass. NFC
Chris Lattner [Tue, 25 May 2021 21:50:35 +0000 (14:50 -0700)]
[Toy] Update tests to pass with top-down canonicalize pass. NFC

3 years ago[libomptarget][nfc] Move hostcall required test to rtl
Jon Chesterfield [Tue, 25 May 2021 21:43:16 +0000 (22:43 +0100)]
[libomptarget][nfc] Move hostcall required test to rtl

[libomptarget][nfc] Move hostcall required test to rtl

Remove a global, fix minor race. First of N patches to bring up hostcall.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D103058

3 years ago[libc++] Install GCC 11 on CI builders
Louis Dionne [Tue, 25 May 2021 21:34:57 +0000 (17:34 -0400)]
[libc++] Install GCC 11 on CI builders

3 years ago[ARM] Extra predicated tests for VMULH. NFC
David Green [Tue, 25 May 2021 21:24:06 +0000 (22:24 +0100)]
[ARM] Extra predicated tests for VMULH. NFC

3 years ago[Internalize] Rename instead of removal if a to-be-internalized comdat has more than...
Fangrui Song [Tue, 25 May 2021 21:15:27 +0000 (14:15 -0700)]
[Internalize] Rename instead of removal if a to-be-internalized comdat has more than one member

Beside the `comdat any` deduplication feature, instrumentations use comdat to
establish dependencies among a group of sections, to prevent section based
linker garbage collection from discarding some members without discarding all.
LangRef acknowledges this usage with the following wording:

> All global objects that specify this key will only end up in the final object file if the linker chooses that key over some other key.

On ELF, for PGO instrumentation, a `__llvm_prf_cnts` section and its associated
`__llvm_prf_data` section are placed in the same GRP_COMDAT group.  A
`__llvm_prf_data` is usually not referenced and expects the liveness of its
associated `__llvm_prf_cnts` to retain it.

The `setComdat(nullptr)` code (added by D10679) in InternalizePass can break the
use case (a `__llvm_prf_data` may be dropped with its associated `__llvm_prf_cnts` retained).
The main goal of this patch is to fix the dependency relationship.

I think it makes sense for InternalizePass to internalize a comdat and thus
suppress the deduplication feature, e.g. a relocatable link of a regular LTO can
create an object file affected by InternalizePass.
If a non-internal comdat in a.o is prevailed by an internal comdat in b.o, the
a.o references to the comdat definitions will be non-resolvable (references
cannot bind to STB_LOCAL definitions in b.o).

On PE-COFF, for a non-external selection symbol, deduplication is naturally
suppressed with link.exe and lld-link. However, this is fuzzy on ELF and I tend
to believe the spec creator has not thought about this use case (see D102973).

GNU ld and gold are still using the "signature is name based" interpretation.
So even if D102973 for ld.lld is accepted, for portability, a better approach is
to rename the comdat. A comdat with one single member is the common case,
leaving the comdat can waste (sizeof(Elf64_Shdr)+4*2) bytes, so we optimize by
deleting the comdat; otherwise we rename the comdat.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D103043

3 years agoRevert "[LoopDeletion] Break backedge if we can prove that the loop is exited on...
Matt Morehouse [Tue, 25 May 2021 20:58:13 +0000 (13:58 -0700)]
Revert "[LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration"

This reverts commit 2531fd70d19aa5d61feb533bbdeee7717a4129eb due to
performance regression on the PPC buildbot.

3 years ago[libc++] [P0619] Hide not1 and not2 under _LIBCPP_ENABLE_CXX20_REMOVED_NEGATORS.
Arthur O'Dwyer [Mon, 24 May 2021 22:36:17 +0000 (18:36 -0400)]
[libc++] [P0619] Hide not1 and not2 under _LIBCPP_ENABLE_CXX20_REMOVED_NEGATORS.

This also provides some of the scaffolding needed by D102992 and D101729, and mops up after D101730 etc.

Differential Revision: https://reviews.llvm.org/D103055

3 years ago[libcxx] Fix the function name in exceptions from create_directories
Martin Storsjö [Mon, 9 Nov 2020 09:48:21 +0000 (11:48 +0200)]
[libcxx] Fix the function name in exceptions from create_directories

If the nested create_directory call fails, we'd still want to
re-report the errors with the create_directories function name,
which is what the caller called.

This fixes one aspect from MS STL's tests for std::filesystem.

Differential Revision: https://reviews.llvm.org/D102365

3 years ago[Canonicalize] Switch the default setting to "top down".
Chris Lattner [Mon, 24 May 2021 22:45:58 +0000 (15:45 -0700)]
[Canonicalize] Switch the default setting to "top down".

This provides a sizable compile time improvement by seeding
the worklist in an order that leads to less iterations of the
worklist.

This patch only changes the behavior of the Canonicalize pass
itself, it does not affect other passes that use the
GreedyPatternRewrite driver

Differential Revision: https://reviews.llvm.org/D103053

3 years ago[flang] Implement checks for defined input/output procedures
Peter Steinfeld [Mon, 24 May 2021 20:12:19 +0000 (13:12 -0700)]
[flang] Implement checks for defined input/output procedures

Defined input/output procedures are specified in 12.6.4.8.  There are different
versions for read versus write and formatted versus unformatted, but they all
share the same basic set of dummy arguments.

I added several checking functions to check-declarations.cpp along with a test.

In the process of implementing this, I noticed and fixed a typo in
.../lib/Evaluate/characteristics.cpp.

Differential Revision: https://reviews.llvm.org/D103045

3 years ago[Canonicalize] Fully parameterize the pass based on config options. NFC.
Chris Lattner [Tue, 25 May 2021 04:23:16 +0000 (21:23 -0700)]
[Canonicalize] Fully parameterize the pass based on config options. NFC.

This allows C++ clients of the Canonicalize pass to specify their own
Config option struct to control how Canonicalize works, increasing reusability.

This also allows controlling these settings for the default Canonicalize pass
using command line options.  This is useful for testing and for playing with
things on the command line.

Differential Revision: https://reviews.llvm.org/D103069

3 years ago[libcxxabi] Use ASan interface header for declaration. NFC
Shoaib Meenai [Mon, 24 May 2021 04:03:12 +0000 (21:03 -0700)]
[libcxxabi] Use ASan interface header for declaration. NFC

This was changed from using the header to using a forward declaration in
c4600ccf891c, since older versions of the header didn't declare the
function. At this point, it's been declared for ~3.5 years, and it
should be pretty safe to assume that we can rely on the ASan interface
header to provide a declaration instead of needing to write our own.

Reviewed By: #libc_abi, ldionne

Differential Revision: https://reviews.llvm.org/D103003

3 years ago[libcxx] [test] Explain an XFAIL LIBCXX-WINDOWS-FIXME and convert into UNSUPPORTED
Martin Storsjö [Wed, 12 May 2021 19:43:13 +0000 (22:43 +0300)]
[libcxx] [test] Explain an XFAIL LIBCXX-WINDOWS-FIXME and convert into UNSUPPORTED

This particular test relies on internal details from the libc++
filesystem implementation header, and those details are structured
differently in the implementation for Windows.

Differential Revision: https://reviews.llvm.org/D102357

3 years ago[libcxx] Make the visibility attributes consistent for __narrow_to_utf8/__widen_from_utf8
Martin Storsjö [Tue, 18 May 2021 14:45:08 +0000 (14:45 +0000)]
[libcxx] Make the visibility attributes consistent for __narrow_to_utf8/__widen_from_utf8

Use the same visiblity attributes as for all other template
specializations in the same file; declare the specialization itself
using _LIBCPP_TYPE_VIS, and don't use _LIBCPP_EXPORTED_FROM_ABI on
the destructor. Methods that are excluded from the ABI are marked
with _LIBCPP_INLINE_VISIBILITY.

This makes the vtable exported from DLL builds of libc++. Practically,
it doesn't make any difference for the CI configuration, but it
can make a difference in mingw setups.

Differential Revision: https://reviews.llvm.org/D102717

3 years ago[docs] [CMake] Change recommendations for how to use LLVM_DEFINITIONS
Martin Storsjö [Mon, 24 May 2021 20:16:10 +0000 (23:16 +0300)]
[docs] [CMake] Change recommendations for how to use LLVM_DEFINITIONS

LLVM_DEFINITIONS is a string variable containing a list of arguments
to pass to the compiler. When CMake's add_definitions is passed a
string variable, this is interpreted as one argument. To make it
behave properly, the string variable needs to be split into a list.

Despite the fact that add_definitions isn't supposed to be used like
the LLVM docs recommended, it worked fine in practice in many cases.
If the first argument in LLVM_DEFINITIONS is of the form -DFOO=42
instead of plain -DFOO, the rest of the string is treated as value
to this define. I.e. if LLVM_DEFINITIONS consists of `-DFOO=42 -DBAR`,
CMake ended up passing `-DFOO="42 -DBAR"` to the compiler.

See https://gitlab.kitware.com/cmake/cmakissues/22162
for discussion on the matter.

Changing LLVM_DEFINITIONS to be a list variable would possibly be
more disruptive; instead keep the variable defined as before but
change the recommendation for how to use it. Then projects using it
can gradually be updated to follow the new recommendation.

Differential Revision: https://reviews.llvm.org/D103044

3 years ago[Hexagon] Remove unused function from HexagonISelDAGToDAGHVX.cpp
Krzysztof Parzyszek [Tue, 25 May 2021 19:27:02 +0000 (14:27 -0500)]
[Hexagon] Remove unused function from HexagonISelDAGToDAGHVX.cpp

It will be reintroduced shortly with an actual use.  This change is
simply to eliminate a compilation warning.

3 years ago[sanitizer][test] s/A<10>/A<7>/ to fix "WARNING: Symbolizer buffer too small" which...
Fangrui Song [Tue, 25 May 2021 19:41:07 +0000 (12:41 -0700)]
[sanitizer][test] s/A<10>/A<7>/ to fix "WARNING: Symbolizer buffer too small" which is somehow a hard error on s390x

https://reviews.llvm.org/D102046#2766553

3 years ago[docs] Explain address spaces a bit more in opaque pointers doc
Arthur Eubanks [Fri, 14 May 2021 19:13:53 +0000 (12:13 -0700)]
[docs] Explain address spaces a bit more in opaque pointers doc

Reviewed By: theraven

Differential Revision: https://reviews.llvm.org/D102523

3 years ago[TSAN][CMake] Add support to run lit on individual tests
Bruno Cardoso Lopes [Mon, 24 May 2021 22:43:56 +0000 (15:43 -0700)]
[TSAN][CMake] Add support to run lit on individual tests

Handy when testing specific files, already supported in other components.

Example:
cd build; ./bin/llvm-lit ../compiler-rt/test/tsan/ignore_free.cpp

Differential Revision: https://reviews.llvm.org/D103054

3 years ago[AMDGPU] Fix unused variable warning. NFC.
Stanislav Mekhanoshin [Tue, 25 May 2021 19:25:19 +0000 (12:25 -0700)]
[AMDGPU] Fix unused variable warning. NFC.

3 years ago[NFC] Fix 'unused' warning
Vitaly Buka [Tue, 25 May 2021 19:17:33 +0000 (12:17 -0700)]
[NFC] Fix 'unused' warning

3 years ago[JITLink][MachO][arm64] Build GOT entries for defined symbols too.
Lang Hames [Tue, 25 May 2021 17:58:21 +0000 (10:58 -0700)]
[JITLink][MachO][arm64] Build GOT entries for defined symbols too.

During the generic x86-64 support refactor in ecf6466f01c52 the implementation
of MachO_arm64_GOTAndStubsBuilder::isGOTEdgeToFix was altered to only return
true for external symbols. This behavior is incorrect: GOT entries may be
required for defined symbols (e.g. in the large code model).

This patch fixes the bug and adds a test case for it (renaming an old test
case to avoid any ambiguity).

3 years ago[JITLink][MachO][arm64] Use a more descriptive test name.
Lang Hames [Tue, 25 May 2021 17:34:23 +0000 (10:34 -0700)]
[JITLink][MachO][arm64] Use a more descriptive test name.

3 years ago[mlir] Add a copy constructor to FailureOr
Mathieu Fehr [Tue, 25 May 2021 19:10:31 +0000 (12:10 -0700)]
[mlir] Add a copy constructor to FailureOr

The copy constructor was missing from FailureOr.

Note that I do not have commit access.

Differential Revision: https://reviews.llvm.org/D98955

3 years ago[Matrix] Use LLVM_DEBUG for a debug flag
Benjamin Kramer [Tue, 25 May 2021 19:07:14 +0000 (21:07 +0200)]
[Matrix] Use LLVM_DEBUG for a debug flag

dump() doesn't exist in release builds.

ld.lld: error: undefined symbol: llvm::Value::dump() const
>>> referenced by LowerMatrixIntrinsics.cpp
>>>               LowerMatrixIntrinsics.o:((anonymous namespace)::LowerMatrixIntrinsics::Visit())

3 years agoRevert "[AIX] Avoid structor alias; die before bad alias codegen"
Jake Egan [Tue, 25 May 2021 19:06:59 +0000 (15:06 -0400)]
Revert "[AIX] Avoid structor alias; die before bad alias codegen"

Avoiding structor alias is no longer needed because AIX now has an alias implementation here: https://reviews.llvm.org/D83252.

This reverts commit b116ded57da3530e661f871f4191c59cd9e091cd.

Reviewed By: jasonliu

Differential Revision: https://reviews.llvm.org/D102724

3 years ago[SCEV] Cache operands used in BEInfo (NFC)
Nikita Popov [Tue, 18 May 2021 22:00:17 +0000 (00:00 +0200)]
[SCEV] Cache operands used in BEInfo (NFC)

When memoized values for a SCEV expressions are dropped, we also
drop all BECounts that make use of the SCEV expression. This is done
by iterating over all the ExitNotTaken counts and (recursively)
checking whether they use the SCEV expression. If there are many
exits, this will take a lot of time.

This patch improves the situation by pre-computing a set of all
used operands, so that we can determine whether a certain BEInfo
needs to be invalidated using a simple set lookup. Will still need
to loop over all BEInfos though.

This makes for a mild improvement on non-degenerate cases:
https://llvm-compile-time-tracker.com/compare.php?from=b661a55a253f4a1cf5a0fbcb86e5ba7b9fb1387b&to=be1393f450e594c53f0ad7e62339a6bc831b16f6&stat=instructions

For the degenerate case from https://bugs.llvm.org/show_bug.cgi?id=50384,
for n=128 I'm seeing run time drop from 1.6s to 1.1s.

Differential Revision: https://reviews.llvm.org/D102796

3 years ago[gn build] Port 33706191d88d
LLVM GN Syncbot [Tue, 25 May 2021 18:58:50 +0000 (18:58 +0000)]
[gn build] Port 33706191d88d

3 years ago[lld-macho][nfc] Remove unnecessary parameterization of section sort
Jez Ng [Tue, 25 May 2021 18:57:18 +0000 (14:57 -0400)]
[lld-macho][nfc] Remove unnecessary parameterization of section sort

As @alexshap pointed out [here](https://reviews.llvm.org/D102972#inline-975208),
it's a bit confusing to have the option to sort OutputSections with any
comparator when in practice we only use one.

Reviewed By: #lld-macho, alexshap, thakis

Differential Revision: https://reviews.llvm.org/D102974

3 years ago[lld-macho][nfc] Sort OutputSections based on explicit order of command-line inputs
Jez Ng [Tue, 25 May 2021 18:57:17 +0000 (14:57 -0400)]
[lld-macho][nfc] Sort OutputSections based on explicit order of command-line inputs

This diff paves the way for {D102964} which adds a new kind of
InputSection.

We previously maintained section ordering implicitly: we created
InputSections as we parsed each file in command-line order, and passed
on this ordering when we created OutputSections and OutputSegments by
iterating over these InputSections. The implicitness of the ordering
made it difficult to refactor the code to e.g. handle a new type of
InputSection. As such, I've codified the ordering explicitly via
`inputOrder` fields. This also allows us to use `sort` instead of
`stable_sort`.

Benchmarking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:

      N           Min           Max        Median           Avg        Stddev
  x  20          4.23          4.35          4.27         4.274   0.030157481
  +  20          4.24          4.38          4.27        4.2815   0.033759989
  No difference proven at 95.0% confidence

Reviewed By: #lld-macho, alexshap

Differential Revision: https://reviews.llvm.org/D102972

3 years ago[lld-macho][nfc] Rename MergedOutputSection to ConcatOutputSection
Jez Ng [Tue, 25 May 2021 18:57:16 +0000 (14:57 -0400)]
[lld-macho][nfc] Rename MergedOutputSection to ConcatOutputSection

The ELF format has the concept of merge sections (marked by SHF_MERGE),
which contain data that can be safely deduplicated. The Mach-O
equivalents are called literal sections (marked by S_CSTRING_LITERALS or
S_{4,8,16}BYTE_LITERALS). While the Mach-O format doesn't use the word
'merge', to avoid confusion, I've renamed our MergedOutputSection to
ConcatOutputSection. I believe it's a more descriptive name too.

This renaming sets the stage for {D102964}.

Reviewed By: #lld-macho, alexshap

Differential Revision: https://reviews.llvm.org/D102971

3 years ago[lld-macho][nfc] clang-format everything
Jez Ng [Tue, 25 May 2021 18:58:06 +0000 (14:58 -0400)]
[lld-macho][nfc] clang-format everything

3 years ago[lld-macho][nfc] Misc code cleanup
Jez Ng [Tue, 25 May 2021 18:57:58 +0000 (14:57 -0400)]
[lld-macho][nfc] Misc code cleanup

* Move `static_asserts` into cpp instead of header file. I noticed they
  had been separated from the main class definition in the header, so I
  set about to clean that up, then figured it made more sense as part of
  the cpp file so as not to incur unnecessary compile-time overhead.

* Remove unnecessary `virtual`s

* Remove unnecessary comment / reword another comment

3 years agoRevert "[NFC][scudo] Let disableMemoryTagChecksTestOnly to fail"
Vitaly Buka [Tue, 25 May 2021 18:35:12 +0000 (11:35 -0700)]
Revert "[NFC][scudo] Let disableMemoryTagChecksTestOnly to fail"

This reverts commit 2c212db4ea42fbbc0e83647da4f62261f775388b.

It's not needed.

3 years ago[CVP] Guard against poison in common phi value transform (PR50399)
Nikita Popov [Sat, 22 May 2021 09:19:30 +0000 (11:19 +0200)]
[CVP] Guard against poison in common phi value transform (PR50399)

The common phi value transform replaces constants with values that
have the same value as the constant on a given edge. However, LVI
generally only provides information that is correct up to poison,
so this can end up replacing a well-defined value with poison.
D69442 addressed an instance of this problem by clearing poison
flags on the generating instruction, which was sufficient at the
time. rGa917fb89dc28 made LVI's edge value analysis slightly more
powerful, and clearing poison flags is no longer sufficient.

This patch changes the transform to instead explicitly guard against
a poison value instead. This should be satisfied for most cases due
to a prior branch on poison.

Fixes https://bugs.llvm.org/show_bug.cgi?id=50399.

Differential Revision: https://reviews.llvm.org/D102966

3 years ago[SelectionDAG] Propagate scoped AA metadata when lowering mem intrinsics.
Michael Liao [Thu, 6 May 2021 05:10:28 +0000 (01:10 -0400)]
[SelectionDAG] Propagate scoped AA metadata when lowering mem intrinsics.

- When memory intrinsics, such as memcpy, the attached scoped AA
  metadata is not passed down to the backend. As a result, the backend
  cannot schedule relevant memory operations around them following that
  hint. In this patch, SelectionDAG is enhanced to propagate that
  metadata (scoped AA only) when they are lowered into loads and stores.

Differential Revision: https://reviews.llvm.org/D102215

3 years agoAdd pre-commit tests for [D102215](https://reviews.llvm.org/D102215).
Michael Liao [Tue, 25 May 2021 17:56:43 +0000 (13:56 -0400)]
Add pre-commit tests for [D102215](https://reviews.llvm.org/D102215).

3 years ago[mlir] Use unique_function in AbstractOperation fields
Mathieu Fehr [Tue, 25 May 2021 18:36:04 +0000 (11:36 -0700)]
[mlir] Use unique_function in AbstractOperation fields

Currently, AbstractOperation fields are function pointers.
Modifying them to unique_function allow them to contain
runtime information.

For instance, this allows operations to be defined at runtime.

Differential Revision: https://reviews.llvm.org/D103031

3 years ago[AMDGPU] Lower kernel LDS into a sorted structure
Stanislav Mekhanoshin [Wed, 19 May 2021 20:39:55 +0000 (13:39 -0700)]
[AMDGPU] Lower kernel LDS into a sorted structure

Differential Revision: https://reviews.llvm.org/D102954

3 years ago[InstSimplify] allow undef element match in vector select condition value
Sanjay Patel [Tue, 25 May 2021 18:18:07 +0000 (14:18 -0400)]
[InstSimplify] allow undef element match in vector select condition value

The semantics of select with undefined/poison condition
are not explicitly stated in the LangRef, but this matches
comments in the code and Alive2 appears to concur:
https://alive2.llvm.org/ce/z/KXytmd

We can find this pattern after demanded elements transforms.

As noted in D101191, fuzzers are finding infinite loops because
we may not account for this pattern in other passes.

3 years ago[mlir][doc] Fix links and references in documentation of Tutorials
Markus Böck [Tue, 25 May 2021 18:18:15 +0000 (20:18 +0200)]
[mlir][doc] Fix links and references in documentation of Tutorials

This patch is the third in a series of patches fixing markdown links and references inside the mlir documentation.

This patch addresses all broken references to other markdown files and sections inside the Tutorials folder.

Differential Revision: https://reviews.llvm.org/D103017

3 years ago[Matrix] Factor and distribute transposes across multiplies
Adam Nemet [Tue, 18 May 2021 16:59:07 +0000 (09:59 -0700)]
[Matrix] Factor and distribute transposes across multiplies

Now that we can fold some transposes into multiplies (CM: A * B^t and RM:
A^t * B), we want to move them around to create the optimal expressions:

* fold away double transposes while still using them to assert the shape
* sink transposes hoping they cancel out
* lift transposes when both operands are transposed

This also modifies the matrix remarks to include the number of exposed
transposes (i.e. transposes that we couldn't fold into a multiply).

The adjustment to the test remarks-inlining is a bit subtle: I am changing the
double transpose to a single transpose so that we don't remove it completely.
More importantly this changes some of the total instruction count, most
notable stores because we can no longer use a vector store.

Differential Revision: https://reviews.llvm.org/D102733

3 years ago[mlir] Add an optional distributionTypes attribute to TiledLoopOp.
Alexander Belyaev [Tue, 25 May 2021 17:53:02 +0000 (19:53 +0200)]
[mlir] Add an optional distributionTypes attribute to TiledLoopOp.

Differential Revision: https://reviews.llvm.org/D103104

3 years ago[LoopIdiom] 'arithmetic right-shift until zero': don't turn potentially infinite...
Roman Lebedev [Tue, 25 May 2021 17:47:53 +0000 (20:47 +0300)]
[LoopIdiom] 'arithmetic right-shift until zero': don't turn potentially infinite loops into finite ones

Nowadays LLVM does not assume that all loops are finite,
so if we want to produce a finite loop from a potentially-infinite one,
we must ensure that the original loop is known to be a finite one.

For this transform, it only matters for arithmetic right-shifts.
For them, either the function or the loop must be known to
be `mustprogress`, or the original value being shifted must be known
to be non-negative (because iff the sign bit was set,
it will never become zero, but will become `-1` in the "end").

It would be really good for alive2 to actually complain about this,
but it currently does not: https://github.com/AliveToolkit/alive2/issues/726

3 years ago[scudo] Fix CHECK implementation
Vitaly Buka [Tue, 25 May 2021 01:12:08 +0000 (18:12 -0700)]
[scudo] Fix CHECK implementation

Cast of signed types to u64 breaks comparison.
Also remove double () around operands.

Reviewed By: cryptoad, hctim

Differential Revision: https://reviews.llvm.org/D103060

3 years ago[scudo] Consistent setting of SCUDO_DEBUG
Vitaly Buka [Tue, 25 May 2021 01:35:14 +0000 (18:35 -0700)]
[scudo] Consistent setting of SCUDO_DEBUG

Make sure that if SCUDO_DEBUG=1 in tests
then we had the same in the scudo
library itself.

Reviewed By: cryptoad, hctim

Differential Revision: https://reviews.llvm.org/D103061

3 years ago[Hexagon] Improve argument packing in vector shuffle selection
Krzysztof Parzyszek [Tue, 4 May 2021 14:26:31 +0000 (09:26 -0500)]
[Hexagon] Improve argument packing in vector shuffle selection

3 years ago[mlir][linalg] Update Linalg.md (NFC).
Tobias Gysi [Tue, 25 May 2021 17:23:45 +0000 (17:23 +0000)]
[mlir][linalg] Update Linalg.md (NFC).

Update the paragraph on generic / indexed_generic to reflect the unification of these operations.

Differential Revision: https://reviews.llvm.org/D102775

3 years ago[CSSPGO][llvm-profgen] Change default cold threshold for context merging
Wenlei He [Tue, 25 May 2021 04:17:17 +0000 (21:17 -0700)]
[CSSPGO][llvm-profgen] Change default cold threshold for context merging

llvm-profgen uses profile summary based cold threshold to merge and trim cold context profile. This is to strike a good balance between profile size and performance.

We've been using 99.9% as the cutoff to save profile size without affecting performance. This change switch to use 99.9% instead of 99.9999% as default cold threshold cutoff for llvm-profgen.

Redundant switch csprof-cold-thres is also removed and tests cleaned up.

Differential Revision: https://reviews.llvm.org/D103071