platform/upstream/llvm.git
3 years ago[mlir] Normalize dynamic memrefs with a map of tiled-layout.
Haruki Imai [Mon, 24 May 2021 03:04:45 +0000 (08:34 +0530)]
[mlir] Normalize dynamic memrefs with a map of tiled-layout.

Steps for normalizing dynamic memrefs for tiled layout map
1. Check if original map is tiled layout. Only tiled layout is supported.
2. Create normalized memrefType. Dimensions that include dynamic dimensions
   in the map output will be dynamic dimensions.
3. Create new maps to calculate each dimension size of new memref.
   In tiled layout, the dimension size can be calculated by replacing
    "floordiv <tile size>" with "ceildiv <tile size>" and
    "mod <tile size>" with "<tile size>".
4. Create AffineApplyOp to apply the new maps. The output of AffineApplyOp is
   dynamicSizes for new AllocOp.
5. Add the new dynamic sizes in new AllocOp.

This patch also set MemRefsNormalizable trant in CastOp and DimOp since
they used with dynamic memrefs.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D97655

3 years ago[Attributor][FIX] Account for undef in the constant value lattice
Johannes Doerfert [Sat, 8 May 2021 04:45:07 +0000 (23:45 -0500)]
[Attributor][FIX] Account for undef in the constant value lattice

The constant value lattice looks like this

```
  <None>
     |
  <undef>
  /  |   \
... <0>  ...
 \   |   /
 <unknown>
```
We did not account for the undef and assumed a value meant we could not
change anymore. Now we actually check if we have the same value as
before, which will signal CHANGED to the users when we go from undef to
a specific constant.

This fixes, among other things, the bug exposed by @ipccp4 in
`value-simplify.ll`.

3 years ago[Attributor][FIX] Ensure we replace undef if we see the first "real" value
Johannes Doerfert [Thu, 6 May 2021 21:51:37 +0000 (16:51 -0500)]
[Attributor][FIX] Ensure we replace undef if we see the first "real" value

The state of AAPotentialValues tracks if undef is contained. It should
fold undef into the first non-undef value. However we missed a case
before. There was also a shadowing definition of two variables that
caused trouble. The test exposes both problems.

3 years ago[Attributor][NFC] Precommit test case with branch on undef
Johannes Doerfert [Fri, 21 May 2021 17:43:15 +0000 (12:43 -0500)]
[Attributor][NFC] Precommit test case with branch on undef

This test exposes a bug in the module pass as it simplifies ipccp4 to
unreachable, which is unfortunately wrong.

3 years ago[Attributor][NFC] Add helpful debug outputs
Johannes Doerfert [Sun, 16 May 2021 01:16:39 +0000 (20:16 -0500)]
[Attributor][NFC] Add helpful debug outputs

3 years ago[Attributor][NFC] Clang format the Attributor source files
Johannes Doerfert [Thu, 6 May 2021 16:27:06 +0000 (11:27 -0500)]
[Attributor][NFC] Clang format the Attributor source files

3 years ago[Attributor][NFC] Rerun update_test_checks script on Attributor tests
Johannes Doerfert [Sun, 16 May 2021 01:32:59 +0000 (20:32 -0500)]
[Attributor][NFC] Rerun update_test_checks script on Attributor tests

3 years ago[Debug-Info] handle DW_TAG_rvalue_reference_type at strict DWARF.
Chen Zheng [Thu, 20 May 2021 09:52:34 +0000 (05:52 -0400)]
[Debug-Info] handle DW_TAG_rvalue_reference_type at strict DWARF.

When -gstrict-dwarf is specified, generate DW_TAG_rvalue_reference_type
at DWARF 4 or above

Reviewed By: dblaikie, aprantl

Differential Revision: https://reviews.llvm.org/D100630

3 years ago[NFC] Removing leftover debug code
Fady Ghanim [Sun, 23 May 2021 23:13:09 +0000 (19:13 -0400)]
[NFC] Removing leftover debug code

Removing a missed value::dump() used to debug during development of
OMPBuilder atomic.

3 years ago[AArch64] Delete unneeded fixup_aarch64_ldr_pcrel_imm19 VK_GOT special case
Fangrui Song [Sun, 23 May 2021 22:20:56 +0000 (15:20 -0700)]
[AArch64] Delete unneeded fixup_aarch64_ldr_pcrel_imm19 VK_GOT special case

An AArch64 VK_GOT fixup must have a symbol. MCAssembler::evaluateFixup considers
such a fixup not resolved. The code path cannot trigger.

3 years ago[OpenMP][OMPIRBuilder]Adding support for `omp atomic`
Fady Ghanim [Thu, 6 May 2021 21:23:28 +0000 (17:23 -0400)]
[OpenMP][OMPIRBuilder]Adding support for `omp atomic`

This patch adds support for generating `omp atomic` for all different
atomic clauses

3 years ago[NFC][scudo] Enforce header size alignment
Vitaly Buka [Sun, 23 May 2021 21:12:49 +0000 (14:12 -0700)]
[NFC][scudo] Enforce header size alignment

As-is it should not change struct size, but it will
help to keep correct size if more fields added.

3 years ago[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFil...
Philipp Krones [Sun, 23 May 2021 21:15:23 +0000 (14:15 -0700)]
[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo

This makes it possible for targets to define their own MCObjectFileInfo.
This MCObjectFileInfo is then used to determine things like section alignment.

This is a follow up to D101462 and prepares for the RISCV backend defining the
text section alignment depending on the enabled extensions.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D101921

3 years ago[libc++] use more early returns for consistency
Joerg Sonnenberger [Sun, 23 May 2021 20:55:45 +0000 (22:55 +0200)]
[libc++] use more early returns for consistency

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D96983

3 years ago[LoopUnroll] Add test for partial unrolling again non-latch exit (NFC)
Nikita Popov [Sun, 23 May 2021 21:08:32 +0000 (23:08 +0200)]
[LoopUnroll] Add test for partial unrolling again non-latch exit (NFC)

This test case would get miscompiled by the current version of
D102982, because unrolling does not respect the PreserveCondBr
flag for partial unrolling.

3 years ago[IR] Add a Location to BlockArgument
Chris Lattner [Sun, 23 May 2021 21:08:31 +0000 (14:08 -0700)]
[IR] Add a Location to BlockArgument

This adds the ability to specify a location when creating BlockArguments.
Notably Value::getLoc() will return this correctly, which makes diagnostics
more precise (e.g. the example in test-legalize-type-conversion.mlir).

This is currently optional to avoid breaking any existing code - if
absent, the BlockArgument defaults to using the location of its enclosing
operation (preserving existing behavior).

The bulk of this change is plumbing location tracking through the parser
and printer to make sure it can round trip (in -mlir-print-debuginfo
mode).  This is complete for generic operations, but requires manual
adoption for custom ops.

I added support for function-like ops to round trip their argument
locations - they print correctly, but when parsing the locations are
dropped on the floor.  I intend to fix this, but it will require more
invasive plumbing through "function_like_impl" stuff so I think it
best to split it out to its own patch.

This is a reapply of the patch here: https://reviews.llvm.org/D102567
with an additional change: we now never defer block argument locations,
guaranteeing that we can round trip correctly.

This isn't required in all cases, but allows us to hill climb here and
works around unrelated bugs like https://bugs.llvm.org/show_bug.cgi?id=50451

Differential Revision: https://reviews.llvm.org/D102991

3 years ago[AArch64][MC] Remove unneeded "in .xxx directive" from diagnostics
Fangrui Song [Sun, 23 May 2021 20:58:16 +0000 (13:58 -0700)]
[AArch64][MC] Remove unneeded "in .xxx directive" from diagnostics

The prevailing style does not add the message. The directive name is not useful
because the next line replicates the error line which includes the directive.

3 years ago[SPARC] recognize the "rd %pc, reg" special form
Joerg Sonnenberger [Sun, 23 May 2021 20:52:59 +0000 (22:52 +0200)]
[SPARC] recognize the "rd %pc, reg" special form

Differential Revision: https://reviews.llvm.org/D96312

3 years agoNFC: cleaned up and renamed scalable-vf-analysis.ll -> scalable-vectorization.ll
Sander de Smalen [Sun, 23 May 2021 13:41:13 +0000 (14:41 +0100)]
NFC: cleaned up and renamed scalable-vf-analysis.ll -> scalable-vectorization.ll

* Removes unnecessary loop hints.
* Use RUN line with '-scalable-vectorization=preferred' instead of 'on'
  for the maximize-bandwidth behaviour. This prepares the test for enabling
  scalable vectorization; With a forced instruction-cost of 1, 'on' will
  always favour fixed-width VF to be chosen, whereas with 'preferred'
  we can check that the maximize-bandwidth option in combination with
  scalable-vectorization=preferred actually picks a scalable VF.
* Renamed to scalable-vectorization.ll, because a follow-up patch will
  test more than just analysis.

3 years ago[NFC][X86][Costmodel] Add tests with with masked loads/stores w/non-power-of-two...
Roman Lebedev [Sun, 23 May 2021 18:42:30 +0000 (21:42 +0300)]
[NFC][X86][Costmodel] Add tests with with masked loads/stores w/non-power-of-two vectors

3 years ago[AArch64] Use \t in AsmStreamer to match the prevailing style
Fangrui Song [Sun, 23 May 2021 18:35:42 +0000 (11:35 -0700)]
[AArch64] Use \t in AsmStreamer to match the prevailing style

3 years ago[mlir][doc] Fix links and indentation of mlir::ModuleOp description
Markus Böck [Sun, 23 May 2021 18:00:44 +0000 (20:00 +0200)]
[mlir][doc] Fix links and indentation of mlir::ModuleOp description

All lines after the first are currently indented by one char further to the left than the first line. This leads to the first character of each sentence being cut from the resulting Markdown file after compilation. The text also contains 3 references to sections of other markdown files. One was missing the file, while the other two had outdated files, leading to 404 errors in the documentation.

Differential Revision: https://reviews.llvm.org/D102983

3 years agoFix bugs URL for PR relocations
Simon Pilgrim [Sun, 23 May 2021 16:19:36 +0000 (17:19 +0100)]
Fix bugs URL for PR relocations

The PR works from llvm.org, not bugs.llvm.org

3 years ago[CostModel][X86] Align v2i64 MUL costs on SSE42+ targets with worst case
Simon Pilgrim [Sun, 23 May 2021 09:29:34 +0000 (10:29 +0100)]
[CostModel][X86] Align v2i64 MUL costs on SSE42+ targets with worst case

Based on worst case of sandybridge (which seems to match nehalem for this SSE sequence) (vs btver2 + bdver2) llvm-mca analysis

3 years ago[gn build] (semi-manually) port 0bccdf82f705
Nico Weber [Sun, 23 May 2021 14:01:06 +0000 (10:01 -0400)]
[gn build] (semi-manually) port 0bccdf82f705

3 years ago[InstSimplify] add more tests for rem-mul-div; NFC
Sanjay Patel [Sun, 23 May 2021 13:42:35 +0000 (09:42 -0400)]
[InstSimplify] add more tests for rem-mul-div; NFC

See D102864 for discussion.

3 years ago[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass
maekawatoshiki [Sun, 23 May 2021 13:32:01 +0000 (22:32 +0900)]
[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass

This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass.
The next patch will utilize LoopNest to effectively handle loop nests.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D99149

3 years ago[LoopUnroll] Add test for unrollable non-latch multi-exit (NFC)
Nikita Popov [Sun, 23 May 2021 08:49:11 +0000 (10:49 +0200)]
[LoopUnroll] Add test for unrollable non-latch multi-exit (NFC)

This test case requires unrolling against a non-latch exit in
a multiple-exit loop with exiting latch. It's not covered by
exiting heuristics or the extension in D102635.

3 years ago[ARM] Add extra debug messages for gather/scatter lowering. NFC
David Green [Sun, 23 May 2021 07:52:13 +0000 (08:52 +0100)]
[ARM] Add extra debug messages for gather/scatter lowering. NFC

3 years ago[NFC][scudo] Replace size_t with uptr
Vitaly Buka [Sun, 23 May 2021 05:55:53 +0000 (22:55 -0700)]
[NFC][scudo] Replace size_t with uptr

3 years ago[NFC][scudo] Add releasePagesToOS test
Vitaly Buka [Sun, 23 May 2021 05:30:03 +0000 (22:30 -0700)]
[NFC][scudo] Add releasePagesToOS test

3 years ago[NFC][scudo] Move SKIP_ON_FUCHSIA to common header
Vitaly Buka [Sun, 23 May 2021 05:28:54 +0000 (22:28 -0700)]
[NFC][scudo] Move SKIP_ON_FUCHSIA to common header

3 years ago[ELF][test] Avoid local signature symbols for section groups to match reality
Fangrui Song [Sun, 23 May 2021 00:48:45 +0000 (17:48 -0700)]
[ELF][test] Avoid local signature symbols for section groups to match reality

If we support local signature symbols (PR43094), these tests would fail.

When the support is added, new tests (local signature symbol specific) should be developed.

3 years agoRevert "[Driver] Support libc++ in MSVC"
Petr Hosek [Sat, 22 May 2021 22:49:46 +0000 (15:49 -0700)]
Revert "[Driver] Support libc++ in MSVC"

This reverts commit b604301be3559fb85a11779db79fc9bda4b62bce since
it caused compilation failure in sanitizer_unwind_win.cpp when using
the runtimes build.

3 years ago[Windows] Use TerminateProcess to exit without running destructors
Martin Storsjö [Fri, 21 May 2021 20:50:01 +0000 (23:50 +0300)]
[Windows] Use TerminateProcess to exit without running destructors

If exiting using _Exit or ExitProcess, DLLs are still unloaded
cleanly before exiting, running destructors and other cleanup in those
DLLs. When the caller expects to exit without cleanup, running
destructors in some loaded DLLs (which can be either libLLVM.dll or
e.g. libc++.dll) can cause deadlocks occasionally.

This is an alternative to D102684.

Differential Revision: https://reviews.llvm.org/D102944

3 years ago[MinGW] Mark a number of library functions unavailable for mingw targets
Martin Storsjö [Thu, 20 May 2021 21:47:39 +0000 (00:47 +0300)]
[MinGW] Mark a number of library functions unavailable for mingw targets

These functions were marked unavailable for MSVC targets before,
within an "T.isOSWindows() && !T.isOSCygMing()" block, but these ones
are unavailable on MinGW targets too.

This avoids generating calls to stpcpy for MinGW targets, which has
been happening since 6dbf0cfcf789365493f70ae69df8a7a59be41c75 (in
some cases).

This fixes https://github.com/mstorsjo/llvm-mingw/issues/201.

Differential Revision: https://reviews.llvm.org/D102946

3 years ago[Driver] Support libc++ in MSVC
Petr Hosek [Wed, 28 Apr 2021 18:23:54 +0000 (11:23 -0700)]
[Driver] Support libc++ in MSVC

This implements support for using libc++ headers and library in the MSVC
toolchain.  We only support libc++ that is a part of the toolchain, and
not headers installed elsewhere on the system.

Differential Revision: https://reviews.llvm.org/D101479

3 years ago[CostModel][X86] Align v4i64 MUL costs on AVX1 targets with worst case
Simon Pilgrim [Sat, 22 May 2021 19:07:55 +0000 (20:07 +0100)]
[CostModel][X86] Align v4i64 MUL costs on AVX1 targets with worst case

Based on worst case of sandybridge (vs btver2 + bdver2) llvm-mca analysis - which is a lot less than what we were predicting (I think based off total uop count).

3 years ago[IR] Optimize no-op removal from AttributeList (NFC)
Nikita Popov [Sat, 22 May 2021 13:03:29 +0000 (15:03 +0200)]
[IR] Optimize no-op removal from AttributeList (NFC)

When removing an AttrBuilder from an index of an AttributeList,
directly return the original list if no attributes were actually
removed.

3 years ago[IR] Optimize no-op removal from AttributeSet (NFC)
Nikita Popov [Sat, 22 May 2021 16:53:17 +0000 (18:53 +0200)]
[IR] Optimize no-op removal from AttributeSet (NFC)

When removing an AttrBuilder from an AttributeSet, first check
whether there is any overlap. If nothing is being removed, we can
directly return the original set.

3 years ago[CostModel][X86] Pull out X86/X64 scalar int arithmetric costs from SSE tables. NFCI.
Simon Pilgrim [Fri, 21 May 2021 18:02:45 +0000 (19:02 +0100)]
[CostModel][X86] Pull out X86/X64 scalar int arithmetric costs from SSE tables. NFCI.

These aren't dependent on any SSE level (and don't tend to get quicker either).

3 years ago[ORC] Add more synchronization to TestLookupWithUnthreadedMaterialization.
Lang Hames [Sat, 22 May 2021 14:55:12 +0000 (07:55 -0700)]
[ORC] Add more synchronization to TestLookupWithUnthreadedMaterialization.

Don't run tasks until their corresponding thread has been added to the running
threads vector. This is an extention to fda4300da82, which doesn't seem to have
been enough to fix the synchronization issues on its own.

3 years ago[JITLink] Move some Block bitfields into Addressable to improve packing.
Lang Hames [Sat, 22 May 2021 04:48:28 +0000 (21:48 -0700)]
[JITLink] Move some Block bitfields into Addressable to improve packing.

Keeping these bitfields from Block to Addressable allows them to be packed with
the bitfields at the end of Addressable, reducing the size of Block by eight
bytes.

3 years ago[HIP] support ThinLTO
Yaxun (Sam) Liu [Mon, 29 Mar 2021 19:09:00 +0000 (15:09 -0400)]
[HIP] support ThinLTO

Add options -[no-]offload-lto and -foffload-lto=[thin,full] for controlling
LTO for offload compilation. Allow LTO for AMDGPU target.

AMDGPU target does not support codegen of object files containing
call of external functions, therefore the LLVM module passed to
AMDGPU backend needs to contain definitions of all the callees.
An LLVM option is added to allow function importer to import
functions with noinline attribute.

HIP toolchain passes proper LLVM options to lld to make sure
function importer imports definitions of all the callees.

Reviewed by: Teresa Johnson, Artem Belevich

Differential Revision: https://reviews.llvm.org/D99683

3 years ago[mlir][linalg][nfc] Fix signed/unsigned comparison warning in header
Butygin [Wed, 19 May 2021 19:04:29 +0000 (22:04 +0300)]
[mlir][linalg][nfc] Fix signed/unsigned comparison warning in header

Differential Revision: https://reviews.llvm.org/D102968

3 years agoReapply [InstCombine] Fold multiuse shr eq zero
Nikita Popov [Mon, 19 Apr 2021 20:09:15 +0000 (22:09 +0200)]
Reapply [InstCombine] Fold multiuse shr eq zero

This was reverted due to performance regressions in ARM benchmarks,
which have since been addressed by D101196 (SCEV analysis improvement)
and D101778 (CGP reverse transform).

-----

The single-use case is handled implicity by converting the icmp
into a mask check first. When comparing with zero in particular,
we don't need the one-use restriction, as we only produce a single
icmp.

https://alive2.llvm.org/ce/z/MSixcm
https://alive2.llvm.org/ce/z/GwpG0M

3 years ago[ARM] Clean up some tests, removing dead instructions. NFC
David Green [Sat, 22 May 2021 12:38:00 +0000 (13:38 +0100)]
[ARM] Clean up some tests, removing dead instructions. NFC

3 years ago[mlir][SCF] Canonicalize nested ParallelOp's
Butygin [Wed, 19 May 2021 19:04:29 +0000 (22:04 +0300)]
[mlir][SCF] Canonicalize nested ParallelOp's

Differential Revision: https://reviews.llvm.org/D102799

3 years ago[CostModel][X86] vXi8 MUL is always promoted to vXi16
Simon Pilgrim [Sat, 22 May 2021 10:56:38 +0000 (11:56 +0100)]
[CostModel][X86] vXi8 MUL is always promoted to vXi16

3 years ago[MLIR][GPU] Add CUDA Tensor core WMMA test
Navdeep Kumar [Sat, 22 May 2021 10:46:08 +0000 (16:16 +0530)]
[MLIR][GPU] Add CUDA Tensor core WMMA test

Add a test case to test the complete execution of WMMA ops on a Nvidia
GPU with tensor cores. These tests are enabled under
MLIR_RUN_CUDA_TENSOR_CORE_TESTS.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D95334

3 years ago[MLIR] Drop stale reference to mlir-edsc-builder-api-test
Uday Bondhugula [Sat, 22 May 2021 10:37:30 +0000 (16:07 +0530)]
[MLIR] Drop stale reference to mlir-edsc-builder-api-test

Drop stale reference to mlir-edsc-builder-api-test.

Differential Revision: https://reviews.llvm.org/D102967

3 years ago[Matrix] Bail out early if there are no matrix intrinsics.
Florian Hahn [Sat, 22 May 2021 10:26:19 +0000 (11:26 +0100)]
[Matrix] Bail out early if there are no matrix intrinsics.

If there are no matrix intrinsics in a function, we can directly bail
out, as there's nothing left to do.

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D102931

3 years ago[CostModel][X86] Add test coverage for sub-64bit vXi8 multiplication costs
Simon Pilgrim [Sat, 22 May 2021 10:33:36 +0000 (11:33 +0100)]
[CostModel][X86] Add test coverage for sub-64bit vXi8 multiplication costs

These can be cheaply promoted to a single v8i16 vector for multiplication

3 years ago[CostModel][X86] Improve v8i32 MUL costs on AVX1 targets to account for slower btver2
Simon Pilgrim [Sat, 22 May 2021 10:11:38 +0000 (11:11 +0100)]
[CostModel][X86] Improve v8i32 MUL costs on AVX1 targets to account for slower btver2

BTVER2 has a 2 cycle throughput for v4i32 multiplies (same as SSE41 targets), which is only partially hidden by the subvector extracts/insert when splitting v8i32.

3 years ago[mlir] ConvertStandardToLLVM: make AllocLikeOpLowering public
Butygin [Wed, 19 May 2021 19:04:29 +0000 (22:04 +0300)]
[mlir] ConvertStandardToLLVM: make AllocLikeOpLowering public

It is useful for someone who wants to implement custom AllocOp LLVM lowering

Differential Revision: https://reviews.llvm.org/D102932

3 years ago[Demangle][Rust] Parse function signatures
Tomasz Miąsko [Sat, 22 May 2021 09:36:59 +0000 (11:36 +0200)]
[Demangle][Rust] Parse function signatures

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102581

3 years ago[Demangle][Rust] Parse references
Tomasz Miąsko [Sat, 22 May 2021 09:36:53 +0000 (11:36 +0200)]
[Demangle][Rust] Parse references

Reviewed By: dblaikie

Part of https://reviews.llvm.org/D102580

3 years ago[Demangle][Rust] Parse raw pointers
Tomasz Miąsko [Sat, 22 May 2021 09:36:40 +0000 (11:36 +0200)]
[Demangle][Rust] Parse raw pointers

Reviewed By: dblaikie

Part of https://reviews.llvm.org/D102580

3 years ago[CVP] Add test for PR50399 (NFC)
Nikita Popov [Sat, 22 May 2021 09:21:02 +0000 (11:21 +0200)]
[CVP] Add test for PR50399 (NFC)

3 years agoReland [X86] X86TTIImpl::getInterleavedMemoryOpCostAVX2(): use getMemoryOpCost()
Roman Lebedev [Sat, 22 May 2021 08:47:08 +0000 (11:47 +0300)]
Reland [X86] X86TTIImpl::getInterleavedMemoryOpCostAVX2(): use getMemoryOpCost()

Now that getMemoryOpCost() correctly handles all the vector variants,
we should no longer hand-roll our own version of it, but use it directly.

The AVX512 variant probably needs a similar change,
but there it is less obvious.

This was initially landed in 69ed93a4355123a45c1d7216aea7cd53d07a361b,
but was reverted in 6b95fd199d96e3ba5c28a23b17b74203522bdaa8
because the patch it depends on was reverted.

3 years agoReland [X86][CostModel] X86TTIImpl::getMemoryOpCost(): rewrite vector handling again
Roman Lebedev [Sat, 22 May 2021 08:40:58 +0000 (11:40 +0300)]
Reland [X86][CostModel] X86TTIImpl::getMemoryOpCost(): rewrite vector handling again

Instead of handling power-of-two sized vector chunks,
try handling the large vector in a stream mode,
decreasing the operational vector size
once it no longer works for the elements left to process.

Notably, this improves costs for overaligned loads - loading padding is fine.
This more directly tracks when we need to insert/extract the YMM/XMM subvector,
some costs fluctuate because of that.

This was initially landed in c02476f3158f2908ef0a6f628210b5380bd33695,
but reverted in 5fddc3312bad7e62493f1605385fad5e589e6450,
because the code made some very optimistic assumptions about invariants
that didn't hold in practice.

Reviewed By: RKSimon, ABataev

Differential Revision: https://reviews.llvm.org/D100684

3 years ago[SelectionDAG] Fix argument copy elision with irregular types
LemonBoy [Sat, 22 May 2021 07:40:00 +0000 (09:40 +0200)]
[SelectionDAG] Fix argument copy elision with irregular types

D29668 enabled to avoid a useless copy of the argument value into an alloca if the caller places it in memory (as it often happens on x86) by directly forwarding the pointer to it. This optimization is illegal if the type contains padding bytes: if a truncating store into the alloca is replaced the upper bits are filled with garbage and produce code misbehaving at runtime.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D102153

3 years ago[ConstantFolding] Use APFloat for constant folding. NFC
Serge Pavlov [Tue, 4 May 2021 04:46:54 +0000 (11:46 +0700)]
[ConstantFolding] Use APFloat for constant folding. NFC

Replace use of host floating types with operations on APFloat when it is
possible. Use of APFloat makes analysis more convenient and facilitates
constant folding in the case of non-default FP environment.

Differential Revision: https://reviews.llvm.org/D102672

3 years ago[Polly] Avoid compiler warning. NFC.
Michael Kruse [Fri, 21 May 2021 22:38:25 +0000 (17:38 -0500)]
[Polly] Avoid compiler warning. NFC.

Avoid the warning

    /polly/lib/Support/RegisterPasses.cpp:833:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default]
      default:
      ^

since all cases are now handled.

Thanks to Luke Benes for reporting.

3 years ago[ORC] Check for underflow on SymbolStringPtr ref-counts.
Lang Hames [Sat, 22 May 2021 04:10:56 +0000 (21:10 -0700)]
[ORC] Check for underflow on SymbolStringPtr ref-counts.

3 years ago[ORC] Fix debugging output: printDescription should not have a newline.
Lang Hames [Sat, 22 May 2021 03:51:33 +0000 (20:51 -0700)]
[ORC] Fix debugging output: printDescription should not have a newline.

3 years ago[ORC] Fix race condtition in CoreAPIsTest.
Lang Hames [Sat, 22 May 2021 03:50:22 +0000 (20:50 -0700)]
[ORC] Fix race condtition in CoreAPIsTest.

This test has been failing intermittently on some builders, probably due to a
race on the WorkThreads vector. This patch should fix that.

3 years ago[docs] ld.lld.1: Mention -z nostart-stop-gc
Fangrui Song [Sat, 22 May 2021 02:57:51 +0000 (19:57 -0700)]
[docs] ld.lld.1: Mention -z nostart-stop-gc

3 years ago[UpdateTestChecks] Default --x86_scrub_rip to False
Fangrui Song [Sat, 22 May 2021 02:26:15 +0000 (19:26 -0700)]
[UpdateTestChecks] Default --x86_scrub_rip to False

True is a bad default: the useful symbol names and `@GOTPCREL` are scrubbed.

Change the default and add global variable tests to x86-basic.ll
(renamed from x86_function_name.ll since we now also test variables).
I updated some tests to show the differences.

Updated LCPI regex to include Darwin style `LCPI_[0-9]+_[0-9]+` (no
leading dot).

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D102588

3 years ago[flang] Fix symbol table bugs with ENTRY statements
peter klausler [Fri, 21 May 2021 21:50:29 +0000 (14:50 -0700)]
[flang] Fix symbol table bugs with ENTRY statements

Dummy arguments of ENTRY statements in execution parts were
not being created as objects, nor were they being implicitly
typed.

When the symbol corresponding to an alternate ENTRY point
already exists (by that name) due to having been referenced
in an earlier call, name resolution used to delete the extant
symbol.  This isn't the right thing to do -- the extant
symbol will be pointed to by parser::Name nodes in the parse
tree while no longer being part of any Scope.

Differential Review: https://reviews.llvm.org/D102948

3 years ago[ORC][C-bindings] Replace LLVMOrcJITTargetMachineBuilderDisposeTargetTriple.
Lang Hames [Sat, 22 May 2021 00:35:23 +0000 (17:35 -0700)]
[ORC][C-bindings] Replace LLVMOrcJITTargetMachineBuilderDisposeTargetTriple.

The implementation and intent behind freeing the triple string here is the same
as LLVMGetDefaultTargetTriple (and any other owned c string returned from the C
API), so we should use LLVMDisposeMessage for to free the string for
consistency.

Patch by Mats Larsen -- thanks Mats!

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D102957

3 years agoRevert "[NewPM] Only invalidate modified functions' analyses in CGSCC passes"
Arthur Eubanks [Fri, 21 May 2021 23:14:08 +0000 (16:14 -0700)]
Revert "[NewPM] Only invalidate modified functions' analyses in CGSCC passes"

This reverts commit d14d84af2f5ebb8ae2188ce6884a29a586dc0a40.

Causes unacceptable memory regressions.

3 years agoRevert "[NPM] Do not run function simplification pipeline unnecessarily"
Arthur Eubanks [Fri, 21 May 2021 23:12:03 +0000 (16:12 -0700)]
Revert "[NPM] Do not run function simplification pipeline unnecessarily"

This reverts commit 97ab068034161fb35e5c9a7b293bf1e569cf077b.

Depends on D100917, which is to be reverted.

3 years ago[NewPM] Mark BitcodeWriter as required.
Eli Friedman [Fri, 21 May 2021 22:43:23 +0000 (15:43 -0700)]
[NewPM] Mark BitcodeWriter as required.

The textual IR writer has an equivalent marking.  It looks like this got
missed in e6ea877.

3 years ago[NFC][sanitizer] Remove unused variable
Vitaly Buka [Fri, 21 May 2021 23:02:06 +0000 (16:02 -0700)]
[NFC][sanitizer] Remove unused variable

3 years ago[lit] Print full googletest commad line
Vitaly Buka [Fri, 21 May 2021 18:27:44 +0000 (11:27 -0700)]
[lit] Print full googletest commad line

Similar to regular output of LIT tests:
https://github.com/llvm/llvm-project/blob/c162f086ba632ffaedfe92d63bf21571bc8ae4da/llvm/utils/lit/lit/TestRunner.py#L1569

Differential Revision: https://reviews.llvm.org/D102899

3 years ago[IR] make stack-protector-guard-* flags into module attrs
Nick Desaulniers [Fri, 21 May 2021 22:53:21 +0000 (15:53 -0700)]
[IR] make stack-protector-guard-* flags into module attrs

D88631 added initial support for:

- -mstack-protector-guard=
- -mstack-protector-guard-reg=
- -mstack-protector-guard-offset=

flags, and D100919 extended these to AArch64. Unfortunately, these flags
aren't retained for LTO. Make them module attributes rather than
TargetOptions.

Link: https://github.com/ClangBuiltLinux/linux/issues/1378
Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D102742

3 years ago[mlir][docs] Add memref and sparse_tensor to Passes.md
Andrew Young [Fri, 21 May 2021 21:33:34 +0000 (14:33 -0700)]
[mlir][docs] Add memref and sparse_tensor to Passes.md

These pass documents belong on the main pass page, and not generated as
top level pages.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D102947

3 years ago[lld][WebAssembly] Fix for PIC output + TLS + non-shared-memory
Sam Clegg [Fri, 21 May 2021 16:58:21 +0000 (09:58 -0700)]
[lld][WebAssembly] Fix for PIC output + TLS + non-shared-memory

Prior to this change build with `-shared/-pie` and using TLS (but
without -shared-memory) would hit this assert:

  "Currenly only a single data segment is supported in PIC mode"

This is because we were not including TLS data when merging data
segments.  However, when we build without shared-memory (i.e.  without
threads) we effectively lower away TLS into a normal active data
segment.. so we were ending up with two active data segments: the merged
data, and the lowered TLS data.

To fix this problem we can instead avoid combining data segments at
all when running in shared memory mode (because in this case all
segment initialization is passive).  And then in non-shared memory
mode we know that TLS has been lowered and therefore we can can
and should combine all segments.

So with this new behavior we have two different modes:

1. With shared memory / mutli-threaded: Never combine data segments
   since it is not necessary.  (All data segments as passive already).

2. Wihout shared memory / single-threaded: Combine *all* data segments
   since we treat TLS as normal data.  (We end up with a single
   active data segment).

Differential Revision: https://reviews.llvm.org/D102937

3 years ago[compiler-rt][profile] Explicitly specify PROFILE_SOURCES extensions. NFC
Jon Roelofs [Fri, 21 May 2021 21:43:34 +0000 (14:43 -0700)]
[compiler-rt][profile] Explicitly specify PROFILE_SOURCES extensions. NFC

3 years ago[NFC][HIP] fix comments in __clang_hip_cmath.h
Yaxun (Sam) Liu [Fri, 21 May 2021 21:43:28 +0000 (17:43 -0400)]
[NFC][HIP] fix comments in __clang_hip_cmath.h

3 years ago[NFC][sanitizer] Fix android bot after D102815
Vitaly Buka [Fri, 21 May 2021 21:04:36 +0000 (14:04 -0700)]
[NFC][sanitizer] Fix android bot after D102815

https://lab.llvm.org/buildbot/#/builders/77/builds/6519

3 years ago[clang] Don't pass multiple backend options if mixing -mimplicit-it and -Wa,-mimplicit-it
Martin Storsjö [Wed, 19 May 2021 21:11:52 +0000 (00:11 +0300)]
[clang] Don't pass multiple backend options if mixing -mimplicit-it and -Wa,-mimplicit-it

If multiple instances of the -arm-implicit-it option is passed to
the backend, it errors out.

Also fix cases where there are multiple -Wa,-mimplicit-it; the existing
tests indicate that the last one specified takes effect, while in
practice it passed double options, which didn't work as intended.

Differential Revision: https://reviews.llvm.org/D102812

3 years agoRISCV: add a few deprecated aliases for CSRs
Saleem Abdulrasool [Wed, 5 May 2021 16:14:51 +0000 (09:14 -0700)]
RISCV: add a few deprecated aliases for CSRs

This adds the {s,u,m}badaddr CSR aliases as well as the sptbr alias.
These are for compatibility with binutils.  Furthermore, these are used
by the RISC-V Proxy Kernel and are required to enable building the Proxy
Kernel with the LLVM IAS.

The aliases here are deprecated.  These are being introduced in order to
provide a compatibility story for the existing GNU toolchain, which
still supports the deprecated spelling in the assembler.  However, in
order to encourage the migration of existing coding, we provide warnings
indicating that the aliased CSRs are deprecated and should be replaced.

Differential Revision: https://reviews.llvm.org/D101919
Reviewed By: Craig Topper

3 years ago[OpenMP] libomp: move warnings to after library initialization
AndreyChurbanov [Fri, 21 May 2021 20:46:21 +0000 (23:46 +0300)]
[OpenMP] libomp: move warnings to after library initialization

Warnings on deprecated api cannot be suppressed if the library is not initialized.
With this change it is possible to set KMP_WARNINGS=false to suppress the warnings.

Differential Revision: https://reviews.llvm.org/D102676

3 years ago[LLD][COFF] PR49068: Include the IMAGE_REL_BASED_HIGHLOW relocation base type when...
Axel Y. Rivera [Fri, 21 May 2021 20:31:03 +0000 (23:31 +0300)]
[LLD][COFF] PR49068: Include the IMAGE_REL_BASED_HIGHLOW relocation base type when the machine is 64 bits and the relocation type is ADDR32

The COFF driver produces an ABSOLUTE relocation base for an ADDR32
relocation type and the system is 64 bits (machine=AMD64). The
relocation information won't be added in the output and could
produce an incorrect address access during run-time. This change
set checks if the relocation type is IMAGE_REL_AMD64_ADDR32 and
if so, adds the relocated symbol as IMAGE_REL_BASED_HIGHLOW base.

Differential Revision: https://reviews.llvm.org/D96619

3 years ago[Verifier] Move some atomicrmw/cmpxchg checks to instruction creation
Arthur Eubanks [Wed, 19 May 2021 20:02:29 +0000 (13:02 -0700)]
[Verifier] Move some atomicrmw/cmpxchg checks to instruction creation

These checks already exist as asserts when creating the corresponding
instruction. Anybody creating these instructions already need to take
care to not break these checks.

Move the checks for success/failure ordering in cmpxchg from the
verifier to the LLParser and BitcodeReader plus an assert.

Add some tests for cmpxchg ordering. The .bc files are created from the
.ll files with an llvm-as with these checks disabled.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102803

3 years ago[libcxx][gardening] Re-order includes across libcxx.
zoecarver [Fri, 21 May 2021 20:13:17 +0000 (13:13 -0700)]
[libcxx][gardening] Re-order includes across libcxx.

This commit alphabetizes all includes in libcxx. This is a NFC.

This can also serve as a pseudo "announcement" for how we should order these headers going forward (note: double underscores go before other headers).

Differential Revision: https://reviews.llvm.org/D102941

3 years ago[InstSimplify] add tests for rem-of-mul; NFC
David Goldblatt [Fri, 21 May 2021 19:13:56 +0000 (15:13 -0400)]
[InstSimplify] add tests for rem-of-mul; NFC

These are baseline tests for D102864

3 years ago[mlir][sparse] add full dimension ordering support
Aart Bik [Fri, 21 May 2021 18:52:34 +0000 (11:52 -0700)]
[mlir][sparse] add full dimension ordering support

This revision completes the "dimension ordering" feature
of sparse tensor types that enables the programmer to
define a preferred order on dimension access (other than
the default left-to-right order). This enables e.g. selection
of column-major over row-major storage for sparse matrices,
but generalized to any rank, as in:

dimOrdering = affine_map<(i,j,k,l,m,n,o,p) -> (p,o,j,k,i,l,m,n)>

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D102856

3 years ago[NFC][lit] Add missing UNRESOLVED test
Vitaly Buka [Fri, 21 May 2021 18:27:44 +0000 (11:27 -0700)]
[NFC][lit] Add missing UNRESOLVED test

D102899 will change it behavour.

3 years ago[NFC][lit] Add skipped test into upstream format
Vitaly Buka [Fri, 21 May 2021 18:22:31 +0000 (11:22 -0700)]
[NFC][lit] Add skipped test into upstream format

Missing from D102694

3 years ago[nfc][lit] Relax spacing check
Vitaly Buka [Fri, 21 May 2021 18:20:34 +0000 (11:20 -0700)]
[nfc][lit] Relax spacing check

3 years ago[gn build] Port 9db55b314b5b
LLVM GN Syncbot [Fri, 21 May 2021 18:10:35 +0000 (18:10 +0000)]
[gn build] Port 9db55b314b5b

3 years ago[libcxx][ranges] Add ranges::data CPO.
zoecarver [Fri, 23 Apr 2021 22:22:18 +0000 (15:22 -0700)]
[libcxx][ranges] Add ranges::data CPO.

This is the second to last one! Based on D101396. Depends on D100255. Refs D101079 and D101193.

Differential Revision: https://reviews.llvm.org/D101476

3 years ago[Matrix] Remove unused matrix-propagate-shape option.
Florian Hahn [Fri, 21 May 2021 18:01:54 +0000 (19:01 +0100)]
[Matrix] Remove unused matrix-propagate-shape option.

The option was used during the initial bringup, but it does not add any
value at this point. Remove it.

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D102930

3 years agoprecommit tests for D102934 and D102928
Philip Reames [Fri, 21 May 2021 17:22:11 +0000 (10:22 -0700)]
precommit tests for D102934 and D102928

3 years ago[scudo] Try to re-enabled the test on arm
Vitaly Buka [Thu, 20 May 2021 23:32:09 +0000 (16:32 -0700)]
[scudo] Try to re-enabled the test on arm

It's probably fixed by D102886.

Builder to watch https://lab.llvm.org/buildbot/#/builders/clang-cmake-armv7-full

Reviewed By: hctim, cryptoad

Differential Revision: https://reviews.llvm.org/D102887

3 years ago[libomptarget] Fix a bug whereby firstprivates are not copied over to the device
George Rokos [Thu, 20 May 2021 23:25:54 +0000 (16:25 -0700)]
[libomptarget] Fix a bug whereby firstprivates are not copied over to the device

The check for the TO flag when processing firstprivates is missing. As a result,
sometimes the device copy of a firstprivate never gets initialized. Currectly we
try to force lambda structs to be allocated immediately by marking them as a
non-firstprivate, so that PrivateArgumentManagerTy::addArg allocates memory for
them immediately. However, calling addArg with IsFirstPrivate=false makes the
function skip initializing the device copy. Whether an argument is firstprivate
and whether we need to allocate memory immediately are not synonyms, so this
patch introduces one more control variable for immediate allocation and sets it
apart from initialization.

Differential Revision: https://reviews.llvm.org/D102890

3 years ago[ORC-RT] Add missing headers to CMakeLists.txt.
Lang Hames [Fri, 21 May 2021 17:17:47 +0000 (10:17 -0700)]
[ORC-RT] Add missing headers to CMakeLists.txt.