platform/upstream/llvm.git
2 years ago[mlir] Lower complex.power and complex.rsqrt to standard dialect.
bixia1 [Wed, 8 Jun 2022 16:11:28 +0000 (09:11 -0700)]
[mlir] Lower complex.power and complex.rsqrt to standard dialect.

Add conversion tests and correctness tests.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D127255

2 years agoAdd Python bindings for the OpaqueType
dime10 [Wed, 8 Jun 2022 17:50:12 +0000 (19:50 +0200)]
Add Python bindings for the OpaqueType

Implement the C-API and Python bindings for the builtin opaque type, which was previously missing.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D127303

2 years ago[docs][clang] Minor typo fix
Jose Manuel Monsalve Diaz [Wed, 8 Jun 2022 17:41:04 +0000 (17:41 +0000)]
[docs][clang] Minor typo fix

Changing "iamge" to "image"

2 years ago[WebAssembly] Implement remaining relaxed SIMD instructions
Thomas Lively [Wed, 8 Jun 2022 17:32:10 +0000 (10:32 -0700)]
[WebAssembly] Implement remaining relaxed SIMD instructions

Add codegen, intrinsics, and builtins for the i16x8.relaxed_q15mulr_s,
i16x8.dot_i8x16_i7x16_s, and i32x4.dot_i8x16_i7x16_add_s instructions. These are
the last instructions from the relaxed SIMD proposal[1] that had not been
implemented.

[1]:
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md.

Differential Revision: https://reviews.llvm.org/D127170

2 years ago[CodeView] Fix incorrect CodeView encoding of signed integer constants
Steve Merritt [Mon, 23 May 2022 19:41:58 +0000 (15:41 -0400)]
[CodeView] Fix incorrect CodeView encoding of signed integer constants

Add proper CodeView encoding for positive constant integer values greater than
127.  In addition, use the two byte encoding form for positive values less
than LF_NUMERIC.

Differential Revision: https://reviews.llvm.org/D126968

2 years ago[X86] Regenerate slow-pmulld.ll with common SSE check prefixes
Simon Pilgrim [Wed, 8 Jun 2022 17:17:50 +0000 (18:17 +0100)]
[X86] Regenerate slow-pmulld.ll with common SSE check prefixes

Add back some unused check prefixes to simplify the D127115 regeneration

2 years agoRevert "[libc++][CI] Updates Docker image."
Mark de Wever [Wed, 8 Jun 2022 17:16:02 +0000 (19:16 +0200)]
Revert "[libc++][CI] Updates Docker image."

This reverts commit f2f0dba818a50fc17ed309823b2fdb72cb725eec.

This Docker file doesn't work on the CI. It fails to clone the checkout.
This seems like an issue with a newer glibc on an older Docker where the
clone3() call fails.

This needs further investigation before relanding.

2 years ago[mlir] Fix handling of some region branch terminator successors
Mogball [Wed, 8 Jun 2022 00:01:44 +0000 (00:01 +0000)]
[mlir] Fix handling of some region branch terminator successors

When `RegionBranchOpInterface::getSuccessorRegions` is called for anything other than the parent op, it expects the operands of the terminator of the source region to be passed, not the operands of the parent op. This was not always respected.

This fixes a bug in integer range inference and ForwardDataFlowSolver and changes `scf.while` to allow narrowing of successors using constant inputs.

Fixes #55873

Reviewed By: mehdi_amini, krzysz00

Differential Revision: https://reviews.llvm.org/D127261

2 years ago[Clang] Fix memory leak due to TemplateArgumentListInfo used in AST node.
Andrew Browne [Fri, 3 Jun 2022 00:42:54 +0000 (17:42 -0700)]
[Clang] Fix memory leak due to TemplateArgumentListInfo used in AST node.

It looks like the leak is rooted at the allocation here:
https://github.com/llvm/llvm-project/blob/1a155ee7de3b62a2fabee86fb470a1554fadc54d/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp#L3857

The VarTemplateSpecializationDecl is allocated using placement new which uses the AST structure for ownership: https://github.com/llvm/llvm-project/blob/1a155ee7de3b62a2fabee86fb470a1554fadc54d/clang/lib/AST/DeclBase.cpp#L99

The problem is the TemplateArgumentListInfo inside https://github.com/llvm/llvm-project/blob/1a155ee7de3b62a2fabee86fb470a1554fadc54d/clang/include/clang/AST/DeclTemplate.h#L2721
This object contains a vector which does not use placement new: https://github.com/llvm/llvm-project/blob/1a155ee7de3b62a2fabee86fb470a1554fadc54d/clang/include/clang/AST/TemplateBase.h#L564

Apparently ASTTemplateArgumentListInfo should be used instead https://github.com/llvm/llvm-project/blob/1a155ee7de3b62a2fabee86fb470a1554fadc54d/clang/include/clang/AST/TemplateBase.h#L575

https://reviews.llvm.org/D125802#3551305

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D126944

2 years ago[libc++][NFC] Add missing 'return 0'
Louis Dionne [Wed, 8 Jun 2022 16:56:13 +0000 (12:56 -0400)]
[libc++][NFC] Add missing 'return 0'

2 years ago[mlir][sparse] Add F16 and BF16.
bixia1 [Tue, 7 Jun 2022 23:07:13 +0000 (16:07 -0700)]
[mlir][sparse] Add F16 and BF16.

This is the first PR to add `F16` and `BF16` support to the sparse codegen. There are still problems in supporting these two data types, such as `BF16` is not quite working yet.

Add tests cases.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D127010

2 years ago[X86] combineMOVMSK - constant fold with getTargetConstantBitsFromNode not just BUILD...
Simon Pilgrim [Wed, 8 Jun 2022 16:48:47 +0000 (17:48 +0100)]
[X86] combineMOVMSK - constant fold with getTargetConstantBitsFromNode not just BUILD_VECTOR

Help avoid a regression in D127115

2 years ago[libc++][NFC] Simplify enable_if for std::copy optimization
Louis Dionne [Tue, 7 Jun 2022 17:16:52 +0000 (13:16 -0400)]
[libc++][NFC] Simplify enable_if for std::copy optimization

Get rid of the __is_trivially_copy_assignable_unwrapped helper, which
is only used in one place, and use __iter_value_type instead of
iterator_traits<T>::value_type.

Differential Revision: https://reviews.llvm.org/D127230

2 years ago[flang] Add one missed semantic check for named constant in common block
PeixinQiao [Wed, 8 Jun 2022 16:43:30 +0000 (00:43 +0800)]
[flang] Add one missed semantic check for named constant in common block

As Fortran 2018 R874, common block object must be one variable name, which
cannot be one named constant. Add this check.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D126762

2 years ago[flang] Add one semantic check for procedure bind(C) interface-name
PeixinQiao [Wed, 8 Jun 2022 16:38:14 +0000 (00:38 +0800)]
[flang] Add one semantic check for procedure bind(C) interface-name

As Fortran 2018 C1521, in procedure declaration statement, if
proc-language-binding-spec (bind(c)) is specified, the proc-interface
shall appear, it shall be an interface-name, and interface-name shall
be declared with a proc-language-binding-spec.

Reviewed By: klausler, Jean Perier

Differential Revision: https://reviews.llvm.org/D127121

2 years ago[LIBOMPTARGET] Adding AMD to llvm-omp-device-info
Jose Manuel Monsalve Diaz [Wed, 1 Jun 2022 21:49:23 +0000 (21:49 +0000)]
[LIBOMPTARGET] Adding AMD to llvm-omp-device-info

Adding device information print for AMD devices on the
`llvm-omp-device-info` command line tool. The output is inspired by
the rocminfo command line tool.

This commit adds missing HSA functions, enums and structs
needed to query additional information from the HSA agents.
A generic message for the `generic-elf-64bit` plugin is also added

Example of an output:
```
llvm-omp-device-info
Device (0):
    This is a generic-elf-64bit device

Device (1):
    This is a generic-elf-64bit device

Device (2):
    This is a generic-elf-64bit device

Device (3):
    This is a generic-elf-64bit device

Device (4):
    HSA Runtime Version:                1.1
    HSA OpenMP Device Number:           0
    Device Name:                        gfx906
    Vendor Name:                        AMD
    Device Type:                        GPU
    Max Queues:                         128
    Queue Min Size:                     64
    Queue Max Size:                     131072
    Cache:
      L0:                               16384 bytes
      L1:                               8388608 bytes
    Cacheline Size:                     64
    Max Clock Freq(MHz):                1725
    Compute Units:                      60
    SIMD per CU:                        4
    Fast F16 Operation:                 TRUE
    Wavefront Size:                     64
    Workgroup Max Size:                 1024
    Workgroup Max Size per Dimension:
      x:                                1024
      y:                                1024
      z:                                1024
    Max Waves Per CU:                   40
    Max Work-item Per CU:               2560
    Grid Max Size:                      4294967295
    Grid Max Size per Dimension:
      x:                                4294967295
      y:                                4294967295
      z:                                4294967295
    Max fbarriers/Workgrp:              32
    Memory Pools:
      Pool GLOBAL; FLAGS: COARSE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GLOBAL; FLAGS: FINE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GROUP:
        Size:                            65536 bytes
        Allocatable:                     FALSE
        Runtime Alloc Granule:           0 bytes
        Runtime Alloc alignment:         0 bytes
        Accessable by all:               FALSE

Device (5):
    HSA Runtime Version:                1.1
    HSA OpenMP Device Number:           1
    Device Name:                        gfx906
    Vendor Name:                        AMD
    Device Type:                        GPU
    Max Queues:                         128
    Queue Min Size:                     64
    Queue Max Size:                     131072
    Cache:
      L0:                               16384 bytes
      L1:                               8388608 bytes
    Cacheline Size:                     64
    Max Clock Freq(MHz):                1725
    Compute Units:                      60
    SIMD per CU:                        4
    Fast F16 Operation:                 TRUE
    Wavefront Size:                     64
    Workgroup Max Size:                 1024
    Workgroup Max Size per Dimension:
      x:                                1024
      y:                                1024
      z:                                1024
    Max Waves Per CU:                   40
    Max Work-item Per CU:               2560
    Grid Max Size:                      4294967295
    Grid Max Size per Dimension:
      x:                                4294967295
      y:                                4294967295
      z:                                4294967295
    Max fbarriers/Workgrp:              32
    Memory Pools:
      Pool GLOBAL; FLAGS: COARSE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GLOBAL; FLAGS: FINE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GROUP:
        Size:                            65536 bytes
        Allocatable:                     FALSE
        Runtime Alloc Granule:           0 bytes
        Runtime Alloc alignment:         0 bytes
        Accessable by all:               FALSE

Device (6):
    HSA Runtime Version:                1.1
    HSA OpenMP Device Number:           2
    Device Name:                        gfx906
    Vendor Name:                        AMD
    Device Type:                        GPU
    Max Queues:                         128
    Queue Min Size:                     64
    Queue Max Size:                     131072
    Cache:
      L0:                               16384 bytes
      L1:                               8388608 bytes
    Cacheline Size:                     64
    Max Clock Freq(MHz):                1725
    Compute Units:                      60
    SIMD per CU:                        4
    Fast F16 Operation:                 TRUE
    Wavefront Size:                     64
    Workgroup Max Size:                 1024
    Workgroup Max Size per Dimension:
      x:                                1024
      y:                                1024
      z:                                1024
    Max Waves Per CU:                   40
    Max Work-item Per CU:               2560
    Grid Max Size:                      4294967295
    Grid Max Size per Dimension:
      x:                                4294967295
      y:                                4294967295
      z:                                4294967295
    Max fbarriers/Workgrp:              32
    Memory Pools:
      Pool GLOBAL; FLAGS: COARSE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GLOBAL; FLAGS: FINE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GROUP:
        Size:                            65536 bytes
        Allocatable:                     FALSE
        Runtime Alloc Granule:           0 bytes
        Runtime Alloc alignment:         0 bytes
        Accessable by all:               FALSE

Device (7):
    HSA Runtime Version:                1.1
    HSA OpenMP Device Number:           3
    Device Name:                        gfx906
    Vendor Name:                        AMD
    Device Type:                        GPU
    Max Queues:                         128
    Queue Min Size:                     64
    Queue Max Size:                     131072
    Cache:
      L0:                               16384 bytes
      L1:                               8388608 bytes
    Cacheline Size:                     64
    Max Clock Freq(MHz):                1725
    Compute Units:                      60
    SIMD per CU:                        4
    Fast F16 Operation:                 TRUE
    Wavefront Size:                     64
    Workgroup Max Size:                 1024
    Workgroup Max Size per Dimension:
      x:                                1024
      y:                                1024
      z:                                1024
    Max Waves Per CU:                   40
    Max Work-item Per CU:               2560
    Grid Max Size:                      4294967295
    Grid Max Size per Dimension:
      x:                                4294967295
      y:                                4294967295
      z:                                4294967295
    Max fbarriers/Workgrp:              32
    Memory Pools:
      Pool GLOBAL; FLAGS: COARSE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GLOBAL; FLAGS: FINE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GROUP:
        Size:                            65536 bytes
        Allocatable:                     FALSE
        Runtime Alloc Granule:           0 bytes
        Runtime Alloc alignment:         0 bytes
        Accessable by all:               FALSE
```

Differential Revision: https://reviews.llvm.org/D126836

2 years ago[NFC][Flang][OpenMP] Refactor getting ompobject symbol
PeixinQiao [Wed, 8 Jun 2022 16:29:07 +0000 (00:29 +0800)]
[NFC][Flang][OpenMP] Refactor getting ompobject symbol

Getting ompobject symbol is needed in multiple places and will be
needed later for the lowering of other constructs/clauses such as
copyin clause. Extract them into one function.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D127280

2 years ago[libc++] Make sure we add /llvm to the list of safe directories
Louis Dionne [Wed, 8 Jun 2022 16:25:05 +0000 (12:25 -0400)]
[libc++] Make sure we add /llvm to the list of safe directories

With the new version of Git in Ubuntu Jammy (which is now what we use in
our Docker image), we need to add `/llvm` to the list of safe directories
to avoid failures.

2 years ago[AArch64] Remove ToBeRemoved from AArch64MIPeepholeOpt
David Green [Wed, 8 Jun 2022 16:26:07 +0000 (17:26 +0100)]
[AArch64] Remove ToBeRemoved from AArch64MIPeepholeOpt

The ToBeRemoved is used to remove any MachineInstructions that are no
longer needed, making sure we don't invalidate the iterator that is
currently in use by erasing the instruction straight away. This makes
issues for keeping the code in SSA from though, where subsequent
transforms that require SSA form may have been broken by previous
peepholes.

If, instead, we use make_early_inc_range the iteration issue shouldn't
be present, so long as we do not remove the subsequent instruction in
the peephole optimizations. That way the code between transforms is kept
in SSA form, meaning hopefully less things that can go wrong.

Differential Revision: https://reviews.llvm.org/D127296

2 years ago[libc] Fix build when __FE_DENORM is defined
Alex Brachet [Wed, 8 Jun 2022 16:21:53 +0000 (16:21 +0000)]
[libc] Fix build when __FE_DENORM is defined

Differential revision: https://reviews.llvm.org/D127222

2 years ago[flang][NFC] Move genMaxWithZero into fir:::factory
jeanPerier [Wed, 8 Jun 2022 16:01:50 +0000 (18:01 +0200)]
[flang][NFC] Move genMaxWithZero into fir:::factory

Move tthe function to allow its usage in the Optimizer/Builder functions.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D127295

2 years ago[lldb] Update TestMultithreaded to report FAIL for a non-zero exit code
Jonas Devlieghere [Wed, 8 Jun 2022 15:45:53 +0000 (08:45 -0700)]
[lldb] Update TestMultithreaded to report FAIL for a non-zero exit code

A non-zero exit code from the test binary results in a
CalledProcessError. Without catching the exception, that would result in
a error (unresolved test) instead of a failure. This patch fixes that.

2 years ago[lldb] Parse the dotest output to determine the most appropriate result code
Jonas Devlieghere [Wed, 8 Jun 2022 15:35:38 +0000 (08:35 -0700)]
[lldb] Parse the dotest output to determine the most appropriate result code

Currently we look for keywords in the dotest.py output to determine the
lit result code. This binary approach of a keyword being present works
for PASS and FAIL, where having at least one test pass or fail
respectively results in that exit code. Things are more complicated
for tests that neither passed or failed, but report a combination of
(un)expected failures, skips or unresolved tests.

This patch changes the logic to parse the number of tests with a
particular result from the dotest.py output. For tests that did not PASS
or FAIL, we now report the lit result code for the one that occurred the
most. For example, if we had a test with 3 skips and 4 expected
failures, we report the test as XFAIL.

We're still mapping multiple tests to one result code, so some loss of
information is inevitable.

Differential revision: https://reviews.llvm.org/D127258

2 years ago[WebAssembly] Regenerate simd-build-vector.ll to show full codegen
Simon Pilgrim [Wed, 8 Jun 2022 15:54:26 +0000 (16:54 +0100)]
[WebAssembly] Regenerate simd-build-vector.ll to show full codegen

2 years ago[flang] Add proper todo in BoxValue
Valentin Clement [Wed, 8 Jun 2022 15:50:49 +0000 (17:50 +0200)]
[flang] Add proper todo in BoxValue

Switch debub message to proper TODOs.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D127282

2 years agoReland [AMDGPU] gfx11 VOP1+VOP2 Instruction MC support
Joe Nash [Mon, 23 May 2022 14:26:02 +0000 (10:26 -0400)]
Reland [AMDGPU] gfx11 VOP1+VOP2 Instruction MC support

The reverted dependent commit is now relanded, so reland this.
Includes dpp instructions and vop1/vop2 promoted to vop3

Patch 17/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126483

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126917

2 years agoAdd a parameter to LoadFromASTFile that accepts a file system and defaults to the...
Andy Soffer [Wed, 8 Jun 2022 15:17:22 +0000 (15:17 +0000)]
Add a parameter to LoadFromASTFile that accepts a file system and defaults to the real file-system.

Reviewed By: ymandel

Differential Revision: https://reviews.llvm.org/D126888

2 years agoAdd an error message to the default SIGPIPE handler
Tim Northover [Wed, 11 May 2022 08:52:10 +0000 (09:52 +0100)]
Add an error message to the default SIGPIPE handler

UNIX03 conformance requires utilities to flush stdout before exiting and raise
an error if writing fails. Flushing already happens on a call to exit
and thus automatically on a return from main. Write failure is then
detected by LLVM's default SIGPIPE handler. The handler already exits with
a non-zero code, but conformance additionally requires an error message.

2 years ago[Dexter] Use PurePath to compare paths in Dexter commands
Stephen Tozer [Mon, 6 Jun 2022 10:20:05 +0000 (11:20 +0100)]
[Dexter] Use PurePath to compare paths in Dexter commands

Prior to this patch, when comparing the paths of source files in Dexter
commands, we would use os.samefile. This function performs actual file
operations and requires the files to exist on the current system; this
is suitable when running the test for the first time, but renders the
DextIR output files non-portable, and unusable if the source files no
longer exist in their original location.

Differential Revision: https://reviews.llvm.org/D127099

2 years ago[RISCV] Support (addi (addi globaladdr, C1), C2) in RISCVMergeBaseOffset.
Craig Topper [Wed, 8 Jun 2022 15:20:34 +0000 (08:20 -0700)]
[RISCV] Support (addi (addi globaladdr, C1), C2) in RISCVMergeBaseOffset.

Add with immediates in the range [-4096, -2049] or [2048, 4095] get
convert to two ADDIs. Teach RISCVMergeBaseOffset to recognize this
pattern as well.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D126843

2 years ago[RISCV] Support LUI+ADDIW in RISCVMergeBaseOffsetOpt::matchLargeOffset.
Craig Topper [Wed, 8 Jun 2022 15:06:56 +0000 (08:06 -0700)]
[RISCV] Support LUI+ADDIW in RISCVMergeBaseOffsetOpt::matchLargeOffset.

LUI+ADDIW always produces a simm32. This allows us to always
fold it into a global offset.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D126729

2 years ago[mlir][spirv] NFC: fix typo in UnifyAliasedResourcePass pass
Lei Zhang [Wed, 8 Jun 2022 15:17:41 +0000 (08:17 -0700)]
[mlir][spirv] NFC: fix typo in UnifyAliasedResourcePass pass

Reviewed By: ThomasRaoux, hanchung

Differential Revision: https://reviews.llvm.org/D127265

2 years ago[DAG] visitVSELECT - don't wait for truncation of sub before attempting to match...
Simon Pilgrim [Wed, 8 Jun 2022 15:16:26 +0000 (16:16 +0100)]
[DAG] visitVSELECT - don't wait for truncation of sub before attempting to match with getTruncatedUSUBSAT

Fixes some X86 PSUBUS regressions encountered in D127115 where the truncate was being replaced with a PACKSS/PACKUS before the fold got called again

2 years ago[DA] Handle mismatching loop levels by considering them non-linear
Bardia Mahjour [Wed, 8 Jun 2022 15:15:37 +0000 (11:15 -0400)]
[DA] Handle mismatching loop levels by considering them non-linear

To represent various loop levels within a nest, DA implements a special
numbering scheme (see comment atop establishNestingLevels). The goal of
this numbering scheme appears to be representing each unique loop
distinctively by using as little memory as possible. This numbering
scheme is simple when the source and destination of the dependence are
in the same loop. In such cases the level is simply the depth of the
loop in which src and dst reside. When the src and dst are not in the
same loop, we could run into the following situation exposed by
https://reviews.llvm.org/D71539. This patch fixes this by detecting
such cases in checkSubscripts and treating them as non-linear/non-affine.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D110973

2 years ago[SystemZ] Use STDY/STEY/LDY/LEY for VR32/VR64 in eliminateFrameIndex().
Jonas Paulsson [Wed, 8 Dec 2021 00:34:26 +0000 (18:34 -0600)]
[SystemZ] Use STDY/STEY/LDY/LEY for VR32/VR64 in eliminateFrameIndex().

When e.g. a VR64 register is spilled to a stack slot requiring a long
(20-bit) displacement, it is possible to use an FP opcode if the allocated
phys reg allows it. This eliminates the use of a separate LAY instruction.

Reviewed By: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D115406

2 years ago[RISCV] Untangle instruction properties from VSETVLIInfo [NFC]
Philip Reames [Wed, 8 Jun 2022 15:08:03 +0000 (08:08 -0700)]
[RISCV] Untangle instruction properties from VSETVLIInfo [NFC]

The abstract state used in the data flow should not know anything about the instructions which produced the abstract states. Instead, when comparing two states, we can simply use information about the machine instr at that time.

In the old design, basically any use of the instruction flags on the current (as opposed to a "Require" - aka upcoming state) would be a bug. We don't seem to actually have any such bugs, but we can make this much more obvious with code structure.

Differential Revision: https://reviews.llvm.org/D126921

2 years ago[clang] co_return cleanup
Nathan Sidwell [Fri, 13 May 2022 12:04:48 +0000 (05:04 -0700)]
[clang] co_return cleanup

There's no need for the CoreturnStmt getChildren member to deal with
the presence or absence of the operand member. All users already deal
with null children.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D125542

2 years ago[clang][NFC][SVE] Add tests for operators on VLS vectors
David Truby [Tue, 7 Jun 2022 15:10:23 +0000 (16:10 +0100)]
[clang][NFC][SVE] Add tests for operators on VLS vectors

This patch adds codegen tests for operators on SVE VLS vector types

2 years ago[Dexter] Catch value error when encountering invalid address
Stephen Tozer [Mon, 6 Jun 2022 11:22:24 +0000 (12:22 +0100)]
[Dexter] Catch value error when encountering invalid address

The DexDeclareAddress command checks the value of a variable at a
certain point in the debugged program, and saves that value to be used
in other commands. If the value at that point is not a valid address
however, it currently causes an error in Dexter when we try to cast it -
this is fixed in this patch by catching the error and leaving the
address value unresolved.

Differential Revision: https://reviews.llvm.org/D127101

2 years agoRestore isa<Ty>(X) asserts inside cast<Ty>(X)
Philip Reames [Wed, 8 Jun 2022 14:11:12 +0000 (07:11 -0700)]
Restore isa<Ty>(X) asserts inside cast<Ty>(X)

PLEASE DO NOT REVERT without careful consideration, and preferably prior
discussion.

cast<Ty>(X) is a "checked cast". Its entire purpose is explicitly documented
(https://llvm.org/docs/ProgrammersManual.html#the-isa-cast-and-dyn-cast
templates) as catching bad casts by asserting that the cast is valid.
Unfortunately, in a recent rewrite of our casting infrastructure about three
months back, these asserts got dropped.

This is discussed in more detail on discourse in https://discourse.llvm.org/t/cast-x-is-broken-implications-and-proposal-to-address/63033.

Differential Revision: https://reviews.llvm.org/D127231

2 years ago[AArch64] Add tests for bitcast high register extracts. NFC
David Green [Wed, 8 Jun 2022 14:26:31 +0000 (15:26 +0100)]
[AArch64] Add tests for bitcast high register extracts. NFC

2 years ago[RISCV] Add ISD::EH_DWARF_CFA
Shao-Ce SUN [Thu, 26 May 2022 18:42:00 +0000 (02:42 +0800)]
[RISCV] Add ISD::EH_DWARF_CFA

Based on D24038.
LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin.

Reviewed By: StephenFan

Differential Revision: https://reviews.llvm.org/D126181

2 years ago[Target] Remove `startswith` for adding `SHF_EXCLUDE` to offload section
Joseph Huber [Wed, 8 Jun 2022 13:54:08 +0000 (09:54 -0400)]
[Target] Remove `startswith` for adding `SHF_EXCLUDE` to offload section

Summary:
We use the special section name `.llvm.offloading` to store device
imagees in the host object file. We want these to be stripped by the
linker as they are not used after linking so we use the `SHF_EXCLUDE`
flag to instruct the linker to drop them. We used to do this for all
sections that started with `.llvm.offloading` when we encoded metadata
in the section name itself. Now we embed a special binary containing the
metadata, we should only add the flag on this name specifically.

2 years ago[Libomptarget] Add missing include to define `printf`
Joseph Huber [Wed, 8 Jun 2022 13:11:04 +0000 (09:11 -0400)]
[Libomptarget] Add missing include to define `printf`

Summary:
This test was failing because of an implicit declaration of `printf`
which isn't legal with newer C, causing it to fail. This patch just adds
the necessary header.

2 years ago[libc] Add expm1f function to bazel's build overlay.
Tue Ly [Wed, 8 Jun 2022 13:36:07 +0000 (09:36 -0400)]
[libc] Add expm1f function to bazel's build overlay.

Add expm1f function to bazel's build overlay.

Reviewed By: gchatelet

Differential Revision: https://reviews.llvm.org/D127298

2 years agoM68k: Fix build
Matt Arsenault [Wed, 8 Jun 2022 13:25:57 +0000 (09:25 -0400)]
M68k: Fix build

2 years agoRevert "[RISCV] Testcase to show wrong register allocation result of subreg liveness"
Kito Cheng [Wed, 8 Jun 2022 13:19:27 +0000 (21:19 +0800)]
Revert "[RISCV] Testcase to show wrong register allocation result of subreg liveness"

Revert due to failed on LLVM_ENABLE_EXPENSIVE_CHECKS.

This reverts commit cbe22c794348a1962af8a5d21fbedbb65974d94c.

2 years agoRecommit "[VPlan] Remove uneeded needsVectorIV check."
Florian Hahn [Wed, 8 Jun 2022 13:06:45 +0000 (14:06 +0100)]
Recommit "[VPlan] Remove uneeded needsVectorIV check."

This reverts commit 266ea446ab747671eb6c736569c3c9c5f3c53d11.

The reasons for the revert have been addressed by cleaning up condition
handling in VPlan and properly marking VPBranchOnMaskRecipe as using
scalars.

The test case for the revert from D123720 has been added in 3d663308a5d.

2 years agoCorrecting some links in the C status page
Aaron Ballman [Wed, 8 Jun 2022 12:51:52 +0000 (08:51 -0400)]
Correcting some links in the C status page

The paper titles were correct, but the document number and links were
incorrect (typo'ed numbers).

2 years ago[sanitizer] Fix shift UB in LEB128 test
Nikita Popov [Wed, 8 Jun 2022 12:19:07 +0000 (14:19 +0200)]
[sanitizer] Fix shift UB in LEB128 test

If u64 and uptr have the same size, then this will perform a shift
by the bitwidth, which is UB. We only need this code if uptr is
smaller than u64.

2 years agoAdd the 2022 papers to the C status tracking page
Aaron Ballman [Wed, 8 Jun 2022 11:41:34 +0000 (07:41 -0400)]
Add the 2022 papers to the C status tracking page

This adds the papers from Feb 2022 (parts 1 and 2) and May 2022.

2 years ago[LV] Add test that caused revert of D123720.
Florian Hahn [Wed, 8 Jun 2022 11:25:17 +0000 (12:25 +0100)]
[LV] Add test that caused revert of D123720.

2 years ago[BOLT] Set valid index for functions with profiles
Vladislav Khmelevsky [Tue, 7 Jun 2022 15:40:04 +0000 (18:40 +0300)]
[BOLT] Set valid index for functions with profiles

Some of the passes that calculates tentative layout like LongJmp and
Golang are expecting that only functions with valid index will be
located in hot text section. But currently functions with valid profiles
and not set index are breaking this logic, to fix this we can move the
hasValidProfile() condition from AssignSections pass to ReorderFunctions.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D127223

2 years ago[MLIR][Math] Add round operation
lorenzo chelini [Thu, 2 Jun 2022 14:49:23 +0000 (16:49 +0200)]
[MLIR][Math] Add round operation

Introduce RoundOp in the math dialect. The operation rounds the operand to the
nearest integer value in floating-point format. RoundOp lowers to LLVM
intrinsics 'llvm.intr.round' or as a function call to libm (round or roundf).

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D127286

2 years ago[Hexagon] Regenerate build-vector-v4i8-zext.ll to show full codegen
Simon Pilgrim [Wed, 8 Jun 2022 10:41:54 +0000 (11:41 +0100)]
[Hexagon] Regenerate build-vector-v4i8-zext.ll to show full codegen

2 years ago[AST] Make header self-contained
Benjamin Kramer [Wed, 8 Jun 2022 10:35:24 +0000 (12:35 +0200)]
[AST] Make header self-contained

There's a dependency in AbstractTypeReader.inc that becomes an error
after D127231.

2 years ago[gn build] Port 916e9052ba95
LLVM GN Syncbot [Wed, 8 Jun 2022 10:19:18 +0000 (10:19 +0000)]
[gn build] Port 916e9052ba95

2 years ago[CMake] Improve support for ASAN on Windows with MSVC cl & clang-cl
Andrew Ng [Tue, 31 May 2022 14:13:24 +0000 (15:13 +0100)]
[CMake] Improve support for ASAN on Windows with MSVC cl & clang-cl

Tested with MSVC 2019 (19.29) and LLVM 14.0.4.

Differential Revision: https://reviews.llvm.org/D126706

2 years ago[libc++] Implement ranges::adjacent_find
Nikolas Klauser [Wed, 8 Jun 2022 10:14:12 +0000 (12:14 +0200)]
[libc++] Implement ranges::adjacent_find

Reviewed By: Mordante, var-const, #libc

Spies: libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D126610

2 years ago[Docs] Add version support information for opaque pointers (NFC)
Nikita Popov [Wed, 8 Jun 2022 09:51:28 +0000 (11:51 +0200)]
[Docs] Add version support information for opaque pointers (NFC)

I've seen a few people try to enable opaque pointers with LLVM 14
already. While LLVM 14 has pretty good baseline support, there are
enough missing pieces that you're definitely going to hit assertion
failures if you try this.

Add some wording to make it clear what the support (or planned
support) for opaque/typed pointers is across LLVM 14, 15, and 16.

2 years ago[SelectionDAG] Remove invalid TypeSize conversion from PromoteIntRes_BITCAST.
Paul Walker [Mon, 6 Jun 2022 15:23:48 +0000 (16:23 +0100)]
[SelectionDAG] Remove invalid TypeSize conversion from PromoteIntRes_BITCAST.

Extend the TypeWidenVector case of PromoteIntRes_BITCAST to work
with TypeSize directly rather than silently casting to unsigned.

To accomplish this I've extended TypeSize with an interface that
essentially allows TypeSize division when both operands have the
same number of dimensions.

There still exists combinations of scalable vector bitcasts that
cause compiler crashes. I call these out by adding "is missing"
entries to sve-bitcast.

Depends on D126957.
Fixes: #55114

Differential Revision: https://reviews.llvm.org/D127126

2 years ago[SVE] Fix incorrect code generation for bitcasts of unpacked vector types.
Paul Walker [Tue, 31 May 2022 09:59:05 +0000 (10:59 +0100)]
[SVE] Fix incorrect code generation for bitcasts of unpacked vector types.

Bitcasting between unpacked scalable vector types of different
element counts is not a NOP because the live elements are laid out
differently.
               01234567
e.g. nxv2i32 = XX??XX??
     nxv4f16 = X?X?X?X?

Differential Revision: https://reviews.llvm.org/D126957

2 years ago[Bitcode] Re-enable verify-uselistorder test (NFC)
Nikita Popov [Wed, 8 Jun 2022 09:28:57 +0000 (11:28 +0200)]
[Bitcode] Re-enable verify-uselistorder test (NFC)

This issue has since been fixed, so re-enable the commented RUN
line.

2 years ago[Test] Add XFAIL test for PR55689
Max Kazantsev [Wed, 8 Jun 2022 08:59:50 +0000 (15:59 +0700)]
[Test] Add XFAIL test for PR55689

SCEV issues in dynamically unreached code, see details at https://github.com/llvm/llvm-project/issues/55689

1st reduced test by Nikic!

2 years ago[doc] Add release notes about SEH unwind information on ARM
Martin Storsjö [Mon, 6 Jun 2022 20:56:31 +0000 (23:56 +0300)]
[doc] Add release notes about SEH unwind information on ARM

Differential Revision: https://reviews.llvm.org/D127150

2 years ago[mlir][bufferize] Improve buffer writability analysis
Matthias Springer [Tue, 7 Jun 2022 22:04:54 +0000 (00:04 +0200)]
[mlir][bufferize] Improve buffer writability analysis

Find writability conflicts (writes to buffers that are not allowed to be written to) by checking SSA use-def chains. This is better than the current writability analysis, which is too conservative and finds false positives.

Differential Revision: https://reviews.llvm.org/D127256

2 years ago[NFC] Remove commented cerr debugging loggings
Chuanqi Xu [Wed, 8 Jun 2022 07:45:21 +0000 (15:45 +0800)]
[NFC] Remove commented cerr debugging loggings

There are some unused cerr debugging loggings in the codes. It is weird
to remain such commented debug helpers in the product.

2 years ago[Sanitizers] intercept FreeBSD procctl
David CARLIER [Wed, 8 Jun 2022 07:55:10 +0000 (08:55 +0100)]
[Sanitizers] intercept FreeBSD procctl

Reviewers: vitalybuka, emaster

Reviewed-By: viatelybuka
Differential Revision: https://reviews.llvm.org/D127069

2 years ago[mlir][MemRef] Fix a crash when expanding a scalar shape
Benjamin Kramer [Tue, 7 Jun 2022 17:30:10 +0000 (19:30 +0200)]
[mlir][MemRef] Fix a crash when expanding a scalar shape

In this case the reassociation is empty, yielding no strides for the
result type.

Differential Revision: https://reviews.llvm.org/D127232

2 years ago[gn build] Port 638b0fb4d651
LLVM GN Syncbot [Wed, 8 Jun 2022 07:20:40 +0000 (07:20 +0000)]
[gn build] Port 638b0fb4d651

2 years ago[ADT][NFC] Early bail out for ComputeEditDistance
Nathan James [Wed, 8 Jun 2022 07:20:28 +0000 (08:20 +0100)]
[ADT][NFC] Early bail out for ComputeEditDistance

The minimun bound for number of edits is the size difference between the 2 arrays.
If MaxEditDistance is smaller than this, we can bail out early without needing to traverse any of the arrays.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D127070

2 years ago[MLIR][SCF] Improve doc (NFC)
lorenzo chelini [Wed, 8 Jun 2022 06:46:36 +0000 (08:46 +0200)]
[MLIR][SCF] Improve doc (NFC)

2 years agoRevert "[SplitKit] Handle early clobber + tied to def correctly"
Kito Cheng [Wed, 8 Jun 2022 05:05:35 +0000 (13:05 +0800)]
Revert "[SplitKit] Handle early clobber + tied to def correctly"

Revert due to failed on LLVM_ENABLE_EXPENSIVE_CHECKS.

This reverts commit e14d04909df4e52e531f6c2e045c3cf9638dd817.

2 years ago[CMake] Enable LLVM_ENABLE_PER_TARGET_RUNTIME_DIR by default on Linux
Fangrui Song [Wed, 8 Jun 2022 04:22:38 +0000 (21:22 -0700)]
[CMake] Enable LLVM_ENABLE_PER_TARGET_RUNTIME_DIR by default on Linux

This makes the LLVM_ENABLE_PROJECTS mode (supported for compiler-rt, deprecated
(D112724) for libcxx/libcxxabi/libunwind) closer to
https://libcxx.llvm.org/BuildingLibcxx.html#bootstrapping-build .
The layout is arguably superior because different libraries of target triples
are in different directories, similar to GCC/Debian multiarch.

When LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu,
`lib/clang/15.0.0/lib/libclang_rt.asan-x86_64.a`
is moved to
`lib/clang/15.0.0/lib/x86_64-unknown-linux-gnu/libclang_rt.asan.a`.

In addition, if the host compiler supports -m32 (multilib),
`lib/clang/15.0.0/lib/libclang_rt.asan-i386.a`
is moved to
`lib/clang/15.0.0/lib/i386-unknown-linux-gnu/libclang_rt.asan.a`.

Reviewed By: mstorsjo, ldionne, #libc

Differential Revision: https://reviews.llvm.org/D107799

2 years ago[DirectX][Fail crash in DXILPrepareModule pass when input has typed ptr.
python3kgae [Wed, 8 Jun 2022 01:38:29 +0000 (18:38 -0700)]
[DirectX][Fail crash in DXILPrepareModule pass when input has typed ptr.

Check supportsTypedPointers instead of hasSetOpaquePointersValue when query if has typed ptr.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D127268

2 years ago[SplitKit] Handle early clobber + tied to def correctly
Kito Cheng [Thu, 26 May 2022 10:43:04 +0000 (18:43 +0800)]
[SplitKit] Handle early clobber + tied to def correctly

Spliter will try to extend a live range into `r` slot for a use operand,
that's works on most situaion, however that not work correctly when the operand
has tied to def, and the def operand is early clobber.

Give an example to demo what's wrong:
  0  %0 = ...
 16  early-clobber %0 = Op %0 (tied-def 0), ...
 32  ... = Op %0

Before extend:
 %0 = [0r, 0d) [16e, 32d)

The point we want to extend is 0d to 16e not 16r in this case, but if
we use 16r here we will extend nothing because that already contained
in [16e, 32d).

This patch add check for detect such case and adjust the extend point.

Detailed explanation for testcase: https://reviews.llvm.org/D126047

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D126048

2 years ago[RISCV] Testcase to show wrong register allocation result of subreg liveness
Kito Cheng [Wed, 8 Jun 2022 02:23:35 +0000 (10:23 +0800)]
[RISCV] Testcase to show wrong register allocation result of subreg liveness

This testcase show the live range isn't construct correctly when subreg
liveness is enabled.

In the testcase `early-clobber-tied-def-subreg-liveness.ll`, first operand of
`vsext.vf2 v8, v16, v0.t` is both def and use, and the use is come from
the memory location of `.L__const._Z3foov.var_49`, it's load and spilled
into stack, and then...v8 is overwrite by another instructions.

```
lui a0, %hi(.L__const._Z3foov.var_49)
addi a0, a0, %lo(.L__const._Z3foov.var_49)
...
vle16.v v8, (a0) # Load value from var_49
...
addi a0, sp, 16
...
vs2r.v v8, (a0) # Spill
...
vl2r.v v8, (a1) # Reload
...
lui a0, %hi(.L__const._Z3foov.var_40)
addi a0, a0, %lo(.L__const._Z3foov.var_40)
vle16.v v8, (a0)     # Load value...into v8???
vmsbc.vx v0, v8, a0  # And use that.
...
vsext.vf2 v8, v16, v0.t # But v8 is here...which is expect value from the reload
```

The `early-clobber-tied-def-subreg-liveness.mir` has more detailed
infomation for that, `%25.sub_vrm2_0` is defined in 64, and used in 464,
and defined again in 464, and we has used an inline asm to clobber all
vector register for trigger spliter.

```
0B      bb.0.entry:
16B       %0:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_49
32B       %1:gpr = ADDI %0:gpr, target-flags(riscv-lo) @__const._Z3foov.var_49
48B       dead $x0 = PseudoVSETIVLI 2, 73, implicit-def $vl, implicit-def $vtype
64B       undef %25.sub_vrm2_0:vrn4m2nov0 = PseudoVLE16_V_M2 %1:gpr, 2, 4, implicit $vl, implicit $vtype
80B       %3:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_48
96B       %4:gpr = ADDI %3:gpr, target-flags(riscv-lo) @__const._Z3foov.var_48
112B      %5:vr = PseudoVLE8_V_M1 %4:gpr, 2, 3, implicit $vl, implicit $vtype
128B      %6:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_46
144B      %7:gpr = ADDI %6:gpr, target-flags(riscv-lo) @__const._Z3foov.var_46
160B      %25.sub_vrm2_1:vrn4m2nov0 = PseudoVLE16_V_M2 %7:gpr, 2, 4, implicit $vl, implicit $vtype
176B      %9:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_45
192B      %10:gpr = ADDI %9:gpr, target-flags(riscv-lo) @__const._Z3foov.var_45
208B      %25.sub_vrm2_2:vrn4m2nov0 = PseudoVLE16_V_M2 %10:gpr, 2, 4, implicit $vl, implicit $vtype
224B      INLINEASM &"" [sideeffect] [attdialect], $0:[clobber], ...
240B      %12:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_44
256B      %13:gpr = ADDI %12:gpr, target-flags(riscv-lo) @__const._Z3foov.var_44
272B      dead $x0 = PseudoVSETIVLI 2, 73, implicit-def $vl, implicit-def $vtype
288B      %25.sub_vrm2_3:vrn4m2nov0 = PseudoVLE16_V_M2 %13:gpr, 2, 4, implicit $vl, implicit $vtype
304B      $x0 = PseudoVSETIVLI 2, 73, implicit-def $vl, implicit-def $vtype
320B      %16:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_40
336B      %17:gpr = ADDI %16:gpr, target-flags(riscv-lo) @__const._Z3foov.var_40
352B      %18:vrm2 = PseudoVLE16_V_M2 %17:gpr, 2, 4, implicit $vl, implicit $vtype
368B      $x0 = PseudoVSETIVLI 2, 73, implicit-def $vl, implicit-def $vtype
384B      %20:gpr = LUI 1048572
400B      %21:gpr = ADDIW %20:gpr, 928
416B      early-clobber %22:vr = PseudoVMSBC_VX_M2 %18:vrm2, %21:gpr, 2, 4, implicit $vl, implicit $vtype
432B      $x0 = PseudoVSETIVLI 2, 9, implicit-def $vl, implicit-def $vtype
448B      $v0 = COPY %22:vr
464B      early-clobber %25.sub_vrm2_0:vrn4m2nov0 = PseudoVSEXT_VF2_M2_MASK %25.sub_vrm2_0:vrn4m2nov0(tied-def 0), %5:vr, killed $v0, 2, 4, 0, implicit $vl, implicit $vtype
480B      %26:gpr = LUI target-flags(riscv-hi) @var_47
496B      %27:gpr = ADDI %26:gpr, target-flags(riscv-lo) @var_47
512B      PseudoVSSEG4E16_V_M2 %25:vrn4m2nov0, %27:gpr, 2, 4, implicit $vl, implicit $vtype
528B      PseudoRET
```

When spliter will try to split %25:

```
selectOrSplit VRN4M2NoV0:%25 [64r,160r:4)[160r,208r:0)[208r,288r:1)[288r,464e:2)[464e,512r:3) 0@160r 1@208r 2@288r 3@464e 4@64r  L0000000000000030 [160r,512r:0) 0@160r  L00000000000000C0 [208r,512r:0) 0@208r  L0000000000000300 [288r,512r:0) 0@288r  L000000000000000C [64r,464e:1)[464e,512r:0) 0@464e 1@64r  weight:1.179245e-02 w=1.179245e-02
```

```
Best local split range: 64r-208r, 6.999861e-03, 3 instrs
    enterIntvBefore 64r: not live
    leaveIntvAfter 208r: valno 1
    useIntv [64B;216r): [64B;216r):1
  blit [64r,160r:4): [64r;160r)=1(%29)(recalc)
  blit [160r,208r:0): [160r;208r)=1(%29)(recalc)
  blit [208r,288r:1): [208r;216r)=1(%29)(recalc) [216r;288r)=0(%28)(recalc)
  blit [288r,464e:2): [288r;464e)=0(%28)(recalc)
  blit [464e,512r:3): [464e;512r)=0(%28)(recalc)
  rewr %bb.0    464e:0  early-clobber %28.sub_vrm2_0:vrn4m2nov0 = PseudoVSEXT_VF2_M2_MASK %25.sub_vrm2_0:vrn4m2nov0(tied-def 0), %5:vr, $v0, 2, 4, 0, implicit $vl, implicit $vtype
  rewr %bb.0    288r:0  %28.sub_vrm2_3:vrn4m2nov0 = PseudoVLE16_V_M2 %13:gpr, 2, 4, implicit $vl, implicit $vtype
  rewr %bb.0    208r:1  %29.sub_vrm2_2:vrn4m2nov0 = PseudoVLE16_V_M2 %10:gpr, 2, 4, implicit $vl, implicit $vtype
  rewr %bb.0    160r:1  %29.sub_vrm2_1:vrn4m2nov0 = PseudoVLE16_V_M2 %7:gpr, 2, 4, implicit $vl, implicit $vtype
  rewr %bb.0    64r:1   undef %29.sub_vrm2_0:vrn4m2nov0 = PseudoVLE16_V_M2 %1:gpr, 2, 4, implicit $vl, implicit $vtype
  rewr %bb.0    464B:0  early-clobber %28.sub_vrm2_0:vrn4m2nov0 = PseudoVSEXT_VF2_M2_MASK %28.sub_vrm2_0:vrn4m2nov0(tied-def 0), %5:vr, $v0, 2, 4, 0, implicit $vl, implicit $vtype
  rewr %bb.0    512B:0  PseudoVSSEG4E16_V_M2 %28:vrn4m2nov0, %27:gpr, 2, 4, implicit $vl, implicit $vtype
  rewr %bb.0    216B:1  undef %28.sub_vrm1_0_sub_vrm1_1_sub_vrm1_2_sub_vrm1_3_sub_vrm1_4_sub_vrm1_5:vrn4m2nov0 = COPY %29.sub_vrm1_0_sub_vrm1_1_sub_vrm1_2_sub_vrm1_3_sub_vrm1_4_sub_vrm1_5:vrn4m2nov0
queuing new interval: %28 [216r,288r:0)[288r,464e:1)[464e,512r:2) 0@216r 1@288r 2@464e  L000000000000000C [216r,216d:0)[464e,512r:1) 0@216r 1@464e  L0000000000000300 [288r,512r:0) 0@288r  L00000000000000C0 [216r,512r:0) 0@216r  L0000000000000030 [216r,512r:0) 0@216r  weight:8.706897e-03
Enqueuing %28
queuing new interval: %29 [64r,160r:0)[160r,208r:1)[208r,216r:2) 0@64r 1@160r 2@208r  L000000000000000C [64r,216r:0) 0@64r  L00000000000000C0 [208r,216r:0) 0@208r  L0000000000000030 [160r,216r:0) 0@160r  weight:1.097826e-02
Enqueuing %29
```

The live range of first part subreg of %25 is become [216r,216d:0)[464e,512r:1),
however first live range should live until 464e rather than just live
and [216r,216d:0).

And then the register allocator allocated wrong result accroding the
live range info.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D126047

2 years ago[MLIR] Add an install target for mlir-libraries
Nathan Lanza [Wed, 8 Jun 2022 02:55:05 +0000 (22:55 -0400)]
[MLIR] Add an install target for mlir-libraries

This is required for the distribution system for installing the
mlir-libraries component. This is copied from clang's equivalent
feature.

Differential Revision: https://reviews.llvm.org/D126837

2 years ago[Debug] [Coroutines] Add deref operator for non complex expression
Chuanqi Xu [Tue, 24 May 2022 06:09:18 +0000 (14:09 +0800)]
[Debug] [Coroutines] Add deref operator for non complex expression

Background:

When we construct coroutine frame, we would insert a dbg.declare
intrinsic for it:
```
%hdl = call void @llvm.coro.begin() ; would return coroutine handle
call void @llvm.dbg.declare(metadata ptr %hdl, metadata
![[DEBUG_VARIABLE: __coro_frame]], metadata !DIExpression())
```

And in the splitted coroutine, it looks like:
```
define void @coro_func.resume(ptr *hdl) {
entry.resume:
    call void @llvm.dbg.declare(metadata ptr %hdl, metadata
![[DEBUG_VARIABLE: __coro_frame]], metadata !DIExpression())
}
```

And we would salvage the debug info by inserting a new alloca here:
```
define void @coro_func.resume(ptr %hdl) {
entry.resume:
    %frame.debug = alloca ptr
    call void @llvm.dbg.declare(metadata ptr %frame.debug, metadata
![[DEBUG_VARIABLE: __coro_frame]], metadata !DIExpression())
    store ptr %hdl, %frame.debug
}
```

But now, the problem comes since the `dbg.declare` refers to the address
of that alloca instead of actual coroutine handle. I saw there are codes
to solve the problem but it only applies to complex expression only. I
feel if it is OK to relax the condition to make it work for
`__coro_frame`.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D126277

2 years agoUpdate the ProgrammersManual explanation for ilist and iplist
Nathan Lanza [Wed, 8 Jun 2022 02:48:24 +0000 (22:48 -0400)]
Update the ProgrammersManual explanation for ilist and iplist

They are now `using` aliases and thus the comments about iplist are now
incorrect. Remove them here.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D95210

2 years ago[JITLink][ELF][AArch64] Implement ADR_GOT_PAGE, LD64_GOT_LO12_NC.
Sunho Kim [Wed, 8 Jun 2022 01:17:26 +0000 (18:17 -0700)]
[JITLink][ELF][AArch64] Implement ADR_GOT_PAGE, LD64_GOT_LO12_NC.

This patch implements two most commonly used Global Offset Table relocations in
ELF/AARCH64: R_AARCH64_ADR_GOT_PAGE and R_AARCH64_LD64_GOT_LO12_NC. It
implements the GOT table manager by extending the existing
PerGraphGOTAndPLTStubsBuilder. A future patch will unify this with the MachO
implementation to produce a generic aarch64 got table manager.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D127057

2 years agoReland "[NFC][compiler-rt][asan] Unify asan and lsan allocator settings"
Leonard Chan [Wed, 8 Jun 2022 01:09:48 +0000 (18:09 -0700)]
Reland "[NFC][compiler-rt][asan] Unify asan and lsan allocator settings"

I believe this should've been fixed with 4b15e665f8d99d3b67b30e615544279654392745
which landed after this initial patch, but I reverted too early before I
saw the builder turn green again.

2 years agoRevert "[libc++][test] Mark ranges.transform.pass.cpp UNSUPPORTED for AIX"
Joe Loser [Tue, 7 Jun 2022 18:54:57 +0000 (12:54 -0600)]
Revert "[libc++][test] Mark ranges.transform.pass.cpp UNSUPPORTED for AIX"

This reverts commit 3583826bb52a7f129b55df043e29860aeab9906d.

Instead of marking the test unsupported for AIX, the choice is to bump the
timeout for CI as done in 76c7e1f2a8820b057de1a241422294bf25fdea2d and
222bd83d505728fca2bbe16cef8b93c321dd8c13

Differential Revision: https://reviews.llvm.org/D127242

2 years ago[InstCombine] decomposeSimpleLinearExpr should bail out on negative operands.
Wael Yehia [Fri, 27 May 2022 03:09:54 +0000 (03:09 +0000)]
[InstCombine] decomposeSimpleLinearExpr should bail out on negative operands.

InstCombine tries to rewrite

  %prod = mul nsw i64 %X,   Scale
  %acc = add nsw i64 %prod,   Offset
  %0 = alloca i8, i64 %acc, align 4
  %1 = bitcast i8* %0 to i32*
  Use ( %1 )

into

  %prod = mul nsw i64 %X,   Scale/4
  %acc = add nsw i64 %prod,   Offset/4
  %0 = alloca i32, i64 %acc, align 4
  Use (%0)

But it assumes Scale is unsigned, and performs an unsigned division.
So we should bail out if Scale cannot be interpreted as an unsigned safely.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D126546

2 years ago[JITLink][ELF][AArch64] Implement R_AARCH64_ABS64 relocation type.
Lang Hames [Wed, 8 Jun 2022 00:48:09 +0000 (17:48 -0700)]
[JITLink][ELF][AArch64] Implement R_AARCH64_ABS64 relocation type.

Implement R_AARCH64_ABS64 relocation entry. This relocation type is generated
when creating a static function pointer to symbol.

Reviewed By: lhames, sgraenitz

Differential Revision: https://reviews.llvm.org/D126658

2 years ago[MSAN] exclude android from pthread_getaffinity_np interceptor
Kevin Athey [Wed, 8 Jun 2022 00:52:03 +0000 (17:52 -0700)]
[MSAN] exclude android from pthread_getaffinity_np interceptor

Depends on https://reviews.llvm.org/D127185.

Differential Revision: https://reviews.llvm.org/D127264

2 years agoRevert "[NFC][compiler-rt][asan] Unify asan and lsan allocator settings"
Leonard Chan [Wed, 8 Jun 2022 00:33:24 +0000 (17:33 -0700)]
Revert "[NFC][compiler-rt][asan] Unify asan and lsan allocator settings"

This reverts commit dd045ddffc51db0a5ff9e00b1d55220af20614be.

This broke the sanitizer-windows builder at https://lab.llvm.org/buildbot/#/builders/127/builds/30751.

2 years ago[JITLink][ELF][AArch64] Implement R_AARCH64_LDST*_ABS_LO12_NC relocation types.
Sunho Kim [Wed, 8 Jun 2022 00:16:41 +0000 (17:16 -0700)]
[JITLink][ELF][AArch64] Implement R_AARCH64_LDST*_ABS_LO12_NC relocation types.

Implement R_AARCH64_LDST*_ABS_LO12_NC relocaiton entries by reusing PageOffset21
generic relocation edge. The difference between MachO backend is that in ELF,
the shift value is explicitly given by relocation type. lld generates the
relocation type that matches with instruction bitwidth, so getting the shift
value implicitly from instruction bytes should be fine in typical use cases.

2 years agoFix for e1d84c421df1bd496918bc4dd30f040d47906a77
Leonard Chan [Wed, 8 Jun 2022 00:28:24 +0000 (17:28 -0700)]
Fix for e1d84c421df1bd496918bc4dd30f040d47906a77

One of the checks in realloc_too_big.c actually printed a regular warning
and not an OOM error, so the check shouldn't be updated.

2 years ago[compiler-rt][lsan] Choose lsan allocator via SANITIZER_CAN_USE_ALLOCATOR64
Leonard Chan [Wed, 1 Jun 2022 21:12:13 +0000 (14:12 -0700)]
[compiler-rt][lsan] Choose lsan allocator via SANITIZER_CAN_USE_ALLOCATOR64

Rather than checking a bunch of individual platforms.

Differential Revision: https://reviews.llvm.org/D126825

2 years ago[NFC][compiler-rt][asan] Unify asan and lsan allocator settings
Leonard Chan [Fri, 3 Jun 2022 21:35:47 +0000 (14:35 -0700)]
[NFC][compiler-rt][asan] Unify asan and lsan allocator settings

This updates existing asan allocator settings to use the same allocator settings as what lsan uses for platforms where they already match.

Differential Revision: https://reviews.llvm.org/D126927

2 years ago[WebAssembly][Objcopy] Check that --only-keep-debug removes known sections
Derek Schuff [Tue, 7 Jun 2022 23:45:23 +0000 (16:45 -0700)]
[WebAssembly][Objcopy] Check that --only-keep-debug removes known sections

NFC; Just update the test to ensure that both known and custom sections
are removed.
Review left over from https://reviews.llvm.org/D126509

2 years ago[compiler-rt][sanitizer] Have all OOM-related error messages start with the same...
Leonard Chan [Tue, 7 Jun 2022 23:45:01 +0000 (16:45 -0700)]
[compiler-rt][sanitizer] Have all OOM-related error messages start with the same format

This way downstream tools that read sanitizer output can differentiate between OOM errors
reported by sanitizers from other sanitizer errors.

Changes:

- Introduce ErrorIsOOM for checking if a platform-specific error code from an "mmap" is an OOM err.
- Add ReportOOMError which just prepends this error message to the start of a Report call.
- Replace some Reports for OOMs with calls to ReportOOMError.
- Update necessary tests.

Differential Revision: https://reviews.llvm.org/D127161

2 years ago[ORC-RT] Remove a stale comment.
Lang Hames [Tue, 7 Jun 2022 23:42:10 +0000 (16:42 -0700)]
[ORC-RT] Remove a stale comment.

2 years ago[JITLink][ELF][AArch64] Implement ADR_PREL_PG_HI21, ADD_ABS_LO12_NC.
Sunho Kim [Tue, 7 Jun 2022 21:03:52 +0000 (14:03 -0700)]
[JITLink][ELF][AArch64] Implement ADR_PREL_PG_HI21, ADD_ABS_LO12_NC.

Implements R_AARCH64_ADR_PREL_PG_HI21 and R_AARCH64_ADD_ABS_LO12_NC fixup edges
using the generic aarch64 patch edges.

Reviewed By: lhames, sgraenitz

Differential Revision: https://reviews.llvm.org/D126287

2 years ago[MSAN] Add interceptor for pthread_getaffinity_np.
Kevin Athey [Tue, 7 Jun 2022 07:00:20 +0000 (00:00 -0700)]
[MSAN] Add interceptor for pthread_getaffinity_np.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D127185

2 years agoAdd checks for -lresolv to sanitizer-ld test.
Kevin Athey [Tue, 7 Jun 2022 03:50:19 +0000 (20:50 -0700)]
Add checks for -lresolv to sanitizer-ld test.

These were missed in https://reviews.llvm.org/D127145.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D127177

2 years agoRevert "[Metadata] Add a resize capability to MDNodes and add a push_back interface...
Wolfgang Pieb [Tue, 7 Jun 2022 22:13:29 +0000 (15:13 -0700)]
Revert "[Metadata] Add a resize capability to MDNodes and add a push_back interface to MDNodes"

This reverts commit e3f6eda8c6ebc567755377911746d4ca2367e649.

Failure in unittest on https://lab.llvm.org/buildbot*builders/171/builds/15666

2 years ago[InstCombine] [InstCombine] reduce left-shift-of-right-shifted constant via demanded...
Sanjay Patel [Tue, 7 Jun 2022 20:54:11 +0000 (16:54 -0400)]
[InstCombine] [InstCombine] reduce left-shift-of-right-shifted constant via demanded bits

If we don't demand low bits and it is valid to pre-shift a constant:
(C2 >> X) << C1 --> (C2 << C1) >> X

https://alive2.llvm.org/ce/z/_UzTMP

This is the reverse-order shift sibling to 82040d414b3c ( D127122 ).
It seems likely that we would want to add this to the SDAG version of
the code too to keep it on par with IR.

2 years ago[InstCombine] add tests for left-shift-of-right-shifted constant; NFC
Sanjay Patel [Tue, 7 Jun 2022 19:49:22 +0000 (15:49 -0400)]
[InstCombine] add tests for left-shift-of-right-shifted constant; NFC

The tests are adapted from the sibling folds' tests (see D127122).