platform/upstream/llvm.git
2 years ago[RISCV] Add a test showing overlapping stack offsets with RVV
Fraser Cormack [Thu, 19 May 2022 09:58:17 +0000 (10:58 +0100)]
[RISCV] Add a test showing overlapping stack offsets with RVV

This test (and its forthcoming fix) was split off from D125787. It shows
that the logic we use to determine when we need to add extra RVV padding
is insufficient.

In this example, we may have a situation involving dynamic stack
alignment -- but no variable-sized objects -- where we have no FP but
must still use SP to index objects. In this case we also need the
extra RVV padding, otherwise objects may overlap. Specifically, the test
shows that the RVV vector object may clobber the lowest callee-save.

    |------------------------------| -- <-- Incoming SP
    | 4-byte callee-save (ra)      |
    |------------------------------| -- <-- SP + VLENB*2 + 60
    | 4-byte callee-save (s0)      |
    |------------------------------| -- <-- SP + VLENB*2 + 56  --
    | 4-byte callee-save (s9)      |                            |
    |------------------------------| -- <-- SP + VLENB*2 + 52   | RVV object(!!)
    | VLENB*2 RVV object           |                            |
    |------------------------------| -- <-- SP + 56            --
    | 4-byte local object          |
    |------------------------------| -- <-- SP + 32
    | Dead area                    |
    |------------------------------| -- <-- InSP - 2*VLENB - 64
    | Possibly-zero realignment    |
    |------------------------------| -- <-- SP (realigned to 32)

This diagram should help show that when SP==InSP -- e.g., when the incoming SP
is 32-byte aligned, subtracting 2*VLENB+64 may keep it that way -- the RVV
object clobbers the spill of s9.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D125962

2 years agoMinutes for pauth sync-ups have moved to Discourse.
Kristof Beyls [Fri, 20 May 2022 12:00:53 +0000 (14:00 +0200)]
Minutes for pauth sync-ups have moved to Discourse.

2 years ago[AMDGPU] Add a test case for an SIFoldOperands bug
Jay Foad [Fri, 20 May 2022 11:19:59 +0000 (12:19 +0100)]
[AMDGPU] Add a test case for an SIFoldOperands bug

2 years agotsan: add lock free stack pattern test
Alexey Katranov [Fri, 20 May 2022 09:03:54 +0000 (11:03 +0200)]
tsan: add lock free stack pattern test

Add a set of tests that iterate over possible combinations of
memory orders for lock free stack implementation.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D110552

2 years ago[X86][AMX][NFC] Refactor X86LowerAMXCast.cpp
Luo, Yuanke [Fri, 20 May 2022 10:26:13 +0000 (18:26 +0800)]
[X86][AMX][NFC] Refactor X86LowerAMXCast.cpp

Change static function to X86LowerAMXCast member function.

Differential Revision: https://reviews.llvm.org/D126058

2 years ago[AMDGPU][MC][GFX8+] Correct SMEM offset parsing
Dmitry Preobrazhensky [Fri, 20 May 2022 10:58:54 +0000 (13:58 +0300)]
[AMDGPU][MC][GFX8+] Correct SMEM offset parsing

Differential Revision: https://reviews.llvm.org/D125907

2 years ago[mlir] do not elide dialect prefix for ops with dots in the name
Alex Zinenko [Thu, 19 May 2022 14:13:51 +0000 (16:13 +0200)]
[mlir] do not elide dialect prefix for ops with dots in the name

For the hypothetical "a.b.c" op printed within a region that declares "a" as
the default dialect, MLIR would currently elide the "a." prefix and only print
"b.c". However, this becomes ambiguous while parsing as "b.c" may be exist as
the "c" op in the "b" dialect. If it does not, the parsing currently fails. Do
not elide the default dialect if the op name contains further dots to avoid the
ambiguity.

See https://discourse.llvm.org/t/dropping-dialect-prefix-for-ops-with-multiple-dots-in-the-name/62562

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D125975

2 years ago[ARM] Cost modelling for MVE vector fptoi_sat
David Green [Fri, 20 May 2022 10:00:34 +0000 (11:00 +0100)]
[ARM] Cost modelling for MVE vector fptoi_sat

Building on top of D125665, this adds MVE costs for fptosi.sat and
fptoui.sat, providing MVE is available and the types are legal.

Differential Revision: https://reviews.llvm.org/D125666

2 years ago[AArch64][SME]Tied up ZA operand for accumulate instructions
Caroline Concatto [Thu, 12 May 2022 12:19:50 +0000 (13:19 +0100)]
[AArch64][SME]Tied up ZA operand for accumulate instructions

This patch updates SMEInstrFormats.td to tie up ZA operand for instructions
that accumulate their results into ZA or use part of ZA as input.

Depends on: D125534

Differential Revision: https://reviews.llvm.org/D125537

2 years ago[AArch64][SME][NFC] Add implicit operands for SME instructions in the disassembly.
Caroline Concatto [Fri, 13 May 2022 10:04:08 +0000 (11:04 +0100)]
[AArch64][SME][NFC] Add implicit operands for SME instructions in the disassembly.

This patch simplifies the switch statement in getInstruction to add
implicit operands (register ZA and Immediate  equal to zero)
in the SME operands when disassembly.

The register ZA and the zero immediate  can be added by checking the operand
in MCInstDesc.

Differential Revision: https://reviews.llvm.org/D125534

2 years ago[flang] Fix use-associated false-positive error
Daniil Dudkin [Fri, 20 May 2022 09:11:36 +0000 (12:11 +0300)]
[flang] Fix use-associated false-positive error

For the program provided as the test case flang fired the following
error:

    error: Semantic errors in main.f90
    error: 'foo' is not a procedure

This change fixes the error by postponing handling of `UseErrorDetails`
from `CharacterizeProcedure` to a later stage.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D125791

2 years ago[LV] Do not LoopSimplify/LCSSA after generating main vector loop.
Florian Hahn [Fri, 20 May 2022 08:58:40 +0000 (09:58 +0100)]
[LV] Do not LoopSimplify/LCSSA after generating main vector loop.

At the moment LV runs LoopSimplify and reconstructs LCSSA form after
generating the main vector loop and before generating the epilogue
vector loop.

In practice, this adds a new exit block for the scalar loop because the
middle block now also branches to the original exit block of the scalar
loop. It also requires adding a new LCSSA phi in the newly created exit
block.

This complicates things when modeling exit values in VPlan, because we
would need to update the VPlan for the epilogue loop to update the newly
created LCSSA phi node.

But none of that should be necessary, as all analysis requiring
loop-simplify form is already done at this point and LCSSA form of the
original loop is not broken.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D125810

2 years ago[NFC][test] Fix the line num of expected-error for CSKY at builtin-alloca-with-align.c
Zi Xuan Wu [Fri, 20 May 2022 08:54:23 +0000 (16:54 +0800)]
[NFC][test] Fix the line num of expected-error for CSKY at builtin-alloca-with-align.c

2 years ago[AArch64] Fix the generation of BE Nops
David Green [Fri, 20 May 2022 08:31:00 +0000 (09:31 +0100)]
[AArch64] Fix the generation of BE Nops

Big endian Nops were being generated as d5 03 20 1f   fnmadd  s21, s30,
s0, s0, getting the bytes of the NOP in the wrong order. This switches
the bytes to not be dependant on the endianness.

Differential Revision: https://reviews.llvm.org/D125980

2 years ago[amdgpu] Add amdgpu_kernel calling conv attribute to clang
Jon Chesterfield [Fri, 20 May 2022 07:50:36 +0000 (08:50 +0100)]
[amdgpu] Add amdgpu_kernel calling conv attribute to clang

Allows emitting define amdgpu_kernel void @func() IR from C or C++.

This replaces the current workflow which is to write a stub in opencl that
calls an external C function implemented in C++ combined through llvm-link.

Calling the resulting function still requires a manual implementation of the
ABI from the host side. The primary application is for more rapid debugging
of the amdgpu backend by permuting a C or C++ test file instead of manually
updating an IR file.

Implementation closely follows D54425. Non-amd reviewers from there.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D125970

2 years agoMake CompoundStmtBitfields::NumStmts not a bit-field
Serge Pavlov [Sun, 15 May 2022 15:35:46 +0000 (22:35 +0700)]
Make CompoundStmtBitfields::NumStmts not a bit-field

Number of statements in CompoundStmt is kept in a bit-field of the common
part of Stmt. The field has 24 bits for the number. To allocate a new
bit field (as attempted in https://reviews.llvm.org/D123952), this
number must be reduced, maximal number of statements in a compound
statement becomes smaller. It can result in compilation errors of some
programs.

With this change the number of statements is kept in a field of type
'unsigned int' rather than in bit-field. To make room in CompoundStmtBitfields
LBraceLoc is moved to fields of CompoundStmt.

Differential Revision: https://reviews.llvm.org/D125635

2 years ago[flang][OpenMP] Fix the types of worksharing-loop variables
Peixin-Qiao [Fri, 20 May 2022 07:16:03 +0000 (15:16 +0800)]
[flang][OpenMP] Fix the types of worksharing-loop variables

The types of lower bound, upper bound, and step are converted into the
type of the loop variable if necessary. OpenMP runtime requires 32-bit
or 64-bit loop variables. OpenMP loop iteration variable cannot have
more than 64 bits size and will be narrowed.

This patch is part of upstreaming code from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project. (#1256)

Co-authored-by: kiranchandramohan <kiranchandramohan@gmail.com>
Reviewed By: kiranchandramohan, shraiysh

Differential Revision: https://reviews.llvm.org/D125740

2 years ago[RISCV] Add VL patterns for vector widening floating-point fused multiply-add instruc...
jacquesguan [Wed, 27 Apr 2022 07:20:31 +0000 (07:20 +0000)]
[RISCV] Add VL patterns for vector widening floating-point fused multiply-add instructions.

This patch adds VL patterns for vector widening floating-point fused multiply-add instructions to support fixed length vector type.

Differential Revision: https://reviews.llvm.org/D124505

2 years ago[RISCV][NFC] Remove `*=` operator for LMULType
eopXD [Fri, 20 May 2022 06:23:34 +0000 (23:23 -0700)]
[RISCV][NFC] Remove `*=` operator for LMULType

LMULType always manipulate on Log2LMUL, let all manipulations go
through LMULType::MulLog2LMUL.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D126042

2 years ago[MIR] Provide location of extra instruction operand when diagnosing it.
Ivan Kosarev [Fri, 20 May 2022 04:55:36 +0000 (05:55 +0100)]
[MIR] Provide location of extra instruction operand when diagnosing it.

Also resolves misspelled FileCheck directives caught with D125604.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D125965

2 years ago[lldb] Update test_software_breakpoint_set_and_remove_work for AS
Jonas Devlieghere [Thu, 19 May 2022 20:38:18 +0000 (13:38 -0700)]
[lldb] Update test_software_breakpoint_set_and_remove_work for AS

On Apple Silicon the platform arch is arm64 rather than AArch64.

2 years ago[lldb] Fix 'ptsname_r' is only available on macOS 10.13.4 or newer
Jonas Devlieghere [Thu, 19 May 2022 20:07:20 +0000 (13:07 -0700)]
[lldb] Fix 'ptsname_r' is only available on macOS 10.13.4 or newer

A deployment target less than 10.13.4 causes an error saying that
'ptsname_r' is only available on macOS 10.13.4 or newer. The current
logic only checks if the symbol is available and doesn't account for the
deployment target. This patch fixes that by adding an availability
check.

Differential revision: https://reviews.llvm.org/D125995

2 years ago[RISCV] Add test showing codegen for unaligned loads and stores of scalar types
Philip Reames [Fri, 20 May 2022 03:59:26 +0000 (20:59 -0700)]
[RISCV] Add test showing codegen for unaligned loads and stores of scalar types

2 years ago[ASan] Add sleep_before_init flag
Julian Lettner [Fri, 20 May 2022 00:46:42 +0000 (17:46 -0700)]
[ASan] Add sleep_before_init flag

Also do a little bit of refactoring instead of just copy&paste.

Differential Revision: https://reviews.llvm.org/D126037

2 years ago[InstCombine] [NFC] Use a pattern matcher for ExtractElementInst
Chenbing Zheng [Fri, 20 May 2022 01:56:33 +0000 (09:56 +0800)]
[InstCombine] [NFC] Use a pattern matcher for ExtractElementInst

Reviewed By: RKSimon, rampitec

Differential Revision: https://reviews.llvm.org/D125857

2 years ago[lit] Fix setup of sanitizer environment
Vitaly Buka [Fri, 20 May 2022 02:22:02 +0000 (19:22 -0700)]
[lit] Fix setup of sanitizer environment

Not all options were propageted into tests.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D122869

2 years ago[mlir][Arithmetic] fold overlapping negf.
jacquesguan [Thu, 19 May 2022 09:25:22 +0000 (09:25 +0000)]
[mlir][Arithmetic] fold overlapping negf.

This patch folds negf(negf(x)) to x.

Differential Revision: https://reviews.llvm.org/D125955

2 years ago[AArch64] Add support for -fzero-call-used-regs
Bill Wendling [Thu, 19 May 2022 23:57:40 +0000 (16:57 -0700)]
[AArch64] Add support for -fzero-call-used-regs

Support the "-fzero-call-used-regs" option on AArch64. This involves much less
specialized code than the X86 version. Most of the checks can be done with
TableGen.

Reviewed By: nickdesaulniers, MaskRay

Differential Revision: https://reviews.llvm.org/D124836

2 years ago[mlir][sparse] Adding x-macros for OverheadType
wren romano [Thu, 19 May 2022 23:00:39 +0000 (16:00 -0700)]
[mlir][sparse] Adding x-macros for OverheadType

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126026

2 years ago[Serialization] Delta encode locations in expansion sloc entries
Sam McCall [Thu, 19 May 2022 08:01:44 +0000 (10:01 +0200)]
[Serialization] Delta encode locations in expansion sloc entries

This is a 1.9% reduction in PCH size in my measurements.

In abbreviated records, VBR6 seems to be slightl better than VBR8 for locations
that may be delta-encoded (i.e. not the first)

Differential Revision: https://reviews.llvm.org/D125952

2 years ago[bazel][libc] Fix bazel build
Alex Brachet [Thu, 19 May 2022 22:57:59 +0000 (22:57 +0000)]
[bazel][libc] Fix bazel build

Differential revision: https://reviews.llvm.org/D126028

2 years ago[mlir][sparse] Factored out a "FATAL" macro for unrecoverable assertion failure
wren romano [Thu, 19 May 2022 22:01:23 +0000 (15:01 -0700)]
[mlir][sparse] Factored out a "FATAL" macro for unrecoverable assertion failure

Depends On D126019

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126022

2 years ago[TableGen] Add generation of argument register lists
Bill Wendling [Thu, 19 May 2022 20:45:43 +0000 (13:45 -0700)]
[TableGen] Add generation of argument register lists

There are cases, like with -fzero-call-used-regs,  where we need to know
which registers can be used by a certain calling convention. This change
generates a list of registers used by each calling convention defined in
*CallingConv.td.

Calling conventions that use registers conditioned on Swift have those
registers placed in a separate list. This allows us to be flexible about
whether to use the Swift registers or not.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D125421

2 years agoRevert "[TableGen] Add generation of argument register lists"
Bill Wendling [Thu, 19 May 2022 22:16:08 +0000 (15:16 -0700)]
Revert "[TableGen] Add generation of argument register lists"

There are build bot failures.

This reverts commit 3fa1b6557d08a148ef853c2a761f1c43e09fef5e.

2 years ago[mlir][sparse] Simplifying closure
wren romano [Thu, 19 May 2022 21:04:35 +0000 (14:04 -0700)]
[mlir][sparse] Simplifying closure

By closing over the `rank` itself rather than `this`, we save a method call on each iteration.  A minor optimization, but one that adds up.

Depends On D126016

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126019

2 years ago[mlir][sparse] Using the name "dimSizes" more consistently
wren romano [Thu, 19 May 2022 20:47:36 +0000 (13:47 -0700)]
[mlir][sparse] Using the name "dimSizes" more consistently

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126016

2 years ago[TableGen] Add generation of argument register lists
Bill Wendling [Thu, 19 May 2022 20:45:43 +0000 (13:45 -0700)]
[TableGen] Add generation of argument register lists

There are cases, like with -fzero-call-used-regs,  where we need to know
which registers can be used by a certain calling convention. This change
generates a list of registers used by each calling convention defined in
*CallingConv.td.

Calling conventions that use registers conditioned on Swift have those
registers placed in a separate list. This allows us to be flexible about
whether to use the Swift registers or not.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D125421

2 years ago[DirectX backend] When cleanup module flags only remove unused flags.
python3kgae [Tue, 17 May 2022 23:57:18 +0000 (16:57 -0700)]
[DirectX backend] When cleanup module flags only remove unused flags.

Only remove dx.valver from module flags when cleanup module flags in DXILTranslateMetadataPass.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D125842

2 years ago[libc] Add strlcat
Alex Brachet [Thu, 19 May 2022 21:47:28 +0000 (21:47 +0000)]
[libc] Add strlcat

Differential Revision: https://reviews.llvm.org/D125978

2 years ago[lldb/test] Fix PExpect.launch issue when disabling color support
Med Ismail Bennani [Thu, 19 May 2022 21:47:04 +0000 (14:47 -0700)]
[lldb/test] Fix PExpect.launch issue when disabling color support

This patch should fix a bug in PExpect.launch that happened when color
support is not enabled.

In that case, we need to add the `--no-use-colors` flag to lldb's launch
argument list. However, previously, each character to the string was
appended separately to the `args` list. This patch solves that by adding
the whole string to the list.

This should fix the TestIOHandlerResize failure on GreenDragon.

Differential Revision: https://reviews.llvm.org/D126021

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2 years agoRevert "[RISCV] Add test cases showing failure to remove mask on rotate amounts."
Craig Topper [Thu, 19 May 2022 21:33:02 +0000 (14:33 -0700)]
Revert "[RISCV] Add test cases showing failure to remove mask on rotate amounts."

This reverts commit e2f410feeab27a8bb2c015fc02bb8527702e401f.

This exposes a pre-existing bug in type legalization that is failing
expensive checks.

2 years agoRevert "[RISCV] Use selectShiftMaskXLen ComplexPattern for isel of rotates."
Craig Topper [Thu, 19 May 2022 21:32:15 +0000 (14:32 -0700)]
Revert "[RISCV] Use selectShiftMaskXLen ComplexPattern for isel of rotates."

This reverts commit 86f7d7074a0129955aa2f5c82fe8c383eb17a35a.

The test cases added for this exposed an pre-existing bug that is failing
the expensive checks bot. Reverting so I can revert that patch.

2 years ago[ConstantRange] Improve the implementation of binaryOr
Alexander Shaposhnikov [Thu, 19 May 2022 20:25:47 +0000 (20:25 +0000)]
[ConstantRange] Improve the implementation of binaryOr

This diff adjusts binaryOr to take advantage of the analysis
based on KnownBits.

Differential revision: https://reviews.llvm.org/D125933

Test plan:
1/ ninja check-llvm
2/ ninja check-llvm-unit

2 years ago[bazel] Add lib/Basic/BuiltinTargetFeatures.h to clang:basic `hdrs`.
Jorge Gorbe Moya [Thu, 19 May 2022 21:17:36 +0000 (14:17 -0700)]
[bazel] Add lib/Basic/BuiltinTargetFeatures.h to clang:basic `hdrs`.

This header is included by
clang/unittests/CodeGen/CheckTargetFeaturesTest.cpp
so it needs to be exposed here to make it visible.

2 years ago[Flang][OpenMP] Upstream the lowering of the parallel do combined construct
Kiran Chandramohan [Thu, 19 May 2022 20:23:04 +0000 (20:23 +0000)]
[Flang][OpenMP] Upstream the lowering of the parallel do combined construct

When parallel is used in a combined construct, then use a separate
function to create the parallel operation. It handles the parallel
specific clauses and leaves the rest for handling at the inner
operations.

Reviewed By: peixin, shraiysh

Differential Revision: https://reviews.llvm.org/D125465

Co-authored-by: Sourabh Singh Tomar <SourabhSingh.Tomar@amd.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>
Co-authored-by: Nimish Mishra <neelam.nimish@gmail.com>
2 years agoHandle instrumentation of scalar single-precision (_ss) intrinsics
Nicolas Capens [Thu, 19 May 2022 20:46:49 +0000 (13:46 -0700)]
Handle instrumentation of scalar single-precision (_ss) intrinsics

Instrumentation of scalar double-precision intrinsics such as
x86_sse41_round_sd was already handled by https://reviews.llvm.org/D82398,
but not their single-precision counterparts.

https://issuetracker.google.com/172238865

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D124871

2 years ago[mlir][sparse] fix unsigned comparison bug in assert
Aart Bik [Thu, 19 May 2022 19:23:52 +0000 (12:23 -0700)]
[mlir][sparse] fix unsigned comparison bug in assert

Reviewed By: bixia, wrengr

Differential Revision: https://reviews.llvm.org/D126007

2 years ago[AMDGPU] Mark s_get_waveid_in_workgroup as not reading memory
Jay Foad [Thu, 19 May 2022 15:52:41 +0000 (16:52 +0100)]
[AMDGPU] Mark s_get_waveid_in_workgroup as not reading memory

It is already marked as having side effects, at least in MIR. It does
not interact with anything else that is modelled as a memory access
either in IR or MachineIR.

Differential Revision: https://reviews.llvm.org/D125985

2 years ago[AMDGPU] Mark s_getreg as having side effects instead of reading memory
Jay Foad [Thu, 19 May 2022 12:43:43 +0000 (13:43 +0100)]
[AMDGPU] Mark s_getreg as having side effects instead of reading memory

s_getreg does not interact with anything else that is modelled as a
memory access either in IR or MachineIR.

Differential Revision: https://reviews.llvm.org/D125968

2 years ago[mlir] Remove unused properties from the standalone example's lit configuration
Stella Stamenova [Thu, 19 May 2022 19:51:37 +0000 (12:51 -0700)]
[mlir] Remove unused properties from the standalone example's lit configuration

Since these are unused, I've removed them from the configuration, so that it can be easier to read and follow.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D125132

2 years ago[Clang][[OpenMP5.1] Initial parser/sema for default(private) clause
Jennifer Yu [Tue, 17 May 2022 21:17:32 +0000 (14:17 -0700)]
[Clang][[OpenMP5.1] Initial parser/sema for default(private) clause

This implements the default(private) clause as defined in OMP5.1

Differential Revision: https://reviews.llvm.org/D125912

2 years ago[LV] Drop wrap flags for reductions using VP def-use chain.
Florian Hahn [Thu, 19 May 2022 19:36:46 +0000 (20:36 +0100)]
[LV] Drop wrap flags for reductions using VP def-use chain.

Update clearReductionWrapFlags to use the VPlan def-use chain from the
reduction phi recipe to drop reduction wrap flags.

This addresses an existing FIXME and fixes a crash when instructions in
the reduction chain are not used and have been removed before VPlan
codegeneration.

Fixes #55540.

2 years ago[gn build] (manually) port 505ddb6b7450 (remove Unit/lit.site.cfg.py)
Nico Weber [Thu, 19 May 2022 19:18:56 +0000 (15:18 -0400)]
[gn build] (manually) port 505ddb6b7450 (remove Unit/lit.site.cfg.py)

2 years ago[lld][test] Delete empty Unit test directory
Keith Smiley [Thu, 12 May 2022 00:12:03 +0000 (17:12 -0700)]
[lld][test] Delete empty Unit test directory

This became empty when we removed the legacy macho lld. This results in
a warning when running `check-lld`. We can revert this in the future if
we want unit tests.

Differential Revision: https://reviews.llvm.org/D125436

2 years agoRevert "[ValueTracking] Added support to deduce PHI Nodes values being a power of 2"
Nico Weber [Thu, 19 May 2022 19:05:30 +0000 (15:05 -0400)]
Revert "[ValueTracking] Added support to deduce PHI Nodes values being a power of 2"

This reverts commit d5c130f17e503e128b8a413c2ce0e522987d2a16.
Breaks tests, see https://reviews.llvm.org/D125332#3525819

2 years ago[OpenMP][libomp] Fix accidental removal of else for core attributes
Jonathan Peyton [Thu, 19 May 2022 18:57:02 +0000 (13:57 -0500)]
[OpenMP][libomp] Fix accidental removal of else for core attributes

2 years ago[ARM] Cost modelling for scalar fptoi_sat
David Green [Thu, 19 May 2022 18:53:21 +0000 (19:53 +0100)]
[ARM] Cost modelling for scalar fptoi_sat

Similar to D124357, this adds some cost modelling for fptoi_sat for Arm
targets. Where VFP2 is available (and FP64/FP16 for the relevant types),
the operations are legal as the Arm instructions naturally saturate.
Otherwise they will need an extra smin/smax clamp, similar to AArch64.

Differential Revision: https://reviews.llvm.org/D125665

2 years ago[InstCombine] NEW Baseline tests for InstCombine optimization to merge GEP instructio...
William Huang [Thu, 12 May 2022 00:32:03 +0000 (00:32 +0000)]
[InstCombine] NEW Baseline tests for InstCombine optimization to merge GEP instructions with constant indices

Splitted the merge constant-indexed GEP optimization into two smaller transformations: 1. Merging GEP of GEP if both are constant-indexed. 2. Swapping constant indexed GEP in a chain of (non-constant) GEP to the end, so that 1 can be applied repeatedly.
There is existing code to partially  handle transformation 1, but it only deals with limited cases

Unit tests are breaking down into two parts for the 2 transformations.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D125438

2 years ago[Office Hours] add initial guidance for hosts
Kristof Beyls [Thu, 19 May 2022 10:25:01 +0000 (12:25 +0200)]
[Office Hours] add initial guidance for hosts

This includes adding guidance to announce an office hours session on the
Discord channel and/or IRC, as discussed at the office hours round table at
EuroLLVM 2022, see
https://discourse.llvm.org/t/office-hours-eurollvm-round-table-summary/62480.

Fixes #55423

2 years ago[ValueTracking] Added support to deduce PHI Nodes values being a power of 2
William Huang [Tue, 10 May 2022 19:47:10 +0000 (19:47 +0000)]
[ValueTracking] Added support to deduce PHI Nodes values being a power of 2

 Add Value Tracking support to deduce induction variable being a power of 2, allowing urem optimizations

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D125332

2 years ago[llvm-dis][test] Fix error case on Windows
Keith Smiley [Thu, 19 May 2022 18:29:37 +0000 (11:29 -0700)]
[llvm-dis][test] Fix error case on Windows

The `N` case in the error differs across platforms.

2 years ago[ValueTracking] Baseline tests for Power-of-2 value tracking on PHI nodes
William Huang [Tue, 3 May 2022 22:19:01 +0000 (22:19 +0000)]
[ValueTracking] Baseline tests for Power-of-2 value tracking on PHI nodes

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D124885

2 years ago[Libomptarget] Add `leaf` attribute to `vprintf` declaration
Joseph Huber [Thu, 19 May 2022 18:18:51 +0000 (14:18 -0400)]
[Libomptarget] Add `leaf` attribute to `vprintf` declaration

Summary:
This patch adds the `leaf` attribute to the `vprintf` declaration in the
OpenMP runtime. This attribute allows us to determine that the `vprintf`
function will not call any functions within the translation unit,
allowing us to deduce `norecurse` attributes on the caller.

2 years ago[WebAssembly] Fix register use-def in FixIrreducibleControlFlow
Heejin Ahn [Fri, 13 May 2022 01:46:53 +0000 (18:46 -0700)]
[WebAssembly] Fix register use-def in FixIrreducibleControlFlow

FixIrreducibleControlFlow pass adds dispatch blocks with a `br_table`
that has multiple predecessors and successors, because it serves as
something like a traffic hub for BBs. As a result of this, there can be
register uses that are not dominated by a def in every path from the
entry block. For example, suppose register %a is defined in BB1 and used
in BB2, and there is a single path from BB1 and BB2:
```
BB1 -> ... -> BB2
```
After FixIrreducibleControlFlow runs, there can be a dispatch block
between these two BBs:
```
BB1 -> ... -> Dispatch -> ... -> BB2
```
And this dispatch block has multiple predecessors, now
there is a path to BB2 that does not first visit BB1, and in that path
%a is not dominated by a def anymore.

To fix this problem, we have been adding `IMPLICIT_DEF`s to all
registers in PrepareForLiveInternals pass, and then remove unnecessary
ones in OptimizeLiveIntervals pass after computing `LiveIntervals`. But
FixIrreducibleControlFlow pass itself ends up violating register use-def
relationship, resulting in invalid code. This was OK so far because
MIR verifier apparently didn't check this in validation. But @arsenm
fixed this and it caught this bug in validation
(https://github.com/llvm/llvm-project/issues/55249).

This CL moves the `IMPLICIT_DEF` adding routine from
PrepareForLiveInternals to FixIrreducibleControlFlow. We only run it
when FixIrreducibleControlFlow changes the code. And then
PrepareForLiveInternals doesn't do anything other than setting
`TracksLiveness` property, which is a prerequisite for running
`LiveIntervals` analysis, which is required by the next pass
OptimizeLiveIntervals.

But in our backend we don't seem to do anything that invalidates this up
until OptimizeLiveIntervals, and I'm not sure why we are calling
`invalidateLiveness` in ReplacePhysRegs pass, because what that pass
does is to replace physical registers with virtual ones 1-to-1. I
deleted the `invalidateLiveness` call there and we don't need to set
that flag explicitly, which obviates all the need for
PrepareForLiveInternals.

(By the way, This 'Liveness' here is different from `LiveIntervals`
analysis. Setting this only means BBs' live-in info is correct, all uses
are dominated by defs, `kill` flag is conservatively correct, which
means if there is a `kill` flag set it should be the last use. See
https://github.com/llvm/llvm-project/blob/2a0837aab1489c88efb03784e34c4dc9f2e28302/llvm/include/llvm/CodeGen/MachineFunction.h#L125-L134
for details.)

So this CL removes PrepareForLiveInternals pass altogether. Something
similar to this was attempted by D56091 long ago but that came short of
actually removing the pass, and I couldn't land it because
FixIrreducibleControlFlow violated use-def relationship, which this CL
fixes.

This doesn't change output in any meaningful way. All test changes
except `irreducible-cfg.mir` are register numbering.

Also this will likely to reduce compilation time, because we have been
adding `IMPLICIT_DEF` for all registers every time `-O2` is given, but
now we do that only when there is irreducible control flow, which is
rare.

Fixes https://github.com/llvm/llvm-project/issues/55249.

Reviewed By: dschuff, kripken

Differential Revision: https://reviews.llvm.org/D125515

2 years ago[WebAssembly] Use CHECK-NEXT for irreducible-cfg.mir
Heejin Ahn [Fri, 13 May 2022 02:30:55 +0000 (19:30 -0700)]
[WebAssembly] Use CHECK-NEXT for irreducible-cfg.mir

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D125514

2 years ago[llvm-dis] Improve missing file error message
Keith Smiley [Tue, 15 Mar 2022 04:50:46 +0000 (21:50 -0700)]
[llvm-dis] Improve missing file error message

Previously the error message didn't include the failing path, which made
it hard to tell what went wrong.

Differential Revision: https://reviews.llvm.org/D121665

2 years ago[docs][tools] Remove old llvm-bcanalyzer options
Keith Smiley [Wed, 23 Mar 2022 23:29:07 +0000 (16:29 -0700)]
[docs][tools] Remove old llvm-bcanalyzer options

These no longer exist. A few have been added since but I'm not enough of
an expert to provide a useful blurb on them outside of what you see with
`--help`.

Differential Revision: https://reviews.llvm.org/D122361

2 years ago[Object] Fix updating darwin archives
Keith Smiley [Wed, 4 May 2022 01:43:46 +0000 (18:43 -0700)]
[Object] Fix updating darwin archives

When creating an archive, llvm-ar looks at the host to determine the
archive format to use, on Apple platforms this means it uses the
K_DARWIN format. K_DARWIN is _virtually_ equivalent to K_BSD, expect for
some very slight differences around padding, timestamps in deterministic
mode, and 64 bit formats. When updating an archive using llvm-ar, or
llvm-objcopy, Archive would try to determine the kind, but it was not
possible to get K_DARWIN in the initialization of the archive, because
they're virtually inciting usable from K_BSD, especially since the
slight differences only apply in very specific cases. This leads to
linker failures when the alignment workaround is not applied to an
archive copied with llvm-objcopy. This change teaches Archive to infer
the K_DARWIN type in the cases where it's possible and the first object
in the archive is a macho object. This avoids using the host triple to
determine this to not affect cross compiling.

Ideally we would eliminate the separate K_DARWIN type entirely since
it's not a truly separate archive type, but then we'd have to force the
macho workarounds on the BSD format generally. This might be acceptable
but then it would be unclear how to handle this case without forcing the
K_DARWIN64 format on all BSD users:

```
if (LastOffset >= Sym64Threshold) {
  if (Kind == object::Archive::K_DARWIN)
    Kind = object::Archive::K_DARWIN64;
  else
    Kind = object::Archive::K_GNU64;
}
```

The logic used to determine if the object is macho is derived from the
logic llvm-ar uses.

Previous context:

111cd669e90e5b2132187d36f8b141b11a671a8b
23a76be5adcaa768ba538f8a4514a7afccf61988

Differential Revision: https://reviews.llvm.org/D124895

2 years ago[ORC] Avoid more SymbolStringPtr copies.
Lang Hames [Thu, 19 May 2022 02:19:57 +0000 (19:19 -0700)]
[ORC] Avoid more SymbolStringPtr copies.

2 years ago[ORC] Add a FIXME.
Lang Hames [Thu, 19 May 2022 01:58:14 +0000 (18:58 -0700)]
[ORC] Add a FIXME.

2 years ago[ORC] Add missing std::moves, pass SymbolLookupSet by value.
Lang Hames [Thu, 19 May 2022 01:39:33 +0000 (18:39 -0700)]
[ORC] Add missing std::moves, pass SymbolLookupSet by value.

Avoids some unnecessary SymbolStringPtr copies.

2 years ago[llvm-jitlink] Print session report even if entry-point lookup errors out.
Lang Hames [Sun, 17 Apr 2022 23:38:01 +0000 (16:38 -0700)]
[llvm-jitlink] Print session report even if entry-point lookup errors out.

2 years ago[mlir][vector] Fix crash in DropInnerMostUnitDims pattern
Thomas Raoux [Thu, 19 May 2022 17:38:04 +0000 (17:38 +0000)]
[mlir][vector] Fix crash in DropInnerMostUnitDims pattern

Fix number of dimensions when incrementally replacing dimensions in
affine map.

Differential Revision: https://reviews.llvm.org/D125984

2 years ago[mlir][tensor] Add canonicalization for tensor.cast from extract_slice
Thomas Raoux [Thu, 19 May 2022 13:55:18 +0000 (13:55 +0000)]
[mlir][tensor] Add canonicalization for tensor.cast from extract_slice

Propagate static size information into extract_slice producer if
possible.

Differential Revision: https://reviews.llvm.org/D125972

2 years ago[NFC] Fix a couple of whitespace issues.
Paul Walker [Thu, 19 May 2022 11:09:36 +0000 (11:09 +0000)]
[NFC] Fix a couple of whitespace issues.

2 years agoDrop qualifiers from return types in C (DR423)
Aaron Ballman [Thu, 19 May 2022 17:05:34 +0000 (13:05 -0400)]
Drop qualifiers from return types in C (DR423)

WG14 DR423 (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2148.htm#dr_423),
resolved during the C11 time frame, changed the way qualifiers are
handled on function return types and in cast expressions after it was
noted that these types are now directly observable via generic
selection expressions. In C, the function declarator is adjusted to
ignore all qualifiers (including _Atomic qualifiers).

Clang already handles the cast expression case correctly (by performing
the lvalue conversion, which drops the qualifiers as well), but with
these changes it will now also handle function declarations
appropriately.

Fixes #39595

Differential Revision: https://reviews.llvm.org/D125919

2 years ago[DeadArgElim] Use poison instead of undef as placeholder for dead arguments
Nuno Lopes [Thu, 19 May 2022 17:00:24 +0000 (18:00 +0100)]
[DeadArgElim] Use poison instead of undef as placeholder for dead arguments

It doesn't matter which value we use for dead args, so let's switch
to poison, so we can eventually kill undef.

Reviewed By: aeubanks, fhahn

Differential Revision: https://reviews.llvm.org/D125983

2 years ago[gn build] Port ca7c307d1816
LLVM GN Syncbot [Thu, 19 May 2022 16:42:14 +0000 (16:42 +0000)]
[gn build] Port ca7c307d1816

2 years ago[SelectOpti][1/5] Setup new select-optimize pass
Sotiris Apostolakis [Fri, 13 May 2022 22:29:21 +0000 (22:29 +0000)]
[SelectOpti][1/5] Setup new select-optimize pass

This is the first commit for the cmov-vs-branch optimization pass.
The goal is to develop a new profile-guided and target-independent cost/benefit analysis
for selecting conditional moves over branches when optimizing for performance.

Initially, this new pass is expected to be enabled only for instrumentation-based PGO.

RFC: https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization/6040

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D120230

2 years ago[NVVM] Update intrinsic defintions to include the `nocallback` attribute
Joseph Huber [Wed, 18 May 2022 23:33:17 +0000 (19:33 -0400)]
[NVVM] Update intrinsic defintions to include the `nocallback` attribute

This patch adds the `nocallback` attribute to the NVVM intrinsics that
did not use the `DefaultAttrsIntrinsic` method that includes it already.
The `nocallback` attribute states that the intrinsic function cannot
enter back into the caller's translation-unit. This allows as to
determine that a function calling a `nocallback` function can have the
`norecurse` attribute.  This should be safe for all the NVVM intrinsics
because they do not call other functions within the translation unit.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D125937

2 years ago[PowerPC] Implement XL compat __fnabs and __fnabss builtins.
Amy Kwan [Thu, 19 May 2022 14:38:34 +0000 (09:38 -0500)]
[PowerPC] Implement XL compat __fnabs and __fnabss builtins.

This patch implements the following floating point negative absolute value
builtins that required for compatibility with the XL compiler:
```
double __fnabs(double);
float __fnabss(float);
```

These builtins will emit :
- fnabs on PWR6 and below, or if VSX is disabled.
- xsnabsdp on PWR7 and above, if VSX is enabled.

Differential Revision: https://reviews.llvm.org/D125506

2 years ago[AMDGPU] emit macro __GFX9__ etc
Yaxun (Sam) Liu [Wed, 18 May 2022 16:44:59 +0000 (12:44 -0400)]
[AMDGPU] emit macro __GFX9__ etc

Emit predefined macros for GPU family. e.g.
for GPU gfx9xx emit __GFX9__, etc.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D125909

2 years ago[SimpleLoopUnswitch] Skip trivial selects during trivial unswitching.
Florian Hahn [Thu, 19 May 2022 16:01:11 +0000 (17:01 +0100)]
[SimpleLoopUnswitch] Skip trivial selects during trivial unswitching.

Update the remaining places in unswitchTrivialBranch to properly skip
trivial selects.

Fixes #55526.

2 years ago[AMDGPU] Allow multiple uses of the same literal in SOP2/SOPC
Jay Foad [Thu, 19 May 2022 14:23:10 +0000 (15:23 +0100)]
[AMDGPU] Allow multiple uses of the same literal in SOP2/SOPC

AMDGPUAsmParser::validateSOPLiteral already knew about this but
SIInstrInfo::verifyInstruction did not.

Differential Revision: https://reviews.llvm.org/D125976

2 years ago[lldb] Add non-address bit improvements to release notes
David Spickett [Thu, 19 May 2022 15:35:46 +0000 (15:35 +0000)]
[lldb] Add non-address bit improvements to release notes

This summarises the changes made by d9398a91e2a6b8837a47a5fda2164c9160e86199.
Which forms the bulk of the fixes needed for non-address bit handling.

Note that in the previous releases we noted memory tagging support,
which is a subset of non-address bits. The recent changes enable
debugging of programs using memory tagging, pointer authentication
and top byte ignore (all at once) on AArch64.

2 years ago[clang] Fix __has_builtin
Yaxun (Sam) Liu [Tue, 17 May 2022 19:12:03 +0000 (15:12 -0400)]
[clang] Fix __has_builtin

Fix __has_builtin to return 1 only if the requested target features
of a builtin are enabled by refactoring the code for checking
required target features of a builtin and use it in evaluation
of __has_builtin.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D125829

2 years ago[LoopVectorize] Don't interleave when the number of runtime checks exceeds the threshold
Tiehu Zhang [Thu, 19 May 2022 15:24:14 +0000 (23:24 +0800)]
[LoopVectorize] Don't interleave when the number of runtime checks exceeds the threshold

The runtime check threshold should also restrict interleave count.
Otherwise, too many runtime checks will be generated for some cases.

Reviewed By: fhahn, dmgreen

Differential Revision: https://reviews.llvm.org/D122126

2 years ago[LoopVectorize] Precommit a test for D122126
Tiehu Zhang [Thu, 19 May 2022 15:14:59 +0000 (23:14 +0800)]
[LoopVectorize] Precommit a test for D122126

2 years ago[VPlan] Update VPWidenMemoryInstruction to not inherit from VPValue.
Florian Hahn [Thu, 19 May 2022 15:24:38 +0000 (16:24 +0100)]
[VPlan] Update VPWidenMemoryInstruction to not inherit from VPValue.

VPWidenMemoryInstruction also models stores which may not produce a value.
This can trip over analyses. Improve the modeling by only adding
VPValues for VPWidenMemoryInstructionRecipes modeling loads.

2 years ago[libc++] Override the value of LIBCXX_CXX_ABI in the cache
Louis Dionne [Thu, 19 May 2022 15:20:26 +0000 (11:20 -0400)]
[libc++] Override the value of LIBCXX_CXX_ABI in the cache

This will allow us to remove this entirely once the commit has propagated
through all CI and hence changed the value in the cache.

2 years ago[NFC] Fix typos in X86CmovConversion
Sotiris Apostolakis [Thu, 19 May 2022 05:37:22 +0000 (05:37 +0000)]
[NFC] Fix typos in X86CmovConversion

2 years ago[libunwind] Remove unused _LIBUNWIND_HAS_NO_THREADS macro in tests
Louis Dionne [Thu, 19 May 2022 14:57:13 +0000 (10:57 -0400)]
[libunwind] Remove unused _LIBUNWIND_HAS_NO_THREADS macro in tests

The _LIBUNWIND_HAS_NO_THREADS macro is only picked up by libunwind
inside its sources, so it is only required when it builds. It doesn't
need to be defined when running the tests.

2 years ago[AMDGPU] gfx11 scalar memory instructions
Joe Nash [Mon, 25 Apr 2022 17:33:24 +0000 (13:33 -0400)]
[AMDGPU] gfx11 scalar memory instructions

Contributors:
Mirko Brkusanin <Mirko.Brkusanin@amd.com>

Patch 9/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125820

Reviewed By: kosarev, #amdgpu, arsenm

Differential Revision: https://reviews.llvm.org/D125822

2 years ago[runtimes] Fix the build of merged ABI/unwinder libraries
Louis Dionne [Wed, 18 May 2022 16:05:45 +0000 (12:05 -0400)]
[runtimes] Fix the build of merged ABI/unwinder libraries

Also, add a CI job that tests this configuration. The exact configuration
is that we build a shared libc++ and merge objects for the ABI library
and the unwinder library into it.

Differential Revision: https://reviews.llvm.org/D125903

2 years ago[flang][driver] Add support for generating executables on MacOSX/Darwin
Andrzej Warzynski [Sun, 15 May 2022 11:35:37 +0000 (12:35 +0100)]
[flang][driver] Add support for generating executables on MacOSX/Darwin

This patch basically extends https://reviews.llvm.org/D122008 with
support for MacOSX/Darwin.

To facilitate this, I've added `MacOSX` to the list of supported OSes in
Target.cpp. Flang already supports `Darwin` and it doesn't really do
anything OS-specific there (it could probably safely skip checking the
OS for now).

Note that generating executables remains hidden behind the
`-flang-experimental-exec` flag. Also, we don't need to add `-lm` on
MacOSX as `libm` is effectively included in `libSystem` (which is linked
in unconditionally).

Differential Revision: https://reviews.llvm.org/D125628

2 years ago[flang][OpenMP] Support for Collapse
Mats Petersson [Wed, 7 Jul 2021 15:58:32 +0000 (16:58 +0100)]
[flang][OpenMP] Support for Collapse

Convert Fortran parse-tree into MLIR for collapse-clause.

Includes simple Fortran to LLVM-IR test, with auto-generated
check-lines (some of which have been edited by hand).

Reviewed By: kiranchandramohan, shraiysh, peixin

Differential Revision: https://reviews.llvm.org/D125302

2 years ago[AMDGPU] gfx11 LDSDIR instructions MC support
Joe Nash [Fri, 22 Apr 2022 19:18:40 +0000 (15:18 -0400)]
[AMDGPU] gfx11 LDSDIR instructions MC support

Contributors:
Carl Ritson <carl.ritson@amd.com>

Patch 8/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125498

Reviewed By: critson, rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D125820

2 years ago[libc++] Granularize algorithm benchmarks
Nikolas Klauser [Thu, 19 May 2022 10:50:02 +0000 (12:50 +0200)]
[libc++] Granularize algorithm benchmarks

Reviewed By: ldionne, #libc

Spies: libcxx-commits, mgorny, mgrang

Differential Revision: https://reviews.llvm.org/D124740

2 years ago[flang][NFC] Allow whitespaces before `ERROR`
Daniil Dudkin [Thu, 19 May 2022 14:11:51 +0000 (17:11 +0300)]
[flang][NFC] Allow whitespaces before `ERROR`

This change allows to write whitespaces before the `ERROR` keyword
in semantic tests for consistency with other testing infrastructure.

Also, one test is changed in order to test if the change works
correctly.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D125884

2 years ago[libc++] Enable move semantics for vector in C++03
Nikolas Klauser [Thu, 19 May 2022 10:46:09 +0000 (12:46 +0200)]
[libc++] Enable move semantics for vector in C++03

We require move semantics in C++03 anyways, so let's enable them for the containers.

Reviewed By: ldionne, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D123802