platform/upstream/llvm.git
2 years ago[RISCV] Fold (sra (add (shl X, 32), C1), 32 - C) -> (shl (sext_inreg (add X, C1), C)
Craig Topper [Thu, 30 Jun 2022 15:52:57 +0000 (08:52 -0700)]
[RISCV] Fold (sra (add (shl X, 32), C1), 32 - C) -> (shl (sext_inreg (add X, C1), C)

Similar for a subtract with a constant left hand side.

(sra (add (shl X, 32), C1<<32), 32) is the canonical IR from InstCombine
for (sext (add (trunc X to i32), 32) to i32).

For RISCV, we should lower this as addiw which means turning it into
(sext_inreg (add X, C1)).

There is an existing DAG combine to convert back to (sext (add (trunc X
to i32), 32) to i32), but it requires isTruncateFree to return true
and for i32 to be a legal type as it used sign_extend and truncate
nodes. So that doesn't work for RISCV.

If the outer sra happens be used by a shl by constant, it will be
folded and the shift amount of the sra will be changed before we
can do our own DAG combine. This requires us to match the more
general pattern and restore the shl.

I had wanted to do this as a separate (add (shl X, 32), C1<<32) ->
(shl (add X, C1), 32) combine, but that hit an infinite loop for some
values of C1.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D128869

2 years ago[RISCV] DAG combine (sra (shl X, 32), 32 - C) -> (shl (sext_inreg X, i32), C).
Craig Topper [Thu, 30 Jun 2022 15:52:43 +0000 (08:52 -0700)]
[RISCV] DAG combine (sra (shl X, 32), 32 - C) -> (shl (sext_inreg X, i32), C).

The sext_inreg can often be folded into an earlier instruction by
using a W instruction. The sext_inreg also works better with our ABI.

This is one of the steps to improving the generated code for this https://godbolt.org/z/hssn6sPco

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D128843

2 years ago[RISCV] Pre-commit tests for D128869. NFC
Craig Topper [Wed, 29 Jun 2022 23:23:41 +0000 (16:23 -0700)]
[RISCV] Pre-commit tests for D128869. NFC

2 years ago[llvm] Fix the modules build
Jonas Devlieghere [Thu, 30 Jun 2022 15:55:51 +0000 (08:55 -0700)]
[llvm] Fix the modules build

Fixes error: missing '#include "llvm/IR/FMF.h"'; 'FastMathFlags' must be
defined before it is used in llvm/include/llvm/IR/NoFolder.h.

2 years ago[llvm-reduce] Add support for LTO bitcode files
Matthew Voss [Thu, 30 Jun 2022 15:53:00 +0000 (08:53 -0700)]
[llvm-reduce] Add support for LTO bitcode files

Adds support for reading and writing LTO bitcode files.

  - Emit a summary if the original bitcode file had a summary
  - Use split LTO units if the original bitcode file used them.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D127168

2 years ago[flang] Fix one corner case in reshape intrinsic
Peixin Qiao [Thu, 30 Jun 2022 15:52:44 +0000 (23:52 +0800)]
[flang] Fix one corner case in reshape intrinsic

As Fortran 2018 16.9.163, the reshape is the only intrinsic which
requires the shape argument to be rank-one integer array and the SIZE
of it to be one constant expression. The current expression lowering
converts the shape expression with slice in intrinsic into one box value
with the box element type of unknown extent. However, the genReshape
requires the box element type to be constant size. So, convert the box
value into one with box element type of sequence of 1 x constant. This
corner case is found in cam4 in SPEC 2017
https://github.com/llvm/llvm-project/issues/56140.

Reviewed By: Jean Perier

Differential Revision: https://reviews.llvm.org/D128597

2 years ago[ARM] Add Thumb-1 CTTZ codegen tests. NFC
David Green [Thu, 30 Jun 2022 15:45:00 +0000 (16:45 +0100)]
[ARM] Add Thumb-1 CTTZ codegen tests. NFC

2 years ago[AMDGPU] gfx11 WMMA instruction support
Piotr Sobczak [Tue, 28 Jun 2022 18:00:03 +0000 (14:00 -0400)]
[AMDGPU] gfx11 WMMA instruction support

gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate)
instructions.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D128756

2 years ago[flang][NFC] Fix warning
Valentin Clement [Thu, 30 Jun 2022 14:56:29 +0000 (16:56 +0200)]
[flang][NFC] Fix warning

2 years ago[pseudo] Forest dump ascii art isn't broken by large indices
Sam McCall [Thu, 30 Jun 2022 14:52:55 +0000 (16:52 +0200)]
[pseudo] Forest dump ascii art isn't broken by large indices

2 years ago[libc++] Remove dead code and unneeded C++03 specializations from type_traits
Nikolas Klauser [Thu, 30 Jun 2022 12:11:26 +0000 (14:11 +0200)]
[libc++] Remove dead code and unneeded C++03 specializations from type_traits

Reviewed By: ldionne, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D128906

2 years ago[libc++] Implement P0618R0 (Deprecating <codecvt>)
Nikolas Klauser [Thu, 30 Jun 2022 11:47:26 +0000 (13:47 +0200)]
[libc++] Implement P0618R0 (Deprecating <codecvt>)

Reviewed By: ldionne, #libc

Spies: cfe-commits, llvm-commits, libcxx-commits

Differential Revision: https://reviews.llvm.org/D127313

2 years ago[libc][Obvious] Do not add __NO_ to targets with FLAG__NO suffix.
Tue Ly [Thu, 30 Jun 2022 14:44:08 +0000 (10:44 -0400)]
[libc][Obvious] Do not add __NO_ to targets with FLAG__NO suffix.

2 years ago[lldb] Fix libc++ string formatter for the "unstable" layout
Pavel Labath [Thu, 30 Jun 2022 14:30:51 +0000 (16:30 +0200)]
[lldb] Fix libc++ string formatter for the "unstable" layout

D128285 only changed the stable (v1) layout, so the matching change in
D128694 broke the formatting of the unstable strings. This fixes that,
and ensures compatibility with all older layouts as well.

2 years ago[IRBuilder] Migrate all binops to folding API
Nikita Popov [Thu, 30 Jun 2022 10:52:31 +0000 (12:52 +0200)]
[IRBuilder] Migrate all binops to folding API

Migrate all binops to use FoldXYZ rather than CreateXYZ APIs,
which are compatible with InstSimplifyFolder and fallible constant
folding.

Rather than continuing to add one method for every single operator,
add a generic FoldBinOp (plus variants for nowrap, exact and fmf
operators), which we would need anyway for CreateBinaryOp.

This change is not NFC because IRBuilder with InstSimplifyFolder
may perform more folding. However, this patch changes SCEVExpander
to not use the folder in InsertBinOp to minimize practical impact
and keep this change as close to NFC as possible.

2 years agoFix PDB/func-symbols.test for Arm/Windows
Muhammad Omair Javaid [Wed, 29 Jun 2022 09:22:56 +0000 (13:22 +0400)]
Fix PDB/func-symbols.test for Arm/Windows

PDB/func-symbols.test was orignally written for 32bit x86, keeping in
mind cdecl and stdcall calling conventions which does name mangling for
example like adding "_" underscore before function name.
This is only x86 specific but purpose of pointers.test is NOT to test
calling convention.
I have made a minor change to make this test pass on Windows/Arm.

2 years agoadd testcases for D128647, NFC
Chen Zheng [Mon, 27 Jun 2022 10:24:41 +0000 (06:24 -0400)]
add testcases for D128647, NFC

2 years agoFix TestCommandScript.py for Arm/Windows
Muhammad Omair Javaid [Wed, 29 Jun 2022 09:34:36 +0000 (13:34 +0400)]
Fix TestCommandScript.py for Arm/Windows

TestCommandScript.py fails on Arm/Windows due following issues:
https://llvm.org/pr56288
https://llvm.org/pr56292

LLDB fails to skip prologue and also step over library function or
nodebug functions fails due to PDB/DWARF mismatch.

This patch replace function breakpoint with line breakpoint so that we
can expect LLDB to stop on desired line. Also replace dwarf with PDB
debug info for this test only.

2 years agoDeferred Concept Instantiation Implementation
Erich Keane [Thu, 19 May 2022 13:44:34 +0000 (06:44 -0700)]
Deferred Concept Instantiation Implementation

This is a continuation of D119544.  Based on @rsmith 's feed back
showing me https://eel.is/c++draft/temp#friend-9, We should properly
handle friend functions now.

Differential Revision: https://reviews.llvm.org/D126907

2 years ago[flang] Convert assertion to a TODO
Valentin Clement [Thu, 30 Jun 2022 13:45:39 +0000 (15:45 +0200)]
[flang] Convert assertion to a TODO

The original assertion is not necessarily correct since the shape
argument may involve a slice of an array (an expression) and not a whole
vector with constant length. In the presence of a slice operation, the
size must be computed (left as a TODO for now).

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D128894

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
2 years ago[gn build] Port a591c7ca0d9f
LLVM GN Syncbot [Thu, 30 Jun 2022 13:27:00 +0000 (13:27 +0000)]
[gn build] Port a591c7ca0d9f

2 years ago[VNCoercion] Separate constant/non-constant mem intrinsic implementations (NFCI)
Nikita Popov [Thu, 30 Jun 2022 13:19:26 +0000 (15:19 +0200)]
[VNCoercion] Separate constant/non-constant mem intrinsic implementations (NFCI)

This means we no longer need to have the same API between IRBuilder
and IRBuilderFolder.

The constant case is substantially simpler, so implementing it
separately isn't an undue burden.

2 years ago[HLSL] Change WaveActiveCountBits to wrapper of __builtin_hlsl_wave_active_count_bits
Xiang Li [Wed, 29 Jun 2022 21:00:28 +0000 (14:00 -0700)]
[HLSL] Change WaveActiveCountBits to wrapper of __builtin_hlsl_wave_active_count_bits

Change WaveActiveCountBits from builtin into wrapper of __builtin_hlsl_wave_active_count_bits.
For comment at
https://reviews.llvm.org/D126857#inline-1235949

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D128855

2 years agoCorrect -Winfinite-recursion warning on potentially-unevaluated operand
Prathit Aswar [Thu, 30 Jun 2022 13:07:49 +0000 (09:07 -0400)]
Correct -Winfinite-recursion warning on potentially-unevaluated operand

Fixing issue "incorrect -Winfinite-recursion warning on potentially-
unevaluated operand".

We add a dedicated visit function (VisitCXXTypeidExpr) for typeid,
instead of using the default (VisitStmt). In this new function we skip
over building the CFG for unevaluated operands of typeid.

Fixes #21668

Differential Revision: https://reviews.llvm.org/D128747

2 years ago[VNCoercion] Use ConstantFoldLoadFromConst API (NFCI)
Nikita Popov [Thu, 30 Jun 2022 12:49:44 +0000 (14:49 +0200)]
[VNCoercion] Use ConstantFoldLoadFromConst API (NFCI)

Nowdays we have a generic constant folding API to load a type from
an offset. It should be able to do anything that VNCoercion can do.

This avoids the weird templating between IRBuilder and ConstantFolder
in one function, which is will stop working as the IRBuilderFolder
moves from CreateXYZ to FoldXYZ APIs.

Unfortunately, this doesn't eliminate this pattern from VNCoercion
entirely yet.

2 years ago[libTooling][NFC] Add a comment about comment parsing to getAssociatedRange.
Aaron Jacobs [Thu, 30 Jun 2022 12:45:42 +0000 (12:45 +0000)]
[libTooling][NFC] Add a comment about comment parsing to getAssociatedRange.

It took me multiple hours of debugging plus asking an expert for help to
figure out why this function didn't do what it promised to do. It turns
out there is a flag that needs to be set. Document this, in an attempt
to save the next person the surprise.

Reviewed By: ymandel

Differential Revision: https://reviews.llvm.org/D128774

2 years ago[libc++] Disentangle _If, _Or and _And
Nikolas Klauser [Thu, 30 Jun 2022 10:57:51 +0000 (12:57 +0200)]
[libc++] Disentangle _If, _Or and _And

Reviewed By: ldionne, #libc, EricWF

Spies: EricWF, libcxx-commits

Differential Revision: https://reviews.llvm.org/D127919

2 years ago[LV] Move LoopVersioning creation to LVP::execute.
Florian Hahn [Thu, 30 Jun 2022 11:14:31 +0000 (12:14 +0100)]
[LV] Move LoopVersioning creation to LVP::execute.

At the moment LoopVersioning is only created for inner-loop
vectorization. This patch moves it to LVP::execute, which means it will
also be added for epilogue vectorization. As a consequence, the proper
noalias metadata is now also added to epilogue vector loops.

LVer will be moved to VPTransformState as follow-up.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D127966

2 years ago[test] Add a lit test fshl-splat-undef.ll
Xiang1 Zhang [Thu, 30 Jun 2022 10:25:01 +0000 (18:25 +0800)]
[test] Add a lit test fshl-splat-undef.ll

2 years ago[NFC][XCOFF] remove an unused global variable.
esmeyi [Thu, 30 Jun 2022 10:55:49 +0000 (06:55 -0400)]
[NFC][XCOFF] remove an unused global variable.

2 years agoUglify __support/xlocale
Michael Platings [Tue, 28 Jun 2022 09:42:20 +0000 (10:42 +0100)]
Uglify __support/xlocale

This allows including the headers without risk of conflict with
user-defined macros e.g. max

Differential Revision: https://reviews.llvm.org/D128728

2 years ago[IR] Fix typo in comment. NFC
Fraser Cormack [Thu, 30 Jun 2022 10:30:12 +0000 (11:30 +0100)]
[IR] Fix typo in comment. NFC

2 years ago[mlir][Linalg] Uniformize SplitReduction transforms and add option to use Bufferizati...
Nicolas Vasilache [Tue, 28 Jun 2022 12:17:32 +0000 (05:17 -0700)]
[mlir][Linalg] Uniformize SplitReduction transforms and add option to use Bufferization::AllocTensor

This revision merges the 2 split_reduction transforms and adds extra control by using attributes.

SplitReduction is known to require a concrete additional buffer to store tempoaray information.
Add an option to introduce a `bufferization.alloc_tensor` instead of `linalg.init_tensor`.
This behaves better with subset-based tiling and bufferization.

Differential Revision: https://reviews.llvm.org/D128722

2 years ago[InstCombine] fix overzealous assert in icmp-shr fold
Sanjay Patel [Thu, 30 Jun 2022 10:14:30 +0000 (06:14 -0400)]
[InstCombine] fix overzealous assert in icmp-shr fold

The assert was added with 0399473de886595d and is correct for that
pattern, but it is off-by-1 with the enhancement in d4f39d833332.

The transforms are still correct with the new pre-condition:
https://alive2.llvm.org/ce/z/6_6ghm
https://alive2.llvm.org/ce/z/_GTBUt

And as shown in the new test, the transform is expected with
'ult' - in that case, the icmp reduces to test if the shift
amount is 0.

2 years ago[ConstantFold] Support loads in ConstantFoldInstOperands()
Nikita Popov [Thu, 30 Jun 2022 10:16:57 +0000 (12:16 +0200)]
[ConstantFold] Support loads in ConstantFoldInstOperands()

This allows all constant folding to happen through a single
function, without requiring special handling for loads at each
call-site.

This may not be NFC because some callers currently don't do that
special handling.

2 years ago[gn build] Port cfb7ffdec0eb
LLVM GN Syncbot [Thu, 30 Jun 2022 10:11:58 +0000 (10:11 +0000)]
[gn build] Port cfb7ffdec0eb

2 years ago[gn build] Port 72cd6b6c8356
LLVM GN Syncbot [Thu, 30 Jun 2022 10:11:58 +0000 (10:11 +0000)]
[gn build] Port 72cd6b6c8356

2 years ago[LLDB] Fix TestSTL.py Makefile to remove -gdwarf O0
Muhammad Omair Javaid [Thu, 30 Jun 2022 10:01:30 +0000 (14:01 +0400)]
[LLDB] Fix TestSTL.py Makefile to remove -gdwarf O0

This is a follow up to my previous commit where TestSTL.py got broken
due to 9c6e04359282e9051f7b2744b99266ece32db001.
Now that we force dwarf symbols by default on windows we dont need to
specifically put -gdwarf O0 in debug flags for this test.

2 years ago[OpenCL] Remove half scalar vload/vstore builtins
Sven van Haastregt [Thu, 30 Jun 2022 10:01:19 +0000 (11:01 +0100)]
[OpenCL] Remove half scalar vload/vstore builtins

These are not mentioned in the OpenCL C Specification nor in the
OpenCL Extension Specification.

Differential Revision: https://reviews.llvm.org/D128434

2 years ago[Pipelines] Add a test how DCE works after ArgumentPromotion
Pavel Samolysov [Thu, 30 Jun 2022 09:54:02 +0000 (12:54 +0300)]
[Pipelines] Add a test how DCE works after ArgumentPromotion

The ArgumentPromotion pass uses Mem2Reg promotion at the end to cutting
down generated alloca instructions as well as meaningless stores and
this behavior can leave unused (dead) arguments.

The test shows that the arguments are not removed in the current
optimization pipeline.

2 years ago[Evaluator] Add missing LLVM_DEBUG()
Nikita Popov [Thu, 30 Jun 2022 09:54:47 +0000 (11:54 +0200)]
[Evaluator] Add missing LLVM_DEBUG()

Missed these in 41f0b6a78143776d673565cfa830849e3b468b8e, resulting
in unconditional debug output.

2 years ago[InlineCost] Simplify constant folding
Nikita Popov [Thu, 30 Jun 2022 09:14:01 +0000 (11:14 +0200)]
[InlineCost] Simplify constant folding

Use a common ConstantFoldInstOperands-based constant folding
implementation, instead of specifying the folding function for
each function individually. Going through the generic handling
doesn't appear to have any significant compile-time impact.

As the test change shows, this is not NFC, because we now use
DataLayout-aware constant folding, which can do slightly better
in some cases (e.g. those involving GEPs).

2 years agoadd testcase for D127202, NFC
Chen Zheng [Thu, 23 Jun 2022 08:35:16 +0000 (04:35 -0400)]
add testcase for D127202, NFC

2 years ago[InlineFunction] Only check pointer arguments for a call
Chen Zheng [Thu, 30 Jun 2022 09:26:27 +0000 (05:26 -0400)]
[InlineFunction] Only check pointer arguments for a call

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128529

2 years ago[LLDB] Fix TestSTL.py on Windows
Muhammad Omair Javaid [Thu, 30 Jun 2022 08:25:43 +0000 (12:25 +0400)]
[LLDB] Fix TestSTL.py on Windows

TestSTL.py was broken by 9c6e04359282e9051f7b2744b99266ece32db001.
This patch fixes it with changes to its Makefile.

2 years ago[X86] Support `_Float16` on SSE2 and up
Phoebe Wang [Thu, 30 Jun 2022 08:40:29 +0000 (16:40 +0800)]
[X86] Support `_Float16` on SSE2 and up

This is split from D113107 to address #56204 and https://discourse.llvm.org/t/how-to-build-compiler-rt-for-new-x86-half-float-abi/63366

Reviewed By: zahiraam, rjmccall, bkramer, MaskRay

Differential Revision: https://reviews.llvm.org/D128571

2 years ago[Evaluator] Use ConstantFoldInstOperands()
Nikita Popov [Thu, 30 Jun 2022 08:44:08 +0000 (10:44 +0200)]
[Evaluator] Use ConstantFoldInstOperands()

For instructions that don't need any special handling, use
ConstantFoldInstOperands(), rather than re-implementing individual
cases.

This is probably not NFC because it can handle cases the previous
code missed (e.g. vector operations).

2 years ago[ConstantFold] Supports compares in ConstantFoldInstOperands()
Nikita Popov [Thu, 30 Jun 2022 09:02:36 +0000 (11:02 +0200)]
[ConstantFold] Supports compares in ConstantFoldInstOperands()

Support compares in ConstantFoldInstOperands(), instead of
forcing the use of ConstantFoldCompareInstOperands(). Also handle
insertvalue (extractvalue was already handled).

This removes a footgun, where many uses of ConstantFoldInstOperands()
need a separate check for compares beforehand. It's particularly
insidious if called on a constant expression, because it doesn't
fail in that case, but will just not do DL-dependent folding.

2 years ago[LoongArch] Fix wrong function names in bstrpick_w.ll. NFC
Weining Lu [Thu, 30 Jun 2022 08:58:52 +0000 (16:58 +0800)]
[LoongArch] Fix wrong function names in bstrpick_w.ll. NFC

2 years ago[RISCV] Add a test covering a (reverted) codegen issue
Fraser Cormack [Thu, 30 Jun 2022 08:25:35 +0000 (09:25 +0100)]
[RISCV] Add a test covering a (reverted) codegen issue

This test checks one of problematic cases outlined in D128006, leading
to the patch's reversal. I thought it best to add a test just in case
this sort of optimization is attempted again in the future in some
fashion.

2 years ago[flang] Fix for array upper bounds with *
Valentin Clement [Thu, 30 Jun 2022 08:36:47 +0000 (10:36 +0200)]
[flang] Fix for array upper bounds with *

Even though the array is declared with '*' upper bounds, it has an
initial value that has a statically known shape. Use the shape from
the type of the initializer when the declared size is '*'.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D128889

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
2 years ago[flang][NFC] Add FIR array test
Valentin Clement [Thu, 30 Jun 2022 08:35:43 +0000 (10:35 +0200)]
[flang][NFC] Add FIR array test

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D128888

Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
2 years ago[BOLT] Fix getDynoStats to handle BCs with no functions
Amir Ayupov [Thu, 30 Jun 2022 08:15:49 +0000 (01:15 -0700)]
[BOLT] Fix getDynoStats to handle BCs with no functions

Address fuzzer crash

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D120696

2 years ago[VPlan] Make sure optimizeInductions removes wide ind from scalar plan.
Florian Hahn [Thu, 30 Jun 2022 08:08:33 +0000 (09:08 +0100)]
[VPlan] Make sure optimizeInductions removes wide ind from scalar plan.

In some cases, there may be widened users of inductions even though the
plan includes the scalar VF. In those cases, make sure we still replace
the VPWidenIntOrFpInductionRecipe with scalar steps, as otherwise we may
try to execute a VPWidenIntOrFpInductionRecipe with a scalar VF.

Alternatively the patch could also split the range if needed.

This fixes a crash exposed by D123720.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D128755

2 years ago[flang] Correct bug in literal CHARACTER constant names
Valentin Clement [Thu, 30 Jun 2022 08:09:47 +0000 (10:09 +0200)]
[flang] Correct bug in literal CHARACTER constant names

The names of CHARACTER strings were being truncated leading to invalid
collisions and other failures. This change makes sure to use the entire
string as the seed for the unique name.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D128884

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
2 years ago[mlir][SCF][bufferize][NFC] Utilize recently added helper function
Matthias Springer [Thu, 30 Jun 2022 07:31:45 +0000 (09:31 +0200)]
[mlir][SCF][bufferize][NFC] Utilize recently added helper function

This should have been part of D128666.

Differential Revision: https://reviews.llvm.org/D128885

2 years ago[NFC] [Modules] Add test for inherit default arguments
Chuanqi Xu [Thu, 30 Jun 2022 07:48:22 +0000 (15:48 +0800)]
[NFC] [Modules] Add test for inherit default arguments

2 years ago[X86][BOLT] Use getOperandType to determine memory access size
Amir Ayupov [Mon, 13 Jun 2022 21:46:43 +0000 (14:46 -0700)]
[X86][BOLT] Use getOperandType to determine memory access size

Generate INSTRINFO_OPERAND_TYPE table in X86GenInstrInfo.inc.

This diff adds support for instructions that were previously reported as having
memory access size 0. It replaces the heuristic of looking at instruction
register width to determine memory access width by instead checking the memory
operand type using tablegen-provided tables.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D126116

2 years ago[SCCP] Simplify CFG in SCCP as well
Nikita Popov [Mon, 27 Jun 2022 15:14:41 +0000 (17:14 +0200)]
[SCCP] Simplify CFG in SCCP as well

Currently, we only remove dead blocks and non-feasible edges in
IPSCCP, but not in SCCP. I'm not aware of any strong reason for
that difference, so this patch updates SCCP to perform the CFG
cleanup as well.

Compile-time impact seems to be pretty minimal, in the 0.05%
geomean range on CTMark.

For the test case from https://reviews.llvm.org/D126962#3611579
the result after -sccp now looks like this:

    define void @test(i1 %c) {
    entry:
      br i1 %c, label %unreachable, label %next
    next:
      unreachable
    unreachable:
      call void @bar()
      unreachable
    }

-jump-threading does nothing on this, but -simplifycfg will produce
the optimal result.

Differential Revision: https://reviews.llvm.org/D128796

2 years ago[flang] SELECT CASE constructs with character selectors that require a temp
Valentin Clement [Thu, 30 Jun 2022 07:03:49 +0000 (09:03 +0200)]
[flang] SELECT CASE constructs with character selectors that require a temp

Here is a character SELECT CASE construct that requires a temp to hold the
result of the TRIM intrinsic call:

```
module m
      character(len=6) :: s
    contains
      subroutine sc
        n = 0
        if (lge(s,'00')) then
          select case(trim(s))
          case('11')
             n = 1
          case default
             continue
          case('22')
             n = 2
          case('33')
             n = 3
          case('44':'55','66':'77','88':)
             n = 4
          end select
        end if
        print*, n
      end subroutine
    end module m
```

This SELECT CASE construct is implemented as an IF/ELSE-IF/ELSE comparison
sequence.  The temp must be retained until some comparison is successful.
At that point the temp may be freed.  Generalize statement context processing
to allow multiple finalize calls to do this, such that the program always
executes exactly one freemem call.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: klausler, vdonaldson

Differential Revision: https://reviews.llvm.org/D128852

Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
2 years ago[flang] Fix error message in test
Valentin Clement [Thu, 30 Jun 2022 06:54:16 +0000 (08:54 +0200)]
[flang] Fix error message in test

2 years ago[clang][dataflow] Handle `for` statements without conditions
Stanislav Gatev [Wed, 29 Jun 2022 15:56:53 +0000 (15:56 +0000)]
[clang][dataflow] Handle `for` statements without conditions

Handle `for` statements without conditions.

Differential Revision: https://reviews.llvm.org/D128833

Reviewed-by: xazax.hun, gribozavr2, li.zhe.hua
2 years ago[flang][NFC] Revert message to not implemented yet
Valentin Clement [Thu, 30 Jun 2022 06:36:10 +0000 (08:36 +0200)]
[flang][NFC] Revert message to not implemented yet

2 years ago[AMDGPU] Fix liveness for loops in si-optimize-exec-masking-pre-ra
Carl Ritson [Thu, 30 Jun 2022 03:26:47 +0000 (12:26 +0900)]
[AMDGPU] Fix liveness for loops in si-optimize-exec-masking-pre-ra

Follow up to D127894, new liveness update code needs to handle
the case where S_ANDN2 input must be extended through loops when
V_CNDMASK_B32 has been hoisted.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D128800

2 years ago[flang][test] Remove RUN COMMAND/EXPECTED OUTPUT/INPUT markers from other directories
Fangrui Song [Thu, 30 Jun 2022 05:10:59 +0000 (22:10 -0700)]
[flang][test] Remove RUN COMMAND/EXPECTED OUTPUT/INPUT markers from other directories

2 years ago[flang][test] Remove RUN LINES?/EXPECTED OUTPUT.*/INPUT markers from test/Driver
Fangrui Song [Thu, 30 Jun 2022 05:08:02 +0000 (22:08 -0700)]
[flang][test] Remove RUN LINES?/EXPECTED OUTPUT.*/INPUT markers from test/Driver

Follow-up to D128763.

2 years agoUse value_or instead of getValueOr. NFC
Fangrui Song [Thu, 30 Jun 2022 04:55:02 +0000 (21:55 -0700)]
Use value_or instead of getValueOr. NFC

2 years ago[lldb] Fix unused variable warning in TraceHTR (NFC)
Kevin Cadieux [Thu, 30 Jun 2022 04:29:18 +0000 (21:29 -0700)]
[lldb] Fix unused variable warning in TraceHTR (NFC)

A warning was recently introduced in [D128576](https://reviews.llvm.org/D128576) due to now unused lambda `function_name_from_load_address`. This warning causes build failures when treating warnings as errors. This change expands the comment to also include the definition of this lambda, fixing the warning.

Error:
```
[3809/6000] Building CXX object tools/lldb/source/Plugins/TraceExporter/common/CMakeFiles/lldbPluginTraceExporterCommon.dir/TraceHTR.cpp.o
FAILED: tools/lldb/source/Plugins/TraceExporter/common/CMakeFiles/lldbPluginTraceExporterCommon.dir/TraceHTR.cpp.o
/usr/bin/clang++ -DHAVE_ROUND -DLLDB_CONFIGURATION_DEBUG -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/__w/1/b/llvm/Debug/tools/lldb/source/Plugins/TraceExporter/common -I/__w/1/llvm-project/lldb/source/Plugins/TraceExporter/common -I/__w/1/llvm-project/lldb/include -I/__w/1/b/llvm/Debug/tools/lldb/include -I/__w/1/b/llvm/Debug/include -I/__w/1/llvm-project/llvm/include -I/__w/1/llvm-project/llvm/../clang/include -I/__w/1/b/llvm/Debug/tools/lldb/../clang/include -I/__w/1/llvm-project/lldb/source -I/__w/1/b/llvm/Debug/tools/lldb/source -isystem /usr/include/libxml2 -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -fdiagnostics-color -Wno-deprecated-declarations -Wno-unknown-pragmas -Wno-strict-aliasing -Wno-deprecated-register -Wno-vla-extension -g  -fno-exceptions -gsplit-dwarf -std=c++14 -MD -MT tools/lldb/source/Plugins/TraceExporter/common/CMakeFiles/lldbPluginTraceExporterCommon.dir/TraceHTR.cpp.o -MF tools/lldb/source/Plugins/TraceExporter/common/CMakeFiles/lldbPluginTraceExporterCommon.dir/TraceHTR.cpp.o.d -o tools/lldb/source/Plugins/TraceExporter/common/CMakeFiles/lldbPluginTraceExporterCommon.dir/TraceHTR.cpp.o -c /__w/1/llvm-project/lldb/source/Plugins/TraceExporter/common/TraceHTR.cpp
/__w/1/llvm-project/lldb/source/Plugins/TraceExporter/common/TraceHTR.cpp:136:8: error: unused variable 'function_name_from_load_address' [-Werror,-Wunused-variable]
  auto function_name_from_load_address =
```

Reviewed By: wallace

Differential Revision: https://reviews.llvm.org/D128874

2 years ago[lld-macho] Initial support for Linker Optimization Hints
Daniel Bertalan [Fri, 17 Jun 2022 15:21:59 +0000 (17:21 +0200)]
[lld-macho] Initial support for Linker Optimization Hints

Linker optimization hints mark a sequence of instructions used for
synthesizing an address, like ADRP+ADD. If the referenced symbol ends up
close enough, it can be replaced by a faster sequence of instructions
like ADR+NOP.

This commit adds support for 2 of the 7 defined ARM64 optimization
hints:
- LOH_ARM64_ADRP_ADD, which transforms a pair of ADRP+ADD into ADR+NOP
  if the referenced address is within +/- 1 MiB
- LOH_ARM64_ADRP_ADRP, which transforms two ADRP instructions into
  ADR+NOP if they reference the same page

These two kinds already cover more than 50% of all LOHs in
chromium_framework.

Differential Review: https://reviews.llvm.org/D128093

2 years ago[MC] Skip lower-case integer suffixes
Keegan Saunders [Thu, 30 Jun 2022 03:55:05 +0000 (20:55 -0700)]
[MC] Skip lower-case integer suffixes

`mov x0, 1024u` is permitted in binutils but rejected by the integrated
assembler. Support the case. This is especially important when using the C
pre-processor with the assembler: some shared code between C and assembler may
use lower-cased suffices.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D128871

2 years ago[Coroutines] Add REQUIRES clause to skip unsupported targets
Chuanqi Xu [Thu, 30 Jun 2022 03:37:28 +0000 (11:37 +0800)]
[Coroutines] Add REQUIRES clause to skip unsupported targets

2 years ago[Driver] Always use --as-needed with libunwind
Petr Hosek [Wed, 29 Jun 2022 17:57:51 +0000 (17:57 +0000)]
[Driver] Always use --as-needed with libunwind

With libgcc, we follow the behavior of GCC for backwards compatibility,
only using --as-needed in the non-C++ mode.

With libunwind, there are no backward compatibility requirements so we
can always use --as-needed on all supported platforms.

Differential Revision: https://reviews.llvm.org/D128841

2 years ago[WebAssembly] Don't set musttail for coroutines when tail-call is not
Chuanqi Xu [Wed, 29 Jun 2022 04:48:48 +0000 (12:48 +0800)]
[WebAssembly] Don't set musttail for coroutines when tail-call is not
enabled

The C++20 Coroutines couldn't be compiled to WebAssembly due to an
optimization named symmetric transfer requires the support for musttail
calls but WebAssembly doesn't support it yet.

This patch tries to fix the problem by adding a supportsTailCalls
method to TargetTransformImpl to skip the symmetric transfer when
tail-call feature is not supported.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D128794

2 years ago[greedyalloc] Return early when there is no register to allocate.
Luo, Yuanke [Wed, 29 Jun 2022 11:13:16 +0000 (19:13 +0800)]
[greedyalloc] Return early when there is no register to allocate.

In X86 we split greddy register allocation into 2 passes. The 1st pass
is to allocate tile register, and the 2nd pass is to allocate the rest
of virtual register. In most cases there is no tile register, so the 1st
pass is unnecessary. To improve the compiling time, we check if there is
any register need to be allocated by invoking callback
`ShouldAllocateClass`. If there is no register to be allocated, just
return false in the pass. This would improve the 1st greed RA pass for
normal cases.

Differential Revision: https://reviews.llvm.org/D128804

2 years ago[RISCV][NFC] Move static global variables into static variable in function.
Kito Cheng [Thu, 30 Jun 2022 02:28:21 +0000 (10:28 +0800)]
[RISCV][NFC] Move static global variables into static variable in function.

It's violate coding guideline in LLVM coding standard[1], because the  the initialization order is nondeterministic and that might increase the launch time of programs.

However these variables are only used to cache query result, so we can move these variables into the function,, that which resolve both issue: 1. initialized in deterministic order, 2. Initialized that when the first time used.

[1] https://llvm.org/docs/CodingStandards.html#do-not-use-static-constructors

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D128726

2 years ago[ELF] Move InputFiles global variables (memoryBuffers, objectFiles, etc) into Ctx...
Fangrui Song [Thu, 30 Jun 2022 01:53:38 +0000 (18:53 -0700)]
[ELF] Move InputFiles global variables (memoryBuffers, objectFiles, etc) into Ctx. NFC

2 years ago[InstCombine] Use known bits to determine exact int->fp cast
zhongyunde [Thu, 30 Jun 2022 01:43:43 +0000 (09:43 +0800)]
[InstCombine] Use known bits to determine exact int->fp cast

Reviewed By: spatel, nikic

Differential Revision: https://reviews.llvm.org/D127854

2 years ago[clang][BPF] Update comment to include TYPE_MATCH
Daniel Müller [Thu, 30 Jun 2022 01:31:02 +0000 (18:31 -0700)]
[clang][BPF] Update comment to include TYPE_MATCH

D126838 added support for the TYPE_MATCH compile-once run-everywhere
relocation to LLVM proper. On the clang side no changes are necessary,
other than the adjustment of a comment to mention this relocation as well.
This change takes care of that.

Differential Revision: https://reviews.llvm.org/D126839

2 years ago[mlir][sparse] auto-insertion of conversion to resolve cycles
Aart Bik [Wed, 29 Jun 2022 19:33:01 +0000 (12:33 -0700)]
[mlir][sparse] auto-insertion of conversion to resolve cycles

When the iteration graph is cyclic (even after several attempts using less and less constraints), the current sparse compiler bails out, and no rewriting hapens. However, this revision adds some new logic where the sparse compiler tries to find a single input sparse tensor that breaks the cycle, and then adds a proper sparse conversion operation. This way, more incoming kernels can be handled!

Note, the resulting code is not optimal (although it keeps more or less proper "sparse" complexity), and more improvements should be added (especially when the kernel directly yields without computation, such as the transpose example). However, handling is better than not handling ;-)

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D128847

2 years ago[BPF] Introduce support for type match relocations
Daniel Müller [Thu, 30 Jun 2022 01:20:37 +0000 (18:20 -0700)]
[BPF] Introduce support for type match relocations

Among others, BPF currently supports the type-exists CO-RE relocation
(e.g., see D83878 & D83242). Its intention, as the name tries to convey,
is to be used for checking existence of a type in a target.
While that check is useful and has its place, we would also like to
be able to perform stricter type queries: instead of just checking mere
existence, we want to make sure that members match up in composite
types, that enum variants are present, etc. We refer to this as "type
match".

This change proposes the addition of a new relocation variant/value that
we intend to use for establishing this match relation.

Differential Revision: https://reviews.llvm.org/D126838

2 years agoRevert "[Debuginfo][DWARF][NFC] Refactor DwarfStringPoolEntryRef."
Vitaly Buka [Thu, 30 Jun 2022 00:52:47 +0000 (17:52 -0700)]
Revert "[Debuginfo][DWARF][NFC] Refactor DwarfStringPoolEntryRef."

Breaks msan bot, see D126883

This reverts commit 77df3be0dee415713cf5c79543f00532674f428b.

2 years ago[ELF] Move whyExtract/backwardReferences from LinkerDriver to Ctx. NFC
Fangrui Song [Thu, 30 Jun 2022 00:34:30 +0000 (17:34 -0700)]
[ELF] Move whyExtract/backwardReferences from LinkerDriver to Ctx. NFC

Ctx was recently added as a more suitable place for such singletons.

2 years ago[CodeView] Call llvm::codeview::visitMemberRecordStream with the deserialized CVType...
Zequan Wu [Thu, 30 Jun 2022 00:09:40 +0000 (17:09 -0700)]
[CodeView] Call llvm::codeview::visitMemberRecordStream with the deserialized CVType whose kind is FieldListRecord.

llvm::codeview::visitMemberRecordStream expects to receive an array ref that's FieldListRecord's Data not a CVType's data which has 4 more bytes preceeding. The first 2 bytes indicate the size of the FieldListRecord, and following 2 bytes is always 0x1203. Inside llvm::codeview::visitMemberRecordStream, it iterates to the data to check if first two bytes matching some type record kinds. If the size coincidentally matches one type kind, it will start parsing from there and causing crash.

2 years ago[clangd] Also mark output arguments of operator call expressions
Christian Kandeler [Thu, 30 Jun 2022 00:12:36 +0000 (20:12 -0400)]
[clangd] Also mark output arguments of operator call expressions

There's no reason that arguments to e.g. lambda calls should be treated
differently than those to "real" functions.

Reviewed By: nridge

Differential Revision: https://reviews.llvm.org/D128329

2 years ago[lldb] Use assertState in even more tests (NFC)
Jonas Devlieghere [Thu, 30 Jun 2022 00:01:36 +0000 (17:01 -0700)]
[lldb] Use assertState in even more tests (NFC)

Followup to D127355 and D127378, converting more instances of
assertEqual to assertState.

2 years ago[BOLT] Respect shouldPrint in dump-dot-all
Amir Ayupov [Thu, 30 Jun 2022 00:01:02 +0000 (17:01 -0700)]
[BOLT] Respect shouldPrint in dump-dot-all

Don't dump dot CFG graph for functions that should not be printed.

Reviewed By: rafauler, maksfb

Differential Revision: https://reviews.llvm.org/D128699

2 years ago[lldb] Skip instead of XFAIL TestGdbRemote_vContThreads on Darwin
Jonas Devlieghere [Wed, 29 Jun 2022 23:34:28 +0000 (16:34 -0700)]
[lldb] Skip instead of XFAIL TestGdbRemote_vContThreads on Darwin

The two XFAILed tests started timing out after D126983. Given that they
were XFAILed anyway I didn't investigate and just skipped them.

2 years ago[Lex] Make sure to notify `MultipleIncludeOpt` for "read tokens" during fast dependen...
Argyrios Kyrtzidis [Tue, 28 Jun 2022 23:54:42 +0000 (16:54 -0700)]
[Lex] Make sure to notify `MultipleIncludeOpt` for "read tokens" during fast dependency directive lexing

Otherwise a header may be erroneously marked as having a header macro guard and won't get re-included.

Differential Revision: https://reviews.llvm.org/D128772

2 years ago[Polly][MatMul] Abandon dependence analysis.
Michael Kruse [Wed, 29 Jun 2022 21:44:57 +0000 (16:44 -0500)]
[Polly][MatMul] Abandon dependence analysis.

The copy statements inserted by the matrix-multiplication optimization
introduce new dependencies between the copy statements and other
statements. As a result, the DependenceInfo must be recomputed.

Not recomputing them caused IslAstInfo to deduce that some loops are
parallel but cause race conditions when accessing the packed arrays.
As a result, matrix-matrix multiplication currently cannot be
parallelized.

Also see discussion at https://reviews.llvm.org/D125202

2 years ago[lldb] Skip TestAppleSimulatorOSType is simulator isn't available
Jonas Devlieghere [Wed, 29 Jun 2022 22:10:14 +0000 (15:10 -0700)]
[lldb] Skip TestAppleSimulatorOSType is simulator isn't available

Skip TestAppleSimulatorOSType is simulator isn't available for the given
platform.

2 years ago[lldb] XFAIL TestVSCode_breakpointEvents.py on Ventura
Jonas Devlieghere [Wed, 29 Jun 2022 21:58:16 +0000 (14:58 -0700)]
[lldb] XFAIL TestVSCode_breakpointEvents.py on Ventura

TestVSCode_breakpointEvents.py is failing on macOS Ventura because we
receive 3 breakpoint events instead of one. This is likely the result of
dyld moving into the shared cache.

2 years ago[ODRHash diagnostics] Fix typos. NFC.
Volodymyr Sapsai [Wed, 29 Jun 2022 21:59:21 +0000 (14:59 -0700)]
[ODRHash diagnostics] Fix typos. NFC.

2 years ago[ThinLTO][test] Add tests for emitting files in-process
Jin Xin Ng [Tue, 28 Jun 2022 23:27:09 +0000 (16:27 -0700)]
[ThinLTO][test] Add tests for emitting files in-process

Completes D127777 by adding llvm-side tests for emitting index and imports files from in-process ThinLTO

Differential Revision: https://reviews.llvm.org/D128771

2 years agoRevert "[Driver] Always use --as-needed with libunwind"
Petr Hosek [Wed, 29 Jun 2022 21:41:47 +0000 (21:41 +0000)]
Revert "[Driver] Always use --as-needed with libunwind"

This reverts commit 2483b3a9679ed2d92abbdbae6927e022903acc70 since
it broke clang-armv7-vfpv3-full-2stage builder:

  https://lab.llvm.org/buildbot#builders/190/builds/887

2 years ago[BOLT] Fix EH trampoline backout code
Maksim Panchenko [Fri, 24 Jun 2022 23:51:46 +0000 (16:51 -0700)]
[BOLT] Fix EH trampoline backout code

When SplitFunctions pass adds a trampoline code for exception landing
pads (limited to shared objects), it may increase the size of the hot
fragment making it larger than the whole function pre-split. When this
happens, the pass reverts the splitting action by restoring the original
block order and marking all blocks hot.

However, if createEHTrampolines() added new blocks to the CFG and
modified invoke instructions, simply restoring the original block layout
will not suffice as the new CFG has more blocks.

For proper backout of the split, modify the original layout by merging
in trampoline blocks immediately before their matching targets. As a
result, the number of blocks increases, but the number of instructions
and the function size remains the same as pre-split.

Add an assertion for the number of blocks when updating a function
layout.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D128696

2 years ago[test] Add REQUIRES: zlib to zdebug.yaml
Fangrui Song [Wed, 29 Jun 2022 21:33:21 +0000 (14:33 -0700)]
[test] Add REQUIRES: zlib to zdebug.yaml

2 years agoFix the eh-filter.ll test.
Stefan Pintilie [Wed, 29 Jun 2022 21:13:57 +0000 (16:13 -0500)]
Fix the eh-filter.ll test.

Forgot to add that this test requires asserts.

2 years ago[mlir][LLVMIR] Apply SubElementTypeInterface on suitable types
Min-Yih Hsu [Thu, 19 May 2022 19:07:18 +0000 (12:07 -0700)]
[mlir][LLVMIR] Apply SubElementTypeInterface on suitable types

This feature is tested by unit test since not many places in the codebase
use SubElementTypeInterface.

Differential Revision: https://reviews.llvm.org/D127539

2 years ago[mlir] Prevent SubElementInterface from going into infinite recursion
Min-Yih Hsu [Sat, 21 May 2022 04:52:49 +0000 (21:52 -0700)]
[mlir] Prevent SubElementInterface from going into infinite recursion

Since only mutable types and attributes can go into infinite recursion
inside SubElementInterface::walkSubElement, and there are only a few of
them (mutable types and attributes), we introduce new traits for Type
and Attribute: TypeTrait::IsMutable and AttributeTrait::IsMutable,
respectively. They indicate whether a type or attribute is mutable.
Such traits are required if the ImplType defines a `mutate` function.

Then, inside SubElementInterface, we use a set to record visited mutable
types and attributes that have been visited before.

Differential Revision: https://reviews.llvm.org/D127537

2 years ago[pseudo] Fix bugs/inconsistencies in forest dump.
Sam McCall [Wed, 29 Jun 2022 11:37:55 +0000 (13:37 +0200)]
[pseudo] Fix bugs/inconsistencies in forest dump.

- when printing a shared node for the second time, don't print its children
  (This keeps output proportional to the size of the structure)
- when printing a shared node for the second time, print its type only, not rule
  (for consistency with above: don't dump details of nodes twice)
- don't abbreviate shared nodes, to ensure we can prune the tree there

Differential Revision: https://reviews.llvm.org/D128805