platform/upstream/llvm.git
2 years ago[PowerPC] Support huge frame size for PPC64
Kai Luo [Mon, 6 Jun 2022 09:08:02 +0000 (09:08 +0000)]
[PowerPC] Support huge frame size for PPC64

Support allocation of huge stack frame(>2g) on PPC64.

For ELFv2 ABI on Linux, quoted from the spec 2.2.3.1 General Stack Frame Requirements
> There is no maximum stack frame size defined.

On AIX, XL allows such huge frame.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D107886

2 years ago[VPlan] Replace BranchOnCount with BranchOnCond if TC <= UF * VF.
Florian Hahn [Mon, 6 Jun 2022 08:38:53 +0000 (09:38 +0100)]
[VPlan] Replace BranchOnCount with BranchOnCond if TC <= UF * VF.

Try to simplify BranchOnCount to `BranchOnCond true` if TC <= UF * VF.

This is an alternative to D121899 which simplifies the VPlan directly
instead of doing so late in code-gen.

The potential benefit of doing this in VPlan is that this may help
cost-modeling in the future. The reason this is done in prepareToExecute
at the moment is that a single plan may be used for multiple VFs/UFs.

There are further simplifications that can be applied as follow ups:

1. Replace inductions with constants
2. Replace vector region with regular block.

Fixes #55354.

Depends on D126679.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D126680

2 years agoRevert "[libcxx] Temporarily skip Arm configs"
David Spickett [Mon, 6 Jun 2022 08:25:51 +0000 (08:25 +0000)]
Revert "[libcxx] Temporarily skip Arm configs"

This reverts commit d4220af52723e76973723d3089c6fe2527fd704d.

Linaro bots are back online.

2 years ago[RISCV] Use check-prefixes to reduce check lines
Shao-Ce SUN [Fri, 3 Jun 2022 17:27:15 +0000 (01:27 +0800)]
[RISCV] Use check-prefixes to reduce check lines

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D125083

2 years ago[NFC][RISCV][format] Blank line between functions, remove unnecessary semicolon.
yanming [Mon, 6 Jun 2022 07:32:12 +0000 (15:32 +0800)]
[NFC][RISCV][format] Blank line between functions, remove unnecessary semicolon.

2 years ago[Scalar] Use llvm::make_early_inc_range (NFC)
Kazu Hirata [Mon, 6 Jun 2022 06:53:18 +0000 (23:53 -0700)]
[Scalar] Use llvm::make_early_inc_range (NFC)

2 years ago[RISCV] Define risc-v's own register class to model FP Register.
yanming [Thu, 2 Jun 2022 04:52:50 +0000 (12:52 +0800)]
[RISCV] Define risc-v's own register class to model FP Register.

The default RegisterClass is not enough to model RISCV Register.
We define risc-v's own register class to model FP Register.
This helps to better estimate the register pressure in the loop-vectorize.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D126854

2 years ago[AST] Fix clang RecursiveASTVisitor for definition of XXXTemplateSpecializationDecl
Qingyuan Zheng [Mon, 6 Jun 2022 05:58:00 +0000 (01:58 -0400)]
[AST] Fix clang RecursiveASTVisitor for definition of XXXTemplateSpecializationDecl

Fixes https://github.com/clangd/clangd/issues/1132
where clangd's semantic highlighting is missing for symbols of a template
specialization definition. It turns out the visitor didn't traverse the
base classes of Class/Var##TemplateSpecializationDecl, i.e.
CXXRecordDecl/VarDecl. This patch adds them back as what is done in
DEF_TRAVERSE_TMPL_PART_SPEC_DECL.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D126757

2 years ago[PDB] Remove truncate* (NFC)
Kazu Hirata [Mon, 6 Jun 2022 06:33:50 +0000 (23:33 -0700)]
[PDB] Remove truncate* (NFC)

- truncateQuotedNameFront: The last use was removed on Jul 10, 2017 in
  commit a9d944fd6fd19ac377b5ebea9272676642b7ceaa.

- truncateQuotedNameBack: The last use was removed on Mar 26, 2018 in
  commit 7b84b678a993c8a8236868f65d1d4c2b3e29fb3d.

- truncateStringMiddle: The last use was removed on Mar 26, 2018 in
  commit 7b84b678a993c8a8236868f65d1d4c2b3e29fb3d.

- truncateStringBack: The last use is in truncateQuotedNameBack being
  removed above.

- truncateStringFront: The last use is in truncateQuotedNameFront
  being removed above.

2 years ago[NFC] [Coroutines] Add test for ambiguous allocation functions in
Chuanqi Xu [Mon, 6 Jun 2022 06:11:59 +0000 (14:11 +0800)]
[NFC] [Coroutines] Add test for ambiguous allocation functions in
promise_type

Address the post-commit comment in
https://reviews.llvm.org/D125517#inline-1217244

2 years agoLLVM Driver Multicall tool
Chris Bieneman [Mon, 6 Jun 2022 04:25:55 +0000 (04:25 +0000)]
LLVM Driver Multicall tool

This patch adds an llvm-driver multicall tool that can combine multiple
LLVM-based tools. The build infrastructure is enabled for a tool by
adding the GENERATE_DRIVER option to the add_llvm_executable CMake
call, and changing the tool's main function to a canonicalized
tool_name_main format (i.e. llvm_ar_main, clang_main, etc...).

As currently implemented llvm-driver contains dsymutil, llvm-ar,
llvm-cxxfilt, llvm-objcopy, and clang (if clang is included in the
build).

llvm-driver can be enabled from builds by setting
LLVM_TOOL_LLVM_DRIVER_BUILD=On.

There are several limitations in the current implementation, which can
be addressed in subsequent patches:

(1) the multicall binary cannot currently properly handle
multi-dispatch tools. This means symlinking llvm-ranlib to llvm-driver
will not properly result in llvm-ar's main being called.
(2) the multicall binary cannot be comprised of tools containing
conflicting cl::opt options as the global cl::opt option list cannot
contain duplicates.

These limitations can be addressed in subsequent patches.

Differential revision: https://reviews.llvm.org/D109977

2 years ago[BPF] Enable IAS in backend
Brad Smith [Mon, 6 Jun 2022 03:28:53 +0000 (23:28 -0400)]
[BPF] Enable IAS in backend

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D123845

2 years ago[CodeGen] Remove splitCanCauseEvictionChain and its helpers (NFC)
Kazu Hirata [Mon, 6 Jun 2022 03:22:47 +0000 (20:22 -0700)]
[CodeGen] Remove splitCanCauseEvictionChain and its helpers (NFC)

The last use was removed on Mar 7, 2022 in commit
294eca35a00f89dff474044ebd478a7f83ccc310.

2 years ago[mlir][NFC] Replace some llvm::find with llvm::is_contained.
jacquesguan [Mon, 6 Jun 2022 02:31:12 +0000 (02:31 +0000)]
[mlir][NFC] Replace some llvm::find with llvm::is_contained.

This patch replaces some llvm::find with llvm::is_contained, it should be more clear.

Differential Revision: https://reviews.llvm.org/D127077

2 years ago[GlobalISel] Remove widenWithUnmerge (NFC)
Kazu Hirata [Mon, 6 Jun 2022 02:58:18 +0000 (19:58 -0700)]
[GlobalISel] Remove widenWithUnmerge (NFC)

The last use was removed on Dec 23, 2021 in commit
29f88b93fdbe3e20c35842ca3a6c2a3f1a81cfce.

2 years ago[GlobalISel] Remove valueIsSplit (NFC)
Kazu Hirata [Mon, 6 Jun 2022 02:51:02 +0000 (19:51 -0700)]
[GlobalISel] Remove valueIsSplit (NFC)

The last use was removed on Jun 27, 2019 in commit
8138996128cd17d78d9d3e6ef7b49987565cb310.

2 years ago[LegalizeTypes][VP] Add widen and split support for vp.fptrunc and vp.fpext
Lian Wang [Thu, 26 May 2022 02:50:42 +0000 (02:50 +0000)]
[LegalizeTypes][VP] Add widen and split support for vp.fptrunc and vp.fpext

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D126439

2 years ago[InstCombine] Add more tests for shl+lshr transforms; NFC
chenglin.bi [Mon, 6 Jun 2022 02:15:48 +0000 (10:15 +0800)]
[InstCombine] Add more tests for shl+lshr transforms; NFC

2 years ago[InstCombine] Fix tests const value for shl+lshr transforms; NFC
chenglin.bi [Mon, 6 Jun 2022 02:08:56 +0000 (10:08 +0800)]
[InstCombine] Fix tests const value for shl+lshr transforms; NFC

2 years ago[InstCombine] Add more tests for shl+lshr transforms; NFC
chenglin.bi [Mon, 6 Jun 2022 01:59:41 +0000 (09:59 +0800)]
[InstCombine] Add more tests for shl+lshr transforms; NFC

2 years ago[Clang][FP16] Add 4 builtins for _Float16
Phoebe Wang [Mon, 6 Jun 2022 00:31:44 +0000 (08:31 +0800)]
[Clang][FP16] Add 4 builtins for _Float16

We are lacking builtins support for `_Float16`. In most cases, we can use other floating-type builtins and truncate them to `_Float16`.
But it's a problem to SNaN, e.g., https://gcc.godbolt.org/z/cqr5nG1jh
This patch adds `__builtin_nansf16` support as well as other 3 ones since they are usually used together.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D127050

2 years ago[clang] Use llvm::is_contained (NFC)
Kazu Hirata [Mon, 6 Jun 2022 00:56:40 +0000 (17:56 -0700)]
[clang] Use llvm::is_contained (NFC)

2 years ago[InstCombine] fold mul with masked low bit operand to trunc+select
Sanjay Patel [Sun, 5 Jun 2022 21:55:09 +0000 (17:55 -0400)]
[InstCombine] fold mul with masked low bit operand to trunc+select

https://alive2.llvm.org/ce/z/o7rQ5q

This shows an extra instruction in some cases, but that is
caused by an existing canonicalization of trunc -> and+icmp.

Codegen should be better for any target where a multiply is
more costly than the most simple ALU op.

This ends up producing the requested x86 asm from issue #55618,
but it's not the same IR. We are missing a canonicalization
from the negate+mask pattern to the trunc+select created here.

2 years ago[ConstProp] add tests for APFloat truncate miscompile; NFC
Sanjay Patel [Fri, 3 Jun 2022 19:47:14 +0000 (15:47 -0400)]
[ConstProp] add tests for APFloat truncate miscompile; NFC

issue #55838

2 years ago[Driver][test] Remove unneeded -no-canonical-prefixes and -o %t.o
Fangrui Song [Sun, 5 Jun 2022 23:06:09 +0000 (16:06 -0700)]
[Driver][test] Remove unneeded -no-canonical-prefixes and -o %t.o

Similar to 980679981fbc311bc07f8cd23e3739fd56c22d2a

2 years ago[clang-format] Handle attributes for for/while loops
owenca [Sat, 4 Jun 2022 19:26:07 +0000 (12:26 -0700)]
[clang-format] Handle attributes for for/while loops

Fixes #55853.

Differential Revision: https://reviews.llvm.org/D127054

2 years ago[MC] Change EndOfStatement "unexpected tokens in .xxx directive " to "expected newline"
Fangrui Song [Sun, 5 Jun 2022 22:11:01 +0000 (15:11 -0700)]
[MC] Change EndOfStatement "unexpected tokens in .xxx directive " to "expected newline"

2 years ago[ARM][MC] Change EndOfStatement "unexpected tokens in .xxx directive " to "expected...
Fangrui Song [Sun, 5 Jun 2022 21:53:59 +0000 (14:53 -0700)]
[ARM][MC] Change EndOfStatement "unexpected tokens in .xxx directive " to "expected newline"

The directive name is not useful because the next line replicates the error line
which includes the directive. The prevailing style uses "expected newline".

2 years ago[NFC] Make comment consistent with allow|ignore list renamings
Aditya Kumar [Sun, 5 Jun 2022 21:48:00 +0000 (14:48 -0700)]
[NFC] Make comment consistent with allow|ignore list renamings

Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D123640

2 years ago[AArch64][MC] Change "unexpected tokens in .xxx directive " to "expected newline"
Fangrui Song [Sun, 5 Jun 2022 21:32:31 +0000 (14:32 -0700)]
[AArch64][MC] Change "unexpected tokens in .xxx directive " to "expected newline"

The directive name is not useful because the next line replicates the error line
which includes the directive. The prevailing style uses "expected newline".

2 years agogn build: Fix build when not building the native target.
Peter Collingbourne [Sun, 5 Jun 2022 06:42:00 +0000 (23:42 -0700)]
gn build: Fix build when not building the native target.

Differential Revision: https://reviews.llvm.org/D127068

2 years ago[bolt] Remove unneeded cl::ZeroOrMore for cl::opt options
Fangrui Song [Sun, 5 Jun 2022 20:29:49 +0000 (13:29 -0700)]
[bolt] Remove unneeded cl::ZeroOrMore for cl::opt options

2 years ago[libc++][test] Mark ranges.transform.pass.cpp UNSUPPORTED for AIX
Joe Loser [Sat, 4 Jun 2022 16:28:17 +0000 (10:28 -0600)]
[libc++][test] Mark ranges.transform.pass.cpp UNSUPPORTED for AIX

The `ranges.transform.pass.cpp` often times out on CI for AIX (32-bit and 64-bit)
only. Mark the test as `UNSUPPORTED` for `AIX` for now. It should be looked into in
the future.

Differential Revision: https://reviews.llvm.org/D127051

2 years ago[mlir] Tunnel LLVM_USE_LINKER through to the standalone example build.
Stella Laurenzo [Sun, 5 Jun 2022 19:30:28 +0000 (12:30 -0700)]
[mlir] Tunnel LLVM_USE_LINKER through to the standalone example build.

When building in debug mode, the link time of the standalone sample is excessive, taking upwards of a minute if using BFD. This at least allows lld to be used if the main invocation was configured that way. On my machine, this gets a standalone test that requires a relink to run in ~13s for Debug mode. This is still a lot, but better than it was. I think we may want to do something about this test: it adds a lot of latency to a normal compile/test cycle and requires a bunch of arg fiddling to exclude.

I think we may end up wanting a `check-mlir-heavy` target that can be used just prior to submit, and then make `check-mlir` just run unit/lite tests. More just thoughts for the future (none of that is done here).

Reviewed By: bondhugula, mehdi_amini

Differential Revision: https://reviews.llvm.org/D126585

2 years ago[llvm] Convert for_each to range-based for loops (NFC)
Kazu Hirata [Sun, 5 Jun 2022 19:07:14 +0000 (12:07 -0700)]
[llvm] Convert for_each to range-based for loops (NFC)

2 years ago[Sparc] Fix a warning
Kazu Hirata [Sun, 5 Jun 2022 18:49:13 +0000 (11:49 -0700)]
[Sparc] Fix a warning

This patch fixes:

  llvm/lib/Target/Sparc/AsmParser/SparcAsmParser.cpp:910:5: error:
  default label in switch which covers all enumeration values
  [-Werror,-Wcovered-switch-default]

2 years ago[NFC] Add test cases reported in PR54341
Matheus Izvekov [Sun, 5 Jun 2022 17:00:57 +0000 (19:00 +0200)]
[NFC] Add test cases reported in PR54341

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D127074

2 years ago[Debuginfo][DWARF][NFC] Refactor DwarfStringPoolEntryRef - remove isIndexed().
Alexey Lapshin [Fri, 3 Jun 2022 11:11:43 +0000 (14:11 +0300)]
[Debuginfo][DWARF][NFC] Refactor DwarfStringPoolEntryRef - remove isIndexed().

This patch is extraction from the https://reviews.llvm.org/D126883.
It removes DwarfStringPoolEntryRef::isIndexed() and isIndexed bit
since they are not used.

Differential Revision: https://reviews.llvm.org/D126958

2 years ago[SPARC][MC] Support more relocation types
LemonBoy [Sun, 5 Jun 2022 18:06:50 +0000 (14:06 -0400)]
[SPARC][MC] Support more relocation types

This patch introduces support for %hix, %lox, %gdop_hix22, %gdop_lox10 and %gdop.

An extra test is introduced to make sure the fixups are correctly applied.

Reviewed By: dcederman

Differential Revision: https://reviews.llvm.org/D102575

2 years ago[flang][runtime] Use __float128 where possible & needed in runtime
Peter Klausler [Tue, 31 May 2022 21:06:11 +0000 (14:06 -0700)]
[flang][runtime] Use __float128 where possible & needed in runtime

On targets with __float128 available and distinct from long double,
use it to support more kind=16 entry points.  This affects mostly
x86-64 targets.  This means that more runtime entry points are
defined for lowering to call.

Delete Common/long-double.h and its LONG_DOUBLE macro in favor of
testing the standard macro LDBL_MANT_DIG.

Differential Revision: https://reviews.llvm.org/D127025

2 years ago[Scalar] Remove isValidSingle (NFC)
Kazu Hirata [Sun, 5 Jun 2022 15:45:11 +0000 (08:45 -0700)]
[Scalar] Remove isValidSingle (NFC)

The last use was removed on Feb 18, 2022 in commit
00ab91b70d21f72af59e4e198c6dc819452405af.

2 years ago[ADT] Add edit_distance_insensitive to StringRef
Nathan James [Sun, 5 Jun 2022 11:03:08 +0000 (12:03 +0100)]
[ADT] Add edit_distance_insensitive to StringRef

In some instances its advantageous to calculate edit distances without worrying about casing.
Currently to achieve this both strings need to be converted to the same case first, then edit distance can be calculated.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D126159

2 years agoRemove unneeded cl::ZeroOrMore for cl::opt/cl::list options
Fangrui Song [Sun, 5 Jun 2022 08:07:50 +0000 (01:07 -0700)]
Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options

2 years agoRemove unneeded cl::ZeroOrMore for cl::opt/cl::list options
Fangrui Song [Sun, 5 Jun 2022 07:31:44 +0000 (00:31 -0700)]
Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options

2 years ago[Transforms/Utils] Use predecessors (NFC)
Kazu Hirata [Sun, 5 Jun 2022 07:16:14 +0000 (00:16 -0700)]
[Transforms/Utils] Use predecessors (NFC)

2 years agoRecommit: "[MLIR][NVVM] Replace fdiv on fp16 with promoted (fp32) multiplication...
Christian Sigg [Sat, 4 Jun 2022 10:33:42 +0000 (12:33 +0200)]
Recommit: "[MLIR][NVVM] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration."

This change rolls bcfc0a9051014437b55ab932d9aca5ecdca6776b forward (i.e., reverting 369ce54bb302f209239b8ebc77ad824add9df089) with fixed CMakeLists.txt.

2 years agoRemove unneeded cl::ZeroOrMore for cl::list options
Fangrui Song [Sun, 5 Jun 2022 06:51:12 +0000 (23:51 -0700)]
Remove unneeded cl::ZeroOrMore for cl::list options

2 years agoUse llvm::less_second (NFC)
Kazu Hirata [Sun, 5 Jun 2022 05:48:32 +0000 (22:48 -0700)]
Use llvm::less_second (NFC)

2 years ago[Target] Use MachineBasicBlock::erase (NFC)
Kazu Hirata [Sun, 5 Jun 2022 05:41:24 +0000 (22:41 -0700)]
[Target] Use MachineBasicBlock::erase (NFC)

2 years ago[CodeGen] Use a range-based for loop (NFC)
Kazu Hirata [Sun, 5 Jun 2022 05:26:55 +0000 (22:26 -0700)]
[CodeGen] Use a range-based for loop (NFC)

2 years agoUse static_cast from SmallString to std::string (NFC)
Kazu Hirata [Sun, 5 Jun 2022 05:09:27 +0000 (22:09 -0700)]
Use static_cast from SmallString to std::string (NFC)

2 years agoUse llvm::less_first (NFC)
Kazu Hirata [Sun, 5 Jun 2022 04:23:18 +0000 (21:23 -0700)]
Use llvm::less_first (NFC)

2 years ago[CodeGen] Use StringRef::contains (NFC)
Kazu Hirata [Sun, 5 Jun 2022 03:58:58 +0000 (20:58 -0700)]
[CodeGen] Use StringRef::contains (NFC)

2 years ago[Transforms] Use llvm::is_contained (NFC)
Kazu Hirata [Sun, 5 Jun 2022 03:48:26 +0000 (20:48 -0700)]
[Transforms] Use llvm::is_contained (NFC)

2 years ago[SPARC] Fix type for i64 inline asm operands
LemonBoy [Sat, 4 Jun 2022 22:26:33 +0000 (18:26 -0400)]
[SPARC] Fix type for i64 inline asm operands

Differential Revision: https://reviews.llvm.org/D101694

2 years ago[VPlan] Update vector latch terminator edge to exit block after execution.
Florian Hahn [Sat, 4 Jun 2022 20:22:32 +0000 (21:22 +0100)]
[VPlan] Update vector latch terminator edge to exit block after execution.

Instead of setting the successor to the exit using CFG.ExitBB, set it to
nullptr initially. The successor to the exit block is later set either
through createEmptyBasicBlock or after VPlan execution (because at the
moment, no block is created by VPlan for the exit block, the existing
one is reused).

This also enables BranchOnCond to be used as terminator for the exiting
block of the topmost vector region.

Depends on D126618.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D126679

2 years ago[mlir] Use context provided rather than getContext
Jacques Pienaar [Sat, 4 Jun 2022 19:18:51 +0000 (12:18 -0700)]
[mlir] Use context provided rather than getContext

Avoids "pass state was never initialized" assertion failure.

2 years ago[flang][runtime] Catch OPEN of connected file
Peter Klausler [Fri, 3 Jun 2022 20:44:13 +0000 (13:44 -0700)]
[flang][runtime] Catch OPEN of connected file

Diagnose OPEN(FILE=f) when f is already connected by the same name to
a distinct external I/O unit.

Differential Revision: https://reviews.llvm.org/D127035

2 years ago[flang][runtime] Emit error message rather than crashing for MOD(ULO)(x,P=0)
Peter Klausler [Thu, 2 Jun 2022 21:06:57 +0000 (14:06 -0700)]
[flang][runtime] Emit error message rather than crashing for MOD(ULO)(x,P=0)

Add extra arguments and checks to the runtime support library so that
a call to the intrinsic functions MOD and MODULO with "denominator"
argument P of zero will cause a crash with a source location rather
than an uninformative floating-point error or integer division by
zero signal.

Additional work is required in lowering to (1) pass source file path and
source line number arguments and (2) actually call these runtime
library APIs instead of emitting inline code for MOD &/or MODULO.

Differential Revision: https://reviews.llvm.org/D127034

2 years ago[flang][runtime] Fix deadlock in error recovery
Peter Klausler [Thu, 2 Jun 2022 20:33:10 +0000 (13:33 -0700)]
[flang][runtime] Fix deadlock in error recovery

When an external I/O statement is in a recoverable error
state before any data transfers take place (for example,
an unformatted transfer with ERR=/IOSTAT=/IOMSG= attempted on
a formatted unit), ensure that the unit's mutex is still
released at the end of the statement.

Differential Revision: https://reviews.llvm.org/D127032

2 years ago[flang] When folding FINDLOC, convert operands to a common type
Peter Klausler [Thu, 2 Jun 2022 00:06:01 +0000 (17:06 -0700)]
[flang] When folding FINDLOC, convert operands to a common type

For example, FINDLOC(A,X) should convert both A and X to COMPLEX(8)
if the operands are REAL(8) and COMPLEX(4), so that comparisons
can be done without losing inforation.  The current implementation
unconditionally converts X to the type of the array A.

Differential Revision: https://reviews.llvm.org/D127030

2 years ago[flang][runtime] Fix WRITE after OPEN(.., ACCESS="APPEND")
Peter Klausler [Wed, 1 Jun 2022 22:32:08 +0000 (15:32 -0700)]
[flang][runtime] Fix WRITE after OPEN(.., ACCESS="APPEND")

The initial size of the file was not being captured as the file position
on which the first output buffer should be framed.

Differential Revision: https://reviews.llvm.org/D127029

2 years ago[flang][runtime] Fix edge case discrepancies with EN output editing
Peter Klausler [Wed, 1 Jun 2022 21:39:56 +0000 (14:39 -0700)]
[flang][runtime] Fix edge case discrepancies with EN output editing

The "engineering" ENw.d output editing descriptor has some difficult
edge case behavior for values that might format into a bunch of 9's
or round up to a 1 for a given scale factor.  Fix the algorithm,
and add tests to protect against regressions.

Differential Revision: https://reviews.llvm.org/D127028

2 years ago[flang] Don't crash on initialization with a zero-sized derived type
Peter Klausler [Wed, 1 Jun 2022 19:16:35 +0000 (12:16 -0700)]
[flang] Don't crash on initialization with a zero-sized derived type

Avoid calls to memcpy with zero byte counts if their address argument
calculations may not be valid expressions.

Differential Revision: https://reviews.llvm.org/D127027

2 years ago[flang][runtime] Don't crash after surviving internal output overflow
Peter Klausler [Tue, 31 May 2022 17:17:20 +0000 (10:17 -0700)]
[flang][runtime] Don't crash after surviving internal output overflow

After the program has survived its attempt to overflow the output buffer
with an internal WRITE using ERR=, IOSTAT=, &/or IOMSG=, don't crash
by accidentally blank-filling the next record that usually doesn't exist.

Differential Revision: https://reviews.llvm.org/D127024

2 years ago[flang][runtime] Don't let random seed queries change the sequence
Peter Klausler [Sat, 4 Jun 2022 05:58:45 +0000 (22:58 -0700)]
[flang][runtime] Don't let random seed queries change the sequence

When the current seed of the pseudo-random generator is queried
with CALL RANDOM_SEED(GET=n), that query should not change the
stream of pseudo-random numbers produced by CALL RANDOM_NUMBER().

Differential Revision: https://reviews.llvm.org/D127023

2 years agoRevert "[MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with...
Mehdi Amini [Sat, 4 Jun 2022 08:35:45 +0000 (08:35 +0000)]
Revert "[MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration."

This reverts commit bcfc0a9051014437b55ab932d9aca5ecdca6776b.

The build is broken with shared library enabled.

2 years agoRemove unneeded cl::ZeroOrMore for cl::opt options
Fangrui Song [Sat, 4 Jun 2022 07:10:42 +0000 (00:10 -0700)]
Remove unneeded cl::ZeroOrMore for cl::opt options

Similar to 557efc9a8b68628c2c944678c6471dac30ed9e8e.
This commit handles options where cl::ZeroOrMore is more than one line below
cl::opt.

2 years ago[MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal...
Christian Sigg [Fri, 3 Jun 2022 21:15:52 +0000 (23:15 +0200)]
[MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration.

This is correct for all values, i.e. the same as promoting the division to fp32 in the NVPTX backend. But it is faster (~10% in average, sometimes more) because:

- it performs less Newton iterations
- it avoids the slow path for e.g. denormals
- it allows reuse of the reciprocal for multiple divisions by the same divisor

Test program:
```
#include <stdio.h>
#include "cuda_fp16.h"

// This is a variant of CUDA's own __hdiv which is fast than hdiv_promote below
// and doesn't suffer from the perf cliff of div.rn.fp32 with 'special' values.
__device__ half hdiv_newton(half a, half b) {
  float fa = __half2float(a);
  float fb = __half2float(b);

  float rcp;
  asm("{rcp.approx.ftz.f32 %0, %1;\n}" : "=f"(rcp) : "f"(fb));

  float result = fa * rcp;
  auto exponent = reinterpret_cast<const unsigned&>(result) & 0x7f800000;
  if (exponent != 0 && exponent != 0x7f800000) {
    float err = __fmaf_rn(-fb, result, fa);
    result = __fmaf_rn(rcp, err, result);
  }

  return __float2half(result);
}

// Surprisingly, this is faster than CUDA's own __hdiv.
__device__ half hdiv_promote(half a, half b) {
  return __float2half(__half2float(a) / __half2float(b));
}

// This is an approximation that is accurate up to 1 ulp.
__device__ half hdiv_approx(half a, half b) {
  float fa = __half2float(a);
  float fb = __half2float(b);

  float result;
  asm("{div.approx.ftz.f32 %0, %1, %2;\n}" : "=f"(result) : "f"(fa), "f"(fb));
  return __float2half(result);
}

__global__ void CheckCorrectness() {
  int i = threadIdx.x + blockIdx.x * blockDim.x;
  half x = reinterpret_cast<const half&>(i);
  for (int j = 0; j < 65536; ++j) {
    half y = reinterpret_cast<const half&>(j);
    half d1 = hdiv_newton(x, y);
    half d2 = hdiv_promote(x, y);
    auto s1 = reinterpret_cast<const short&>(d1);
    auto s2 = reinterpret_cast<const short&>(d2);
    if (s1 != s2) {
      printf("%f (%u) / %f (%u), got %f (%hu), expected: %f (%hu)\n",
             __half2float(x), i, __half2float(y), j, __half2float(d1), s1,
             __half2float(d2), s2);
      //__trap();
    }
  }
}

__device__ half dst;

__global__ void ProfileBuiltin(half x) {
  #pragma unroll 1
  for (int i = 0; i < 10000000; ++i) {
    x = x / x;
  }
  dst = x;
}

__global__ void ProfilePromote(half x) {
  #pragma unroll 1
  for (int i = 0; i < 10000000; ++i) {
    x = hdiv_promote(x, x);
  }
  dst = x;
}

__global__ void ProfileNewton(half x) {
  #pragma unroll 1
  for (int i = 0; i < 10000000; ++i) {
    x = hdiv_newton(x, x);
  }
  dst = x;
}

__global__ void ProfileApprox(half x) {
  #pragma unroll 1
  for (int i = 0; i < 10000000; ++i) {
    x = hdiv_approx(x, x);
  }
  dst = x;
}

int main() {
  CheckCorrectness<<<256, 256>>>();
  half one = __float2half(1.0f);
  ProfileBuiltin<<<1, 1>>>(one);  // 1.001s
  ProfilePromote<<<1, 1>>>(one);  // 0.560s
  ProfileNewton<<<1, 1>>>(one);   // 0.508s
  ProfileApprox<<<1, 1>>>(one);   // 0.304s
  auto status = cudaDeviceSynchronize();
  printf("%s\n", cudaGetErrorString(status));
}
```

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D126158

2 years ago[flang][runtime] Signal new I/O error on floating-point input overflow
Peter Klausler [Fri, 3 Jun 2022 20:26:28 +0000 (13:26 -0700)]
[flang][runtime] Signal new I/O error on floating-point input overflow

Besides raising the IEEE floating-point overflow exception, treat
a floating-point overflow on input as an I/O error catchable with
ERR=, IOSTAT=, &/or IOMSG=.

Differential Revision: https://reviews.llvm.org/D127022

2 years ago[BOLT][UTILS] Usability improvements for nfc-check-setup
Amir Ayupov [Sat, 4 Jun 2022 05:54:32 +0000 (22:54 -0700)]
[BOLT][UTILS] Usability improvements for nfc-check-setup

# Stash local changes before checkout.
# Print a message that the source repository revision has been changed, with
  instructions to switch back.
# Make the script executable.
# Print sample instructions how to run bolt tests.
# Assume that llvm-bolt-wrapper script is in the same source directory.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126941

2 years ago[flang] Don't discard lower bounds of implicit-shape named constants
Peter Klausler [Mon, 30 May 2022 23:27:49 +0000 (16:27 -0700)]
[flang] Don't discard lower bounds of implicit-shape named constants

F18 preserves lower bounds of explicit-shape named constant arrays, but
failed to also do so for implicit-shape named constants.  Fix.

Differential Revision: https://reviews.llvm.org/D127021

2 years ago[flang][runtime] Ensure that 0. <= RANDOM_NUMBER() < 1.
Peter Klausler [Mon, 30 May 2022 23:13:48 +0000 (16:13 -0700)]
[flang][runtime] Ensure that 0. <= RANDOM_NUMBER() < 1.

It was possible for RANDOM_NUMBER() to return 1.0.

Differential Revision: https://reviews.llvm.org/D127020

2 years agoRevert D126950 "[lld][WebAssembly] Retain data segments referenced via __start/__stop"
Fangrui Song [Sat, 4 Jun 2022 05:18:06 +0000 (22:18 -0700)]
Revert D126950 "[lld][WebAssembly] Retain data segments referenced via __start/__stop"

This reverts commit dcf3368e33c3a01bd21b692d3be5dc1ecee587f4.

It breaks -DLLVM_ENABLE_ASSERTIONS=on builds. In addition, the description is
incorrect about ld.lld behavior. For wasm, there should be justification to add
the new mode.

2 years ago[flang] Distinguish intrinsic module USE in module files; correct search paths
Peter Klausler [Mon, 30 May 2022 19:47:32 +0000 (12:47 -0700)]
[flang] Distinguish intrinsic module USE in module files; correct search paths

In the USE statements that f18 emits to module files, ensure that symbols
from intrinsic modules are marked as such on their USE statements.  And
ensure that the current working directory (".") cannot override the intrinsic
module search path when trying to locate an intrinsic module.

Differential Revision: https://reviews.llvm.org/D127019

2 years ago[Hexagon][bolt] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Fangrui Song [Sat, 4 Jun 2022 05:04:57 +0000 (22:04 -0700)]
[Hexagon][bolt] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC

Similar to 557efc9a8b68628c2c944678c6471dac30ed9e8e

2 years ago[clang-link-wrapper] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Fangrui Song [Sat, 4 Jun 2022 05:02:11 +0000 (22:02 -0700)]
[clang-link-wrapper] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC

Similar to 557efc9a8b68628c2c944678c6471dac30ed9e8e

2 years ago[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Fangrui Song [Sat, 4 Jun 2022 04:59:05 +0000 (21:59 -0700)]
[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC

Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!`
error. More were added due to cargo cult. Since the error has been removed,
cl::ZeroOrMore is unneeded.

Also remove cl::init(false) while touching the lines.

2 years ago[RISCV] Add more patterns for FNMADD
LiaoChunyu [Thu, 2 Jun 2022 03:50:54 +0000 (11:50 +0800)]
[RISCV] Add more patterns for FNMADD

D54205 handles fnmadd: -rs1 * rs2 - rs3
This patch add fnmadd: -(rs1 * rs2 + rs3) (the nsz flag on the FMA)

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D126852

2 years ago[libc++][ranges][NFC] Fix a patch link in ranges status.
varconst [Sat, 4 Jun 2022 03:39:00 +0000 (20:39 -0700)]
[libc++][ranges][NFC] Fix a patch link in ranges status.

2 years ago[libc++][ranges][NFC] Mark range algorithms that are in progress.
varconst [Sat, 4 Jun 2022 03:02:46 +0000 (20:02 -0700)]
[libc++][ranges][NFC] Mark range algorithms that are in progress.

2 years ago[lld][WebAssembly] Retain data segments referenced via __start/__stop
Yuta Saito [Sat, 4 Jun 2022 02:28:31 +0000 (02:28 +0000)]
[lld][WebAssembly] Retain data segments referenced via __start/__stop

As well as ELF linker does, retain all data segments named X referenced
through `__start_X` or `__stop_X`.

For example, `FOO_MD` should not be stripped in the below case, but it's currently mis-stripped

```llvm
@FOO_MD  = global [4 x i8] c"bar\00", section "foo_md", align 1
@__start_foo_md = external constant i8*
@__stop_foo_md = external constant i8*
@llvm.used = appending global [1 x i8*] [i8* bitcast (i32 ()* @foo_md_size to i8*)], section "llvm.metadata"

define i32 @foo_md_size()  {
entry:
  ret i32 sub (
    i32 ptrtoint (i8** @__stop_foo_md to i32),
    i32 ptrtoint (i8** @__start_foo_md to i32)
  )
}
```

This fixes https://github.com/llvm/llvm-project/issues/55839

Reviewed By: sbc100

Differential Revision: https://reviews.llvm.org/D126950

2 years ago[flang] Correct folding of CSHIFT and EOSHIFT for DIM>1
Peter Klausler [Sun, 29 May 2022 21:18:51 +0000 (14:18 -0700)]
[flang] Correct folding of CSHIFT and EOSHIFT for DIM>1

The algorithm was wrong for higher dimensions, and so were
the expected test results.  Rework.

Differential Revision: https://reviews.llvm.org/D127018

2 years ago[pseudo] Fix leaks after D126731
Fangrui Song [Sat, 4 Jun 2022 01:43:15 +0000 (18:43 -0700)]
[pseudo] Fix leaks after D126731

Array Operator new Cookies help lsan find allocations, while std::array
can't.

2 years ago[flang][runtime] Signal format error when input field width is zero
Peter Klausler [Sun, 29 May 2022 17:12:57 +0000 (10:12 -0700)]
[flang][runtime] Signal format error when input field width is zero

A data edit descriptor for input may not have a zero field width.

Differential Revision: https://reviews.llvm.org/D127017

2 years ago[flang][runtime] OPEN write-only files
Peter Klausler [Sun, 29 May 2022 15:36:57 +0000 (08:36 -0700)]
[flang][runtime] OPEN write-only files

If a file being opened with no ACTION= is write-only then cope with
it rather than defaulting prematurely to treating it as read-only.

Differential Revision: https://reviews.llvm.org/D127015

2 years ago[RISCV] Support LUI+ADDIW in doPeepholeLoadStoreADDI.
Craig Topper [Sat, 4 Jun 2022 00:58:22 +0000 (17:58 -0700)]
[RISCV] Support LUI+ADDIW in doPeepholeLoadStoreADDI.

This fixes an inconsistency between RV32 and RV64. Still considering
trying to do this peephole during isel, but wanted to fix the
inconsistency first.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126986

2 years ago[flang][runtime] INQUIRE(FILE="...",SIZE=nbytes)
Peter Klausler [Sun, 29 May 2022 15:21:59 +0000 (08:21 -0700)]
[flang][runtime] INQUIRE(FILE="...",SIZE=nbytes)

Implement inquire-by-file SIZE= specifier.

Differential Revision: https://reviews.llvm.org/D127014

2 years ago[clang][test] Mark test arm-float-abi-lto.c unsupported on AIX
Jake Egan [Sat, 4 Jun 2022 01:00:47 +0000 (21:00 -0400)]
[clang][test] Mark test arm-float-abi-lto.c unsupported on AIX

This test is failing after the introduction of opaque pointers (https://reviews.llvm.org/D125847). The test is flaky and fails from segmentation fault, but it's unclear why. So, mark this test unsupported while it's investigated.

2 years ago[test] Modify test to verify D126396 (Clean "./" from __FILE__ expansion)
Paul Pluzhnikov [Sat, 4 Jun 2022 00:54:02 +0000 (17:54 -0700)]
[test] Modify test to verify D126396 (Clean "./" from __FILE__ expansion)

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D127009

2 years ago[flang][runtime] Allow extra character for E0.0 output editing
Peter Klausler [Sat, 28 May 2022 23:11:43 +0000 (16:11 -0700)]
[flang][runtime] Allow extra character for E0.0 output editing

When the digit count ('d') is zero in E0 editing, allow for one more
output character; otherwise, any - or + sign in the output causes
an output field overflow.

Differential Revision: https://reviews.llvm.org/D127013

2 years ago[mlir][sparse] Adding IsSparseTensorPred and updating ops to use it
wren romano [Fri, 3 Jun 2022 23:41:02 +0000 (16:41 -0700)]
[mlir][sparse] Adding IsSparseTensorPred and updating ops to use it

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126994

2 years ago[flang][runtime] Fix bug with extra leading zero in octal output
Peter Klausler [Sat, 28 May 2022 22:36:02 +0000 (15:36 -0700)]
[flang][runtime] Fix bug with extra leading zero in octal output

Octal (O) output editing often emits an extra leading 0 digit
due to the total digit count being off by one since word sizes
aren't multiples of three bits.

Differential Revision: https://reviews.llvm.org/D127012

2 years ago[flang] Fix crash in IsSaved()
Peter Klausler [Sat, 28 May 2022 18:54:57 +0000 (11:54 -0700)]
[flang] Fix crash in IsSaved()

Code was accessing ProcEntityDetails in a symbol that didn't have them.

Differential Revision: https://reviews.llvm.org/D127011

2 years ago[NFC] [libunwind] turn assert into static_assert
Florian Mayer [Fri, 3 Jun 2022 18:45:04 +0000 (11:45 -0700)]
[NFC] [libunwind] turn assert into static_assert

Reviewed By: #libunwind, MaskRay

Differential Revision: https://reviews.llvm.org/D126987

2 years ago[tools] Forward declare classes & remove includes
Clemens Wasser [Fri, 3 Jun 2022 23:32:04 +0000 (16:32 -0700)]
[tools] Forward declare classes & remove includes

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D120208

2 years ago[mlir][linalg] fix crash in vectorization of elementwise operations
Christopher Bate [Fri, 3 Jun 2022 20:23:45 +0000 (14:23 -0600)]
[mlir][linalg] fix crash in vectorization of elementwise operations

The current vectorization logic implicitly expects "elementwise"
linalg ops to have projected permutations for indexing maps, but
the precondition logic misses this check. This can result in a
crash when executing the generic vectorization transform on an op
with a non-projected permutation input indexing map. This change
fixes the logic and adds a test (which crashes without this fix).

Differential Revision: https://reviews.llvm.org/D127000

2 years ago[DWARF] Show which augmentation character was unrecognized.
Florian Mayer [Fri, 3 Jun 2022 21:09:34 +0000 (14:09 -0700)]
[DWARF] Show which augmentation character was unrecognized.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D127003

2 years ago[Hexagon] Enable IAS in the Hexagon backend
Brad Smith [Fri, 3 Jun 2022 22:15:12 +0000 (18:15 -0400)]
[Hexagon] Enable IAS in the Hexagon backend

Reviewed By: kparzysz

Differential Revision: https://reviews.llvm.org/D123096

2 years ago[clang] Allow const variables with weak attribute to be overridden
Anders Waldenborg [Tue, 24 May 2022 19:46:32 +0000 (21:46 +0200)]
[clang] Allow const variables with weak attribute to be overridden

A variable with `weak` attribute signifies that it can be replaced with
a "strong" symbol link time. Therefore it must not emitted with
"weak_odr" linkage, as that allows the backend to use its value in
optimizations.

The frontend already considers weak const variables as
non-constant (note_constexpr_var_init_weak diagnostic) so this change
makes frontend and backend consistent.

This commit reverses the
  f49573d1 weak globals that are const should get weak_odr linkage.
commit from 2009-08-05 which introduced this behavior. Unfortunately
that commit doesn't provide any details on why the change was made.

This was discussed in
https://discourse.llvm.org/t/weak-attribute-semantics-on-const-variables/62311

Differential Revision: https://reviews.llvm.org/D126324