platform/upstream/llvm.git
21 months ago[DebugInfo][test] XFAIL DebugInfo/Generic/missing-abstract-variable.ll on LoongArch
Weining Lu [Fri, 30 Sep 2022 01:29:43 +0000 (09:29 +0800)]
[DebugInfo][test] XFAIL DebugInfo/Generic/missing-abstract-variable.ll on LoongArch

The same as SPARC and RISCV. See D119122.

Differential Revision: https://reviews.llvm.org/D134932

21 months ago[flang][NFC] Use prefixed accessors for fircg dialect
Valentin Clement [Mon, 3 Oct 2022 09:02:23 +0000 (11:02 +0200)]
[flang][NFC] Use prefixed accessors for fircg dialect

The raw accessor is going away soon so switch to prefixed accessors in the
fircg dialect. The main dialect was switched some months ago.

https://github.com/llvm/llvm-project/issues/58090

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D135061

21 months ago[Clang][MinGW][cygwin] Fix __declspec with -fdeclspec enabled
Alvin Wong [Mon, 3 Oct 2022 07:40:16 +0000 (10:40 +0300)]
[Clang][MinGW][cygwin] Fix __declspec with -fdeclspec enabled

Fixes https://github.com/llvm/llvm-project/issues/49958

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D135027

21 months ago[LLD][COFF] Reduce chance of symbol name collision with delay-load
Alvin Wong [Mon, 3 Oct 2022 07:40:01 +0000 (10:40 +0300)]
[LLD][COFF] Reduce chance of symbol name collision with delay-load

Delay-loaded imports creats a load thunk with a symbol name. Before this
change, the name uses a `__imp_load_` prefix. On the other hand, normal
import uses the `__imp_` prefix for the import address pointer. If an
import symbol named `load_func` is imported normally and another named
`func` is imported using delay-load, this can cause a symbol name
collision.

This patch changes delay-load imports to use `__imp___load_` prefix.
Because it is less likely for normal imports to have a name starting in
`__load_` this should reduce the chance of a name collision.

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D134464

21 months ago[LLD][COFF] Set OrdinalBase to 1 for export table
Alvin Wong [Mon, 3 Oct 2022 07:39:48 +0000 (10:39 +0300)]
[LLD][COFF] Set OrdinalBase to 1 for export table

Before this, LLD sets OrdinalBase to 0, which deviates from usual
practices. This technically would allow LLD to export a symbol using
ordinal 0, however LLD never use export ordinal 0, which results in
binaries with export tables always having an empty export at ordinal 0.

This change makes LLD set OrdinalBase to 1 and not create the empty
export with ordinal 0, which makes its behaviour more in line with both
the MSVC linker and the GNU linker.

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D134140

21 months ago[libc++] Remove a part of reverted D131898 or D130695
Vitaly Buka [Mon, 3 Oct 2022 07:49:52 +0000 (00:49 -0700)]
[libc++] Remove a part of reverted D131898 or D130695

21 months agoRevert "[libc++] Updates generated transitve includes."
Vitaly Buka [Mon, 3 Oct 2022 07:39:19 +0000 (00:39 -0700)]
Revert "[libc++] Updates generated transitve includes."

Looks like a part of reverted D131898.

This reverts commit cfd5b8f11195f5c23ae6d4aa2df2d95cae9b976e.

21 months ago[flang] Make real type of kind 10 target dependent
Peixin Qiao [Mon, 3 Oct 2022 07:24:39 +0000 (15:24 +0800)]
[flang] Make real type of kind 10 target dependent

The real(10) is supported on x86_64. On aarch64, the value of
selected_real_kind(16) should be 16 rather than 10 since real(10)
is not supported on x86_64. Previously, the real type support check
is not target dependent. Support it now through the target triple
information.

Reviewed By: clementval

Differential Revision: https://reviews.llvm.org/D134021

21 months ago[Bazel] fixes for 9f77909.
Christian Sigg [Mon, 3 Oct 2022 07:06:42 +0000 (09:06 +0200)]
[Bazel] fixes for 9f77909.

21 months ago[mlir][bufferize][NFC] Fix FileCheck capture
Matthias Springer [Mon, 3 Oct 2022 07:05:39 +0000 (16:05 +0900)]
[mlir][bufferize][NFC] Fix FileCheck capture

One of the test cases matched IR from a subsequent test case. For this reason, the test case appeared to pass while it is actually broken.

This change does not fix the test case itself. It will be fixed when we overhaul the buffer deallocation implementation. (The memory leak in this test case is an edge case.)

Differential Revision: https://reviews.llvm.org/D135046

21 months ago[GlobalISel] Allow prelegalizer combiners to have access to LegalizerInfo.
Amara Emerson [Sun, 2 Oct 2022 21:28:29 +0000 (22:28 +0100)]
[GlobalISel] Allow prelegalizer combiners to have access to LegalizerInfo.

Before, the isPreLegalize() query in CombinerHelper only checked for the
presence of a LegalizerInfo object. This is problematic when we want to have
a combine actually check for legality in a pre-legalizer combine pass, since
if we pass a LegalizerInfo object to the constructor it causes the combines to
think that we're running *post* legalizer, which isn't true.

This change fixes it to instead check an explicit bool that passes to signal
whether the pass will be run before or after legalization.

Doing so exposed a bug in the extending loads combine, which tried to check for
legality of candidate extending loads if LegalizerInfo was present. Since we
only ran it pre-legalizer and therefore with a null LegalizerInfo, it never
actually ran. Also fixes the legality checks to keep the tests passing.

Differential Revision: https://reviews.llvm.org/D135044

21 months ago[mlir][interfaces] Add ShapedDimOpInterface
Matthias Springer [Mon, 3 Oct 2022 04:53:16 +0000 (13:53 +0900)]
[mlir][interfaces] Add ShapedDimOpInterface

This interface is implemented by memref.dim and tensor.dim. This change makes it possible to remove a build dependency of the Affine dialect on the Tensor dialect (and maybe also the MemRef dialect in the future).

Differential Revision: https://reviews.llvm.org/D133595

21 months ago[ELF] Replace some config->ekind with file->ekind. NFC
Fangrui Song [Mon, 3 Oct 2022 04:27:41 +0000 (21:27 -0700)]
[ELF] Replace some config->ekind with file->ekind. NFC

21 months agoRevert "Add APFloat and MLIR type support for fp8 (e5m2)."
Vitaly Buka [Mon, 3 Oct 2022 04:21:51 +0000 (21:21 -0700)]
Revert "Add APFloat and MLIR type support for fp8 (e5m2)."

Breaks bots https://lab.llvm.org/buildbot/#/builders/37/builds/17086

This reverts commit 2dc68b5398258c7a0cf91f10192d058e787afcdf.

21 months ago[ELF] Move init from ELFFileBase constructor to a separate function. NFC
Fangrui Song [Mon, 3 Oct 2022 04:10:28 +0000 (21:10 -0700)]
[ELF] Move init from ELFFileBase constructor to a separate function. NFC

21 months ago[mlir][shape] add outline-shape-computation pass
Yuanqiang Liu [Mon, 3 Oct 2022 03:24:49 +0000 (20:24 -0700)]
[mlir][shape] add outline-shape-computation pass

Add outline-shape-computation pass. This pass his pass outlines the
shape computation part in high level IR by adding shape.func and
populate corresponding mapping information into ShapeMappingAnalysis.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D131810

21 months ago[ELF] Remove redundant getELFKind call. NFC
Fangrui Song [Mon, 3 Oct 2022 03:16:13 +0000 (20:16 -0700)]
[ELF] Remove redundant getELFKind call. NFC

21 months ago[ELF] Simplify addFile. NFC
Fangrui Song [Mon, 3 Oct 2022 02:49:17 +0000 (19:49 -0700)]
[ELF] Simplify addFile. NFC

21 months ago[mlir][linalg][NFC] Drop emitAccessorPrefix from Linalg dialect
Matthias Springer [Mon, 3 Oct 2022 02:29:47 +0000 (11:29 +0900)]
[mlir][linalg][NFC] Drop emitAccessorPrefix from Linalg dialect

Differential Revision: https://reviews.llvm.org/D135048

21 months ago[gn build] Port 71410fd2c065
LLVM GN Syncbot [Mon, 3 Oct 2022 01:41:14 +0000 (01:41 +0000)]
[gn build] Port 71410fd2c065

21 months agoRevert "[libc++] Implement P0591R4 (Utility functions to implement uses-allocator...
Vitaly Buka [Mon, 3 Oct 2022 01:39:11 +0000 (18:39 -0700)]
Revert "[libc++] Implement P0591R4 (Utility functions to implement uses-allocator construction)"

Breaks ubsan tests https://lab.llvm.org/buildbot/#/builders/85/builds/11131

This reverts commit 099384dcea49f5f4b0dc7e615c9845bf9baad4bc.

21 months agoAdd APFloat and MLIR type support for fp8 (e5m2).
Stella Laurenzo [Wed, 27 Jul 2022 02:02:37 +0000 (19:02 -0700)]
Add APFloat and MLIR type support for fp8 (e5m2).

This is a first step towards high level representation for fp8 types
that have been built in to hardware with near term roadmaps. Like the
BFLOAT16 type, the family of fp8 types are inspired by IEEE-754 binary
floating point formats but, due to the size limits, have been tweaked in
various ways in order to maximally use the range/precision in various
scenarios. The list of variants is small/finite and bounded by real
hardware.

This patch introduces the E5M2 FP8 format as proposed by Nvidia, ARM,
and Intel in the paper: https://arxiv.org/pdf/2209.05433.pdf

As the more conformant of the two implemented datatypes, we are plumbing
it through LLVM's APFloat type and MLIR's type system first as a
template. It will be followed by the range optimized E4M3 FP8 format
described in the paper. Since that format deviates further from the
IEEE-754 norms, it may require more debate and implementation
complexity.

Given that we see two parts of the FP8 implementation space represented
by these cases, we are recommending naming of:

* `F8M<N>` : For FP8 types that can be conceived of as following the
  same rules as FP16 but with a smaller number of mantissa/exponent
  bits. Including the number of mantissa bits in the type name is enough
  to fully specify the type. This naming scheme is used to represent
  the E5M2 type described in the paper.
* `F8M<N>F` : For FP8 types such as E4M3 which only support finite
  values.

The first of these (this patch) seems fairly non-controversial. The
second is previewed here to illustrate options for extending to the
other known variant (but can be discussed in detail in the patch
which implements it).

Many conversations about these types focus on the Machine-Learning
ecosystem where they are used to represent mixed-datatype computations
at a high level. At that level (which is why we also expose them in
MLIR), it is important to retain the actual type definition so that when
lowering to actual kernels or target specific code, the correct
promotions, casts and rescalings can be done as needed. We expect that
most LLVM backends will only experience these types as opaque `I8`
values that are applicable to some instructions.

MLIR does not make it particularly easy to add new floating point types
(i.e. the FloatType hierarchy is not open). Given the need to fully
model FloatTypes and make them interop with tooling, such types will
always be "heavy-weight" and it is not expected that a highly open type
system will be particularly helpful. There are also a bounded number of
floating point types in use for current and upcoming hardware, and we
can just implement them like this (perhaps looking for some cosmetic
ways to reduce the number of places that need to change). Creating a
more generic mechanism for extending floating point types seems like it
wouldn't be worth it and we should just deal with defining them one by
one on an as-needed basis when real hardware implements a new scheme.
Hopefully, with some additional production use and complete software
stacks, hardware makers will converge on a set of such types that is not
terribly divergent at the level that the compiler cares about.

(I cleaned up some old formatting and sorted some items for this case:
If we converge on landing this in some form, I will NFC commit format
only changes as a separate commit)

Differential Revision: https://reviews.llvm.org/D133823

21 months ago[RISCV] Add a LocalStackSlotAllocation test
luxufan [Wed, 28 Sep 2022 23:32:34 +0000 (23:32 +0000)]
[RISCV] Add a LocalStackSlotAllocation test

Differential Revision: https://reviews.llvm.org/D134884

21 months ago[gn build] Port a6e1080b87db
LLVM GN Syncbot [Sun, 2 Oct 2022 23:53:57 +0000 (23:53 +0000)]
[gn build] Port a6e1080b87db

21 months agoRevert "[libc++][ranges]Refactor `copy{,_backward}` and `move{,_backward}`"
Vitaly Buka [Sun, 2 Oct 2022 23:23:35 +0000 (16:23 -0700)]
Revert "[libc++][ranges]Refactor `copy{,_backward}` and `move{,_backward}`"

Breaks msan, asan

https://lab.llvm.org/buildbot/#/builders/5/builds/27904

This reverts commit 005916de58f73aa5c4264c084ba7b0e21040d88f.

21 months ago[ELF] Add LLVM_LIBRARY_VISIBILITY to some global variables. NFC
Fangrui Song [Sun, 2 Oct 2022 20:23:52 +0000 (13:23 -0700)]
[ELF] Add LLVM_LIBRARY_VISIBILITY to some global variables. NFC

21 months ago[flang] Introduce fir.class type
Valentin Clement [Sun, 2 Oct 2022 18:11:57 +0000 (20:11 +0200)]
[flang] Introduce fir.class type

Introduce a new ClassType for polymorphic
entities. A fir.class type is similar to a fir.box type in
many ways and is also base on the BaseBoxType.

This patch is part of the implementation of the poltymorphic
entities.
https://github.com/llvm/llvm-project/blob/main/flang/docs/PolymorphicEntities.md

Depends on D134956

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D134957

21 months ago[flang] Introduce BaseBoxType
Valentin Clement [Sun, 2 Oct 2022 18:07:18 +0000 (20:07 +0200)]
[flang] Introduce BaseBoxType

Introduce a BaseBoxType to be used by BoxType and
the a new ClassType that is introduced in a follow up patch.

This patch is part of the implementation of the poltymorphic
entities.
https://github.com/llvm/llvm-project/blob/main/flang/docs/PolymorphicEntities.md

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D134956

21 months ago[llvm-objdump][test] Improve address test
Fangrui Song [Sun, 2 Oct 2022 17:49:52 +0000 (10:49 -0700)]
[llvm-objdump][test] Improve address test

21 months ago[libc++] Updates generated transitve includes.
Mark de Wever [Sun, 2 Oct 2022 17:37:21 +0000 (19:37 +0200)]
[libc++] Updates generated transitve includes.

This should fix the CI.

21 months ago[InstCombine] convert mul by negative-pow2 to negate and shift
Sanjay Patel [Sun, 2 Oct 2022 16:22:25 +0000 (12:22 -0400)]
[InstCombine] convert mul by negative-pow2 to negate and shift

This is an unusual canonicalization because we create an extra instruction,
but it's likely better for analysis and codegen (similar reasoning as D133399).

InstCombine::Negator may create this kind of multiply from negate and shift,
but this should not conflict because of the narrow negation.

I don't know how to create a fully general proof for this kind of transform in
Alive2, but here's an example with bitwidths similar to one of the regression
tests:
https://alive2.llvm.org/ce/z/J3jTjR

Differential Revision: https://reviews.llvm.org/D133667

21 months ago[ValueTracking] peek through fpext in isKnownNeverInfinity()
Sanjay Patel [Sun, 2 Oct 2022 15:19:05 +0000 (11:19 -0400)]
[ValueTracking] peek through fpext in isKnownNeverInfinity()

https://alive2.llvm.org/ce/z/BkNoRW

21 months ago[InstSimplify] add tests for FP infinity compare with fpext; NFC
Sanjay Patel [Sun, 2 Oct 2022 14:49:56 +0000 (10:49 -0400)]
[InstSimplify] add tests for FP infinity compare with fpext; NFC

21 months ago[ARM] Add lowering for bf16 neon vtrn, vzup and vuzp.
David Green [Sun, 2 Oct 2022 14:34:37 +0000 (15:34 +0100)]
[ARM] Add lowering for bf16 neon vtrn, vzup and vuzp.

These go via Dag2Dag, which are better based on element sizes not the
exact element types.

21 months ago[ARM] More bf16 shuffle handling, including perfect shuffles.
David Green [Sun, 2 Oct 2022 13:31:51 +0000 (14:31 +0100)]
[ARM] More bf16 shuffle handling, including perfect shuffles.

21 months ago[ConstraintElimination] Update Changed status in ssub simplification.
Florian Hahn [Sun, 2 Oct 2022 13:25:51 +0000 (14:25 +0100)]
[ConstraintElimination] Update Changed status in ssub simplification.

Update tryToSimplifyOverflowMath to indicate whether the function made
any changes to the IR.

21 months ago[ARM] Add tablegen patterns for bf16 vrev
David Green [Sun, 2 Oct 2022 12:42:14 +0000 (13:42 +0100)]
[ARM] Add tablegen patterns for bf16 vrev

21 months ago[ARM] Add tablegen patterns for bf16 vext
David Green [Sun, 2 Oct 2022 11:45:58 +0000 (12:45 +0100)]
[ARM] Add tablegen patterns for bf16 vext

This adds missing tablegen patterns for VEXT, identical to the fp16
patterns as they only use baseline Neon operations.
Part of fixing #57770.

21 months ago[ARM][DAG] BF16 constant handling.
David Green [Sun, 2 Oct 2022 10:51:08 +0000 (11:51 +0100)]
[ARM][DAG] BF16 constant handling.

Much like f16 and f32, we shouldn't try to shrink bf16 to smaller fp
constant.  The code may not be optimal, but this allows us to legalize
bf16 constants under Arm without errors.

21 months agoRevert "[flang] Make real type of kind 10 target dependent"
Peixin Qiao [Sun, 2 Oct 2022 09:45:03 +0000 (17:45 +0800)]
Revert "[flang] Make real type of kind 10 target dependent"

This reverts commit d11e406e369fc90be5e2e2a0798ea7b7d2625882.

21 months ago[test] Make Linux/sem_init_glibc.cpp robust
Fangrui Song [Sun, 2 Oct 2022 07:47:10 +0000 (00:47 -0700)]
[test] Make Linux/sem_init_glibc.cpp robust

and fix it for 32-bit ports defining sem_init@GLIBC_2.0 (i386, mips32, powerpc32) for glibc>=2.36.

Fix https://github.com/llvm/llvm-project/issues/58079

Reviewed By: mgorny

Differential Revision: https://reviews.llvm.org/D135023

21 months ago[flang][OpenMP] Fix resolve common block in data-sharing clauses
Peixin Qiao [Sun, 2 Oct 2022 02:38:27 +0000 (10:38 +0800)]
[flang][OpenMP] Fix resolve common block in data-sharing clauses

The previous resolve only creates the host associated varaibles for
common block members, but does not replace the original objects with
the new created ones. Fix it and also compute the sizes and offsets
for the host common block members if they are host associated.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D127214

21 months ago[flang] Make real type of kind 10 target dependent
Peixin Qiao [Sun, 2 Oct 2022 02:26:55 +0000 (10:26 +0800)]
[flang] Make real type of kind 10 target dependent

The real(10) is supported on x86_64. On aarch64, the value of
selected_real_kind(16) should be 16 rather than 10 since real(10)
is not supported on x86_64. Previously, the real type support check
is not target dependent. Support it now through the target triple
information.

Reviewed By: clementval

Differential Revision: https://reviews.llvm.org/D134021

21 months ago[clang][auto-init] Deprecate -enable-trivial-auto-var-init-zero-knowing-it-will-be...
Kees Cook [Fri, 6 May 2022 19:47:43 +0000 (12:47 -0700)]
[clang][auto-init] Deprecate -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang

GCC 12 has been released and contains unconditional support for
-ftrivial-auto-var-init=zero:
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-ftrivial-auto-var-init

Maintain compatibility with GCC, and remove the -enable flag for "zero"
mode. The flag is left to generate an "unused" warning, though, to not
break all the existing users. The flag will be fully removed in Clang 17.

Link: https://github.com/llvm/llvm-project/issues/44842
Reviewed By: nickdesaulniers, MaskRay, srhines, xbolva00

Differential Revision: https://reviews.llvm.org/D125142

21 months ago[gn build] Port 005916de58f7
LLVM GN Syncbot [Sun, 2 Oct 2022 00:35:45 +0000 (00:35 +0000)]
[gn build] Port 005916de58f7

21 months ago[libc++][ranges]Refactor `copy{,_backward}` and `move{,_backward}`
Konstantin Varlamov [Sun, 2 Oct 2022 00:28:57 +0000 (17:28 -0700)]
[libc++][ranges]Refactor `copy{,_backward}` and `move{,_backward}`

Instead of using `reverse_iterator`, share the optimization between the 4 algorithms. The key observation here that `memmove` applies to both `copy` and `move` identically, and to their `_backward` versions very similarly. All algorithms now follow the same pattern along the lines of:
```
if constexpr (can_memmove<InIter, OutIter>) {
  memmove(first, last, out);
} else {
  naive_implementation(first, last, out);
}
```
A follow-up will delete `unconstrained_reverse_iterator`.

This patch removes duplication and divergence between `std::copy`, `std::move` and `std::move_backward`. It also improves testing:
- the test for whether the optimization is used only applied to `std::copy` and, more importantly, was essentially a no-op because it would still pass if the optimization was not used;
- there were no tests to make sure the optimization is not used when the effect would be visible.

Differential Revision: https://reviews.llvm.org/D130695

21 months ago[mlir] Use std::enable_if_t (NFC)
Kazu Hirata [Sun, 2 Oct 2022 00:24:56 +0000 (17:24 -0700)]
[mlir] Use std::enable_if_t (NFC)

21 months ago[clang] Use std::enable_if_t (NFC)
Kazu Hirata [Sun, 2 Oct 2022 00:24:54 +0000 (17:24 -0700)]
[clang] Use std::enable_if_t (NFC)

21 months ago[ADT] Use std::common_type_t (NFC)
Kazu Hirata [Sun, 2 Oct 2022 00:24:52 +0000 (17:24 -0700)]
[ADT] Use std::common_type_t (NFC)

21 months ago[mlir] Remove ReferTo attr constraint
Jacques Pienaar [Sun, 2 Oct 2022 00:19:14 +0000 (17:19 -0700)]
[mlir] Remove ReferTo attr constraint

The current generation is unsafe as it is evaluated during verify
invocation rather than during verifySymbolUses. Remove until this is
safely generated.

Differential Revision: https://reviews.llvm.org/D134558

21 months ago[llvm] Migrate PAEval to new pass manager
Arthur Eubanks [Sat, 1 Oct 2022 23:40:58 +0000 (16:40 -0700)]
[llvm] Migrate PAEval to new pass manager

21 months ago[RISCV] Use _TIED form of VWADD(U)_WX/VWSUB(U)_WX to avoid early clobber.
Craig Topper [Sat, 1 Oct 2022 23:31:23 +0000 (16:31 -0700)]
[RISCV] Use _TIED form of VWADD(U)_WX/VWSUB(U)_WX to avoid early clobber.

One of the sources is the same size as the destination so that source
doesn't have an overlap with the destination register. By using the _TIED
form we avoid an early clobber contraint for that source.

This matches what was already done for instrinsics. ConvertToThreeAddress
will fix it if it can't stay tied.

21 months ago[RISCV] Minor tablegen formatting cleanup. NFC
Craig Topper [Sat, 1 Oct 2022 22:59:25 +0000 (15:59 -0700)]
[RISCV] Minor tablegen formatting cleanup. NFC

21 months ago[ELF] --check-sections: allow address 0xffffffff for ELFCLASS32
Fangrui Song [Sat, 1 Oct 2022 22:37:07 +0000 (15:37 -0700)]
[ELF] --check-sections: allow address 0xffffffff for ELFCLASS32

Fix https://github.com/llvm/llvm-project/issues/58101

21 months ago[ELF] Rename LinkerScript::ctx to state. NFC
Fangrui Song [Sat, 1 Oct 2022 22:27:39 +0000 (15:27 -0700)]
[ELF] Rename LinkerScript::ctx to state. NFC

To avoid name conflict with `elf::ctx`.

21 months ago[GlobalISel] Combine abs(undef) -> 0
Jessica Paquette [Sat, 1 Oct 2022 21:13:28 +0000 (14:13 -0700)]
[GlobalISel] Combine abs(undef) -> 0

SDAG does this, GISel doesn't.

See https://gcc.godbolt.org/z/sqjMx3Tfv

More context:
https://github.com/llvm/llvm-project/issues/57256

Differential Revision: https://reviews.llvm.org/D135021

21 months ago[ELF] Move driver into ctx and remove indirection. NFC
Fangrui Song [Sat, 1 Oct 2022 22:12:50 +0000 (15:12 -0700)]
[ELF] Move driver into ctx and remove indirection. NFC

This removes one global variable and removes GOT and unique_ptr indirection.

21 months ago[ELF] Remove symtab indirection. NFC
Fangrui Song [Sat, 1 Oct 2022 21:46:49 +0000 (14:46 -0700)]
[ELF] Remove symtab indirection. NFC

Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr indirection.

21 months agoUpdate missing test after 24553df57dcc7bb2567697d8697b37ffbbac
Jessica Paquette [Sat, 1 Oct 2022 21:00:01 +0000 (14:00 -0700)]
Update missing test after 24553df57dcc7bb2567697d8697b37ffbbac

21 months ago[libc++] Enable libc++-specific tests for constexpr string
Nikolas Klauser [Sat, 1 Oct 2022 13:37:24 +0000 (15:37 +0200)]
[libc++] Enable libc++-specific tests for constexpr string

Reviewed By: ldionne, Mordante, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D128578

21 months ago[libc++][NFC] Prefer type aliases over structs
Nikolas Klauser [Sat, 1 Oct 2022 13:42:00 +0000 (15:42 +0200)]
[libc++][NFC] Prefer type aliases over structs

Reviewed By: ldionne, #libc

Spies: sstefan1, libcxx-commits, jeroen.dobbelaere

Differential Revision: https://reviews.llvm.org/D134901

21 months ago[GlobalISel] Combine `undef / X -> 0` and `undef % X -> 0`
Jessica Paquette [Sat, 1 Oct 2022 20:36:39 +0000 (13:36 -0700)]
[GlobalISel] Combine `undef / X -> 0` and `undef % X -> 0`

This fixes the `urem_undef_lhs` case in the following:

https://gcc.godbolt.org/z/Wo9x7o679

Also see https://github.com/llvm/llvm-project/issues/57256 for more related
bugs.

This is equivalent to the undef bits in `simplifyDivRem` in the DAGCombiner.

Differential Revision: https://reviews.llvm.org/D135020

21 months ago[llvm-driver] Support single distributions
Alex Brachet [Sat, 1 Oct 2022 20:20:28 +0000 (20:20 +0000)]
[llvm-driver] Support single distributions

`LLVM_DISTRIBUTION_COMPONENTS` now influences the llvm binary in the
normal cmake output directory when it is set. This allows for
distribution targets to only include tools they want in the llvm
binary. It must be done this way because only one target can be
associated with a specific output name.

Differential Revision: https://reviews.llvm.org/D131310

21 months ago[llvm-driver][NFC] Simplify handling of tool symlinks
Alex Brachet [Sat, 1 Oct 2022 20:18:49 +0000 (20:18 +0000)]
[llvm-driver][NFC] Simplify handling of tool symlinks

Differential Revision: https://reviews.llvm.org/D134979

21 months ago[ELF] Remove ctx indirection. NFC
Fangrui Song [Sat, 1 Oct 2022 19:06:33 +0000 (12:06 -0700)]
[ELF] Remove ctx indirection. NFC

Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr
indirection. We can move other global variables into ctx without
indirection concern. In the long term we may consider passing Ctx
as a parameter to various functions and eliminate global state as
much as possible and then remove `Ctx::reset`.

21 months ago[ELF] Remove elf::config indirection. NFC
Fangrui Song [Sat, 1 Oct 2022 18:39:45 +0000 (11:39 -0700)]
[ELF] Remove elf::config indirection. NFC

`config` has 1000+ uses so we try to avoid changing `config->foo`. Define a
wrapper with LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr
indirection.

My x86-64 lld executable is 11+KiB smaller.

21 months ago[Clang] Move ParsedTargetAttr to TargetInfo.h
David Green [Sat, 1 Oct 2022 17:26:42 +0000 (18:26 +0100)]
[Clang] Move ParsedTargetAttr to TargetInfo.h

This moves the struct, as it is now parsed by TargetInfo, so avoiding
some includes of AST in Basic.

21 months ago[lldb] Remove scoped timer from high firing and fast running ExtractUnitDIENoDwoIfNeeded
Dave Lee [Thu, 29 Sep 2022 23:01:49 +0000 (16:01 -0700)]
[lldb] Remove scoped timer from high firing and fast running ExtractUnitDIENoDwoIfNeeded

Profiles show that `DWARFUnit::ExtractUnitDIENoDwoIfNeeded` is both high firing (tens of thousands of calls) and fast running (15 µs mean).

Timers like this are noise and load for profiling systems, and can be removed.

rdar://100326595

Differential Revision: https://reviews.llvm.org/D134920

21 months ago[lldb] Remove scoped timer from high firing and fast running SymbolFileDWARF::FindFun...
Dave Lee [Thu, 29 Sep 2022 23:23:22 +0000 (16:23 -0700)]
[lldb] Remove scoped timer from high firing and fast running SymbolFileDWARF::FindFunctions

Profiles show that `SymbolFileDWARF::FindFunctions` is both high firing (many thousands of calls) and fast running (35 µs mean).

Timers like this are noise and load for profiling systems, and can be removed.

rdar://100326595

Differential Revision: https://reviews.llvm.org/D134922

21 months ago[SimpleLoopUnswitch] Pass -verify-cfg-preserved to test.
Florian Hahn [Sat, 1 Oct 2022 16:19:02 +0000 (17:19 +0100)]
[SimpleLoopUnswitch] Pass -verify-cfg-preserved to test.

This ensures PreservedCFGCheckerAnalysis is always added, independent of
whether opt was built with assertions enabled or not.

This fixes a few buildbot failures for bots that don't have assertions
enabled.

21 months agoFoward declare ParsedTargetAttr as a struct.
David Green [Sat, 1 Oct 2022 15:14:00 +0000 (16:14 +0100)]
Foward declare ParsedTargetAttr as a struct.

21 months ago[DAGCombine] Add tests for D57317
Paweł Bylica [Sat, 1 Oct 2022 14:42:30 +0000 (14:42 +0000)]
[DAGCombine] Add tests for D57317

Add two tests for D57317: Deduplicate addcarry node using commutativity.
https://reviews.llvm.org/D57317

21 months ago[LAA] Change to function analysis for new PM.
Florian Hahn [Sat, 1 Oct 2022 14:44:26 +0000 (15:44 +0100)]
[LAA] Change to function analysis for new PM.

At the moment, LoopAccessAnalysis is a loop analysis for the new pass
manager. The issue with that is that LAI caches SCEV expressions and
modifications in a loop may impact SCEV expressions in other loops, but
we do not have a convenient way to invalidate LAI for other loops
withing a loop pipeline.

To avoid this issue, turn it into a function analysis which returns a
manager object that keeps track of the individual LAI objects per loop.

Fixes #50940.

Fixes #51669.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D134606

21 months ago[Clang][AArch64] Support AArch64 target(..) attribute formats.
David Green [Sat, 1 Oct 2022 14:40:59 +0000 (15:40 +0100)]
[Clang][AArch64] Support AArch64 target(..) attribute formats.

This adds support under AArch64 for the target("..") attributes. The
current parsing is very X86-shaped, this patch attempts to bring it line
with the GCC implementation from
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes.

The supported formats are:
- "arch=<arch>" strings, that specify the architecture features for a
  function as per the -march=arch+feature option.
- "cpu=<cpu>" strings, that specify the target-cpu and any implied
  atributes as per the -mcpu=cpu+feature option.
- "tune=<cpu>" strings, that specify the tune-cpu cpu for a function as
  per -mtune.
- "+<feature>", "+no<feature>" enables/disables the specific feature, for
  compatibility with GCC target attributes.
- "<feature>", "no-<feature>" enabled/disables the specific feature, for
  backward compatibility with previous releases.

To do this, the parsing of target attributes has been moved into
TargetInfo to give the target the opportunity to override the existing
parsing. The only non-aarch64 change should be a minor alteration to the
error message, specifying using "CPU" to describe the cpu, not
"architecture", and the DuplicateArch/Tune from ParsedTargetAttr have
been combined into a single option.

Differential Revision: https://reviews.llvm.org/D133848

21 months ago[AArch64] Lower multiplication by a negative constant to shl+sub+shl
zhongyunde [Sat, 1 Oct 2022 07:36:46 +0000 (15:36 +0800)]
[AArch64] Lower multiplication by a negative constant to shl+sub+shl

Change the costmodel to lower a = b * C where C = -(2^n - 2^m) to
            lsl     w8, w0, m
            sub     w0, w8, w0, lsl n
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D134934

21 months ago[gn build] Port 099384dcea49
LLVM GN Syncbot [Sat, 1 Oct 2022 13:18:43 +0000 (13:18 +0000)]
[gn build] Port 099384dcea49

21 months ago[libc++] Implement P0591R4 (Utility functions to implement uses-allocator construction)
Nikolas Klauser [Fri, 30 Sep 2022 09:42:25 +0000 (11:42 +0200)]
[libc++] Implement P0591R4 (Utility functions to implement uses-allocator construction)

Reviewed By: ldionne, #libc, huixie90

Spies: huixie90, libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D131898

21 months ago[lldb] Fix warnings about unused variables when building without asserts. NFC.
Martin Storsjö [Sat, 1 Oct 2022 11:27:48 +0000 (14:27 +0300)]
[lldb] Fix warnings about unused variables when building without asserts. NFC.

21 months ago[ARM] Support all versions of AND, ORR, EOR and BIC in optimizeCompareInstr
Filipp Zhinkin [Fri, 12 Aug 2022 16:09:03 +0000 (19:09 +0300)]
[ARM] Support all versions of AND, ORR, EOR and BIC in optimizeCompareInstr

Combine cmp with zero and all versions of AND, ORR, EOR and BIC instructions into S-suffixed versions.

Related issue: https://github.com/llvm/llvm-project/issues/57122

Reviewed By: efriedma, samtebbs

Differential Revision: https://reviews.llvm.org/D131786

21 months ago[AMDGPU][GFX11] Mitigate VALU mask write hazard
Carl Ritson [Sat, 1 Oct 2022 00:17:42 +0000 (09:17 +0900)]
[AMDGPU][GFX11] Mitigate VALU mask write hazard

VALU use of an SGPR (pair) as mask followed by SALU write to the
same SGPR can cause incorrect execution of subsequent SALU reads
of the SGPR.

Reviewed By: foad, rampitec

Differential Revision: https://reviews.llvm.org/D134151

21 months ago[mlir] Allow DenseElementsAttr to use any shaped type
Jeff Niu [Sat, 1 Oct 2022 00:32:13 +0000 (17:32 -0700)]
[mlir] Allow DenseElementsAttr to use any shaped type

This patch allows the type of DenseElementsAttr to be any shaped type.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D135002

21 months ago[clang-format] Correctly indent closing brace of compound requires
Emilia Dreamer [Sat, 1 Oct 2022 05:16:45 +0000 (08:16 +0300)]
[clang-format] Correctly indent closing brace of compound requires

When a compound requirement is too long to fit onto a single line, the
braces are split apart onto separate lines, and the contained expression
is indented. However, this indentation would also apply to the closing
brace and the trailing return type requirement thereof.
This was because the indentation level was being restored after all
trailing things were already read

With this change, the initial level of the opening brace is set before
attempting to read any trailing return type requirements

Fixes https://github.com/llvm/llvm-project/issues/57108

Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D134626

21 months ago[RISCV] Prevent performCombineVMergeAndVOps from creating cycles in the DAG.
Craig Topper [Sat, 1 Oct 2022 03:01:44 +0000 (20:01 -0700)]
[RISCV] Prevent performCombineVMergeAndVOps from creating cycles in the DAG.

If True has a Chain result, the other operands of the vmerge may
depend on it through that Chain. We need to ensure it isn't a
predecessor of those operands.

Reviewed By: fakepaper56

Differential Revision: https://reviews.llvm.org/D134980

21 months ago[RISCV] Update cost of vector roundeven to match round which uses the same sequence...
Craig Topper [Sat, 1 Oct 2022 03:01:34 +0000 (20:01 -0700)]
[RISCV] Update cost of vector roundeven to match round which uses the same sequence but a different FRM value.

Reviewed By: reames, eopXD

Differential Revision: https://reviews.llvm.org/D134978

21 months ago[MemProf] Update metadata during inlining
Teresa Johnson [Thu, 30 Jun 2022 21:49:44 +0000 (14:49 -0700)]
[MemProf] Update metadata during inlining

Update both memprof and callsite metadata to reflect inlined functions.

For callsite metadata this is simply a concatenation of each cloned
call's call stack with that of the inlined callsite's.

For memprof metadata, each profiled memory info block (MIB) is either
moved to the cloned allocation call or left on the original allocation
call depending on whether its context matches the newly refined call
stack context on the cloned call. We also reapply context trimming
optimizations based on the refined set of contexts on each of the calls
(cloned and original).

Depends on D128142.

Reviewed By: snehasish

Differential Revision: https://reviews.llvm.org/D128143

21 months ago[VP][RISCV] Add vp.copysign and RISC-V support.
Yeting Kuo [Fri, 30 Sep 2022 02:28:40 +0000 (10:28 +0800)]
[VP][RISCV] Add vp.copysign and RISC-V support.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134935

21 months agoUPdate reference-log-noml.txt as well to adapt for D133902
Matthias Braun [Sat, 1 Oct 2022 01:03:28 +0000 (18:03 -0700)]
UPdate reference-log-noml.txt as well to adapt for D133902

21 months ago[OpenMP] [OMPT] [1/8] Create separate categories for host, device, [no]emi events
Dhruva Chakrabarti [Fri, 30 Sep 2022 04:41:29 +0000 (04:41 +0000)]
[OpenMP] [OMPT] [1/8] Create separate categories for host, device, [no]emi events

In preparation for OMPT target changes, create separate categories of events that will be used by OMPT target support.

Split up existing macro FOREACH_OMPT_EVENT into new ones. There is no change to the original macro. Created new macros FOREACH_OMPT_HOST_EVENT, FOREACH_OMPT_DEVICE_EVENT, FOREACH_OMPT_NOEMI_EVENT, FOREACH_OMPT_EMI_EVENT, and a few other sub-categories that can be used as required. One such use is in D123974 which uses events selectively.

Patch from John Mellor-Crummey <johnmc@rice.edu>

Reviewed By: dreachem

Differential Revision: https://reviews.llvm.org/D123429

21 months agoAdapt dev-mode-logging.ll test to D133902
Matthias Braun [Sat, 1 Oct 2022 00:45:19 +0000 (17:45 -0700)]
Adapt dev-mode-logging.ll test to D133902

21 months ago[mlir][sparse] Improving error messages for MLIR_SPARSETENSOR_FATAL
wren romano [Fri, 30 Sep 2022 23:55:53 +0000 (16:55 -0700)]
[mlir][sparse] Improving error messages for MLIR_SPARSETENSOR_FATAL

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D135000

21 months agoRevert "[mlgo] Fix tests post D133902"
Mircea Trofin [Sat, 1 Oct 2022 00:30:08 +0000 (17:30 -0700)]
Revert "[mlgo] Fix tests post D133902"

This reverts commit 25d65b545530f7155734a06ef0e5143b4edb8ff9.

There's a more thorough fix in f9317bf0bed0e0f248c18114afa24dcd56d727ae

21 months ago[mlgo] Fix tests post D133902
Mircea Trofin [Sat, 1 Oct 2022 00:26:33 +0000 (17:26 -0700)]
[mlgo] Fix tests post D133902

The breaks were expected, except for the dev-mode-extra-features-logging
one. XFAIL-ing to unblock bots, investigating further.

21 months agoFix tied operands in phi-coalescing.mir test; try to adapt MLRegalloc tests
Matthias Braun [Fri, 30 Sep 2022 23:44:07 +0000 (16:44 -0700)]
Fix tied operands in phi-coalescing.mir test; try to adapt MLRegalloc tests

Fix a test using invalid MLIR using different VRegs for the tied operands
of ADD64rr, which happened to trigger an assertion after my latest
changes.

Also attempting to adjust the MLRegalloc tests to the adjusted regalloc
(though I don't have a 100% working setup for them even without my
changes)

21 months agoRevert "[MemProf] Update metadata during inlining" and preceeding commit
Teresa Johnson [Fri, 30 Sep 2022 23:56:57 +0000 (16:56 -0700)]
Revert "[MemProf] Update metadata during inlining" and preceeding commit

This reverts commit 0d7f3464ce0ba3a97df73e08ee0acd4e33adbe9b and
commit f9403ca41e5f3dab60cd6e5de26eea65dcab01a4. The latter was
"Profile matching and IR annotation for memprof profiles." and was left
from a bad rebase from a commit already pushed upstream.

21 months ago[mlir] Flip Async/GPU/MemRef/OpenACC/OpenMP/PDL dialects to prefixed
River Riddle [Fri, 30 Sep 2022 23:06:29 +0000 (16:06 -0700)]
[mlir] Flip Async/GPU/MemRef/OpenACC/OpenMP/PDL dialects to prefixed

This flips all of the remaining dialects to prefixed except for linalg, which
will be done in a followup.

Differential Revision: https://reviews.llvm.org/D134995

21 months ago[MemProf] Update metadata during inlining
Teresa Johnson [Thu, 30 Jun 2022 22:28:10 +0000 (15:28 -0700)]
[MemProf] Update metadata during inlining

Update both memprof and callsite metadata to reflect inlined functions.

For callsite metadata this is simply a concatenation of each cloned
call's call stack with that of the inlined callsite's.

For memprof metadata, each profiled memory info block (MIB) is either
moved to the cloned allocation call or left on the original allocation
call depending on whether its context matches the newly refined call
stack context on the cloned call. We also reapply context trimming
optimizations based on the refined set of contexts on each of the calls
(cloned and original), via utilities in MemoryProfileInfo.

Depends on D128142.

Differential Revision: https://reviews.llvm.org/D128143

21 months agoProfile matching and IR annotation for memprof profiles.
Teresa Johnson [Thu, 30 Jun 2022 21:49:44 +0000 (14:49 -0700)]
Profile matching and IR annotation for memprof profiles.

See also related RFCs:
RFC: Sanitizer-based Heap Profiler [1]
RFC: A binary serialization format for MemProf [2]
RFC: IR metadata format for MemProf [3]*

* Note that the IR metadata format has changed from the RFC during
implementation, as described in the preceeding patch adding the basic
metadata and verification support.

The matching is performed during the normal PGO annotation phase, to
ensure that the inlines applied in the IR at that point are a subset
of the inlines in the profiled binary and thus reflected in the
profile's call stacks. This is important because the call frames are
associated with functions in the profile based on the inlining in the
symbolized call stacks, and this simplifies locating the subset of
profile data relevant for matching onto each function's IR.

The PGOInstrumentationUse pass is enhanced to perform matching for
whatever combination of memprof and regular PGO profile data exists in
the profile.

Using the utilities introduced in D128854:
The memprof profile data for each context is converted to "cold" or
"notcold" based on parameterized thresholds for size, access count, and
lifetime. The memprof allocation contexts are trimmed to the minimal
amount of context required to uniquely identify whether the context is
cold or not cold. For allocations where all profiled contexts have the
same allocation type, no memprof metadata is attached and instead the
allocation call is directly annotated with an attribute specifying the
alloction type. This is the same attributed that will be applied to
allocation calls once cloned for different contexts, and later used
during LibCall simplification to emit allocation hints [4].

Depends on D128141 and D128854.

[1] https://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html
[2] https://lists.llvm.org/pipermail/llvm-dev/2021-September/153007.html
[3] https://discourse.llvm.org/t/rfc-ir-metadata-format-for-memprof/59165
[4] https://github.com/google/tcmalloc/commit/ab87cf382dc56784f783f3aaa43d6d0465d5f385

Differential Revision: https://reviews.llvm.org/D128142

21 months ago[mlir][ods] Allow references to the self type
Jeff Niu [Fri, 30 Sep 2022 23:08:01 +0000 (16:08 -0700)]
[mlir][ods] Allow references to the self type

The self type always "bound" since it is provided to the attribute
parser hook. Allow custom directives to reference it.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D134997

21 months agoX86: Stop assigning register costs for longer encodings.
Matthias Braun [Tue, 16 Aug 2022 17:35:33 +0000 (10:35 -0700)]
X86: Stop assigning register costs for longer encodings.

This stops reporting CostPerUse 1 for `R8`-`R15` and `XMM8`-`XMM31`.
This was previously done because instruction encoding require a REX
prefix when using them resulting in longer instruction encodings. I
found that this regresses the quality of the register allocation as the
costs impose an ordering on eviction candidates. I also feel that there
is a bit of an impedance mismatch as the actual costs occure when
encoding instructions using those registers, but the order of VReg
assignments is not primarily ordered by number of Defs+Uses.

I did extensive measurements with the llvm-test-suite wiht SPEC2006 +
SPEC2017 included, internal services showed similar patterns. Generally
there are a log of improvements but also a lot of regression. But on
average the allocation quality seems to improve at a small code size
regression.

Results for measuring static and dynamic instruction counts:

Dynamic Counts (scaled by execution frequency) / Optimization Remarks:
    Spills+FoldedSpills   -5.6%
    Reloads+FoldedReloads -4.2%
    Copies                -0.1%

Static / LLVM Statistics:
    regalloc.NumSpills    mean -1.6%, geomean -2.8%
    regalloc.NumReloads   mean -1.7%, geomean -3.1%
    size..text            mean +0.4%, geomean +0.4%

Static / LLVM Statistics:
    mean -2.2%, geomean -3.1%) regalloc.NumSpills
    mean -2.6%, geomean -3.9%) regalloc.NumReloads
    mean +0.6%, geomean +0.6%) size..text

Static / LLVM Statistics:
    regalloc.NumSpills   mean -3.0%
    regalloc.NumReloads  mean -3.3%
    size..text           mean +0.3%, geomean +0.3%

Differential Revision: https://reviews.llvm.org/D133902

21 months ago[libc] disable syscall test without fullbuild
Michael Jones [Fri, 30 Sep 2022 22:55:28 +0000 (15:55 -0700)]
[libc] disable syscall test without fullbuild

Our syscall implementation depends on a specific macro that's only
defined in our headers. If we're not using our headers, then the test
doesn't work. I've disabled the test in this case because there's no
point in testing the system libc's syscall implementation.

Differential Revision: https://reviews.llvm.org/D134994