review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

mgrzywac [Thu, 29 Jun 2023 07:41:08 +0000 (07:41 +0000)]

[libunwind] Add cached compile and link flags to libunwind

Add flags allowing to use compile flags and libraries provided in cache with libunwind.
Similar flags are already present in libc++ and libc++abi CMakeLists files.

Differential Revision: https://reviews.llvm.org/D150252

commit | commitdiff | tree

Craig Topper [Thu, 29 Jun 2023 07:20:47 +0000 (00:20 -0700)]

[RISCV] Do a more complete job of disabling extending loads and truncating stores for fixed vector types.

We weren't marking some combinations as Expand if ones of the
types wasn't legal.

Fixes #63596.

commit | commitdiff | tree

Hanbum Park [Thu, 29 Jun 2023 07:13:18 +0000 (09:13 +0200)]

[InstSimplify] Fold icmp of allocas based on offset difference

Strengthen the fold for icmps of non-overlapping storage, by
working on the difference of offsets, rather than considering
both offsets independently. In particular, this allows handling
comparisons of pointers to the end of equal-sized allocations.

Proofs: https://alive2.llvm.org/ce/z/Po2nL4

Differential Revision: https://reviews.llvm.org/D153752

commit | commitdiff | tree

Martin Braenne [Thu, 29 Jun 2023 06:39:39 +0000 (06:39 +0000)]

[clang][dataflow] Don't crash when creating pointers to members.

The newly added tests crash without the other changes in this patch.

Reviewed By: sammccall, xazax.hun, gribozavr2

Differential Revision: https://reviews.llvm.org/D153960

commit | commitdiff | tree

Nikita Popov [Fri, 23 Jun 2023 10:50:27 +0000 (12:50 +0200)]

[SCEV] Make use of non-null pointers for range calculation

We know that certain pointers (e.g. non-extern-weak globals or
allocas in default address space) are not null, in which case the
lowest address they can be allocated at is their alignment.

This allows us to calculate better exit counts for loops that have
an additional null check in the guarding condition
(see alloca_icmp_null_exit_count).

Differential Revision: https://reviews.llvm.org/D153624

commit | commitdiff | tree

Tobias Gysi [Thu, 29 Jun 2023 06:31:03 +0000 (06:31 +0000)]

[mlir][llvm] Add debug label intrinsic

This revision adds support for the llvm.dbg.label.intrinsic
and the corresponding DILabel metadata.

Reviewed By: Dinistro

Differential Revision: https://reviews.llvm.org/D153975

commit | commitdiff | tree

Jianjian GUAN [Thu, 29 Jun 2023 03:15:14 +0000 (11:15 +0800)]

[RISCV] Update computeKnownBitsForTargetNode for FPCLASS.

The fclass instruction only set one of the low 10 bits.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154040

commit | commitdiff | tree

Mikael Holmen [Thu, 29 Jun 2023 05:51:15 +0000 (07:51 +0200)]

[StructuralHash] Ignore global variable declarations

Ignore declarations of global variables, just as we do with declarations
of functions.

Done as a follow up to the comments in https://reviews.llvm.org/D149209

Differential Revision: https://reviews.llvm.org/D153855

# Conflicts:
# llvm/lib/IR/StructuralHash.cpp

commit | commitdiff | tree

Han Shen [Thu, 29 Jun 2023 05:18:53 +0000 (22:18 -0700)]

[Analysis] Refactor MBB hotness/coldness into templated PSI functions.

Currently, to use PSI->isFunctionHotInCallGraph, we first need to
calculate BPI->BFI, which is expensive. Instead, we can implement this
directly with MBFI. Also as @wenlei mentioned in another patch review,
that MachineSizeOpts already has isFunctionColdInCallGraph,
isFunctionHotInCallGraphNthPercentile, etc implemented. These can be
refactored and so they can be reused across MachineFunctionSplitting
and MachineSizeOpts passes.

This CL does this - it refactors out those internal static functions
into PSI as templated functions, so they can be accessed easily.

Differential Revision: https://reviews.llvm.org/D153927

commit | commitdiff | tree

Freddy Ye [Thu, 29 Jun 2023 05:29:00 +0000 (13:29 +0800)]

[NFC] Add missing cpu tests in predefined-arch-macros.c

Added tests for penryn, nehalem, westmere, sandybridge, ivybridge,
haswell, bonnell, silvermont.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D153714

commit | commitdiff | tree

Martin Braenne [Wed, 28 Jun 2023 08:38:00 +0000 (08:38 +0000)]

[clang][dataflow] Make `getThisPointeeStorageLocation()` return an `AggregateStorageLocation`.

This avoids the need for casts at callsites.

Depends On D153852

Reviewed By: sammccall, xazax.hun, gribozavr2

Differential Revision: https://reviews.llvm.org/D153854

commit | commitdiff | tree

Martin Braenne [Wed, 28 Jun 2023 08:36:06 +0000 (08:36 +0000)]

[clang][dataflow] Initialize fields of anonymous records correctly.

Previously, the newly added test would crash.

Depends On D153851

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D153852

commit | commitdiff | tree

luxufan [Wed, 28 Jun 2023 15:01:09 +0000 (23:01 +0800)]

[ValueTracking] Guaranteed well-defined if parameter has a dereferecable_or_null attribute

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D153945

commit | commitdiff | tree

zhanglimin [Thu, 29 Jun 2023 03:40:22 +0000 (11:40 +0800)]

[sanitizer][msan] The LLVM part of the LoongArch memory sanitizer implementation

This patch enabled msan in LLVM and fixed all failing tests in
check-msan.

It does not add VarArgHelper implementation on LoongArch, which
will be done separately later. And it adds a test for VarArgNoOpHelper,
which is based on the X86 one.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D152692

commit | commitdiff | tree

Manna, Soumi [Thu, 29 Jun 2023 02:51:15 +0000 (19:51 -0700)]

[analyzer] Refactor codes in findMethodDecl()

In findMethodDecl(clang::ObjCMessageExpr const *, clang::ObjCObjectPointerType const *, clang::ASTContext &), if the MessageExpr->getReceiverKind() is not Instance or Class, we never dereference pointer “ReceiverObjectPtrType”. Also, we don't dereference the pointer “ReceiverObjectPtrType” if ReceiverType is ObjCIdType or ObhjCClassType. So the pointer “ReceiverObjectPtrType”is only used in this branch and the declaration should be here.

This patch directly uses ReceiverType->castAs<ObjCObjectPointerType>() instead of ReceiverObjectPtrType when calling canAssignObjCInterfaces() to express the intent more clearly.

Reviewed By: erichkeane, steakhal

Differential Revision: https://reviews.llvm.org/D152194

commit | commitdiff | tree

zhanglimin [Thu, 29 Jun 2023 01:46:02 +0000 (09:46 +0800)]

[MSan] Enable MSAN for loongarch64

This patch adds basic memory sanitizer support for loongarch64
with 47-bit VMA, which memory layout is based on x86_64.

The LLVM part of the LoongArch memory sanitizer implementation will
be done separately, which will fix failing tests in check-msan.
These failing tests fail with the following same error: "error in
backend: unsupported architecture".

Reviewed By: #sanitizers, vitalybuka, MaskRay

Differential Revision: https://reviews.llvm.org/D140528

commit | commitdiff | tree

Wang, Xin10 [Thu, 29 Jun 2023 03:23:06 +0000 (23:23 -0400)]

[NFC]Fix possibly derefer nullptr in ComplexDeinterleavingPass.cpp

Fix static analyzer reports issue, add assert to avoid analyzer report.

Reviewed By: igor.kirillov

Differential Revision: https://reviews.llvm.org/D153942

commit | commitdiff | tree

Weining Lu [Wed, 28 Jun 2023 00:25:11 +0000 (08:25 +0800)]

[LoongArch] Emit R_LARCH_64_PCREL relocation for FK_Data_8 when IsPCRel is true

Reviewed By: xen0n, MaskRay, hev

Differential Revision: https://reviews.llvm.org/D153872

commit | commitdiff | tree

4vtomat [Tue, 27 Jun 2023 07:06:40 +0000 (00:06 -0700)]

[RISCV] Bump vector crypto to v1.0.0-rc1

Differential Revision: https://reviews.llvm.org/D153836

commit | commitdiff | tree

Manna, Soumi [Thu, 29 Jun 2023 02:25:46 +0000 (19:25 -0700)]

[CLANG] Fix potential integer overflow value in getRVVTypeSize()

In getRVVTypeSize(clang::ASTContext &, clang::BuiltinType const *) potential integer overflow occurs on expression VScale->first * MinElts with type unsigned int (32 bits, unsigned) is evaluated using 32-bit arithmetic, and then used in a context that expects an expression of type uint64_t (64 bits, unsigned).

To avoid integer overflow, this patch changes the types of variables MinElts and EltSize to uint64_t from unsigned instead of the cast.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D153146

commit | commitdiff | tree

Hideto Ueno [Thu, 29 Jun 2023 01:56:48 +0000 (18:56 -0700)]

[mlir][IR] clang-format OperationSupport.cpp, NFC

Follow-up to D154015

commit | commitdiff | tree

Hideto Ueno [Thu, 29 Jun 2023 01:47:21 +0000 (18:47 -0700)]

[mlir][IR] Combine location hash if required in OperationEquivalence::computeHash

This fixes a bug that `OperationEquivalence::computeHash` doesn't
combine hash of operation locations even when `IgnoreLocations` is false.
Added a unit test which fails at the current trunk.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D154015

commit | commitdiff | tree

Kai Sasaki [Thu, 29 Jun 2023 01:04:35 +0000 (10:04 +0900)]

[mlir][memref] Make result normalization aware of the number symbols

Memref normalization fails to recognize the non-zero symbols used in the memref type itself with strided, offset information. It causes the crash with the type like `memref<128x512xf32, strided<[?, ?], offset: ?>>`. The original issue is here. https://github.com/llvm/llvm-project/issues/61345

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D150250

commit | commitdiff | tree

Amir Ayupov [Thu, 29 Jun 2023 00:53:54 +0000 (17:53 -0700)]

[BOLT] Add -dump-cg option to dump call graph

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D153994

commit | commitdiff | tree

Amir Ayupov [Thu, 29 Jun 2023 00:52:35 +0000 (17:52 -0700)]

[BOLT][NFC] Add extra debug logging to buildCallGraph

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D153987

commit | commitdiff | tree

Amir Ayupov [Thu, 29 Jun 2023 00:50:39 +0000 (17:50 -0700)]

[BOLT][NFC] Print functions after attaching profile (-print-profile)

Add an extra point of dumping functions: immediately after attaching the profile information.
This dumping is enabled by newly introduced `-print-profile` and `-print-all`.

The reason is that in `aggregate-only`/perf2bolt mode BOLT may not reach the point of
printing the function after CFG is constructed (`-print-cfg`), while we may still want to inspect
the attached profile, especially for diff'ing purposes.

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D153996

commit | commitdiff | tree

Tue Ly [Thu, 29 Jun 2023 00:30:54 +0000 (20:30 -0400)]

[libc][NFC] Set rounding mode for sincosf exhaustive test.

commit | commitdiff | tree

Christopher Ferris [Wed, 28 Jun 2023 21:24:38 +0000 (14:24 -0700)]

[scudo] Use fast get time in secondary.

When I moved the primary to use the faster get time syscall, I missed
the secondary use. Now fix the secondary to use this function too.

Reviewed By: Chia-hungDuan

Differential Revision: https://reviews.llvm.org/D154012

commit | commitdiff | tree

Florian Hahn [Wed, 28 Jun 2023 22:12:05 +0000 (23:12 +0100)]

[ConstraintElim] Add tests with phis and different alloc sizes/end ptrs.

Extra tests for D152730

commit | commitdiff | tree

varconst [Tue, 27 Jun 2023 23:40:39 +0000 (16:40 -0700)]

[libc++][hardening][NFC] Introduce `_LIBCPP_ASSERT_UNCATEGORIZED`.

Replace most uses of `_LIBCPP_ASSERT` with
`_LIBCPP_ASSERT_UNCATEGORIZED`.

This is done as a prerequisite to introducing hardened mode to libc++.
The idea is to make enabling assertions an opt-in with (somewhat)
fine-grained controls over which categories of assertions are enabled.
The vast majority of assertions are currently uncategorized; the new
macro will allow turning on `_LIBCPP_ASSERT` (the underlying mechanism
for all kinds of assertions) without enabling all the uncategorized
assertions (in the future; this patch preserves the current behavior).

Differential Revision: https://reviews.llvm.org/D153816

commit | commitdiff | tree

Luke Lau [Tue, 20 Jun 2023 14:23:00 +0000 (15:23 +0100)]

[RISCV] Add test cases for vmv.v.vs which could be combined

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D153350

commit | commitdiff | tree

Luke Lau [Tue, 13 Jun 2023 13:51:49 +0000 (13:51 +0000)]

[RISCV] Add test cases for insert subvector shuffles for fixed vectors

These cases could have the vmv.v.v folded into the VL of the previous
instruction.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D153030

commit | commitdiff | tree

Luke Lau [Tue, 6 Jun 2023 11:16:13 +0000 (13:16 +0200)]

[DAGCombine] Fold (store (insert_elt (load p)) x p) -> (store x)

If we have a store of a load with no other uses in between it, it's
considered dead and is removed. So sometimes when legalizing a fixed
length vector store of an insert, we end up producing better code
through scalarization than without.
An example is the follow below:

  %a = load <4 x i64>, ptr %x
  %b = insertelement <4 x i64> %a, i64 %y, i32 2
  store <4 x i64> %b, ptr %x

If this is scalarized, then DAGCombine successfully removes 3 of the 4
stores which are considered dead, and on RISC-V we get:

  sd a1, 16(a0)

However if we make the vector type legal (-mattr=+v), then we lose the
optimisation because we don't scalarize it.

This patch attempts to recover the optimisation for vectors by
identifying patterns where we store a load with a single insert
inbetween, replacing it with a scalar store of the inserted element.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D152276

commit | commitdiff | tree

Luke Lau [Wed, 28 Jun 2023 11:57:32 +0000 (11:57 +0000)]

[RISCV] Add fixed vector insert tests that are pass by value

So we can still test insert_vector_elt lowering with D152276

Reviewed By: frasercrmck, craig.topper

Differential Revision: https://reviews.llvm.org/D153964

commit | commitdiff | tree

LLVM GN Syncbot [Wed, 28 Jun 2023 21:38:12 +0000 (21:38 +0000)]

[gn build] Port 75a1797044fc

commit | commitdiff | tree

Paul Kirth [Wed, 28 Jun 2023 15:33:24 +0000 (15:33 +0000)]

Reland [llvm] Preliminary fat-lto-objects support

Fat LTO objects contain both LTO compatible IR, as well as generated
object code. This allows users to defer the choice of whether to use LTO
or not to link-time. This is a feature available in GCC for some time,
and makes the existing -ffat-lto-objects flag functional in the same
way as GCC's.

Within LLVM, we add a new EmbedBitcodePass that serializes the module to
the object file, and expose a new pass pipeline for compiling fat
objects. The new pipeline initially clones the module and runs the
selected (Thin)LTOPrelink pipeline, after which it will serialize the
module into a `.llvm.lto` section of an ELF file. When compiling for
(Thin)LTO, this normally the point at which the compiler would emit a
object file containing the bitcode and metadata.

After that point we compile the original module using the
PerModuleDefaultPipeline used for non-LTO compilation. We generate
standard object files at the end of this pipeline, which contain machine
code and the new `.llvm.lto` section containing bitcode.

Since the two pipelines operate on different copies of the module, we
can be sure that the bitcode in the `.llvm.lto` section and object code
in `.text` are congruent with the existing output produced by the
default and LTO pipelines.

Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977

Earlier versions of this patch were missing REQUIRES lines for llc
related tests in Transforms/EmbedBitcode. Those tests are now under
CodeGen/X86, which should avoid running the check on unsupported
platforms.

The EmbedbBitcodePass also returned PreservedAnalyses::all when adding a
metadata section, which failed expensive checks, since it modified the
module. This is now corrected.

Reviewed By: tejohnson, MaskRay, nikic

Differential Revision: https://reviews.llvm.org/D146776

commit | commitdiff | tree

Peiming Liu [Wed, 28 Jun 2023 19:56:38 +0000 (19:56 +0000)]

[mlir][sparse] admit un-sparsifiable operations if all its operands are loaded from dense input

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D153998

commit | commitdiff | tree

Fangrui Song [Wed, 28 Jun 2023 21:01:08 +0000 (14:01 -0700)]

[Object] Add ELF section type SHT_LLVM_BITCODE for LLVM bitcode

clang -ffat-lto-objects can use this new ELF section type for the .llvm.lto
section for fat LTO support (D146776).

Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D153215

commit | commitdiff | tree

Matt Arsenault [Sun, 20 Nov 2022 16:50:13 +0000 (08:50 -0800)]

HIP: Directly call floor builtins

commit | commitdiff | tree

Alexey Bataev [Wed, 28 Jun 2023 19:53:39 +0000 (12:53 -0700)]

[SLP]Fix emission of buildvectors with full match.

If the buildvector node is a full match of another node, need to
correctly build the mask for the original vector value and build common
mask for the emitted node.

commit | commitdiff | tree

Wenlei He [Wed, 28 Jun 2023 18:54:39 +0000 (11:54 -0700)]

[NFC][Sample PGO] Avoid non-const accessor for CallsiteSamples

Exposing a non-const accessor for clearing CallsiteSamples during flattening is a big of an overkill. Replace the non-const accessor with removeAllCallsiteSamples.

Differential Revision: https://reviews.llvm.org/D153995

commit | commitdiff | tree

Nikolas Klauser [Wed, 28 Jun 2023 20:33:40 +0000 (13:33 -0700)]

[clang] Fix checking the equality comparator of base classes in __is_trivially_equality_comparable

Fixes #63192

Reviewed By: cor3ntin

Spies: cfe-commits

Differential Revision: https://reviews.llvm.org/D153890

commit | commitdiff | tree

Ethan Luis McDonough [Wed, 28 Jun 2023 20:02:37 +0000 (15:02 -0500)]

[flang][openmp] Fortran offloading test

Flang currently supports offloading for AMD GPUs. This patch establishes a test structure for Fortran offloading tests in libomptarget.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D148778

commit | commitdiff | tree

Florian Hahn [Wed, 28 Jun 2023 20:10:43 +0000 (21:10 +0100)]

[ConstraintElim] Add additional induction phi tests with end argument.

Extra tests for D152730 with different GEP step sizes and the end pointer
being an argument.

commit | commitdiff | tree

David Green [Wed, 28 Jun 2023 20:02:29 +0000 (21:02 +0100)]

[SLP] Use vector types for cmp alt instructions costs

Similar to the other code that costs main/alt instructions, the cmp should be
using the VecTy for the costs, not the ScalarTy.

One of the tests look like it gets worse just because it is not simplified to
0.

Differential Revision: https://reviews.llvm.org/D153507

commit | commitdiff | tree

Serge Pavlov [Wed, 28 Jun 2023 19:04:31 +0000 (02:04 +0700)]

Revert "[Clang] Reset FP options before function instantiations"

This reverts commit 98390ccb80569e8fbb20e6c996b4b8cff87fbec6.
It caused issue #63542.

commit | commitdiff | tree

Matt Arsenault [Tue, 6 Jun 2023 22:06:38 +0000 (18:06 -0400)]

HIP: Use frexp builtins in math headers

commit | commitdiff | tree

Matt Arsenault [Wed, 28 Jun 2023 19:04:08 +0000 (15:04 -0400)]

LangRef: Fix sphinx build error

commit | commitdiff | tree

root [Fri, 23 Jun 2023 17:59:22 +0000 (10:59 -0700)]

adding bf16 support to NVPTX

Currently, bf16 has been scatteredly added to the PTX codegen. This patch aims to complete the set of instructions and code path required to support bf16 data type.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D144911

Co-authored-by: Artem Belevich <tra@google.com>

commit | commitdiff | tree

Matt Arsenault [Tue, 2 May 2023 13:07:47 +0000 (09:07 -0400)]

clang: Use new frexp intrinsic for builtins and add f16 version

commit | commitdiff | tree

Matt Arsenault [Thu, 27 Apr 2023 01:57:10 +0000 (21:57 -0400)]

IR: Add llvm.frexp intrinsic

Add an intrinsic which returns the two pieces as multiple return
values. Alternatively could introduce a pair of intrinsics to
separately return the fractional and exponent parts.

AMDGPU has native instructions to return the two halves, but could use
some generic legalization and optimization handling. For example, we
should be able to handle legalization of f16 on older targets, and for
bf16. Additionally antique targets need a hardware workaround which
would be better handled in the backend rather than in library code
where it is now.

commit | commitdiff | tree

Caroline Tice [Tue, 27 Jun 2023 07:18:33 +0000 (00:18 -0700)]

[LLDB] Fix buffer overflow problem in DWARFExpression::Evaluate.

In two calls to ReadMemory in DWARFExpression.cpp, the buffer size
passed to ReadMemory is not actually the size of the buffer (I suspect
a copy/paste error where the variable name was not properly
updated). This caused a buffer overflow bug, which we found throuth
Address Sanitizer. This patch fixes the problem by passing the
correct buffer size to the calls to ReadMemory (and to the
DataExtractor).

Differential Revision: https://reviews.llvm.org/D153840

commit | commitdiff | tree

Nikolas Klauser [Wed, 28 Jun 2023 18:22:11 +0000 (11:22 -0700)]

[libc++] Add missing _LIBCPP_HIDE_FROM_ABI in uninitialized_buffer.h

commit | commitdiff | tree

Tue Ly [Sat, 24 Jun 2023 04:08:31 +0000 (00:08 -0400)]

[libc][math] Implement erff function correctly rounded to all rounding modes.

Implement correctly rounded `erff` functions.

For `x >= 4`, `erff(x) = 1` for `FE_TONEAREST` or `FE_UPWARD`, `0x1.ffffep-1` for `FE_DOWNWARD` or `FE_TOWARDZERO`.

For `0 <= x < 4`, we divide into 32 sub-intervals of length `1/8`, and use a degree-15 odd polynomial to approximate `erff(x)` in each sub-interval:
```
  erff(x) ~ x * (c0 + c1 * x^2 + c2 * x^4 + ... + c7 * x^14).
```

For `x < 0`, we can use the same formula as above, since the odd part is factored out.

Performance tested with `perf.sh` tool from the CORE-MATH project on AMD Ryzen 9 5900X:

Reciprocal throughput (clock cycles / op)
```
$ ./perf.sh erff --path2
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH reciprocal throughput --  with -march=native      (with FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 11.790 + 0.182 clc/call; Median-Min = 0.154 clc/call; Max = 12.255 clc/call;
-- CORE-MATH reciprocal throughput --  with -march=x86-64-v2      (without FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 14.205 + 0.151 clc/call; Median-Min = 0.159 clc/call; Max = 15.893 clc/call;

-- System LIBC reciprocal throughput --
[####################] 100 %
Ntrial = 20 ; Min = 45.519 + 0.445 clc/call; Median-Min = 0.552 clc/call; Max = 46.345 clc/call;

-- LIBC reciprocal throughput --  with -mavx2 -mfma     (with FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 9.595 + 0.214 clc/call; Median-Min = 0.220 clc/call; Max = 9.887 clc/call;
-- LIBC reciprocal throughput --  with -msse4.2     (without FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 10.223 + 0.190 clc/call; Median-Min = 0.222 clc/call; Max = 10.474 clc/call;
```

and latency (clock cycles / op):
```
$ ./perf.sh erff --path2
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH latency --  with -march=native      (with FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 38.566 + 0.391 clc/call; Median-Min = 0.503 clc/call; Max = 39.170 clc/call;
-- CORE-MATH latency --  with -march=x86-64-v2      (without FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 43.223 + 0.667 clc/call; Median-Min = 0.680 clc/call; Max = 43.913 clc/call;

-- System LIBC latency --
[####################] 100 %
Ntrial = 20 ; Min = 111.613 + 1.267 clc/call; Median-Min = 1.696 clc/call; Max = 113.444 clc/call;

-- LIBC latency --  with -mavx2 -mfma     (with FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 40.138 + 0.410 clc/call; Median-Min = 0.536 clc/call; Max = 40.729 clc/call;
-- LIBC latency --  with -msse4.2     (without FMA instructions)
[####################] 100 %
Ntrial = 20 ; Min = 44.858 + 0.872 clc/call; Median-Min = 0.814 clc/call; Max = 46.019 clc/call;
```

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D153683

commit | commitdiff | tree

Daniel Thornburgh [Fri, 23 Jun 2023 22:24:48 +0000 (15:24 -0700)]

[Symbolizer] Ignore unknown additional symbolizer markup fields

The symbolizer markup syntax is structured such that fields require only
previous fields for their interpretation; this was originally intended
to make adding new fields a natural extension mechanism for existing
elements. This codifies this into the spec and makes the behavior of the
llvm-symbolizer match. Extra fields are now warned about, but ignored,
rather than ignoring the whole element.

Reviewed By: mcgrathr

Differential Revision: https://reviews.llvm.org/D153821

commit | commitdiff | tree

Jon Roelofs [Mon, 26 Jun 2023 17:31:57 +0000 (10:31 -0700)]

[MachineInst] Bump NumOperands back up to 24bits

In https://reviews.llvm.org/D149445, it was lowered from 32 to 16bits, which
broke an internal project of ours. The relevant code being compiled is a fairly
large nested switch that results in a PHI node with 65k+ operands, which can't
easily be turned into a table for perf reasons.

This change unifies `NumOperands`, `Flags`, and `AsmPrinterFlags` into a packed
7-byte struct, which `CapOperands` can follow as the 8th byte, rounding it up
to a nice alignment before the `Info` field.

rdar://111217742&109362033

Differential revision: https://reviews.llvm.org/D153791

commit | commitdiff | tree

Matt Arsenault [Thu, 8 Jun 2023 16:42:59 +0000 (12:42 -0400)]

AMDGPU: Move AMDGPUAttributor run earlier

Move it up with other module passes. It's a higher level optimization
that should probably be done before hacking up the IR for codegen. It
should really be done earlier than this. We could possibly move this
with other IPO passes, but we'd have to stop inferring the lack of
lds.kernel.id calls and have the LDS module pass mark functions which
don't need the ID.

The one test change is because that pass is relying on the backend run
of SROA (which we ideally wouldn't have).

commit | commitdiff | tree

Philip Reames [Wed, 28 Jun 2023 16:38:40 +0000 (09:38 -0700)]

[docs][RISCV] Remove duplicate entries for zvfbfmin and zvfbfwma

commit | commitdiff | tree

Snehasish Kumar [Tue, 27 Jun 2023 18:26:57 +0000 (18:26 +0000)]

[instrprof] Add an overload to accept raw_string_ostream.

Add an overload for InstrProfWriter::write so that users can emit the
buffer to a string. Also use this new overload for existing unit test
usecases.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D153904

commit | commitdiff | tree

David Green [Wed, 28 Jun 2023 16:16:34 +0000 (17:16 +0100)]

[SLP][AArch64] Extend extracts-from-scalarizable-vector.ll test for cmp cost testing. NFC

See D153507. The existing test is over-simplified, as written it should have
been simpified prior to SLP vectorization. I have left it as-is to ensure the
crash it was protecting against doesn't arise again. A new test with valid
inputs is also added to show the incorrect costs of alt cmp vectorization.

commit | commitdiff | tree

Fraser Cormack [Thu, 22 Jun 2023 15:49:39 +0000 (16:49 +0100)]

[InstSimplify] Fix a scalable-vector crash

D143505 fixed/simplified folding of operations with SNaN operands. In
doing so it introduced a crash when handling scalable vector types,
wherein the scalable-vector ConstantVector was cast to a ConstantFP.

Since we know by that point in the code that if we've found a NaN, we're
dealing with a scalable-vector splat (as there are no other kinds of
scalable-vector constant for which that holds), we can grab the splatted
value and re-use the existing code, which will automatically splat the
new NaN back to a scalable vector for us.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D153566

commit | commitdiff | tree

Valentin Clement [Wed, 28 Jun 2023 16:08:22 +0000 (09:08 -0700)]

[flang][openacc] Resolve symbol in device, host and self clause

Some symbols were not resolved in the device, host and self clause
resulting in an `Internal: no symbol found` error.

This patch adds symbol resolution for these clauses.

Reviewed By: razvanlupusoru

Differential Revision: https://reviews.llvm.org/D153919

commit | commitdiff | tree

Valentin Clement [Wed, 28 Jun 2023 16:07:04 +0000 (09:07 -0700)]

[flang][openacc] Relax clause rule on routine directive

Some compiler treat `acc routine` without a parallelism clause as
if seq is present. Relax the parser rule to allow acc routine
without clause. The default clause will be handled in lowering.

Reviewed By: razvanlupusoru

Differential Revision: https://reviews.llvm.org/D153896

commit | commitdiff | tree

Paul Robinson [Wed, 28 Jun 2023 15:27:25 +0000 (08:27 -0700)]

[doc] Fix link typo

commit | commitdiff | tree

LLVM GN Syncbot [Wed, 28 Jun 2023 15:19:41 +0000 (15:19 +0000)]

[gn build] Port 1bfdc534aaae

commit | commitdiff | tree

Yusra Syeda [Wed, 28 Jun 2023 15:18:12 +0000 (11:18 -0400)]

Revert "[SystemZ][z/OS] This patch adds support for the ADA (associated data area), doing the following:"

This reverts commit 9df0f66af5462e23216eae31aedbd4d2f459cc3d.

commit | commitdiff | tree

Jeffrey Byrnes [Tue, 27 Jun 2023 15:40:15 +0000 (08:40 -0700)]

[AMDGPU] NFC: Add schedule-relaxed-occupancy to relax occupancy targets for wave-limited/membound kernels

Default scheduling behavior for these types of kernels is to chase high occupancy goals with scheduling heuristics, but allow occupancy drops if we are unable to reach the target.

This (experimental, off-by-default) feature relaxes occupancy target from the beginning, which enables scheduler to produce better ILP schedules.

Differential Revision: https://reviews.llvm.org/D153925

Change-Id: I112833214e2db869704591f4df3c4574d0fcbb1b

commit | commitdiff | tree

Shilei Tian [Wed, 28 Jun 2023 15:06:36 +0000 (11:06 -0400)]

[NFC][Doc] Update feature support doc `clang/docs/OpenMPSupport.rst` to correct
the color of finished task

commit | commitdiff | tree

Craig Topper [Wed, 28 Jun 2023 14:57:47 +0000 (07:57 -0700)]

[LegalizeTypes] Combine PromoteIntRes_VECTOR_DEINTERLEAVE and PromoteIntRes_VECTOR_INTERLEAVE. NFC

The functions are identical except for the opcode of the node.
We can have a single function and use N->getOpcode().

Reviewed By: luke, paulwalker-arm

Differential Revision: https://reviews.llvm.org/D153929

commit | commitdiff | tree

Paul Robinson [Tue, 27 Jun 2023 15:19:40 +0000 (08:19 -0700)]

[doc] Give better info about forks

Differential Revision: https://reviews.llvm.org/D153884

commit | commitdiff | tree

LLVM GN Syncbot [Wed, 28 Jun 2023 14:14:23 +0000 (14:14 +0000)]

[gn build] Port 9df0f66af546

commit | commitdiff | tree

Yusra Syeda [Wed, 28 Jun 2023 14:13:10 +0000 (10:13 -0400)]

[SystemZ][z/OS] This patch adds support for the ADA (associated data area), doing the following:
- Creates the ADA table to handle displacements
- Emits the ADA section in the SystemZAsmPrinter
- Lowers the ADA_ENTRY node into the appropriate load instruction

Differential Revision: https://reviews.llvm.org/D153788

commit | commitdiff | tree

LLVM GN Syncbot [Wed, 28 Jun 2023 14:08:35 +0000 (14:08 +0000)]

[gn build] Port 8e71d14972b4

commit | commitdiff | tree

Felipe de Azevedo Piovezan [Mon, 26 Jun 2023 14:14:25 +0000 (10:14 -0400)]

[lldb] Use LLVM's implementation of AppleTables for apple_objc

This concludes the migration of accelerator tables from LLDB code to LLVM code.

Differential Revision: https://reviews.llvm.org/D153868

commit | commitdiff | tree

David Green [Wed, 28 Jun 2023 14:02:38 +0000 (15:02 +0100)]

[ARM][AArch64] !cast<Instruction>("XYZ") -> XYZ. NFC

commit | commitdiff | tree

Guillaume Chatelet [Wed, 28 Jun 2023 11:31:16 +0000 (11:31 +0000)]

[libc][NFC] Separate avx/no-avx x86 memcpy implementations

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D153958

commit | commitdiff | tree

Joseph Huber [Tue, 27 Jun 2023 19:17:20 +0000 (14:17 -0500)]

[AMDGPU] Always pass `-mcpu` to the `lld` linker

Currently, AMDGPU more or less only supports linking with LTO. If the
user does not either pass `-flto` or `-Wl,-plugin-opt=mcpu=` manually
linking will fail because the architecture's aren't compatible. THis
patch simply passes `-mcpu` by default if it was specified. Should be a
no-op if it's not actually used.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D153909

commit | commitdiff | tree

Jie Fu [Wed, 28 Jun 2023 13:46:08 +0000 (21:46 +0800)]

[flang] Build broken due to no member named 'getNumScalableDims' in 'mlir::VectorType' after D153412 (NFC)

/data/llvm-project/flang/lib/Optimizer/Dialect/FIROps.cpp:971:46: error: no member named 'getNumScalableDims' in 'mlir::VectorType'
if (mlir::dyn_cast<mlir::VectorType>(ty).getNumScalableDims() == 0)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
1 error generated.

commit | commitdiff | tree

Youngsuk Kim [Wed, 28 Jun 2023 13:21:01 +0000 (09:21 -0400)]

[llvm] Replace uses of Type::getPointerTo (NFC)

Partial progress towards removing in-tree uses of `Type::getPointerTo`,
before we can deprecate the API.

If the API is used solely to support an unnecessary bitcast, get rid of
the bitcast as well.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D153933

commit | commitdiff | tree

Felipe de Azevedo Piovezan [Tue, 20 Jun 2023 16:21:38 +0000 (12:21 -0400)]

[lldb] Use LLVM's implementation of AppleTables for apple_debug_types

This commit is replacing really old LLDB code, and we've found some odd
behavior while doing this replacement. While the changes here are largely NFC,
there are some subtle changes that fix such odd behavior.

The most curious example of this is the method `FindCompleteObjCClassName`,
which has a flag `must_be_implementation`. This flag was _only_ being respected
for accelerator tables containing the atom `type_flags`, which seems
counter-intuitive. The implementation for DWARF 5 tables does not do that and
neither does the code introduced in this patch.

There were other weird cases, for example, we found boolean logic that was
always true in a code path: look for a `if !has_qualified_name...` deleted
line; that condition was true by simple if/else analysis.

Differential Revision: https://reviews.llvm.org/D153867

commit | commitdiff | tree

Serge Pavlov [Wed, 28 Jun 2023 13:11:15 +0000 (20:11 +0700)]

[Clang] Reset FP options before function instantiations

Previously function template instantiations occurred with FP options
that were in effect at the end of translation unit. It was a problem
for late template parsing as these FP options were used as attributes of
AST nodes and may result in crash. To fix it FP options are set to the
state of the point of template definition.

Differential Revision: https://reviews.llvm.org/D143241

commit | commitdiff | tree

Alexey Bataev [Tue, 27 Jun 2023 19:48:08 +0000 (12:48 -0700)]

[SLP]Fix PR63141: compareCmp is not strict weak ordering.

Added some extra checks for comapreCMP function if IsCompatibility is
false to make it meat the strict weak ordering requirements to be
correctly used in sort functions.

commit | commitdiff | tree

Kevin P. Neal [Wed, 28 Jun 2023 12:52:23 +0000 (08:52 -0400)]

[TableGen] Stabilize sort in GET_SUBTARGETINFO_MACRO block

Add missed change requested in D153371.

commit | commitdiff | tree

Andrzej Warzynski [Wed, 21 Jun 2023 12:27:13 +0000 (13:27 +0100)]

[mlir][VectorType] Remove `numScalableDims` from the vector type

This is a follow-up of https://reviews.llvm.org/D153372 in which
`numScalableDims` (single integer) was effectively replaced with
`isScalableDim` bitmask.

This change is a part of a larger effort to enable scalable
vectorisation in Linalg. See this RFC for more context:
* https://discourse.llvm.org/t/rfc-scalable-vectorisation-in-linalg/

Differential Revision: https://reviews.llvm.org/D153412

commit | commitdiff | tree

Nikita Popov [Wed, 28 Jun 2023 12:48:42 +0000 (14:48 +0200)]

[AArch64] Make tests more robust (NFC)

commit | commitdiff | tree

David Truby [Mon, 26 Jun 2023 14:03:17 +0000 (15:03 +0100)]

[flang] Add COMDAT to global variables where needed

On platforms which support COMDAT sections we should use them when
linkonce or linkonce_odr linkage is requested. This is required on
Windows (PE/COFF) and provides better behaviour than weak symbols on
ELF-based platforms.

This patch also reverts string literals to use linkonce instead of
internal linkage now that comdats are supported.

Differential Revision: https://reviews.llvm.org/D153768

commit | commitdiff | tree

Jingu Kang [Tue, 27 Jun 2023 08:33:13 +0000 (09:33 +0100)]

[AArch64] Remove vector shift instrinsic with shift amount zero

Differential Revision: https://reviews.llvm.org/D153847

commit | commitdiff | tree

Nikita Popov [Wed, 28 Jun 2023 12:31:14 +0000 (14:31 +0200)]

[SimplifyCFG] Make some tests more robust (NFC)

commit | commitdiff | tree

Felipe de Azevedo Piovezan [Fri, 16 Jun 2023 19:07:59 +0000 (15:07 -0400)]

[lldb] Use LLVM's implementation of AppleTables for apple_{names,namespaces}

All the new code should match the behavior of the old exactly.

Of note, the custom queries used to be implemented inside `HashedNameToDIE.cpp`
(which is the LLDB implementation of the tables). However, when porting to LLVM,
we believe they don't belong inside the LLVM table implementation:

1. They don't require any knowledge about the table itself
2. They are not relevant for other users of these classes.
3. They use LLDB data structures.

As such, we implement these custom queries inside AppleDWARFIndex.cpp.

Types and Objective-C tables are done separately, as they have slightly
different functionality that require rewriting more code.

Differential Revision: https://reviews.llvm.org/D153866

commit | commitdiff | tree

John Brawn [Thu, 1 Jun 2023 16:04:39 +0000 (17:04 +0100)]

[ARM] Generate out-of-line jump tables for XO without 32-bit branch

When we only have a 16-bit pc-relative branch instruction we generate
a table of address for a jump table. Currently this is placed inline,
but this won't work with execute-only memory. In this case generate
the jump table out-of-line.

Differential Revision: https://reviews.llvm.org/D153774

commit | commitdiff | tree

Kevin P. Neal [Wed, 28 Jun 2023 12:26:12 +0000 (08:26 -0400)]

[TableGen] Stabilize sort in GET_SUBTARGETINFO_MACRO block

The sort of the elements in the GET_SUBTARGETINFO_MACRO block is done on
the "Name" field of each record. This field is not guaranteed to be unique,
is not guaranteed to even have a value at all, and is not used in the
output anyway. Change to sort on the "FieldName" field which should be
unique.

Problem spotted when lib/Target/PowerPC/PPCGenSubtargetInfo.inc changed
unexpectedly.

Differential Revision: https://reviews.llvm.org/D153371

commit | commitdiff | tree

Nikita Popov [Wed, 28 Jun 2023 12:21:06 +0000 (14:21 +0200)]

[SimplifyCFG] Add additional tests with assume (NFC)

commit | commitdiff | tree

Florian Hahn [Wed, 28 Jun 2023 12:19:39 +0000 (13:19 +0100)]

[ConstraintElim] Try to use first cmp to prove second cmp for ANDs.

This patch extends the existing logic to handle cases where we have
branch conditions of the form (AND icmp, icmp) where the first icmp
implies the second. This can improve results in some cases, e.g. if
SimplifyCFG folded conditions from multiple branches to an AND.

The implementation handles this by adding a new type of check
(AndImpliedCheck), which are queued before conditional facts for the same
block.

When encountering AndImpliedChecks during solving, the first condition
is optimistically added to the constraint system, then we check if the
second icmp can be simplified, and finally the newly added entries are
removed.

The reason for doing things this way is to avoid clashes with signed
<-> unsigned condition transfer, which require us to re-order facts to
increase effectiveness.

Reviewed By: nikic, antoniofrighetto

Differential Revision: https://reviews.llvm.org/D151799

commit | commitdiff | tree

Tue Ly [Wed, 28 Jun 2023 12:13:05 +0000 (08:13 -0400)]

[libc] Fix missing dependency and linking option for sqrtf exhaustive test.

commit | commitdiff | tree

Haojian Wu [Wed, 28 Jun 2023 12:04:22 +0000 (14:04 +0200)]

[clangd] Fix some typos, NFC

commit | commitdiff | tree

Florian Hahn [Wed, 28 Jun 2023 12:02:11 +0000 (13:02 +0100)]

[ConstraintElim] Move condition check logic to helper function (NFC).

This allows easier re-use of the checking logic. Split off from D151799.

commit | commitdiff | tree

Tue Ly [Sat, 24 Jun 2023 04:03:06 +0000 (00:03 -0400)]

[libc][math] Clean up exhaustive tests implementations.

Clean up exhaustive tests. Let check functions return number of failures instead of passed/failed.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D153682

commit | commitdiff | tree

Alexey Bataev [Wed, 28 Jun 2023 11:31:40 +0000 (04:31 -0700)]

Revert "[SLP]Fix PR63141: compareCmp is not strict weak ordering."

This reverts commit f3ebd88064d7f1c36a8272b3e5f7d53501c3f53b to pacify
windows-based buildbots.

commit | commitdiff | tree

Matt Arsenault [Wed, 28 Jun 2023 11:32:20 +0000 (07:32 -0400)]

OpenMP: Revert accidental cmake change to make amdgpu-arch errors fatal

I still think this should be done but should be done separately.

commit | commitdiff | tree

Matt Arsenault [Mon, 26 Jun 2023 16:43:35 +0000 (12:43 -0400)]

ValueTracking: Handle !absolute_symbol in computeKnownBits

Use a unit test since I don't see any existing uses try to make use of
the high bits of a pointer.

This will also assert if the metadata type doesn't match the pointer
width, but I consider that a defect in the verifier and shouldn't be
handled.

AMDGPU allocates LDS globals by assigning !absolute_symbol with the
final fixed address. Tracking the high bits are 0 may help with
addressing mode matching.

Domain: System / Toolchain;

RSS Atom