review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

Nathan Ridge [Mon, 12 Jun 2023 08:05:10 +0000 (04:05 -0400)]

[clangd] Use resolveTypeToRecordDecl() to resolve the type of a base specifier during heuristic resolution

The code for resolving the type of a base specifier was inside
CXXRecordDecl::lookupDependentName(), so this patch reimplements
lookupDependentName() in HeuristicResolver.

Fixes https://github.com/clangd/clangd/issues/1657

Differential Revision: https://reviews.llvm.org/D153248

commit | commitdiff | tree

Mark de Wever [Sat, 3 Jun 2023 11:37:53 +0000 (13:37 +0200)]

[chrono][test] Fixes some tests on Windows.

The tests switched from assert to TEST_EQUAL to make it easier to debug
assertion failures. This is used to fix most tests on Windows.

Some CI tests give no output, which needs to be investigated separately.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D152062

commit | commitdiff | tree

Mark de Wever [Mon, 5 Jun 2023 16:39:23 +0000 (18:39 +0200)]

[libc++][format] Removes an AIX work-around.

This work-around was for Clang 13 and older, which we no longer support.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D152175

commit | commitdiff | tree

Mark de Wever [Sun, 4 Jun 2023 17:12:42 +0000 (19:12 +0200)]

[libc++][test] Removes old fallbacks.

All supported compilers support the -std=year except Clang < 17 which
doesn't support C++23. This fallback is marked for removal once we no
longer support Clang 16.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D152106

commit | commitdiff | tree

Mark de Wever [Fri, 21 Apr 2023 06:09:06 +0000 (08:09 +0200)]

[libc++][format] Fixes UTF-8 continuation.

The mask used to check whether a code unit is a valid continuation was
incorrect and accepts non-continuation code points. This fixes the
issue.

Reviewed By: ldionne, tahonermann, #libc

Differential Revision: https://reviews.llvm.org/D149672

commit | commitdiff | tree

Mark de Wever [Sun, 18 Jun 2023 13:44:38 +0000 (15:44 +0200)]

[libc++][CI] Install newer CMake version.

This version allowed testing the std module in C++26 mode.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D153227

commit | commitdiff | tree

Anna Thomas [Tue, 13 Jun 2023 18:41:23 +0000 (14:41 -0400)]

[LV] Add support for minimum/maximum intrinsics

{mini|maxi}mum intrinsics are different from {min|max}num intrinsics in
the propagation of NaN and signed zero. Also, the minnum/maxnum
intrinsics require the presence of nsz flags to be valid reductions in
vectorizer. In this regard, we introduce a new recurrence kind and also
add support for identifying reduction patterns using these intrinsics.

The reduction intrinsics and lowering was introduced here: 26bfbec5d2.

There are tests added which show how this interacts across chains of
min/max patterns.

Differential Revision: https://reviews.llvm.org/D151482

commit | commitdiff | tree

Nico Weber [Wed, 14 Jun 2023 02:53:21 +0000 (19:53 -0700)]

[lld] Make lit files relocatable

2700da5fe28d8 added lld/test/Unit/lit.site.cfg.py.in in a state
that half-supports relocatable lld lit tests.

Make them fully relocatable.

See description of fb80b6b2d58c47 for background.

Differential Revision: https://reviews.llvm.org/D152885

commit | commitdiff | tree

Matt Arsenault [Mon, 19 Jun 2023 15:55:11 +0000 (11:55 -0400)]

AMDGPU: Delete old AMDGPUPropagateAttributes pass

The optimizing, non-broken features have all been moved to
AMDGPUAttributor. The only remaining piece of functionality was the
broken propagation of the wavesize features. This was fundamentally
broken and a hack for device library linking. It doesn't matter when
the device libraries are correctly linked and internalized.

In case of linked-as-normal-bitcode (as comgr still does), we're
reliant on the global subtarget anyway. If we can get away without
forcing target-cpu, we should just as well be able to get away without
propagating target-features.

commit | commitdiff | tree

Amy Kwan [Tue, 20 Jun 2023 04:22:28 +0000 (23:22 -0500)]

[AIX][TLS] Generate 32-bit local-exec access code sequence

This patch adds support for the TLS local-exec access model on AIX to allow
for the ability to generate the 32-bit (specifically, non-optimized) code sequence.
This work is a follow up of D149722.

The particular sequence that is generated for this sequence is as follows:
```
.tc var[TC],var[TL]@le.   // variable offset, with the le relocation specifier

bla .__get_tpointer()     // get the thread pointer, modifies r3
lwz reg1, var[TC](2)      // load the variable offset
add reg2, r3, reg1        // add the variable offset to the retrieved thread pointer
```

Differential Revision: https://reviews.llvm.org/D152669

commit | commitdiff | tree

Craig Topper [Tue, 20 Jun 2023 16:36:38 +0000 (09:36 -0700)]

[RISCV] Remove mask from vrgatherei16 in lowerVECTOR_INTERLEAVE.

Unless I'm missing something we need to update the whole vector
not just where OddMask is true.

Reviewed By: luke

Differential Revision: https://reviews.llvm.org/D153087

commit | commitdiff | tree

Simi Pallipurath [Tue, 20 Jun 2023 16:25:35 +0000 (17:25 +0100)]

Revert "[lld][Arm] Big Endian - Byte invariant support."

This reverts commit 8cf8956897ce9bca3176c6339077b1ca17b27abc.

commit | commitdiff | tree

Matt Arsenault [Mon, 20 Mar 2023 22:49:17 +0000 (18:49 -0400)]

InlineSpiller: Consider copy bundles when looking for snippet copies

This was looking for full copies produced by SplitKit, but SplitKit
introduces copy bundles if not all lanes are live. The scan for uses
needs to look at bundles, not individual instructions.

This is a prerequisite to avoiding some redundant spills due to
subregisters which will help avoid an allocation failure in a future
patch.

commit | commitdiff | tree

Matt Arsenault [Tue, 20 Jun 2023 00:47:29 +0000 (20:47 -0400)]

llvm-reduce: Fix introducing invalid uses of intrinsics

commit | commitdiff | tree

Matt Devereau [Tue, 20 Jun 2023 15:05:29 +0000 (15:05 +0000)]

[AArch64][SME] Rename strided load/store enums

This patch renames load/store enums to be equivalent
to the contiguous addressing modes with _STRIDED and _STRIDED_IMM
suffixed.

commit | commitdiff | tree

Nikita Popov [Tue, 20 Jun 2023 15:38:08 +0000 (17:38 +0200)]

[Attributor] Convert test to opaque pointers (NFC)

commit | commitdiff | tree

Nikita Popov [Tue, 20 Jun 2023 15:34:55 +0000 (17:34 +0200)]

[Attributor] Name instructions in test (NFC)

commit | commitdiff | tree

Nikita Popov [Tue, 20 Jun 2023 15:24:55 +0000 (17:24 +0200)]

[Attributor] Convert some tests to opaque pointers (NFC)

commit | commitdiff | tree

Owen Pan [Sat, 17 Jun 2023 22:28:49 +0000 (15:28 -0700)]

[clang-format] Add InsertNewlineAtEOF to .clang-format files

Also, reformat all clang-format related files.

Differential Revision: https://reviews.llvm.org/D153208

commit | commitdiff | tree

Louis Dionne [Mon, 19 Jun 2023 18:40:19 +0000 (14:40 -0400)]

[libc++] Add missing [[maybe_unused]] attribute in format tests

Otherwise, Clang complains about format_ctx being unused in the tests
when exceptions are disabled in Freestanding mode. I don't know why it
doesn't complain not in freestanding mode.

Differential Revision: https://reviews.llvm.org/D153301

commit | commitdiff | tree

Nikita Popov [Tue, 20 Jun 2023 15:22:19 +0000 (17:22 +0200)]

[LoopVersioning] Convert tests to opaque pointers (NFC)

commit | commitdiff | tree

Nikita Popov [Tue, 20 Jun 2023 15:22:07 +0000 (17:22 +0200)]

[LoopVersioning] Regenerate test checks (NFC)

commit | commitdiff | tree

Mark de Wever [Mon, 19 Jun 2023 15:06:43 +0000 (17:06 +0200)]

[NFC][libc++] Addresses LWG3927.

Changes to preconditions have no impact on the library.

Implements
- LWG3927 Unclear preconditions for operator[] for sequence containers

Reviewed By: #libc, philnik

Differential Revision: https://reviews.llvm.org/D153286

commit | commitdiff | tree

Mark de Wever [Mon, 19 Jun 2023 15:06:43 +0000 (17:06 +0200)]

[NFC][libc++] Addresses LWG3935.

Note libc++ implemented this in its initial version.

Implements
- LWG3935 template<class X> constexpr complex& operator=(const complex<X>&) has no specification

Reviewed By: #libc, philnik, ldionne

Differential Revision: https://reviews.llvm.org/D153287

commit | commitdiff | tree

Nikita Popov [Tue, 20 Jun 2023 15:14:52 +0000 (17:14 +0200)]

[SafepointIRVerifier] Convert test to opaque pointers (NFC)

commit | commitdiff | tree

Simon Pilgrim [Tue, 20 Jun 2023 15:03:07 +0000 (16:03 +0100)]

[DAG] Add getExtOrTrunc helper. NFC.

Wrap the getSExtOrTrunc/getZExtOrTrunc calls behind an IsSigned argument.

commit | commitdiff | tree

Nikita Popov [Tue, 20 Jun 2023 14:56:09 +0000 (16:56 +0200)]

[LTO] Avoid -opaque-pointers=0 in test (NFC)

Commit the bitcode file instead of generating it dynamically, as
this will no longer be possible in the future.

commit | commitdiff | tree

Louis Dionne [Wed, 29 Mar 2023 20:48:20 +0000 (16:48 -0400)]

[libc++] Add incomplete availability markup for std::pmr

This fixes rdar://110330781, which asked for the feature-test macro
for std::pmr to take into account the deployment target. It doesn't
fix https://llvm.org/PR62212, though, because the availability markup
itself must be disabled until some Clang bugs have been fixed.

This is pretty vexing, however at least everything should work once
those Clang bugs have been fixed. In the meantime, this patch at least
adds the required markup (as disabled) and ensures that the feature-test
macro for std::pmr is aware of the deployment target requirement.

Differential Revision: https://reviews.llvm.org/D135813

commit | commitdiff | tree

Nikita Popov [Tue, 20 Jun 2023 14:44:16 +0000 (16:44 +0200)]

[DebugInfo] Convert tests to opaque pointers (NFC)

commit | commitdiff | tree

Nikita Popov [Tue, 20 Jun 2023 14:43:14 +0000 (16:43 +0200)]

[Bitcode] Remove -opaque-pointer=0 check lines (NFC)

These tests were testing both typed an opaque pointers. Only keep
opaque pointers tests.

commit | commitdiff | tree

Nikita Popov [Tue, 20 Jun 2023 14:34:58 +0000 (16:34 +0200)]

[llvm-nm] Avoid -opaque-pointers option in test (NFC)

Commit the typed pointer bitcode file instead of producing it,
as this will not be possible in the future anymore.

commit | commitdiff | tree

Simon Pilgrim [Tue, 20 Jun 2023 14:10:32 +0000 (15:10 +0100)]

[DAG] Fold (abds x, y) -> (abdu x, y) iff both args are known positive

This is a generic DAG combine version of D151055 which recognizes when a signed ABDS can be safely replaced with a unsigned ABDU instruction if it is legal.

Alive2: https://alive2.llvm.org/ce/z/pb5BjG

Differential Revision: https://reviews.llvm.org/D153328

commit | commitdiff | tree

David Green [Tue, 20 Jun 2023 14:26:01 +0000 (15:26 +0100)]

[AArch64] Use ISD::isExtOpcode. NFC

commit | commitdiff | tree

Jingu Kang [Mon, 19 Jun 2023 16:48:00 +0000 (17:48 +0100)]

[AArch64] Try to fold uaddlv and uaddlp

Add tablegen pattern for uaddlv(uaddlp(x)) ==> uaddlv(x).

Differential Revision: https://reviews.llvm.org/D153323

commit | commitdiff | tree

Stephen Thomas [Tue, 20 Jun 2023 08:59:08 +0000 (09:59 +0100)]

[AMDGPU] Remove unused method Waitcnt::dominates(). NFC

Differential Revision: https://reviews.llvm.org/D153322

commit | commitdiff | tree

Weining Lu [Tue, 20 Jun 2023 13:50:54 +0000 (21:50 +0800)]

[LoongArch] Optimize conditional selection of integer

This patch optimizes code generation by leveraging the zeroing behavior of the `maskeqz`/`masknez` instructions.

```
int sel(int a, int b)
{
return (a < b) ? a : 0;
}
```

```
slt $a1,$a0,$a1
masknez $a2,$r0,$a1
maskeqz $a0,$a0,$a1
or $a0,$a0,$a2
```

=>

```
slt $a1,$a0,$a1
maskeqz $a0,$a0,$a1
```

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D153193

commit | commitdiff | tree

Pravin Jagtap [Tue, 20 Jun 2023 13:52:58 +0000 (09:52 -0400)]

[AMDGPU] Use verify<domtree> instead of intra-pass asserts.

Verifying dominator tree is expensive using intra-pass
asserts. Asserts added during D147408 are
increasing the build time of libc significantly. This change
does the verification after the atomic optimizer pass
and should fix the regression reported in D153232.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D153261

commit | commitdiff | tree

Weining Lu [Tue, 20 Jun 2023 13:49:28 +0000 (21:49 +0800)]

[LoongArch] Indent LoongArchInstrInfo.td a little bit. NFC

commit | commitdiff | tree

Tue Ly [Fri, 16 Jun 2023 13:22:00 +0000 (09:22 -0400)]

[libc][math] Improve exp2f performance.

Re-organize special cases and add a special case when `|x| < 2^-5`.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D153134

commit | commitdiff | tree

Tue Ly [Thu, 15 Jun 2023 18:56:08 +0000 (14:56 -0400)]

[libc][math] Slightly improve sinhf and coshf performance.

Re-order exceptional branches and slightly adjust the evaluation.
Depends on https://reviews.llvm.org/D153026 .

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D153062

commit | commitdiff | tree

Tue Ly [Thu, 15 Jun 2023 05:21:57 +0000 (01:21 -0400)]

[libc][math] Improve tanhf performance.

Re-order exceptional branches and slightly adjust the evaluation.

Performance tested with the CORE-MATH project on AMD EPYC 7B12 (clocks/op)

Reciprocal throughputs:
```
--- BEFORE ---

$ CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf
[####################] 100 %  (with -mavx2 -mfma)
Ntrial = 20 ; Min = 7.794 + 0.102 clc/call; Median-Min = 0.066 clc/call; Max = 8.267 clc/call;
[####################] 100 %. (with -msse4.2)
Ntrial = 20 ; Min = 10.783 + 0.172 clc/call; Median-Min = 0.144 clc/call; Max = 11.446 clc/call;
[####################] 100 %. (SSE2)
Ntrial = 20 ; Min = 18.926 + 0.381 clc/call; Median-Min = 0.342 clc/call; Max = 19.623 clc/call;

--- AFTER ---

$ CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf
[####################] 100 %  (with -mavx2 -mfma)
Ntrial = 20 ; Min = 6.598 + 0.085 clc/call; Median-Min = 0.052 clc/call; Max = 6.868 clc/call;
[####################] 100 %  (with -msse4.2)
Ntrial = 20 ; Min = 9.245 + 0.304 clc/call; Median-Min = 0.248 clc/call; Max = 10.675 clc/call;
[####################] 100 %. (SSE2)
Ntrial = 20 ; Min = 11.724 + 0.440 clc/call; Median-Min = 0.444 clc/call; Max = 12.262 clc/call;
```

Latency:
```
--- BEFORE ---

$ PERF_ARGS="--latency" CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf
[####################] 100 %  (with -mavx2 -mfma)
Ntrial = 20 ; Min = 38.821 + 0.157 clc/call; Median-Min = 0.122 clc/call; Max = 39.539 clc/call;
[####################] 100 %. (with -msse4.2)
Ntrial = 20 ; Min = 44.767 + 0.766 clc/call; Median-Min = 0.681 clc/call; Max = 45.951 clc/call;
[####################] 100 %. (SSE2)
Ntrial = 20 ; Min = 55.055 + 1.512 clc/call; Median-Min = 1.571 clc/call; Max = 57.039 clc/call;

--- AFTER ---

$ PERF_ARGS="--latency" CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf
[####################] 100 %  (with -mavx2 -mfma)
Ntrial = 20 ; Min = 36.147 + 0.194 clc/call; Median-Min = 0.181 clc/call; Max = 36.536 clc/call;
[####################] 100 %  (with -msse4.2)
Ntrial = 20 ; Min = 40.904 + 0.728 clc/call; Median-Min = 0.557 clc/call; Max = 42.231 clc/call;
[####################] 100 %. (SSE2)
Ntrial = 20 ; Min = 55.776 + 0.557 clc/call; Median-Min = 0.542 clc/call; Max = 56.551 clc/call;
```

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D153026

commit | commitdiff | tree

Weining Lu [Tue, 20 Jun 2023 09:21:31 +0000 (17:21 +0800)]

[LoongArch] Add missing chains and remove unnecessary `SDNPSideEffect` property for some intrinsic nodes

commit | commitdiff | tree

David Green [Tue, 20 Jun 2023 13:10:25 +0000 (14:10 +0100)]

[AArch64] Add tablegen patterns for fp16 fcvtn2.

Similar to the existing f32 pattern, this adds a tablegen pattern for the fp16
fcvtn2.

commit | commitdiff | tree

Simi Pallipurath [Thu, 18 May 2023 11:29:07 +0000 (12:29 +0100)]

[lld][Arm] Big Endian - Byte invariant support.

Arm has BE8 big endian configuration called a byte-invariant(every byte has the same address on little and big-endian systems).

When in BE8 mode:
  1. Instructions are big-endian in relocatable objects but
     little-endian in executables and shared objects.
  2. Data is big-endian.
  3. The data encoding of the ELF file is ELFDATA2MSB.

To support BE8 without an ABI break for relocatable objects,the linker takes on the responsibility of changing the endianness of instructions. At a high level the only difference between BE32 and BE8 in the linker is that for BE8:
  1. The linker sets the flag EF_ARM_BE8 in the ELF header.
  2. The linker endian reverses the instructions, but not data.

This patch adds BE8 big endian support for Arm. To endian reverse the instructions we'll need access to the mapping symbols. Code sections can contain a mix of Arm, Thumb and literal data. We need to endian reverse Arm instructions as words, Thumb instructions
as half-words and ignore literal data.The only way to find these transitions precisely is by using mapping symbols. The instruction reversal will need to take place after relocation. For Arm BE8 code sections (Section has SHF_EXECINSTR flag ) we inserted a step after relocation to endian reverse the instructions. The implementation strategy i have used here is to write all sections BE32  including SyntheticSections then endian reverse all code in InputSections via mapping symbols.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D150870

commit | commitdiff | tree

Joseph Huber [Thu, 8 Jun 2023 13:28:07 +0000 (08:28 -0500)]

[LinkerWrapper] Support linking vendor bitcode late

The GPU vendors currently provide bitcode files for their device
runtime. These files need to be handled specially as they are not built
to be linked in with a standard `llvm-link` call or through LTO linking.
This patch adds an alternative to use the existing clang handling of
these libraries that does the necessary magic to make this work.

We do this by causing the LTO backend to emit bitcode before running the
backend. We then pass this through to clang which uses the existing
support which has been fixed to support this by D152391. The backend
will then be run with the merged module.

This patch adds the `--builtin-bitcode=<triple>=file.bc` to specify a single
file, or just `--clang-backend` to let the toolchain handle its defaults
(currently nothing for NVPTX and the ROCm device libs for AMDGPU). This may have
a performance impact due to running the optimizations again, we could
potentially disable optimizations in LTO and only do the linking if this is an
issue.

This should allow us to resolve issues when relying on the `linker-wrapper` to
do a late linking that may depend on vendor libraries.

Depends on D152391

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D152442

commit | commitdiff | tree

Joseph Huber [Wed, 7 Jun 2023 18:11:19 +0000 (13:11 -0500)]

[Clang] Allow bitcode linking when the input is LLVM-IR

Clang provides the `-mlink-bitcode-file` and `-mlink-builtin-bitcode`
options to insert LLVM-IR into the current TU. These are usefuly
primarily for including LLVM-IR files that require special handling to
be correct and cannot be linked normally, such as GPU vendor libraries
like `libdevice.10.bc`. Currently these options can only be used if the
source input goes through the AST consumer path. This patch makes the
changes necessary to also support this when the input is LLVM-IR. This
will allow the following operation:

```
clang in.bc -Xclang -mlink-builtin-bitcode -Xclang libdevice.10.bc
```

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D152391

commit | commitdiff | tree

Matthias Springer [Tue, 20 Jun 2023 12:39:40 +0000 (14:39 +0200)]

[mlir][Pass] Check supported op types before running a pass

Add extra error checking to prevent passes from being run on unsupported ops through the pass manager infrastructure.

Differential Revision: https://reviews.llvm.org/D153144

commit | commitdiff | tree

Haojian Wu [Tue, 20 Jun 2023 11:10:45 +0000 (13:10 +0200)]

[include-cleaner] Ignore the ParmVarDecl itself in WalkAST.cpp

This fixes a false positive where a ParamVarDecl happend to be the
same name of some C standard symbol and has a global namespace.

```
using A = int(int time); // we suggest <ctime> for the `int time`.
```

Differential Revision: https://reviews.llvm.org/D153330

commit | commitdiff | tree

Vladislav Dzhidzhoev [Tue, 20 Jun 2023 11:08:47 +0000 (13:08 +0200)]

Revert "Reland "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)" (2)"

This reverts commit cb9ac7051589ea0d05507f9370d0716bef86b4ae.
It causes an assert in clang:
virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction*): Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms && "getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed.
https://bugs.chromium.org/p/chromium/issues/detail?id=1456288#c2

commit | commitdiff | tree

Alexey Lapshin [Mon, 19 Jun 2023 15:04:19 +0000 (17:04 +0200)]

[DWARFLinker] add DWARFUnit::getIndexedAddressOffset().

This patch is a followup for D153162. It cures one more place
where indexed address was incorrectly read. It also moves handling
of indexed address into DWARFUnit.

Differential Revision: https://reviews.llvm.org/D153297

commit | commitdiff | tree

Alex Zinenko [Mon, 19 Jun 2023 11:21:33 +0000 (13:21 +0200)]

[mlir] mark libraries in mlir/examples as such

LLVM build system separates between `add_llvm_example_library` and
`add_llvm_library`, which is presumably used to package examples
separately from the regular library. Introduce a similar approach to
building example libraries in MLIR and use it for the transform dialect
tutorial.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D153265

commit | commitdiff | tree

Harvin Iriawan [Tue, 20 Jun 2023 10:48:18 +0000 (11:48 +0100)]

[AArch64] Add Cortex-A510 specific scheduling

- Update the Cortex-A510 mcpu target to use A510 scheduling info instead of
  A55. Values taken are based on the A510 software optimisation guide
  https://developer.arm.com/documentation/PJDOC-466751330-536816/latest
- Make latency of most integer ops to 1. CPU uarch is able to resolve most
  integer ops in 1 cycle

Differential Revision: https://reviews.llvm.org/D152688

commit | commitdiff | tree

ManuelJBrito [Sun, 11 Jun 2023 19:39:45 +0000 (20:39 +0100)]

[clang][NFC] Drop alignment in builtin-nondeterministic-value test

Drop alignment to allow test to run in different platforms.

Differential Revision: https://reviews.llvm.org/D152547

commit | commitdiff | tree

Chuanqi Xu [Tue, 20 Jun 2023 10:35:58 +0000 (18:35 +0800)]

[Coroutines] Store the index for final suspend point in the exception path

Try to address part of
https://github.com/llvm/llvm-project/issues/61900.

It is not completely addressed since the original reproducer is not
fixed due to the final suspend point is optimized out in its special
case. But that is a relatively independent issue.

commit | commitdiff | tree

Ivan Kosarev [Tue, 20 Jun 2023 10:29:40 +0000 (11:29 +0100)]

[AMDGPU] Drop GFX11 runs for dagcombine-fma-fmad.ll and fma.f16.ll.

They cause failures on the llvm-clang-x86_64-expensive-checks-debian
buildbot.

This partially reverts
D153269 [AMDGPU][GFX11] Add test coverage for FMA instructions.

commit | commitdiff | tree

Jan Svoboda [Tue, 20 Jun 2023 10:21:59 +0000 (12:21 +0200)]

[clang][index] Fix cast warning

This is a follow-up to D151938 that should fix GCC's -Wcast-qual warning.

commit | commitdiff | tree

Francesco Petrogalli [Tue, 20 Jun 2023 09:56:03 +0000 (11:56 +0200)]

[CodeGen][test] Add missing `REQUIRES`.

Differential Revision: https://reviews.llvm.org/D153325

commit | commitdiff | tree

Ivan Kosarev [Tue, 20 Jun 2023 09:32:12 +0000 (10:32 +0100)]

[AMDGPU][GFX11] Add test coverage for FMA instructions.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D153269

commit | commitdiff | tree

Francesco Petrogalli [Tue, 20 Jun 2023 09:42:01 +0000 (11:42 +0200)]

[llc][MISched] Add `-misched-detail-resource-booking` to llc.

The option `-misched-detail-resource-booking` prints the following
information every time the method
`SchedBoundary::getNextResourceCycle` is invoked:

1. counters of the resources that have already been booked;

2. the values returned by `getNextResourceCycle`, which is the next
available cycle in which a resource can be booked.

The method is useful to debug low-level checks inside the machine
scheduler that make decisions based on the values returned by
`getNextResourceCycle`.

Reviewed By: andreadb

Differential Revision: https://reviews.llvm.org/D153116

commit | commitdiff | tree

Francesco Petrogalli [Tue, 20 Jun 2023 09:28:45 +0000 (11:28 +0200)]

Revert "[llc][MISched] Add `-misched-detail-resource-booking` to llc."

Reverting because of https://lab.llvm.org/buildbot#builders/75/builds/32485:

llvm-project/llvm/lib/CodeGen/MachineScheduler.cpp:2374:7: error: use of undeclared identifier 'MischedDetailResourceBooking'
if (MischedDetailResourceBooking)

This reverts commit fc06262c1c365777e71207b6a5de281cba927c96.

commit | commitdiff | tree

Francesco Petrogalli [Tue, 20 Jun 2023 09:13:39 +0000 (11:13 +0200)]

commit | commitdiff | tree

Matthias Springer [Tue, 20 Jun 2023 08:48:40 +0000 (10:48 +0200)]

[mlir][transform] Add TransformRewriter

All `apply` functions now have a `TransformRewriter &` parameter. This rewriter should be used to modify the IR. It has a `TrackingListener` attached and updates the internal handle-payload mappings based on rewrites.

Implementations no longer need to create their own `TrackingListener` and `IRRewriter`. Error checking is integrated into `applyTransform`. Tracking listener errors are reported only for ops with the `ReportTrackingListenerFailuresOpTrait` trait attached, allowing for a gradual migration. Furthermore, errors can be silenced with an op attribute.

Additional API will be added to `TransformRewriter` in subsequent revisions. This revision just adds an "empty" `TransformRewriter` class and updates all `apply` implementations.

Differential Revision: https://reviews.llvm.org/D152427

commit | commitdiff | tree

Nikita Popov [Tue, 20 Jun 2023 07:33:49 +0000 (09:33 +0200)]

[SCEVNormalization] Short circuit case with no loops (NFC)

If there are no post-inc loops, normalization is a no-op. Don't
bother rewriting the SCEV in that case.

commit | commitdiff | tree

Diana Picus [Wed, 7 Jun 2023 12:42:02 +0000 (14:42 +0200)]

[AMDGPU] Document amdgpu_cs_chain[_preserve] CCs. NFC

Co-authored-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Differential Revision: https://reviews.llvm.org/D151997

commit | commitdiff | tree

Diana Picus [Wed, 7 Jun 2023 11:52:36 +0000 (13:52 +0200)]

[AMDGPU] Start documenting calling conventions. NFC

Add a section to AMDGPUUsage.rst about calling conventions and list the
ones from the CallingConv enum. Full descriptions can come later (help
appreciated).

Differential Revision: https://reviews.llvm.org/D151996

commit | commitdiff | tree

serge-sans-paille [Mon, 19 Jun 2023 16:35:03 +0000 (18:35 +0200)]

[llvm-profdata] Fix llvm-profdata help and make sure it remains in sync

This makes the new `order` subcommand part of the help.
As a side effect, also make llvm::map_range compatible with plain
arrays.

Differential Revision: https://reviews.llvm.org/D153303

commit | commitdiff | tree

Michael Buch [Mon, 19 Jun 2023 12:59:32 +0000 (13:59 +0100)]

[clang][DebugInfo] Emit DW_AT_deleted on any deleted member function

Currently we emit `DW_AT_deleted` for `deleted` special-member
functions (i.e., ctors/dtors). However, in C++ one can mark any
member function as deleted. This patch expands the set of member
functions for which we emit `DW_AT_deleted`.

The DWARFv5 spec section 5.7.8 says:
```
<non-normative>
In C++, a member function may be declared as deleted. This prevents the compiler from
generating a default implementation of a special member function such as a constructor
or destructor, and can affect overload resolution when used on other member functions.
</non-normative>

If the member function entry has been declared as deleted, then that entry has a
DW_AT_deleted attribute.
```

Thus this change is conforming.

Differential Revision: https://reviews.llvm.org/D153282

commit | commitdiff | tree

Nuno Lopes [Tue, 20 Jun 2023 08:07:08 +0000 (09:07 +0100)]

[docs] Fix GEP faq references to undefined behavior [NFC]

commit | commitdiff | tree

Ben Shi [Mon, 19 Jun 2023 09:14:41 +0000 (17:14 +0800)]

[CSKY] Optimize multiplication with immediates

Try to break a multiplication with a specific immediate to
an/a addition/subtraction of left shifts.

Reviewed By: zixuan-wu

Differential Revision: https://reviews.llvm.org/D153106

commit | commitdiff | tree

Ben Shi [Mon, 19 Jun 2023 09:12:25 +0000 (17:12 +0800)]

[CSKY][test][NFC] Add more tests of multiplication with immediates

Reviewed By: zixuan-wu

Differential Revision: https://reviews.llvm.org/D153105

commit | commitdiff | tree

Diana Picus [Tue, 20 Jun 2023 07:54:40 +0000 (09:54 +0200)]

Revert "[AMDGPU] Start documenting calling conventions. NFC"

This reverts commit aa7b127cb7314e326457d7f790d36db1cb74f63c.

...because I really ought to install sphinx.

commit | commitdiff | tree

Martin Braenne [Tue, 20 Jun 2023 07:15:31 +0000 (07:15 +0000)]

Revert "Prevent deadlocks in death tests."

This reverts commit dfbcee286b9b96751014ebc5ba5290e42796be37.

This was causing unit tests to fail on Gentoo, see comments on
https://reviews.llvm.org/D152696.

commit | commitdiff | tree

Diana Picus [Tue, 20 Jun 2023 07:37:16 +0000 (09:37 +0200)]

Fixup D151996

commit | commitdiff | tree

Diana Picus [Wed, 7 Jun 2023 11:52:36 +0000 (13:52 +0200)]

commit | commitdiff | tree

Nathan Ridge [Tue, 20 Jun 2023 06:57:36 +0000 (02:57 -0400)]

[clangd] Index the type of a non-type template parameter

Fixes https://github.com/clangd/clangd/issues/1666

Differential Revision: https://reviews.llvm.org/D153251

commit | commitdiff | tree

Haojian Wu [Mon, 19 Jun 2023 12:57:09 +0000 (14:57 +0200)]

[include-cleaner] Bailout on invalid code for the command-line tool

The binary tool only works on working source code, if the source code is
not compilable, don't perform any analysis and edits.

Differential Revision: https://reviews.llvm.org/D153271

commit | commitdiff | tree

Matthias Springer [Tue, 20 Jun 2023 06:54:49 +0000 (08:54 +0200)]

[mlir][transform] Add ApplyRegisteredPassOp transform op

This transform op runs a pass on the target op.

Differential Revision: https://reviews.llvm.org/D153143

commit | commitdiff | tree

Kazu Hirata [Tue, 20 Jun 2023 06:36:14 +0000 (23:36 -0700)]

[tools] Use llvm::is_contained (NFC)

commit | commitdiff | tree

Fangrui Song [Tue, 20 Jun 2023 06:02:45 +0000 (23:02 -0700)]

[xray][AArch64] Rewrite trampoline

Optimize (cmp+beq => cbz), duduplicate code (SAVE_REGISTERS/RESTORE_REGISTERS),
improve portability (use ASM_SYMBOL to be compatible with Mach-O), and fix style
issues.
Also, port D37965 (x86 tail call) to __xray_FunctionTailExit.

commit | commitdiff | tree

Jaroslav Sevcik [Tue, 20 Jun 2023 05:56:16 +0000 (07:56 +0200)]

[lldb] Make the test for D153043 linux-only

commit | commitdiff | tree

Jaroslav Sevcik [Tue, 20 Jun 2023 05:19:37 +0000 (07:19 +0200)]

[lldb] Make test for D153043 independent of external symbols

This removes dependence on the libc abort function.

commit | commitdiff | tree

Zhongyunde [Tue, 20 Jun 2023 05:12:02 +0000 (13:12 +0800)]

[LV] Add cost model for simd vector select instructions of type float

For simd vector selects, use cmeq + bsl for v2f32/v4f32/v2f64, so their cost are cheep.
Fix https://github.com/llvm/llvm-project/issues/63082

Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D152523

commit | commitdiff | tree

Bing1 Yu [Mon, 19 Jun 2023 07:57:01 +0000 (15:57 +0800)]

[X86][AMX] set Stride to Tile's Col when doing combine amxcast and store into tilestore

%tile = call x86_amx @llvm.x86.tileloadd64.internal(i16 8, i16 32, i8* %src_ptr, i64 64)
%vec = call <256 x i8> @llvm.x86.cast.tile.to.vector.v256i8(x86_amx...%tile)
store <256 x i8> %vec, <256 x i8>* %dst_ptr, align 256
=>
%tile = call x86_amx @llvm.x86.tileloadd64.internal(i16 8, i16 32, i8* %src_ptr, i64 64)
%stride = sext i16 32 to i64
call void @llvm.x86.tilestored64.internal(i16 8, i16 32, i8* %dst_ptr, i64 32, x86_amx %tile)

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D153002

commit | commitdiff | tree

Fangrui Song [Tue, 20 Jun 2023 03:38:16 +0000 (20:38 -0700)]

[XRay] Make llvm.xray.customevent parameter type match __xray_customevent

The intrinsic has a smaller integer type than the parameter type of
builtin-function/API. Fix this similar to commit 3fa3cb408d8d0f1365b322262e501b6945f7ead9.

commit | commitdiff | tree

Fangrui Song [Tue, 20 Jun 2023 03:28:39 +0000 (20:28 -0700)]

[XRay] Make llvm.xray.typedevent parameter type match __xray_typedevent

The Clang built-in function is void __xray_typedevent(size_t, const void *, size_t),
but the LLVM intrinsics has smaller integer types. Since we only allow
64-bit ELF/Mach-O targets, we can change llvm.xray.typedevent to
i64/ptr/i64.

This allows encoding more information and avoids i16 legalization for
many non-X86 targets.

fdrLoggingHandleTypedEvent only supports uint16_t event type.

commit | commitdiff | tree

Kazu Hirata [Tue, 20 Jun 2023 03:02:47 +0000 (20:02 -0700)]

[BOLT] Fix a warning in release builds

This patch fixes:

bolt/lib/Core/BinarySection.cpp:120:24: error: unused variable
'Relocation' [-Werror,-Wunused-variable]

commit | commitdiff | tree

Vladislav Dzhidzhoev [Mon, 19 Jun 2023 14:42:05 +0000 (16:42 +0200)]

Reland "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)" (2)

Test "local-type-as-template-parameter.ll" is now enabled only for
x86_64.

Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144006

Depends on D144005

commit | commitdiff | tree

Vladislav Dzhidzhoev [Mon, 19 Jun 2023 23:54:48 +0000 (01:54 +0200)]

Revert "Reland "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)""

This reverts commit 2da45172c4bcd42f704c57c656926f56f32fc5ce.
Test local-type-as-template-parameter.ll fails on ppc64-aix.

commit | commitdiff | tree

Fangrui Song [Mon, 19 Jun 2023 22:11:26 +0000 (15:11 -0700)]

[XRay][X86] Remove sled version 0 support from patchCustomEvent

This is remnant after D140739.

commit | commitdiff | tree

Fangrui Song [Mon, 19 Jun 2023 21:53:22 +0000 (14:53 -0700)]

[xray][test] Test __xray_typedevent after D43668

commit | commitdiff | tree

Scott Linder [Wed, 17 May 2023 21:33:09 +0000 (21:33 +0000)]

[DebugInfo] Add DW_OP_LLVM_user extension point

The extension codespace for DWARF expressions (DW_OP_LLVM_{lo,hi}_user)
has shrunk over time, as no extension is ever "retired" in practice. To
facilitate future extensions, this patch reserves one open opcode as an extension
point (0xfe), which is followed by a ULEB128-encoded SubOperation, and
then by the subop's operands.

There is some prior-art, namely DW_OP_AARCH64_operation
(see https://github.com/ARM-software/abi-aa/blob/edd7460d87493fff124b8b5713acf71ffc06ee91/aadwarf64/aadwarf64.rst#45dwarf-expression-operations).

This version makes some different tradeoffs, opting to use a ULEB128 for
the subop encoding for future-proofing.

Reviewed By: #debug-info, dblaikie

Differential Revision: https://reviews.llvm.org/D147271

commit | commitdiff | tree

Jay Foad [Mon, 19 Jun 2023 15:54:17 +0000 (16:54 +0100)]

[AMDGPU] Do not release VGPRs if there may be pending scratch stores

Differential Revision: https://reviews.llvm.org/D153295

commit | commitdiff | tree

Jay Foad [Mon, 19 Jun 2023 20:08:35 +0000 (21:08 +0100)]

[AMDGPU] Remove unused macro CNT_MASK

commit | commitdiff | tree

Fangrui Song [Mon, 19 Jun 2023 19:48:33 +0000 (12:48 -0700)]

[Driver] Correct -fnoxray-link-deps to -fno-xray-link-deps

and removed unused CC1Option.
Also change -whole-archive to the canonical spelling and improve tests.

commit | commitdiff | tree

Scott Linder [Mon, 19 Jun 2023 17:13:08 +0000 (17:13 +0000)]

[DebugInfo] Support more than 2 operands in DWARF operations

Update DWARFExpression::Operation and LVOperation to support more than
2 operands.

Take the opportunity to use a SmallVector, which will handle at least 2
operands without allocation anyway, and removes the static limit
completely.

As there is no longer the concept of an "unused operand", remove
Operation::Encoding::SizeNA. Any use of it is now replaced with explicit
checks for how many operands an operation has.

There are still places where the limit remains 2, namely in the
DWARFLinker and in DIExpressions, but these can be updated in later
patches as-needed.

There are no explicit tests as this is nearly NFC: no new operation is
added which makes use of the additional operand capacity yet. A future
patch adding a new DWARF extension point will include operations which
require the support.

Reviewed By: Orlando, CarlosAlbertoEnciso

Differential Revision: https://reviews.llvm.org/D147270

commit | commitdiff | tree

Siva Chandra Reddy [Fri, 16 Jun 2023 23:23:33 +0000 (23:23 +0000)]

[libc] Remove the requirement of a platform-flush operation in File abstraction.

The libc flush operation is not supposed to trigger a platform level
flush operation. See "Notes" on this Linux man page:
https://man7.org/linux/man-pages/man3/fflush.3.html

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D153182

commit | commitdiff | tree

Vladislav Dzhidzhoev [Mon, 19 Jun 2023 14:42:05 +0000 (16:42 +0200)]

Reland "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)"

Test "local-type-as-template-parameter.ll" now requires linux-system.

Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144006

Depends on D144005

commit | commitdiff | tree

Louis Dionne [Mon, 19 Jun 2023 17:43:17 +0000 (13:43 -0400)]

[libc++] Add missing 'return 0' from main functions in tests

commit | commitdiff | tree

Ellis Hoag [Mon, 19 Jun 2023 17:39:13 +0000 (10:39 -0700)]

[SpecialCaseList] Remove TrigramIndex

`TrigramIndex` was added back in https://reviews.llvm.org/D27188 as an optimization to make `SpecialCaseList::match()` faster. I've found that `TrigramIndex` actually makes the function slower and it has no functional use, so we can remove it.

I grabbed the list of queries passed to `SpecialCaseList::match()` on a random very large file (`AArch64ISelLowering.cpp`) and measured the runtime to call `match()` on all of them with [this line](https://github.com/llvm/llvm-project/blob/8e1f820bb4eadf5c0704818f6063e0db1006e32d/llvm/lib/Support/SpecialCaseList.cpp#L64) disabled and then enabled.

```
$ hyperfine --warmup 3 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests' 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests'
Benchmark 1: GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests
  Time (mean ± σ):     575.9 ms ±  20.3 ms    [User: 573.1 ms, System: 2.7 ms]
  Range (min … max):   555.5 ms … 620.0 ms    10 runs

Benchmark 2: GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests
  Time (mean ± σ):     283.4 ms ±   6.7 ms    [User: 280.3 ms, System: 3.0 ms]
  Range (min … max):   277.0 ms … 294.9 ms    10 runs

Summary
  'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests' ran
    2.03 ± 0.09 times faster than 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests'
```

Using `perf` I found that most of the runtime in `TrigramIndex::isDefinitelyOut()` comes from a division operation that seems to come from `std::unordered_map`: https://github.com/llvm/llvm-project/blob/8e1f820bb4eadf5c0704818f6063e0db1006e32d/llvm/include/llvm/Support/TrigramIndex.h#L62

Removing `TrigramIndex` will make it easier to potentially switch to using `GlobPattern` instead of a full regex for `SpecialCaseList`. See discussion in https://reviews.llvm.org/D152762 for details.

Reviewed By: MaskRay, #sanitizers, vitalybuka

Differential Revision: https://reviews.llvm.org/D153171

commit | commitdiff | tree

Amy Kwan [Sat, 17 Jun 2023 05:33:38 +0000 (00:33 -0500)]

[AIX][TLS] Generate 64-bit local-exec access code sequence

This patch adds support for the TLS local-exec access model on AIX to allow
for the ability to generate the 64-bit (specifically, non-optimized) code sequence.

For this patch in particular, the sequence that is generated involves a load of the
variable offset, followed by an add of the loaded variable offset to r13 (which is
thread pointer, respectively). This code sequence looks like the following:
```
ld reg1,var[TC](2)
add reg2, reg1, r13 // r13 contains the thread pointer
```
The TOC (.tc pseudo-op) entries generated in the assembly files are also
changed where we add the @le relocation for the variable offset.

Differential Revision: https://reviews.llvm.org/D149722

Domain: System / Toolchain;

RSS Atom