platform/upstream/llvm.git
16 months ago[AMDGPU] Drop GFX11 runs for dagcombine-fma-fmad.ll and fma.f16.ll.
Ivan Kosarev [Tue, 20 Jun 2023 10:29:40 +0000 (11:29 +0100)]
[AMDGPU] Drop GFX11 runs for dagcombine-fma-fmad.ll and fma.f16.ll.

They cause failures on the llvm-clang-x86_64-expensive-checks-debian
buildbot.

This partially reverts
D153269 [AMDGPU][GFX11] Add test coverage for FMA instructions.

16 months ago[clang][index] Fix cast warning
Jan Svoboda [Tue, 20 Jun 2023 10:21:59 +0000 (12:21 +0200)]
[clang][index] Fix cast warning

This is a follow-up to D151938 that should fix GCC's -Wcast-qual warning.

16 months ago[CodeGen][test] Add missing `REQUIRES`.
Francesco Petrogalli [Tue, 20 Jun 2023 09:56:03 +0000 (11:56 +0200)]
[CodeGen][test] Add missing `REQUIRES`.

Differential Revision: https://reviews.llvm.org/D153325

16 months ago[AMDGPU][GFX11] Add test coverage for FMA instructions.
Ivan Kosarev [Tue, 20 Jun 2023 09:32:12 +0000 (10:32 +0100)]
[AMDGPU][GFX11] Add test coverage for FMA instructions.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D153269

16 months ago[llc][MISched] Add `-misched-detail-resource-booking` to llc.
Francesco Petrogalli [Tue, 20 Jun 2023 09:42:01 +0000 (11:42 +0200)]
[llc][MISched] Add `-misched-detail-resource-booking` to llc.

The option `-misched-detail-resource-booking` prints the following
information every time the method
`SchedBoundary::getNextResourceCycle` is invoked:

1. counters of the resources that have already been booked;

2. the values returned by `getNextResourceCycle`, which is the next
available cycle in which a resource can be booked.

The method is useful to debug low-level checks inside the machine
scheduler that make decisions based on the values returned by
`getNextResourceCycle`.

Reviewed By: andreadb

Differential Revision: https://reviews.llvm.org/D153116

16 months agoRevert "[llc][MISched] Add `-misched-detail-resource-booking` to llc."
Francesco Petrogalli [Tue, 20 Jun 2023 09:28:45 +0000 (11:28 +0200)]
Revert "[llc][MISched] Add `-misched-detail-resource-booking` to llc."

Reverting because of https://lab.llvm.org/buildbot#builders/75/builds/32485:

llvm-project/llvm/lib/CodeGen/MachineScheduler.cpp:2374:7: error: use of undeclared identifier 'MischedDetailResourceBooking'
 if (MischedDetailResourceBooking)

This reverts commit fc06262c1c365777e71207b6a5de281cba927c96.

16 months ago[llc][MISched] Add `-misched-detail-resource-booking` to llc.
Francesco Petrogalli [Tue, 20 Jun 2023 09:13:39 +0000 (11:13 +0200)]
[llc][MISched] Add `-misched-detail-resource-booking` to llc.

The option `-misched-detail-resource-booking` prints the following
information every time the method
`SchedBoundary::getNextResourceCycle` is invoked:

1. counters of the resources that have already been booked;

2. the values returned by `getNextResourceCycle`, which is the next
available cycle in which a resource can be booked.

The method is useful to debug low-level checks inside the machine
scheduler that make decisions based on the values returned by
`getNextResourceCycle`.

Reviewed By: andreadb

Differential Revision: https://reviews.llvm.org/D153116

16 months ago[mlir][transform] Add TransformRewriter
Matthias Springer [Tue, 20 Jun 2023 08:48:40 +0000 (10:48 +0200)]
[mlir][transform] Add TransformRewriter

All `apply` functions now have a `TransformRewriter &` parameter. This rewriter should be used to modify the IR. It has a `TrackingListener` attached and updates the internal handle-payload mappings based on rewrites.

Implementations no longer need to create their own `TrackingListener` and `IRRewriter`. Error checking is integrated into `applyTransform`. Tracking listener errors are reported only for ops with the `ReportTrackingListenerFailuresOpTrait` trait attached, allowing for a gradual migration. Furthermore, errors can be silenced with an op attribute.

Additional API will be added to `TransformRewriter` in subsequent revisions. This revision just adds an "empty" `TransformRewriter` class and updates all `apply` implementations.

Differential Revision: https://reviews.llvm.org/D152427

16 months ago[SCEVNormalization] Short circuit case with no loops (NFC)
Nikita Popov [Tue, 20 Jun 2023 07:33:49 +0000 (09:33 +0200)]
[SCEVNormalization] Short circuit case with no loops (NFC)

If there are no post-inc loops, normalization is a no-op. Don't
bother rewriting the SCEV in that case.

16 months ago[AMDGPU] Document amdgpu_cs_chain[_preserve] CCs. NFC
Diana Picus [Wed, 7 Jun 2023 12:42:02 +0000 (14:42 +0200)]
[AMDGPU] Document amdgpu_cs_chain[_preserve] CCs. NFC

Co-authored-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Differential Revision: https://reviews.llvm.org/D151997

16 months ago[AMDGPU] Start documenting calling conventions. NFC
Diana Picus [Wed, 7 Jun 2023 11:52:36 +0000 (13:52 +0200)]
[AMDGPU] Start documenting calling conventions. NFC

Add a section to AMDGPUUsage.rst about calling conventions and list the
ones from the CallingConv enum. Full descriptions can come later (help
appreciated).

Differential Revision: https://reviews.llvm.org/D151996

16 months ago[llvm-profdata] Fix llvm-profdata help and make sure it remains in sync
serge-sans-paille [Mon, 19 Jun 2023 16:35:03 +0000 (18:35 +0200)]
[llvm-profdata] Fix llvm-profdata help and make sure it remains in sync

This makes the new `order` subcommand part of the help.
As a side effect, also make llvm::map_range compatible with plain
arrays.

Differential Revision: https://reviews.llvm.org/D153303

16 months ago[clang][DebugInfo] Emit DW_AT_deleted on any deleted member function
Michael Buch [Mon, 19 Jun 2023 12:59:32 +0000 (13:59 +0100)]
[clang][DebugInfo] Emit DW_AT_deleted on any deleted member function

Currently we emit `DW_AT_deleted` for `deleted` special-member
functions (i.e., ctors/dtors). However, in C++ one can mark any
member function as deleted. This patch expands the set of member
functions for which we emit `DW_AT_deleted`.

The DWARFv5 spec section 5.7.8 says:
```
<non-normative>
In C++, a member function may be declared as deleted. This prevents the compiler from
generating a default implementation of a special member function such as a constructor
or destructor, and can affect overload resolution when used on other member functions.
</non-normative>

If the member function entry has been declared as deleted, then that entry has a
DW_AT_deleted attribute.
```

Thus this change is conforming.

Differential Revision: https://reviews.llvm.org/D153282

16 months ago[docs] Fix GEP faq references to undefined behavior [NFC]
Nuno Lopes [Tue, 20 Jun 2023 08:07:08 +0000 (09:07 +0100)]
[docs] Fix GEP faq references to undefined behavior [NFC]

16 months ago[CSKY] Optimize multiplication with immediates
Ben Shi [Mon, 19 Jun 2023 09:14:41 +0000 (17:14 +0800)]
[CSKY] Optimize multiplication with immediates

Try to break a multiplication with a specific immediate to
an/a addition/subtraction of left shifts.

Reviewed By: zixuan-wu

Differential Revision: https://reviews.llvm.org/D153106

16 months ago[CSKY][test][NFC] Add more tests of multiplication with immediates
Ben Shi [Mon, 19 Jun 2023 09:12:25 +0000 (17:12 +0800)]
[CSKY][test][NFC] Add more tests of multiplication with immediates

Reviewed By: zixuan-wu

Differential Revision: https://reviews.llvm.org/D153105

16 months agoRevert "[AMDGPU] Start documenting calling conventions. NFC"
Diana Picus [Tue, 20 Jun 2023 07:54:40 +0000 (09:54 +0200)]
Revert "[AMDGPU] Start documenting calling conventions. NFC"

This reverts commit aa7b127cb7314e326457d7f790d36db1cb74f63c.

...because I really ought to install sphinx.

16 months agoRevert "Prevent deadlocks in death tests."
Martin Braenne [Tue, 20 Jun 2023 07:15:31 +0000 (07:15 +0000)]
Revert "Prevent deadlocks in death tests."

This reverts commit dfbcee286b9b96751014ebc5ba5290e42796be37.

This was causing unit tests to fail on Gentoo, see comments on
https://reviews.llvm.org/D152696.

16 months agoFixup D151996
Diana Picus [Tue, 20 Jun 2023 07:37:16 +0000 (09:37 +0200)]
Fixup D151996

16 months ago[AMDGPU] Start documenting calling conventions. NFC
Diana Picus [Wed, 7 Jun 2023 11:52:36 +0000 (13:52 +0200)]
[AMDGPU] Start documenting calling conventions. NFC

Add a section to AMDGPUUsage.rst about calling conventions and list the
ones from the CallingConv enum. Full descriptions can come later (help
appreciated).

Differential Revision: https://reviews.llvm.org/D151996

16 months ago[clangd] Index the type of a non-type template parameter
Nathan Ridge [Tue, 20 Jun 2023 06:57:36 +0000 (02:57 -0400)]
[clangd] Index the type of a non-type template parameter

Fixes https://github.com/clangd/clangd/issues/1666

Differential Revision: https://reviews.llvm.org/D153251

16 months ago[include-cleaner] Bailout on invalid code for the command-line tool
Haojian Wu [Mon, 19 Jun 2023 12:57:09 +0000 (14:57 +0200)]
[include-cleaner] Bailout on invalid code for the command-line tool

The binary tool only works on working source code, if the source code is
not compilable, don't perform any analysis and edits.

Differential Revision: https://reviews.llvm.org/D153271

16 months ago[mlir][transform] Add ApplyRegisteredPassOp transform op
Matthias Springer [Tue, 20 Jun 2023 06:54:49 +0000 (08:54 +0200)]
[mlir][transform] Add ApplyRegisteredPassOp transform op

This transform op runs a pass on the target op.

Differential Revision: https://reviews.llvm.org/D153143

16 months ago[tools] Use llvm::is_contained (NFC)
Kazu Hirata [Tue, 20 Jun 2023 06:36:14 +0000 (23:36 -0700)]
[tools] Use llvm::is_contained (NFC)

16 months ago[xray][AArch64] Rewrite trampoline
Fangrui Song [Tue, 20 Jun 2023 06:02:45 +0000 (23:02 -0700)]
[xray][AArch64] Rewrite trampoline

Optimize (cmp+beq => cbz), duduplicate code (SAVE_REGISTERS/RESTORE_REGISTERS),
improve portability (use ASM_SYMBOL to be compatible with Mach-O), and fix style
issues.
Also, port D37965 (x86 tail call) to __xray_FunctionTailExit.

16 months ago[lldb] Make the test for D153043 linux-only
Jaroslav Sevcik [Tue, 20 Jun 2023 05:56:16 +0000 (07:56 +0200)]
[lldb] Make the test for D153043 linux-only

16 months ago[lldb] Make test for D153043 independent of external symbols
Jaroslav Sevcik [Tue, 20 Jun 2023 05:19:37 +0000 (07:19 +0200)]
[lldb] Make test for D153043 independent of external symbols

This removes dependence on the libc abort function.

16 months ago[LV] Add cost model for simd vector select instructions of type float
Zhongyunde [Tue, 20 Jun 2023 05:12:02 +0000 (13:12 +0800)]
[LV] Add cost model for simd vector select instructions of type float

For simd vector selects, use cmeq + bsl for v2f32/v4f32/v2f64, so their cost are cheep.
Fix https://github.com/llvm/llvm-project/issues/63082

Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D152523

16 months ago[X86][AMX] set Stride to Tile's Col when doing combine amxcast and store into tilestore
Bing1 Yu [Mon, 19 Jun 2023 07:57:01 +0000 (15:57 +0800)]
[X86][AMX] set Stride to Tile's Col when doing combine amxcast and store into tilestore

%tile = call x86_amx @llvm.x86.tileloadd64.internal(i16 8, i16 32, i8* %src_ptr, i64 64)
%vec = call <256 x i8> @llvm.x86.cast.tile.to.vector.v256i8(x86_amx...%tile)
store <256 x i8> %vec, <256 x i8>* %dst_ptr, align 256
=>
%tile = call x86_amx @llvm.x86.tileloadd64.internal(i16 8, i16 32, i8* %src_ptr, i64 64)
%stride = sext i16 32 to i64
call void @llvm.x86.tilestored64.internal(i16 8, i16 32, i8* %dst_ptr, i64 32, x86_amx %tile)

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D153002

16 months ago[XRay] Make llvm.xray.customevent parameter type match __xray_customevent
Fangrui Song [Tue, 20 Jun 2023 03:38:16 +0000 (20:38 -0700)]
[XRay] Make llvm.xray.customevent parameter type match __xray_customevent

The intrinsic has a smaller integer type than the parameter type of
builtin-function/API. Fix this similar to commit 3fa3cb408d8d0f1365b322262e501b6945f7ead9.

16 months ago[XRay] Make llvm.xray.typedevent parameter type match __xray_typedevent
Fangrui Song [Tue, 20 Jun 2023 03:28:39 +0000 (20:28 -0700)]
[XRay] Make llvm.xray.typedevent parameter type match __xray_typedevent

The Clang built-in function is void __xray_typedevent(size_t, const void *, size_t),
but the LLVM intrinsics has smaller integer types. Since we only allow
64-bit ELF/Mach-O targets, we can change llvm.xray.typedevent to
i64/ptr/i64.

This allows encoding more information and avoids i16 legalization for
many non-X86 targets.

fdrLoggingHandleTypedEvent only supports uint16_t event type.

16 months ago[BOLT] Fix a warning in release builds
Kazu Hirata [Tue, 20 Jun 2023 03:02:47 +0000 (20:02 -0700)]
[BOLT] Fix a warning in release builds

This patch fixes:

  bolt/lib/Core/BinarySection.cpp:120:24: error: unused variable
  'Relocation' [-Werror,-Wunused-variable]

16 months agoReland "[DebugMetadata][DwarfDebug] Support function-local types in lexical block...
Vladislav Dzhidzhoev [Mon, 19 Jun 2023 14:42:05 +0000 (16:42 +0200)]
Reland "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)" (2)

Test "local-type-as-template-parameter.ll" is now enabled only for
x86_64.

Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144006

Depends on D144005

16 months agoRevert "Reland "[DebugMetadata][DwarfDebug] Support function-local types in lexical...
Vladislav Dzhidzhoev [Mon, 19 Jun 2023 23:54:48 +0000 (01:54 +0200)]
Revert "Reland "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)""

This reverts commit 2da45172c4bcd42f704c57c656926f56f32fc5ce.
Test local-type-as-template-parameter.ll fails on ppc64-aix.

16 months ago[XRay][X86] Remove sled version 0 support from patchCustomEvent
Fangrui Song [Mon, 19 Jun 2023 22:11:26 +0000 (15:11 -0700)]
[XRay][X86] Remove sled version 0 support from patchCustomEvent

This is remnant after D140739.

16 months ago[xray][test] Test __xray_typedevent after D43668
Fangrui Song [Mon, 19 Jun 2023 21:53:22 +0000 (14:53 -0700)]
[xray][test] Test __xray_typedevent after D43668

16 months ago[DebugInfo] Add DW_OP_LLVM_user extension point
Scott Linder [Wed, 17 May 2023 21:33:09 +0000 (21:33 +0000)]
[DebugInfo] Add DW_OP_LLVM_user extension point

The extension codespace for DWARF expressions (DW_OP_LLVM_{lo,hi}_user)
has shrunk over time, as no extension is ever "retired" in practice. To
facilitate future extensions, this patch reserves one open opcode as an extension
point (0xfe), which is followed by a ULEB128-encoded SubOperation, and
then by the subop's operands.

There is some prior-art, namely DW_OP_AARCH64_operation
(see https://github.com/ARM-software/abi-aa/blob/edd7460d87493fff124b8b5713acf71ffc06ee91/aadwarf64/aadwarf64.rst#45dwarf-expression-operations).

This version makes some different tradeoffs, opting to use a ULEB128 for
the subop encoding for future-proofing.

Reviewed By: #debug-info, dblaikie

Differential Revision: https://reviews.llvm.org/D147271

16 months ago[AMDGPU] Do not release VGPRs if there may be pending scratch stores
Jay Foad [Mon, 19 Jun 2023 15:54:17 +0000 (16:54 +0100)]
[AMDGPU] Do not release VGPRs if there may be pending scratch stores

Differential Revision: https://reviews.llvm.org/D153295

16 months ago[AMDGPU] Remove unused macro CNT_MASK
Jay Foad [Mon, 19 Jun 2023 20:08:35 +0000 (21:08 +0100)]
[AMDGPU] Remove unused macro CNT_MASK

16 months ago[Driver] Correct -fnoxray-link-deps to -fno-xray-link-deps
Fangrui Song [Mon, 19 Jun 2023 19:48:33 +0000 (12:48 -0700)]
[Driver] Correct -fnoxray-link-deps to -fno-xray-link-deps

and removed unused CC1Option.
Also change -whole-archive to the canonical spelling and improve tests.

16 months ago[DebugInfo] Support more than 2 operands in DWARF operations
Scott Linder [Mon, 19 Jun 2023 17:13:08 +0000 (17:13 +0000)]
[DebugInfo] Support more than 2 operands in DWARF operations

Update DWARFExpression::Operation and LVOperation to support more than
2 operands.

Take the opportunity to use a SmallVector, which will handle at least 2
operands without allocation anyway, and removes the static limit
completely.

As there is no longer the concept of an "unused operand", remove
Operation::Encoding::SizeNA. Any use of it is now replaced with explicit
checks for how many operands an operation has.

There are still places where the limit remains 2, namely in the
DWARFLinker and in DIExpressions, but these can be updated in later
patches as-needed.

There are no explicit tests as this is nearly NFC: no new operation is
added which makes use of the additional operand capacity yet. A future
patch adding a new DWARF extension point will include operations which
require the support.

Reviewed By: Orlando, CarlosAlbertoEnciso

Differential Revision: https://reviews.llvm.org/D147270

16 months ago[libc] Remove the requirement of a platform-flush operation in File abstraction.
Siva Chandra Reddy [Fri, 16 Jun 2023 23:23:33 +0000 (23:23 +0000)]
[libc] Remove the requirement of a platform-flush operation in File abstraction.

The libc flush operation is not supposed to trigger a platform level
flush operation. See "Notes" on this Linux man page:
    https://man7.org/linux/man-pages/man3/fflush.3.html

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D153182

16 months agoReland "[DebugMetadata][DwarfDebug] Support function-local types in lexical block...
Vladislav Dzhidzhoev [Mon, 19 Jun 2023 14:42:05 +0000 (16:42 +0200)]
Reland "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)"

Test "local-type-as-template-parameter.ll" now requires linux-system.

Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144006

Depends on D144005

16 months ago[libc++] Add missing 'return 0' from main functions in tests
Louis Dionne [Mon, 19 Jun 2023 17:43:17 +0000 (13:43 -0400)]
[libc++] Add missing 'return 0' from main functions in tests

16 months ago[SpecialCaseList] Remove TrigramIndex
Ellis Hoag [Mon, 19 Jun 2023 17:39:13 +0000 (10:39 -0700)]
[SpecialCaseList] Remove TrigramIndex

`TrigramIndex` was added back in https://reviews.llvm.org/D27188 as an optimization to make `SpecialCaseList::match()` faster. I've found that `TrigramIndex` actually makes the function slower and it has no functional use, so we can remove it.

I grabbed the list of queries passed to `SpecialCaseList::match()` on a random very large file (`AArch64ISelLowering.cpp`) and measured the runtime to call `match()` on all of them with [this line](https://github.com/llvm/llvm-project/blob/8e1f820bb4eadf5c0704818f6063e0db1006e32d/llvm/lib/Support/SpecialCaseList.cpp#L64) disabled and then enabled.

```
$ hyperfine --warmup 3 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests' 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests'
Benchmark 1: GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests
  Time (mean ± σ):     575.9 ms ±  20.3 ms    [User: 573.1 ms, System: 2.7 ms]
  Range (min … max):   555.5 ms … 620.0 ms    10 runs

Benchmark 2: GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests
  Time (mean ± σ):     283.4 ms ±   6.7 ms    [User: 280.3 ms, System: 3.0 ms]
  Range (min … max):   277.0 ms … 294.9 ms    10 runs

Summary
  'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests' ran
    2.03 ± 0.09 times faster than 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests'
```

Using `perf` I found that most of the runtime in `TrigramIndex::isDefinitelyOut()` comes from a division operation that seems to come from `std::unordered_map`: https://github.com/llvm/llvm-project/blob/8e1f820bb4eadf5c0704818f6063e0db1006e32d/llvm/include/llvm/Support/TrigramIndex.h#L62

Removing `TrigramIndex` will make it easier to potentially switch to using `GlobPattern` instead of a full regex for `SpecialCaseList`. See discussion in https://reviews.llvm.org/D152762 for details.

Reviewed By: MaskRay, #sanitizers, vitalybuka

Differential Revision: https://reviews.llvm.org/D153171

16 months ago[AIX][TLS] Generate 64-bit local-exec access code sequence
Amy Kwan [Sat, 17 Jun 2023 05:33:38 +0000 (00:33 -0500)]
[AIX][TLS] Generate 64-bit local-exec access code sequence

This patch adds support for the TLS local-exec access model on AIX to allow
for the ability to generate the 64-bit (specifically, non-optimized) code sequence.

For this patch in particular, the sequence that is generated involves a load of the
variable offset, followed by an add of the loaded variable offset to r13 (which is
thread pointer, respectively). This code sequence looks like the following:
```
ld reg1,var[TC](2)
add reg2, reg1, r13     // r13 contains the thread pointer
```
The TOC (.tc pseudo-op) entries generated in the assembly files are also
changed where we add the @le relocation for the variable offset.

Differential Revision: https://reviews.llvm.org/D149722

16 months ago[AIX][TLS] Relax front end diagnostics to accept the local-exec TLS model
Amy Kwan [Sat, 17 Jun 2023 05:31:35 +0000 (00:31 -0500)]
[AIX][TLS] Relax front end diagnostics to accept the local-exec TLS model

This patch relaxes the front end AIX diagnostics added in D102070 to accept the
local-exec TLS model, as we plan to support this model in a series of future patches.

The diagnostics are relaxed when local-exec is used as a compiler option to
`-ftls-model=*` and in the `__attribute__((tls_model("local-exec")))` attribute.

Differential Revision: https://reviews.llvm.org/D149596

16 months agoRevert "[DebugMetadata][DwarfDebug] Support function-local types in lexical block...
Vladislav Dzhidzhoev [Mon, 19 Jun 2023 17:16:13 +0000 (19:16 +0200)]
Revert "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)"

This reverts commit 66511b401042f28c74d2ded3aac76d19a53bd7c4.
llvm/test/DebugInfo/Generic/local-type-as-template-parameter.ll is
broken.

16 months agoRecommit "[LSR] Consider post-inc form when creating extends/truncates."
Florian Hahn [Mon, 19 Jun 2023 16:57:05 +0000 (17:57 +0100)]
Recommit "[LSR] Consider post-inc form when creating extends/truncates."

This reverts the revert commit 1797ab36efc9c90c921cd725831f8c3f6a7125a2.

The recommitted version now checks the PostIncLoopSets for all fixups
and returns nullptr if the result doesn't match for all fixups.

16 months ago[AMDGPU] Add basic support for extended i8 perm matching
Jeffrey Byrnes [Mon, 8 May 2023 18:58:14 +0000 (11:58 -0700)]
[AMDGPU] Add basic support for extended i8 perm matching

Differential Revision: https://reviews.llvm.org/D142782

Change-Id: Ibb95224f7885839e8b77a705f487f10b47a258a6

16 months ago[mlir] Fix a rare use-after free in dialect loading
Benjamin Kramer [Mon, 19 Jun 2023 16:18:07 +0000 (18:18 +0200)]
[mlir] Fix a rare use-after free in dialect loading

applyExtensions can load further dialects, invalidating the reference to
the dialect pointer in the dialects DenseMap. Capture the pointer to
prevent that from happening.

16 months ago[gn build] Port eb7491769a51
LLVM GN Syncbot [Mon, 19 Jun 2023 16:13:28 +0000 (16:13 +0000)]
[gn build] Port eb7491769a51

16 months ago[AMDGPU] Reimplement the GFX11 early release VGPRs optimization
Jay Foad [Mon, 19 Jun 2023 14:39:45 +0000 (15:39 +0100)]
[AMDGPU] Reimplement the GFX11 early release VGPRs optimization

Implement this optimization in SIInsertWaitcnts, where we already have
information about whether there might be outstanding VMEM store
instructions. This has the following advantages:
- Correctly handles atomics-with-return.
- Correctly handles call instructions.
- Should be faster because it does not require running a separate pass.

Differential Revision: https://reviews.llvm.org/D153279

16 months ago[mlir][sparse][gpu] remove tuple as one of the spmm_buffer_size output type
Kun Wu [Mon, 19 Jun 2023 15:57:39 +0000 (15:57 +0000)]
[mlir][sparse][gpu] remove tuple as one of the spmm_buffer_size output type

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D153188

16 months ago[DAGCombiner][NFC] Factor out ByteProvider
Jeffrey Byrnes [Mon, 8 May 2023 18:43:15 +0000 (11:43 -0700)]
[DAGCombiner][NFC] Factor out ByteProvider

Differential Revision: https://reviews.llvm.org/D143018

Change-Id: I3dc03787a3382c0c3fe6b869f869c2946f450874

16 months ago[libc++][NFC] Sort header list in header_information.py
Louis Dionne [Mon, 19 Jun 2023 15:53:53 +0000 (11:53 -0400)]
[libc++][NFC] Sort header list in header_information.py

16 months agoAMDGPU: Remove amdgpu-waves-per-eu support in old attribute pass
Matt Arsenault [Fri, 10 Dec 2021 23:48:16 +0000 (18:48 -0500)]
AMDGPU: Remove amdgpu-waves-per-eu support in old attribute pass

AMDGPUAttributor now handles this attribute with value merging, so
delete the old approach which could only apply this to functions which
did not set it, or cloned the function.

16 months agoclang: Add __builtin_elementwise_round
Matt Arsenault [Thu, 5 Jan 2023 19:50:11 +0000 (14:50 -0500)]
clang: Add __builtin_elementwise_round

16 months agoValueTracking: Handle compare to nan and -inf constants in fcmpToClassTest
Matt Arsenault [Wed, 24 May 2023 11:47:57 +0000 (12:47 +0100)]
ValueTracking: Handle compare to nan and -inf constants in fcmpToClassTest

This will help enable a cleanup of simplifyFCmpInst

16 months ago[Hexagon] Fix range checks for immediate operands
Krzysztof Parzyszek [Mon, 19 Jun 2023 15:18:05 +0000 (08:18 -0700)]
[Hexagon] Fix range checks for immediate operands

The output assembly (textual) contains the instruction
  r29 = add(r29,#4294967136)
The value 4294967136 is -160 when interpreted as a signed 32-bit
integer, so it fits in the range of the immediate operand without
a constant extender. The range check in HexagonInstrInfo was putting
the operand value into an int variable, reporting no need for an
extender. This resulted in a packet with 4 instructions, including
the "add". The corresponding check in HexagonMCInstrInfo was using
an int64_t variable, causing the range check to fail, and an extender
to be emitted when lowering to MCInst, resulting in a packet with
too many instructions.

16 months ago[NFC] Add libc++ formatting commit to the git-blame ignore file
Louis Dionne [Mon, 19 Jun 2023 15:21:15 +0000 (11:21 -0400)]
[NFC] Add libc++ formatting commit to the git-blame ignore file

16 months ago[libc++][NFC] Apply clang-format on large parts of the code base
Louis Dionne [Fri, 16 Jun 2023 13:49:04 +0000 (09:49 -0400)]
[libc++][NFC] Apply clang-format on large parts of the code base

This commit does a pass of clang-format over files in libc++ that
don't require major changes to conform to our style guide, or for
which we're not overly concerned about conflicting with in-flight
patches or hindering the git blame.

This roughly covers:
- benchmarks
- range algorithms
- concepts
- type traits

I did a manual verification of all the changes, and in particular I
applied clang-format on/off annotations in a few places where the
result was less readable after than before. This was not necessary
in a lot of places, however I did find that clang-format had pretty
bad taste when it comes to formatting concepts.

Differential Revision: https://reviews.llvm.org/D153140

16 months ago[llvm-mca][TimelineView] Skip invalid entries when printing the json output.
Andrea Di Biagio [Mon, 19 Jun 2023 13:37:22 +0000 (14:37 +0100)]
[llvm-mca][TimelineView] Skip invalid entries when printing the json output.

16 months ago[CVP] Use simpler urem expansion when LHS >= RHS (PR63330)
Nikita Popov [Mon, 19 Jun 2023 15:14:37 +0000 (17:14 +0200)]
[CVP] Use simpler urem expansion when LHS >= RHS (PR63330)

In this case we don't need to emit the comparison and select.

This is papering over a weakness in CVP in that newly added
instructions don't get revisited. If they were revisited, the
icmp would be folded at that point.

However, even without that it makes sense to handle this explicitly,
because it avoids the need to insert freeze, which may prevent
further analysis of the operation by LVI.

Proofs: https://alive2.llvm.org/ce/z/quyBxp

Fixes https://github.com/llvm/llvm-project/issues/63330.

16 months ago[CVP] Add additional tests for PR63330 (NFC)
Nikita Popov [Mon, 19 Jun 2023 15:04:49 +0000 (17:04 +0200)]
[CVP] Add additional tests for PR63330 (NFC)

16 months ago[BOLT] Implement composed relocations
Job Noorman [Mon, 19 Jun 2023 14:51:43 +0000 (16:51 +0200)]
[BOLT] Implement composed relocations

BOLT currently assumes (and asserts) that no two relocations can share
the same offset. Although this is true in most cases, ELF has a feature
called (not sure if this is an official term) composed relocations [1]
where multiple relocations at the same offset are combined to produce a
single value.

For example, to support label subtraction (a - b) on RISC-V, two
relocations are emitted at the same offset:
- R_RISCV_ADD32 a + 0
- R_RISCV_SUB32 b + 0
which, when combined, will produce the value of (a - b).

To support this in BOLT, first, RelocationSetType in BinarySection is
changed to be a multiset in order to allow it to store multiple
relocations at the same offset.

Next, Relocation::emit() is changed to receive an iterator pair of
relocations. In most cases, these will point to a single relocation in
which case its behavior is unaltered by this patch. For composed
relocations, they should point to all relocations at the same offset and
the following happens:
- A new method Relocation::createExpr() is called for every relocation.
  This method is essentially the same as the original emit() except that
  it returns the MCExpr without emitting it.
- The MCExprs of relocations i and i+1 are combined using the opcode
  returned by the new method Relocation::getComposeOpcodeFor().
- After combining all MCExprs, the last one is emitted.

Note that in the current patch, getComposeOpcodeFor() simply calls
llvm_unreachable() since none of the current targets use composed
relocations. This will change once the RISC-V target lands.

Finally, BinarySection::emitAsData() is updated to group relocations by
offset and emit them all at once.

Note that this means composed relocations are only supported in data
sections. Since this is the only place they seem to be used in RISC-V, I
believe it's reasonable to only support them there for now to avoid
further code complexity.

[1]: https://www.sco.com/developers/gabi/latest/ch4.reloc.html

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D146546

16 months ago[libc++] "Implements" new SI prefixis.
Mark de Wever [Sat, 17 Jun 2023 13:50:50 +0000 (15:50 +0200)]
[libc++] "Implements" new SI prefixis.

Like yocto, zepto, zetta, and yotta. The new prefixes quecto, ronto,
ronna, and quetta can't be implemented in a intmax_t. So their
implementation does nothing.

Implements
- P2734R0 Adding the new SI prefixes

Depends on D153192

Reviewed By: #libc, philnik

Differential Revision: https://reviews.llvm.org/D153200

16 months ago[libc++][regex] Removes operator!=.
Mark de Wever [Tue, 16 Aug 2022 06:15:51 +0000 (08:15 +0200)]
[libc++][regex] Removes operator!=.

Implements part of:
- P1614R2 The Mothership has Landed

Reviewed By: #libc, H-G-Hristov, philnik

Differential Revision: https://reviews.llvm.org/D153222

16 months ago[libc++] Marks __cpp_lib_bitops as implemented.
Mark de Wever [Sun, 18 Jun 2023 10:35:44 +0000 (12:35 +0200)]
[libc++] Marks __cpp_lib_bitops as implemented.

This FTM was introduced in
  P0553R4 Bit operations

Which has been implemented since libc++ 9.

This was noticed while working on D153192.

Reviewed By: #libc, philnik

Differential Revision: https://reviews.llvm.org/D153225

16 months ago[gn] port 6f2e92c10cebca5 better (lld/unittests)
Nico Weber [Mon, 19 Jun 2023 14:58:08 +0000 (10:58 -0400)]
[gn] port 6f2e92c10cebca5 better (lld/unittests)

lld/test/Unit/lit.site.cfg.py.in got cleaned up in the reland.

16 months ago[libc++] Update status after Varna meeting.
Mark de Wever [Sat, 17 Jun 2023 10:00:23 +0000 (12:00 +0200)]
[libc++] Update status after Varna meeting.

This updates:
- The status tables
- Feature test macros
- New headers for modules
The latter avoids forgetting about modules when implementing the feature
in a new header.

Reviewed By: #libc, philnik

Differential Revision: https://reviews.llvm.org/D153192

16 months ago[mlir][Vector] Let VectorToLLVM operate on non-ModuleOp
Nicolas Vasilache [Mon, 19 Jun 2023 14:54:14 +0000 (14:54 +0000)]
[mlir][Vector] Let VectorToLLVM operate on non-ModuleOp

16 months agoRevert "[mlir][Vector] Let VectorToLLVM operate on non-ModuleOp"
Nicolas Vasilache [Mon, 19 Jun 2023 14:51:12 +0000 (14:51 +0000)]
Revert "[mlir][Vector] Let VectorToLLVM operate on non-ModuleOp"

This reverts commit aabea3d320c87561fe98b56c9f53cca1c6d18869.

That commit had mistakenly squashed spurious changes in.

16 months ago[gn] port c7d3c84449f4
Nico Weber [Mon, 19 Jun 2023 14:49:44 +0000 (10:49 -0400)]
[gn] port c7d3c84449f4

16 months ago[gn] port 3956a34e4fc6
Nico Weber [Mon, 19 Jun 2023 14:48:35 +0000 (10:48 -0400)]
[gn] port 3956a34e4fc6

16 months agoReland "[gn build] Port 2700da5fe28d (lld/unittests etc)"
Nico Weber [Mon, 19 Jun 2023 14:47:20 +0000 (10:47 -0400)]
Reland "[gn build] Port 2700da5fe28d (lld/unittests etc)"

The lld CL relanded in 6f2e92c10cebca5.

This reverts commit d76b37e6954ad0cf66f1f3c6a9c70328c45859f3.

16 months ago[AArch64] Add and expand the testing of fmin/fmax reduction. NFC
David Green [Mon, 19 Jun 2023 14:47:21 +0000 (15:47 +0100)]
[AArch64] Add and expand the testing of fmin/fmax reduction. NFC

For both CodeGen and CostModelling, this adds extran testing for the new
lvm.vector.reduce.fmaximum and lvm.vector.reduce.fminimum intrinsics, as well
as making sure there is test coverage for all the various cases.

16 months ago[AMDGPU] Fix operand class of v_ldexp_f16 src1
Joe Nash [Mon, 12 Jun 2023 21:21:29 +0000 (17:21 -0400)]
[AMDGPU] Fix operand class of v_ldexp_f16 src1

Patch eece6ba283bd changed the src1 type of v_ldexp_f16 from i32 to
i16. Though semantically src1 is an i16, the hardware reads this operand as an
f16 type, which primarily enables floating point inline constants.
Therefore this patch changes the operand type to f16. It maintains the
current behavior where floating point source modifiers are not allowed
on src1. SDWA sext modifier continues to be allowed.
The test asm and disasm test changes in eece6ba283bd are reverted,
because the floating point inline constants are allowed.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D153169

16 months ago[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes...
Vladislav Dzhidzhoev [Mon, 19 Jun 2023 14:42:05 +0000 (16:42 +0200)]
[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)

RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544

Similar to imported declarations, the patch tracks function-local types in
DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with
the aforementioned metadata change and provided a support of function-local
types scoped within a lexical block.

The patch assumes that DICompileUnit's 'enums field' no longer tracks local
types and DwarfDebug would assert if any locally-scoped types get placed there.

Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144006

Depends on D144005

16 months ago[clang][Serialization][RISCV] Increase the number of reserved predefined type IDs
Roger Ferrer Ibanez [Mon, 19 Jun 2023 14:37:46 +0000 (14:37 +0000)]
[clang][Serialization][RISCV] Increase the number of reserved predefined type IDs

In D152070 we added many new intrinsic types required for the RISC-V
Vector Extension.

This was crashing when loading the AST as those types are intrinsically
added to the AST (they don't come from the disk).

The total number required now by clang exceeds 400 so increasing the
value to 500 solves the problem. This value was already increased in
D92715 but I assume this has some impact on the on-disk format.

Also add a static assert to avoid this happening again in the future.

Differential Revision: https://reviews.llvm.org/D153111

16 months ago[LSR] Add test for for issue leading to revert of abfeda5af329b5.
Florian Hahn [Mon, 19 Jun 2023 14:35:48 +0000 (15:35 +0100)]
[LSR] Add test for for issue leading to revert of abfeda5af329b5.

Add unit test triggering an assertion with abfeda5af329b5.

16 months ago[mlir][NVGPU] NFC - Add a more convenient C++ builder for nvgpu::MmaSyncOp
Nicolas Vasilache [Mon, 19 Jun 2023 13:33:30 +0000 (13:33 +0000)]
[mlir][NVGPU] NFC - Add a more convenient C++ builder for nvgpu::MmaSyncOp

16 months ago[mlir][Vector] Let VectorToLLVM operate on non-ModuleOp
Nicolas Vasilache [Mon, 19 Jun 2023 13:26:00 +0000 (13:26 +0000)]
[mlir][Vector] Let VectorToLLVM operate on non-ModuleOp

Restriction to ModuleOp is ancient and unnecessarily restrictive.

16 months ago[libc++] Move non operator new definitions outside of new.cpp
Louis Dionne [Wed, 14 Jun 2023 22:36:37 +0000 (15:36 -0700)]
[libc++] Move non operator new definitions outside of new.cpp

This makes it such that new.cpp contains only the definitions of
operator new and operator delete, like its libc++abi counterpart.

Differential Revision: https://reviews.llvm.org/D153136

16 months ago[CVP] Don't freeze value if guaranteed non-undef
Nikita Popov [Mon, 19 Jun 2023 13:41:39 +0000 (15:41 +0200)]
[CVP] Don't freeze value if guaranteed non-undef

Avoid inserting the freeze if not necessary, as this allows LVI
to continue reasoning about the expression.

16 months ago[CVP] Add test for PR63330 (NFC)
Nikita Popov [Mon, 19 Jun 2023 13:38:35 +0000 (15:38 +0200)]
[CVP] Add test for PR63330 (NFC)

16 months ago[FuncSpec] Promote stack values before specialization.
Alexandros Lamprineas [Mon, 19 Jun 2023 10:08:03 +0000 (11:08 +0100)]
[FuncSpec] Promote stack values before specialization.

After each iteration of the function specializer, constant stack values
are promoted to constant globals in order to enable recursive function
specialization. This should also be done once before running the
specializer. Enables specialization of _QMbrute_forcePdigits_2 from
SPEC2017:548.exchange2_r.

Differential Revision: https://reviews.llvm.org/D152799

16 months ago[OpenMP] Implement printing TDGs to dot files
Adrian Munera [Thu, 15 Jun 2023 19:27:01 +0000 (14:27 -0500)]
[OpenMP] Implement printing TDGs to dot files

This patch implements the "__kmp_print_tdg_dot" function, that prints a task dependency graph into a dot file containing the tasks and their dependencies.

It is activated through a new environment variable "KMP_TDG_DOT"

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D150962

16 months ago[BBUtils] Don't add 'then' block to a loop if it's terminated with unreachable
Dmitry Makogon [Thu, 8 Jun 2023 11:56:18 +0000 (18:56 +0700)]
[BBUtils] Don't add 'then' block to a loop if it's terminated with unreachable

SplitBlockAndInsertIfThen utility creates two new blocks,
they're called ThenBlock and Tail (true and false destinations of a conditional
branch correspondingly). The function has a bool parameter Unreachable,
and if it's set, then ThenBlock is terminated with an unreachable.
At the end of the function the new blocks are added to the loop of the split
block. However, in case ThenBlock is terminated with an unreachable,
it cannot belong to any loop.

Differential Revision: https://reviews.llvm.org/D152434

16 months ago[mlir] Add support for LLVMIR comdat operation
David Truby [Fri, 16 Jun 2023 11:45:54 +0000 (12:45 +0100)]
[mlir] Add support for LLVMIR comdat operation

The LLVM comdat operation specifies how to deduplicate globals with the
same key in two different object files. This is necessary on Windows
where e.g. two object files with linkonce globals will not link unless
a comdat for those globals is specified. It is also supported in the ELF
format.

Differential Revision: https://reviews.llvm.org/D150796

16 months ago[RISCV] Add support for XCVbitmanip extension in CV32E40P
melonedo [Wed, 14 Jun 2023 13:42:57 +0000 (21:42 +0800)]
[RISCV] Add support for XCVbitmanip extension in CV32E40P

Implement XCVbitmanip intrinsics for CV32E40P according to the specification.

This commit is part of a patch-set to upstream the 7 vendor specific extensions of CV32E40P.

Contributors: @CharKeaney, @jeremybennett, @lewis-revill, @liaolucy, @simoncook, @xmj.

Spec: https://github.com/openhwgroup/cv32e40p/blob/62bec66b36182215e18c9cf10f723567e23878e9/docs/source/instruction_set_extensions.rst

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D152915

16 months ago[libc++] Split sources for <filesystem>
Louis Dionne [Mon, 12 Jun 2023 17:43:55 +0000 (10:43 -0700)]
[libc++] Split sources for <filesystem>

The operations.cpp file contained the implementation of a ton of
functionality unrelated to just the filesystem operations, and
filesystem_common.h contained a lot of unrelated functionality as well.

Splitting this up into more files will make it possible in the future
to support parts of <filesystem> (e.g. path) on systems where there is
no notion of a filesystem.

Differential Revision: https://reviews.llvm.org/D152377

16 months ago[libc++][NFC] Move several .fail.cpp tests to .verify.cpp
Louis Dionne [Fri, 16 Jun 2023 15:11:56 +0000 (11:11 -0400)]
[libc++][NFC] Move several .fail.cpp tests to .verify.cpp

A few tests were also straightforward to translate to SFINAE tests
instead, so in a few cases I did that and removed the .fail.cpp test
entirely.

Differential Revision: https://reviews.llvm.org/D153149

16 months ago[libc++][NFC] Rename __lower_bound_impl to __lower_bound
Louis Dionne [Fri, 16 Jun 2023 13:52:23 +0000 (09:52 -0400)]
[libc++][NFC] Rename __lower_bound_impl to __lower_bound

For consistency with other algorithms.

Differential Revision: https://reviews.llvm.org/D153141

16 months agoclang/HIP: Remove __llvm_amdgcn_* wrapper hacks
Matt Arsenault [Tue, 22 Nov 2022 16:24:09 +0000 (11:24 -0500)]
clang/HIP: Remove __llvm_amdgcn_* wrapper hacks

These are leftover hacks from using asm declaratios to access
intrinsics.

16 months agoHIP: Directly call isfinite builtins
Matt Arsenault [Sun, 20 Nov 2022 16:49:35 +0000 (08:49 -0800)]
HIP: Directly call isfinite builtins

16 months ago[BasicAA] Add test for PR63266 (NFC)
Nikita Popov [Mon, 19 Jun 2023 12:40:33 +0000 (14:40 +0200)]
[BasicAA] Add test for PR63266 (NFC)

16 months ago[MLIR][OpenMP] Refactoring createTargetData in OMPIRBuilder
Akash Banerjee [Mon, 19 Jun 2023 11:46:15 +0000 (12:46 +0100)]
[MLIR][OpenMP] Refactoring createTargetData in OMPIRBuilder

Key changes:
  - Refactor the createTargetData function to make use of the emitOffloadingArrays and emitOffloadingArraysArgument functions to generate code.
  - Added a new emitIfClause helper function to allow handling if clauses in a similar fashion to Clang.
  - Updated the MLIR side of code to account for changes to createTargetData.

Depends on D149872

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D146557

16 months ago[lldb][AArch64] Add thread local storage tpidr register
David Spickett [Mon, 19 Jun 2023 10:52:06 +0000 (11:52 +0100)]
[lldb][AArch64] Add thread local storage tpidr register

This register is used as the pointer to the current thread
local storage block and is read from NT_ARM_TLS on Linux.

Though tpidr will be present on all AArch64 Linux, I am soon
going to add a second register tpidr2 to this set.

tpidr is only present when SME is implemented, therefore the
NT_ARM_TLS set will change size. This is why I've added this
as a dynamic register set to save changes later.

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D152516

16 months agoRe-land [LLD] Allow usage of LLD as a library
Alexandre Ganea [Mon, 19 Jun 2023 11:32:34 +0000 (07:32 -0400)]
Re-land [LLD] Allow usage of LLD as a library

This reverts commit aa495214b39d475bab24b468de7a7c676ce9e366.

As discussed in https://github.com/llvm/llvm-project/issues/53475 this patch
allows for using LLD-as-a-lib. It also lets clients link only the drivers that
they want (see unit tests).

This also adds the unit test infra as in the other LLVM projects. Among the
test coverage, I've added the original issue from @krzysz00, see:
https://github.com/ROCmSoftwarePlatform/D108850-lld-bug-reproduction

Important note: this doesn't allow (yet) linking in parallel. This will come a
bit later hopefully, in subsequent patches, for COFF at least.

Differential revision: https://reviews.llvm.org/D119049