Tom Eccles [Wed, 14 Jun 2023 13:23:00 +0000 (13:23 +0000)]
[flang][hlfir] Fix multiple return declaration type
When the ENTRY statement is used, the same source can return different
types depending on the entry point. These different return values are
storage associated (share the same storage). Previously, this led to the
declaration of the results to all have the largest type. This patch adds
a convert between the stack allocation and the declaration so that the
hlfir.decl gets the right type.
I haven't managed to generate code where this convert converted a
reference to an allocation for a smaller type into an allocation for a
larger one, but I have added an assert just in case.
This is a different solution to https://reviews.llvm.org/D152725, see
discussion there.
Differential Revision: https://reviews.llvm.org/D152931
Dmitry Makogon [Mon, 19 Jun 2023 08:49:47 +0000 (15:49 +0700)]
[Test] Add test for PR62430 showing bug in SCEV mul expression creation (NFC)
Hristo Hristov [Mon, 19 Jun 2023 08:45:58 +0000 (11:45 +0300)]
[libc++][ranges] Mark `views::stride` in progress
Alexandros Lamprineas [Sun, 18 Jun 2023 17:51:45 +0000 (18:51 +0100)]
[FuncSpec][NFC] Improve the unittest coverage.
The specialization bonus is zero in some unittests because the basic blocks
containing the users of the constant arguments are executed less frequently
than the entry block. Sinking them into loops solves that.
Differential Revision: https://reviews.llvm.org/D153230
Momchil Velikov [Mon, 19 Jun 2023 08:11:20 +0000 (09:11 +0100)]
[CodeGenPrepare] Fix for using outdated/corrupt LoopInfo
Some transformation in CodeGenPrepare pass may create and/or delete
basic block, but they don't update the LoopInfo, so the LoopInfo may
end up containing dangling pointers and sometimes reused basic blocks,
which leads to "interesting" non-deterministic behaviour.
These transformations do not seem to alter the loop structure of the
function, and updating the loop info is quite straighforward.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D150384
Change-Id: If8ab3905749ea6be94fbbacd54c5cfab5bc1fba1
Martin Braenne [Thu, 15 Jun 2023 19:07:05 +0000 (19:07 +0000)]
[clang][dataflow] Create `Value`s for integer literals.
This patch includes a test that fails without the fix.
I discovered that we weren't creating `Value`s for integer literals when, in a
different patch, I tried to overwrite the value of a struct field with a literal
for the purposes of a test and was surprised to find that the struct compared
the same before and after the assignment.
This functionality therefore seems useful at least for tests, but is probably
also useful for actual analysis of code.
Reviewed By: ymandel, xazax.hun, gribozavr2
Differential Revision: https://reviews.llvm.org/D152813
Joseph Huber [Mon, 19 Jun 2023 08:25:37 +0000 (03:25 -0500)]
[libc] Disable atomic optimizations for `libc` AMDGPU builds
Recently the AMDGPU backend automatically enables a pass to optimize
atomics. This results in the LTO build taking about 10x longer in all
cases. For now we disable this by default as was the case before the
patch in D152649.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D153232
Ingo Müller [Mon, 19 Jun 2023 07:15:01 +0000 (07:15 +0000)]
[mlir][ExecutionEngine] Only load JITDyLibs without init/destroy funcs.
In https://reviews.llvm.org/D153029, I moved the loading/unloading
mechanisms of shared libraries from the JIT runner to the execution
engine in order to make that mechanism available in the latter
(including its Python bindings). However, I realized that I introduced a
small change in semantic: previously, the JIT runner checked for the
presence of init/destroy functions and only loaded the library as
JITDyLib if they were not present. After I moved the code, all libraries
were loaded as JITDyLib, even if they registered their symbols
explicitly in their init function. I am not sure if this is really a
problem but (1) the previous behavior was different and (2) I guess it
could cause a problem if some symbols are exported through the init
function *and* have public visibility. This patch reestablishes the
original behaviour in the new place of the code.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D153249
Christian Sigg [Mon, 19 Jun 2023 07:26:17 +0000 (09:26 +0200)]
Felix [Mon, 19 Jun 2023 05:54:08 +0000 (05:54 +0000)]
[clang-tidy] Reserved-identifier: Improved AllowedIdentifiers option to support regular expressions
Fixes: https://github.com/llvm/llvm-project/issues/59119
Reviewed By: PiotrZSL
Differential Revision: https://reviews.llvm.org/D152764
Bing1 Yu [Mon, 19 Jun 2023 07:07:36 +0000 (15:07 +0800)]
[X86][AMX] Let Store not be removed if combineCastStore failed
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D152819
Matthias Springer [Mon, 19 Jun 2023 07:04:53 +0000 (09:04 +0200)]
[mlir][transform] SequenceOp: Top-level operations can be used as matchers
As a convenience to the user, top-level sequence ops can optionally be used as matchers: the op type is specified by the type of the block argument.
This is similar to how pass pipeline targets can be specified on the command line (`-pass-pipeline='builtin.module(func.func(...))`).
Differential Revision: https://reviews.llvm.org/D153121
Matthias Springer [Mon, 19 Jun 2023 06:56:49 +0000 (08:56 +0200)]
[mlir][transform] ApplyPatternsOp: Add check to prevent modifying the transform IR
Add an extra check to make sure that transform IR is not getting modified by this op while it is being interpreted. This generally dangerous and we may want to enforce this for all transform ops that modify the payload in the future.
Users should generally try to apply patterns only to the piece of IR where it is needed (e.g., a matched function) and not the entire module (which may contain the transform IR).
This revision is in response to a crash in a downstream compiler that was caused by a dead `transform.structured.match` op that was removed by the GreedyPatternRewriteDriver's DCE while the enclosing sequence was being interpreted.
Differential Revision: https://reviews.llvm.org/D153113
David Green [Mon, 19 Jun 2023 06:52:46 +0000 (07:52 +0100)]
[AArch64] More tablegen patterns for addp of two extracts
Similar to D152245, this adds integer addp patterns, using the larger
v4i32 addp from addp extractlow, extracthi.
David Green [Mon, 19 Jun 2023 06:48:31 +0000 (07:48 +0100)]
[AArch64] Add tablegen patterns for faddp of two extracts
This adds some simple tablegen patterns for converting
`faddp v2f32 extractlow(Rn), v2f32 extracthigh(Rn)` to
`faddp v4f32 Rn, v4f32 Rn` using the q variants of the
instructions, avoiding the extra ext needed to extract
the high lanes. Only the bottom lanes of the new faddp
are used, the second Rn operand is used as a placeholder.
It uses Rn to prevent any false dependencies, but could
equally by undef.
Differential Revision: https://reviews.llvm.org/D152245
luxufan [Mon, 19 Jun 2023 06:15:01 +0000 (14:15 +0800)]
[SCCP][NFC] Regenerate test case
luxufan [Mon, 19 Jun 2023 06:07:42 +0000 (14:07 +0800)]
[SCCP][NFC] Regenerate test case
Yevgeny Rouban [Thu, 8 Jun 2023 09:32:12 +0000 (16:32 +0700)]
[LoopUnrollRuntime] Allow indirect transition to deopt non-latch exit blocks
Relax condition on runtime trip count unrolling loops with 1 non-latch exit
that leads to a deop block.
There are cases when the deopt blocks are common exits for different loops.
LoopSimplify pass splits such edges to the common deopting blocks to make
sure that all exit nodes of the loop only have predecessors that are inside
of the loop (See simplifyOneLoop()). This breaks the current condition for
unrolling. This patch allows the split transitive blocks that still lead to
the deopting blocks.
Differential Revision: https://reviews.llvm.org/D152639
Chuanqi Xu [Mon, 19 Jun 2023 02:35:16 +0000 (10:35 +0800)]
Recommit [ABI] [C++20] [Modules] Don't generate vtable if the class is defined in other module unit
Close https://github.com/llvm/llvm-project/issues/61940.
The root cause is that clang will generate vtable as strong symbol now
even if the corresponding class is defined in other module units. After
I check the wording in Itanium ABI, I find this is not inconsistent.
Itanium ABI 5.2.3
(https://itanium-cxx-abi.github.io/cxx-abi/abi.html#vague-vtable) says:
> The virtual table for a class is emitted in the same object containing
> the definition of its key function, i.e. the first non-pure virtual
> function that is not inline at the point of class definition.
So the current behavior is incorrect. This patch tries to address this.
Also I think we need to do a similar change for MSVC ABI. But I don't
find the formal wording. So I don't address this in this patch.
Reviewed By: rjmccall, iains, dblaikie
Differential Revision: https://reviews.llvm.org/D150023
Fangrui Song [Mon, 19 Jun 2023 02:30:16 +0000 (19:30 -0700)]
[XRay] Mark Mach-O xray_instr_map and xray_fn_idx as S_ATTR_LIVE_SUPPORT
Add the `S_ATTR_LIVE_SUPPORT` attribute to the sections so that `ld -dead_strip`
will retain subsections that reference live functions, once we we add linker
private "l" symbols as atoms.
Jianjian GUAN [Fri, 16 Jun 2023 08:58:24 +0000 (16:58 +0800)]
[RISCV] Match shl (ext v, splat 1) to vector widening add.
Since we use match shl (v, splat 1) to vadd, we could also expand to widening add.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D153112
Fangrui Song [Mon, 19 Jun 2023 00:49:53 +0000 (17:49 -0700)]
[MC] flushPendingLabels: set Atom for new fragment after D71368
Fixes:
c26c5e47ab9ca60835f191c90fa751e9a7dd0f3d (essentially a no-op)
The newly created MCDataFragment should inherit Atom (see
MCMachOStreamer::finishImpl). To the best of my knowledge, this change cannot be
tested at present, but this is important to ensure
MCExpr.cpp:AttemptToFoldSymbolOffsetDifference gives the same result in case we
evaluate the expression again with a MCAsmLayout.
In the following case,
```
.section __DATA,xray_instr_map
lxray_sleds_start1:
.space 16
Lxray_sleds_end1:
.section __DATA,xray_fn_idx
.quad (Lxray_sleds_end1-lxray_sleds_start1)>>4 // can be folded without a MCAsmLayout
```
When we have a MCAsmLayout, without this change, evaluating
(Lxray_sleds_end1-lxray_sleds_start1)>>4 again will fail due to
`FA->getAtom() == nullptr && FB.getAtom() != nullptr` in
MachObjectWriter::isSymbolRefDifferenceFullyResolvedImpl, called by
AttemptToFoldSymbolOffsetDifference.
Fangrui Song [Mon, 19 Jun 2023 00:18:38 +0000 (17:18 -0700)]
[MC] Fold A-B when A is a pending label or A/B are separated by a MCFillFragment
When the MCAssembler is non-null and the MCAsmLayout is null, we can fold A-B
in these additional cases:
* when A is a pending label (will be reassigned to a real fragment in flushPendingLabels())
* A and B are separated by a MCFillFragment with a constant size
Fangrui Song [Sun, 18 Jun 2023 23:00:18 +0000 (16:00 -0700)]
[MC] Remove unneeded MCDataFragment check from AttemptToFoldSymbolOffsetDifference
If FA == FB, we can use SA.getOffset() - SB.getOffset() even if FA is
not a MCDataFragment, as the only case this can be problematic
(different offsets for a variable-size fragment) is invalid/unreachable.
If FA != FB, the `if (FI->getKind() != MCFragment::FT_Data)` check below
can bail out correctly.
This change will help Mach-O fold more expressions. For ELF this is NFC,
unless evaluateFixup has a bug that would evaluate an expression
differently.
Fangrui Song [Sun, 18 Jun 2023 22:14:21 +0000 (15:14 -0700)]
[MC] flushPendingLabels: set Atom for new fragment after D71368
The newly created MCDataFragment should inherit Atom (see
MCMachOStreamer::finishImpl). I cannot think of a case to test the
behavior, but this is one step towards folding the Mach-O label
difference below and making Mach-O more similar to ELF.
```
.section __DATA,xray_instr_map
lxray_sleds_start1:
.space 16
Lxray_sleds_end1:
.section __DATA,xray_fn_idx
.quad (Lxray_sleds_end1-lxray_sleds_start1)>>4 // error: expected relocatable expression // Mach-O
```
Alfred Persson Forsberg [Sun, 18 Jun 2023 22:08:18 +0000 (23:08 +0100)]
[libc] [NFC] malloc.h: fix include guard typo
Differential Revision: https://reviews.llvm.org/D153231
Fangrui Song [Sun, 18 Jun 2023 20:32:40 +0000 (13:32 -0700)]
[XRay][test] Make tests less sensitive to .Ltmp/Ltmp label changes
Kazu Hirata [Sun, 18 Jun 2023 19:44:00 +0000 (12:44 -0700)]
[Target] Use llvm::is_contained (NFC)
Kazu Hirata [Sun, 18 Jun 2023 18:53:01 +0000 (11:53 -0700)]
[BOLT] Use llvm::is_contained (NFC)
Kazu Hirata [Sun, 18 Jun 2023 18:52:59 +0000 (11:52 -0700)]
[AST] Use DenseMapBase::lookup (NFC)
Hristo Hristov [Mon, 12 Jun 2023 16:10:38 +0000 (19:10 +0300)]
[libc++][spaceship][NFC] P1612R2: Mark remove `operator!=` from "Ranges Library" items as "Complete"
Marked already implemented parts of P1612R2 as "Complete":
- `ranges::iota_view::iterator` https://reviews.llvm.org/D110774
- `iota_view::sentinel` https://reviews.llvm.org/D107396
- `filter_view::iterator` https://reviews.llvm.org/D109086
- `filter_view::sentinel` https://reviews.llvm.org/D109086
- `ranges::transform_view::iterator` https://reviews.llvm.org/D110774
- `transform_view::sentinel` https://reviews.llvm.org/D103056
- `take_view::sentinel` https://reviews.llvm.org/D123600
- `join_view::iterator` https://reviews.llvm.org/D107671
- `join_view::sentinel ` https://reviews.llvm.org/D107671
- `split_view::outer_iterator` https://reviews.llvm.org/D142063
- `split_view::inner_iterator` https://reviews.llvm.org/D142063
Note these operators were added and removed in C++20.
Reviewed By: Mordante, #libc
Differential Revision: https://reviews.llvm.org/D152721
Pranav Kant [Sun, 18 Jun 2023 17:51:56 +0000 (17:51 +0000)]
Uday Bondhugula [Sat, 17 Jun 2023 17:57:59 +0000 (23:27 +0530)]
[MLIR] Provide bare pointer memref lowering option on gpu-to-nvvm pass
Provide the bare pointer memref lowering option on gpu-to-nvvm pass.
This is needed whenever we lower memrefs on the host function side and
the kernel calls on the host-side (gpu-to-llvm) with the bare ptr
convention. The GPU module side of the lowering should also "align" and
use the bare pointer convention.
Reviewed By: krzysz00
Differential Revision: https://reviews.llvm.org/D152480
Matt Arsenault [Tue, 22 Nov 2022 17:51:21 +0000 (12:51 -0500)]
clang/HIP: Use __builtin_fmaf16
Missed these in
43fd46fda3c90b014e8a73c62f67af9543ea4d59
Serge Pavlov [Sun, 18 Jun 2023 16:46:08 +0000 (23:46 +0700)]
[Doc] Fix table layout
Simon Pilgrim [Sun, 18 Jun 2023 16:37:17 +0000 (17:37 +0100)]
[GlobalIsel][X86] selectDivRem - fix typo in 64-bit AH handling code
This function was lifted from fast-isel, and still referred to the Instruction::SRem/URrem opcodes, instead of the G_SREM/G_UREM opcodes.
But it turns out these aren't necessary at all as only the G_SREM/G_UREM codepaths will use the AH register for DivRemResultReg anyhow.
Simon Pilgrim [Sun, 18 Jun 2023 16:06:32 +0000 (17:06 +0100)]
[GlobalIsel][X86] Regenerate srem/urem select test coverage
Serge Pavlov [Sun, 18 Jun 2023 15:53:32 +0000 (22:53 +0700)]
[clang] Add __builtin_isfpclass
A new builtin function __builtin_isfpclass is added. It is called as:
__builtin_isfpclass(<floating point value>, <test>)
and returns an integer value, which is non-zero if the floating point
argument falls into one of the classes specified by the second argument,
and zero otherwise. The set of classes is an integer value, where each
value class is represented by a bit. There are ten data classes, as
defined by the IEEE-754 standard, they are represented by bits:
0x0001 (__FPCLASS_SNAN) - Signaling NaN
0x0002 (__FPCLASS_QNAN) - Quiet NaN
0x0004 (__FPCLASS_NEGINF) - Negative infinity
0x0008 (__FPCLASS_NEGNORMAL) - Negative normal
0x0010 (__FPCLASS_NEGSUBNORMAL) - Negative subnormal
0x0020 (__FPCLASS_NEGZERO) - Negative zero
0x0040 (__FPCLASS_POSZERO) - Positive zero
0x0080 (__FPCLASS_POSSUBNORMAL) - Positive subnormal
0x0100 (__FPCLASS_POSNORMAL) - Positive normal
0x0200 (__FPCLASS_POSINF) - Positive infinity
They have corresponding builtin macros to facilitate using the builtin
function:
if (__builtin_isfpclass(x, __FPCLASS_NEGZERO | __FPCLASS_POSZERO) {
// x is any zero.
}
The data class encoding is identical to that used in llvm.is.fpclass
function.
Differential Revision: https://reviews.llvm.org/D152351
Yingwei Zheng [Sun, 18 Jun 2023 15:40:19 +0000 (23:40 +0800)]
[CodeGenPrepare][RISCV] Remove asserting VH references before erasing the dead GEP
Fixes issue https://github.com/llvm/llvm-project/issues/63365
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D153194
luxufan [Sun, 18 Jun 2023 13:52:19 +0000 (21:52 +0800)]
[SCCP][NFC] Regenerate test case
Ivan Butygin [Sat, 10 Jun 2023 20:59:24 +0000 (22:59 +0200)]
[mlir][math] Uplift from arith to math.fma
Add pass to uplift from arith mulf + addf ops to math.fma if fastmath flags allow it.
Differential Revision: https://reviews.llvm.org/D152633
Simon Pilgrim [Sun, 18 Jun 2023 14:24:44 +0000 (15:24 +0100)]
[X86] Regenerate tls.ll and reuse common linux check prefixes
Simon Pilgrim [Sun, 18 Jun 2023 14:08:44 +0000 (15:08 +0100)]
[X86] Regenerate add32ri8.ll
Paul Walker [Sat, 17 Jun 2023 16:51:49 +0000 (17:51 +0100)]
[SVE][AArch64TTI] Fix invalid mla combine that miscomputes the value of inactive lanes.
Consider: add(pg, a, mul_u(pg, b, c))
Although the multiply's inactive lanes are undefined, they don't
contribute to the final result. The overall result of the inactive
lanes come from "a" and thus the above is another form of mla
rather than mla_u.
Paul Walker [Sat, 17 Jun 2023 15:48:09 +0000 (16:48 +0100)]
[NFC][AArch64TTI] Breakout add/sub combines into discrete functions.
Paul Walker [Sat, 17 Jun 2023 15:19:22 +0000 (16:19 +0100)]
Increase test coverage of Transforms/InstCombine/AArch64/sve-intrinsic-muladdsub.ll
AMS21 [Sun, 18 Jun 2023 11:40:47 +0000 (11:40 +0000)]
[clang-tidy] Improve `performance-move-const-arg` message when no move constructor is available
We now display a simple note if the reason is that the used class does not
support move semantics.
This fixes llvm#62550
Reviewed By: PiotrZSL
Differential Revision: https://reviews.llvm.org/D153220
AMS21 [Sun, 18 Jun 2023 11:40:32 +0000 (11:40 +0000)]
[clang-tidy] Fix `llvmlibc-inline-function-decl` false positives for templated function definitions
For a declaration the `FunctionDecl` begin location does not include the
template parameter lists, but for some reason if you have a separate
definitions to the declaration the begin location does include them.
With this patch we now correctly handle that case.
This fixes llvm#62746
Reviewed By: PiotrZSL
Differential Revision: https://reviews.llvm.org/D153218
LLVM GN Syncbot [Sun, 18 Jun 2023 07:30:50 +0000 (07:30 +0000)]
[gn build] Port
845618cf69e8
AMS21 [Sun, 18 Jun 2023 06:50:05 +0000 (06:50 +0000)]
[clang-tidy] Refactor common code from the Noexcept*Checks into `NoexceptFunctionCheck`
As discussed in the https://reviews.llvm.org/D148697 review.
Reviewed By: PiotrZSL
Differential Revision: https://reviews.llvm.org/D153198
NAKAMURA Takumi [Sun, 18 Jun 2023 06:24:53 +0000 (15:24 +0900)]
pr62660-normalization-failure.ll REQUIRES: asserts (#62660)
Youngsuk Kim [Sun, 18 Jun 2023 01:12:08 +0000 (04:12 +0300)]
[clang] Replace uses of CGBuilderTy::CreateElementBitCast (NFC)
* Add `Address::withElementType()` as a replacement for
`CGBuilderTy::CreateElementBitCast`.
* Partial progress towards replacing `CreateElementBitCast`, as it no
longer does what its name suggests. Either replace its uses with
`Address::withElementType()`, or remove them if no longer needed.
* Remove unused parameter 'Name' of `CreateElementBitCast`
Reviewed By: barannikov88, nikic
Differential Revision: https://reviews.llvm.org/D153196
Krzysztof Parzyszek [Sun, 18 Jun 2023 00:14:39 +0000 (17:14 -0700)]
[Hexagon] Add missing patterns for boolean [v]selects
Fixes https://github.com/llvm/llvm-project/issues/59663
Krzysztof Parzyszek [Sat, 17 Jun 2023 22:41:53 +0000 (15:41 -0700)]
[Hexagon] Handle all compares of i1 and vNi1
Fixes https://github.com/llvm/llvm-project/issues/63363
Krzysztof Parzyszek [Sat, 17 Jun 2023 22:37:24 +0000 (15:37 -0700)]
[Hexagon] Add missing patterns for truncate to vNi1
LLVM GN Syncbot [Sat, 17 Jun 2023 23:02:48 +0000 (23:02 +0000)]
[gn build] Port
cea4285949b5
Nico Weber [Sat, 17 Jun 2023 23:02:16 +0000 (19:02 -0400)]
[gn] fix build after
992cb98462ab
Fangrui Song [Sat, 17 Jun 2023 22:40:19 +0000 (15:40 -0700)]
[Pseudo Probe] Do not place functions in nodeduplicate COMDATs
For a function not in an IR COMDAT, currently we place it into a nodeduplicate IR
COMDAT so that its text section and its associated .pseudo_probe section will be
in the same section group, which can be retained or discarded by the linker as a
unit. However, the section group wastes space.
After D153189 uses SHF_LINK_ORDER to ensure a .pseudo_probe section will be
discarded when its associated text section is discarded, we can remove the
nodeduplicate IR change.
In the following example, the .pseudo_probe associated with .text.f is discarded as expected.
```
clang -c -ffunction-sections -fpseudo-probe-for-profiling -xc =(printf 'void _start(){} void f(){}') -o a.o
ld.lld --gc-sections --print-gc-sections a.o
```
Reviewed By: hoy
Differential Revision: https://reviews.llvm.org/D153191
Owen Pan [Sat, 17 Jun 2023 22:19:01 +0000 (15:19 -0700)]
[clang-format][NFC] Remove redundant getLLVMStyle() in unit tests
Brad Smith [Sat, 17 Jun 2023 21:38:28 +0000 (17:38 -0400)]
[Support/ELF] - Add OpenBSD PT_OPENBSD_NOBTCFI constant.
OpenBSD commit for reference:
https://github.com/openbsd/src/commit/
7b407c478fab53a6d9a091887c828c3f7b3f8b46
Florian Hahn [Sat, 17 Jun 2023 20:06:21 +0000 (21:06 +0100)]
[LSR] Enable SCEV verification for test from
f3a0ad2d and mark as XFAIL
The test fails SCEV verification, which cause the expensive check bots
to fail. Always run verification and mark as XFAIL until fixed.
Fangrui Song [Sat, 17 Jun 2023 19:34:49 +0000 (12:34 -0700)]
PPCAsmParser: Use parseOptionalToken
to simplify code near __tls_get_addr parsing.
Uday Bondhugula [Sat, 17 Jun 2023 17:13:43 +0000 (22:43 +0530)]
[MLIR] Add support for bare pointer calling convention in gpu-to-llvm
Add support for the bare pointer calling convention in the gpu-to-llvm
pass. This wasn't being exposed and is needed when GPU-compiled MLIR is
to be called with this convention.
Reviewed By: krzysz00
Differential Revision: https://reviews.llvm.org/D152477
Florian Hahn [Sat, 17 Jun 2023 16:58:41 +0000 (17:58 +0100)]
Revert "[LSR] Consider post-inc form when creating extends/truncates."
This reverts commit
abfeda5af329b5889db709ff74506e20e0b569e9.
and
fe19036e1266d2a90b44725c82b898134906e4c3.
The added assertion triggers during clang bootstrap builds. Revert while
I investigate.
Jeff Niu [Sat, 17 Jun 2023 16:54:13 +0000 (09:54 -0700)]
[mlir][arith] Remove unused ODS class
Differential Revision: https://reviews.llvm.org/D153203
Florian Hahn [Sat, 17 Jun 2023 16:37:25 +0000 (17:37 +0100)]
[LSR] Add test for #62660.
Add test for LSR miscompile.
Felipe de Azevedo Piovezan [Sat, 17 Jun 2023 12:26:05 +0000 (08:26 -0400)]
[Coding style] Fix incorrect link syntax
Hui [Wed, 31 May 2023 09:49:56 +0000 (10:49 +0100)]
[libc++][NFC] Granularise <thread> header
- This was to make implementing jthread easier and requested in https://reviews.llvm.org/D151559
Differential Revision: https://reviews.llvm.org/D151792
Prajwal S N [Sat, 17 Jun 2023 10:02:32 +0000 (15:32 +0530)]
[docs] Fix link for static constructors article
It was previously present in the inline code block and did not work as a
hyperlink.
Reviewed By: yassingh
Differential Revision: https://reviews.llvm.org/D153061
Florian Hahn [Sat, 17 Jun 2023 09:15:15 +0000 (10:15 +0100)]
[AMDGPU] Update test after
abfeda5af329b58.
Florian Hahn [Sat, 17 Jun 2023 08:58:37 +0000 (09:58 +0100)]
[LSR] Consider post-inc form when creating extends/truncates.
GenerateTruncates at the moment creates extends/truncates for post-inc
uses of normalized expressions. For example, if an add rec of the form
{1,+,-1} is used outside the loop, the normalized form will use {1,+,-1}
instead of {0,+,-1}. When naively sign-extending the normalized
expression, it will get extended incorrectly to {1,+,-1} for the wider
type, if the backedge-taken count of the loop is 1.
To address this, the patch updates GenerateTruncates to check if the
LSRUse contains any fixups with PostIncLoops. If that's the case, first
de-normalize the expression, then perform the extend/truncate, then
normalize again.
There may be other places where similar checks are needed and the helper
can be generalized for those cases. I'd not be surprised if other subtle
mis-compiles are caused by this.
Fixes #38847.
Fixes #58039.
Fixes #62852.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D153004
Florian Hahn [Sat, 17 Jun 2023 08:56:59 +0000 (09:56 +0100)]
[LSR] Add test case for #58039.
Corentin Jabot [Sat, 17 Jun 2023 05:34:42 +0000 (08:34 +0300)]
[Clang] Add the list of core papers approved in Varna to the status page
Fangrui Song [Sat, 17 Jun 2023 06:46:36 +0000 (23:46 -0700)]
[Pseudo Probe] Make .pseudo_probe GC-able
* Add the SHF_LINK_ORDER flag so that the .pseudo_probe section is discarded when the associated text section is discarded.
* Add unique ID so that with `clang -ffunction-sections -fno-unique-section-names`, there is one separate .pseudo_probe for each text section (disambiguated by `.section ....,unique,id` in assembly)
The changes allow .pseudo_probe GC even if we don't place instrumented functions
in an IR comdat (see `getOrCreateFunctionComdat` in SampleProfileProbe.cpp).
Reviewed By: hoy
Differential Revision: https://reviews.llvm.org/D153189
Jay Foad [Fri, 16 Jun 2023 13:49:19 +0000 (14:49 +0100)]
[AMDGPU] Generate checks for load-constant tests
Differential Revision: https://reviews.llvm.org/D153139
Fangrui Song [Sat, 17 Jun 2023 05:19:32 +0000 (22:19 -0700)]
[bazel] Fix clang after D148094
Sergei Barannikov [Tue, 9 May 2023 15:55:59 +0000 (18:55 +0300)]
[clang][CodeGen] Break up TargetInfo.cpp [8/8]
This commit breaks up CodeGen/TargetInfo.cpp into a set of *.cpp files,
one file per target. There are no functional changes, mostly just code moving.
Non-code-moving changes are:
* A virtual destructor has been added to DefaultABIInfo to pin the vtable to a cpp file.
* A few methods of ABIInfo and DefaultABIInfo were split into declaration + definition
in order to reduce the number of transitive includes.
* Several functions that used to be static have been placed in clang::CodeGen
namespace so that they can be accessed from other cpp files.
RFC: https://discourse.llvm.org/t/rfc-splitting-clangs-targetinfo-cpp/69883
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D148094
Sergei Barannikov [Tue, 9 May 2023 15:41:05 +0000 (18:41 +0300)]
[clang][CodeGen] Break up TargetInfo.cpp [7/8]
Wrap calls to XXXTargetCodeGenInfo constructors into factory functions.
This allows moving implementations of TargetCodeGenInfo to dedicated cpp
files without a change.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D150215
Sindhu Chittireddy [Tue, 13 Jun 2023 21:37:58 +0000 (14:37 -0700)]
[NFC] Fix potential dereferencing of null return value.
Replace getAs with castAs and add assert if needed.
Differential Revision: https://reviews.llvm.org/D152977
Pranav Kant [Sat, 17 Jun 2023 02:57:46 +0000 (02:57 +0000)]
Weining Lu [Sat, 17 Jun 2023 01:46:40 +0000 (09:46 +0800)]
[LoongArch] Fix handling of the chain of CSRWR and CSRXCHG nodes
`LoongArchISD::CSRWR` has two results. The first is the result of
`loongarch.csrwr.[wd]` intrinsic and the second is the chain. But
currently the chain is not processed correctly when creating this
node, resulting in the `csrwr` instruction being optimized out when
the result is not used by anyone [1]. `LoongArchISD::CSRXCHG` has
the same issue.
This patch addresses this issue.
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/loongarch/include/asm/loongarch.h?h=v6.4-rc6#n219
Reviewed By: hev
Differential Revision: https://reviews.llvm.org/D153120
Weining Lu [Sat, 17 Jun 2023 01:46:29 +0000 (09:46 +0800)]
[LoongArch][NFC] Precommit test for D153120 (the fix of CSRWR and CSRXCHG)
Reviewed By: xry111
Differential Revision: https://reviews.llvm.org/D153119
Ashay Rane [Fri, 16 Jun 2023 22:11:43 +0000 (17:11 -0500)]
[MLIR] Register all extensions in CAPI's RegisterEverything
The patch for promised interfaces (
a5ef51d7) doesn't register all
extensions in the CAPI's `mlirRegisterAllDialects()` function. This is
used by the MLIR Python bindings, causing downstream users of the Python
bindings to terminate abruptly. This patch adds the call to register
all extensions.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D153174
LiaoChunyu [Sat, 17 Jun 2023 01:33:48 +0000 (09:33 +0800)]
[RISCV] Fold special case (xor (setcc constant, y, setlt), 1) -> (setcc y, constant + 1, setlt)
Improve D151719.
(xor (setcc constant, y, setlt), 1) -> (setcc y, constant + 1, setlt)
https://alive2.llvm.org/ce/z/BZNEia
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D152128
Matt Arsenault [Sat, 5 Nov 2022 20:07:49 +0000 (13:07 -0700)]
clang/AMDGPU: Emit atomicrmw for atomic_inc/dec builtins
This makes the scope and ordering arguments actually do something.
Also add some new OpenCL tests since the existing HIP tests didn't
cover address spaces.
Fangrui Song [Sat, 17 Jun 2023 00:08:58 +0000 (17:08 -0700)]
[MC] Restore a special case to support limited A-B folding when A/B are in the same fragment being laided out
Add subsection-if.s to test what we can fold (in the same fragment) and what we cannot.
Fix https://github.com/ClangBuiltLinux/linux/issues/1876
Fixes:
4bdc7f7a331f82cca1637388cf68bdc5b32ab43b
Owen Pan [Wed, 14 Jun 2023 22:06:17 +0000 (15:06 -0700)]
Reland [clang-format] Fix overlapping whitespace replacements before PPDirective
If the first token of an annotated line already has a computed Newlines,
reuse it to avoid potential overlapping whitespace replacements before
preprocessor branching directives.
Fixes #62892.
Differential Revision: https://reviews.llvm.org/D151954
Daniel Thornburgh [Fri, 16 Jun 2023 23:54:43 +0000 (16:54 -0700)]
[Fuchsia] Forward libedit flags to stage2
Philip Reames [Fri, 16 Jun 2023 23:47:39 +0000 (16:47 -0700)]
[RISCV] Fix a latent miscompile in doPeepholeMaskedRVV
The code was using the tail policy being "agnostic" to select a instruction whose semantics were "undefined". This was almost always fine (as the pass through operand was usually implicit_def), but could in theory lead to a miscompile. I don't actually have a test case as it requires a later transform to exploit the wrong tail policy state, and I couldn't easily figure out to get vsetvli insertion to miscompile given the wrong state. This was spotted by inspection, and it may be a miscompile in theory only at the moment.
Note that this may cause regressions if there are instructions for which we either don't have a _TU pseudo form, or the _TU pseudo form is missing a policy operand. When I was first looking at this, I saw exactly that, and D153067 exists to add the missing policy operand I noticed.
As a later follow up, I want to always force the use of _TU, but it seemed good to fix the bug, then driven the _TU transition in a separate patch.
Differential Revision: https://reviews.llvm.org/D153070
Philip Reames [Fri, 16 Jun 2023 23:41:09 +0000 (16:41 -0700)]
[RISCV] Add a policy operand to VPseudoBinaryNoMaskTU [NFC]
This change adds a policy operand to the helper class which is used for binary ops like vadd, but also, possibly surprisingly, some of the vslide variants. This allows us to represent the tail agnostic state with this pseudo family - previously, we could only represent tail undefined and tail undisturbed. (Since these don't have a mask, they're always mask undefined.)
This is NFC because no current producer uses the tail agnostic state. This will change in an upcoming change to doPeepholeMaskedRVV.
Differential Revision: https://reviews.llvm.org/D153067
Diego Caballero [Fri, 16 Jun 2023 23:21:24 +0000 (23:21 +0000)]
[mlir][Vector] Fix 0-D tensor vectorization in Linalg
It looks like scalable vector support broke vectorization for 0-D
tensors and we didn't have any test coverting that case. This patch
provides a fix and a test.
Differential Revision: https://reviews.llvm.org/D153181
Owen Pan [Fri, 16 Jun 2023 23:33:36 +0000 (16:33 -0700)]
[clang-format][NFC] Use verifyGoogleFormat in FormatTest.cpp
Replaces verifyFormat(..., getGoogleStyle()) with
verifyGoogleFormat(...) in FormatTest.cpp.
Craig Topper [Fri, 16 Jun 2023 23:40:10 +0000 (16:40 -0700)]
[RISCV] Reduce alignment for __attribute__((riscv_rvv_vector_bits)) for LMUL<1 types.
Don't use an alignment larger than the vector size.
Luke Lau [Fri, 16 Jun 2023 23:31:35 +0000 (16:31 -0700)]
[RISCV] Refactor vecPolicyOp skip logic in doPeepholeMaskedRVV. NFC
We can just explicitly check if the new unmasked pseudo takes a policy
op, rather than implicitly relying on I->UnmaskedTUPseudo ==
I->UnmaskedPseudo. Split out from another patch to make the diff more
readable.
Differential Revision: https://reviews.llvm.org/D152961
Luke Lau [Fri, 16 Jun 2023 23:30:38 +0000 (16:30 -0700)]
[RISCV] Reuse RISCVDAGToDAGISel member TTI in doPeepholeMaskedRVV. NFC
Differential Revision: https://reviews.llvm.org/D152960
Daniel Thornburgh [Fri, 16 Jun 2023 23:34:26 +0000 (16:34 -0700)]
[Fuchsia] Forward terminfo flags to stage 2
Philip Reames [Fri, 16 Jun 2023 23:13:22 +0000 (16:13 -0700)]
[RISCV] Make all vector binops use the _TU pseudo form
This continues towards the goal spelled out in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295. This patch switches all the binary operations (no widen, no narrow, but both int and FP) to use the _TU + implicit_def passthrough form. Change is mechanical.
This only changes the unmasked variants. Masked variants will still go through doPeepholeMaskedRVV and end up in the unsuffixed/TA form. Fixing that will be a separate change.
Differential Revision: https://reviews.llvm.org/D152940
Nitin John Raj [Thu, 15 Jun 2023 02:10:44 +0000 (19:10 -0700)]
[RISCV] Introduce RISCVISD::VWMACC(U/SU)_VL opcode
Differential Revision: https://reviews.llvm.org/D153057
Owen Pan [Fri, 16 Jun 2023 07:00:43 +0000 (00:00 -0700)]
[clang-format][NFC] Clean up unit tests
This patch adds a verifyNoChange macro to verify code that won't
change after being formatted. (The code will not be messed up before
being formatted.) It then replaces EXPECT_EQ with verifyFormat
wherever applicable so that the code will be messed up before being
formatted. When the replacement fails the unit test, verifyFormat is
replaced with verifyNoChange.
Differential Revision: https://reviews.llvm.org/D153109
Matt Arsenault [Thu, 1 Jun 2023 22:11:24 +0000 (18:11 -0400)]
TTI: Add function to hasBranchDivergence
It my be possible to contextually ignore divergence in a function if
it's known to run single threaded.