Jonas Hahnfeld [Thu, 7 Apr 2022 14:19:11 +0000 (16:19 +0200)]
[ORC] Fix sorting of contructors by priority
The code was incorrectly sorting by the function address.
Differential Revision: https://reviews.llvm.org/D123311
Xiang Li [Mon, 2 May 2022 20:59:37 +0000 (13:59 -0700)]
[DirectX backend] Add pass to lower llvm intrinsic into dxil op function.
A new pass DXILOpLowering was added.
It will scan all llvm intrinsics, create dxil op function if it can map to dxil op function.
Then translate call instructions on the intrinsic into call on dxil op function.
dxil op function will add i32 argument to the begining of args for dxil opcode.
So cannot use setCalledFunction to update the call instruction on intrinsic.
This commit only support sin to start the work.
Reviewed By: kuhar, beanz
Differential Revision: https://reviews.llvm.org/D124805
Yeting Kuo [Sun, 8 May 2022 13:10:06 +0000 (21:10 +0800)]
[RISCV] Make PseudoReadVL have the vtypes of the corresponding VLEFF/VLSEGFF.
The patch make PseudoReadVL have the vtypes of the corresponding VLEFF/VLSEGFF.
It's useful to get the vtypes of locations of PseudoReadVL without finding the
corresponding VLEFF/VLSEGFF.
It could simplify optimizations in RISCVInsertVSETVLI like D123581.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D125199
jacquesguan [Mon, 18 Apr 2022 06:32:32 +0000 (06:32 +0000)]
[RISCV] Add rvv codegen support for vp.fpext.
This patch adds rvv codegen support for vp.fpext. The lowering of fp_round, vp.fptrunc, fp_extend and vp.fpext share most code so use a common lowering function to handle these four.
And this patch changes the intermediate cast from ISD::FP_EXTEND/ISD::FP_ROUND to the RVV VL version op RISCVISD::FP_EXTEND_VL and RISCVISD::FP_ROUND_VL for scalable vectors.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D123975
Peter Steinfeld [Mon, 9 May 2022 21:12:41 +0000 (14:12 -0700)]
[flang] Change "bad kind" messages in the runtime to "not yet implemented"
Similar to change D125046.
If a programmer is able to compile and link a program that contains types that
are not yet supported by the runtime, it must be because they're not yet
implemented.
This change will make it easier to find unimplemented code in tests.
Differential Revision: https://reviews.llvm.org/D125267
Mingming Liu [Wed, 11 May 2022 02:56:14 +0000 (19:56 -0700)]
[X86] Fix 80 column violation in X86InstrInfo.cpp. NFC
Differential Revision: https://reviews.llvm.org/D125345
Mingming Liu [Wed, 11 May 2022 02:46:15 +0000 (19:46 -0700)]
Revert "[NFC] Run clang-format on llvm/lib/Target/X86/X86InstroInfo.cpp"
This reverts commit
8bef5476de3ec7388ad0c72b26dcc82ac7fd970a.
Need to revert, update commit message and reapply.
Alexander Shaposhnikov [Wed, 11 May 2022 01:07:54 +0000 (01:07 +0000)]
[Transform][Utils][NFC] Clean up CtorUtils.cpp
Xiang1 Zhang [Sat, 7 May 2022 07:22:15 +0000 (15:22 +0800)]
[CodeGen] Fix ConvertNodeToLibcall for STRICT_FPOWI
Reviewed By: PengfeiWang
Differential Revision: https://reviews.llvm.org/D125159
Mingming Liu [Tue, 10 May 2022 23:01:02 +0000 (16:01 -0700)]
[NFC] Run clang-format on llvm/lib/Target/X86/X86InstroInfo.cpp
Differential Revision: https://reviews.llvm.org/D125345
Ting Wang [Wed, 11 May 2022 00:47:51 +0000 (20:47 -0400)]
[PowerPC] Fix PPCISD::STBRX selection issue on A2
Enable FeatureISA2_06 on Power A2 target
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D125203
Eduard Zingerman [Wed, 11 May 2022 00:41:41 +0000 (17:41 -0700)]
[BPF] Add a test for making FI_ri as isPseudo
Commit
8a63326150ee ("[BPF] Mark FI_ri as isPseudo to avoid
assertion during disassembly") added isPseudo to FI_ri insn
in BPFInstrInfo.td file. This patch added the missing test file.
Differential Revision: https://reviews.llvm.org/D125185
Eduard Zingerman [Wed, 11 May 2022 00:04:58 +0000 (17:04 -0700)]
[BPF] Mark FI_ri as isPseudo to avoid assertion during disassembly
When a specific sequence of bytes is present in the file during
disassembly the disassembler fails with the following assertion:
...
0: 18 20 00 00 00 00 00 00 lea
... Assertion `idx < size()' failed.
...
llvm::SmallVectorTemplateCommon<...>::operator[](...) ...
llvm::MCInst::getOperand(unsigned int) ...
llvm::BPFInstPrinter::printOperand(...) ...
llvm::BPFInstPrinter::printInstruction() ...
llvm::BPFInstPrinter::printInst(...) ...
...
The byte sequence causing the error is (little endian):
18 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00
The issue could be reproduced using the program bellow:
test.ir:
@G = constant
[16 x i8]
[i8 u0x18, i8 u0x20, i8 u0x00, i8 u0x00, i8 u0x00, i8 u0x00, i8 u0x00, i8 u0x00,
i8 u0x00, i8 u0x00, i8 u0x00, i8 u0x00, i8 u0x00, i8 u0x00, i8 u0x00, i8 u0x00],
section "foo", align 8
Compiled and disassembled as follows:
cat test.ir | llc -march=bpfel -filetype=obj -o - \
| llvm-objdump --arch=bpfel --section=foo -d -
This byte sequence corresponds to FI_ri instruction declared in the
BPFInstrInfo.td as follows:
def FI_ri
: TYPE_LD_ST<BPF_IMM.Value, BPF_DW.Value,
(outs GPR:$dst),
(ins MEMri:$addr),
"lea\t$dst, $addr",
[(set i64:$dst, FIri:$addr)]> {
// This is a tentative instruction, and will be replaced
// with MOV_rr and ADD_ri in PEI phase
let Inst{51-48} = 0;
let Inst{55-52} = 2;
let Inst{47-32} = 0;
let Inst{31-0} = 0;
let BPFClass = BPF_LD;
}
Notes:
- First byte (opcode) is formed as follows:
- BPF_IMM.Value is 0x00
- BPF_DW.Value is 0x18
- BPF_LD is 0x00
- Second byte (registers) is formed as follows:
- let Inst{55-52} = 2;
- let Inst{51-48} = 0;
The FI_ri instruction is always replaced by MOV_rr ADD_ri instructions
pair in the BPFRegisterInfo::eliminateFrameIndex method. Thus, this
instruction should be invisible to disassembler. This patch achieves
this by adding "isPseudo" flag for this instruction.
The bug was found by decompiling of one of the BPF tests from Linux
kernel (llvm-objdump -D tools/testing/selftests/bpf/bpf_iter_sockmap.o)
Differential Revision: https://reviews.llvm.org/D125185
Florian Mayer [Wed, 11 May 2022 00:00:57 +0000 (17:00 -0700)]
[HWASan symbolize] Write error to stderr.
Florian Mayer [Tue, 10 May 2022 23:32:12 +0000 (16:32 -0700)]
[HWASan] deflake hwasan_symbolize test more.
Don't fail on corrupted ELF file on indexing. This happens because files
change in the directory from concurrent tests.
Peter Klausler [Tue, 10 May 2022 20:42:08 +0000 (13:42 -0700)]
[flang] Allow local variables and function result inquiries in specification expressions
Inquiries into the bounds, size, and length of local variables (and function results)
are acceptable specification expressions. A recent change allowed them for dummy
arguments that are not OPTIONAL or INTENT(OUT), but didn't address other object
entities.
Differential Revision: https://reviews.llvm.org/D125343
Nick Desaulniers [Tue, 10 May 2022 23:21:17 +0000 (16:21 -0700)]
[BuildLibCalls] infer inreg param attrs from NumRegisterParameters
We're having a hard time booting the ARCH=i386 Linux kernel with clang
after removing -ffreestanding because instcombine was dropping inreg
from callers during libcall simplification, but not the callees defined
in different translation units. This led the callers and callees to have
wildly different calling conventions, which (predictably) blew up at
runtime.
Infer the inreg param attrs on function declarations from the module
metadata "NumRegisterParameters." This allows us to boot the ARCH=i386
Linux kernel (w/ -ffreestanding removed).
Fixes: https://github.com/llvm/llvm-project/issues/53645
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D125285
Wende Tan [Tue, 10 May 2022 22:44:46 +0000 (15:44 -0700)]
[Bitcode] Include indirect users of BlockAddresses in bitcode
The original fix (commit
23ec5782c3cc) of
https://github.com/llvm/llvm-project/issues/52787 only adds `Function`s
that have `Instruction`s that directly use `BlockAddress`es into the
bitcode (`FUNC_CODE_BLOCKADDR_USERS`).
However, in either @rickyz's original reproducing code:
```
void f(long);
__attribute__((noinline)) static void fun(long x) {
f(x + 1);
}
void repro(void) {
fun(({
label:
(long)&&label;
}));
}
```
```
...
define dso_local void @repro() #0 {
entry:
br label %label
label: ; preds = %entry
tail call fastcc void @fun()
ret void
}
define internal fastcc void @fun() unnamed_addr #1 {
entry:
tail call void @f(i64 add (i64 ptrtoint (i8* blockaddress(@repro, %label) to i64), i64 1)) #3
ret void
}
...
```
or the xfs and overlayfs in the Linux kernel, `BlockAddress`es (e.g.,
`i8* blockaddress(@repro, %label)`) may first compose `ConstantExpr`s
(e.g., `i64 ptrtoint (i8* blockaddress(@repro, %label) to i64)`) and
then used by `Instruction`s. This case is not handled by the original
fix.
This patch adds *indirect* users of `BlockAddress`es, i.e., the
`Instruction`s using some `Constant`s which further use the
`BlockAddress`es, into the bitcode as well, by doing depth-first
searches.
Fixes: https://github.com/llvm/llvm-project/issues/52787
Fixes:
23ec5782c3cc ("[Bitcode] materialize Functions early when BlockAddress taken")
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D124878
Mingming Liu [Tue, 10 May 2022 21:23:40 +0000 (14:23 -0700)]
[Peephole-opt][X86] Enhance peephole opt to see through SUBREG_TO_REG
(following AND) and eliminates redundant TEST instruction.
Differential Revision: https://reviews.llvm.org/D124118
Chia-hung Duan [Tue, 10 May 2022 22:48:46 +0000 (22:48 +0000)]
[mlir] Print some message for op-printing verification
Before dump, Insetad of switching to generic form silently after
verification failure. Print some debug logs to help identify why an op
may be printed in a different way.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D125136
Thomas Raoux [Mon, 9 May 2022 17:18:21 +0000 (17:18 +0000)]
[mlir][gpu] Move async copy ops to NVGPU and add caching hints
Move async copy operations to NVGPU as they only exist on NV target and are
designed to match ptx semantic. This allows us to also add more fine grain
caching hint attribute to the op.
Add hint to bypass L1 and hook it up to NVVM op.
Differential Revision: https://reviews.llvm.org/D125244
Vasileios Porpodas [Thu, 5 May 2022 22:03:31 +0000 (15:03 -0700)]
[SLP] Make reordering aware of external vectorizable scalar stores.
The current reordering scheme only checks the ordering of in-tree operands.
There are some cases, however, where we need to adjust the ordering based on
the ordering of a future SLP-tree who's instructions are not part of the
current tree, but are external users.
This patch is a simple implementation of this. We keep track of scalar stores
that are users of TreeEntries and if they look profitable to vectorize, then
we keep track of their ordering. During the reordering step we take this new
index order into account. This can remove some shuffles in cases like in the
lit test.
Differential Revision: https://reviews.llvm.org/D125111
Philip Reames [Tue, 10 May 2022 22:06:26 +0000 (15:06 -0700)]
[riscv] Consolidate logic for SEW/VL operand offset calculations [nfc]
Philip Reames [Tue, 10 May 2022 21:11:36 +0000 (14:11 -0700)]
[riscv] Minor style cleanup so that code more obviously matches comments [nfc]
Mike Rice [Tue, 10 May 2022 17:54:00 +0000 (10:54 -0700)]
[OpenMP] Fix mangling for linear modifiers with variable stride
This adds support for variable stride with the val, uval, and ref linear
modifiers. Previously only the no modifer type ls<argno> was supported.
val -> Ls<argno>
uval -> Us<argno>
ref -> Rs<argno>
Differential Revision: https://reviews.llvm.org/D125330
LLVM GN Syncbot [Tue, 10 May 2022 21:06:25 +0000 (21:06 +0000)]
[gn build] Port
f822db7670d4
Mehdi Amini [Tue, 10 May 2022 21:04:24 +0000 (21:04 +0000)]
Remove unused variable (fix -Werror build on MSVC)
Jan Korous [Tue, 10 May 2022 21:03:48 +0000 (14:03 -0700)]
Revert "[utils] Avoid hardcoding metadata ids in update_cc_test_checks"
This reverts commit
ce583b14b2ec37b1c168bb92020680cb452502b3.
Mingming Liu [Tue, 10 May 2022 20:46:46 +0000 (13:46 -0700)]
Revert "Enhance peephole optimization."
This reverts commit
d84ca05ef7f897fdd51900ea07e3c5344632130a.
Will revert, update commit message and re-commit.
Vasileios Porpodas [Fri, 6 May 2022 18:13:46 +0000 (11:13 -0700)]
[SLP][NFC] Precommit a lit test for a followup patch that improves tree reordering for external users.
Differential Revision: https://reviews.llvm.org/D125110
Erich Keane [Tue, 10 May 2022 20:34:01 +0000 (13:34 -0700)]
[NFC] Replace not-null and not-isa check with a not-isa_and_nonnull
Jim Ingham [Tue, 10 May 2022 20:27:47 +0000 (13:27 -0700)]
Add the "sent break" message to the "gdb-remote packets" channel
It was originally only in "gdb-remote process" but it is convenient to
also have it come as part of gdb-remote packets.
Matthias Braun [Tue, 10 May 2022 20:25:32 +0000 (13:25 -0700)]
Nathan James [Tue, 10 May 2022 20:06:17 +0000 (21:06 +0100)]
[clang-tidy] Fix unintended change left in
12cb540529e
jeff [Tue, 26 Apr 2022 18:23:13 +0000 (11:23 -0700)]
[AMDGPU] Allow for MFMA Inst Clustering
This patch adds cluster edges between independent MFMA instructions. Additionally, it propogates all predecessors of cluster insts to the root of the cluster(s), and all successors to the leaf(ves) of the cluster(s) -- this is done to remove the possibility that those insts will be interspersed within the cluster.
Reviewed By: kerbowa
Differential Revision: https://reviews.llvm.org/D124678
Erich Keane [Tue, 10 May 2022 19:48:01 +0000 (12:48 -0700)]
[NFC] Add missing 'break' in a switch case
Mingming Liu [Wed, 20 Apr 2022 19:28:19 +0000 (19:28 +0000)]
Enhance peephole optimization.
Differential Revision: https://reviews.llvm.org/D124118
Erich Keane [Tue, 10 May 2022 19:27:45 +0000 (12:27 -0700)]
[NFC]Add Missing Break in switch that we didn't notice because it was
last.
jeff [Thu, 28 Apr 2022 23:50:55 +0000 (16:50 -0700)]
[NFC] Fix typo
Reviewed By: kerbowa
Differential Revision: https://reviews.llvm.org/D124647
Arthur Eubanks [Tue, 10 May 2022 02:32:14 +0000 (19:32 -0700)]
[BasicAA] Fix order in which we pass MemoryLocations to alias()
D98718 caused the order of Values/MemoryLocations we pass to alias() to
be significant due to storing the offset in the PartialAlias case. But
some callers weren't audited and were still passing swapped arguments,
causing the returned PartialAlias offset to be negative in some
cases. For example, the newly added unittests would return -1
instead of 1.
Fixes #55343, a miscompile.
Reviewed By: asbirlea, nikic
Differential Revision: https://reviews.llvm.org/D125328
Florian Hahn [Tue, 10 May 2022 18:57:43 +0000 (19:57 +0100)]
[AArch64] Remove redundant f{min,max}nm intrinsics.
The patch extends AArch64TTIImpl::instCombineIntrinsic to simplify
llvm.aarch64.neon.f{min,max}nm(a, a) -> a.
This helps with simplifying code written using the ACLE, e.g.
see https://godbolt.org/z/jYxsoc89c
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D125234
Yaxun (Sam) Liu [Tue, 10 May 2022 18:55:59 +0000 (14:55 -0400)]
Fix indentation in ReleaseNotes.rst
Nicolas Vasilache [Tue, 10 May 2022 17:31:22 +0000 (17:31 +0000)]
[mlir][SCF] Retire `cloneWithNewYields` helper function.
This is now subsumed by `replaceLoopWithNewYields`.
Differential Revision: https://reviews.llvm.org/D125309
Mahesh Ravishankar [Fri, 6 May 2022 21:44:26 +0000 (21:44 +0000)]
[mlir][SCF] Add utility method to add new yield values to a loop.
The current implementation of `cloneWithNewYields` has a few issues
- It clones the loop body of the original loop to create a new
loop. This is very expensive.
- It performs `erase` operations which are incompatible when this
method is called from within a pattern rewrite. All erases need to
go through `PatternRewriter`.
To address these a new utility method `replaceLoopWithNewYields` is added
which
- moves the operations from the original loop into the new loop.
- replaces all uses of the original loop with the corresponding
results of the new loop
- use a call back to allow caller to generate the new yield values.
- the original loop is modified to just yield the basic block
arguments corresponding to the iter_args of the loop. This
represents a no-op loop. The loop itself is dead (since all its uses
are replaced), but is not removed. The caller is expected to erase
the op. Consequently, this method can be called from within a
`matchAndRewrite` method of a `PatternRewriter`.
The `cloneWithNewYields` could be replaces with
`replaceLoopWithNewYields`, but that seems to trigger a failure during
walks, potentially due to the operations being moved. That is left as
a TODO.
Differential Revision: https://reviews.llvm.org/D125147
Alan Zhao [Tue, 10 May 2022 18:19:45 +0000 (14:19 -0400)]
[llvm-ml] Implement support for MASM's extern directive
The EXTERN keyword defines external symbols in MASM.
Credit goes to epastor@ for implementing most of the logic; I (ayzhao@)
added some bugfixes and tests.
[0]: https://docs.microsoft.com/en-us/cpp/assembler/masm/extern-masm?view=msvc-170
Reviewed By: epastor
Submitted By: epastor
Differential Revision: https://reviews.llvm.org/D125273
Yaxun (Sam) Liu [Tue, 3 May 2022 18:13:56 +0000 (14:13 -0400)]
[CUDA][HIP] support __noinline__ as keyword
CUDA/HIP programs use __noinline__ like a keyword e.g.
__noinline__ void foo() {} since __noinline__ is defined
as a macro __attribute__((noinline)) in CUDA/HIP runtime
header files.
However, gcc and clang supports __attribute__((__noinline__))
the same as __attribute__((noinline)). Some C++ libraries
use __attribute__((__noinline__)) in their header files.
When CUDA/HIP programs include such header files,
clang will emit error about invalid attributes.
This patch fixes this issue by supporting __noinline__ as
a keyword, so that CUDA/HIP runtime could remove
the macro definition.
Reviewed by: Aaron Ballman, Artem Belevich
Differential Revision: https://reviews.llvm.org/D124866
Sanjay Patel [Tue, 10 May 2022 18:20:43 +0000 (14:20 -0400)]
[InstCombine] fold shuffles with FP<->Int cast operands
shuffle (cast X), (cast Y), Mask --> cast (shuffle X, Y, Mask)
This is similar to a recent transform with fneg (
b331a7ebc1e0 ),
but this is intentionally the most conservative first step to
try to avoid regressions in codegen. There are several
restrictions that could be removed as follow-up enhancements.
Note that a cast with a unary shuffle is currently canonicalized
in the other direction (shuffle after cast - D103038 ). We might
want to invert that to be consistent with this patch.
Sanjay Patel [Tue, 10 May 2022 15:52:07 +0000 (11:52 -0400)]
[InstCombine] add tests for shuffles with FP<->int cast operands; NFC
Joseph Huber [Tue, 10 May 2022 17:19:16 +0000 (13:19 -0400)]
[OpenMP] Fix embedding offload code when there is no offloading toolchain
Summary:
We use the `--offload-new-driver` option to enable offload code
embedding. The check for when to do this was flawed and was enabling it
too early in the case of OpenMP, causing a segfault when dereferencing
the offloading toolchain.
Jan Korous [Sat, 23 Apr 2022 02:01:50 +0000 (19:01 -0700)]
[utils] Avoid hardcoding metadata ids in update_cc_test_checks
Specifically for: !tbaa, !tbaa.struct, !annotation, !srcloc, !nosanitize.
The goal is to avoid test brittleness caused by hardcoded values.
Differential Revision: https://reviews.llvm.org/D123273
Matthias Braun [Wed, 27 Apr 2022 01:27:21 +0000 (18:27 -0700)]
CodeGenPrepare: Replace constant PHI arguments with switch condition value
We often see code like the following after running SCCP:
switch (x) { case 42: phi(42, ...); }
This tends to produce bad code as we currently materialize the constant
phi-argument in the switch-block. This increases register pressure and
if the pattern repeats for `n` case statements, we end up generating `n`
constant values.
This changes CodeGenPrepare to catch this pattern and revert it back to:
switch (x) { case 42: phi(x, ...); }
Differential Revision: https://reviews.llvm.org/D124552
Matthias Braun [Tue, 3 May 2022 17:53:34 +0000 (10:53 -0700)]
Avoid 8 and 16bit switch conditions on x86
This adds a `TargetLoweringBase::getSwitchConditionType` callback to
give targets a chance to control the type used in
`CodeGenPrepare::optimizeSwitchInst`.
Implement callback for X86 to avoid i8 and i16 types where possible as
they often incur extra zero-extensions.
This is NFC for non-X86 targets.
Differential Revision: https://reviews.llvm.org/D124894
Matthias Braun [Tue, 3 May 2022 22:01:39 +0000 (15:01 -0700)]
Use update_llc_test_checks for the switch.ll test; add new test
- Change `switch.ll` test to a style suitable for
`tools/update_llc_test_checks.py`.
- Precommit test for upcoming changes:
- Add `switch_i8` to `test/CodeGen/X86/switch.ll`.
- Add `test/CodeGen/X86/switch-phi-const.ll`.
Differential Revision: https://reviews.llvm.org/D124893
Kadir Cetinkaya [Mon, 9 May 2022 09:25:29 +0000 (11:25 +0200)]
[clangd] Support for standard inlayHint protocol
- Make clangd's internal representation more aligned with the standard.
We keep range and extra inlayhint kinds around, but don't serialize
them on standard version.
- Have custom serialization for extension (ugly, but going to go away).
- Support both versions until clangd-17.
- Don't advertise extension if client has support for standard
implementation.
- Log a warning at startup about extension being deprecated, if client
doesn't have support.
Differential Revision: https://reviews.llvm.org/D125228
Mike Rice [Mon, 9 May 2022 18:41:38 +0000 (11:41 -0700)]
[OpenMP] Add mangling support for linear modifiers (ref,uval,val)
Add mangling for linear parameters specified with ref, uval, and val
for 'omp declare simd' vector functions.
Add missing stride for linear this parameters.
Differential Revision: https://reviews.llvm.org/D125269
Tsukasa OI [Tue, 10 May 2022 16:25:43 +0000 (00:25 +0800)]
[RISCV] 'K'-extension ordering
This commit adds 'K' to supported extension list (before 'J').
It makes "Zk*" extensions correctly placed before "Zv*" extensions.
Multi-letter "Z*" extensions are first ordered with the most closely
related alphabetical extension category ("IMAF..."). This is represented
in LLVM as `AllStdExts' variable in `llvm/lib/Support/RISCVISAInfo.cpp'.
However, it did not have 'k' making "Zk*" extensions not correctly ordered.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D124340
Krzysztof Drewniak [Tue, 10 May 2022 15:37:53 +0000 (15:37 +0000)]
[mlir][AMDGPU] Add AMDGPU conversion patterns to ConvertGPUToROCDL
This ensures that attributes such as the index bitwidth propagate
correctly to the AMDGPUToROCDL patterns.
Differential Revision: https://reviews.llvm.org/D125320
Konstantin Varlamov [Tue, 10 May 2022 16:29:39 +0000 (09:29 -0700)]
[libc++][ranges] Implement `views::drop`.
The view itself has been implemented previously -- this patch only adds
the ability to pipe it.
Also finishes the implementation of [P1739](https://wg21.link/p1739) and
[LWG3407](https://wg21.link/lwg3407).
Differential Revision: https://reviews.llvm.org/D125156
David Green [Tue, 10 May 2022 16:17:03 +0000 (17:17 +0100)]
Revert "[AArch64] Generate AND in place of CSEL for predicated CTTZ"
This reverts commit
7dcd0ea683ed3175bc3ec6aed24901a9d504182e due to
issues reported postcommit with the correctness of truncated cttzs.
Craig Topper [Tue, 10 May 2022 15:56:31 +0000 (08:56 -0700)]
[CVP] Preserve exact name when converting sext->zext and ashr->lshr.
Previously we took the old name and always appended a numberic suffix.
Since we're doing a 1:1 replacement, it's clearer to keep the original
name exactly.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D125281
Craig Topper [Tue, 10 May 2022 15:56:23 +0000 (08:56 -0700)]
[SCCP] Preserve Name when converting SExt->ZExt.
This makes the output IR more readable since we're doing a one to
one replacement.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D125280
Peter Klausler [Fri, 29 Apr 2022 20:23:26 +0000 (13:23 -0700)]
[flang] Enforce limit on rank + corank
Fortran 2018 requires that a compiler allow objects whose rank + corank
is 15, and that's our maximum; detect and diagnose violations.
Differential Revision: https://reviews.llvm.org/D125153
Nikita Popov [Tue, 10 May 2022 15:43:27 +0000 (17:43 +0200)]
[InstCombine] Add additional freeze tests (NFC)
Ivan Kosarev [Tue, 10 May 2022 14:54:40 +0000 (15:54 +0100)]
[AMDGPU][GFX10] Support base+soffset+offset SMEM loads.
Also makes a step towards resolving
https://github.com/llvm/llvm-project/issues/38652
Reviewed By: foad, dp
Differential Revision: https://reviews.llvm.org/D125117
Aaron Ballman [Tue, 10 May 2022 15:14:24 +0000 (11:14 -0400)]
Diagnose unreachable generic selection associations
The controlling expression of a _Generic selection expression undergoes
lvalue conversion, array conversion, and function conversion before
picking the association. This means that array types, function types,
and qualified types are all unreachable code if they're used as an
association. I've been caught by this twice in the past few months and
I figure that if a WG14 member can't seem to remember this rule, users
are also likely to struggle with it. So this adds an on-by-default
unreachable code diagnostic for generic selection expression
associations.
Note, we don't have to worry about function types as those are already
a constraint violation which generates an error.
Differential Revision: https://reviews.llvm.org/D125259
Peter Klausler [Thu, 5 May 2022 15:15:20 +0000 (08:15 -0700)]
[flang] Fold real-valued MODULO() and MOD()
Evaluate real-valued references to the intrinsic functions MODULO
and MOD at compilation time without recourse to an external math
library.
Differential Revision: https://reviews.llvm.org/D125151
Louis Dionne [Tue, 10 May 2022 15:14:59 +0000 (11:14 -0400)]
[libc++abi][NFC] Fix typo in comment
Ashay Rane [Mon, 9 May 2022 18:22:43 +0000 (11:22 -0700)]
[mlir] Fail early if AnalysisState::getBuffer() returns failure
This patch updates calls to AnalysisState::getBuffer() so that we return
early with a failure if the call does not succeed.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D125251
Daniel Bertalan [Tue, 10 May 2022 14:52:52 +0000 (15:52 +0100)]
[CodeGen] Use ABI alignment for C++ new expressions
In case of placement new, if we do not know the alignment of the
operand, we can't assume it has the preferred alignment. It might be
e.g. a pointer to a struct member which follows ABI alignment rules.
This makes UBSAN no longer report "constructor call on misaligned
address" when constructing a double into a struct field of type double
on i686. The psABI specifies an alignment of 4 bytes, but the preferred
alignment used by Clang is 8 bytes.
We now use ABI alignment for allocating new as well, as the preferred
alignment should be used for over-aligning e.g. local variables, which
isn't relevant for ABI code dealing with operator new. AFAICT there
wouldn't be problems either way though.
Fixes #54845.
Differential Revision: https://reviews.llvm.org/D124736
Krzysztof Drewniak [Wed, 30 Mar 2022 21:56:19 +0000 (21:56 +0000)]
[MLIR][AMDGPU] Add AMDGPU dialect, wrappers around raw buffer intrinsics
By analogy with the NVGPU dialect, introduce an AMDGPU dialect for
AMD-specific intrinsic wrappers.
The dialect initially includes wrappers around the raw buffer intrinsics.
On AMD GPUs, a memref can be converted to a "buffer descriptor" that
allows more precise control of memory access, such as by allowing for
out of bounds loads/stores to be replaced by 0/ignored without adding
additional conditional logic, which is important for performance.
The repository currently contains a limited conversion from
transfer_read/transfer_write to Mubuf intrinsics, which are an older,
deprecated intrinsic for the same functionality.
The new amdgpu.raw_buffer_* ops allow these operations to be used
explicitly and for including metadata such as whether the target
chipset is an RDNA chip or not (which impacts the interpretation of
some bits in the buffer descriptor), while still maintaining an
MLIR-like interface.
(This change also exposes the floating-point atomic add intrinsic.)
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D122765
Sam McCall [Sat, 7 May 2022 00:15:41 +0000 (02:15 +0200)]
[Frontend] Flip default of CreateInvocationOptions::ProbePrecompiled to false
This is generally a better default for tools other than the compiler, which
shouldn't assume a PCH file on disk is something they can consume.
Preserve the old behavior in places associated with libclang/c-index-test
(including ASTUnit) as there are tests relying on it and most important
consumers are out-of-tree. It's unclear whether the tests are specifically
trying to test this functionality, and what the downstream implications of
removing it are. Hopefully someone more familiar can clean this up in future.
Differential Revision: https://reviews.llvm.org/D125149
Peter Klausler [Wed, 4 May 2022 23:35:31 +0000 (16:35 -0700)]
[flang] Fold real-valued DIM()
Fold references to the intrinsic function DIM with constant real
arguments. And clean up folding of comparisons with NaNs to address
a problem noticed in testing -- NaNs should successfully compare
unequal to all values, including themselves, instead of failing all
comparisons.
Differential Revision: https://reviews.llvm.org/D125146
Chris Lattner [Tue, 10 May 2022 09:22:25 +0000 (10:22 +0100)]
[MLIR Parser] Improve QoI for "expected token" errors
A typical problem with missing a token is that the missing
token is at the end of a line. The problem with this is that
the error message gets reported on the start of the following
line (which is where the next / invalid token is) which can
be confusing.
Handle this by noticing this case and backing up to the end of
the previous line.
Differential Revision: https://reviews.llvm.org/D125295
Amy Kwan [Mon, 9 May 2022 15:51:08 +0000 (10:51 -0500)]
[NFC][PowerPC] Add 32-bit AIX RUN lines to test cases.
This patch adds 32-bit AIX RUN lines to several test cases, along with the
addition of one new test case, to prepare for future codegen changes involving
the PPCISD::SCALAR_TO_VECTOR_PERMUTED node on 32-bit mode.
Nicolai Hähnle [Wed, 13 Apr 2022 02:10:04 +0000 (21:10 -0500)]
AMDGPU/SDAG: Refine the fold to v_mad_[iu]64_[iu]32
Only fold for uniform values on pre-GFX9 chips. GFX9+ allow us
to keep the calculation entirely on the SALU.
For subtargets where integer multiplication isn't full-rate, avoid
folding if the multiply has too many uses.
Finally, we expand 64x32 and 64x64 multiplies here as well, if they
feed into an addition. This results in better code generation than
the generic expansion for such multiplies because we end up using
the accumulator of the MAD instructions.
Differential Revision: https://reviews.llvm.org/D123835
Dawid Jurczak [Sat, 7 May 2022 09:34:45 +0000 (11:34 +0200)]
[GVNSink] Make GVNSink resistant against self referencing instructions (PR36954)
Before this change GVNSink pass suffers from stack overflow while processing self referenced instruction in unreachable basic block.
According [1] and [2] it's reasonable to make pass resistant against self referencing instructions.
To fix issue we skip sinking analysis when we reach instruction coming from unreachable block.
[1] https://groups.google.com/g/llvm-dev/c/843Tig9IzwA
[2] https://lists.llvm.org/pipermail/llvm-dev/2015-February/082629.html
Differential Revision: https://reviews.llvm.org/D113897
Louis Dionne [Mon, 9 May 2022 17:31:42 +0000 (13:31 -0400)]
[libc++abi] Reword uncaught exception termination message
When we terminate due to an exception being uncaught, libc++abi prints
a message saying "terminating with uncaught exception [...]". This patch
changes that to say "terminating due to uncaught exception [...]" instead,
which is a bit clearer. Indeed, I've seen some people being confused and
thinking that libc++abi was the component throwing the exception.
Differential Revision: https://reviews.llvm.org/D125245
Nikita Popov [Tue, 10 May 2022 13:10:28 +0000 (15:10 +0200)]
[SCEVExpander] Remove handling for mixed int/pointer min/max (NFCI)
Mixed int/pointer min/max are no longer possible.
Alexey Bataev [Tue, 10 May 2022 12:51:14 +0000 (05:51 -0700)]
[SLP][NFC]Add a test for improved shuffles in buildvector sequences,
NFC.
Nikita Popov [Tue, 10 May 2022 12:50:50 +0000 (14:50 +0200)]
[IndVarSimplify] Regenerate test checks (NFC)
Nicolai Hähnle [Thu, 28 Apr 2022 21:27:57 +0000 (16:27 -0500)]
GlobalISel: Trivial documentation and comment fixes
Differential Revision: https://reviews.llvm.org/D124808
Rosie Sumpter [Tue, 3 May 2022 09:58:09 +0000 (10:58 +0100)]
[Sema][SVE2] Move/simplify Sema testing for SVE2 ACLE builtins
Currently for SVE2 ACLE builtins, single tests are used to verify both
clang code generation (when the feature is available) and semantic
error/warning messages (when the feature is unavailable). This
patch moves the semantic testing for the target feature flag into
dedicated Sema tests.
Differential Revision: https://reviews.llvm.org/D124850
Rosie Sumpter [Tue, 3 May 2022 09:39:41 +0000 (10:39 +0100)]
[Sema][SVE] Move/simplify Sema testing for SVE ACLE builtins
Currently for SVE ACLE builtins, single tests are used to verify both
clang code generation (when the feature is available) and semantic
error/warning messages (when the feature is unavailable). This
patch moves the semantic testing into dedicated Sema tests.
Differential Revision: https://reviews.llvm.org/D124924
Haojian Wu [Sun, 8 May 2022 20:14:27 +0000 (22:14 +0200)]
[pseudo] Add benchmarks for pseudoparser.
Running on SemaDecl.cpp with the cxx.bnf grammar:
```
--------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------
runParseBNFGrammar 649389 ns 649365 ns 1013
runBuildLR
34591903 ns
34591380 ns 20
runPreprocessTokens
11418744 ns
11418703 ns 61 bytes_per_second=63.8971M/s
runGLRParse
282996863 ns
282988726 ns 2 bytes_per_second=2.57827M/s
runParseOverall
294969719 ns
294951870 ns 2 bytes_per_second=2.4737M/s
```
Differential Revision: https://reviews.llvm.org/D125226
Adrian Kuegel [Tue, 10 May 2022 10:58:01 +0000 (12:58 +0200)]
[mlir] Remove unused using declaration (NFC)
Fraser Cormack [Tue, 10 May 2022 09:54:12 +0000 (10:54 +0100)]
[RISCV][NFC] Remove else after continue
Nikita Popov [Tue, 10 May 2022 10:17:09 +0000 (12:17 +0200)]
[LoopVectorize] Remove incorrect nuw flag from test (NFC)
nuw does not make sense for reverse iteration.
Nikita Popov [Tue, 10 May 2022 09:46:22 +0000 (11:46 +0200)]
[InstSimplify] Handle unknown function context in pointer icmp fold (PR54615)
This issue reproduces in the context of LoopDeletion, because the
bitcast does not get simplified away there. For a plain -inst-simplify
run the bitcast would get folded away first.
Fixes https://github.com/llvm/llvm-project/issues/54615.
Chuanqi Xu [Tue, 10 May 2022 09:24:24 +0000 (17:24 +0800)]
[NFC] [Coroutines] Remove EnableReuseStorageInFrame option
The EnableReuseStorageInFrame option is designed for testing only.
But it is better to use *_PASS_WITH_PARAMS macro to keep consistent with
other passes.
Nikita Popov [Tue, 10 May 2022 09:24:02 +0000 (11:24 +0200)]
[InstCombine] Handle GEP scalar/vector base mismatch (PR55363)
30a12f3f6322399185fdceffe176152a58bb84ae switched the type check
to use the GEP result type rather than the GEP operand type.
However, the GEP result types may match even if the operand types
don't, in case GEPs with scalar/vector base and vector index
are compared.
Fixes https://github.com/llvm/llvm-project/issues/55363.
Lian Wang [Tue, 10 May 2022 09:20:56 +0000 (09:20 +0000)]
Revert "[RISCV][SelectionDAG] Support VECREDUCE_ADD mask operation"
This patch make CodeGen/test/AArch64/vecreduce-add-legalization.ll fail.
This reverts commit
17a8a1bb7126a7c1b0bc629d9299f2e5ae6db3f1.
Gabor Marton [Tue, 10 May 2022 09:17:59 +0000 (11:17 +0200)]
[analyzer] Attempt to fix test infeasible-crash.c
Lian Wang [Mon, 9 May 2022 06:18:34 +0000 (06:18 +0000)]
[RISCV][SelectionDAG] Support VECREDUCE_ADD mask operation
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D125206
Lian Wang [Tue, 10 May 2022 08:07:46 +0000 (08:07 +0000)]
[RISCV] Add more tests for vector reduce mask operations
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D125216
Nikita Popov [Fri, 22 Apr 2022 08:19:54 +0000 (10:19 +0200)]
[fuzzer] Reduce size of large.test
This halves the size of LargeTest, dropping time to compile this
file locally from 14s to 5.5s. Hopefully this will also fix the
persistent timeouts in pre-merge checks.
Differential Revision: https://reviews.llvm.org/D124237
Nikita Popov [Mon, 9 May 2022 14:47:27 +0000 (16:47 +0200)]
[MLIR] Split off MLIRExecutionEngineUtils to fix libMLIR.so build (PR54242)
Building libMLIR.so currently fails with:
> /usr/bin/ld: /tmp/ccNzulEA.ltrans39.ltrans.o: in function `(anonymous namespace)::SerializeToHsacoPass::optimizeLlvm(llvm::Module&, llvm::TargetMachine&)':
> /builddir/build/BUILD/llvm-project-15.0.0.src/mlir/lib/Dialect/GPU/Transforms/SerializeToHsaco.cpp:328: undefined reference to `mlir::makeOptimizingTransformer(unsigned int, unsigned int, llvm::TargetMachine*)'
This is because MLIRGPUTransforms depends on MLIRExecutionEngine in
https://github.com/llvm/llvm-project/blob/
61bb2e4ea82fc5499a271d70d4537383d1942208/mlir/lib/Dialect/GPU/Transforms/SerializeToHsaco.cpp#L328,
but MLIRExecutionEngine is marked as excluded from libMLIR.so.
However, this code doesn't require the full execution engine: It
only performs middle-end optimization, and does not need any of
the JIT/codegen infrastructure. As such, split off a separate
library MLIRExecutionEngineUtils, which only contains that part
and is not excluded from libMLIR.so.
Fixes https://github.com/llvm/llvm-project/issues/54242.
Differential Revision: https://reviews.llvm.org/D125214
Gabor Marton [Mon, 2 May 2022 11:32:28 +0000 (13:32 +0200)]
[analyzer] Replace adjacent assumeInBound calls to assumeInBoundDual
This is to minimize superfluous assume calls.
Depends on D124758
Differential Revision: https://reviews.llvm.org/D124761
Gabor Marton [Fri, 6 May 2022 14:20:25 +0000 (16:20 +0200)]
[analyzer] Implement assume in terms of assumeDual
Summary:
By evaluating both children states, now we are capable of discovering
infeasible parent states. In this patch, `assume` is implemented in the terms
of `assumeDuali`. This might be suboptimal (e.g. where there are adjacent
assume(true) and assume(false) calls, next patches addresses that). This patch
fixes a real CRASH.
Fixes https://github.com/llvm/llvm-project/issues/54272
Differential Revision:
https://reviews.llvm.org/D124758
Gabor Marton [Tue, 3 May 2022 10:11:49 +0000 (12:11 +0200)]
[analyzer] Indicate if a parent state is infeasible
In some cases a parent State is already infeasible, but we recognize
this only if an additonal constraint is added. This patch is the first
of a series to address this issue. In this patch `assumeDual` is changed
to clone the parent State but with an `Infeasible` flag set, and this
infeasible-parent is returned both for the true and false case. Then
when we add a new transition in the exploded graph and the destination
is marked as infeasible, the node will be a sink node.
Related bug:
https://github.com/llvm/llvm-project/issues/50883
Actually, this patch does not solve that bug in the solver, rather with
this patch we can handle the general parent-infeasible cases.
Next step would be to change the State API and require all checkers to
use the `assume*Dual` API and deprecate the simple `assume` calls.
Hopefully, the next patch will introduce `assumeInBoundDual` and will
solve the CRASH we have here:
https://github.com/llvm/llvm-project/issues/54272
Differential Revision: https://reviews.llvm.org/D124674
Nikita Popov [Tue, 10 May 2022 08:01:24 +0000 (10:01 +0200)]
[Docs] Clarify CLANG_ENABLE_OPAQUE_POINTERS behavior (NFC)
While it originally did, this option no longer affects the cc1
interface. For the cc1 interface, -no-opaque-pointers has to be
passed, there is no cmake option.