Florian Hahn [Wed, 15 Jun 2022 17:48:02 +0000 (18:48 +0100)]
[LV] Remove unneeded CustomBuilder arg from setDebugLocFromInst (NFC).
The only user that passed in a custom builder was passing in
VPTransformState::Builder, which is the same as ILV::Builder.
Fangrui Song [Wed, 15 Jun 2022 17:46:37 +0000 (10:46 -0700)]
[llvm-profdata][test] Change -Wl,-no-pie to -no-pie after D127808
The driver option -no-pie is preferred: Clang selects different crt*.o files,
though the PIC one usually can replace the non-PIC one.
Thomas Raoux [Wed, 15 Jun 2022 17:16:56 +0000 (17:16 +0000)]
[mlir][GPUToNVVM] Fix bug in mma elementwise lowering
The maxf implementation of wmma elementwise op was incorrect as the
operands of the select to check for Nan were swapped.
Differential Revision: https://reviews.llvm.org/D127879
Simon Pilgrim [Wed, 15 Jun 2022 17:20:00 +0000 (18:20 +0100)]
[X86] X86InstrInfo.cpp - fix signed/unsigned promotion warnings in addImm calls
addImm takes a int64_t arg but we were using uint64_t types
Mitch Phillips [Wed, 15 Jun 2022 16:54:17 +0000 (09:54 -0700)]
[clang] Add -fsanitize=memtag-globals (no-op).
Adds the -fsanitize plumbing for memtag-globals. Makes -fsanitize=memtag
imply -fsanitize=memtag-globals.
This has no effect on codegen for now.
Reviewed By: eugenis, aaron.ballman
Differential Revision: https://reviews.llvm.org/D127163
Okwan Kwon [Tue, 14 Jun 2022 21:16:26 +0000 (14:16 -0700)]
[mlir] add an option to print op stats in JSON
Differential Revision: https://reviews.llvm.org/D127691
Quinn Pham [Wed, 8 Jun 2022 14:47:48 +0000 (09:47 -0500)]
[PowerPC] emit VSX instructions instead of VMX instructions for vector loads and stores
This patch changes the PowerPC backend to generate VSX load/store instructions
for all vector loads/stores on Power8 and earlier (LE) instead of VMX
load/store instructions. The reason for this change is because VMX instructions
require the vector to be 16-byte aligned. So, a vector load/store will fail with
VMX instructions if the vector is misaligned. Also, `gcc` generates VSX
instructions in this situation which allow for unaligned access but require a
swap instruction after loading/before storing. This is not an issue for BE
because we already emit VSX instructions since no swap is required. And this is
not an issue on Power9 and up since we have access to `lxv[x]`/`stxv[x]` which
allow for unaligned access and do not require swaps.
This patch also delays the VSX load/store for LE combines until after
LegalizeOps to prioritize other load/store combines.
Reviewed By: #powerpc, stefanp
Differential Revision: https://reviews.llvm.org/D127309
Rob Suderman [Wed, 15 Jun 2022 16:54:23 +0000 (09:54 -0700)]
[tosa] Lower tosa.slice to tensor.slice for dynamic case
Existing slice lowering only supporting static shapes.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D127704
Benjamin Kramer [Wed, 15 Jun 2022 16:51:13 +0000 (18:51 +0200)]
[SelectionDAG] Constant fold FP_TO_BF16 and BF16_TO_FP.
Snehasish Kumar [Tue, 14 Jun 2022 23:30:34 +0000 (23:30 +0000)]
[memprof] Update the test comments to include -Wl,-no-pie
Until we have symbolization for position independent code lets update
this documentation since clang now defaults to position independent
code.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D127808
Alex Zinenko [Wed, 15 Jun 2022 16:38:14 +0000 (18:38 +0200)]
[mlir] address post-commit review for D127724
- make transform.alternatives op apply only to isolated-from-above payload IR
scopes;
- fix potential leak;
- fix several typos.
lorenzo chelini [Wed, 15 Jun 2022 15:20:49 +0000 (17:20 +0200)]
[MLIR][Bufferization] Assume alias if no information is available
- Post (minor) fix after: https://reviews.llvm.org/D127301
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D127868
Pengxuan Zheng [Tue, 14 Jun 2022 02:22:14 +0000 (19:22 -0700)]
[LLD][COFF] Convert file name to lowercase when inserting it into visitedLibs
It seems to be a bug in `LinkerDriver::findFile`, the file name is not converted
to lowercase when being inserted into `visitedLibs`. This is the only exception
in the file and all other places always convert file names to lowercase when
inserting them into `visitedLibs` (or `visitedFiles`).
Reviewed By: thieta, hans
Differential Revision: https://reviews.llvm.org/D127709
Paul Robinson [Wed, 15 Jun 2022 16:35:49 +0000 (09:35 -0700)]
[PS5] Support sin+cos->sincos optimization
Juergen Ributzka [Mon, 13 Jun 2022 22:57:51 +0000 (15:57 -0700)]
[llvm] Fix MachO exports trie parsing.
The exports trie parser ordinal validation check doesn't consider the case where
the ordinal can be zero or negative for certain special values that are defined
in BindSpecialDylib. Update the validation to account for that fact and add a
test case.
This fixes rdar://
94844233.
Differential Revision: https://reviews.llvm.org/D127806
Joseph Huber [Tue, 14 Jun 2022 18:33:32 +0000 (14:33 -0400)]
[Binary] Add iterator to the OffloadBinary string maps
The offload binary contains internally a string map of all the key and
value pairs identified in the binary itself. Normally users query these
values from the `getString` function, but this makes it difficult to
identify which strings are availible. This patch adds a simple const
iterator range to the offload binary allowing users to iterate through
the strings.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D127774
Stanislav Gatev [Wed, 15 Jun 2022 14:58:13 +0000 (14:58 +0000)]
[clang][dataflow] Make `Value` and `StorageLocation` non-copyable
This makes it harder to misuse APIs that return references by
accidentally copying the results which could happen when assigning the
them to variables declared as `auto`.
Differential Revision: https://reviews.llvm.org/D127865
Reviewed-by: ymandel, xazax.hun
Mark de Wever [Mon, 13 Jun 2022 18:05:36 +0000 (20:05 +0200)]
[libc++] Removes unneeded <iterator> includes.
Reviewed By: #libc, philnik
Differential Revision: https://reviews.llvm.org/D127675
Thomas Raoux [Wed, 15 Jun 2022 15:15:57 +0000 (15:15 +0000)]
[mlir][vector] NFC remove dependency of VectorTransform to GPU dialect
Make the reduction distribution pattern more generic and remove layering
problem. The new pattern to distribute reduction is now independent of
GPU and takes a lamdba to decide how the distributed reduction should be
generated.
Differential Revision: https://reviews.llvm.org/D127867
Paul Robinson [Wed, 15 Jun 2022 16:02:03 +0000 (09:02 -0700)]
[PS5] Trap after noreturn calls, with special case for stack-check-fail
Luo, Yuanke [Wed, 15 Jun 2022 11:03:18 +0000 (19:03 +0800)]
[CodeGen] Fix the bug of machine sink
The use operand may be undefined. In that case we can just continue to
check the next operand since it won't increase register pressure.
Differential Revision: https://reviews.llvm.org/D127848
Balazs Benics [Wed, 15 Jun 2022 15:08:27 +0000 (17:08 +0200)]
[analyzer] Relax constraints on const qualified regions
The arithmetic restriction seems to be artificial.
The comment below seems to be stale.
Thus, we remove both.
Depends on D127306.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D127763
Balazs Benics [Wed, 15 Jun 2022 15:08:27 +0000 (17:08 +0200)]
[analyzer] Treat system globals as mutable if they are not const
Previously, system globals were treated as immutable regions, unless it
was the `errno` which is known to be frequently modified.
D124244 wants to add a check for stores to immutable regions.
It would basically turn all stores to system globals into an error even
though we have no reason to believe that those mutable sys globals
should be treated as if they were immutable. And this leads to
false-positives if we apply D124244.
In this patch, I'm proposing to treat mutable sys globals actually
mutable, hence allocate them into the `GlobalSystemSpaceRegion`, UNLESS
they were declared as `const` (and a primitive arithmetic type), in
which case, we should use `GlobalImmutableSpaceRegion`.
In any other cases, I'm using the `GlobalInternalSpaceRegion`, which is
no different than the previous behavior.
---
In the tests I added, only the last `expected-warning` was different, compared to the baseline.
Which is this:
```lang=C++
void test_my_mutable_system_global_constraint() {
assert(my_mutable_system_global > 2);
clang_analyzer_eval(my_mutable_system_global > 2); // expected-warning {{TRUE}}
invalidate_globals();
clang_analyzer_eval(my_mutable_system_global > 2); // expected-warning {{UNKNOWN}} It was previously TRUE.
}
void test_my_mutable_system_global_assign(int x) {
my_mutable_system_global = x;
clang_analyzer_eval(my_mutable_system_global == x); // expected-warning {{TRUE}}
invalidate_globals();
clang_analyzer_eval(my_mutable_system_global == x); // expected-warning {{UNKNOWN}} It was previously TRUE.
}
```
---
Unfortunately, the taint checker will be also affected.
The `stdin` global variable is a pointer, which is assumed to be a taint
source, and the rest of the taint propagation rules will propagate from
it.
However, since mutable variables are no longer treated immutable, they
also get invalidated, when an opaque function call happens, such as the
first `scanf(stdin, ...)`. This would effectively remove taint from the
pointer, consequently disable all the rest of the taint propagations
down the line from the `stdin` variable.
All that said, I decided to look through `DerivedSymbol`s as well, to
acquire the memregion in that case as well. This should preserve the
previously existing taint reports.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D127306
Phoebe Wang [Wed, 15 Jun 2022 14:22:32 +0000 (22:22 +0800)]
Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""
Fixed the missing SQRT promotion. Adding several missing operations too.
Balazs Benics [Wed, 15 Jun 2022 14:58:08 +0000 (16:58 +0200)]
[analyzer][NFC] Prefer using isa<> instead getAs<> in conditions
Depends on D125709
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D127742
Balazs Benics [Wed, 15 Jun 2022 14:50:12 +0000 (16:50 +0200)]
[analyzer][NFC] Remove dead field of UnixAPICheckers
Initially, I thought there is some fundamental bug here by not using the
bool fields, but it turns out D55425 split this checker into two
separate ones; making these fields dead.
Depends on D127836, which uncovered this issue.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D127838
Balazs Benics [Wed, 15 Jun 2022 14:50:12 +0000 (16:50 +0200)]
[analyzer] Fix StreamErrorState hash bug
The `Profile` function was incorrectly implemented.
The `StreamErrorState` has an implicit `bool` conversion operator, which
will result in a different hash than faithfully hashing the raw value of
the enum.
I don't have a test for it, since it seems difficult to find one.
Even if we would have one, any change in the hashing algorithm would
have a chance of breaking it, so I don't think it would justify the
effort.
Depends on D127836, which uncovered this issue by marking the related
`Profile` function dead.
Reviewed By: martong, balazske
Differential Revision: https://reviews.llvm.org/D127839
Balazs Benics [Wed, 15 Jun 2022 14:50:12 +0000 (16:50 +0200)]
[analyzer][NFC] Remove dead code and modernize surroundings
Thanks @kazu for helping me clean these parts in D127799.
I'm leaving the dump methods, along with the unused visitor handlers and
the forwarding methods.
The dead parts actually helped to uncover two bugs, to which I'm going
to post separate patches.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D127836
Valentin Clement [Wed, 15 Jun 2022 14:48:46 +0000 (16:48 +0200)]
[flang][NFC] Fix some formatting
Fix some mismatch in format used in the file and reduce the diff with fir-dev
to be able to finish the upstreaming on this file.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D127849
Matthias Springer [Wed, 15 Jun 2022 10:33:54 +0000 (12:33 +0200)]
[mlir][tablegen] Generate default attr values in Python bindings
When specifying an op attribute with a default value (via DefaultValuedAttr), the default value is a string of C++ code. In the general case, the default value of such an attribute cannot be translated to Python when generating the bindings. However, we can hard-code default Python values for frequently-used C++ default values.
This change adds a Python default value for empty ArrayAttrs.
Differential Revision: https://reviews.llvm.org/D127750
Sunho Kim [Wed, 15 Jun 2022 14:24:18 +0000 (23:24 +0900)]
[JITLink][ELF] Log enum name of unsupported relocation type.
Logs enum name of unsupported relocation type. This also changes elf/x86 to use common util function (getELFRelocationTypeName) inside llvm object module.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127715
Shao-Ce SUN [Wed, 15 Jun 2022 14:22:59 +0000 (22:22 +0800)]
[Driver][test] Make RISCV tests robust with PATH=
When `riscv64-unknown-linux-gnu-ld` is in the PATH, `clang -### -fuse-ld=ld --target=riscv64-unknown-linux-gnu` will use unknown-linux-gnu-ld first, which causes the error in the lit test.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D127589
Kadir Cetinkaya [Wed, 15 Jun 2022 14:10:19 +0000 (16:10 +0200)]
[clangd][NFC] Use the existing ASTContext from scope
Kadir Cetinkaya [Wed, 15 Jun 2022 08:05:26 +0000 (10:05 +0200)]
[clangd] Always desugar type aliases in hover
The alias itself is already included in the definition section of the
hover (it's printed as spelled in source code). So it doesn't provide any value
when we print the aliases as-is.
Fixes https://github.com/clangd/clangd/issues/1134.
Differential Revision: https://reviews.llvm.org/D127832
Krasimir Georgiev [Wed, 15 Jun 2022 13:59:32 +0000 (15:59 +0200)]
Revert "[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records AND [NFC][Thumb] Update frame-chain codegen test to use thumbv6m"
This reverts commit
7625e01d661644a560884057755d48a0da8b77b4 and
dependent
cbcce82ef6b512d97e92a319a75a03e997c844e1.
Commit
7625e01d661644a560884057755d48a0da8b77b4 causes some new codegen test
failures under asan, e.g., CodeGen/ARM/execute-only.ll:
https://lab.llvm.org/buildbot/#/builders/5/builds/24659/steps/15/logs/stdio.
Gabor Marton [Mon, 13 Jun 2022 15:19:01 +0000 (17:19 +0200)]
[analyzer][NFC][test] Add new RUN line with support-symbolic-integer-casts=true to expr-inspection.cpp
Added a new run line to bolster gradual transition of handling cast operations,
see https://discourse.llvm.org/t/roadmap-of-modeling-symbolic-cast-operations/63107
Differential Revision: https://reviews.llvm.org/D127649
Timm Bäder [Mon, 13 Jun 2022 13:48:20 +0000 (15:48 +0200)]
[clang][sema] Provide better diagnostic for missing template arguments
Instead of just complaining that "x is not a class, namespace or
enumeration", mention that using x requires template arguments.
Differential Revision: https://reviews.llvm.org/D127638
Fixes https://github.com/llvm/llvm-project/issues/55962
Timm Bäder [Tue, 14 Jun 2022 06:49:07 +0000 (08:49 +0200)]
[clang][NFC] Remove unused parameter from ActOnCXXNestedNameSpecifier
Alvin Wong [Wed, 15 Jun 2022 13:07:36 +0000 (16:07 +0300)]
[lldb] Fix loading DLL from some ramdisk on Windows
The WinAPI `GetFinalPathNameByHandle` is used to retrieve the DLL file
name from the HANDLE provided to `LOAD_DLL_DEBUG_EVENT` in the debug
loop. When this API fails, lldb will simply ignore that module.
Certain ramdisk (e.g. ImDisk) does not work with this API, which means
it is impossible to use lldb to debug a process which loads DLLs located
on this type of ramdisk. In order to make this work, we need to use a
fallback routine which involves creating a file mapping, using
`GetMappedFileName` to get a device path, then substitutes the device
path with its drive letter.
References:
* https://developercommunity.visualstudio.com/t/cannot-debug-program-when-compiled-to-ram-drive/43004#T-N109926
* https://github.com/jrfonseca/drmingw/issues/65
* https://docs.microsoft.com/en-us/windows/win32/memory/obtaining-a-file-name-from-a-file-handle
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D126657
Furkan Usta [Wed, 15 Jun 2022 08:51:12 +0000 (10:51 +0200)]
[clang] Use correct visibility parameters when following a Using declaration
Fixes https://github.com/clangd/clangd/issues/1137
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D127629
Martin Storsjö [Fri, 10 Jun 2022 07:57:40 +0000 (10:57 +0300)]
[LLD] [MinGW] Implement --disable-reloc-section, mapped to /fixed
Since binutils 2.36, GNU ld defaults to emitting base relocations,
and that version added the new option --disable-reloc-section to
disable it.
Differential Revision: https://reviews.llvm.org/D127478
Martin Storsjö [Wed, 8 Jun 2022 20:55:45 +0000 (23:55 +0300)]
[COFF] Don't reject executables with data directories pointing outside of provided data
Before
bb94611d6545c2c5271f5bb01de1aa4228a37250, we didn't check
that the sections in the COFF executable actually contained enough
raw data, when looking up what section contains tables pointed to
by the data directories.
That commit added checking, to avoid setting a pointer that points
out of bounds - by rejecting such executables.
It turns out that some binaries (e.g.g a "helper.exe" provided by
NSIS) contains a base relocation table data directory that points
into the wrong section. It points inside the virtual address space
allocated for that section, but the section contains much less raw
data, and the table points outside of the provided raw data.
No longer reject such binaries (to let tools operate on them and
inspect them), but don't set the table pointers (so that when
printing e.g. base relocations, we don't print anything).
This should fix the regression pointed out in
https://reviews.llvm.org/D126898#3565834.
Differential Revision: https://reviews.llvm.org/D127345
Alexey Bataev [Tue, 14 Jun 2022 17:35:04 +0000 (10:35 -0700)]
[SLP] Improve reordering in presence of constant only nodes.
We can skip the analysis of the constant nodes, their order should not
affect the ordering of the trees/subtrees.
Differential Revision: https://reviews.llvm.org/D127775
PeixinQiao [Wed, 15 Jun 2022 13:10:36 +0000 (21:10 +0800)]
[flang] Fix one regression failure related to BIND(C) statement
For BIND(C) statement, two common block with the same name can have the
same bind name. Fix the regression failure by adding this check. Also add
the regression tests.
Co-authored-by: Jean Perier <jperier@nvidia.com>
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D127841
Alex Zinenko [Wed, 15 Jun 2022 12:49:21 +0000 (14:49 +0200)]
[mlir] check interfaces are attached to the expected object
Add static assertions into the various `attachInterface` methods, which are
used for adding external interface implementations to attributes, operations
and types, that ensure `ExternalModel` interface classes are instantiated for
the same concrete operation for the concrete base (potentially self) attribute
or type as they are attached to. `FallbackModel`s remain usable for generic
interface models that should support more than one kind of entities.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D127850
Alex Zinenko [Wed, 15 Jun 2022 13:04:52 +0000 (15:04 +0200)]
[mlir] generate documentation for transform dialect extensions
Elena Lepilkina [Wed, 1 Jun 2022 07:47:40 +0000 (10:47 +0300)]
[test][RISCV] Precommit test for SeparateConstOffsetFromGEP (NFC)
Precommit test for D127727
Guillaume Chatelet [Tue, 14 Jun 2022 11:31:58 +0000 (11:31 +0000)]
[NFC][Alignment] Use Align in MCAlignFragment
Gabor Marton [Mon, 13 Jun 2022 15:04:42 +0000 (17:04 +0200)]
[analyzer][NFC][test] Add new RUN lint with support-symbolic-integer-casts=true to svalbuilder-rearrange-comparisons.c
Added a new run line to bolster gradual transition of handling cast operations,
see https://discourse.llvm.org/t/roadmap-of-modeling-symbolic-cast-operations/63107
Differential Revision: https://reviews.llvm.org/D127646
Nico Weber [Wed, 15 Jun 2022 11:42:40 +0000 (07:42 -0400)]
[gn build] (semi-automatically) port
fb34d531af95
Nico Weber [Wed, 15 Jun 2022 11:42:19 +0000 (07:42 -0400)]
[gn build] (semi-automatically) port
8bc0bb956421
Thomas Joerg [Wed, 15 Jun 2022 11:10:01 +0000 (13:10 +0200)]
Revert "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""
This reverts commit
6e02e27536b9de25a651cfc9c2966ce471169355.
This introduces a crash in the backend. Reproducer in MLIR's LLVM
dialect follows. Let me know if you have trouble reproducing this.
module {
llvm.func @malloc(i64) -> !llvm.ptr<i8>
llvm.func @_mlir_ciface_tf_report_error(!llvm.ptr<i8>, i32, !llvm.ptr<i8>)
llvm.mlir.global internal constant @error_message_2208944672953921889("failed to allocate memory at loc(\22-\22:3:8)\00")
llvm.func @_mlir_ciface_tf_alloc(!llvm.ptr<i8>, i64, i64, i32, i32, !llvm.ptr<i32>) -> !llvm.ptr<i8>
llvm.func @Rsqrt_CPU_DT_HALF_DT_HALF(%arg0: !llvm.ptr<i8>, %arg1: i64, %arg2: !llvm.ptr<i8>) -> !llvm.struct<(i64, ptr<i8>)> attributes {llvm.emit_c_interface, tf_entry} {
%0 = llvm.mlir.constant(8 : i32) : i32
%1 = llvm.mlir.constant(8 : index) : i64
%2 = llvm.mlir.constant(2 : index) : i64
%3 = llvm.mlir.constant(dense<0.000000e+00> : vector<4xf16>) : vector<4xf16>
%4 = llvm.mlir.constant(dense<[0, 1, 2, 3]> : vector<4xi32>) : vector<4xi32>
%5 = llvm.mlir.constant(dense<1.000000e+00> : vector<4xf16>) : vector<4xf16>
%6 = llvm.mlir.constant(false) : i1
%7 = llvm.mlir.constant(1 : i32) : i32
%8 = llvm.mlir.constant(0 : i32) : i32
%9 = llvm.mlir.constant(4 : index) : i64
%10 = llvm.mlir.constant(0 : index) : i64
%11 = llvm.mlir.constant(1 : index) : i64
%12 = llvm.mlir.constant(-1 : index) : i64
%13 = llvm.mlir.null : !llvm.ptr<f16>
%14 = llvm.getelementptr %13[%9] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%15 = llvm.ptrtoint %14 : !llvm.ptr<f16> to i64
%16 = llvm.alloca %15 x f16 {alignment = 32 : i64} : (i64) -> !llvm.ptr<f16>
%17 = llvm.alloca %15 x f16 {alignment = 32 : i64} : (i64) -> !llvm.ptr<f16>
%18 = llvm.mlir.null : !llvm.ptr<i64>
%19 = llvm.getelementptr %18[%arg1] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%20 = llvm.ptrtoint %19 : !llvm.ptr<i64> to i64
%21 = llvm.alloca %20 x i64 : (i64) -> !llvm.ptr<i64>
llvm.br ^bb1(%10 : i64)
^bb1(%22: i64): // 2 preds: ^bb0, ^bb2
%23 = llvm.icmp "slt" %22, %arg1 : i64
llvm.cond_br %23, ^bb2, ^bb3
^bb2: // pred: ^bb1
%24 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>
%25 = llvm.getelementptr %24[%10, 2] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>, i64) -> !llvm.ptr<i64>
%26 = llvm.add %22, %11 : i64
%27 = llvm.getelementptr %25[%26] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%28 = llvm.load %27 : !llvm.ptr<i64>
%29 = llvm.getelementptr %21[%22] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
llvm.store %28, %29 : !llvm.ptr<i64>
llvm.br ^bb1(%26 : i64)
^bb3: // pred: ^bb1
llvm.br ^bb4(%10, %11 : i64, i64)
^bb4(%30: i64, %31: i64): // 2 preds: ^bb3, ^bb5
%32 = llvm.icmp "slt" %30, %arg1 : i64
llvm.cond_br %32, ^bb5, ^bb6
^bb5: // pred: ^bb4
%33 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>
%34 = llvm.getelementptr %33[%10, 2] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>, i64) -> !llvm.ptr<i64>
%35 = llvm.add %30, %11 : i64
%36 = llvm.getelementptr %34[%35] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%37 = llvm.load %36 : !llvm.ptr<i64>
%38 = llvm.mul %37, %31 : i64
llvm.br ^bb4(%35, %38 : i64, i64)
^bb6: // pred: ^bb4
%39 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>>
%40 = llvm.getelementptr %39[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
%41 = llvm.load %40 : !llvm.ptr<ptr<f16>>
%42 = llvm.getelementptr %13[%11] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%43 = llvm.ptrtoint %42 : !llvm.ptr<f16> to i64
%44 = llvm.alloca %7 x i32 : (i32) -> !llvm.ptr<i32>
llvm.store %8, %44 : !llvm.ptr<i32>
%45 = llvm.call @_mlir_ciface_tf_alloc(%arg0, %31, %43, %8, %7, %44) : (!llvm.ptr<i8>, i64, i64, i32, i32, !llvm.ptr<i32>) -> !llvm.ptr<i8>
%46 = llvm.bitcast %45 : !llvm.ptr<i8> to !llvm.ptr<f16>
%47 = llvm.icmp "eq" %31, %10 : i64
%48 = llvm.or %6, %47 : i1
%49 = llvm.mlir.null : !llvm.ptr<i8>
%50 = llvm.icmp "ne" %45, %49 : !llvm.ptr<i8>
%51 = llvm.or %50, %48 : i1
llvm.cond_br %51, ^bb7, ^bb13
^bb7: // pred: ^bb6
%52 = llvm.urem %31, %9 : i64
%53 = llvm.sub %31, %52 : i64
llvm.br ^bb8(%10 : i64)
^bb8(%54: i64): // 2 preds: ^bb7, ^bb9
%55 = llvm.icmp "slt" %54, %53 : i64
llvm.cond_br %55, ^bb9, ^bb10
^bb9: // pred: ^bb8
%56 = llvm.mul %54, %11 : i64
%57 = llvm.add %56, %10 : i64
%58 = llvm.add %57, %10 : i64
%59 = llvm.getelementptr %41[%58] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%60 = llvm.bitcast %59 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
%61 = llvm.load %60 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
%62 = "llvm.intr.sqrt"(%61) : (vector<4xf16>) -> vector<4xf16>
%63 = llvm.fdiv %5, %62 : vector<4xf16>
%64 = llvm.getelementptr %46[%58] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%65 = llvm.bitcast %64 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
llvm.store %63, %65 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
%66 = llvm.add %54, %9 : i64
llvm.br ^bb8(%66 : i64)
^bb10: // pred: ^bb8
%67 = llvm.icmp "ult" %53, %31 : i64
llvm.cond_br %67, ^bb11, ^bb12
^bb11: // pred: ^bb10
%68 = llvm.mul %53, %12 : i64
%69 = llvm.add %31, %68 : i64
%70 = llvm.mul %53, %11 : i64
%71 = llvm.add %70, %10 : i64
%72 = llvm.trunc %69 : i64 to i32
%73 = llvm.mlir.undef : vector<4xi32>
%74 = llvm.insertelement %72, %73[%8 : i32] : vector<4xi32>
%75 = llvm.shufflevector %74, %73 [0 : i32, 0 : i32, 0 : i32, 0 : i32] : vector<4xi32>, vector<4xi32>
%76 = llvm.icmp "slt" %4, %75 : vector<4xi32>
%77 = llvm.add %71, %10 : i64
%78 = llvm.getelementptr %41[%77] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%79 = llvm.bitcast %78 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
%80 = llvm.intr.masked.load %79, %76, %3 {alignment = 2 : i32} : (!llvm.ptr<vector<4xf16>>, vector<4xi1>, vector<4xf16>) -> vector<4xf16>
%81 = llvm.bitcast %16 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
llvm.store %80, %81 : !llvm.ptr<vector<4xf16>>
%82 = llvm.load %81 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
%83 = "llvm.intr.sqrt"(%82) : (vector<4xf16>) -> vector<4xf16>
%84 = llvm.fdiv %5, %83 : vector<4xf16>
%85 = llvm.bitcast %17 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
llvm.store %84, %85 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
%86 = llvm.load %85 : !llvm.ptr<vector<4xf16>>
%87 = llvm.getelementptr %46[%77] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%88 = llvm.bitcast %87 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
llvm.intr.masked.store %86, %88, %76 {alignment = 2 : i32} : vector<4xf16>, vector<4xi1> into !llvm.ptr<vector<4xf16>>
llvm.br ^bb12
^bb12: // 2 preds: ^bb10, ^bb11
%89 = llvm.mul %2, %1 : i64
%90 = llvm.mul %arg1, %2 : i64
%91 = llvm.add %90, %11 : i64
%92 = llvm.mul %91, %1 : i64
%93 = llvm.add %89, %92 : i64
%94 = llvm.alloca %93 x i8 : (i64) -> !llvm.ptr<i8>
%95 = llvm.bitcast %94 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>>
llvm.store %46, %95 : !llvm.ptr<ptr<f16>>
%96 = llvm.getelementptr %95[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
llvm.store %46, %96 : !llvm.ptr<ptr<f16>>
%97 = llvm.getelementptr %95[%2] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
%98 = llvm.bitcast %97 : !llvm.ptr<ptr<f16>> to !llvm.ptr<i64>
llvm.store %10, %98 : !llvm.ptr<i64>
%99 = llvm.bitcast %94 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64, i64)>>
%100 = llvm.getelementptr %99[%10, 3] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64, i64)>>, i64) -> !llvm.ptr<i64>
%101 = llvm.getelementptr %100[%arg1] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%102 = llvm.sub %arg1, %11 : i64
llvm.br ^bb14(%102, %11 : i64, i64)
^bb13: // pred: ^bb6
%103 = llvm.mlir.addressof @error_message_2208944672953921889 : !llvm.ptr<array<42 x i8>>
%104 = llvm.getelementptr %103[%10, %10] : (!llvm.ptr<array<42 x i8>>, i64, i64) -> !llvm.ptr<i8>
llvm.call @_mlir_ciface_tf_report_error(%arg0, %0, %104) : (!llvm.ptr<i8>, i32, !llvm.ptr<i8>) -> ()
%105 = llvm.mul %2, %1 : i64
%106 = llvm.mul %2, %10 : i64
%107 = llvm.add %106, %11 : i64
%108 = llvm.mul %107, %1 : i64
%109 = llvm.add %105, %108 : i64
%110 = llvm.alloca %109 x i8 : (i64) -> !llvm.ptr<i8>
%111 = llvm.bitcast %110 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>>
llvm.store %13, %111 : !llvm.ptr<ptr<f16>>
%112 = llvm.getelementptr %111[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
llvm.store %13, %112 : !llvm.ptr<ptr<f16>>
%113 = llvm.getelementptr %111[%2] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
%114 = llvm.bitcast %113 : !llvm.ptr<ptr<f16>> to !llvm.ptr<i64>
llvm.store %10, %114 : !llvm.ptr<i64>
%115 = llvm.call @malloc(%109) : (i64) -> !llvm.ptr<i8>
"llvm.intr.memcpy"(%115, %110, %109, %6) : (!llvm.ptr<i8>, !llvm.ptr<i8>, i64, i1) -> ()
%116 = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)>
%117 = llvm.insertvalue %10, %116[0] : !llvm.struct<(i64, ptr<i8>)>
%118 = llvm.insertvalue %115, %117[1] : !llvm.struct<(i64, ptr<i8>)>
llvm.return %118 : !llvm.struct<(i64, ptr<i8>)>
^bb14(%119: i64, %120: i64): // 2 preds: ^bb12, ^bb15
%121 = llvm.icmp "sge" %119, %10 : i64
llvm.cond_br %121, ^bb15, ^bb16
^bb15: // pred: ^bb14
%122 = llvm.getelementptr %21[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%123 = llvm.load %122 : !llvm.ptr<i64>
%124 = llvm.getelementptr %100[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
llvm.store %123, %124 : !llvm.ptr<i64>
%125 = llvm.getelementptr %101[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
llvm.store %120, %125 : !llvm.ptr<i64>
%126 = llvm.mul %120, %123 : i64
%127 = llvm.sub %119, %11 : i64
llvm.br ^bb14(%127, %126 : i64, i64)
^bb16: // pred: ^bb14
%128 = llvm.call @malloc(%93) : (i64) -> !llvm.ptr<i8>
"llvm.intr.memcpy"(%128, %94, %93, %6) : (!llvm.ptr<i8>, !llvm.ptr<i8>, i64, i1) -> ()
%129 = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)>
%130 = llvm.insertvalue %arg1, %129[0] : !llvm.struct<(i64, ptr<i8>)>
%131 = llvm.insertvalue %128, %130[1] : !llvm.struct<(i64, ptr<i8>)>
llvm.return %131 : !llvm.struct<(i64, ptr<i8>)>
}
llvm.func @_mlir_ciface_Rsqrt_CPU_DT_HALF_DT_HALF(%arg0: !llvm.ptr<struct<(i64, ptr<i8>)>>, %arg1: !llvm.ptr<i8>, %arg2: !llvm.ptr<struct<(i64, ptr<i8>)>>) attributes {llvm.emit_c_interface, tf_entry} {
%0 = llvm.load %arg2 : !llvm.ptr<struct<(i64, ptr<i8>)>>
%1 = llvm.extractvalue %0[0] : !llvm.struct<(i64, ptr<i8>)>
%2 = llvm.extractvalue %0[1] : !llvm.struct<(i64, ptr<i8>)>
%3 = llvm.call @Rsqrt_CPU_DT_HALF_DT_HALF(%arg1, %1, %2) : (!llvm.ptr<i8>, i64, !llvm.ptr<i8>) -> !llvm.struct<(i64, ptr<i8>)>
llvm.store %3, %arg0 : !llvm.ptr<struct<(i64, ptr<i8>)>>
llvm.return
}
}
Nikita Popov [Wed, 15 Jun 2022 11:23:32 +0000 (13:23 +0200)]
[BitcodeReader] Remove unnecessary argument defaults (NFC)
This is an internal method that is always called with all arguments.
Simon Pilgrim [Wed, 15 Jun 2022 11:20:53 +0000 (12:20 +0100)]
[X86] X86TargetTransformInfo.cpp - use InstructionCost type to accumulate instructions costs
Simon Pilgrim [Wed, 15 Jun 2022 11:19:34 +0000 (12:19 +0100)]
[AArch64] Add test case from D127354
Benjamin Kramer [Tue, 7 Jun 2022 11:29:10 +0000 (13:29 +0200)]
Add a conversion from double to bf16
This introduces a new compiler-rt function `__truncdfbf2`.
Benjamin Kramer [Fri, 3 Jun 2022 08:47:22 +0000 (10:47 +0200)]
Promote bf16 to f32 when the target doesn't support it
This is modeled after the half-precision fp support. Two new nodes are
introduced for casting from and to bf16. Since casting from bf16 is a
simple operation I opted to always directly lower it to integer
arithmetic. The other way round is more complicated if you want to
preserve IEEE semantics, so it's handled by a new __truncsfbf2
compiler-rt builtin.
This is of course very bare bones, but sufficient to get a semi-softened
fadd on x86.
Possible future improvements:
- Targets with bf16 conversion instructions can now make fp_to_bf16 legal
- The software conversion to bf16 can be replaced by a trivial
implementation under fast math.
Differential Revision: https://reviews.llvm.org/D126953
Simon Pilgrim [Wed, 15 Jun 2022 10:53:00 +0000 (11:53 +0100)]
Fix signed/unsigned comparison warning
Keith Walker [Tue, 24 May 2022 14:54:58 +0000 (15:54 +0100)]
[DebugInfo][ARM] Not readonly check for RWPI globals
When compiling for the RWPI relocation model [1], the debug information
is wrong for readonly global variables.
Writable global variables are accessed by the static base register (R9
on ARM) in the RWPI relocation model. This is being correctly generated
Readonly global variables are not accessed by the static base register
in the RWPI relocation model. This case is incorrectly generating the
same debugging information as for writable global variables.
References:
[1] ARM Read-Write Position Independence: https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#read-write-position-independence-rwpi
Differential Revision: https://reviews.llvm.org/D126361
Benjamin Kramer [Wed, 15 Jun 2022 10:20:44 +0000 (12:20 +0200)]
[Sema] Remove unused function after
8c7b64b5ae2a
Nabeel Omer [Wed, 15 Jun 2022 09:52:37 +0000 (10:52 +0100)]
[X86][SLP] Basic test coverage for llvm.powi
This patch introduces basic test coverage for llvm.powi.* intrinsics.
Differential Revision: https://reviews.llvm.org/D127492
David Sherwood [Wed, 15 Jun 2022 10:09:12 +0000 (11:09 +0100)]
[NFC] Move tests CodeGen/AArch64/SME/sme-* -> CodeGen/AArch64/sme-*
Simon Pilgrim [Wed, 15 Jun 2022 10:07:48 +0000 (11:07 +0100)]
[DAG] Fix SDLoc mismatch in (shl (srl x, c1), c2) -> and(shift(x,c3)) fold
Noticed by @craig.topper on D125836 which uses a tweaked copy of the same code.
Differential Revision: https://reviews.llvm.org/D127772
Stanislav Gatev [Wed, 18 May 2022 21:57:40 +0000 (21:57 +0000)]
[clang][dataflow] Add support for correlated branches to optional model
Add support for correlated branches to the std::optional dataflow model.
Differential Revision: https://reviews.llvm.org/D125931
Reviewed-by: ymandel, xazax.hun
Martin Boehme [Wed, 15 Jun 2022 06:07:23 +0000 (08:07 +0200)]
[clang] Reject non-declaration C++11 attributes on declarations
For backwards compatiblity, we emit only a warning instead of an error if the
attribute is one of the existing type attributes that we have historically
allowed to "slide" to the `DeclSpec` just as if it had been specified in GNU
syntax. (We will call these "legacy type attributes" below.)
The high-level changes that achieve this are:
- We introduce a new field `Declarator::DeclarationAttrs` (with appropriate
accessors) to store C++11 attributes occurring in the attribute-specifier-seq
at the beginning of a simple-declaration (and other similar declarations).
Previously, these attributes were placed on the `DeclSpec`, which made it
impossible to reconstruct later on whether the attributes had in fact been
placed on the decl-specifier-seq or ahead of the declaration.
- In the parser, we propgate declaration attributes and decl-specifier-seq
attributes separately until we can place them in
`Declarator::DeclarationAttrs` or `DeclSpec::Attrs`, respectively.
- In `ProcessDeclAttributes()`, in addition to processing declarator attributes,
we now also process the attributes from `Declarator::DeclarationAttrs` (except
if they are legacy type attributes).
- In `ConvertDeclSpecToType()`, in addition to processing `DeclSpec` attributes,
we also process any legacy type attributes that occur in
`Declarator::DeclarationAttrs` (and emit a warning).
- We make `ProcessDeclAttribute` emit an error if it sees any non-declaration
attributes in C++11 syntax, except in the following cases:
- If it is being called for attributes on a `DeclSpec` or `DeclaratorChunk`
- If the attribute is a legacy type attribute (in which case we only emit
a warning)
The standard justifies treating attributes at the beginning of a
simple-declaration and attributes after a declarator-id the same. Here are some
relevant parts of the standard:
- The attribute-specifier-seq at the beginning of a simple-declaration
"appertains to each of the entities declared by the declarators of the
init-declarator-list" (https://eel.is/c++draft/dcl.dcl#dcl.pre-3)
- "In the declaration for an entity, attributes appertaining to that entity can
appear at the start of the declaration and after the declarator-id for that
declaration." (https://eel.is/c++draft/dcl.dcl#dcl.pre-note-2)
- "The optional attribute-specifier-seq following a declarator-id appertains to
the entity that is declared."
(https://eel.is/c++draft/dcl.dcl#dcl.meaning.general-1)
The standard contains similar wording to that for a simple-declaration in other
similar types of declarations, for example:
- "The optional attribute-specifier-seq in a parameter-declaration appertains to
the parameter." (https://eel.is/c++draft/dcl.fct#3)
- "The optional attribute-specifier-seq in an exception-declaration appertains
to the parameter of the catch clause" (https://eel.is/c++draft/except.pre#1)
The new behavior is tested both on the newly added type attribute
`annotate_type`, for which we emit errors, and for the legacy type attribute
`address_space` (chosen somewhat randomly from the various legacy type
attributes), for which we emit warnings.
Depends On D111548
Reviewed By: aaron.ballman, rsmith
Differential Revision: https://reviews.llvm.org/D126061
Sven van Haastregt [Wed, 15 Jun 2022 09:54:46 +0000 (10:54 +0100)]
[OpenCL] Reword unknown extension pragma diagnostic
For newer OpenCL extensions that do not require a pragma, such as
`cl_khr_subgroup_shuffle`, a user could still accidentally attempt to
use a pragma. This would result in a warning
"unknown OpenCL extension 'cl_khr_subgroup_shuffle' - ignoring"
which could be mistakenly interpreted as "clang does not support this
extension at all" instead of "clang does not require any pragma for
this extension".
Differential Revision: https://reviews.llvm.org/D126660
Simon Pilgrim [Wed, 15 Jun 2022 09:40:13 +0000 (10:40 +0100)]
[X86] needCarryOrOverflowFlag/onlyZeroFlagUsed - merge identical switch cases. NFCI.
Makes it easier to grok and fixes various bugprone-branch-clone warnings.
David Sherwood [Thu, 9 Jun 2022 08:01:49 +0000 (09:01 +0100)]
[AArch64][SME] Add SME read/write intrinsics that map to the mova instruction
This patch adds implementations for the read/write SME ACLE intrinsics:
@llvm.aarch64.sme.read.horiz
@llvm.aarch64.sme.read.vert
@llvm.aarch64.sme.write.horiz
@llvm.aarch64.sme.write.vert
These all map to the SME mova instruction.
Differential Revision: https://reviews.llvm.org/D127414
Martin Boehme [Wed, 15 Jun 2022 08:59:07 +0000 (10:59 +0200)]
[Clang] Documentation-only: Add missing closing `>` in AttrDocs.td
Ilya Biryukov [Wed, 15 Jun 2022 08:55:55 +0000 (10:55 +0200)]
[libcxx] Fix allocator<void>::pointer in C++20 with removed members
When compiled with `-D_LIBCPP_ENABLE_CXX20_REMOVED_ALLOCATOR_MEMBERS`
uses of `allocator<void>::pointer` resulted in compiler errors after D104323.
If we instantiate the primary template, `allocator<void>::reference` produces
an error 'cannot form references to void'.
To workaround this, allow to bring back the `allocator<void>` specialization by defining the new `_LIBCPP_ENABLE_CXX20_REMOVED_ALLOCATOR_VOID_SPECIALIZATION` macro.
To make sure the code that uses `allocator<void>` and the removed members does not break,
both `_LIBCPP_ENABLE_CXX20_REMOVED_ALLOCATOR_MEMBERS` and `_LIBCPP_ENABLE_CXX20_REMOVED_ALLOCATOR_MEMBERS` have to be defined.
Reviewed By: ldionne, #libc, philnik
Differential Revision: https://reviews.llvm.org/D126210
Martin Boehme [Wed, 15 Jun 2022 08:38:10 +0000 (10:38 +0200)]
[Clang] Fix signed-unsigned comparison warning that breaks the ppc64 build.
David Sherwood [Tue, 14 Jun 2022 15:27:18 +0000 (16:27 +0100)]
[NFC][AArch64] Minor refactor of AArch64InstPrinter::printMatrixTileList
We can remove the MatrixZADRegisterTable table of tile registers and
just calculate the register index directly.
Differential Revision: https://reviews.llvm.org/D127757
Kadir Cetinkaya [Wed, 15 Jun 2022 08:04:48 +0000 (10:04 +0200)]
[clangd] Enable AKA type printing by default
This has been tested on a large set of c++ developers for a long while,
without any crashes or complaints.
Differential Revision: https://reviews.llvm.org/D127833
owenca [Wed, 15 Jun 2022 08:29:51 +0000 (01:29 -0700)]
[libcxx] Remove extraneous '---' lines in .clang-format files
Benjamin Kramer [Wed, 15 Jun 2022 08:27:19 +0000 (10:27 +0200)]
[mlir][Arith] Fix a use-after-free after rewriting ops to unsigned
Just short-circuit when a change was made, the erased value is invalid
after that. Found by asan.
This pass looks like it could use rewrite patterns instead which don't
have this issue, but let's fix the asan build first.
Kito Cheng [Thu, 9 Jun 2022 15:25:18 +0000 (23:25 +0800)]
[RISCV] Fixing undefined physical register issue when subreg liveness tracking enabled.
RISC-V expand register tuple spilling into series of register spilling after
register allocation phase by the pseudo instruction expansion, however part of
register tuple might be still undefined during spilling, machine verifier will
complain the spill instruction is using an undefined physical register.
Optimal solution should be doing liveness analysis and do not emit spill
and reload for those undefined parts, but accurate liveness info at that point
is not so easy to get.
So the suboptimal solution is still spill and reload those undefined parts, but
adding implicit-use of super register to spill function, then machine
verifier will only report report using undefined physical register if
the when whole super register is undefined, and this behavior are also
documented in MachineVerifier::checkLiveness[1].
Example for demo what happend:
```
v10m2 = xxx
# v12m2 not define yet
PseudoVSPILL2_M2 v10m2_v12m2
...
```
After expansion:
```
v10m2 = xxx
# v12m2 not define yet
# Expand PseudoVSPILL2_M2 v10m2_v12m2 to 2 vs2r
VS2R_V v10m2
VS2R_V v12m2 # Use undef reg!
```
What this patch did:
```
v10m2 = xxx
# v12m2 not define yet
# Expand PseudoVSPILL2_M2 v10m2_v12m2 to 2 vs2r
VS2R_V v10m2 implicit v10m2_v12m2
# Use undef reg (v12m2), but v10m2_v12m2 ins't totally undef, so
# that's OK.
VS2R_V v12m2 implicit v10m2_v12m2
```
[1] https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/MachineVerifier.cpp#L2016-L2019
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D127642
Matthias Springer [Wed, 15 Jun 2022 08:15:09 +0000 (10:15 +0200)]
[mlir][bufferize] Better implementation of AnalysisState::isTensorYielded
If `create-deallocs=0`, mark all bufferization.alloc_tensor ops as escaping. (Unless they already have an `escape` attribute.) In the absence of analysis information, check SSA use-def chains to see if the value may be yielded.
Differential Revision: https://reviews.llvm.org/D127302
Siva Chandra Reddy [Wed, 15 Jun 2022 08:09:12 +0000 (08:09 +0000)]
[libc][Obvious] Removed few unused vars.
Matthias Springer [Wed, 15 Jun 2022 08:06:55 +0000 (10:06 +0200)]
[mlir][bufferize][NFC] Merge AlwaysCopyAnalysisState into AnalysisState
`AnalysisState` now has default implementations of all virtual functions.
Differential Revision: https://reviews.llvm.org/D127301
Heejin Ahn [Tue, 14 Jun 2022 23:41:17 +0000 (16:41 -0700)]
[InstCombine] Improve check for catchswitch BBs (NFC)
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D127810
Matthias Springer [Wed, 15 Jun 2022 07:58:56 +0000 (09:58 +0200)]
[mlir][bufferize][NFC] Make func BufferizableOpInterface impl compatible with One-Shot Bufferize
Bufferization of the func dialect must go through `OneShotModuleBufferize`. With this change, the analysis interface methods of the BufferizableOpInterface of func dialect ops can be used together with the normal `OneShotBufferize`. (In the absence of analysis information, they will return conservative results.)
Differential Revision: https://reviews.llvm.org/D127299
Peixin-Qiao [Wed, 15 Jun 2022 08:02:27 +0000 (16:02 +0800)]
[flang][OpenMP] Add one semantic check for data-sharing clauses
As OpenMP 5.0, for firstprivate, lastprivate, copyin, and copyprivate
clauses, if the list item is a polymorphic variable with the allocatable
attribute, the behavior is unspecified.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D127601
Matthias Springer [Wed, 15 Jun 2022 07:09:07 +0000 (09:09 +0200)]
[mlir][linalg][bufferize] Remove always-aliasing-with-dest option
This flag was introduced for a use case in IREE, but it is no longer needed.
Differential Revision: https://reviews.llvm.org/D126965
Martin Boehme [Wed, 15 Jun 2022 06:08:10 +0000 (08:08 +0200)]
[Clang] Add the `annotate_type` attribute
This is an analog to the `annotate` attribute but for types. The intent is to allow adding arbitrary annotations to types for use in static analysis tools.
For details, see this RFC:
https://discourse.llvm.org/t/rfc-new-attribute-annotate-type-iteration-2/61378
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D111548
Peixin-Qiao [Wed, 15 Jun 2022 07:39:13 +0000 (15:39 +0800)]
[flang] Change C889 from error into warning
This constraint is used in OMP2012 benchmark, and other compilers do not
enforce it. Change it into one warning. This addresses the issue
https://github.com/llvm/llvm-project/issues/56003.
Reviewed By: klausler, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D127740
Nikita Popov [Wed, 15 Jun 2022 07:36:39 +0000 (09:36 +0200)]
[SimplifyLibCalls] Drop duplicate check (NFC)
The same condition already exists inside optimizeMemCmpConstantSize().
Austin Kerbow [Wed, 15 Jun 2022 07:23:30 +0000 (00:23 -0700)]
[AMDGPU] Fix buildbot failures after
48ebc1af29
Some buildbots (lto, windows) were failing due to some function reference
variables being improperly initialized.
Siva Chandra Reddy [Wed, 15 Jun 2022 07:11:57 +0000 (07:11 +0000)]
[libc] Add linux threads targets only if __support/OSUtil targets are available.
Petr Hosek [Wed, 15 Jun 2022 06:53:18 +0000 (06:53 +0000)]
[libFuzzer] Use the compiler to link the relocatable object
Rather than invoking the linker directly, let the compiler driver
handle it. This ensures that we use the correct linker in the case
of cross-compiling.
Differential Revision: https://reviews.llvm.org/D127828
Matthias Springer [Wed, 15 Jun 2022 07:00:44 +0000 (09:00 +0200)]
[mlir][SCF][bufferize] Implement `resolveConflicts` for SCF ops
scf::ForOp and scf::WhileOp must insert buffer copies not only for out-of-place bufferizations, but also to enforce additional invariants wrt. to buffer aliasing behavior. This is currently happening in the respective `bufferize` methods. With this change, the tensor copy insertion pass will also enforce these invariants by inserting copies. The `bufferize` methods can then be simplified and made independent of the `AnalysisState` data structure in a subsequent change.
Differential Revision: https://reviews.llvm.org/D126822
owenca [Wed, 15 Jun 2022 06:57:08 +0000 (23:57 -0700)]
[mlir] Add missing newline at end of .clang-format file
chenglin.bi [Wed, 15 Jun 2022 06:51:15 +0000 (14:51 +0800)]
[LSR] Add test for LoopStrenghtReduce for Ldp; NFC
#53877
Siva Chandra Reddy [Wed, 15 Jun 2022 06:32:06 +0000 (06:32 +0000)]
[libc][NFC] Add src.__support.OSUtil targets conditionally.
Before this change, they were unconditionally added, irrespective of the
availability of the architecture specific pieces.
Kadir Cetinkaya [Tue, 14 Jun 2022 15:08:37 +0000 (17:08 +0200)]
[clangd] Wire up compilation for style blocks
Differential Revision: https://reviews.llvm.org/D127749
Yeting Kuo [Sat, 11 Jun 2022 16:46:30 +0000 (00:46 +0800)]
[RISCV] Teach vsetvli insertion to not insert redundant vsetvli right after VLEFF/VLSEGFF.
VSETVLIInfos right after VLEFF/VLSEGFF are currently unknown since they modify
VL. Unknown VSETVLIInfos make next vector operations needed to be inserted
VSET(I)VLI. Actually the next vector operation of VLEFF/VLSEGFF may not need to
be inserted VSET(I)VLI if it uses same VTYPE and the resulted vl of
VLEFF/VLSEGFF.
Take the below C code as an example,
vint8m4_t vec_src1 = vle8ff_v_i8m4(str1, &new_vl, vl);
vbool2_t mask1 = vmseq_vx_i8m4_b2(vec_src1, 0, new_vl);
vsetvli insertion adds a redundant vsetvli for that,
Assembly result:
vsetvli a2,a2,e8,m4,ta,mu
vle8ff.v v28,(a0)
csrr a3,vl ; redundant
vsetvli zero,a3,e8,m4,ta,mu ; redundant
vmseq.vi v25,v28,0
After D126794, VLEFF/VLSEGFF has a define having value of VL. The patch consider
there is a ghost vsetvli right after VLEFF/VLSEGFF. The ghost VSET(I)LIs use the
vl output of the VLEFF/VLSEGFF as its AVL and same VTYPE of the VLEFF/VLSEGFF.
The ghost vsetvli must be redundant, and we could use it to get the VSETVLIInfo
right after VLEFF/VLSEGFF.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D127576
Ping Deng [Wed, 15 Jun 2022 05:43:26 +0000 (05:43 +0000)]
[SelectionDAG] fold 'Op0 - (X * MulC)' to 'Op0 + (X << log2(-MulC))'
Reviewed By: craig.topper, spatel
Differential Revision: https://reviews.llvm.org/D127474
Siva Chandra Reddy [Wed, 15 Jun 2022 05:38:38 +0000 (05:38 +0000)]
[libc][NFC] Use uint32_t to represent futex words.
Futexes are 32 bits in size on all platforms, including 64-bit systems.
owenca [Mon, 13 Jun 2022 19:15:31 +0000 (12:15 -0700)]
[clang-format] Never analyze insert/remove braces in the same pass
Turn off RemoveBracesLLVM while analyzing InsertBraces and vice
versa to avoid potential interference of each other and better the
performance.
Differential Revision: https://reviews.llvm.org/D127685
LLVM GN Syncbot [Wed, 15 Jun 2022 05:24:12 +0000 (05:24 +0000)]
[gn build] Port
48ebc1af2948
Austin Kerbow [Fri, 3 Jun 2022 18:35:47 +0000 (11:35 -0700)]
[AMDGPU] Add more expressive sched_barrier controls
The sched_barrier builtin allow the scheduler's behavior to be shaped by users
when very specific codegen is needed in order to create highly optimized code.
This patch adds more granular control over the types of instructions that are
allowed to be reordered with respect to one or multiple sched_barriers. A mask
is used to specify groups of instructions that should be allowed to be scheduled
around a sched_barrier. The details about this mask may be used can be found in
llvm/include/llvm/IR/IntrinsicsAMDGPU.td.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D127123