Matthias Springer [Tue, 30 Aug 2022 14:55:49 +0000 (16:55 +0200)]
[mlir][SCF][bufferize] Support different iter_arg/init_arg types (scf.while)
This change implements the same functionality as D132860, but for scf.while.
Differential Revision: https://reviews.llvm.org/D132927
Matthias Springer [Tue, 30 Aug 2022 14:46:23 +0000 (16:46 +0200)]
[mlir][SCF][bufferize][NFC] Move scf.if buffer type computation to getBufferType
A part of the functionality of `bufferize` is extracted into `getBufferType`. Also, bufferized scf.yields inside scf.if are now created with the correct bufferized type from the get-to.
Differential Revision: https://reviews.llvm.org/D132862
Matthias Springer [Tue, 30 Aug 2022 14:42:29 +0000 (16:42 +0200)]
[mlir][arith][bufferize][NFC] Move buffer type computation to getBufferType
A part of the functionality of `bufferize` is extracted into `getBufferType`.
Differential Revision: https://reviews.llvm.org/D132861
zhijian [Tue, 30 Aug 2022 14:38:38 +0000 (10:38 -0400)]
[AIX][clang][driver] Check the command string to the linker for exportlist opts
Summary:
Some of code in the patch are contributed by David Tenty.
1. We currently only check driver Wl options and don't check for the plain -b, -Xlinker or other options which get passed through to the linker when we decide whether to run llvm-nm --export-symbols, so we may run it in situations where we wouldn't if the user had used the equivalent -Wl, prefixed options. If we run the export list utility when the user has specified an export list, we could export more symbols than they intended.
2. Add a new functionality to allow redirecting the stdin, stdout, stderr of individual Jobs, if redirects are set for the Job use them, otherwise fall back to the global Compilation redirects if any.
Reviewers: David Tenty, Fangrui Song, Steven Wan
Differential Revision: https://reviews.llvm.org/D119147
Matthias Springer [Tue, 30 Aug 2022 14:32:09 +0000 (16:32 +0200)]
[mlir][SCF][bufferize] Support different iter_arg/init_arg types (scf.for)
Even though iter_arg and init_arg of an scf.for loop may have the same tensor type, their bufferized memref types are not necessarily equal. It is sometimes necessary to insert a cast in case of differing layout maps.
Differential Revision: https://reviews.llvm.org/D132860
Jon Chesterfield [Tue, 30 Aug 2022 14:29:38 +0000 (15:29 +0100)]
[amdgpu][nfc] Add test case showing false aliasing in LDS lowering
Matthias Springer [Tue, 30 Aug 2022 14:26:12 +0000 (16:26 +0200)]
[mlir][bufferization] Generalize getBufferType
This change generalizes getBufferType. This function can be used to predict the buffer type of any tensor value (not just BlockArguments) without changing any IR. It also subsumes getMemorySpace. This is useful for loop bufferization, where the precise buffer type of an iter_arg cannot be known without examining the loop body.
Differential Revision: https://reviews.llvm.org/D132859
Dmitry Preobrazhensky [Tue, 30 Aug 2022 14:04:09 +0000 (17:04 +0300)]
[AMDGPU][MC][GFX11][NFC] Update asm tests for VOP3P instructions
Differential Revision: https://reviews.llvm.org/D132876
Dmitry Preobrazhensky [Tue, 30 Aug 2022 13:59:29 +0000 (16:59 +0300)]
[AMDGPU][MC][GFX11][NFC] Add tests for opcode promotions and forced suffices
Differential Revision: https://reviews.llvm.org/D132869
Dmitry Preobrazhensky [Tue, 30 Aug 2022 13:54:58 +0000 (16:54 +0300)]
[AMDGPU][MC][GFX11][NFC] Add missing asm tests for VOPC and VOPC.DPP instructions
Differential Revision: https://reviews.llvm.org/D132690
Dmitry Preobrazhensky [Tue, 30 Aug 2022 13:48:57 +0000 (16:48 +0300)]
[AMDGPU][MC][GFX11][NFC] Update asm tests for VOPC instructions promoted to VOP3
Differential Revision: https://reviews.llvm.org/D132857
Alexey Bataev [Mon, 29 Aug 2022 20:08:47 +0000 (13:08 -0700)]
[SLP]Improve operands kind analaysis for constants.
Removed EnableFP parameter in getOperandInfo function since it is not
needed, the operands kinds also controlled by the operation code, which
allows to remove extra check for the type of the operands. Also, added
analysis for uniform constant float values.
This change currently does not trigger any changes in the code since TTI
does not do analysis for constant floats, so it can be considered NFC.
Tested with llvm-test-suite + SPEC2017, no changes.
Differential Revision: https://reviews.llvm.org/D132886
Dmitry Preobrazhensky [Tue, 30 Aug 2022 13:21:23 +0000 (16:21 +0300)]
[AMDGPU][MC][GFX11][NFC] Update asm tests for VOP3 instructions
Differential Revision: https://reviews.llvm.org/D132854
Timm Bäder [Mon, 29 Aug 2022 04:51:09 +0000 (06:51 +0200)]
[clang][Parse] Fix crash when emitting template diagnostic
This was passing a 6 to the diagnostic engine, which the diagnostic
message didn't handle.
Add the new value to the diagnosic message, remove an unused value and
add a test.
This fixes https://github.com/llvm/llvm-project/issues/57415
Differential Revision: https://reviews.llvm.org/D132821
Matheus Izvekov [Sun, 28 Aug 2022 14:19:02 +0000 (16:19 +0200)]
[libcxx] CI: set symbolizer for bootstrapping build
Setting the symbolizer is required for getting a pretty
stack trace when Clang crashes.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D132807
serge-sans-paille [Mon, 29 Aug 2022 15:02:50 +0000 (17:02 +0200)]
[clang] Fix -Warray-bound interaction with -fstrict-flex-arrays=1
The test to check if an array was a FAM in the context of array bound checking
and strict-flex-arrays=1 was inverted.
As a by product, improve test coverage.
Differential Revision: https://reviews.llvm.org/D132853
Markus Böck [Tue, 30 Aug 2022 12:46:22 +0000 (14:46 +0200)]
[cmake] Don't include symlinks to tools in Build-all when `LLVM_BUILD_TOOLS` is off
When building LLVM with LLVM_BUILD_TOOLS as OFF, numerous tools such as llvm-ar or llvm-objcopy end up still being built. The reason for this is that the symlink targets are unconditionally included in a Build-all build, causing the tool they're symlinking to be built after all.
This patch changes that behaviour to be more intuitive by only including the symlink in a Build-all build if the target they're linking to is also included.
Differential Revision: https://reviews.llvm.org/D132883
zhongyunde [Tue, 30 Aug 2022 12:36:30 +0000 (20:36 +0800)]
[InstCombine] Distributive or+mul with const operand
We aleady support the transform: `(X+C1)*CI -> X*CI+C1*CI`
Here the case is a little special as the form of `(X+C1)*CI` is transformed into `(X|C1)*CI`,
so we should also support the transform: `(X|C1)*CI -> X*CI+C1*CI`
Fixes https://github.com/llvm/llvm-project/issues/57278
Reviewed By: bcl5980, spatel, RKSimon
Differential Revision: https://reviews.llvm.org/D132658
Florian Hahn [Tue, 30 Aug 2022 12:27:50 +0000 (13:27 +0100)]
[DSE] Support looking through memory phis at end of function.
Update isWriteAtEndOfFunction to look through MemoryPhis. The reason
MemoryPhis were skipped so far was the known AliasAnalysis issue with it
missing loop-carried dependences.
This problem is already addressed in other parts of the code by skipping
MemoryDefs that may be in difference loops. I think the same logic can
be applied here.
This can have a substantial impact on the number of stores removed in
some cases. For MultiSource/SPEC2006/SPEC2017 with -O3:
```
Metric: dse.NumFastStores
Program dse.NumFastStores
base patch diff
External/S...CINT2017rate/557.xz_r/557.xz_r 14.00 45.00 221.4%
External/S...te/538.imagick_r/538.imagick_r 439.00 1267.00 188.6%
MultiSourc...e/Applications/SIBsim4/SIBsim4 6.00 15.00 150.0%
MultiSourc...Prolangs-C/simulator/simulator 3.00 7.00 133.3%
MultiSource/Applications/siod/siod 3.00 7.00 133.3%
MultiSourc...arks/FreeBench/distray/distray 6.00 9.00 50.0%
MultiSourc...e/Applications/obsequi/Obsequi 22.00 30.00 36.4%
MultiSource/Benchmarks/Ptrdist/bc/bc 23.00 28.00 21.7%
External/S...NT2017rate/502.gcc_r/502.gcc_r 1258.00 1512.00 20.2%
External/S...te/520.omnetpp_r/520.omnetpp_r 954.00 1143.00 19.8%
External/S...rate/510.parest_r/510.parest_r 5961.00 7122.00 19.5%
External/S...C/CINT2006/445.gobmk/445.gobmk 47.00 56.00 19.1%
External/S...00.perlbench_r/500.perlbench_r 241.00 286.00 18.7%
External/S...NT2006/471.omnetpp/471.omnetpp 36.00 42.00 16.7%
External/S...06/400.perlbench/400.perlbench 183.00 210.00 14.8%
MultiSource/Applications/SPASS/SPASS 72.00 81.00 12.5%
External/S...17rate/541.leela_r/541.leela_r 72.00 80.00 11.1%
External/SPEC/CINT2006/403.gcc/403.gcc 585.00 642.00 9.7%
MultiSourc...e/Applications/sqlite3/sqlite3 120.00 131.00 9.2%
MultiSourc...Applications/hexxagon/hexxagon 11.00 12.00 9.1%
External/S.../CFP2006/453.povray/453.povray 566.00 615.00 8.7%
External/S...rate/511.povray_r/511.povray_r 578.00 627.00 8.5%
External/S...FP2006/482.sphinx3/482.sphinx3 12.00 13.00 8.3%
MultiSource/Applications/oggenc/oggenc 130.00 140.00 7.7%
MultiSourc...e/Applications/ClamAV/clamscan 250.00 268.00 7.2%
MultiSourc.../mediabench/jpeg/jpeg-6a/cjpeg 19.00 20.00 5.3%
MultiSourc...ch/consumer-jpeg/consumer-jpeg 19.00 20.00 5.3%
External/S...te/526.blender_r/526.blender_r 3747.00 3928.00 4.8%
MultiSourc...OE-ProxyApps-C++/miniFE/miniFE 104.00 108.00 3.8%
MultiSourc...ch/consumer-lame/consumer-lame 54.00 56.00 3.7%
MultiSource/Benchmarks/Bullet/bullet 1222.00 1264.00 3.4%
MultiSourc...nchmarks/tramp3d-v4/tramp3d-v4 973.00 1005.00 3.3%
External/S.../CFP2006/447.dealII/447.dealII 2699.00 2780.00 3.0%
External/S...06/483.xalancbmk/483.xalancbmk 788.00 810.00 2.8%
External/S.../CFP2006/450.soplex/450.soplex 180.00 185.00 2.8%
MultiSourc.../DOE-ProxyApps-C++/CLAMR/CLAMR 338.00 345.00 2.1%
MultiSourc...Benchmarks/7zip/7zip-benchmark 685.00 699.00 2.0%
External/S...FP2017rate/544.nab_r/544.nab_r 158.00 160.00 1.3%
MultiSourc...sumer-typeset/consumer-typeset 772.00 781.00 1.2%
External/S...2017rate/525.x264_r/525.x264_r 410.00 414.00 1.0%
External/S...23.xalancbmk_r/523.xalancbmk_r 998.00 1002.00 0.4%
```
Compile-time is almost neutral:
https://llvm-compile-time-tracker.com/compare.php?from=
b3125ad3d60531a97eea20009cc9629a87755862&to=
84007eee59004f43464eda7f5ba8263ed5158df8&stat=instructions
NewPM-O3: +0.03%
NewPM-ReleaseThinLTO: -0.01%
NewPM-ReleaseLTO-g: +0.03%
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D132365
Johannes Reifferscheid [Tue, 30 Aug 2022 11:15:27 +0000 (13:15 +0200)]
Move BufferViewFlowAnalysis to the Bufferization dialect.
It's only used from there, and this lets us remove the dependency from Analysis
to the Arith dialect.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D132928
Thomas Symalla [Tue, 30 Aug 2022 11:51:45 +0000 (13:51 +0200)]
[NFC][AMDGPU] Pre-commit tests for D132837.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D132930
Valentin Clement [Tue, 30 Aug 2022 11:48:51 +0000 (13:48 +0200)]
[flang] Create a temporary of the correct size when lowering SetLength in genarr
This patch creates a temporary of the appropriate length while lowering SetLength.
The corresponding character can be truncated or padded if necessary.
This fix issue with array constructor in argument and also with statement function.
D132464 was fixing the same issue in genval.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D132866
Nikita Popov [Tue, 30 Aug 2022 10:06:26 +0000 (12:06 +0200)]
[GVN] Regenerate test checks (NFC)
Utkarsh Saxena [Tue, 30 Aug 2022 09:07:37 +0000 (11:07 +0200)]
[clangd] Enable folding ranges by default.
Differential Revision: https://reviews.llvm.org/D132919
Tomas Matheson [Tue, 23 Aug 2022 16:04:19 +0000 (17:04 +0100)]
[AArch64][GISel] constrain regclass for 128->64 copy
When selecting G_EXTRACT to COPY for extracting a 64-bit GPR from
a 128-bit register pair (XSeqPair) we know enough to constrain the
destination register class to gpr64. Without this it may have only
a register bank and some copy elimination code would assert while
assuming that a register class existed.
The register class has to be set explicitly because we might hit the
COPY -> COPY case where register class can't be inferred.
This would cause the following to crash in selection, where the store
is commented (otherwise the store constrains the register class):
define dso_local i128 @load_atomic_i128_unordered(i128* %p) {
%pair = cmpxchg i128* %p, i128 0, i128 0 acquire acquire
%val = extractvalue { i128, i1 } %pair, 0
; store i128 %val, i128* %p
ret i128 %val
}
Differential Revision: https://reviews.llvm.org/D132665
Tomas Matheson [Tue, 23 Aug 2022 16:01:53 +0000 (17:01 +0100)]
[AArch64][GISel] fix G_ADD*/G_SUB* legalization
widenScalarDst updates the insert point to after MI, so
widenScalarSrc must be called before widenScalarDst. Otherwise
The updated Src values will appear after MI and break SSA. e.g.:
%14:_(s64), %15:_(s1) = G_UADDE %9:_, %11:_, %13:_
becomes
%14:_(s64), %16:_(s32) = G_UADDE %9:_, %11:_, %17:_
%15:_(s1) = G_TRUNC %16:_(s32)
%17:_(s32) = G_ZEXT %13:_(s1)
Differential Revision: https://reviews.llvm.org/D132547
Change-Id: Ie3458747a6879433f4d5ab9939d2bd102dd0f2db
OCHyams [Tue, 30 Aug 2022 08:48:58 +0000 (09:48 +0100)]
[DebugInfo] Fix line number attribution in mldst-motion
Taking the example from the test included in this patch:
$ cat test.cpp -n
1 void fun(int *a, int cond) {
2 if (cond)
3 a[1] = 1;
4 else
5 a[1] = 2;
6 }
mldst-motion will merge and sink the stores in if.then and if.else into
if.end. The resultant PHI, gep and store should be attributed line zero
with the innermost common scope rather than picking a debug location from
one of the original stores.
Reviewed By: djtodoro
Differential Revision: https://reviews.llvm.org/D132741
Benjamin Kramer [Tue, 30 Aug 2022 09:01:33 +0000 (11:01 +0200)]
[bazel] Stop building PassGenTest.cpp.inc, it was removed in
13ed6958df40b85fcc80250bb3f819863904ecee
Ting Wang [Tue, 30 Aug 2022 08:32:29 +0000 (04:32 -0400)]
[PowerPC] CTRLoop pseudo instructions should not be duplicated
Add isNotDuplicable to CTRLoop pseudo instructions, to avoid other pass
such as early-tailduplication break the loop structure by duplicating
pseudo instructions.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D132738
Pavel Samolysov [Sat, 27 Aug 2022 12:22:03 +0000 (15:22 +0300)]
[LazyCallGraph] Reformat the code in accordance with the code style. NFC
Also, some local variables were renamed in accordance with the code
style as well as `std::tie` occurrences and `.first`/`.second` member
uses were replaced with structure bindings.
Differential Revision: https://reviews.llvm.org/D132806
Michele Scuttari [Tue, 30 Aug 2022 07:48:11 +0000 (09:48 +0200)]
[MLIR] Unique autogenerated file for tablegen passes
Being the generated code macro-guarded, the autogenerated `.cpp.inc` file has been merged into the `.h.inc` to reduce the build steps.
Reviewed By: mehdi_amini, rriddle
Differential Revision: https://reviews.llvm.org/D132884
Shoaib Meenai [Sun, 28 Aug 2022 20:09:56 +0000 (01:09 +0500)]
[MachO] Don't fold compact unwind entries with LSDA
Folding them will cause the unwinder to compute the incorrect function
start address for the folded entries, which in turn will cause the
personality function to interpret the LSDA incorrectly and break
exception handling.
You can verify the end-to-end flow by creating a simple C++ file:
```
void h();
int main() { h(); }
```
and then linking this file against the liblsda.dylib produced by the
test case added here. Before this change, running the resulting program
would result in a program termination with an uncaught exception.
Afterwards, it works correctly.
Reviewed By: #lld-macho, thevinster
Differential Revision: https://reviews.llvm.org/D132845
Martin Storsjö [Mon, 29 Aug 2022 09:45:00 +0000 (12:45 +0300)]
[lldb] Use the NativeSock type instead of plain 'int'
This fixes a warning when building for Windows:
../tools/lldb/source/Host/common/TCPSocket.cpp:297:16: warning: comparison of integers of different signs: 'int' and 'const NativeSocket' (aka 'const unsigned long long') [-Wsign-compare]
if (sock != kInvalidSocketValue) {
~~~~ ^ ~~~~~~~~~~~~~~~~~~~
Differential Revision: https://reviews.llvm.org/D132841
Martin Storsjö [Thu, 11 Aug 2022 21:26:46 +0000 (00:26 +0300)]
[libcxx] [test] Remove an unnecessary condition in a feature check
We don't need to check for `_LIBCPP_HAS_NO_LOCALIZATION` here;
this was copied over by mistake from the test above (which does
use locale.h).
Differential Revision: https://reviews.llvm.org/D132834
Ting Wang [Tue, 30 Aug 2022 05:57:22 +0000 (01:57 -0400)]
[NFC][PowerPC] Add test case to show ctrloop mi shall not be duplicated
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D132899
Xiang1 Zhang [Tue, 30 Aug 2022 05:41:29 +0000 (13:41 +0800)]
[NFC] Clang-format for CodeGenPrepare.cpp
Richard Smith [Tue, 30 Aug 2022 04:40:44 +0000 (21:40 -0700)]
Fix assumption that Clang version number is numeric.
This can be set at configure time and might include other characters.
liqinweng [Tue, 30 Aug 2022 03:24:38 +0000 (11:24 +0800)]
[RISCV][COST] Refactor for costs of integer saturing add/sub
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D132822
Vitaly Buka [Tue, 30 Aug 2022 03:32:12 +0000 (20:32 -0700)]
[test][msan] Use -DAG to match Debug Info
jacquesguan [Fri, 26 Aug 2022 07:17:02 +0000 (15:17 +0800)]
[InstCombine] fold fake floating point vector extract to shift+trunc.
This patch supports the FP part of D111082.
Differential Revision: https://reviews.llvm.org/D125750
wanglian [Mon, 22 Aug 2022 09:02:17 +0000 (17:02 +0800)]
[LegalizeTypes] Support widen result for VECTOR_REVERSE.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D132359
jacquesguan [Fri, 26 Aug 2022 06:56:18 +0000 (14:56 +0800)]
[InstCombine] Precommit test for D125750.
Differential Revision: https://reviews.llvm.org/D126054
Vitaly Buka [Tue, 30 Aug 2022 01:54:32 +0000 (18:54 -0700)]
[test][msan] Add missing Debug Info check from dtor test
Noah Shutty [Tue, 30 Aug 2022 01:23:04 +0000 (01:23 +0000)]
[llvm] [Debuginfod] Remove `llvm-debuginfod-find` lit tests that used python http server.
These tests depend on `ThreadingHTTPServer` which was not introduced until python 3.7 so we might as well delete them to avoid issues. Most of their functionality is now covered by the llvm-debuginfod.test for the debuginfod server.
Reviewed By: mysterymath
Differential Revision: https://reviews.llvm.org/D119096
Vitaly Buka [Tue, 30 Aug 2022 01:26:04 +0000 (18:26 -0700)]
[test][msan] Don't ignore prefix of sanitizer_dtor_callback
Vitaly Buka [Tue, 30 Aug 2022 01:07:06 +0000 (18:07 -0700)]
[test][msan] Don't ignore the suffix if use-after-dtor callback
Joseph Huber [Mon, 29 Aug 2022 20:38:20 +0000 (15:38 -0500)]
[libomptarget] Deprecate old method for setting the tripcount
Previously, the tripcount was set by a push call. We moved away from
this with the new interface that added the tripcount to the kernel
arguments struct, but kept around the old interface for legacy purposes
for the LLVM 15 release. This patch removes the support for the legacy
method.
This removes the support for the old method, but does not break
backwards compatibility. This will result in applications using the old
interface being slower when run on the device.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D132885
Vitaly Buka [Tue, 30 Aug 2022 00:49:11 +0000 (17:49 -0700)]
[test][msan] Add more Debug Info use-after-dtor tests
Vitaly Buka [Mon, 29 Aug 2022 15:08:42 +0000 (08:08 -0700)]
[test][msan] Remov unneeded CHECK-NOT
Argyrios Kyrtzidis [Sat, 27 Aug 2022 23:14:24 +0000 (16:14 -0700)]
[driver] Additional ignoring of module-map related flags, if modules are disabled
Differential Revision: https://reviews.llvm.org/D132801
Rob Suderman [Tue, 30 Aug 2022 00:20:38 +0000 (17:20 -0700)]
[mlir][tosa] Add folder for tosa.cast
Tosa.cast should fold on splats as it is trivial to fold the operation
into the splatted value.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D132518
Rob Suderman [Tue, 30 Aug 2022 00:05:23 +0000 (17:05 -0700)]
[mlir][tosa] Fold tosa.reshape with splat values
Folding reshapes of splats is trivial and should be canonicalized
away.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D132760
Mehdi Amini [Mon, 29 Aug 2022 10:08:54 +0000 (10:08 +0000)]
Apply clang-tidy fixes for readability-identifier-naming in BytecodeWriter.cpp (NFC)
Mehdi Amini [Mon, 29 Aug 2022 10:08:17 +0000 (10:08 +0000)]
Apply clang-tidy fixes for modernize-use-emplace in BytecodeReader.cpp (NFC)
Rob Suderman [Mon, 29 Aug 2022 23:46:48 +0000 (16:46 -0700)]
[mlir][tosa] Added folders for tosa.div
Added folders for tosa.sub that handles bypassing divide by one,
and a zero numerator.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D132693
Rong Xu [Tue, 30 Aug 2022 00:01:27 +0000 (17:01 -0700)]
fix buildbot build error.
Rong Xu [Wed, 24 Aug 2022 18:54:06 +0000 (11:54 -0700)]
[llvm-profdata] Improve profile supplementation
Current implementation promotes a non-cold function in the SampleFDO profile
into a hot function in the FDO profile. This is too aggressive. This patch
promotes a hot functions in the SampleFDO profile into a hot function, and a
warm function in SampleFDO into a warm function in FDO.
Differential Revision: https://reviews.llvm.org/D132601
Rob Suderman [Mon, 29 Aug 2022 23:24:27 +0000 (16:24 -0700)]
[mlir][tosa] Added folders for tosa.mul
Added folders for tosa.sub that handles bypassing sub-zero,
fold subtraction of two splat tensors.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D132678
Rob Suderman [Mon, 29 Aug 2022 21:32:30 +0000 (14:32 -0700)]
[mlir][tosa] Added folders for tosa.sub
Added folders for tosa.sub that handles bypassing sub-zero,
fold subtraction of two splat tensors.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D132618
Rong Xu [Wed, 24 Aug 2022 19:20:46 +0000 (12:20 -0700)]
[llvm-profdata] Handle internal linkage functions in profile supplementation
This patch has the following changes:
(1) Handling of internal linkage functions (static functions)
Static functions in FDO have a prefix of source file name, while they do not
have one in SampleFDO. Current implementation does not handle this and we are
not updating the profile for static functions. This patch fixes this.
(2) Handling of -funique-internal-linakge-symbols
Again this is for the internal linkage functions. Option
-funique-internal-linakge-symbols can now be applied to both FDO and SampleFDO
compilation. When it is used, it demangles internal linkage function names and
adds a hash value as the postfix.
When both SampleFDO and FDO profiles use this option, or both
not use this option, changes in (1) should handle this.
Here we also handle when the SampleFDO profile using this option while FDO
profile not using this option, or vice versa.
There is one case where this patch won't work: If one of the profiles used
mangled name and the other does not. For example, if the SampleFDO profile
uses clang c-compiler and without -funique-internal-linakge-symbols, while
the FDO profile uses -funique-internal-linakge-symbols. The SampleFDO profile
contains unmangled names while the FDO profile contains mangled names. If
both profiles use c++ compiler, this won't happen. We think this use case
is rare and does not justify the effort to fix.
Differential Revision: https://reviews.llvm.org/D132600
Jeff Niu [Mon, 29 Aug 2022 22:59:04 +0000 (15:59 -0700)]
[mlir] Remove a not very useful `eraseArguments` overload
This overload just wraps a bitvector, and in most cases a bitvector
could be used directly instead of a list.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D132896
Craig Topper [Mon, 29 Aug 2022 22:45:30 +0000 (15:45 -0700)]
[RISCV] Enable (srl (and X, C2), C) to form SRLIW in more cases.
Don't require the AND has one use and don't depend on
targetShrinkDemandedConstant turning C2 into 0xffffffff. Instead,
check that the constant is 0xffffffff after replacing any bits
that will be shifted out with 1s.
Another way to fix this might be to prevent SimplifyDemandedBits
from destroying the ANDI after type legalization using
targetShrinkDemandedBits. That would prevent the CSE that created
this mess. targetShrinkDemandedBits is currently only enable after
legalize ops. Quick experiment shows we can't just change when it
runs, we would need to try a different heuristic for post type
legalization.
Craig Topper [Mon, 29 Aug 2022 22:39:25 +0000 (15:39 -0700)]
[RISCV] Add test for failure to use ANDI and SRLIW due to SimplifyDemandedBits.
Jeff Niu [Mon, 29 Aug 2022 21:32:14 +0000 (14:32 -0700)]
[mlir] Add `Block::eraseArguments` that erases a subrange
This patch adds a an `eraseArguments` function that erases a subrange of
a block's arguments. This can be used inplace of the terrible pattern
```
block->eraseArguments(llvm::to_vector(llvm::seq(...)));
```
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D132890
Philip Reames [Mon, 29 Aug 2022 22:16:47 +0000 (15:16 -0700)]
[LV] Add debug output for force scalar tracing [nfc]
I keep finding myself needing to rule this out as a possible source of scalarization, so add debug output like we have for other instructions we decide to scalarize.
Julian Lettner [Mon, 29 Aug 2022 19:06:53 +0000 (12:06 -0700)]
[Driver] Fix & re-enable DriverKit test
This reverts commit
ce6989fd8a9fb2608f670de023fdd4611f47b811.
Lang Hames [Mon, 29 Aug 2022 21:05:51 +0000 (14:05 -0700)]
[ORC-RT] unit tests do not need access to LLVM headers.
Also delete trailing whitespace in lib/orc/CMakeLists.txt
Rong Xu [Wed, 24 Aug 2022 18:28:02 +0000 (11:28 -0700)]
[llvm-profdata] Adjust profile supplementation heuristics
1) We now use the count size in FDO as the main factor to deal with pre-inliner.
Currently we use the number of sample records in the SampleFDO profile. But
that only counts the top-level body sample records (not including the nested
call-sites). We are seeing some big functions not being updated because of
this. I think using the count size in FDO profile is more reasonable to judge if
the function is likely to be inlined to the callers in pre-inliner.
(2) We use getMaxCount in SampleFDO rather the HeadSample to determine if
if the function is hot in SampleFDO. This is in-sync with the logic
in the compiler (also HeadSample can be 0).
Differential Revision: https://reviews.llvm.org/D132602
Philip Reames [Mon, 29 Aug 2022 21:04:30 +0000 (14:04 -0700)]
[LV] Refresh autogen tests to reflect naming changes [nfc]
Purely so that these can be easily autogened without spurious diffs
Craig Topper [Mon, 29 Aug 2022 20:29:16 +0000 (13:29 -0700)]
[RISCV] Use hasAllWUsers to recover ANDI.
SimplifyDemandedBits can 0 the upper bits and targetShrinkDemandedConstant
isn't alway able to recover it.
At least part of that may be because targetShrinkDemandedConstant
only runs in the last DAGCombine. Might be worth seeing what happens
if we move it post type legalization.
Craig Topper [Mon, 29 Aug 2022 20:12:38 +0000 (13:12 -0700)]
[RISCV] Add test case for missed opportunity to use ANDI.
Immediate was messed up by SimplfyDemandedBits.
Valery N Dmitriev [Thu, 25 Aug 2022 23:58:56 +0000 (16:58 -0700)]
[SLP] Try to match reductions before trying to vectorize a vector build sequence.
This patch changes order of searching for reductions vs other vectorization possibilities.
The idea is if we do not match a reduction it won't be harmful for further attempts to
find vectorizable operations on a vector build sequences. But doing it in the opposite
order we have good chance to ruin opportunity to match a reduction later.
We also don't want to try vectorizing binary operations too early as 2-way vectorization
may effectively prohibit wider ones leading to producing less effective code.
Differential Revision: https://reviews.llvm.org/D132590
Rob Suderman [Mon, 29 Aug 2022 19:03:25 +0000 (12:03 -0700)]
[mlir][tosa] Added folders for tosa.greater
Added folders for tosa.greater fold splat values.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D132707
Slava Zakharin [Tue, 2 Aug 2022 23:26:55 +0000 (16:26 -0700)]
[mlir][math] Set llvm readnone attribute for libm functions.
Math dialect operations currently do not limit transformations
applied to them, which means that they potentially behave like
clang's -ffast-math mathematics. Clang marks math functions with
readnone attribute enabling more optimizations.
This change does the same for functions used by MathToLibm convertor.
In particular, this enables LLVM LICM for tan() call in
Polyhedron/mp_prop_design_11 compiled with flang.
Differential Revision: https://reviews.llvm.org/D131031
Joseph Huber [Mon, 29 Aug 2022 14:43:45 +0000 (09:43 -0500)]
[libomptarget] Always enable time tracing in libomptarget
Previously time tracing features were hidden behind an optional CMake
option. This was because `libomptarget` was not based on the LLVM
libraries at that time. Now that `libomptarget` is an LLVM library we
should be able to freely use the `LLVMSupport` library whenever we want
and do not need to guard it in this way.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D132852
Philip Reames [Mon, 29 Aug 2022 19:47:56 +0000 (12:47 -0700)]
[LV] Minor code restructure of isUniformAfterVectorization [nfc]
Mostly just to make a future patch easier to review.
Hans Wennborg [Mon, 29 Aug 2022 19:35:49 +0000 (21:35 +0200)]
Mark compiler-rt/test/profile/instrprof-merging.cpp unsupported on Windows
It is not reliable. See #57430.
Craig Topper [Mon, 29 Aug 2022 19:23:03 +0000 (12:23 -0700)]
[RISCV] Add more invertible setccs to tryDemorganOfBooleanCondition.
This builds on D132771 to invert (setlt 0, X) to (setlt X, 1) and
vice versa.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D132798
Craig Topper [Mon, 29 Aug 2022 19:20:36 +0000 (12:20 -0700)]
[RISCV] Pre-commit tests for D132798. NFC
Craig Topper [Mon, 29 Aug 2022 19:03:17 +0000 (12:03 -0700)]
[RISCV] Apply DeMorgan to (beqz (and/or (seteq), (xor Z, 1))) to remove the xor.
We can rewrite to (bnez (or/and (setne), Z) is Z is 0/1.
Alternatively, we could canonicalize to (xor (or/and (setne), Z), 1)
even if there is no branch. The xor would not always get removed,
but it might enable other DeMorgan combines. I decided to be
conservative for this first patch and require the xor to be removed.
I have a couple other invertible setccs I will add in a follow up
patch.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D132771
Yuanfang Chen [Mon, 29 Aug 2022 17:52:00 +0000 (10:52 -0700)]
[PS4][driver] make -fjmc work with LTO driver linking stage
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D131820
Yuanfang Chen [Wed, 24 Aug 2022 19:24:54 +0000 (12:24 -0700)]
[Clang] Implement function attribute nouwtable
To have finer control of IR uwtable attribute generation. For target code generation,
IR nounwind and uwtable may have some interaction. However, for frontend, there are
no semantic interactions so the this new `nouwtable` is marked "SimpleHandler = 1".
Differential Revision: https://reviews.llvm.org/D132592
Julian Lettner [Mon, 29 Aug 2022 18:57:57 +0000 (11:57 -0700)]
Revert "[Driver] Fix & re-enable DriverKit test"
This reverts commit
2d66571729a2ffcd88398a77508b0d6432ed7ba0 due to
test failure:
http://45.33.8.238/win/65224/step_7.txt
Michele Scuttari [Mon, 29 Aug 2022 08:57:58 +0000 (10:57 +0200)]
[MLIR] Fix autogenerated pass constructors linkage
The patch addresses the linkage of the new autogenerated pass constructors, which, being declared as friend functions, resulted in having an inline nature and thus their implementations not being exported.
Reviewd By: mehdi_amini, rriddle
Differential Revision: https://reviews.llvm.org/D132572
Philip Reames [Mon, 29 Aug 2022 18:37:42 +0000 (11:37 -0700)]
[RLEV] Pick a correct insert point when incoming instruction is itself a phi node
This fixes https://github.com/llvm/llvm-project/issues/57336. It was exposed by a recent SCEV change, but appears to have been a long standing issue.
Note that the whole insert into the loop instead of a split exit edge is slightly contrived to begin with; it's there solely because IndVarSimplify preserves the CFG.
Differential Revision: https://reviews.llvm.org/D132571
Arthur Eubanks [Mon, 29 Aug 2022 18:25:29 +0000 (11:25 -0700)]
Revert "[runtimes] Use a response file for runtimes test suites"
This reverts commit
992e10a3fce41255e4b11782f51d0f4b26dca14d.
Breaks builds with LLVM_INCLUDE_TESTS=OFF, see comments in D132438.
ziqingluo-90 [Fri, 26 Aug 2022 20:51:10 +0000 (13:51 -0700)]
[clang-tidy] Fixing a bug in `InfiniteLoopCheck` that raises false alarms on finite loops
A loop can recursively increase/decrease a function local static
variable and make itself finite. For example,
```
void f() {
static int i = 0;
i++;
while (i < 10)
f();
}
```
Such cases are not considered by `InfiniteLoopCheck`. This commit
fixes this problem by detecting usages of static local variables
and recursions.
Reviewed by: NoQ, njames93
Differential Revision: https://reviews.llvm.org/D128401
Alexey Bataev [Fri, 26 Aug 2022 17:23:43 +0000 (10:23 -0700)]
[SLP]Fix PR57322: vectorize constant float stores.
Stores for constant floats must be vectorized, improve analysis in SLP
vectorizer for stores.
Differential Revision: https://reviews.llvm.org/D132750
Florian Hahn [Mon, 29 Aug 2022 17:58:56 +0000 (18:58 +0100)]
[SLP] Add tests showing over-eager SLP when scalar fma can be used.
Add test cases for AArch64 that show over-eager SLP vectorization on
AArch64, where keeping the things scalar allows efficient lowering using
scalar fmas.
Peter Klausler [Sat, 27 Aug 2022 00:34:26 +0000 (17:34 -0700)]
[flang] Don't construct TBP bindings for abstract derived types
The tables constructed by semantics that describe derived types to
the runtime support library must not include "vtable" entries for
the deferred type-bound procedures of abstract derived types;
these can turn out to be unsatisfiable external references to
procedures whose interfaces were used in the definitions of those
bindings.
Differential Revision: https://reviews.llvm.org/D132774
Julian Lettner [Sat, 27 Aug 2022 01:43:13 +0000 (18:43 -0700)]
[Driver] Fix & re-enable DriverKit test
Dave Lee [Thu, 25 Aug 2022 18:28:44 +0000 (11:28 -0700)]
[lldb] Quietly source lit-lldb-init
Improve utility of `FileCheck` output when a shell test fails.
The conflict is from:
1. On failure, `FileCheck` prints 5 lines of context
2. Shell tests first source `lit-lldb-init`, having the effect of printing its contents
If a `FileCheck` failure happens at the beginning of the input, then the
context shown is the `lit-lldb-init`, as it's over 5 lines and is the first
thing printed. As the init contents are fairly static, and presumably
uninteresting to most test failures, it seems reasonable to not print it.
Unfortunately it's not possible to use the `--source-quietly` flag in the lldb
invocation, because it will quiet all other `--source` flags on the command
line, making many tests fail.
This fix is a level of indirection, where a new sibling file named
`lit-lldb-init-quiet` is created, and its static contents are:
```
command source -C --silent-run true lit-lldb-init
```
This achieves the result of loading `lit-lldb-init` quietly. The `-C` flag
loads the path relatively.
Differential Revision: https://reviews.llvm.org/D132694
Dave Lee [Sat, 27 Aug 2022 17:28:23 +0000 (10:28 -0700)]
[lldb] Remove mention of dotest.pl
Dave Lee [Sat, 27 Aug 2022 23:33:43 +0000 (16:33 -0700)]
[lldb][test] Speed up lldb arch determination (NFC)
While investigation slow tests, I looked into why `TestMultithreaded.py`. One
of the reasons is that it determines the architecture of lldb by running:
```
lldb -o 'file path/to/lldb' -o 'quit'
```
On my fairly fast machine, this takes 24 seconds, and `TestMultithreaded.py`
calls this function 4 times.
With this change, this command now takes less than 0.2s on the same machine.
The reason it's slow is symbol table and debug info loading, as indicated by
the new progress events printed to the console. One setting reduced the time in
half:
```
settings set target.preload-symbols false
```
Further investigation, by profiling with Instruments on macOS, showed that
loading time was also caused by looking for scripts. The setting that
eliminates this time is:
```
settings set target.load-script-from-symbol-file false
```
Differential Revision: https://reviews.llvm.org/D132803
Utkarsh Saxena [Thu, 4 Aug 2022 10:41:15 +0000 (12:41 +0200)]
FoldingRanges: Handle LineFoldingsOnly clients.
Do not fold the endline which contains tokens after the end of range.
Differential Revision: https://reviews.llvm.org/D131154
Quentin Colombet [Sat, 27 Aug 2022 01:14:23 +0000 (01:14 +0000)]
[mlir][MemRef] Canonicalize reinterpret_cast(extract_strided_metadata)
Add a canonicalizetion step for
reinterpret_cast(extract_strided_metadata).
This step replaces this sequence of operations by either:
- A noop, i.e., the original memref is directly used, or
- A plain cast of the original memref
The choice is ultimately made based on whether the original memref type
is equal to what the reinterpret_cast iss producing. For instance, the
reinterpret_cast could be changing some dimensions from static to
dynamic and in such case, we need to keep a cast.
The transformation is currently only performed when the reinterpret_cast
uses exactly the same arguments as what the extract_strided_metadata
produces. It may be possible to be more aggressive here but I wanted to
start with a relatively simple MLIR patch for my first one!
Differential Revision: https://reviews.llvm.org/D132776
Nathan Ridge [Mon, 29 Aug 2022 08:24:24 +0000 (04:24 -0400)]
[clangd] Fail more gracefully if QueryDriverDatabase cannot determine file type
Currently, QueryDriverDatabase returns an empty compile command
if it could not determine the file type.
This failure mode is unnecessarily destructive; it's better to
just return the incoming compiler command, which is still more
likely to be useful than an empty command.
Differential Revision: https://reviews.llvm.org/D132833
Aart Bik [Fri, 26 Aug 2022 20:49:07 +0000 (13:49 -0700)]
[mlir][sparse] start a sparse codegen conversion pass
This new pass provides an alternative to the current conversion pass
that converts sparse tensor types and sparse primitives to opaque pointers
and calls into a runtime support library. This pass will map sparse tensor
types to actual data structures and primitives to actual code. In the long
run, this new pass will remove our dependence on the support library, avoid
the need to link in fully templated and expanded code, and provide much better
opportunities for optimization on the generated code.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D132766
Craig Topper [Mon, 29 Aug 2022 16:32:02 +0000 (09:32 -0700)]
[VP][RISCV] Add vp.fabs intrinsic and RISC-V support.
Mostly just modeled after vp.fneg except there is a
"functional instruction" for fneg while fabs is always an
intrinsic.
Reviewed By: fakepaper56
Differential Revision: https://reviews.llvm.org/D132793
Jeff Niu [Wed, 10 Aug 2022 22:20:41 +0000 (18:20 -0400)]
[mlir][ods] OpFormat: fix type inference issues
This patch fixes issues with generating assembly format parsers for
operations that use the `operands` directive or which have unnamed
arguments or results.
This patch also fixes a function in `OpAsmParser` that always produced
an error when trying to resolve variadic operands with the same type.
Fixes #51841
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D131627