Vitaly Buka [Tue, 30 Aug 2022 03:33:01 +0000 (20:33 -0700)]
[msan] Add more specific messages for use-after-destroy
Reviewed By: kda, kstoimenov
Differential Revision: https://reviews.llvm.org/D132907
Kai Luo [Wed, 31 Aug 2022 01:23:32 +0000 (09:23 +0800)]
[AtomicExpand] Make floating point conversion happens before fence insertion
IIUC, the conversion part is not part of atomic operations and fences should be put around converted atomic operations.
This also fixes atomic load of floating point values which requires fence on PowerPC.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D127609
Richard Smith [Wed, 31 Aug 2022 01:20:51 +0000 (18:20 -0700)]
Revert "[driver] Additional ignoring of module-map related flags, if modules are disabled"
This reverts commit
33162a81d4c93a53ef847d3601b0b03830937d3c.
This change breaks the usage of module maps with modules disabled, such
as for layering checking via `-fmodules-decluse`.
Regression test added.
Shafik Yaghmour [Wed, 31 Aug 2022 01:08:44 +0000 (18:08 -0700)]
[Clang] Fix lambda CheckForDefaultedFunction(...) so that it checks the CXXMethodDecl is a special member function before attempting to call DefineDefaultedFunction(...)
In Sema::CheckCompletedCXXClass(...) It used a lambda CheckForDefaultedFunction
the CXXMethodDecl passed to CheckForDefaultedFunction may not be a special
member function and so before attempting to apply functions that only apply to
special member functions it needs to check. It fails to do this before calling
DefineDefaultedFunction(...). This PR adds that check and test to verify we no
longer crash.
This fixes https://github.com/llvm/llvm-project/issues/57431
Differential Revision: https://reviews.llvm.org/D132906
Weining Lu [Wed, 31 Aug 2022 00:48:01 +0000 (08:48 +0800)]
[llc] Use CPUStr instead of calling codegen::getMCPU(). NFC
`getCPUStr()` fallsback to `getMCPU()`.
The only difference between `getCPUStr()` and `getMCPU()` is that
`getCPUStr()` handles `-mcpu=native`. That doesn't matter for this case.
This is just a simplification of the original code and it does not
change the functionality. So no new tests added.
Differential Revision: https://reviews.llvm.org/D132849
Lang Hames [Sat, 27 Aug 2022 03:29:02 +0000 (20:29 -0700)]
[ORC-RT] Make llvm-jitlink an ORC-RT specific dependence.
The llvm-jitlink tool is not needed by other sanitizer tests.
Joseph Huber [Tue, 30 Aug 2022 21:21:22 +0000 (16:21 -0500)]
[Libomptarget] Remove old workaround for GCC 5,6 from libomptarget
Some code previous needed the `used` attribute to prevent the GCC
compiler versions 5 and 6 from removing it. This is no longer required
as the minimum supported GCC version for LLVM 16 is >=7.1.0.
Reviewed By: JonChesterfield, vzakhari
Differential Revision: https://reviews.llvm.org/D132976
bzcheeseman [Tue, 9 Aug 2022 15:11:13 +0000 (08:11 -0700)]
[Docs][CodeReview] Add a small paragraph on adding tokens, NFC.
Reviewed By: whisperity
Differential Revision: https://reviews.llvm.org/D131500
LLVM GN Syncbot [Tue, 30 Aug 2022 22:53:54 +0000 (22:53 +0000)]
[gn build] Port
ea9ac3519c13
Greg Clayton [Fri, 24 Jun 2022 22:08:59 +0000 (15:08 -0700)]
An upcoming patch to LLDB will require the ability to decode base64. This patch adds support for decoding base64 and adds tests.
Resubmission of https://reviews.llvm.org/D126254 with where decodeBase64Byte is no longer a lambda but a static function. Some compilers have different errors or warnings with respect to what needs to be captured and what doesn't (see comments in https://reviews.llvm.org/D126254 for details).
Differential Revision: https://reviews.llvm.org/D128560
Ben Langmuir [Tue, 30 Aug 2022 22:50:09 +0000 (15:50 -0700)]
Revert "[clang][deps] Split translation units into individual -cc1 or other commands"
Failing on some bots, reverting until I can fix it.
This reverts commit
f80a0ea760728e70f70debf744277bc3aa59bc17.
Markus Böck [Tue, 30 Aug 2022 22:35:07 +0000 (00:35 +0200)]
[GlobalISel] Explicitly fail trying to translate `gc.statepoint` and related intrinsics
The provided testcase would previously fail with an assertion due to later down below trying to allocate registers for `token` return types and arguments. This is especially problematic as the process would then exit instead of falling back to using FastIsel.
This patch fixes that by simply explicitly failing translation if either of these intrinsics are encountered.
Fixes https://github.com/llvm/llvm-project/issues/57349
Differential Revision: https://reviews.llvm.org/D132974
Ben Langmuir [Thu, 25 Aug 2022 16:22:31 +0000 (09:22 -0700)]
[clang][deps] Split translation units into individual -cc1 or other commands
Instead of trying to "fix" the original driver invocation by appending
arguments to it, split it into multiple commands, and for each -cc1
command use a CompilerInvocation to give precise control over the
invocation.
This change should make it easier to (in the future) canonicalize the
command-line (e.g. to improve hits in something like ccache), apply
optimizations, or start supporting multi-arch builds, which would
require different modules for each arch.
In the long run it may make sense to treat the TU commands as a
dependency graph, each with their own dependencies on modules or earlier
TU commands, but for now they are simply a list that is executed in
order, and the dependencies are simply duplicated. Since we currently
only support single-arch builds, there is no parallelism available in
the execution.
Differential Revision: https://reviews.llvm.org/D132405
Ian Anderson [Tue, 30 Aug 2022 20:09:21 +0000 (13:09 -0700)]
[clang][modules] Don't hard code [no_undeclared_includes] for the Darwin module
The Darwin module has specified [no_undeclared_includes] for at least five years now, there's no need to hard code it in the compiler.
Reviewed By: ributzka, Bigcheese
Differential Revision: https://reviews.llvm.org/D132971
Mingming Liu [Sat, 20 Aug 2022 04:14:43 +0000 (21:14 -0700)]
[NFC] Move a test case across files.
The test case is about pmull2 instruction generated used than a SIMD
ldr being generated. So aarch64-pmull2.ll is a better test file.
Differential Revision: https://reviews.llvm.org/D132277
Jeff Niu [Tue, 30 Aug 2022 16:46:56 +0000 (09:46 -0700)]
[mlir] Fix try_value_begin_impl for DenseElementsAttr
The previous implementation would still crash if the element type was
not iterable. This patch changes SparseElementsAttr to properly
implement `try_value_begin_impl` according to ElementsAttr and changes
DenseElementsAttr to implement `tryGetValues` as the basis for querying
element values.
Depends on D132904
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D132958
Jeff Niu [Tue, 30 Aug 2022 01:14:52 +0000 (18:14 -0700)]
[mlir][ElementsAttr] Change value_begin_impl to try_value_begin_impl
This patch changes `value_begin_impl` to a faillable
`try_value_begin_impl` so that specific cases can fail iteration if the
type doesn't match the internal storage.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D132904
Slava Zakharin [Fri, 26 Aug 2022 23:12:25 +0000 (16:12 -0700)]
[flang] Lower integer exponentiation into math::IPowI.
Differential Revision: https://reviews.llvm.org/D132770
Jonas Devlieghere [Tue, 30 Aug 2022 21:01:49 +0000 (14:01 -0700)]
[lldb] Fix two bugs in ObjectContainerMachOFileset
Fix two small issues in the live-memory variant of ObjectContainerMachOFileset.
Differential revision: https://reviews.llvm.org/D132973
Kirill Okhotnikov [Tue, 30 Aug 2022 21:04:00 +0000 (23:04 +0200)]
[libc][math] Fix broken atan function.
Kirill Okhotnikov [Tue, 30 Aug 2022 20:59:00 +0000 (22:59 +0200)]
[libc][math] Fix broken tests.
Alex Zinenko [Tue, 30 Aug 2022 20:55:31 +0000 (22:55 +0200)]
[mlir] fix -Wsign-compare equivalent on Windows
Some clients treat this as compilation error.
Kirill Okhotnikov [Mon, 29 Aug 2022 10:34:15 +0000 (12:34 +0200)]
[libc][math] Added atanf function.
Performance by core-math (core-math/glibc 2.31/current llvm-14):
28.879/20.843/20.15
Differential Revision: https://reviews.llvm.org/D132842
Kirill Okhotnikov [Sun, 28 Aug 2022 18:03:19 +0000 (20:03 +0200)]
[libc][math] Added atanhf function.
Performance by core-math (core-math/glibc 2.31/current llvm-14):
10.845/43.174/13.467
The review is done on top of D132809.
Differential Revision: https://reviews.llvm.org/D132811
Kirill Okhotnikov [Sun, 28 Aug 2022 17:12:41 +0000 (19:12 +0200)]
[libc][math] Added auxiliary function log2_eval for asinhf/acoshf/atanhf.
1) `double log2_eval(double)` function added with better than float precision is added.
2) Some refactoring done to put all auxiliary functions and corresponding data
to one place to reuse the code.
3) Added tests for new functions.
4) Performance and precision tests of the function shows, that it more precise than exiting log2,
(no exceptional cases), but timing is ~5% higer that on current one.
Differential Revision: https://reviews.llvm.org/D132809
Jeff Niu [Tue, 30 Aug 2022 19:13:15 +0000 (12:13 -0700)]
[mlir] Allow dense array to be parsed with type elision
This patch makes parsing dense arrays with type elision work properly.
If a ranked tensor type is supplied to `parseAttribute` on a dense
array, the element type is skipped. Moreover, if type elision is set to
`AttrTypeElision::Must`, the element type is elided.
For example, this allows
```
memref.global @z : memref<3xi32> = array<1, 2, 3>
```
Fixes #57433
Depends on D132758
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D132964
Jeff Niu [Thu, 25 Aug 2022 23:21:28 +0000 (16:21 -0700)]
[mlir] Make DenseArrayAttr generic
This patch turns `DenseArrayBaseAttr` into a fully-functional attribute by
adding a generic parser and printer, supporting bool or integer and floating
point element types with bitwidths divisible by 8. It has been renamed
to `DenseArrayAttr`. The patch maintains the specialized subclasses,
e.g. `DenseI32ArrayAttr`, which remain the preferred API for accessing
elements in C++.
This allows `DenseArrayAttr` to hold signed and unsigned integer elements:
```
array<si8: -128, 127>
array<ui8: 255>
```
"Exotic" floating point elements:
```
array<bf16: 1.2, 3.4>
```
And integers of other bitwidths:
```
array<i24: 8388607>
```
Reviewed By: rriddle, lattner
Differential Revision: https://reviews.llvm.org/D132758
Michele Scuttari [Tue, 30 Aug 2022 20:20:36 +0000 (22:20 +0200)]
Revert "[MLIR] Update pass declarations to new autogenerated files"
This reverts commit
2be8af8f0e0780901213b6fd3013a5268ddc3359.
Lang Hames [Tue, 30 Aug 2022 20:08:22 +0000 (13:08 -0700)]
[ORC] Update mapper deinitialize functions to deinitialize in reverse order.
This updates the ExecutorSharedMemoryMapperService::deinitialize and
InProcessMemoryMapper::deinitialize methods to deinitialize in reverse order,
bringing them into alignment with the behavior of
InProcessMemoryManager::deallocate and SimpleExecutorMemoryManager::deallocate.
Reverse deinitialization is required because later allocations can depend on
earlier ones.
This fixes failures in the ORC runtime test suite.
Rob Suderman [Tue, 30 Aug 2022 19:59:50 +0000 (12:59 -0700)]
[mlir][tosa] Fix windows build-bot error due to implicit i64 cast
There is an implicit i64 cast due to the << during MulOp's folder.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D132969
Michele Scuttari [Tue, 30 Aug 2022 19:56:31 +0000 (21:56 +0200)]
[MLIR] Update pass declarations to new autogenerated files
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.
Reviewed By: mehdi_amini, rriddle
Differential Review: https://reviews.llvm.org/D132838
Gulfem Savrun Yeniceri [Wed, 17 Aug 2022 22:17:58 +0000 (22:17 +0000)]
[profile] Create only prof header when no counters
When we use selective instrumentation and instrument a file
that is not in the selected files list provided via -fprofile-list,
we generate an empty raw profile. This leads to empty_raw_profile
error when we try to read that profile. This patch fixes the issue by
generating a raw profile that contains only a profile header when
there are no counters and profile data.
A small reproducer for the above issue:
echo "src:other.cc" > code.list
clang++ -O2 -fprofile-instr-generate -fcoverage-mapping
-fprofile-list=code.list code.cc -o code
./code
llvm-profdata show default.profraw
Differential Revision: https://reviews.llvm.org/D132094
Craig Topper [Tue, 30 Aug 2022 19:37:00 +0000 (12:37 -0700)]
[RISCV] Use uint64_t countTrailingZeros/Ones instead of APInt. NFC
We know the type is 32 or 64 bits, we can use getZExtValue and
bypass the slow path check in APInt.
Sanjay Patel [Tue, 30 Aug 2022 19:21:17 +0000 (15:21 -0400)]
[Verifier] remove stale comment about PHI with no operands; NFC
The code was changed with:
9eb2c0113dfe
...but missed the corresponding code comment.
Alexey Bataev [Tue, 30 Aug 2022 15:09:31 +0000 (08:09 -0700)]
[SLP]Fix PR57447: Assertion `!getTreeEntry(V) && "Scalar already in tree!"' failed.
The pointer operands for the ScatterVectorize node may contain
non-instruction values and they are not checked for "already being
vectorized". Need to check that such pointers are already vectorized and
gather them instead of trying to build vectorize node to avoid compiler
crash.
Differential Revision: https://reviews.llvm.org/D132949
Craig Topper [Tue, 30 Aug 2022 18:59:37 +0000 (11:59 -0700)]
[RISCV] Improve isel of AND with shiftedMask containing 32 leading zeros and some trailing zeros.
We can use srliw to shift out the trailing bits and slli to shift
back in zeros. The sign extend of srliw will 0 the upper 32 bits
since we will be shifting a 0 into bit 31.
Stanislav Mekhanoshin [Mon, 29 Aug 2022 19:16:52 +0000 (12:16 -0700)]
[AMDGPU] Limit TID / wavefrontsize uniformness to 1D kernels
If a kernel has uneven dimensions we can have a value of workitem-id-x
divided by the wavefrontsize non-uniform. For example dimensions (65, 2)
will have workitems with address (64, 0) and (0, 1) packed into a same
wave which gives 1 and 0 after the division by 64 respectively.
Unfortunately, this limits the optimization to OpenCL only and only if
reqd_work_group_size attribute is set. This patch limits it to 1D kernels,
although that shall be possible to perform this optimization is the size
of the X dimension is a power of 2, we just do not currently have
infrastructure to query it.
Note that presence of amdgpu-no-workitem-id-y attribute does not help
as it only hints the lack of the workitem-id-y query, but not the absence
of the actual 2nd dimension, therefore affecting just the SGPR allocation.
Differential Revision: https://reviews.llvm.org/D132879
Luke Nihlen [Mon, 29 Aug 2022 16:27:46 +0000 (16:27 +0000)]
[clang] Don't emit debug vtable information for consteval functions
Fixes https://github.com/llvm/llvm-project/issues/55065
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D132874
Justin Bogner [Tue, 30 Aug 2022 18:52:42 +0000 (11:52 -0700)]
[AMDGPU] Precommit two tests showing missed combines to v_med3
Joe Nash [Mon, 29 Aug 2022 18:42:20 +0000 (14:42 -0400)]
[AMDGPU][GFX11] Fix dst register class for V_CVT_U32_U16
This instruction was referring to the wrong VOPProfile, likely due to a
typo, leading to an incorrect destination register type.
The MC layer will care about this change, but is NFC while 16-bit values
actually use 32 bit registers.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D132878
isuckatcs [Thu, 25 Aug 2022 13:11:53 +0000 (15:11 +0200)]
[clang-tidy] Fix false positive on ArrayInitIndexExpr inside ProBoundsConstantArrayIndexCheck
Sometimes in the AST we can have an ArraySubscriptExpr,
where the index is an ArrayInitIndexExpr.
ArrayInitIndexExpr is not a constant, so
ProBoundsConstantArrayIndexCheck reports a warning when
it sees such expression. This expression can only be
implicitly generated, and always appears inside an
ArrayInitLoopExpr, so we shouldn't report a warning.
Differential Revision: https://reviews.llvm.org/D132654
Sanjay Patel [Tue, 30 Aug 2022 17:31:53 +0000 (13:31 -0400)]
[InstCombine] add tests for xor-of-ctlz/cttz; NFC
Sanjay Patel [Mon, 29 Aug 2022 17:39:16 +0000 (13:39 -0400)]
[InstCombine] delete redundant folds; NFC
InstSimplify does this via isKnownNonEqual(), so it's already
using knownbits on these patterns and trying other folds.
Sanjay Patel [Mon, 29 Aug 2022 14:38:23 +0000 (10:38 -0400)]
[InstCombine] add tests for signbit test using lshr; NFC
Muiez Ahmed [Tue, 30 Aug 2022 18:18:44 +0000 (14:18 -0400)]
[SystemZ][z/OS] Account for renamed parameter name (libc++)
The following patch (https://reviews.llvm.org/D129051) broke z/OS builds by renaming the parameter name. This patch accounts for that change.
Differential Revision: https://reviews.llvm.org/D132946
Aart Bik [Tue, 30 Aug 2022 18:06:35 +0000 (11:06 -0700)]
[mlir][sparse] add missing file for singleton revision
Differential Revision: https://reviews.llvm.org/D132961
Stephen Long [Tue, 30 Aug 2022 18:06:43 +0000 (11:06 -0700)]
[SVE] Fix SVEDup0 matching -0.0f
Because of D128669, CPY is being used to zero active lanes even in the case of -0.0f. This patch checks for floating point positive zero. That way SVEDup0 won't match -0.0f.
Fixes https://github.com/llvm/llvm-project/issues/57428
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D132880
Eugene Zhulenev [Fri, 26 Aug 2022 22:38:41 +0000 (15:38 -0700)]
[mlir] Async: add unrealized cast materializations to AsyncToLLVM pass
[mlir] Async: add unrealized cast materializations to AsyncToLLVM pass
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D132768
Aart Bik [Mon, 29 Aug 2022 22:43:20 +0000 (15:43 -0700)]
[mlir][sparse] add more dimension level types and properties
We recently removed the singleton dimension level type (see the revision
https://reviews.llvm.org/D131002) since it was unimplemented but also
incomplete (properties were missing). This revision add singleton back as
extra dimension level type, together with properties ordered/not-ordered
and unique/not-unique. Even though still not lowered to actual code, this
provides a complete way of defining many more sparse storage schemes (in
the long run, we want to support even dimension level types and properties
using the additional extensions proposed in [Chou]).
Note that the current solution of using suffixes for the properties is not
ideal, but keeps the extension relatively simple with respect to parsing and
printing. Furthermore, it is rather consistent with the TACO implementation
which uses things like Compressed-Unique as well. Nevertheless, we probably
want to separate dimension level types from properties when we add more types
and properties.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D132897
Chris Bieneman [Tue, 30 Aug 2022 17:19:05 +0000 (12:19 -0500)]
[Docs] Fixing incorrect document title
Doh! This clearly slipped my review. Thanks DuckDuckGo for showing me
the error of my ways :).
Chris Bieneman [Tue, 30 Aug 2022 17:16:46 +0000 (12:16 -0500)]
[Docs] [HLSL] Documenting HLSL Entry Functions
This document describes the basic usage and implementation details for
HLSL entry functions in Clang.
Reviewed By: python3kgae
Differential Revision: https://reviews.llvm.org/D132672
Jim Ingham [Fri, 19 Aug 2022 00:20:55 +0000 (17:20 -0700)]
Change the meaning of a UUID with all zeros for data.
Previously, depending on how you constructed a UUID from data or a
StringRef, an input value of all zeros was valid (e.g. setFromData)
or not (e.g. setFromOptionalData). Since there was no way to tell
which interpretation to use, it was done somewhat inconsistently.
This standardizes the meaning of a UUID of all zeros to Not Valid,
and removes all the Optional methods and their uses, as well as the
static factories that supported them.
Differential Revision: https://reviews.llvm.org/D132191
Joseph Huber [Mon, 29 Aug 2022 21:49:14 +0000 (16:49 -0500)]
[Libomptarget] Make unified shared memory test unsupported on AMDGPU
This test is an expected failure on AMDGPU. The expected failure is a GPU memory
failure, which will typically result in the device totally failing. This isn't
an issue for some GPU configurations that do not use the offloading device to
also drive the display server. However, if the main GPU is used for testing it
will reliably result in the user's display becoming unresponsive. This makes it
difficult to run the GPU offloading tests on many systems.
This patch simply makes this test unsupported so it no longer runs and freezes
my computer when using `ninja check-openmp`.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D132891
Matheus Izvekov [Fri, 17 Jun 2022 20:29:27 +0000 (22:29 +0200)]
[clang] Improve diagnostics for expansion length mismatch
When checking parameter packs for expansion, instead of basing the diagnostic for
length mismatch for outer parameters only on the known number of expansions,
we should also analyze SubstTemplateTypeParmPackType and SubstNonTypeTemplateParmPackExpr
for unexpanded packs, so we can emit a diagnostic pointing to a concrete
outer parameter.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D128095
Hendrik Greving [Tue, 30 Aug 2022 16:14:11 +0000 (09:14 -0700)]
[mlir] Fix signed ceildiv, loop normalization.
Fixes using the signed ceildiv op instead of incorrectly assuming positive loop bounds.
Adjusts the tests for above.
Differential Revision: https://reviews.llvm.org/D132953
Hendrik Greving [Thu, 25 Aug 2022 19:52:01 +0000 (12:52 -0700)]
[mlir] Make division unsigned.
Uses arith.divui where it is safe to do so.
Adjusts the tests for above.
Differential Revision: https://reviews.llvm.org/D132701
Mingming Liu [Tue, 30 Aug 2022 05:26:53 +0000 (22:26 -0700)]
[NFC][AArch64] Specify datalayout explicitly for cast.ll and
arith-overflow.ll and update tests accordingly.
- These two tests stands out when data layout is explicitly added in a
sweep study (D132889)
Differential Revision: https://reviews.llvm.org/D132856
Daniel Bertalan [Tue, 30 Aug 2022 15:57:32 +0000 (17:57 +0200)]
[lld-macho] Rename {StubHelper,ObjCStubs}Section::setup() to setUp (NFC)
The phrasal verb is spelled "set up"; "setup" is a noun.
Suggested in https://reviews.llvm.org/D132947#inline-1280089
Hendrik Greving [Tue, 24 May 2022 17:06:56 +0000 (10:06 -0700)]
[BasicBlockUtils] Amend test for loop metadata.
Amends test Transforms/LoopSimplify/update_latch_md2.ll
with auto-generated checks.
Differential Revision: https://reviews.llvm.org/D125574
David Penry [Mon, 29 Aug 2022 22:44:19 +0000 (15:44 -0700)]
[ModuloScheduler] Fix missing LLVM_DEBUG
Guard a debug message with LLVM_DEBUG
Differential Revision: https://reviews.llvm.org/D132895
Mark de Wever [Sat, 20 Aug 2022 12:51:52 +0000 (14:51 +0200)]
[libc++] Improves feature-test macro diagnostics.
This was mentioned in review D131326.
Reviewed By: var-const, #libc, philnik
Differential Revision: https://reviews.llvm.org/D132293
Simon Pilgrim [Tue, 30 Aug 2022 14:59:00 +0000 (15:59 +0100)]
[CostModel][X86] Account for add/sub 512-bit vector splitting costs on non-AVX512BW targets
Jolanta Jensen [Fri, 19 Aug 2022 14:45:45 +0000 (15:45 +0100)]
[NFC][LoopLoadElim] Extending type-mismatch testing
Added IR for int-pointer type mismatch and int-vector
type mismatch. Regenerated CHECK lines using
the update_test_checks.py script.
Differential Revision: https://reviews.llvm.org/D132239
Shivam Gupta [Tue, 30 Aug 2022 13:17:24 +0000 (18:47 +0530)]
[llvm-size] Fix missing file name for darwin output format with non-Mach-O
llvm-size falls back to printing in Berkeley format, if --format=darwin is specified and a non-Mach-O object has been provided. However, it does not print the input filename when it should:
Before -
(base) xgupta@archlinux ~/llvm/llvm-project/build (main*) $ llvm-size ~/hello.o --format=darwin
text data bss dec hex filename
291 0 0 291 123 %
After -
(base) xgupta@archlinux ~/llvm/llvm-project/build (main*) $ bin/llvm-size ~/hello.o --format=darwin
text data bss dec hex filename
291 0 0 291 123 /home/xgupta/hello.o
Fix #42316
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D132364
Alex Zinenko [Tue, 30 Aug 2022 08:23:57 +0000 (10:23 +0200)]
[mlir] materialize strided memref layout as attribute
Introduce a new attribute to represent the strided memref layout. Strided
layouts are omnipresent in code generation flows and are the only kind of
layouts produced and supported by a half of operation in the memref dialect
(view-related, shape-related). However, they are internally represented as
affine maps that require a somewhat fragile extraction of the strides from the
linear form that also comes with an overhead. Furthermore, textual
representation of strided layouts as affine maps is difficult to read: compare
`affine_map<(d0, d1, d2)[s0, s1] -> (d0*32 + d1*s0 + s1 + d2)>` with
`strides: [32, ?, 1], offset: ?`. While a rudimentary support for parsing a
syntactically sugared version of the strided layout has existed in the codebase
for a long time, it does not go as far as this commit to make the strided
layout a first-class attribute in the IR.
This introduces the attribute and updates the tests that using the pre-existing
sugared form to use the new attribute instead. Most memref created
programmatically, e.g., in passes, still use the affine form with further
extraction of strides and will be updated separately.
Update and clean-up the memref type documentation that has gotten stale and has
been referring to the details of affine map composition that are long gone.
See https://discourse.llvm.org/t/rfc-materialize-strided-memref-layout-as-an-attribute/64211.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D132864
Nico Weber [Mon, 29 Aug 2022 16:39:50 +0000 (12:39 -0400)]
[llvm-otool] Print dyld_info output before chained_fixup output
This matches otool.
Differential Revision: https://reviews.llvm.org/D132865
Matthias Springer [Tue, 30 Aug 2022 14:55:49 +0000 (16:55 +0200)]
[mlir][SCF][bufferize] Support different iter_arg/init_arg types (scf.while)
This change implements the same functionality as D132860, but for scf.while.
Differential Revision: https://reviews.llvm.org/D132927
Matthias Springer [Tue, 30 Aug 2022 14:46:23 +0000 (16:46 +0200)]
[mlir][SCF][bufferize][NFC] Move scf.if buffer type computation to getBufferType
A part of the functionality of `bufferize` is extracted into `getBufferType`. Also, bufferized scf.yields inside scf.if are now created with the correct bufferized type from the get-to.
Differential Revision: https://reviews.llvm.org/D132862
Matthias Springer [Tue, 30 Aug 2022 14:42:29 +0000 (16:42 +0200)]
[mlir][arith][bufferize][NFC] Move buffer type computation to getBufferType
A part of the functionality of `bufferize` is extracted into `getBufferType`.
Differential Revision: https://reviews.llvm.org/D132861
zhijian [Tue, 30 Aug 2022 14:38:38 +0000 (10:38 -0400)]
[AIX][clang][driver] Check the command string to the linker for exportlist opts
Summary:
Some of code in the patch are contributed by David Tenty.
1. We currently only check driver Wl options and don't check for the plain -b, -Xlinker or other options which get passed through to the linker when we decide whether to run llvm-nm --export-symbols, so we may run it in situations where we wouldn't if the user had used the equivalent -Wl, prefixed options. If we run the export list utility when the user has specified an export list, we could export more symbols than they intended.
2. Add a new functionality to allow redirecting the stdin, stdout, stderr of individual Jobs, if redirects are set for the Job use them, otherwise fall back to the global Compilation redirects if any.
Reviewers: David Tenty, Fangrui Song, Steven Wan
Differential Revision: https://reviews.llvm.org/D119147
Matthias Springer [Tue, 30 Aug 2022 14:32:09 +0000 (16:32 +0200)]
[mlir][SCF][bufferize] Support different iter_arg/init_arg types (scf.for)
Even though iter_arg and init_arg of an scf.for loop may have the same tensor type, their bufferized memref types are not necessarily equal. It is sometimes necessary to insert a cast in case of differing layout maps.
Differential Revision: https://reviews.llvm.org/D132860
Jon Chesterfield [Tue, 30 Aug 2022 14:29:38 +0000 (15:29 +0100)]
[amdgpu][nfc] Add test case showing false aliasing in LDS lowering
Matthias Springer [Tue, 30 Aug 2022 14:26:12 +0000 (16:26 +0200)]
[mlir][bufferization] Generalize getBufferType
This change generalizes getBufferType. This function can be used to predict the buffer type of any tensor value (not just BlockArguments) without changing any IR. It also subsumes getMemorySpace. This is useful for loop bufferization, where the precise buffer type of an iter_arg cannot be known without examining the loop body.
Differential Revision: https://reviews.llvm.org/D132859
Dmitry Preobrazhensky [Tue, 30 Aug 2022 14:04:09 +0000 (17:04 +0300)]
[AMDGPU][MC][GFX11][NFC] Update asm tests for VOP3P instructions
Differential Revision: https://reviews.llvm.org/D132876
Dmitry Preobrazhensky [Tue, 30 Aug 2022 13:59:29 +0000 (16:59 +0300)]
[AMDGPU][MC][GFX11][NFC] Add tests for opcode promotions and forced suffices
Differential Revision: https://reviews.llvm.org/D132869
Dmitry Preobrazhensky [Tue, 30 Aug 2022 13:54:58 +0000 (16:54 +0300)]
[AMDGPU][MC][GFX11][NFC] Add missing asm tests for VOPC and VOPC.DPP instructions
Differential Revision: https://reviews.llvm.org/D132690
Dmitry Preobrazhensky [Tue, 30 Aug 2022 13:48:57 +0000 (16:48 +0300)]
[AMDGPU][MC][GFX11][NFC] Update asm tests for VOPC instructions promoted to VOP3
Differential Revision: https://reviews.llvm.org/D132857
Alexey Bataev [Mon, 29 Aug 2022 20:08:47 +0000 (13:08 -0700)]
[SLP]Improve operands kind analaysis for constants.
Removed EnableFP parameter in getOperandInfo function since it is not
needed, the operands kinds also controlled by the operation code, which
allows to remove extra check for the type of the operands. Also, added
analysis for uniform constant float values.
This change currently does not trigger any changes in the code since TTI
does not do analysis for constant floats, so it can be considered NFC.
Tested with llvm-test-suite + SPEC2017, no changes.
Differential Revision: https://reviews.llvm.org/D132886
Dmitry Preobrazhensky [Tue, 30 Aug 2022 13:21:23 +0000 (16:21 +0300)]
[AMDGPU][MC][GFX11][NFC] Update asm tests for VOP3 instructions
Differential Revision: https://reviews.llvm.org/D132854
Timm Bäder [Mon, 29 Aug 2022 04:51:09 +0000 (06:51 +0200)]
[clang][Parse] Fix crash when emitting template diagnostic
This was passing a 6 to the diagnostic engine, which the diagnostic
message didn't handle.
Add the new value to the diagnosic message, remove an unused value and
add a test.
This fixes https://github.com/llvm/llvm-project/issues/57415
Differential Revision: https://reviews.llvm.org/D132821
Matheus Izvekov [Sun, 28 Aug 2022 14:19:02 +0000 (16:19 +0200)]
[libcxx] CI: set symbolizer for bootstrapping build
Setting the symbolizer is required for getting a pretty
stack trace when Clang crashes.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D132807
serge-sans-paille [Mon, 29 Aug 2022 15:02:50 +0000 (17:02 +0200)]
[clang] Fix -Warray-bound interaction with -fstrict-flex-arrays=1
The test to check if an array was a FAM in the context of array bound checking
and strict-flex-arrays=1 was inverted.
As a by product, improve test coverage.
Differential Revision: https://reviews.llvm.org/D132853
Markus Böck [Tue, 30 Aug 2022 12:46:22 +0000 (14:46 +0200)]
[cmake] Don't include symlinks to tools in Build-all when `LLVM_BUILD_TOOLS` is off
When building LLVM with LLVM_BUILD_TOOLS as OFF, numerous tools such as llvm-ar or llvm-objcopy end up still being built. The reason for this is that the symlink targets are unconditionally included in a Build-all build, causing the tool they're symlinking to be built after all.
This patch changes that behaviour to be more intuitive by only including the symlink in a Build-all build if the target they're linking to is also included.
Differential Revision: https://reviews.llvm.org/D132883
zhongyunde [Tue, 30 Aug 2022 12:36:30 +0000 (20:36 +0800)]
[InstCombine] Distributive or+mul with const operand
We aleady support the transform: `(X+C1)*CI -> X*CI+C1*CI`
Here the case is a little special as the form of `(X+C1)*CI` is transformed into `(X|C1)*CI`,
so we should also support the transform: `(X|C1)*CI -> X*CI+C1*CI`
Fixes https://github.com/llvm/llvm-project/issues/57278
Reviewed By: bcl5980, spatel, RKSimon
Differential Revision: https://reviews.llvm.org/D132658
Florian Hahn [Tue, 30 Aug 2022 12:27:50 +0000 (13:27 +0100)]
[DSE] Support looking through memory phis at end of function.
Update isWriteAtEndOfFunction to look through MemoryPhis. The reason
MemoryPhis were skipped so far was the known AliasAnalysis issue with it
missing loop-carried dependences.
This problem is already addressed in other parts of the code by skipping
MemoryDefs that may be in difference loops. I think the same logic can
be applied here.
This can have a substantial impact on the number of stores removed in
some cases. For MultiSource/SPEC2006/SPEC2017 with -O3:
```
Metric: dse.NumFastStores
Program dse.NumFastStores
base patch diff
External/S...CINT2017rate/557.xz_r/557.xz_r 14.00 45.00 221.4%
External/S...te/538.imagick_r/538.imagick_r 439.00 1267.00 188.6%
MultiSourc...e/Applications/SIBsim4/SIBsim4 6.00 15.00 150.0%
MultiSourc...Prolangs-C/simulator/simulator 3.00 7.00 133.3%
MultiSource/Applications/siod/siod 3.00 7.00 133.3%
MultiSourc...arks/FreeBench/distray/distray 6.00 9.00 50.0%
MultiSourc...e/Applications/obsequi/Obsequi 22.00 30.00 36.4%
MultiSource/Benchmarks/Ptrdist/bc/bc 23.00 28.00 21.7%
External/S...NT2017rate/502.gcc_r/502.gcc_r 1258.00 1512.00 20.2%
External/S...te/520.omnetpp_r/520.omnetpp_r 954.00 1143.00 19.8%
External/S...rate/510.parest_r/510.parest_r 5961.00 7122.00 19.5%
External/S...C/CINT2006/445.gobmk/445.gobmk 47.00 56.00 19.1%
External/S...00.perlbench_r/500.perlbench_r 241.00 286.00 18.7%
External/S...NT2006/471.omnetpp/471.omnetpp 36.00 42.00 16.7%
External/S...06/400.perlbench/400.perlbench 183.00 210.00 14.8%
MultiSource/Applications/SPASS/SPASS 72.00 81.00 12.5%
External/S...17rate/541.leela_r/541.leela_r 72.00 80.00 11.1%
External/SPEC/CINT2006/403.gcc/403.gcc 585.00 642.00 9.7%
MultiSourc...e/Applications/sqlite3/sqlite3 120.00 131.00 9.2%
MultiSourc...Applications/hexxagon/hexxagon 11.00 12.00 9.1%
External/S.../CFP2006/453.povray/453.povray 566.00 615.00 8.7%
External/S...rate/511.povray_r/511.povray_r 578.00 627.00 8.5%
External/S...FP2006/482.sphinx3/482.sphinx3 12.00 13.00 8.3%
MultiSource/Applications/oggenc/oggenc 130.00 140.00 7.7%
MultiSourc...e/Applications/ClamAV/clamscan 250.00 268.00 7.2%
MultiSourc.../mediabench/jpeg/jpeg-6a/cjpeg 19.00 20.00 5.3%
MultiSourc...ch/consumer-jpeg/consumer-jpeg 19.00 20.00 5.3%
External/S...te/526.blender_r/526.blender_r 3747.00 3928.00 4.8%
MultiSourc...OE-ProxyApps-C++/miniFE/miniFE 104.00 108.00 3.8%
MultiSourc...ch/consumer-lame/consumer-lame 54.00 56.00 3.7%
MultiSource/Benchmarks/Bullet/bullet 1222.00 1264.00 3.4%
MultiSourc...nchmarks/tramp3d-v4/tramp3d-v4 973.00 1005.00 3.3%
External/S.../CFP2006/447.dealII/447.dealII 2699.00 2780.00 3.0%
External/S...06/483.xalancbmk/483.xalancbmk 788.00 810.00 2.8%
External/S.../CFP2006/450.soplex/450.soplex 180.00 185.00 2.8%
MultiSourc.../DOE-ProxyApps-C++/CLAMR/CLAMR 338.00 345.00 2.1%
MultiSourc...Benchmarks/7zip/7zip-benchmark 685.00 699.00 2.0%
External/S...FP2017rate/544.nab_r/544.nab_r 158.00 160.00 1.3%
MultiSourc...sumer-typeset/consumer-typeset 772.00 781.00 1.2%
External/S...2017rate/525.x264_r/525.x264_r 410.00 414.00 1.0%
External/S...23.xalancbmk_r/523.xalancbmk_r 998.00 1002.00 0.4%
```
Compile-time is almost neutral:
https://llvm-compile-time-tracker.com/compare.php?from=
b3125ad3d60531a97eea20009cc9629a87755862&to=
84007eee59004f43464eda7f5ba8263ed5158df8&stat=instructions
NewPM-O3: +0.03%
NewPM-ReleaseThinLTO: -0.01%
NewPM-ReleaseLTO-g: +0.03%
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D132365
Johannes Reifferscheid [Tue, 30 Aug 2022 11:15:27 +0000 (13:15 +0200)]
Move BufferViewFlowAnalysis to the Bufferization dialect.
It's only used from there, and this lets us remove the dependency from Analysis
to the Arith dialect.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D132928
Thomas Symalla [Tue, 30 Aug 2022 11:51:45 +0000 (13:51 +0200)]
[NFC][AMDGPU] Pre-commit tests for D132837.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D132930
Valentin Clement [Tue, 30 Aug 2022 11:48:51 +0000 (13:48 +0200)]
[flang] Create a temporary of the correct size when lowering SetLength in genarr
This patch creates a temporary of the appropriate length while lowering SetLength.
The corresponding character can be truncated or padded if necessary.
This fix issue with array constructor in argument and also with statement function.
D132464 was fixing the same issue in genval.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D132866
Nikita Popov [Tue, 30 Aug 2022 10:06:26 +0000 (12:06 +0200)]
[GVN] Regenerate test checks (NFC)
Utkarsh Saxena [Tue, 30 Aug 2022 09:07:37 +0000 (11:07 +0200)]
[clangd] Enable folding ranges by default.
Differential Revision: https://reviews.llvm.org/D132919
Tomas Matheson [Tue, 23 Aug 2022 16:04:19 +0000 (17:04 +0100)]
[AArch64][GISel] constrain regclass for 128->64 copy
When selecting G_EXTRACT to COPY for extracting a 64-bit GPR from
a 128-bit register pair (XSeqPair) we know enough to constrain the
destination register class to gpr64. Without this it may have only
a register bank and some copy elimination code would assert while
assuming that a register class existed.
The register class has to be set explicitly because we might hit the
COPY -> COPY case where register class can't be inferred.
This would cause the following to crash in selection, where the store
is commented (otherwise the store constrains the register class):
define dso_local i128 @load_atomic_i128_unordered(i128* %p) {
%pair = cmpxchg i128* %p, i128 0, i128 0 acquire acquire
%val = extractvalue { i128, i1 } %pair, 0
; store i128 %val, i128* %p
ret i128 %val
}
Differential Revision: https://reviews.llvm.org/D132665
Tomas Matheson [Tue, 23 Aug 2022 16:01:53 +0000 (17:01 +0100)]
[AArch64][GISel] fix G_ADD*/G_SUB* legalization
widenScalarDst updates the insert point to after MI, so
widenScalarSrc must be called before widenScalarDst. Otherwise
The updated Src values will appear after MI and break SSA. e.g.:
%14:_(s64), %15:_(s1) = G_UADDE %9:_, %11:_, %13:_
becomes
%14:_(s64), %16:_(s32) = G_UADDE %9:_, %11:_, %17:_
%15:_(s1) = G_TRUNC %16:_(s32)
%17:_(s32) = G_ZEXT %13:_(s1)
Differential Revision: https://reviews.llvm.org/D132547
Change-Id: Ie3458747a6879433f4d5ab9939d2bd102dd0f2db
OCHyams [Tue, 30 Aug 2022 08:48:58 +0000 (09:48 +0100)]
[DebugInfo] Fix line number attribution in mldst-motion
Taking the example from the test included in this patch:
$ cat test.cpp -n
1 void fun(int *a, int cond) {
2 if (cond)
3 a[1] = 1;
4 else
5 a[1] = 2;
6 }
mldst-motion will merge and sink the stores in if.then and if.else into
if.end. The resultant PHI, gep and store should be attributed line zero
with the innermost common scope rather than picking a debug location from
one of the original stores.
Reviewed By: djtodoro
Differential Revision: https://reviews.llvm.org/D132741
Benjamin Kramer [Tue, 30 Aug 2022 09:01:33 +0000 (11:01 +0200)]
[bazel] Stop building PassGenTest.cpp.inc, it was removed in
13ed6958df40b85fcc80250bb3f819863904ecee
Ting Wang [Tue, 30 Aug 2022 08:32:29 +0000 (04:32 -0400)]
[PowerPC] CTRLoop pseudo instructions should not be duplicated
Add isNotDuplicable to CTRLoop pseudo instructions, to avoid other pass
such as early-tailduplication break the loop structure by duplicating
pseudo instructions.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D132738
Pavel Samolysov [Sat, 27 Aug 2022 12:22:03 +0000 (15:22 +0300)]
[LazyCallGraph] Reformat the code in accordance with the code style. NFC
Also, some local variables were renamed in accordance with the code
style as well as `std::tie` occurrences and `.first`/`.second` member
uses were replaced with structure bindings.
Differential Revision: https://reviews.llvm.org/D132806
Michele Scuttari [Tue, 30 Aug 2022 07:48:11 +0000 (09:48 +0200)]
[MLIR] Unique autogenerated file for tablegen passes
Being the generated code macro-guarded, the autogenerated `.cpp.inc` file has been merged into the `.h.inc` to reduce the build steps.
Reviewed By: mehdi_amini, rriddle
Differential Revision: https://reviews.llvm.org/D132884
Shoaib Meenai [Sun, 28 Aug 2022 20:09:56 +0000 (01:09 +0500)]
[MachO] Don't fold compact unwind entries with LSDA
Folding them will cause the unwinder to compute the incorrect function
start address for the folded entries, which in turn will cause the
personality function to interpret the LSDA incorrectly and break
exception handling.
You can verify the end-to-end flow by creating a simple C++ file:
```
void h();
int main() { h(); }
```
and then linking this file against the liblsda.dylib produced by the
test case added here. Before this change, running the resulting program
would result in a program termination with an uncaught exception.
Afterwards, it works correctly.
Reviewed By: #lld-macho, thevinster
Differential Revision: https://reviews.llvm.org/D132845
Martin Storsjö [Mon, 29 Aug 2022 09:45:00 +0000 (12:45 +0300)]
[lldb] Use the NativeSock type instead of plain 'int'
This fixes a warning when building for Windows:
../tools/lldb/source/Host/common/TCPSocket.cpp:297:16: warning: comparison of integers of different signs: 'int' and 'const NativeSocket' (aka 'const unsigned long long') [-Wsign-compare]
if (sock != kInvalidSocketValue) {
~~~~ ^ ~~~~~~~~~~~~~~~~~~~
Differential Revision: https://reviews.llvm.org/D132841
Martin Storsjö [Thu, 11 Aug 2022 21:26:46 +0000 (00:26 +0300)]
[libcxx] [test] Remove an unnecessary condition in a feature check
We don't need to check for `_LIBCPP_HAS_NO_LOCALIZATION` here;
this was copied over by mistake from the test above (which does
use locale.h).
Differential Revision: https://reviews.llvm.org/D132834