Martin Storsjö [Wed, 27 Apr 2022 10:28:01 +0000 (13:28 +0300)]
[libcxx] Switch __cxx_contention_t to int32_t on 32 bit AIX
I guess this is an ABI break for the 32 bit AIX configuration, but I'm
not sure if that one is meant to be ABI stable yet or not.
Previously, this used int32_t for this type on linux, but int64_t
on all other platforms. This was added in D68480 /
54fa9ecd3088508b05b0c5b5cb52da8a3c188655, but I don't really see
any discussion around this detail there.
Switching this to 32 bit on 32 bit AIX silences these libcxx build
warnings:
```
In file included from /scratch/powerllvm/cpap8006/llvm-project/libcxx-ci/libcxx/src/atomic.cpp:12:
/scratch/powerllvm/cpap8006/llvm-project/libcxx-ci/build/aix/include/c++/v1/atomic:1005:12: warning: large atomic operation may incur significant performance penalty; the access size (8 bytes) exceeds the max lock-free size (4 bytes) [-Watomic-alignment]
return __c11_atomic_fetch_add(&__a->__a_value, __delta, static_cast<__memory_order_underlying_t>(__order));
^
/scratch/powerllvm/cpap8006/llvm-project/libcxx-ci/build/aix/include/c++/v1/atomic:948:12: warning: large atomic operation may incur significant performance penalty; the access size (8 bytes) exceeds the max lock-free size (4 bytes) [-Watomic-alignment]
return __c11_atomic_load(const_cast<__ptr_type>(&__a->__a_value), static_cast<__memory_order_underlying_t>(__order));
^
/scratch/powerllvm/cpap8006/llvm-project/libcxx-ci/build/aix/include/c++/v1/atomic:1000:12: warning: large atomic operation may incur significant performance penalty; the access size (8 bytes) exceeds the max lock-free size (4 bytes) [-Watomic-alignment]
return __c11_atomic_fetch_add(&__a->__a_value, __delta, static_cast<__memory_order_underlying_t>(__order));
^
/scratch/powerllvm/cpap8006/llvm-project/libcxx-ci/build/aix/include/c++/v1/atomic:1022:12: warning: large atomic operation may incur significant performance penalty; the access size (8 bytes) exceeds the max lock-free size (4 bytes) [-Watomic-alignment]
return __c11_atomic_fetch_sub(&__a->__a_value, __delta, static_cast<__memory_order_underlying_t>(__order));
^
4 warnings generated.
```
Differential Revision: https://reviews.llvm.org/D124519
Benjamin Kramer [Thu, 12 May 2022 15:59:39 +0000 (17:59 +0200)]
[DenseElementAttr] Silence warning in -DNDEBUG builds. NFC.
Quentin Colombet [Tue, 3 May 2022 02:04:42 +0000 (19:04 -0700)]
[DeadArgElim] Re-apply: Set unused arguments for internal functions
The re-apply includes fixes to clang tests that were missed in
the original commit.
Original message:
Prior to this patch we would only set to undef the unused arguments of the
external functions. The rationale was that unused arguments of internal
functions wouldn't need to be turned into undef arguments because they
should have been simply eliminated by the time we reach that code.
This is actually not true because there are plenty of cases where we can't
remove unused arguments. For instance, if the internal function is used in
an indirect call, it may not be possible to change the function signature.
Yet, for statically known call-sites we would still like to mark the unused
arguments as undef.
This patch enables the "set undef arguments" optimization on internal
functions when we encounter cases where internal functions cannot be
optimized. I.e., whenever an internal function is marked "live".
Differential Revision: https://reviews.llvm.org/D124699
Chris Lattner [Thu, 12 May 2022 15:17:52 +0000 (16:17 +0100)]
Various improvements suggested by river NFC.
Differential Revision: https://reviews.llvm.org/D125471
Chris Lattner [Thu, 12 May 2022 04:32:16 +0000 (05:32 +0100)]
[DenseElementAttr] Simplify the public API for creating these.
Instead of requiring the client to compute the "isSplat" bit,
compute it internally. This makes the logic more consistent
and defines away a lot of "elements.size()==1" in the clients.
This addresses Issue #55185
Differential Revision: https://reviews.llvm.org/D125447
Eric Schweitz [Tue, 10 May 2022 14:51:15 +0000 (07:51 -0700)]
Fixes a performance problem with lowering of forall loops and creating
too many temporaries.
Fix clang-format errors.
Differential Revision: https://reviews.llvm.org/D125336
Fraser Cormack [Thu, 12 May 2022 14:45:04 +0000 (15:45 +0100)]
[CodeGen][NFC] Move some comments from the end of lines to above them
This avoids wrapping the line itself awkwardly when it exceeds 80 chars.
It also better matches our style most other places.
Jeremy Morse [Thu, 12 May 2022 14:39:51 +0000 (15:39 +0100)]
[DebugInfo][InstrRef] Describe value sizes when spilt to stack
This is a re-apply of D123599, which was reverted in
4fe2ab5279408, now
with a more appropriate assertion. Original commit message follow:
InstrRefBasedLDV can track and describe variable values that are spilt to
the stack -- however it does not current describe the size of the value on
the stack. This can cause uninitialized bytes to be read from the stack if
a small register is spilt for a larger variable, or theoretically on
big-endian machines if a large value on the stack is used for a small
variable.
Fix this by using DW_OP_deref_size to specify the amount of data to load
from the stack, if there's any possibility for ambiguity. There are a few
scenarios where this can be omitted (such as when using DW_OP_piece and a
non-DW_OP_stack_value location), see deref-spills-with-size.mir for an
explicit table of inputs flavours and output expressions.
Differential Revision: https://reviews.llvm.org/D123599
Pavel Samolysov [Thu, 12 May 2022 14:39:26 +0000 (16:39 +0200)]
[ArgPromotion] Make a non-byval promotion attempt first
It makes sense to make a non-byval promotion attempt first and then
fall back to the byval one. The non-byval ('usual') promotion is
generally better, for example it does promotion even when a structure
has more elements than 'MaxElements' but not all of them are actually
used in the function.
Differential Revision: https://reviews.llvm.org/D124514
Richard Howell [Wed, 4 May 2022 17:25:34 +0000 (10:25 -0700)]
[clang] serialize ORIGINAL_PCH_DIR relative to BaseDirectory
This diff changes the serialization of the `ORIGINAL_PCH_DIR`
entry in module files to be serialized relative to the module's
`BaseDirectory`. This will allow for the module to be relocatable
across machines.
The path is restored relative to the module's BaseDirectory on
deserialization.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D124946
Richard Howell [Wed, 4 May 2022 16:48:44 +0000 (09:48 -0700)]
[clang] serialize SUBMODULE_TOPHEADER relative to BaseDirectory
This diff changes the serialization of the `SUBMODULE_TOPHEADER`
entry in module files to be serialized relative to the module's
`BaseDirectory`. This matches the behavior of the
`SUBMODULE_HEADER` entry and will allow for the module to be
relocatable across machines.
The path is restored relative to the module's `BaseDirectory` on
deserialization.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D124938
Richard Howell [Wed, 4 May 2022 15:44:40 +0000 (08:44 -0700)]
[clang] add -fmodule-file-home-is-cwd
This diff adds a new frontend flag `-fmodule-file-home-is-cwd`.
The behavior of this flag is similar to
`-fmodule-map-file-home-is-cwd` but does not require the module
map files to be modified to have inputs relative to the cwd.
Instead the output modules will have their `BaseDirectory` set
to the cwd and will try and resolve paths relative to that.
The motiviation for this change is to support relocatable pcm
files that are built on different machines with different paths
without having to alter module map files, which is sometimes not
possible as they are provided by 3rd parties.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D124874
serge-sans-paille [Thu, 12 May 2022 13:33:17 +0000 (15:33 +0200)]
[openmp] Fix strict aliasing issue in cmpxchg routine
Avoid warning under -fstrict-aliasing by using a call to memcpy to perform type
punning.
Differential Revision: https://reviews.llvm.org/D125467
Nikita Popov [Thu, 12 May 2022 12:56:44 +0000 (14:56 +0200)]
[AArch64] Preserve chain when lowering fixed length load to SVE (PR55281)
When a fixed length load is lowered to an SVE masked load, the
result chain is currently set to the input chain of the old load,
rather than the result chain of the new load. This may cause stores
to be incorrectly reordered.
Fixes https://github.com/llvm/llvm-project/issues/55281.
Differential Revision: https://reviews.llvm.org/D125464
Tomasz Kamiński [Thu, 12 May 2022 13:40:11 +0000 (15:40 +0200)]
Reland "[analyzer] Canonicalize SymIntExpr so the RHS is positive when possible"
This PR changes the `SymIntExpr` so the expression that uses a
negative value as `RHS`, for example: `x +/- (-N)`, is modeled as
`x -/+ N` instead.
This avoids producing a very large `RHS` when the symbol is cased to
an unsigned number, and as consequence makes the value more robust in
presence of casts.
Note that this change is not applied if `N` is the lowest negative
value for which negation would not be representable.
Reviewed By: steakhal
Patch By: tomasz-kaminski-sonarsource!
Differential Revision: https://reviews.llvm.org/D124658
Thomas Raoux [Wed, 11 May 2022 17:43:44 +0000 (17:43 +0000)]
[mlir][vector] Add lowering pattern for vector.warp_execute_on_lane_0 op
Add lowering of the vector.warp_execute_on_lane_0 into scf.if plus memory
transfer for the operands and yield values.
This also add an integration test running on GPU warp. The same tests can be
later re-used with different comment lines to tests distribution
transformations.
This is mostly from @springerm contribution.
Differential Revision: https://reviews.llvm.org/D125430
Ken Matsui [Thu, 12 May 2022 13:25:05 +0000 (09:25 -0400)]
Warn if using `elifdef` & `elifndef` in not C2x & C++2b mode
This adds an extension warning when using the preprocessor conditionals
in a language mode they're not officially supported in, and an opt-in
warning for compatibility with previous standards.
Fixes #55306
Differential Revision: https://reviews.llvm.org/D125178
Pedro Olsen Ferreira [Thu, 12 May 2022 12:44:47 +0000 (13:44 +0100)]
Rename and fix ValueMap::resize to reserve
The underlying map type (DenseMap) has had its resize() function
renamed to reserve() as part of
c04fc7a60ff4ea4610ea157be006c9771224a7b6 (SVN 264026).
This is only visible when the member function is called, as it is
template type name dependent.
Differential Revision: https://reviews.llvm.org/D125387
Martin Storsjö [Thu, 11 Nov 2021 10:55:10 +0000 (12:55 +0200)]
[AArch64] Stop creating unnecessary label MCSymbols for each Windows unwind opcode. NFC.
These labels aren't needed in the ARM version of WinEH tables, as each
unwind opcode maps to a specific instruction (each opcode is assumed
to represent one instruction), and the written tables don't contain
offsets like on x86_64.
Differential Revision: https://reviews.llvm.org/D125369
Martin Storsjö [Wed, 24 Nov 2021 12:03:54 +0000 (14:03 +0200)]
[MC] [Win64EH] Simplify code using WinEH::Instruction::operator!=. NFC.
operator== and operator!= were added in
1308bb99e06752ab0b5175c92da31083f91af921 / D87369, but this existing
codepath wasn't updated to use them.
Also fix the indentation of the enclosed liens.
Differential Revision: https://reviews.llvm.org/D125368
Benjamin Kramer [Thu, 12 May 2022 11:35:27 +0000 (13:35 +0200)]
[mlir][linalg] Add lowering of named ops on complex numbers
This lets linalg.dot and friends lower to a complex muladd using ops
from the complex dialect.
Differential Revision: https://reviews.llvm.org/D125461
owenca [Fri, 6 May 2022 22:08:12 +0000 (15:08 -0700)]
[clang-format] Don't remove braces if a 1-statement body would wrap
Reimplement the RemoveBracesLLVM feature which handles a
single-statement block that would get wrapped.
Fixes #53543.
Differential Revision: https://reviews.llvm.org/D125137
Nikita Popov [Thu, 12 May 2022 10:22:51 +0000 (12:22 +0200)]
[FastISel] Add some debug output (NFC)
Print a debug message when aborting isel (next to the ORE report)
and when folding a load.
Benjamin Kramer [Thu, 12 May 2022 10:04:14 +0000 (12:04 +0200)]
[bazel] Add support for configuring the bazel build for PPC
TF already carries a patch for this.
Benjamin Kramer [Wed, 11 May 2022 13:12:21 +0000 (15:12 +0200)]
[mlir][LLVM] Make the nested type restriction on complex constants less aggressive
Complex nested in other types is perfectly fine, just nested structs
aren't supported. Instead of checking whether there's nesting just check
whether the struct we're dealing with is a complex number.
Differential Revision: https://reviews.llvm.org/D125381
Max Kazantsev [Thu, 12 May 2022 09:09:11 +0000 (16:09 +0700)]
[Test] Regenerate checks using auto-update (work around PR55365)
Dmitry Vassiliev [Thu, 12 May 2022 08:46:03 +0000 (10:46 +0200)]
[Intrinsics] Fix `nvvm_prmt` intrinsic attributes
`nvvm_prmt` doesn't seem to be `commutative`. nvvm also sets `IntrSpeculatable` for it.
Here is the doc https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-prmt
Reviewed By: tra, jchlanda
Differential Revision: https://reviews.llvm.org/D125423
Daniil Dudkin [Thu, 12 May 2022 08:39:53 +0000 (11:39 +0300)]
[mlir][NFC] Fix `GpuKernelOutliningPass` copy constructor warnings
1. Call copy constructor of the base class
2. Assign value of the option directly
Reviewed By: dcaballe, rriddle
Differential Revision: https://reviews.llvm.org/D125101
Ivan Kosarev [Thu, 12 May 2022 07:51:35 +0000 (08:51 +0100)]
[AMDGPU][NFC] Remove unused function.
Introduced in
https://reviews.llvm.org/rG229d5e669bbbe7ca38ad832627a9809405939f1b
and then became unused in
https://reviews.llvm.org/D19584
Reviewed By: foad, dp
Differential Revision: https://reviews.llvm.org/D125385
Nikita Popov [Wed, 11 May 2022 07:57:15 +0000 (09:57 +0200)]
[MLIR] Fix build without native arch
D125214 split off a MLIRExecutionEngineUtils library that is used
by MLIRGPUTransforms. However, currently the entire ExecutionEngine
directory is skipped if the LLVM_NATIVE_ARCH target is not available.
Move the check for LLVM_NATIVE_ARCH, such that MLIRExecutionEngineUtils
always gets built, and only the JIT-related libraries are omitted
without native arch.
Differential Revision: https://reviews.llvm.org/D125357
Ivan Kosarev [Thu, 12 May 2022 07:25:33 +0000 (08:25 +0100)]
[AMDGPU][GFX10] Support base+soffset+offset SMEM stores.
Also makes another step towards resolving
https://github.com/llvm/llvm-project/issues/38652
Reviewed By: foad, dp
Differential Revision: https://reviews.llvm.org/D125380
Matthias Springer [Thu, 12 May 2022 07:42:53 +0000 (09:42 +0200)]
[mlir][bufferize] Support alloc hoisting across function boundaries
This change integrates the BufferResultsToOutParamsPass into One-Shot Module Bufferization. This improves memory management (deallocation) when buffers are returned from a function.
Note: This currently only works with statically-sized tensors. The generated code is not very efficient yet and there are opportunities for improvment (fewer copies). By default, this new functionality is deactivated.
Differential Revision: https://reviews.llvm.org/D125376
Matthias Springer [Thu, 12 May 2022 07:27:21 +0000 (09:27 +0200)]
[mlir][bufferize] Fix op filter
Bufferization has an optional filter to exclude certain ops from analysis+bufferization. There were a few remaining places in the codebase where the filter was not checked.
Differential Revision: https://reviews.llvm.org/D125356
Tim Northover [Thu, 12 May 2022 07:30:53 +0000 (08:30 +0100)]
Revert "Add an error message to the default SIGPIPE handler"
It broke a PPC bot, for not immediately obvious reasons.
Matthias Springer [Thu, 12 May 2022 07:17:04 +0000 (09:17 +0200)]
[mlir][bufferize] Add helpers for templatized DENY filters
We already have templatized ALLOW filters but the DENY filters were missing.
Differential Revision: https://reviews.llvm.org/D125358
Carl Ritson [Wed, 11 May 2022 09:21:27 +0000 (18:21 +0900)]
[AMDGPU] Remove pre-committed test for D124981. NFC.
Tim Northover [Wed, 11 May 2022 08:52:10 +0000 (09:52 +0100)]
Add an error message to the default SIGPIPE handler
UNIX03 conformance requires utilities to flush stdout before exiting and raise
an error if writing fails. Flushing already happens on a call to exit
and thus automatically on a return from main. Write failure is then
detected by LLVM's default SIGPIPE handler. The handler already exits with
a non-zero code, but conformance additionally requires an error message.
Krasimir Georgiev [Thu, 12 May 2022 06:30:36 +0000 (08:30 +0200)]
silence new -Wunused-result warnings in test
No functional changes intended.
After https://github.com/llvm/llvm-project/commit/
f156b51aecc676a9051136f6f5cb74e37dd574d1,
new -Wunused-result warnings popped up in this test:
https://buildkite.com/llvm-project/upstream-bazel/builds/28320#
bc3ec049-af39-4114-b7b8-
4cbc180bc09b
River Riddle [Wed, 11 May 2022 04:25:00 +0000 (21:25 -0700)]
[mlir:Parser] Emit a better diagnostic when a custom operation is unknown
When a custom operation is unknown and does not have a dialect prefix, we currently
emit an error using the name of the operation with the default dialect prefix. This
leads to a confusing error message, especially when operations get moved between dialects.
For example, `func` was recently moved out of `builtin` and to the `func` dialect. The current
error message we get is:
```
func @foo()
^ custom op 'builtin.func' is unknown
```
This could lead users to believe that there is supposed to be a `builtin.func`,
because there used to be. This commit adds a better error message that does
not assume that the operation is supposed to be in the default dialect:
```
func @foo()
^ custom op 'func' is unknown (tried 'builtin.func' as well)
```
Differential Revision: https://reviews.llvm.org/D125351
Mahesh Ravishankar [Thu, 12 May 2022 03:50:21 +0000 (03:50 +0000)]
[mlir][Linalg] Combine canonicalizers that deal with removing dead/redundant args.
`linalg.generic` ops have canonicalizers that either remove arguments
not used in the payload, or redundant arguments. Combine these and
enhance the canonicalization to also remove results that have no use.
This is effectively dead code elimination for Linalg ops.
Differential Revision: https://reviews.llvm.org/D123632
Mogball [Thu, 12 May 2022 05:14:25 +0000 (05:14 +0000)]
[mlir][ods] (NFC) don't use std::function for map_range
Mogball [Thu, 12 May 2022 04:16:17 +0000 (04:16 +0000)]
[mlir] (NFC) Use assembly format for test.graph_region
bzcheeseman [Wed, 11 May 2022 19:25:04 +0000 (15:25 -0400)]
[MLIR][Operation] Simplify Operation casting, NFC
We can simplify the code needed to implement dyn_cast/cast/isa support for MLIR operations with documented interfaces via the CastInfo structures. This will also provide an example of how to use CastInfo.
Depends on D123901
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D124963
bzcheeseman [Sat, 16 Apr 2022 18:34:08 +0000 (11:34 -0700)]
[LLVM][Casting.h] Update dyn_cast machinery to provide more control over how the casting is performed.
This patch expands the expressive capability of the casting utilities in LLVM by introducing several levels of configurability. By creating modular CastInfo classes we can enable projects like MLIR that need more fine-grained control over how a cast is actually performed to retain that control, while making it easy to express the easy cases (like a checked pointer to pointer cast).
The current implementation of Casting.h doesn't make it clear where the entry points for customizing the cast behavior are, so part of the motivation for this patch is adding that documentation. Another part of the motivation is to support using LLVM RTTI with a wider set of use cases, such as nullable value to value casts, or pointer to value casts (as in MLIR).
Reviewed By: lattner, rriddle
Differential Revision: https://reviews.llvm.org/D123901
Fangrui Song [Thu, 12 May 2022 03:27:11 +0000 (20:27 -0700)]
[Bitcode] Simplify code after FUNC_CODE_BLOCKADDR_USERS changes (D124878)
Switch to the more common `Constant && !GlobalValue` test.
Use the more common `Worklist/Visited` variable names.
Jim Lin [Wed, 11 May 2022 06:13:35 +0000 (14:13 +0800)]
[BPF] Implement mod operation
Implement BPF_MOD instruction to fix lack of assembly parser support mentioned in https://github.com/llvm/llvm-project/issues/55192.
Reviewed By: ast
Differential Revision: https://reviews.llvm.org/D125207
Lian Wang [Thu, 12 May 2022 02:11:37 +0000 (02:11 +0000)]
[LegalizeVectorTypes] Enable WidenVecRes_SETCC work for scalable vector.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D125359
Ping Deng [Thu, 12 May 2022 02:22:56 +0000 (02:22 +0000)]
[RISCV][NFC] Simplify tests by reorganizing check prefixes
Reviewed By: benshi001, asb
Differential Revision: https://reviews.llvm.org/D125354
grosul1 [Thu, 12 May 2022 01:44:13 +0000 (01:44 +0000)]
[mlir] Fix loop unrolling: properly replace the arguments of the epilogue loop.
Using "replaceUsesOfWith" is incorrect because the same initializer value may appear multiple times.
For example, if the epilogue is needed when this loop is unrolled
```
%x:2 = scf.for ... iter_args(%arg1 = %c1, %arg2 = %c1) {
...
}
```
then both epilogue's arguments will be incorrectly renamed to use the same result index (note #1 in both cases):
```
%x_unrolled:2 = scf.for ... iter_args(%arg1 = %c1, %arg2 = %c1) {
...
}
%x_epilogue:2 = scf.for ... iter_args(%arg1 = %x_unrolled#1, %arg2 = %x_unrolled#1) {
...
}
```
Weining Lu [Tue, 3 May 2022 03:06:24 +0000 (11:06 +0800)]
[LoongArch] Check msb is not less than lsb for the bstr{ins/pick}.{w/d} instructions
Differential Revision: https://reviews.llvm.org/D124825
David Tenty [Thu, 12 May 2022 00:47:48 +0000 (20:47 -0400)]
Revert "[NFC][tests][AIX] XFAIL test for lack of visibility support"
This reverts commit
f5a9b5cc12658f4d6caa3e0cfc3e771698fb3798 since
https://reviews.llvm.org/D125141 has resolved the test issue.
Tapan Thaker [Wed, 11 May 2022 23:29:07 +0000 (16:29 -0700)]
[lld/macho] Fixes the -ObjC flag
When checking the segment name for Swift symbols, we should be checking that they start with `__swift` instead of checking for equality
Fixes the issue https://github.com/llvm/llvm-project/issues/55355
Reviewed By: #lld-macho, keith, thevinster
Differential Revision: https://reviews.llvm.org/D125250
Vasileios Porpodas [Wed, 11 May 2022 22:52:24 +0000 (15:52 -0700)]
Recommit "[SLP] Make reordering aware of external vectorizable scalar stores."
This reverts commit
c2a7904aba465fcaf13bbe2a5772cdeeb88060e5.
Original code review: https://reviews.llvm.org/D125111
Amir Ayupov [Wed, 11 May 2022 23:23:27 +0000 (16:23 -0700)]
[BOLT][NFC] Use BitVector::set_bits
Refactor and use `set_bits` BitVector interface.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D125374
Greg Clayton [Tue, 10 May 2022 20:41:06 +0000 (13:41 -0700)]
Add "indexedVariables" to variables with lots of children.
Prior to this fix if we have a really large array or collection class, we would end up always creating all of the child variables for an array or collection class. If the number of children was very high this can cause delays when expanding variables. By adding the "indexedVariables" to variables with lots of children, we can keep good performance in the variables view at all times. This patch will add the "indexedVariables" key/value pair to any "Variable" JSON dictionairies when we have an array of synthetic child provider that will create more than 100 children.
We have to be careful to not call "uint32_t SBValue::GetNumChildren()" on any lldb::SBValue that we use because it can cause a class, struct or union to complete the type in order to be able to properly tell us how many children it has and this can be expensive if you have a lot of variables. By default LLDB won't need to complete a type if we have variables that are classes, structs or unions unless the user expands the variable in the variable view. So we try to only get the GetNumChildren() when we have an array, as this is a cheap operation, or a synthetic child provider, most of which are for showing collections that typically fall into this category. We add a variable reference, which indicates that something can be expanded, when the function "bool SBValue::MightHaveChildren()" is true as this call doesn't need to complete the type in order to return true. This way if no one ever expands class variables, we don't need to complete the type.
Differential Revision: https://reviews.llvm.org/D125347
Simon Dardis [Sun, 8 May 2022 21:23:16 +0000 (22:23 +0100)]
[MIPS] Remove an incorrect microMIPS instruction alias
The microMIPS instruction set is compatible with the MIPS instruction
set at the assembly level but not in terms of encodings. `nop` in
microMIPS is a special case as it has the same encoding as `nop` for
MIPS.
Fix this error by reducing the usage of NOP in the MIPS backend such
that only that ISA correct variants are produced.
Differential Revision: https://reviews.llvm.org/D124716
Arthur Eubanks [Wed, 11 May 2022 22:27:39 +0000 (15:27 -0700)]
Revert "[SLP] Make reordering aware of external vectorizable scalar stores."
This reverts commit
71bcead98b2e655031208e5ad0ce89f8971a6343.
Causes crashes, see comments in D125111.
Alan Zhao [Wed, 11 May 2022 22:05:55 +0000 (15:05 -0700)]
Explicitly add -target for Windows builds in file_test_windows.c
It turns out that the llvm buildbots run the test with
-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4, which would cause this
test to fail as the test assumed that the default target is Windows. To
fix this, we explicitly set -target for the Windows testcases.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D125425
Yuanfang Chen [Wed, 11 May 2022 21:42:03 +0000 (14:42 -0700)]
[Driver][test] run one test in darwin-dsymutil.c for Darwin only
Alan Zhao [Wed, 11 May 2022 20:54:09 +0000 (22:54 +0200)]
[clang] Add the flag -ffile-reproducible
When Clang generates the path prefix (i.e. the path of the directory
where the file is) when generating FILE, __builtin_FILE(), and
std::source_location, Clang uses the platform-specific path separator
character of the build environment where Clang _itself_ is built. This
leads to inconsistencies in Chrome builds where Clang running on
non-Windows environments uses the forward slash (/) path separator
while Clang running on Windows builds uses the backslash (\) path
separator. To fix this, we add a flag -ffile-reproducible (and its
inverse, -fno-file-reproducible) to have Clang use the target's
platform-specific file separator character.
Additionally, the existing flags -fmacro-prefix-map and
-ffile-prefix-map now both imply -ffile-reproducible. This can be
overriden by setting -fno-file-reproducible.
[0]: https://crbug.com/1310767
Differential revision: https://reviews.llvm.org/D122766
Mike Rice [Wed, 11 May 2022 18:26:07 +0000 (11:26 -0700)]
[OpenMP] Fix mangling for linear parameters with negative stride
The 'n' character is used in place of '-' in the mangled name.
Differential Revision: https://reviews.llvm.org/D125406
Xiang Li [Wed, 11 May 2022 20:38:13 +0000 (13:38 -0700)]
Revert "[HLSL] add -D option for dxc mode."
This reverts commit
4dae38ebfba0d8583e52c3ded8f62f5f9fa77fda.
Differential Revision: https://reviews.llvm.org/D125414
Joseph Huber [Wed, 11 May 2022 20:53:36 +0000 (16:53 -0400)]
[LinkerWrapper][Fix} Fix bad alignment from extracted archive members
Summary:
We use embedded binaries to extract offloading device code from the host
fatbinary. This uses a binary format whose necessary alignment is
eight bytes. The alignment is included within the ELF section type so
the data extracted from the ELF should always be aligned at that amount.
However, if this file was extraqcted from a static archive, it was being
sent as an offset in the archive file which did not have the same
alignment guaruntees as the ELF file. This was causing errors in the
UB-sanitizer build as it would occasionally try to access a misaligned
address. To fix this, I simply copy the memory directly to a new buffer
which is guarnteed to have worst-case alignment of 16 in the case that
it's not properly aligned.
Austin Kerbow [Fri, 25 Mar 2022 00:46:15 +0000 (17:46 -0700)]
[AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic
Adds an intrinsic/builtin that can be used to fine tune scheduler behavior. If
there is a need to have highly optimized codegen and kernel developers have
knowledge of inter-wave runtime behavior which is unknown to the compiler this
builtin can be used to tune scheduling.
This intrinsic creates a barrier between scheduling regions. The immediate
parameter is a mask to determine the types of instructions that should be
prevented from crossing the sched_barrier. In this initial patch, there are only
two variations. A mask of 0 means that no instructions may be scheduled across
the sched_barrier. A mask of 1 means that non-memory, non-side-effect inducing
instructions may cross the sched_barrier.
Note that this intrinsic is only meant to work with the scheduling passes. Any
other transformations that may move code will not be impacted in the ways
described above.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D124700
Florian Hahn [Wed, 11 May 2022 20:20:42 +0000 (21:20 +0100)]
[ConstraintElimination] Add extra tests for different overflows.
Additional tests for D125264, inspired by @spatel.
Philip Reames [Wed, 11 May 2022 20:16:31 +0000 (13:16 -0700)]
[riscv] Add a bunch of tests exploring switch lowering
Specifically, how we handle zext vs sext around truncates.
Craig Topper [Wed, 11 May 2022 19:49:01 +0000 (12:49 -0700)]
[RISCV] Enable subregister liveness tracking for RVV.
RVV makes heavy use of subregisters due to LMUL>1 and segment
load/store tuples. Enabling subregister liveness tracking improves the quality
of the register allocation.
I've added a command line that can be used to turn it off if it causes compile
time or functional issues. I used the command line to keep the old behavior
for one interesting test case that was testing register allocation.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D125108
Craig Topper [Wed, 11 May 2022 19:16:37 +0000 (12:16 -0700)]
[RISCV] Fold addiw from (add X, (addiw (lui C1, C2))) into load/store address
This is a followup to D124231.
We can fold the ADDIW in this pattern if we can prove that LUI+ADDI
would have produced the same result as LUI+ADDIW.
This pattern occurs because constant materialization prefers LUI+ADDIW
for all simm32 immediates. Only immediates in the range
0x7ffff800-0x7fffffff require an ADDIW. Other simm32 immediates
work with LUI+ADDI.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D124693
Florian Hahn [Wed, 11 May 2022 19:46:48 +0000 (20:46 +0100)]
[GVN] Add test case for memdep invalidation bug.
Test case for #30999.
Chris Lattner [Wed, 11 May 2022 07:51:53 +0000 (08:51 +0100)]
[AsmParser] Adopt emitWrongTokenError more, improving QoI
This is a full audit of emitError calls, I took the opportunity
to remove extranous parens and fix a couple cases where we'd
generate multiple diagnostics for the same error.
Differential Revision: https://reviews.llvm.org/D125355
Nikolas Klauser [Sun, 8 May 2022 14:40:04 +0000 (16:40 +0200)]
[libc++] Remove __invalidate_all_iterators and replace the uses with std::__debug_db_invalidate_all
Reviewed By: ldionne, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D125188
Nikolas Klauser [Sat, 7 May 2022 20:20:23 +0000 (22:20 +0200)]
[libc++] Add a few more debug wrapper functions
Reviewed By: ldionne, #libc, jloser
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D125176
Craig Topper [Wed, 11 May 2022 18:52:07 +0000 (11:52 -0700)]
[CodeGenPrepare] Use const reference to avoid unnecessary APInt copy. NFC
Spotted while looking at Matthias' patches.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D124985
Philip Reames [Wed, 11 May 2022 18:41:59 +0000 (11:41 -0700)]
[test, riscv] Add test illustrating missing handling for fallthrough blocks in 541c9ba
River Riddle [Sat, 7 May 2022 01:24:17 +0000 (18:24 -0700)]
[TableGen] Refactor TableGenParseFile to no longer use a callback
Now that TableGen no longer relies on global Record state, we can allow
for the client to own the RecordKeeper and SourceMgr. Given that TableGen
internally still relies on the global llvm::SrcMgr, this method unfortunately
still isn't thread-safe.
Differential Revision: https://reviews.llvm.org/D125277
River Riddle [Sat, 7 May 2022 01:05:54 +0000 (18:05 -0700)]
[TableGen] Remove the use of global Record state
This commits removes TableGens reliance on managed static global record state
by moving the RecordContext into the RecordKeeper. The RecordKeeper is now
treated similarly to a (LLVM|MLIR|etc)Context object and is passed to static
construction functions. This is an important step forward in removing TableGens
reliance on global state, and in a followup will allow for users that parse tablegen
to parse multiple tablegen files without worrying about Record lifetime.
Differential Revision: https://reviews.llvm.org/D125276
Qiongsi Wu [Wed, 11 May 2022 17:20:41 +0000 (13:20 -0400)]
[clang][ppc] Creating Seperate Install Target for PPC htm Headers
This patch splits out the htm intrinsic headers from the PPC headers list.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D125386
Craig Topper [Wed, 11 May 2022 18:20:15 +0000 (11:20 -0700)]
[RISCV] Add caching to the gather/scatter to strided load/store conversion.
If we have multiple gather/scatter instructions using the same the
same strided address we would scalarize it multiple times. I guess
a later pass cleans this up, but I don't know if that's guaranteed.
This patch adds a cache to remember the scalarization we already
created for a previous gather/scatter.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D125326
Yaxun (Sam) Liu [Wed, 11 May 2022 15:41:46 +0000 (11:41 -0400)]
[clang] Fix KEYALL
Update KEYALL to cover KEYCUDA. Introduce KEYMAX and
a generic way to update KEYALL.
Reviewed by: Dan Liew
Differential Revision: https://reviews.llvm.org/D125396
Xiang Li [Tue, 10 May 2022 21:22:29 +0000 (14:22 -0700)]
[HLSL] add -D option for dxc mode.
Create dxc_D as alias to option D which Define <macro> to <value> (or 1 if <value> omitted).
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D125338
Craig Topper [Wed, 11 May 2022 18:14:56 +0000 (11:14 -0700)]
[RISCV] Move implementation of getVLOpNum and getSEWOpNum from RISCVInsertVSETVLI to RISCVBaseInfo.h. NFC
We should consolidate the operand counting and ordering into
RISCVBaseInfo.h and stop spreading it around.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D125344
Craig Topper [Wed, 11 May 2022 17:58:10 +0000 (10:58 -0700)]
[RISCV] Override TargetLowering::shouldProduceAndByConstByHoistingConstFromShiftsLHSOfAnd.
This hook determines if SimplifySetcc transforms (X & (C l>>/<< Y))
==/!= 0 into ((X <</l>> Y) & C) ==/!= 0. Where C is a constant and
X might be a constant.
The default implementation favors doing the transform if X is not
a constant. Otherwise the code is left alone. There is a provision
that if the target supports a bit test instruction then the transform
will favor ((1 << Y) & X) ==/!= 0. RISCV does not say it has a variable
bit test operation.
RISCV with Zbs does have a BEXT instruction that performs (X >> Y) & 1.
Without Zbs, (X >> Y) & 1 still looks preferable to ((1 << Y) & X) since
we can fold use ANDI instead of putting a 1 in a register for SLL.
This patch overrides this hook to favor bit extract patterns and
otherwise falls back to the "do the transform if X is not a constant"
heuristic.
I've added tests where both C and X are constants with both the shl form
and lshr form. I've also added a test for a switch statement that lowers
to a bit test. That was my original motivation for looking at this.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D124639
Sanjay Patel [Wed, 11 May 2022 17:57:33 +0000 (13:57 -0400)]
[InstCombine] freeze operand in sdiv expansion
As discussed in issue #37809, this transform is not safe
if the input is an undefined value.
This is similar to a recent change for urem:
d428f09b2c9d
There is no difference in codegen on the basic examples,
but this could lead to regressions. We may need to
improve freeze analysis or lowering if that happens.
Presumably, in real cases that are similar to the tests
where a subsequent transform removes the select, we
will also be able to remove the freeze by seeing that
the parameter has 'noundef'.
Sanjay Patel [Wed, 11 May 2022 17:48:13 +0000 (13:48 -0400)]
[InstCombine] update auto-generated CHECK lines in test file; NFC
These are all cosmetic (value naming) diffs that would distract from
real changes in this file.
Craig Topper [Wed, 11 May 2022 17:48:12 +0000 (10:48 -0700)]
[RISCV] Add a DAG combine to pre-promote (i32 (and (srl X, Y), 1)) with Zbs on RV64.
Type legalization will want to turn (srl X, Y) into RISCVISD::SRLW,
which will prevent us from using a BEXT instruction.
I don't think there is any precedent for type promotion checking
users to decide how to promote. Instead, I've added this DAG combine to
do it before type legalization.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D124109
Philip Reames [Wed, 11 May 2022 17:35:29 +0000 (10:35 -0700)]
[riscv] Canonicalize vsetvli (vsetvli avl, vtype1) vtype2 transitionsas reviewed
This patch is an alternative to a piece of D125270. If we have one vsetvli which is using as AVL the output of another, and the prior AVL can be proven to produce the same VL value as that defining one, we can use the AVL from the prior instruction. This has the effect of removing a state transition on AVL, and will let us use the cheaper 'vsetvli x0, x0, vtype1' form or possible even skip emitting it entirely.
This builds on the same infrastructure as D125337, and does the analogous extension to working on abstract states instead of only prior explicit vsetvli instructions. This is where the (relatively minor) code improvements come from.
More importantly, this fixes the last case where the state computed in phase 1 and 2 of the algorithm differs from the state computed during phase 3. Note that such differences can cause miscompiles by creating disagreements about contents of the VL and VTYPE registers at block boundaries.
Doing this transform inside the dataflow can cause the compatibility of a later store to change with regards to the current state. test15 in the diff illustrates this case well. What we have is a vsetvli which is mutated by one following vector op, but whose GPR result is used by another. The compatibility logic walks back to the def in this case, and checks to see if it matches the immediate prior state. In phase 1 and 2, it doesn't, and in phase 3 (after mutation) it does because we remove a transition which caused it to differ.
Differential Revision: https://reviews.llvm.org/D125392
Arthur Eubanks [Wed, 11 May 2022 16:16:16 +0000 (09:16 -0700)]
[gn build] Use llvm-ar when clang_base_path is specified
Only applies linux for now.
This prevents warnings with use_thinlto like
bfd plugin: LLVM gold plugin has failed to create LTO module: Not an int attribute (Producer: 'LLVM15.0.0git' Reader: 'LLVM 13.0.1')
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D125399
Peter Klausler [Mon, 9 May 2022 16:37:35 +0000 (09:37 -0700)]
[flang] Fix check for assumed-size arguments to SHAPE() & al.
The predicate that is used to detect an invalid assumed-size argument
to the intrinsic functions SHAPE, SIZE, & LBOUND gives false results
for arguments whose shapes are not calculatable at compilation time.
Replace with an explicit test for an assumed-size array dummy argument
symbol.
Differential Revision: https://reviews.llvm.org/D125342
Philip Reames [Wed, 11 May 2022 17:12:53 +0000 (10:12 -0700)]
[riscv] Add tests for vsetvli reuse across iterations of a loop
These variations are chosen to exercise both FRE and PRE cases involving loops which don't change state in the iteration and can thus perform vsetvli in the preheader of the loop only. At the moment, these are essentially all TODOs.
David Tenty [Mon, 2 May 2022 21:06:04 +0000 (17:06 -0400)]
[clang][AIX] Don't ignore XCOFF visibility by default
D87451 added -mignore-xcoff-visibility for AIX targets and made it the default (which mimicked the behaviour of the XL 16.1 compiler on AIX).
However, ignoring hidden visibility has unwanted side effects and some libraries depend on visibility to hide non-ABI facing entities from user headers and
reserve the right to change these implementation details based on this (https://libcxx.llvm.org/DesignDocs/VisibilityMacros.html). This forces us to use
internal linkage fallbacks for these cases on AIX and creates an unwanted divergence in implementations on the plaform.
For these reasons, it's preferable to not add -mignore-xcoff-visibility by default, which is what this patch does.
Reviewed By: DiggerLin
Differential Revision: https://reviews.llvm.org/D125141
Mircea Trofin [Wed, 11 May 2022 17:06:26 +0000 (10:06 -0700)]
[mlgo] Fix test
Updated reference file for dev-mode-logging.ll and expected output.
Peter Klausler [Sat, 7 May 2022 01:39:23 +0000 (18:39 -0700)]
[flang] Fold complex component references
Complex component references (z%RE, z%IM) of complex named constants
should be evaluated at compilation time.
Differential Revision: https://reviews.llvm.org/D125341
Sanjay Patel [Wed, 11 May 2022 16:09:47 +0000 (12:09 -0400)]
[InstCombine] freeze operand in urem expansion
As discussed in issue #37809, this transform is not safe
if the input is an undefined value.
There is no difference in codegen on the basic examples,
but this could lead to regressions. We may need to
improve freeze analysis or lowering if that happens.
Amir Ayupov [Wed, 11 May 2022 16:34:10 +0000 (09:34 -0700)]
[BOLT][NFC] Add MCPlus::primeOperands iterator_range
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D125397
Joseph Huber [Wed, 11 May 2022 16:25:06 +0000 (12:25 -0400)]
[OpenMP] Add a check for alignment in the offload packager
Summary:
These sections need to be aligned correctly to be extracted later, add
a check to indicate if they aren't.
Vibhuti Sawant [Wed, 11 May 2022 16:16:53 +0000 (09:16 -0700)]
[Bazel] Add support for s390x build target
While executing the test suite for Tensorflow(v2.8.0), we encountered multiple TC failures with the below error
```
'z14' is not a recognized processor for this target
```
This patch adds the s390x target to the build target list. It fixes TC failures in multiple modules of Tensorflow on s390x arch. It is also tested to have no effect on x86 machines.
Reviewed By: GMNGeoffrey
Differential Revision: https://reviews.llvm.org/D125096
Aaron Ballman [Wed, 11 May 2022 16:09:21 +0000 (12:09 -0400)]
Fix the Clang sphinx build
This should address:
https://lab.llvm.org/buildbot/#/builders/92/builds/26609
Matthias Braun [Wed, 11 May 2022 15:41:09 +0000 (08:41 -0700)]
Fix endless loop in optimizePhiConst with integer constant switch condition
Avoid endless loop in degenerate case with an integer constant as switch
condition as reported in https://reviews.llvm.org/D124552
Alban Bridonneau [Wed, 11 May 2022 15:36:24 +0000 (15:36 +0000)]
[NFC] Change comment number in aarch64 isel
python3kgae [Sat, 7 May 2022 07:32:17 +0000 (00:32 -0700)]
[DirectX backend] Add pass to emit dxil metadata.
A new pass DxilEmitMetadata is added to translate information saved in llvm ir into metadata to match DXIL spec.
Only generate DXIL validator version in this PR.
In llvm ir, validator version is saved in ModuleFlag with "dx.valver" as Key.
!llvm.module.flags = !{!0, !1}
!1 = !{i32 6, !"dx.valver", !2}
!2 = !{i32 1, i32 1}
DXIL validator version has major and minor versions that are specified as named metadata:
!dx.valver = !{!2}
!2 = !{i32 1, i32 7}
Reviewed By: kuhar, beanz
Differential Revision: https://reviews.llvm.org/D125158