Ashay Rane [Thu, 5 May 2022 20:18:28 +0000 (13:18 -0700)]
[mlir] translate memref.reshape ops that have static shapes
This patch references code for translating memref.reinterpret_cast ops
to add translation rules for memref.reshape ops that have a static shape
argument. Since reshape ops don't have offsets, sizes, or strides, this
patch simply sets the allocated and aligned pointers of the MemRef
descriptor.
Reviewed By: ftynse, cathyzhyi
Differential Revision: https://reviews.llvm.org/D125039
Louis Dionne [Thu, 12 May 2022 18:43:20 +0000 (14:43 -0400)]
[libc++] Mark <stdatomic.h> as requiring C++23
Otherwise, we might get errors with modules in pre-C++23 when mixing
<atomic> and <stdatomic.h>. This should fix breakage on Green Dragon.
Michael Jones [Thu, 12 May 2022 18:38:31 +0000 (11:38 -0700)]
[libc] fix uint includes and libc bazel
This patch fixes the includes for the new UInt class so that the api
test now passes, additionally it fixes the bazel files to account for
the new dependencies.
Differential Revision: https://reviews.llvm.org/D125490
Florian Hahn [Thu, 12 May 2022 18:33:48 +0000 (19:33 +0100)]
[LAA] Initial support for runtime checks with pointer selects.
Scaffolding support for generating runtime checks for multiple SCEV expressions
per pointer. The initial version just adds support for looking through
a single pointer select.
The more sophisticated logic for analyzing forks is in D108699
Reviewed By: huntergr
Differential Revision: https://reviews.llvm.org/D114487
Vasileios Porpodas [Thu, 12 May 2022 02:09:38 +0000 (19:09 -0700)]
[SLP][NFC] Added test to exercise the cause of a crash caused by reordering.
This is to support
0950d4060cd916a1d08da657db2513d2ce3e38fa.
External users that can affect reordering, with range == VL.size() but
non consecutive (like stores to A[0],A[0],A[3],A[3]) would escape the check
for consecutive accesses and would cause a crash.
Michael Jones [Tue, 3 May 2022 16:55:00 +0000 (09:55 -0700)]
[libc] add uint128 implementation
Some platforms don't support proper 128 bit integers, but some
algorithms use them, such as any that use long doubles. This patch
modifies the existing UInt class to support the necessary operators.
This does not put this new class into use, that will be in followup
patches.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D124959
Félix Cloutier [Thu, 12 May 2022 18:09:06 +0000 (11:09 -0700)]
[clang] Allow all string types for all attribute(format) styles
This allows using any recognized kind of string for any
__attribute__((format)) archetype. Before this change, for instance,
the printf archetype would only accept char pointer types and the
NSString archetype would only accept NSString pointers. This is
more restrictive than necessary as there exist functions to
convert between string types that can be annotated with
__attribute__((format_arg)) to transfer format information.
Reviewed By: ahatanak
Differential Revision: https://reviews.llvm.org/D125254
rdar://
89060618
Fangrui Song [Thu, 12 May 2022 18:03:12 +0000 (11:03 -0700)]
[ELF] Align the end of PT_GNU_RELRO to max-page-size instead of common-page-size
We picked common-page-size to match GNU ld. Recently, the resolution to GNU ld
https://sourceware.org/bugzilla/show_bug.cgi?id=28824 (milestone: 2.39) switched
to max-page-size so that the last page can be protected by RELRO in case the
system page size is larger than common-page-size.
Thanks to our two RW PT_LOAD scheme (D58892), switching to max-page-size does
not change file size (while GNU ld's scheme may increase file size).
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D125410
Hongtao Yu [Thu, 12 May 2022 04:33:09 +0000 (21:33 -0700)]
[llvm-profgen] Filter out oversized LBR ranges.
As a follow up to {D123271}, LBR ranges that are too big should also be considered as invalid.
For example, the last two pairs in the following trace form a range [0x0d7b02b0, 0x368ba706] that covers a ton of functions in the binary. Such oversized range should also be ignored.
0x0c74505f/0x368b99a0 **0x368ba706**/0x0c745040 0x0d7b1c3f/**0x0d7b02b0**
Add a defensive check to filter out those ranges based that the valid range should not cross the unconditional branch(Call, return, unconditional jmp).
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D125448
Blue Gaston [Wed, 11 May 2022 19:52:32 +0000 (12:52 -0700)]
[Sanitizers][Darwin] Add READ/WRITE detection on arm64 for darwin.
On arm64 the read/write flag is set on the esr register.
Adding this flag check for arm64 enables a more accurate
print out for sanitizer signal reports and matches the
behavior on x86.
Fixes bug: https://bugs.llvm.org/show_bug.cgi?id=27543 https://github.com/google/sanitizers/issues/653
These tests are now passing:
SanitizerCommon-asan-arm64-Darwin :: Posix/illegal_read_test.cpp
SanitizerCommon-asan-arm64-Darwin :: Posix/illegal_write_test.cpp
SanitizerCommon-asan-arm64e-Darwin :: Posix/illegal_read_test.cpp
SanitizerCommon-asan-arm64e-Darwin :: Posix/illegal_write_test.cpp
SanitizerCommon-tsan-arm64-Darwin :: Posix/illegal_read_test.cpp
SanitizerCommon-tsan-arm64-Darwin :: Posix/illegal_write_test.cpp
SanitizerCommon-tsan-arm64e-Darwin :: Posix/illegal_read_test.cpp
SanitizerCommon-tsan-arm64e-Darwin :: Posix/illegal_write_test.cpp
SanitizerCommon-ubsan-arm64-Darwin :: Posix/illegal_read_test.cpp
SanitizerCommon-ubsan-arm64-Darwin :: Posix/illegal_write_test.cpp
SanitizerCommon-ubsan-arm64e-Darwin :: Posix/illegal_read_test.cpp
SanitizerCommon-ubsan-arm64e-Darwin :: Posix/illegal_write_test.cpp
rdar://
92104440
Differential Revision: https://reviews.llvm.org/D125416
Sanjay Patel [Thu, 12 May 2022 16:37:29 +0000 (12:37 -0400)]
[InstCombine] freeze operand in div+mul fold
As discussed in issue #37809, this transform is not safe
if the input is an undefined value.
This is similar to recent changes for urem and sdiv:
d428f09b2c9d
99ef341ce943
There is no difference in codegen on the basic examples,
but this could lead to regressions. We may need to
improve freeze analysis or lowering if that happens.
Presumably, in real cases that are similar to the tests
where a subsequent transform removes the rem, we
will also be able to remove the freeze by seeing that
the parameter has 'noundef'.
Philip Reames [Thu, 12 May 2022 17:10:12 +0000 (10:10 -0700)]
[RISCV] Extend dataflow workaround from D119518 to fallthrough blocks
We've got a lurking problem with our data flow implementation where different phases disagree, resulting in possible miscompiles. D119518 introduced a workaround, but failed to consider blocks without terminators (e.g. fallthroughs).
I have a deeper rework of the algorithm in flight over in D125232, but this patch is specifically a minimal fix for an active miscompile. That change can be reworked over this once landed.
Differential Revision: https://reviews.llvm.org/D125408
Louis Dionne [Thu, 12 May 2022 17:26:16 +0000 (13:26 -0400)]
[libc++abi][NFC] Add comment on long reaching #if
Aaron Ballman [Thu, 12 May 2022 17:19:26 +0000 (13:19 -0400)]
Check for resource exhaustion when recursively parsing declarators
With sufficiently tortured code, it's possible to cause a stack
overflow when parsing declarators. Thus, we now check for resource
exhaustion when recursively parsing declarators so that we can at least
warn the user we're about to crash before we actually crash.
Fixes #51642
Differential Revision: https://reviews.llvm.org/D124915
Louis Dionne [Mon, 9 May 2022 20:41:38 +0000 (16:41 -0400)]
[libc++abi] Refactor exception type demangling into a separate function
As a fly-by fix, also let `__cxa_demangle` allocate its buffer alone,
since we are not allowed to pass a non-malloc'd buffer to it.
Differential Revision: https://reviews.llvm.org/D125268
Simon Pilgrim [Thu, 12 May 2022 16:46:41 +0000 (17:46 +0100)]
[CostModel][X86] Auto generate partial interleaved load LV costs using UTC_ARGS --filter control
Simon Pilgrim [Thu, 12 May 2022 16:40:40 +0000 (17:40 +0100)]
[CostModel][X86] Auto generate masked load/store LV costs using UTC_ARGS --filter control
Also fix a sse42 -> sse4.2 typo so that we actually test costs for sse4.2
Simon Pilgrim [Thu, 12 May 2022 16:06:34 +0000 (17:06 +0100)]
[CostModel][X86] Auto generate gather/scatter LV costs using UTC_ARGS --filter control
Also fix a sse42 -> sse4.2 typo so that we actually test costs for sse4.2
Stephen Long [Thu, 12 May 2022 16:10:43 +0000 (09:10 -0700)]
[Headers][MSVC] Define wchar_t in stddef.h like MSVC if not using the builtin type
MSVC expects wchar_t to be defined in stddef.h if /Zc:wchar_t- is specified
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D124026
Hongtao Yu [Thu, 12 May 2022 16:28:25 +0000 (09:28 -0700)]
[CSSPGO][llvm-profgen] Do not duplicate context profiles into base profile when converting CS flat profile to nested.
Recent experiments with our two large internal services showed that duplicating context profiles into base profile caused code size inflation and didn't deliver good performance compared to no such duplication. It was a trick we made to catch up with the CS flat profile and I'm now turning it off by default.
The code size inflation mainly comes from the enriched based profiles. A base profile for a function represents the uninlined (or outlined) portion of the whole function running time. Such portion could be very small if a function is inlined into most of its hot callsites. Duplicating context profiles of the function into its base profiles could cause the outlined body to be hot enough and in turn get many of its callees inlined, thus increases the code size. The size inflation could further cause perf regression.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D124796
Craig Topper [Thu, 12 May 2022 07:32:45 +0000 (00:32 -0700)]
[TypePromotion] Promote undef by converting to 0.
If we're promoting an undef I think that means that we expect the
upper bits are zero. undef doesn't guarantee that.
This patch replaces undef with 0 to ensure this. This matches how
a zext or sext of undef would be folded by InstCombine/InstSimplify.
I haven't found a failure from this was just thinking through the code.
Differential Revision: https://reviews.llvm.org/D123174
Craig Topper [Thu, 12 May 2022 01:48:52 +0000 (18:48 -0700)]
[RISCV] Use tail agnostic policy when selecting riscv_fma_vl to instructions
riscv_fma_vl doesn't have a tail, so use the tail_agnostic policy.
We were already doing this for some patterns. I think the patterns
with fneg and mask were added later and I copied the tail policy
from the unmasked patterns.
Reviewed By: khchen
Differential Revision: https://reviews.llvm.org/D125424
Yaxun (Sam) Liu [Thu, 12 May 2022 15:34:07 +0000 (11:34 -0400)]
[clang]Silence warning in MicrosoftCXXABI.cpp
Silence warning with gcc 9.3 about:
[1/351] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/MicrosoftCXXABI.cpp.o
../../clang/lib/AST/MicrosoftCXXABI.cpp:57:12: warning: 'virtual unsigned int {anonymous}::MicrosoftNumberingContext::getManglingNumber(const clang::VarDecl*, unsigned int)' was hidden [-Woverloaded-virtual]
57 | unsigned getManglingNumber(const VarDecl *VD,
| ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:80:12: warning: by 'virtual unsigned int {anonymous}::MSHIPNumberingContext::getManglingNumber(const clang::TagDecl*, unsigned int)' [-Woverloaded-virtual]
80 | unsigned getManglingNumber(const TagDecl *TD,
| ^~~~~~~~~~~~~~~~~
Change-Id: Ia519e77c6454eb020228478dd6498eaf7864dae8
Martin Storsjö [Wed, 27 Apr 2022 10:28:01 +0000 (13:28 +0300)]
[libcxx] Switch __cxx_contention_t to int32_t on 32 bit AIX
I guess this is an ABI break for the 32 bit AIX configuration, but I'm
not sure if that one is meant to be ABI stable yet or not.
Previously, this used int32_t for this type on linux, but int64_t
on all other platforms. This was added in D68480 /
54fa9ecd3088508b05b0c5b5cb52da8a3c188655, but I don't really see
any discussion around this detail there.
Switching this to 32 bit on 32 bit AIX silences these libcxx build
warnings:
```
In file included from /scratch/powerllvm/cpap8006/llvm-project/libcxx-ci/libcxx/src/atomic.cpp:12:
/scratch/powerllvm/cpap8006/llvm-project/libcxx-ci/build/aix/include/c++/v1/atomic:1005:12: warning: large atomic operation may incur significant performance penalty; the access size (8 bytes) exceeds the max lock-free size (4 bytes) [-Watomic-alignment]
return __c11_atomic_fetch_add(&__a->__a_value, __delta, static_cast<__memory_order_underlying_t>(__order));
^
/scratch/powerllvm/cpap8006/llvm-project/libcxx-ci/build/aix/include/c++/v1/atomic:948:12: warning: large atomic operation may incur significant performance penalty; the access size (8 bytes) exceeds the max lock-free size (4 bytes) [-Watomic-alignment]
return __c11_atomic_load(const_cast<__ptr_type>(&__a->__a_value), static_cast<__memory_order_underlying_t>(__order));
^
/scratch/powerllvm/cpap8006/llvm-project/libcxx-ci/build/aix/include/c++/v1/atomic:1000:12: warning: large atomic operation may incur significant performance penalty; the access size (8 bytes) exceeds the max lock-free size (4 bytes) [-Watomic-alignment]
return __c11_atomic_fetch_add(&__a->__a_value, __delta, static_cast<__memory_order_underlying_t>(__order));
^
/scratch/powerllvm/cpap8006/llvm-project/libcxx-ci/build/aix/include/c++/v1/atomic:1022:12: warning: large atomic operation may incur significant performance penalty; the access size (8 bytes) exceeds the max lock-free size (4 bytes) [-Watomic-alignment]
return __c11_atomic_fetch_sub(&__a->__a_value, __delta, static_cast<__memory_order_underlying_t>(__order));
^
4 warnings generated.
```
Differential Revision: https://reviews.llvm.org/D124519
Benjamin Kramer [Thu, 12 May 2022 15:59:39 +0000 (17:59 +0200)]
[DenseElementAttr] Silence warning in -DNDEBUG builds. NFC.
Quentin Colombet [Tue, 3 May 2022 02:04:42 +0000 (19:04 -0700)]
[DeadArgElim] Re-apply: Set unused arguments for internal functions
The re-apply includes fixes to clang tests that were missed in
the original commit.
Original message:
Prior to this patch we would only set to undef the unused arguments of the
external functions. The rationale was that unused arguments of internal
functions wouldn't need to be turned into undef arguments because they
should have been simply eliminated by the time we reach that code.
This is actually not true because there are plenty of cases where we can't
remove unused arguments. For instance, if the internal function is used in
an indirect call, it may not be possible to change the function signature.
Yet, for statically known call-sites we would still like to mark the unused
arguments as undef.
This patch enables the "set undef arguments" optimization on internal
functions when we encounter cases where internal functions cannot be
optimized. I.e., whenever an internal function is marked "live".
Differential Revision: https://reviews.llvm.org/D124699
Chris Lattner [Thu, 12 May 2022 15:17:52 +0000 (16:17 +0100)]
Various improvements suggested by river NFC.
Differential Revision: https://reviews.llvm.org/D125471
Chris Lattner [Thu, 12 May 2022 04:32:16 +0000 (05:32 +0100)]
[DenseElementAttr] Simplify the public API for creating these.
Instead of requiring the client to compute the "isSplat" bit,
compute it internally. This makes the logic more consistent
and defines away a lot of "elements.size()==1" in the clients.
This addresses Issue #55185
Differential Revision: https://reviews.llvm.org/D125447
Eric Schweitz [Tue, 10 May 2022 14:51:15 +0000 (07:51 -0700)]
Fixes a performance problem with lowering of forall loops and creating
too many temporaries.
Fix clang-format errors.
Differential Revision: https://reviews.llvm.org/D125336
Fraser Cormack [Thu, 12 May 2022 14:45:04 +0000 (15:45 +0100)]
[CodeGen][NFC] Move some comments from the end of lines to above them
This avoids wrapping the line itself awkwardly when it exceeds 80 chars.
It also better matches our style most other places.
Jeremy Morse [Thu, 12 May 2022 14:39:51 +0000 (15:39 +0100)]
[DebugInfo][InstrRef] Describe value sizes when spilt to stack
This is a re-apply of D123599, which was reverted in
4fe2ab5279408, now
with a more appropriate assertion. Original commit message follow:
InstrRefBasedLDV can track and describe variable values that are spilt to
the stack -- however it does not current describe the size of the value on
the stack. This can cause uninitialized bytes to be read from the stack if
a small register is spilt for a larger variable, or theoretically on
big-endian machines if a large value on the stack is used for a small
variable.
Fix this by using DW_OP_deref_size to specify the amount of data to load
from the stack, if there's any possibility for ambiguity. There are a few
scenarios where this can be omitted (such as when using DW_OP_piece and a
non-DW_OP_stack_value location), see deref-spills-with-size.mir for an
explicit table of inputs flavours and output expressions.
Differential Revision: https://reviews.llvm.org/D123599
Pavel Samolysov [Thu, 12 May 2022 14:39:26 +0000 (16:39 +0200)]
[ArgPromotion] Make a non-byval promotion attempt first
It makes sense to make a non-byval promotion attempt first and then
fall back to the byval one. The non-byval ('usual') promotion is
generally better, for example it does promotion even when a structure
has more elements than 'MaxElements' but not all of them are actually
used in the function.
Differential Revision: https://reviews.llvm.org/D124514
Richard Howell [Wed, 4 May 2022 17:25:34 +0000 (10:25 -0700)]
[clang] serialize ORIGINAL_PCH_DIR relative to BaseDirectory
This diff changes the serialization of the `ORIGINAL_PCH_DIR`
entry in module files to be serialized relative to the module's
`BaseDirectory`. This will allow for the module to be relocatable
across machines.
The path is restored relative to the module's BaseDirectory on
deserialization.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D124946
Richard Howell [Wed, 4 May 2022 16:48:44 +0000 (09:48 -0700)]
[clang] serialize SUBMODULE_TOPHEADER relative to BaseDirectory
This diff changes the serialization of the `SUBMODULE_TOPHEADER`
entry in module files to be serialized relative to the module's
`BaseDirectory`. This matches the behavior of the
`SUBMODULE_HEADER` entry and will allow for the module to be
relocatable across machines.
The path is restored relative to the module's `BaseDirectory` on
deserialization.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D124938
Richard Howell [Wed, 4 May 2022 15:44:40 +0000 (08:44 -0700)]
[clang] add -fmodule-file-home-is-cwd
This diff adds a new frontend flag `-fmodule-file-home-is-cwd`.
The behavior of this flag is similar to
`-fmodule-map-file-home-is-cwd` but does not require the module
map files to be modified to have inputs relative to the cwd.
Instead the output modules will have their `BaseDirectory` set
to the cwd and will try and resolve paths relative to that.
The motiviation for this change is to support relocatable pcm
files that are built on different machines with different paths
without having to alter module map files, which is sometimes not
possible as they are provided by 3rd parties.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D124874
serge-sans-paille [Thu, 12 May 2022 13:33:17 +0000 (15:33 +0200)]
[openmp] Fix strict aliasing issue in cmpxchg routine
Avoid warning under -fstrict-aliasing by using a call to memcpy to perform type
punning.
Differential Revision: https://reviews.llvm.org/D125467
Nikita Popov [Thu, 12 May 2022 12:56:44 +0000 (14:56 +0200)]
[AArch64] Preserve chain when lowering fixed length load to SVE (PR55281)
When a fixed length load is lowered to an SVE masked load, the
result chain is currently set to the input chain of the old load,
rather than the result chain of the new load. This may cause stores
to be incorrectly reordered.
Fixes https://github.com/llvm/llvm-project/issues/55281.
Differential Revision: https://reviews.llvm.org/D125464
Tomasz Kamiński [Thu, 12 May 2022 13:40:11 +0000 (15:40 +0200)]
Reland "[analyzer] Canonicalize SymIntExpr so the RHS is positive when possible"
This PR changes the `SymIntExpr` so the expression that uses a
negative value as `RHS`, for example: `x +/- (-N)`, is modeled as
`x -/+ N` instead.
This avoids producing a very large `RHS` when the symbol is cased to
an unsigned number, and as consequence makes the value more robust in
presence of casts.
Note that this change is not applied if `N` is the lowest negative
value for which negation would not be representable.
Reviewed By: steakhal
Patch By: tomasz-kaminski-sonarsource!
Differential Revision: https://reviews.llvm.org/D124658
Thomas Raoux [Wed, 11 May 2022 17:43:44 +0000 (17:43 +0000)]
[mlir][vector] Add lowering pattern for vector.warp_execute_on_lane_0 op
Add lowering of the vector.warp_execute_on_lane_0 into scf.if plus memory
transfer for the operands and yield values.
This also add an integration test running on GPU warp. The same tests can be
later re-used with different comment lines to tests distribution
transformations.
This is mostly from @springerm contribution.
Differential Revision: https://reviews.llvm.org/D125430
Ken Matsui [Thu, 12 May 2022 13:25:05 +0000 (09:25 -0400)]
Warn if using `elifdef` & `elifndef` in not C2x & C++2b mode
This adds an extension warning when using the preprocessor conditionals
in a language mode they're not officially supported in, and an opt-in
warning for compatibility with previous standards.
Fixes #55306
Differential Revision: https://reviews.llvm.org/D125178
Pedro Olsen Ferreira [Thu, 12 May 2022 12:44:47 +0000 (13:44 +0100)]
Rename and fix ValueMap::resize to reserve
The underlying map type (DenseMap) has had its resize() function
renamed to reserve() as part of
c04fc7a60ff4ea4610ea157be006c9771224a7b6 (SVN 264026).
This is only visible when the member function is called, as it is
template type name dependent.
Differential Revision: https://reviews.llvm.org/D125387
Martin Storsjö [Thu, 11 Nov 2021 10:55:10 +0000 (12:55 +0200)]
[AArch64] Stop creating unnecessary label MCSymbols for each Windows unwind opcode. NFC.
These labels aren't needed in the ARM version of WinEH tables, as each
unwind opcode maps to a specific instruction (each opcode is assumed
to represent one instruction), and the written tables don't contain
offsets like on x86_64.
Differential Revision: https://reviews.llvm.org/D125369
Martin Storsjö [Wed, 24 Nov 2021 12:03:54 +0000 (14:03 +0200)]
[MC] [Win64EH] Simplify code using WinEH::Instruction::operator!=. NFC.
operator== and operator!= were added in
1308bb99e06752ab0b5175c92da31083f91af921 / D87369, but this existing
codepath wasn't updated to use them.
Also fix the indentation of the enclosed liens.
Differential Revision: https://reviews.llvm.org/D125368
Benjamin Kramer [Thu, 12 May 2022 11:35:27 +0000 (13:35 +0200)]
[mlir][linalg] Add lowering of named ops on complex numbers
This lets linalg.dot and friends lower to a complex muladd using ops
from the complex dialect.
Differential Revision: https://reviews.llvm.org/D125461
owenca [Fri, 6 May 2022 22:08:12 +0000 (15:08 -0700)]
[clang-format] Don't remove braces if a 1-statement body would wrap
Reimplement the RemoveBracesLLVM feature which handles a
single-statement block that would get wrapped.
Fixes #53543.
Differential Revision: https://reviews.llvm.org/D125137
Nikita Popov [Thu, 12 May 2022 10:22:51 +0000 (12:22 +0200)]
[FastISel] Add some debug output (NFC)
Print a debug message when aborting isel (next to the ORE report)
and when folding a load.
Benjamin Kramer [Thu, 12 May 2022 10:04:14 +0000 (12:04 +0200)]
[bazel] Add support for configuring the bazel build for PPC
TF already carries a patch for this.
Benjamin Kramer [Wed, 11 May 2022 13:12:21 +0000 (15:12 +0200)]
[mlir][LLVM] Make the nested type restriction on complex constants less aggressive
Complex nested in other types is perfectly fine, just nested structs
aren't supported. Instead of checking whether there's nesting just check
whether the struct we're dealing with is a complex number.
Differential Revision: https://reviews.llvm.org/D125381
Max Kazantsev [Thu, 12 May 2022 09:09:11 +0000 (16:09 +0700)]
[Test] Regenerate checks using auto-update (work around PR55365)
Dmitry Vassiliev [Thu, 12 May 2022 08:46:03 +0000 (10:46 +0200)]
[Intrinsics] Fix `nvvm_prmt` intrinsic attributes
`nvvm_prmt` doesn't seem to be `commutative`. nvvm also sets `IntrSpeculatable` for it.
Here is the doc https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-prmt
Reviewed By: tra, jchlanda
Differential Revision: https://reviews.llvm.org/D125423
Daniil Dudkin [Thu, 12 May 2022 08:39:53 +0000 (11:39 +0300)]
[mlir][NFC] Fix `GpuKernelOutliningPass` copy constructor warnings
1. Call copy constructor of the base class
2. Assign value of the option directly
Reviewed By: dcaballe, rriddle
Differential Revision: https://reviews.llvm.org/D125101
Ivan Kosarev [Thu, 12 May 2022 07:51:35 +0000 (08:51 +0100)]
[AMDGPU][NFC] Remove unused function.
Introduced in
https://reviews.llvm.org/rG229d5e669bbbe7ca38ad832627a9809405939f1b
and then became unused in
https://reviews.llvm.org/D19584
Reviewed By: foad, dp
Differential Revision: https://reviews.llvm.org/D125385
Nikita Popov [Wed, 11 May 2022 07:57:15 +0000 (09:57 +0200)]
[MLIR] Fix build without native arch
D125214 split off a MLIRExecutionEngineUtils library that is used
by MLIRGPUTransforms. However, currently the entire ExecutionEngine
directory is skipped if the LLVM_NATIVE_ARCH target is not available.
Move the check for LLVM_NATIVE_ARCH, such that MLIRExecutionEngineUtils
always gets built, and only the JIT-related libraries are omitted
without native arch.
Differential Revision: https://reviews.llvm.org/D125357
Ivan Kosarev [Thu, 12 May 2022 07:25:33 +0000 (08:25 +0100)]
[AMDGPU][GFX10] Support base+soffset+offset SMEM stores.
Also makes another step towards resolving
https://github.com/llvm/llvm-project/issues/38652
Reviewed By: foad, dp
Differential Revision: https://reviews.llvm.org/D125380
Matthias Springer [Thu, 12 May 2022 07:42:53 +0000 (09:42 +0200)]
[mlir][bufferize] Support alloc hoisting across function boundaries
This change integrates the BufferResultsToOutParamsPass into One-Shot Module Bufferization. This improves memory management (deallocation) when buffers are returned from a function.
Note: This currently only works with statically-sized tensors. The generated code is not very efficient yet and there are opportunities for improvment (fewer copies). By default, this new functionality is deactivated.
Differential Revision: https://reviews.llvm.org/D125376
Matthias Springer [Thu, 12 May 2022 07:27:21 +0000 (09:27 +0200)]
[mlir][bufferize] Fix op filter
Bufferization has an optional filter to exclude certain ops from analysis+bufferization. There were a few remaining places in the codebase where the filter was not checked.
Differential Revision: https://reviews.llvm.org/D125356
Tim Northover [Thu, 12 May 2022 07:30:53 +0000 (08:30 +0100)]
Revert "Add an error message to the default SIGPIPE handler"
It broke a PPC bot, for not immediately obvious reasons.
Matthias Springer [Thu, 12 May 2022 07:17:04 +0000 (09:17 +0200)]
[mlir][bufferize] Add helpers for templatized DENY filters
We already have templatized ALLOW filters but the DENY filters were missing.
Differential Revision: https://reviews.llvm.org/D125358
Carl Ritson [Wed, 11 May 2022 09:21:27 +0000 (18:21 +0900)]
[AMDGPU] Remove pre-committed test for D124981. NFC.
Tim Northover [Wed, 11 May 2022 08:52:10 +0000 (09:52 +0100)]
Add an error message to the default SIGPIPE handler
UNIX03 conformance requires utilities to flush stdout before exiting and raise
an error if writing fails. Flushing already happens on a call to exit
and thus automatically on a return from main. Write failure is then
detected by LLVM's default SIGPIPE handler. The handler already exits with
a non-zero code, but conformance additionally requires an error message.
Krasimir Georgiev [Thu, 12 May 2022 06:30:36 +0000 (08:30 +0200)]
silence new -Wunused-result warnings in test
No functional changes intended.
After https://github.com/llvm/llvm-project/commit/
f156b51aecc676a9051136f6f5cb74e37dd574d1,
new -Wunused-result warnings popped up in this test:
https://buildkite.com/llvm-project/upstream-bazel/builds/28320#
bc3ec049-af39-4114-b7b8-
4cbc180bc09b
River Riddle [Wed, 11 May 2022 04:25:00 +0000 (21:25 -0700)]
[mlir:Parser] Emit a better diagnostic when a custom operation is unknown
When a custom operation is unknown and does not have a dialect prefix, we currently
emit an error using the name of the operation with the default dialect prefix. This
leads to a confusing error message, especially when operations get moved between dialects.
For example, `func` was recently moved out of `builtin` and to the `func` dialect. The current
error message we get is:
```
func @foo()
^ custom op 'builtin.func' is unknown
```
This could lead users to believe that there is supposed to be a `builtin.func`,
because there used to be. This commit adds a better error message that does
not assume that the operation is supposed to be in the default dialect:
```
func @foo()
^ custom op 'func' is unknown (tried 'builtin.func' as well)
```
Differential Revision: https://reviews.llvm.org/D125351
Mahesh Ravishankar [Thu, 12 May 2022 03:50:21 +0000 (03:50 +0000)]
[mlir][Linalg] Combine canonicalizers that deal with removing dead/redundant args.
`linalg.generic` ops have canonicalizers that either remove arguments
not used in the payload, or redundant arguments. Combine these and
enhance the canonicalization to also remove results that have no use.
This is effectively dead code elimination for Linalg ops.
Differential Revision: https://reviews.llvm.org/D123632
Mogball [Thu, 12 May 2022 05:14:25 +0000 (05:14 +0000)]
[mlir][ods] (NFC) don't use std::function for map_range
Mogball [Thu, 12 May 2022 04:16:17 +0000 (04:16 +0000)]
[mlir] (NFC) Use assembly format for test.graph_region
bzcheeseman [Wed, 11 May 2022 19:25:04 +0000 (15:25 -0400)]
[MLIR][Operation] Simplify Operation casting, NFC
We can simplify the code needed to implement dyn_cast/cast/isa support for MLIR operations with documented interfaces via the CastInfo structures. This will also provide an example of how to use CastInfo.
Depends on D123901
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D124963
bzcheeseman [Sat, 16 Apr 2022 18:34:08 +0000 (11:34 -0700)]
[LLVM][Casting.h] Update dyn_cast machinery to provide more control over how the casting is performed.
This patch expands the expressive capability of the casting utilities in LLVM by introducing several levels of configurability. By creating modular CastInfo classes we can enable projects like MLIR that need more fine-grained control over how a cast is actually performed to retain that control, while making it easy to express the easy cases (like a checked pointer to pointer cast).
The current implementation of Casting.h doesn't make it clear where the entry points for customizing the cast behavior are, so part of the motivation for this patch is adding that documentation. Another part of the motivation is to support using LLVM RTTI with a wider set of use cases, such as nullable value to value casts, or pointer to value casts (as in MLIR).
Reviewed By: lattner, rriddle
Differential Revision: https://reviews.llvm.org/D123901
Fangrui Song [Thu, 12 May 2022 03:27:11 +0000 (20:27 -0700)]
[Bitcode] Simplify code after FUNC_CODE_BLOCKADDR_USERS changes (D124878)
Switch to the more common `Constant && !GlobalValue` test.
Use the more common `Worklist/Visited` variable names.
Jim Lin [Wed, 11 May 2022 06:13:35 +0000 (14:13 +0800)]
[BPF] Implement mod operation
Implement BPF_MOD instruction to fix lack of assembly parser support mentioned in https://github.com/llvm/llvm-project/issues/55192.
Reviewed By: ast
Differential Revision: https://reviews.llvm.org/D125207
Lian Wang [Thu, 12 May 2022 02:11:37 +0000 (02:11 +0000)]
[LegalizeVectorTypes] Enable WidenVecRes_SETCC work for scalable vector.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D125359
Ping Deng [Thu, 12 May 2022 02:22:56 +0000 (02:22 +0000)]
[RISCV][NFC] Simplify tests by reorganizing check prefixes
Reviewed By: benshi001, asb
Differential Revision: https://reviews.llvm.org/D125354
grosul1 [Thu, 12 May 2022 01:44:13 +0000 (01:44 +0000)]
[mlir] Fix loop unrolling: properly replace the arguments of the epilogue loop.
Using "replaceUsesOfWith" is incorrect because the same initializer value may appear multiple times.
For example, if the epilogue is needed when this loop is unrolled
```
%x:2 = scf.for ... iter_args(%arg1 = %c1, %arg2 = %c1) {
...
}
```
then both epilogue's arguments will be incorrectly renamed to use the same result index (note #1 in both cases):
```
%x_unrolled:2 = scf.for ... iter_args(%arg1 = %c1, %arg2 = %c1) {
...
}
%x_epilogue:2 = scf.for ... iter_args(%arg1 = %x_unrolled#1, %arg2 = %x_unrolled#1) {
...
}
```
Weining Lu [Tue, 3 May 2022 03:06:24 +0000 (11:06 +0800)]
[LoongArch] Check msb is not less than lsb for the bstr{ins/pick}.{w/d} instructions
Differential Revision: https://reviews.llvm.org/D124825
David Tenty [Thu, 12 May 2022 00:47:48 +0000 (20:47 -0400)]
Revert "[NFC][tests][AIX] XFAIL test for lack of visibility support"
This reverts commit
f5a9b5cc12658f4d6caa3e0cfc3e771698fb3798 since
https://reviews.llvm.org/D125141 has resolved the test issue.
Tapan Thaker [Wed, 11 May 2022 23:29:07 +0000 (16:29 -0700)]
[lld/macho] Fixes the -ObjC flag
When checking the segment name for Swift symbols, we should be checking that they start with `__swift` instead of checking for equality
Fixes the issue https://github.com/llvm/llvm-project/issues/55355
Reviewed By: #lld-macho, keith, thevinster
Differential Revision: https://reviews.llvm.org/D125250
Vasileios Porpodas [Wed, 11 May 2022 22:52:24 +0000 (15:52 -0700)]
Recommit "[SLP] Make reordering aware of external vectorizable scalar stores."
This reverts commit
c2a7904aba465fcaf13bbe2a5772cdeeb88060e5.
Original code review: https://reviews.llvm.org/D125111
Amir Ayupov [Wed, 11 May 2022 23:23:27 +0000 (16:23 -0700)]
[BOLT][NFC] Use BitVector::set_bits
Refactor and use `set_bits` BitVector interface.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D125374
Greg Clayton [Tue, 10 May 2022 20:41:06 +0000 (13:41 -0700)]
Add "indexedVariables" to variables with lots of children.
Prior to this fix if we have a really large array or collection class, we would end up always creating all of the child variables for an array or collection class. If the number of children was very high this can cause delays when expanding variables. By adding the "indexedVariables" to variables with lots of children, we can keep good performance in the variables view at all times. This patch will add the "indexedVariables" key/value pair to any "Variable" JSON dictionairies when we have an array of synthetic child provider that will create more than 100 children.
We have to be careful to not call "uint32_t SBValue::GetNumChildren()" on any lldb::SBValue that we use because it can cause a class, struct or union to complete the type in order to be able to properly tell us how many children it has and this can be expensive if you have a lot of variables. By default LLDB won't need to complete a type if we have variables that are classes, structs or unions unless the user expands the variable in the variable view. So we try to only get the GetNumChildren() when we have an array, as this is a cheap operation, or a synthetic child provider, most of which are for showing collections that typically fall into this category. We add a variable reference, which indicates that something can be expanded, when the function "bool SBValue::MightHaveChildren()" is true as this call doesn't need to complete the type in order to return true. This way if no one ever expands class variables, we don't need to complete the type.
Differential Revision: https://reviews.llvm.org/D125347
Simon Dardis [Sun, 8 May 2022 21:23:16 +0000 (22:23 +0100)]
[MIPS] Remove an incorrect microMIPS instruction alias
The microMIPS instruction set is compatible with the MIPS instruction
set at the assembly level but not in terms of encodings. `nop` in
microMIPS is a special case as it has the same encoding as `nop` for
MIPS.
Fix this error by reducing the usage of NOP in the MIPS backend such
that only that ISA correct variants are produced.
Differential Revision: https://reviews.llvm.org/D124716
Arthur Eubanks [Wed, 11 May 2022 22:27:39 +0000 (15:27 -0700)]
Revert "[SLP] Make reordering aware of external vectorizable scalar stores."
This reverts commit
71bcead98b2e655031208e5ad0ce89f8971a6343.
Causes crashes, see comments in D125111.
Alan Zhao [Wed, 11 May 2022 22:05:55 +0000 (15:05 -0700)]
Explicitly add -target for Windows builds in file_test_windows.c
It turns out that the llvm buildbots run the test with
-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4, which would cause this
test to fail as the test assumed that the default target is Windows. To
fix this, we explicitly set -target for the Windows testcases.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D125425
Yuanfang Chen [Wed, 11 May 2022 21:42:03 +0000 (14:42 -0700)]
[Driver][test] run one test in darwin-dsymutil.c for Darwin only
Alan Zhao [Wed, 11 May 2022 20:54:09 +0000 (22:54 +0200)]
[clang] Add the flag -ffile-reproducible
When Clang generates the path prefix (i.e. the path of the directory
where the file is) when generating FILE, __builtin_FILE(), and
std::source_location, Clang uses the platform-specific path separator
character of the build environment where Clang _itself_ is built. This
leads to inconsistencies in Chrome builds where Clang running on
non-Windows environments uses the forward slash (/) path separator
while Clang running on Windows builds uses the backslash (\) path
separator. To fix this, we add a flag -ffile-reproducible (and its
inverse, -fno-file-reproducible) to have Clang use the target's
platform-specific file separator character.
Additionally, the existing flags -fmacro-prefix-map and
-ffile-prefix-map now both imply -ffile-reproducible. This can be
overriden by setting -fno-file-reproducible.
[0]: https://crbug.com/1310767
Differential revision: https://reviews.llvm.org/D122766
Mike Rice [Wed, 11 May 2022 18:26:07 +0000 (11:26 -0700)]
[OpenMP] Fix mangling for linear parameters with negative stride
The 'n' character is used in place of '-' in the mangled name.
Differential Revision: https://reviews.llvm.org/D125406
Xiang Li [Wed, 11 May 2022 20:38:13 +0000 (13:38 -0700)]
Revert "[HLSL] add -D option for dxc mode."
This reverts commit
4dae38ebfba0d8583e52c3ded8f62f5f9fa77fda.
Differential Revision: https://reviews.llvm.org/D125414
Joseph Huber [Wed, 11 May 2022 20:53:36 +0000 (16:53 -0400)]
[LinkerWrapper][Fix} Fix bad alignment from extracted archive members
Summary:
We use embedded binaries to extract offloading device code from the host
fatbinary. This uses a binary format whose necessary alignment is
eight bytes. The alignment is included within the ELF section type so
the data extracted from the ELF should always be aligned at that amount.
However, if this file was extraqcted from a static archive, it was being
sent as an offset in the archive file which did not have the same
alignment guaruntees as the ELF file. This was causing errors in the
UB-sanitizer build as it would occasionally try to access a misaligned
address. To fix this, I simply copy the memory directly to a new buffer
which is guarnteed to have worst-case alignment of 16 in the case that
it's not properly aligned.
Austin Kerbow [Fri, 25 Mar 2022 00:46:15 +0000 (17:46 -0700)]
[AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic
Adds an intrinsic/builtin that can be used to fine tune scheduler behavior. If
there is a need to have highly optimized codegen and kernel developers have
knowledge of inter-wave runtime behavior which is unknown to the compiler this
builtin can be used to tune scheduling.
This intrinsic creates a barrier between scheduling regions. The immediate
parameter is a mask to determine the types of instructions that should be
prevented from crossing the sched_barrier. In this initial patch, there are only
two variations. A mask of 0 means that no instructions may be scheduled across
the sched_barrier. A mask of 1 means that non-memory, non-side-effect inducing
instructions may cross the sched_barrier.
Note that this intrinsic is only meant to work with the scheduling passes. Any
other transformations that may move code will not be impacted in the ways
described above.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D124700
Florian Hahn [Wed, 11 May 2022 20:20:42 +0000 (21:20 +0100)]
[ConstraintElimination] Add extra tests for different overflows.
Additional tests for D125264, inspired by @spatel.
Philip Reames [Wed, 11 May 2022 20:16:31 +0000 (13:16 -0700)]
[riscv] Add a bunch of tests exploring switch lowering
Specifically, how we handle zext vs sext around truncates.
Craig Topper [Wed, 11 May 2022 19:49:01 +0000 (12:49 -0700)]
[RISCV] Enable subregister liveness tracking for RVV.
RVV makes heavy use of subregisters due to LMUL>1 and segment
load/store tuples. Enabling subregister liveness tracking improves the quality
of the register allocation.
I've added a command line that can be used to turn it off if it causes compile
time or functional issues. I used the command line to keep the old behavior
for one interesting test case that was testing register allocation.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D125108
Craig Topper [Wed, 11 May 2022 19:16:37 +0000 (12:16 -0700)]
[RISCV] Fold addiw from (add X, (addiw (lui C1, C2))) into load/store address
This is a followup to D124231.
We can fold the ADDIW in this pattern if we can prove that LUI+ADDI
would have produced the same result as LUI+ADDIW.
This pattern occurs because constant materialization prefers LUI+ADDIW
for all simm32 immediates. Only immediates in the range
0x7ffff800-0x7fffffff require an ADDIW. Other simm32 immediates
work with LUI+ADDI.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D124693
Florian Hahn [Wed, 11 May 2022 19:46:48 +0000 (20:46 +0100)]
[GVN] Add test case for memdep invalidation bug.
Test case for #30999.
Chris Lattner [Wed, 11 May 2022 07:51:53 +0000 (08:51 +0100)]
[AsmParser] Adopt emitWrongTokenError more, improving QoI
This is a full audit of emitError calls, I took the opportunity
to remove extranous parens and fix a couple cases where we'd
generate multiple diagnostics for the same error.
Differential Revision: https://reviews.llvm.org/D125355
Nikolas Klauser [Sun, 8 May 2022 14:40:04 +0000 (16:40 +0200)]
[libc++] Remove __invalidate_all_iterators and replace the uses with std::__debug_db_invalidate_all
Reviewed By: ldionne, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D125188
Nikolas Klauser [Sat, 7 May 2022 20:20:23 +0000 (22:20 +0200)]
[libc++] Add a few more debug wrapper functions
Reviewed By: ldionne, #libc, jloser
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D125176
Craig Topper [Wed, 11 May 2022 18:52:07 +0000 (11:52 -0700)]
[CodeGenPrepare] Use const reference to avoid unnecessary APInt copy. NFC
Spotted while looking at Matthias' patches.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D124985
Philip Reames [Wed, 11 May 2022 18:41:59 +0000 (11:41 -0700)]
[test, riscv] Add test illustrating missing handling for fallthrough blocks in 541c9ba
River Riddle [Sat, 7 May 2022 01:24:17 +0000 (18:24 -0700)]
[TableGen] Refactor TableGenParseFile to no longer use a callback
Now that TableGen no longer relies on global Record state, we can allow
for the client to own the RecordKeeper and SourceMgr. Given that TableGen
internally still relies on the global llvm::SrcMgr, this method unfortunately
still isn't thread-safe.
Differential Revision: https://reviews.llvm.org/D125277
River Riddle [Sat, 7 May 2022 01:05:54 +0000 (18:05 -0700)]
[TableGen] Remove the use of global Record state
This commits removes TableGens reliance on managed static global record state
by moving the RecordContext into the RecordKeeper. The RecordKeeper is now
treated similarly to a (LLVM|MLIR|etc)Context object and is passed to static
construction functions. This is an important step forward in removing TableGens
reliance on global state, and in a followup will allow for users that parse tablegen
to parse multiple tablegen files without worrying about Record lifetime.
Differential Revision: https://reviews.llvm.org/D125276
Qiongsi Wu [Wed, 11 May 2022 17:20:41 +0000 (13:20 -0400)]
[clang][ppc] Creating Seperate Install Target for PPC htm Headers
This patch splits out the htm intrinsic headers from the PPC headers list.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D125386