Rahul Kayaith [Thu, 25 May 2023 02:05:06 +0000 (22:05 -0400)]
[mlir][python] Hook up PyRegionList.__iter__ to PyRegionIterator
This fixes a -Wunused-member-function warning, at the moment
`PyRegionIterator` is never constructed by anything (the only use was
removed in D111697), and iterating over region lists is just falling
back to a generic python iterator object.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D150244
Enna1 [Thu, 25 May 2023 02:11:02 +0000 (10:11 +0800)]
[gcov] Add nosanitize metadata to memory access instructions inserted by emitProfileNotes()
This patch adds nosantize metadata to memory access instructions inserted by gcov emitProfileNotes(), making sanitizers skip these instructions when gcov and sanitizer are used together.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D150460
Rahul Kayaith [Thu, 25 May 2023 01:51:36 +0000 (21:51 -0400)]
[mlir][python] Allow specifying block arg locations
Currently blocks are always created with UnknownLoc's for their arguments. This
adds an `arg_locs` argument to all block creation APIs, which takes an optional
sequence of locations to use, one per block argument. If no locations are
supplied, the current Location context is used.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D150084
Vitaly Buka [Thu, 11 May 2023 23:53:30 +0000 (16:53 -0700)]
[NFC][sanitizer] Rename *ThreadRegistry functions
Reviewed By: thurston
Differential Revision: https://reviews.llvm.org/D150407
Hanhan Wang [Thu, 25 May 2023 00:41:29 +0000 (17:41 -0700)]
[mlir][linalg] Only apply masking on xfer_write when needed.
If the input vector sizes are as same as tensor.pad result shape, the
masking is not needed. Otherwise, the masking is needed and the masking
operands should be as same as tensor.empty op.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D151391
Vitaly Buka [Wed, 24 May 2023 23:51:12 +0000 (16:51 -0700)]
[hwasan] Fix allocator_interface implementation
__sanitizer_get_current_allocated_bytes had as body, but allocator
caches were not registered to collect stats. It's done by
SizeClassAllocator64LocalCache::Init().
Reviewed By: thurston
Differential Revision: https://reviews.llvm.org/D151389
Aart Bik [Wed, 24 May 2023 20:09:59 +0000 (13:09 -0700)]
[mlir][sparse][gpu] fixed typo in CUDA test
Test was printing same result twice
Reviewed By: K-Wu
Differential Revision: https://reviews.llvm.org/D151370
Kai Sasaki [Thu, 25 May 2023 00:19:32 +0000 (09:19 +0900)]
[mlir][linalg] Treat quant dialect type as unsupported in named conversion
Since the tosa-to-linalg conversion does not support the quant dialect type, we can treat it as unsupported instead of crash. Issue was reported https://github.com/llvm/llvm-project/issues/62367
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D151296
Craig Topper [Thu, 25 May 2023 00:25:33 +0000 (17:25 -0700)]
[RISCV] Add a special caes to performVFMADD_VLCombine to support the multiplicand being the same value.
The one use check will fail if there are two uses in the same
instruction. Add a special case for this.
Thurston Dang [Wed, 24 May 2023 20:19:34 +0000 (20:19 +0000)]
hwasan: refactor order of macros in hwasan_platform_interceptors.h [NFC]
Currently, the header file contains all the undefs, followed by all the
define X 0. This will be inconvenient for re-enabling interceptors,
because we would need to comment out (or delete) the corresponding macros
in two different places.
This patch groups together the macros for each function.
Additionally, it adds the suggestion that interceptors should be
re-enabled by commenting out (not deleting) the macros.
Differential Revision: https://reviews.llvm.org/D151371
Vitaly Buka [Sat, 13 May 2023 01:43:53 +0000 (18:43 -0700)]
[AST] Construct Capture objects before use
Msan reports https://reviews.llvm.org/P8308
The reason is if PointerIntPair is not properly
constructed, setPointer uses Info::updatePointer
on uninitialized value.
Reviewed By: #clang, rsmith
Differential Revision: https://reviews.llvm.org/D150504
Md Abdullah Shahneous Bari [Thu, 25 May 2023 00:08:30 +0000 (17:08 -0700)]
[mlir][spirv] Support import/export Functions and GlobalVariables
"LinkageAttributes" decoration allow a SPIR-V module to import
external functions and global variables, or export functions or
global variables for other SPIR-V modules to link against and use.
Import/export capability is extremely important when using outside
libraries (e.g., intrinsic libraries).
Added decorations:
- LinkageAttributes
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D148749
eopXD [Thu, 25 May 2023 00:02:48 +0000 (17:02 -0700)]
[Clang][RISCV] Update vaadd.c test case with new script. NFC
Vitaly Buka [Wed, 24 May 2023 23:48:01 +0000 (16:48 -0700)]
[sanitizer] Deflake test
Craig Topper [Wed, 24 May 2023 23:37:16 +0000 (16:37 -0700)]
[RISCV] Remove some unneeded vmacc isel patterns.
The patterns are for a vpmerge with an all 1s mask, but we are
able to handle that with a post-isel peephole recently.
Slava Zakharin [Wed, 24 May 2023 20:53:25 +0000 (13:53 -0700)]
[flang][hlfir][NFC] Make BOZ lowering a TODO.
This change just turns the unhandled BOZ fatal error into TODO
like in non-HLFIR path.
Vitaly Buka [Wed, 24 May 2023 23:19:30 +0000 (16:19 -0700)]
[sanitizer] Use atomic_fetch_add instead of load/store
Vitaly Buka [Wed, 24 May 2023 23:12:00 +0000 (16:12 -0700)]
[NFC][sanitizer] Remove unused method
Rashmi Mudduluru [Wed, 24 May 2023 22:45:56 +0000 (15:45 -0700)]
[-Wunsafe-buffer-usage] Group variables associated by pointer assignments
Differential Revision: https://reviews.llvm.org/D145739
Vitaly Buka [Wed, 24 May 2023 22:59:26 +0000 (15:59 -0700)]
[sanitizer] Lazy initialize AllocatorGlobalStats
This allow to have no InitLinkerInitialized and let AllocatorGlobalStats
accept registration before allocator initialization.
Lei Zhang [Wed, 24 May 2023 22:48:54 +0000 (17:48 -0500)]
[mlir][capi] Drop mlirShapedTypeGetTypeID
ShapedType is a virtual type rather than a concrete one.
We don't have an implmentation for this API too.
Reviewed By: makslevental
Differential Revision: https://reviews.llvm.org/D151376
Peng Sun [Wed, 24 May 2023 22:20:56 +0000 (15:20 -0700)]
[mlir][tosa] Update the limit of 62 on shift.
Aligns the shift requirement with the TOSA specification.
Reviewed By: eric-k256
Differential Revision: https://reviews.llvm.org/D151113
Nikolas Klauser [Wed, 24 May 2023 22:33:38 +0000 (15:33 -0700)]
[libc++][PSTL] Add a simple std::thread backend
This is just to test that the PSTL works with parallelization. This is not supposed to be a production-ready backend.
Reviewed By: ldionne, #libc
Spies: EricWF, arichardson, libcxx-commits
Differential Revision: https://reviews.llvm.org/D150284
Nishant Patel [Wed, 24 May 2023 22:23:19 +0000 (15:23 -0700)]
[mlir][spirv] Add a missing pattern to MathToSPIRV Conversion pass
The MathToSPIRV conversion pass missed out a pattern for converting
math::AbsIOP to spirv::CLSAbsOp
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D151378
Sterling Augustine [Wed, 24 May 2023 22:11:07 +0000 (15:11 -0700)]
remove useless visibility spec from bazel build.
Amara Emerson [Wed, 24 May 2023 21:46:20 +0000 (14:46 -0700)]
Pre-commit a min/max idiom breakage test for D149918
Tai Ly [Wed, 24 May 2023 21:43:02 +0000 (14:43 -0700)]
[TOSA] Refactor TosaMakeBroadcastable pass
This refactors and exposes EqualizeRanks utility function
from within TosaMakeBroadcastable pass so it may be used to
reshape operator inputs to equal ranks.
Signed-off-by: Tai Ly <tai.ly@arm.com>
Differential Revision: https://reviews.llvm.org/D150283
Advenam Tacet [Wed, 24 May 2023 21:26:14 +0000 (14:26 -0700)]
[2a/3][ASan][libcxx] std::deque annotations
This revision is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in std::vector, to std::string and `std::deque` collections. These changes allow ASan to detect cases when the instrumented program accesses memory which is internally allocated by the collection but is still not in-use (accesses before or after the stored elements for `std::deque`, or between the size and capacity bounds for `std::string`).
The motivation for the research and those changes was a bug, found by Trail of Bits, in a real code where an out-of-bounds read could happen as two strings were compared via a std::equals function that took `iter1_begin`, `iter1_end`, `iter2_begin` iterators (with a custom comparison function). When object `iter1` was longer than `iter2`, read out-of-bounds on `iter2` could happen. Container sanitization would detect it.
This revision introduces annotations for `std::deque`. Each chunk of the container can now be annotated using the `__sanitizer_annotate_double_ended_contiguous_container` function, which was added in the rG1c5ad6d2c01294a0decde43a88e9c27d7437d157. Any attempt to access poisoned memory will trigger an ASan error. Although false negatives are rare, they are possible due to limitations in the ASan API, where a few (usually up to 7) bytes before the container may remain unpoisoned. There are no false positives in the same way as with `std::vector` annotations.
This patch only supports objects (deques) that use the standard allocator. However, it can be easily extended to support all allocators, as suggested in the D146815 revision.
Furthermore, the patch includes the addition of the `is_double_ended_contiguous_container_asan_correct` function to libcxx/test/support/asan_testing.h. This function can be used to verify whether a `std::deque` object has been correctly annotated.
Finally, the patch extends the unit tests to verify ASan annotations (added LIBCPP_ASSERTs).
If a program is compiled without ASan, all helper functions will be no-ops. In binaries with ASan, there is a negligible performance impact since the code from the change is only executed when the deque container changes in size and it’s proportional to the change. It is important to note that regardless of whether or not these changes are in use, every access to the container's memory is instrumented.
Reviewed By: #libc, philnik
Spies: vitalybuka, hans, mikhail.ramalho, Enna1, #sanitizers, philnik, libcxx-commits
Differential Revision: https://reviews.llvm.org/D132092
Valentin Clement [Wed, 24 May 2023 21:00:24 +0000 (14:00 -0700)]
[flang][openacc][NFC] Remove unused function
Now that operands have moved to the new data operands lowering, this
function is not used anymore.
Reviewed By: vzakhari
Differential Revision: https://reviews.llvm.org/D151357
Matt Arsenault [Sat, 10 Dec 2022 13:28:11 +0000 (08:28 -0500)]
AMDGPU: Replace certain llvm.amdgcn.class uses with llvm.is.fpclass
Most transforms should now be performed on llvm.is.fpclass. Unlike the
generic intrinsic, this supports variable test masks.
Mehdi Amini [Wed, 24 May 2023 20:30:49 +0000 (13:30 -0700)]
Fix MLIR bytecode reading of i0 IntegerAttr
The move of the bytecode serialization to be tablegen driven in
https://reviews.llvm.org/D144820 added a new condition in the reading
path that forbid 0-sized integer, even though we still produce them.
Fix #62920
Differential Revision: https://reviews.llvm.org/D151372
AdityaK [Wed, 17 May 2023 18:30:13 +0000 (11:30 -0700)]
[libc++, std::vector] call the optimized version of __uninitialized_allocator_copy for trivial types
See: https://github.com/llvm/llvm-project/issues/61987
Fix suggested by: @philnik and @var-const
Reviewers: philnik, ldionne, EricWF, var-const
Differential Revision: https://reviews.llvm.org/D147741
Testing:
ninja check-cxx check-clang check-llvm
Benchmark Testcases (BM_CopyConstruct, and BM_Assignment) added.
performance improvement:
Run on (8 X 4800 MHz CPU s)
CPU Caches:
L1 Data 48 KiB (x4)
L1 Instruction 32 KiB (x4)
L2 Unified 1280 KiB (x4)
L3 Unified 12288 KiB (x1)
Load Average: 1.66, 3.02, 2.43
Comparing build-runtimes-base/libcxx/benchmarks/vector_operations.libcxx.out to build-runtimes/libcxx/benchmarks/vector_operations.libcxx.out
Benchmark Time CPU Time Old Time New CPU Old CPU New
----------------------------------------------------------------------------------------------------------------------------------------
BM_ConstructSize/vector_byte/5140480 +0.0362 +0.0362 116906 121132 116902 121131
BM_CopyConstruct/vector_int/5140480 -0.4563 -0.4577 1755224 954241 1755330 951987
BM_Assignment/vector_int/5140480 -0.0222 -0.0220 990045 968095 989917 968125
BM_ConstructSizeValue/vector_byte/5140480 +0.0308 +0.0307 116970 120567 116977 120573
BM_ConstructIterIter/vector_char/1024 -0.0831 -0.0831 19 17 19 17
BM_ConstructIterIter/vector_size_t/1024 +0.0129 +0.0131 88 89 88 89
BM_ConstructIterIter/vector_string/1024 -0.0064 -0.0018 54455 54109 54208 54112
OVERALL_GEOMEAN -0.0845 -0.0842 0 0 0 0
FYI, the perf improvements for BM_CopyConstruct due to this patch is mostly subsumed by the https://reviews.llvm.org/D149826. However this patch still adds value by converting copy to memmove (the second testcase).
Before the patch:
```
define linkonce_odr dso_local void @_ZNSt3__16vectorIiNS_9allocatorIiEEE18__construct_at_endIPiS5_EEvT_T0_m(ptr noundef nonnull align 8 dereferenceable(24) %0, ptr noundef %1, ptr noundef %2, i64 noundef %3) local_unnamed_addr #4 comdat align 2 {
%5 = getelementptr inbounds %"class.std::__1::vector", ptr %0, i64 0, i32 1
%6 = load ptr, ptr %5, align 8, !tbaa !12
%7 = icmp eq ptr %1, %2
br i1 %7, label %16, label %8
8: ; preds = %4, %8
%9 = phi ptr [ %13, %8 ], [ %1, %4 ]
%10 = phi ptr [ %14, %8 ], [ %6, %4 ]
%11 = icmp ne ptr %10, null
tail call void @llvm.assume(i1 %11)
%12 = load i32, ptr %9, align 4, !tbaa !14
store i32 %12, ptr %10, align 4, !tbaa !14
%13 = getelementptr inbounds i32, ptr %9, i64 1
%14 = getelementptr inbounds i32, ptr %10, i64 1
%15 = icmp eq ptr %13, %2
br i1 %15, label %16, label %8, !llvm.loop !16
16: ; preds = %8, %4
%17 = phi ptr [ %6, %4 ], [ %14, %8 ]
store ptr %17, ptr %5, align 8, !tbaa !12
ret void
}
```
After the patch:
```
define linkonce_odr dso_local void @_ZNSt3__16vectorIiNS_9allocatorIiEEE18__construct_at_endIPiS5_EEvT_T0_m(ptr noundef nonnull align 8 dereferenceable(24) %0, ptr noundef %1, ptr noundef %2, i64 noundef %3) local_unnamed_addr #4 comdat align 2 {
%5 = getelementptr inbounds %"class.std::__1::vector", ptr %0, i64 0, i32 1
%6 = load ptr, ptr %5, align 8, !tbaa !12
%7 = ptrtoint ptr %2 to i64
%8 = ptrtoint ptr %1 to i64
%9 = sub i64 %7, %8
%10 = ashr exact i64 %9, 2
tail call void @llvm.memmove.p0.p0.i64(ptr align 4 %6, ptr align 4 %1, i64 %9, i1 false)
%11 = getelementptr inbounds i32, ptr %6, i64 %10
store ptr %11, ptr %5, align 8, !tbaa !12
ret void
}
```
This is due to the optimized version of uninitialized_allocator_copy function.
Vitaly Buka [Wed, 24 May 2023 20:23:10 +0000 (13:23 -0700)]
[NFC][HWASAN] Rename AllocatorSwallowThreadLocalCache
Vitaly Buka [Wed, 24 May 2023 07:31:15 +0000 (00:31 -0700)]
[lsan] Fix allocator_interface implementation
__sanitizer_get_current_allocated_bytes had as body, but allocator
caches were not registered to collect stats. It's done by
SizeClassAllocator64LocalCache::Init().
Reviewed By: thurston
Differential Revision: https://reviews.llvm.org/D151355
Ramkumar Ramachandra [Thu, 1 Dec 2022 10:28:51 +0000 (11:28 +0100)]
mlir/tosa: supply better documentation for tosa.pad
This patch modifies the description in TosaOps.td, taking into account
all the arguments, and supplying examples.
Signed-off-by: Ramkumar Ramachandra <r@artagnon.com>
Differential Revision: https://reviews.llvm.org/D139089
Eugene Burmako [Wed, 24 May 2023 20:03:36 +0000 (22:03 +0200)]
[MLIR] Update Bazel build to finish moving PDL-related transform ops into an extension
https://reviews.llvm.org/D151104 moved PDL-related transform ops into an extension and updated the Bazel build, but one tiny thing fell through the cracks - TransformOpsPyFiles also needs to include the newly introduced `mlir/python/mlir/dialects/_transform_pdl_extension_ops_ext.py`.
Reviewed By: saugustine, bkramer
Differential Revision: https://reviews.llvm.org/D151368
Craig Topper [Wed, 24 May 2023 19:15:23 +0000 (12:15 -0700)]
LLVM_FALLTHROUGH => [[fallthrough]]. NFC
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D150996
Thurston Dang [Fri, 19 May 2023 21:32:54 +0000 (21:32 +0000)]
sanitizer_common: add test that calls wcslen
This is a very simple test that calls wsclen. There are currently no other HWASan tests that call wsclen, which is why the wcslen interceptor issue (triggered by https://reviews.llvm.org/D150708 and fixed in https://reviews.llvm.org/D150909) was only detected by stage2/hwasan check on the buildbots. With this test, the issue would have been caught by stage1 check-sanitizer, with a more obvious diagnosis.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D151000
Florian Hahn [Wed, 24 May 2023 19:16:41 +0000 (20:16 +0100)]
[IRGen] Handle infinite cycles in findDominatingStoreToReturnValue.
If there is an infinite cycle in the IR, the loop will never exit. Keep
track of visited basic blocks in a set and return nullptr if a block is
visited again.
Fixes #62830.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D151076
Slava Zakharin [Wed, 24 May 2023 18:18:43 +0000 (11:18 -0700)]
[flang][hlfir] Create temporary for passing constant expression for actual arg.
Even though the constant expression actual argument is not definable,
and the associated dummy argument is not definable, the compiler may produce
implicit copies into the memory storage associated with the constant expression.
For example, a constant expression storage passed by reference to a subprogram
may be used for implicit copy-out:
```
subroutine sub(i, n)
interface
subroutine sub2(i)
integer :: i(*)
end subroutine sub2
end interface
integer :: i(n)
call sub2(i(3::2)) ! copy-out after the call will write to 'i'
end subroutine sub
subroutine test
call sub((/1,2,3,4,5/), 5)
end subroutine test
```
If we pass a reference to constant expression storage to 'sub' directly,
the copy-out inside 'sub' will try to write into readonly memory.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D151271
Artem Belevich [Tue, 23 May 2023 19:17:50 +0000 (12:17 -0700)]
[CUDA] Fix wrappers for sm_80 functions
Previous implementation provided wrappers for the internal implementations used
by CUDA headers. However, clang does not include those, so we need to provide
the public functions instead.
Differential Revision: https://reviews.llvm.org/D151243
Erich Keane [Thu, 18 May 2023 15:26:25 +0000 (08:26 -0700)]
Make dereferencing a void* a hard-error instead of warn-as-error
Clang 16 changed to consider dereferencing a void* to be a
warning-as-error, plus made this an error in SFINAE contexts, since this
resulted in incorrect template instantiation. When doing so, the Clang
16 documentation was updated to reflect that this was likely to change
again to a non-disablable error in the next version.
As there has been no response to changing from a warning to an error, I
believe this is a non-controversial change.
This patch changes this to be an Error, consistent with the standard and
other compilers.
This was discussed in this RFC:
https://discourse.llvm.org/t/rfc-can-we-stop-the-extension-to-allow-dereferencing-void-in-c/65708
Differential Revision: https://reviews.llvm.org/D150875
Michael Liao [Wed, 24 May 2023 18:08:43 +0000 (14:08 -0400)]
Fix shared library build again from 1c9a800. NFC
Sterling Augustine [Wed, 24 May 2023 18:04:42 +0000 (11:04 -0700)]
Disable MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS for bazel builds.
Med Ismail Bennani [Wed, 24 May 2023 17:52:47 +0000 (10:52 -0700)]
[lldb] Disable `watchpoint_callback.test` temporarily on darwin
This test started failing on the green-dragon bot, but after some
investigation, it doesn't have anything to do with Lua.
If we use a variable watchpoint with a condition using a scope variable,
if we go out-of-scope, the watpoint remains active which can the
expression evaluator to fail to parse the watchpoint condition (because
of the missing varible bindings).
For now, we should disable this test until we come up with a fix for it.
rdar://
109574319
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Vitaly Buka [Wed, 24 May 2023 17:41:56 +0000 (10:41 -0700)]
[NFC] New line in test
Valentin Clement [Wed, 24 May 2023 17:44:40 +0000 (10:44 -0700)]
[mlir][openacc] Use new reduction design in acc.loop
Use the new reduction design in acc.loop operation.
Depends on D151146
Reviewed By: razvanlupusoru, jeanPerier
Differential Revision: https://reviews.llvm.org/D151164
Vitaly Buka [Wed, 24 May 2023 17:35:34 +0000 (10:35 -0700)]
[msan] Implement __sanitizer_get_current_allocated_bytes
__sanitizer_get_current_allocated_bytes had as body, but allocator
caches were not registered to collect stats. It's done by
SizeClassAllocator64LocalCache::Init().
Reviewed By: kstoimenov
Differential Revision: https://reviews.llvm.org/D151352
Philip Reames [Wed, 24 May 2023 17:40:06 +0000 (10:40 -0700)]
[RISCV] Use vfslide1down for build_vectors of non-constant floats
This adds the vfslide1down (and vfslide1up for consistency) nodes. These mostly parallel the existing vslide1down/up nodes. (See note below on instruction semantics.) We then use the vfslide1down in build_vector lowering instead of going through the stack.
The specification is more than a bit vague on the meaning of these instructions. All we're given is "The vfslide1down instruction is defined analogously, but sources its scalar argument from an f register."
We have to combine this with a general note at the beginning of section 10. Vector Arithmetic Instruction Formats which reads: "For floating-point operations, the scalar can be taken from a scalar f register. If FLEN > SEW, the value in the f registers is checked for a valid NaN-boxed value, in which case the least-signicant SEW bits of the f register are used, else the canonical NaN value is used. Vector instructions where any floating-point vector operand’s EEW is not a supported floating-point type width (which includes when FLEN < SEW) are reserved.".
Note that floats are NaN-boxed when D is implemented.
Combining that all together, we're fine as long as the element type matches the vector type - which is does by construction. We shouldn't have legal vectors which hit the reserved encoding case. An assert is included, just to be careful.
Differential Revision: https://reviews.llvm.org/D151347
Valentin Clement [Wed, 24 May 2023 17:39:00 +0000 (10:39 -0700)]
[mlir][openacc] Add check for the private list in acc.serial
Add the missing check on private list information. The
check is the same than the one done for acc.parallel.
Depends on D151146
Reviewed By: razvanlupusoru, jeanPerier
Differential Revision: https://reviews.llvm.org/D151149
Valentin Clement [Wed, 24 May 2023 17:39:00 +0000 (10:39 -0700)]
[mlir][openacc] Add check for the private list in acc.serial
Add the missing check on private list information. The
check is the same than the one done for acc.parallel.
Depends on D151146
Reviewed By: razvanlupusoru, jeanPerier
Differential Revision: https://reviews.llvm.org/D151149
Aaron Ballman [Wed, 24 May 2023 17:38:27 +0000 (13:38 -0400)]
Publicly document the C & C++ Language WG meetings
We've been hosting these meetings regularly for a while now, so this
begins advertising the meetings more widely.
Valentin Clement [Wed, 24 May 2023 17:38:01 +0000 (10:38 -0700)]
[mlir][openacc] Use new reduction design in acc.parallel
After D150818 the reduction clause is represented
with a acc.reduction.recipe operation and an operand.
This patch updates the acc.parallel op for the new design.
Reviewed By: razvanlupusoru, jeanPerier
Differential Revision: https://reviews.llvm.org/D151146
Philip Reames [Wed, 24 May 2023 17:31:54 +0000 (10:31 -0700)]
[RISCV][InsertVSETVLI] Support constant VLs larger than immediate encoding
The immediate field on the vsetivli is fairly limited. For larger vectors, we end up having to materialize a constant in a register. We hadn't plumbed the infrastructure to treat such materialized constants as constants for purpose of vsetvli elimination.
I only bothered to handle LI. We could extend this to LUI sequences, but well, 2048 elements is probably enough for all practical fixed length vector codegen. :)
The test delta does point out a related problem. At LMUL8, we see increased register allocation pressure, and we should probably either a) address register allocation remat, or b) be less aggressive about eliminating vsetvlis at high lmul. Note that high LMUL code is not generated much by default.
Differential Revision: https://reviews.llvm.org/D151212
Kazu Hirata [Wed, 24 May 2023 17:36:38 +0000 (10:36 -0700)]
[mlir] Fix a warning
This patch fixes:
mlir/lib/Dialect/GPU/IR/GPUDialect.cpp:175:2: error: extra ';'
outside of a function is incompatible with C++98
[-Werror,-Wc++98-compat-extra-semi]
Amy Kwan [Wed, 24 May 2023 17:09:19 +0000 (12:09 -0500)]
Fix shared library build from 1c9a800.
Fix the shared library build failure on clang-ppc64le-rhel from 1c9a800 as seen
in: https://lab.llvm.org/buildbot/#/builders/57/builds/27080/steps/6/logs/stdio
Stefan Pintilie [Wed, 24 May 2023 16:33:54 +0000 (12:33 -0400)]
[PowerPC] Remove asserts from the disassembler.
My previous patch had added a couple of asserts to the disassembler.
The problem with this is that the disassembler is not just used for the
text section it is also used to disassemble the data section of an
object where the bytes do not necessarily represent instructions. If the
data in the data section happens to look like an illegal instruction
then llvm-objdump will assert on data because it is finding an illegal
instruction that is not actually an instruction at all.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D149711
Vitaly Buka [Wed, 24 May 2023 17:21:03 +0000 (10:21 -0700)]
[tsan] Implement __sanitizer_purge_allocator
Alex Langford [Tue, 23 May 2023 17:20:35 +0000 (10:20 -0700)]
[DebugInfo] Follow-up to D151001
I landed D151001 before it had gotten sign-off from all the reviewers.
This is a follow-up to address the additional feedback.
Differential Revision: https://reviews.llvm.org/D151233
Kun Wu [Wed, 24 May 2023 16:43:38 +0000 (09:43 -0700)]
[mlir] [gpu] [sparse] refined SparseHandle type
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D151014
Kelvin Li [Tue, 23 May 2023 23:02:49 +0000 (19:02 -0400)]
[flang] Support for PowerPC vector type
The following PowerPC vector type syntax is added:
VECTOR ( element-type-spec )
where element-type-sec is integer-type-spec, real-type-sec or unsigned-type-spec.
Two opaque types (__VECTOR_PAIR and __VECTOR_QUAD) are also added.
A finite set of functionalities are implemented in order to support the new types:
1. declare objects
2. declare function result
3. declare type dummy arguments
4. intrinsic assignment between the new type objects (e.g. v1=v2)
5. reference functions that return the new types
Submit on behalf of @tislam @danielcchen
Authors: @tislam @danielcchen
Differential Revision: https://reviews.llvm.org/D150876
Vitaly Buka [Wed, 24 May 2023 16:58:32 +0000 (09:58 -0700)]
Revert "[PowerPC] Simplify fp-to-int store optimization"
Breaks https://lab.llvm.org/buildbot/#/builders/18/builds/9118
This reverts commit
8064caf83fb166b709bfe0e7641c5181341cb064.
Sterling Augustine [Wed, 24 May 2023 16:35:53 +0000 (09:35 -0700)]
Fix bazel build for https://reviews.llvm.org/D144552
Differential Revision: https://reviews.llvm.org/D151346
Jay Foad [Wed, 24 May 2023 11:00:01 +0000 (12:00 +0100)]
[MachineVerifier] Try harder to verify LiveIntervals
Verify the LiveIntervals analysis after a pass that claims to preserve
it, even if there are no further passes (apart from the verifier itself)
that would use the analysis.
Fixes https://github.com/llvm/llvm-project/issues/46217
Differential Revision: https://reviews.llvm.org/D129208
John Brawn [Wed, 17 May 2023 15:52:26 +0000 (16:52 +0100)]
[clang] Don't define predefined macros multiple times
Fix several instances of macros being defined multiple times
in several targets. Most of these are just simple duplication in a
TargetInfo or OSTargetInfo of things already defined in
InitializePredefinedMacros or InitializeStandardPredefinedMacros,
but there are a few that aren't:
* AArch64 defines a couple of feature macros for armv8.1a that are
handled generically by getTargetDefines.
* CSKY needs to take care when CPUName and ArchName are the same.
* Many os/target combinations result in __ELF__ being defined twice.
Instead define __ELF__ just once in InitPreprocessor based on
the Triple, which already knows what the object format is based
on os and target.
These changes shouldn't change the final result of which macros are
defined, with the exception of the changes to __ELF__ where if you
explicitly specify the object type in the triple then this affects
if __ELF__ is defined, e.g. --target=i686-windows-elf results in it
being defined where it wasn't before, but this is more accurate as an
ELF file is in fact generated.
Differential Revision: https://reviews.llvm.org/D150966
Marco Elver [Wed, 24 May 2023 16:24:55 +0000 (18:24 +0200)]
Fix "[HWASan] Use ASM_WRAPPER_NAME instead of __interceptor_*"
Fix typo introduced in
2f1e2a6b1ca2.
Reported-by: RamNalamothu
Tom Eccles [Wed, 24 May 2023 16:05:25 +0000 (16:05 +0000)]
Revert "[flang] use greedy mlir driver for stack arrays pass"
This reverts commit
74c2ec50f393bad8b31d0dd0bd8b2ff44d361198.
This caused a regression building spec2017 with -Ofast.
Matthias Braun [Fri, 19 May 2023 19:47:37 +0000 (12:47 -0700)]
Bump coalescing limit
This bumps the "large-interval-freq-threshold" limit in the register
coalescer to 256. The limit was introduced in
https://reviews.llvm.org/D59143 without much justify for the particular
value "100", so I hope bumping it is ok.
This change is motivated by bad codegen for the popular crc32c
algorithm; the code is often based/copied from this implementation:
https://github.com/htot/crc32c/blob/master/crc32c/crc32intelc.cc which
uses a duffs-device pattern with 128 switch-cases. There are examples in
RocksDB (https://github.com/facebook/rocksdb/blob/main/util/crc32c.cc)
and Folly
(https://github.com/facebook/folly/blob/main/folly/hash/detail/Crc32cDetail.cpp)
which are important use cases for us.
Differential Revision: https://reviews.llvm.org/D150994
Nemanja Ivanovic [Wed, 24 May 2023 16:12:19 +0000 (11:12 -0500)]
[PowerPC] Do not attempt to combine fptoui without FPCVT
Commit
8064caf83fb166b709bfe0e7641c5181341cb064 added a call
to a function that performs this combine without checking whether
the target supports FPCVT. This caused asserts to trip on BE bots
as the default target does not have this feature.
Mark de Wever [Tue, 23 May 2023 15:22:12 +0000 (17:22 +0200)]
[libc++] Fixes clang-tidy plugin for clang-tidy 17.
When using with clang-tidy 17 Node.getAttrName() sometimes returns a
nullptr. This caused segfaults in the CI.
Reviewed By: philnik, #libc
Differential Revision: https://reviews.llvm.org/D151224
Kazu Hirata [Wed, 24 May 2023 16:05:50 +0000 (09:05 -0700)]
[ARM] Remove unused functions isExpImmValue, isExpImm, and isInvertedExpImm
The last uses were removed by:
commit
772e4931932270a82f38c83d4344c800b2f54eff
Author: Simon Tatham <simon.tatham@arm.com>
Date: Thu Jan 23 11:53:27 2020 +0000
Differential Revision: https://reviews.llvm.org/D151299
Matt Arsenault [Wed, 24 May 2023 14:59:42 +0000 (15:59 +0100)]
AMDGPU: Add some new tests for class undef/poison handling
Nikolas Klauser [Wed, 24 May 2023 15:46:13 +0000 (08:46 -0700)]
Revert "[libc++] Apply _LIBCPP_EXCLUDE_FROM_EXPLICIT_INSTANTIATION only in classes that we have instantiated externally"
This reverts commit
b3c9150062dc4264afb4a3d2790f071c1ebe0743.
There were unexpected breakages downstream. @EricWF is investigating.
Harsh Menon [Wed, 24 May 2023 01:03:59 +0000 (18:03 -0700)]
[mlir] Add support for multiple uses in transform.structured.fuse_into_containing_op
In the tile and fuse of the first extract use, we add support
for scenarios where the results of the tiled op have uses
that are dominated by the scf.for_all. Specifically, we replace
the scf.for_all with a new scf.for_all that has an additional
shared_out and add the appropriate parallel insert slice op.
Differential Revision: https://reviews.llvm.org/D151275
Vitaly Buka [Wed, 24 May 2023 07:31:15 +0000 (00:31 -0700)]
[sanitizer] Add allocator_interface test
Hooks are in malloc_hook.cpp.
Mark de Wever [Wed, 17 May 2023 17:17:52 +0000 (19:17 +0200)]
[libc++][format] Removes the experimental status.
The code has been quite ready for a while now and there are no more ABI
breaking papers. So this is a good time to mark the feature as stable.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D150802
Guillaume Chatelet [Wed, 24 May 2023 14:43:00 +0000 (14:43 +0000)]
[libc] simplify test for getrandom
`getrandom` is implemented as a syscall.
We don't want to test linux implementation of the syscall. We just want to verify that it reacts as expected to sensible values.
Runtime before
```
[ RUN ] LlvmLibcGetRandomTest.InvalidFlag
[ OK ] LlvmLibcGetRandomTest.InvalidFlag (took 0 ms)
[ RUN ] LlvmLibcGetRandomTest.InvalidBuffer
[ OK ] LlvmLibcGetRandomTest.InvalidBuffer (took 0 ms)
[ RUN ] LlvmLibcGetRandomTest.ReturnsSize
[ OK ] LlvmLibcGetRandomTest.ReturnsSize (took 83 ms)
[ RUN ] LlvmLibcGetRandomTest.PiEstimation
[ OK ] LlvmLibcGetRandomTest.PiEstimation (took 9882 ms)
```
Runtime after
```
[ RUN ] LlvmLibcGetRandomTest.InvalidFlag
[ OK ] LlvmLibcGetRandomTest.InvalidFlag (took 0 ms)
[ RUN ] LlvmLibcGetRandomTest.InvalidBuffer
[ OK ] LlvmLibcGetRandomTest.InvalidBuffer (took 0 ms)
[ RUN ] LlvmLibcGetRandomTest.ReturnsSize
[ OK ] LlvmLibcGetRandomTest.ReturnsSize (took 0 ms)
[ RUN ] LlvmLibcGetRandomTest.CheckValue
[ OK ] LlvmLibcGetRandomTest.CheckValue (took 0 ms)
```
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D151336
Peter Klausler [Tue, 23 May 2023 20:55:23 +0000 (13:55 -0700)]
[flang] Fix SPACING() of very small values
SPACING() must return TINY() for zero arguments (which we do)
and also for subnormal values smaller than TINY() in absolute value,
which we get wrong. Fix folding and the runtime.
Differential Revision: https://reviews.llvm.org/D151272
Christian Ulmann [Wed, 24 May 2023 14:52:54 +0000 (14:52 +0000)]
[mlir][LLVM] Fix aliasing in intrinsic base class
This commit fixes a bug in the intrinsic base class that caused the
declaration of alias analysis attributes under a wrong condition.
Valentin Clement [Wed, 24 May 2023 14:57:38 +0000 (07:57 -0700)]
[mlir][openacc] destroy region on firstprivate.recipe is optional
The destroy region is optional but the verifier was enforcing it.
Update the verifier and make it clear in the definition.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D151239
Luke Lau [Mon, 22 May 2023 16:51:32 +0000 (17:51 +0100)]
[RISCV] Scalarize constant stores of fixed vectors if small enough
For stores of small fixed-length vector constants, we can store them
with a sequence of lui/addi/sh/sw to avoid the cost of building the
vector and the vsetivli toggle, provided the constant materialization
cost isn't too high.
This subsumes the optimisation for stores of zeroes in
4dc9a2c5b93682c12d7a80bbe790b14ddb301877
(This is a reapply of
0ca13f9d2701e23af2d000a5d8f48b33fe0878b7)
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D151221
Luke Lau [Wed, 24 May 2023 14:52:19 +0000 (15:52 +0100)]
Revert "[RISCV] Scalarize constant stores of fixed vectors up to 32 bits"
This reverts commit
0ca13f9d2701e23af2d000a5d8f48b33fe0878b7.
Philip Reames [Wed, 24 May 2023 14:43:52 +0000 (07:43 -0700)]
[RISCV] Add test coverage for buildvector of FP values
Matt Arsenault [Wed, 24 May 2023 07:52:22 +0000 (08:52 +0100)]
Inline: Convert test to generated checks
Matt Arsenault [Wed, 24 May 2023 13:24:49 +0000 (14:24 +0100)]
IR: Avoid include in FMF header
Matthias Springer [Wed, 24 May 2023 14:30:57 +0000 (16:30 +0200)]
[mlir][Transforms] Fix mlir-config flag check
Boolean compiler flags (such as `DMLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`) show up in `mlir-config.h` as preprocessor defines that are either 0 or 1. Use `#if` instead of `#ifdef`.
This should have been part of D144552.
Luke Lau [Mon, 22 May 2023 16:51:32 +0000 (17:51 +0100)]
[RISCV] Scalarize constant stores of fixed vectors up to 32 bits
For stores of small fixed-length vector constants, we can store them
with a sequence of lui/addi/sh/sw to avoid the cost of building the
vector and the vsetivli toggle.
Note that this only handles vectors that are 32 bits or smaller, but
could be expanded to 64 bits if we know that the constant
materialization cost isn't too high.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D151221
Luke Lau [Mon, 22 May 2023 16:49:45 +0000 (17:49 +0100)]
[RISCV] Add test cases for storing small constant vectors
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D151220
Doru Bercea [Wed, 24 May 2023 14:14:43 +0000 (10:14 -0400)]
Enable up to 64 arguments for outlined regions in OpenMP device code.
Co-Author: Fabio Luporini <fabio@devitocodes.com>
Review: https://reviews.llvm.org/D150134
Jie Fu [Wed, 24 May 2023 14:21:56 +0000 (22:21 +0800)]
[MergeICmps] Fix -Wsign-compare and typos (NFC)
/data/llvm-project/llvm/lib/Transforms/Scalar/MergeICmps.cpp:623:21: error: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigne
d long') [-Werror,-Wsign-compare]
for (int i = 0; i < Comparisons.size(); i++) {
~ ^ ~~~~~~~~~~~~~~~~~~
1 error generated.
Matthias Springer [Wed, 24 May 2023 14:14:47 +0000 (16:14 +0200)]
[mlir][Transforms] GreedyPatternRewriteDriver debugging: Detect faulty patterns
Compute operation finger prints to detect incorrect API usage in RewritePatterns. Does not work for dialect conversion patterns.
Detect patterns that:
* Returned `failure` but changed the IR.
* Returned `success` but did not change the IR.
* Inserted/removed/modified ops, bypassing the rewriter. Not all cases are detected.
These new checks are quite expensive, so they are only enabled with `-DMLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS=ON`. Failures manifest as fatal errors (`llvm::report_fatal_error`) or crashes (accessing deallocated memory). To get better debugging information, run `mlir-opt -debug` (to see which pattern is broken) with ASAN (to see where memory was deallocated).
Differential Revision: https://reviews.llvm.org/D144552
Jay Foad [Wed, 24 May 2023 11:00:01 +0000 (12:00 +0100)]
[RegisterCoalescer] Fix updating LiveIntervals in joinReservedPhysReg
Live intervals for physical registers are calculated lazily on demand.
In a case like this:
16B %0:gpr32 = IMPLICIT_DEF
32B $wzr = COPY %0
if the live interval for $wzr did not already exist then the update code
in joinReservedPhysReg would create it with a definition at 32B, which
would remain even after the COPY was deleted.
Differential Revision: https://reviews.llvm.org/D151314
Jay Foad [Fri, 5 May 2023 09:51:28 +0000 (10:51 +0100)]
[MachineVerifier] Verify liveins for live-through segments
Differential Revision: https://reviews.llvm.org/D149947
Zhongyunde [Wed, 24 May 2023 13:16:41 +0000 (21:16 +0800)]
Reland [MergeICmps] Adapt to non-eq comparisons, bugfix
1.Fix the last runtime issue as some sequent comparisons need be spilted.
For the origin equal comparisons chain, the new spilted Icmp chain will
still be end with equal, while for the new not-equal comparisons chain,
the new spilted Icmp chain will still be end with equal, so should address
this carefully, see detail wih case partial_sequent_ne
2. Fix the mismatch of last link comparison
Thanks for @aeubanks, @glandium and @ayzhao report the runtime issue
and carefully examine.
Fix https://github.com/llvm/llvm-project/issues/59740.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D141188
Matthias Springer [Wed, 24 May 2023 13:59:29 +0000 (15:59 +0200)]
[mlir][Transforms][NFC] GreedyPatternRewriteDriver: Reformat debug logic
Do not duplicate code that is performing actual work, put debug code around it.
Differential Revision: https://reviews.llvm.org/D151207
Jay Foad [Wed, 24 May 2023 12:28:07 +0000 (13:28 +0100)]
[AMDGPU] Switch to backwards scavenging in non-spill cases
When the scavenger is not allowed to spill, the only difference between
forward and backward should be the heuristics used to pick an available
register. Forwards scavenging tries to pick a register that can be used
again later in the BB; backwards scavenging tries to pick one that can
be used earlier.
Backwards scavenging is preferred because it does not rely on accurate
kill flags.
Differential Revision: https://reviews.llvm.org/D151323
Sheng [Wed, 24 May 2023 14:02:41 +0000 (22:02 +0800)]
[clang][NFC] Add a blank line in ReleaseNotes.rst
A buildbot has failed on the absence of the blank line at the end of the bullet list.
Sheng [Wed, 24 May 2023 13:45:03 +0000 (21:45 +0800)]
[clang][Sema] Fix a crash when instantiating a non-type template argument in a dependent scope.
The type alias template is not diagnosed when instantiating an expected non-type template argument in a dependent scope, causing ICE.
Besides that, the diagnostic message has been updated to account for the fact that the function template is not the only non-type template.
Fixes #62533
Reviewed By: #clang-language-wg, erichkeane
Differential Revision: https://reviews.llvm.org/D151062
Hansang Bae [Thu, 4 May 2023 16:06:12 +0000 (11:06 -0500)]
[OpenMP][libomp] Implement KMP_DLSYM_NEXT on Windows
The interop API routines try to invoke external entries, but we did
not have support for KMP_DLSYM_NEXT on Windows. Also added proper
guards for STUB build.
Differential Revision: https://reviews.llvm.org/D149892
Clement Courbet [Wed, 24 May 2023 13:21:50 +0000 (15:21 +0200)]
[clang-tidy] Really fix rG9182c679dde7
Correct link is clang-tidy/checks/performance/no-automatic-move