Martin Storsjö [Tue, 13 Jul 2021 12:39:54 +0000 (12:39 +0000)]
[libcxx] [test] Clarify weak_ptr_ret on Windows, remove a LIBCXX-WINDOWS-FIXME
On Windows, structs with a destructor are always returned indirectly;
add this to the list of known exceptions in the test where the class
isn't returned in registers as expected.
Differential Revision: https://reviews.llvm.org/D105906
Dmitry Vyukov [Mon, 12 Jul 2021 19:06:28 +0000 (12:06 -0700)]
sanitizer_common: add simpler ThreadRegistry ctor
Currently ThreadRegistry is overcomplicated because of tsan,
it needs tid quarantine and reuse counters. Other sanitizers
don't need that. It also seems that no other sanitizer now
needs max number of threads. Asan used to need 2^24 limit,
but it does not seem to be needed now. Other sanitizers blindly
copy-pasted that without reasons. Lsan also uses quarantine,
but I don't see why that may be potentially needed.
Add a ThreadRegistry ctor that does not require any sizes
and use it in all sanitizers except for tsan.
In preparation for new tsan runtime, which won't need
any of these parameters as well.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D105713
Yuichi Yoshida [Wed, 14 Jul 2021 05:47:31 +0000 (05:47 +0000)]
Reformulate OrcJIT tutorial doc to make it more clear.
Fixed a minor writing error. The text was hard to understand.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D105899
Zakk Chen [Wed, 14 Jul 2021 03:32:55 +0000 (20:32 -0700)]
[RISCV] Support overloading for RVV miscellaneous functions.
Based on this update to the intrinsic doc
https://github.com/riscv/rvv-intrinsic-doc/pull/103
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D105611
Vitaly Buka [Wed, 14 Jul 2021 04:29:14 +0000 (21:29 -0700)]
[sanitizer] Fix type error in python 3
Vitaly Buka [Wed, 14 Jul 2021 03:45:04 +0000 (20:45 -0700)]
[sanitizer] Upgrade android scripts to python 3
David Green [Wed, 14 Jul 2021 03:40:47 +0000 (04:40 +0100)]
Revert "[clang] Refactor AST printing tests to share more infrastructure"
This reverts commit
20176bc7dd3f431db4c3d59b51a9f53d52190c82 as some
versions of GCC do not seem to handle the new code very well. They
complain about:
/tmp/ccqUQZyw.s: Assembler messages:
/tmp/ccqUQZyw.s:1151: Error: symbol `_ZNSt14_Function_base13_Base_managerIN5clangUlPKNS1_4StmtEE2_EE10_M_managerERSt9_Any_dataRKS7_St18_Manager_operation' is already defined
/tmp/ccqUQZyw.s:11963: Error: symbol `_ZNSt17_Function_handlerIFbPKN5clang4StmtEENS0_UlS3_E2_EE9_M_invokeERKSt9_Any_dataOS3_' is already defined
This seems like it is some GCC issue, but multiple buildbots (and my
local machine) are all failing because of it.
Vitaly Buka [Wed, 14 Jul 2021 03:38:45 +0000 (20:38 -0700)]
[sanitizer] Convert script to python 3
Michael Kruse [Wed, 14 Jul 2021 03:33:56 +0000 (22:33 -0500)]
[Polly] Fix typo. NFC.
Thanks to Mugerwa Martin for reporting.
Jinsong Ji [Wed, 14 Jul 2021 03:29:06 +0000 (03:29 +0000)]
[AIX] Update testcase to use aix triple
We have implemented the basic MCAsmParser now, we can use the triple
directly now.
Hongtao Yu [Wed, 14 Jul 2021 02:49:50 +0000 (19:49 -0700)]
[CSSPGO][llvm-profgen] Fix a missing initalization
Fixing a missing initalization that accidentaly caused by https://reviews.llvm.org/D103178 .
Hongtao Yu [Wed, 14 Jul 2021 02:48:58 +0000 (19:48 -0700)]
Revert "[CSSPGO][llvm-profgen] Fix a missing initalization"
This reverts commit
fef5f4456abcb1ea052206db6c232468d70b07f2.
Hongtao Yu [Wed, 14 Jul 2021 02:45:48 +0000 (19:45 -0700)]
[CSSPGO][llvm-profgen] Fix a missing initalization
Fixing a missing initalization that accidentaly caused by https://reviews.llvm.org/D103178 .
Shilei Tian [Wed, 14 Jul 2021 02:28:26 +0000 (22:28 -0400)]
[AbstractAttributor] Fold function calls to `__kmpc_is_spmd_exec_mode` if possible
In the device runtime there are many function calls to `__kmpc_is_spmd_exec_mode`
to query the execution mode of current kernels. In many cases, user programs
only contain target region executing in one mode. As a consequence, those runtime
function calls will only return one value. If we can get rid of these function
calls during compliation, it can potentially improve performance.
In this patch, we use `AAKernelInfo` to analyze kernel execution. Basically, for
each kernel (device) function `F`, we collect all kernel entries `K` that can
reach `F`. A new AA, `AAFoldRuntimeCall`, is created for each call site. In each
iteration, it will check all reaching kernel entries, and update the folded value
accordingly.
In the future we will support more function.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105787
Philip Reames [Tue, 13 Jul 2021 20:30:44 +0000 (13:30 -0700)]
[SCEV] Handle zero stride correctly in howManyLessThans
This is split from D105216, but the code is hoisted much earlier into
the path where we can actually get a zero stride flowing through. Some
fairly simple proofs handle the cases which show up in practice. The
only test changes are the cases where we really do need a non-zero
divider to produce the right result.
Recommitting with isLoopInvariant() check.
Differential Revision: https://reviews.llvm.org/D105921
Richard Smith [Wed, 14 Jul 2021 01:28:45 +0000 (18:28 -0700)]
Fix test trying to write a spurious output file into the source
directory.
This causes test failures if the source directory is read-only.
Hongtao Yu [Wed, 14 Jul 2021 01:30:16 +0000 (18:30 -0700)]
[NFC][CSSPGO] Rename the name of an enum value.
Hongtao Yu [Wed, 30 Jun 2021 23:52:37 +0000 (16:52 -0700)]
[CSSPGO] Do not import pseudo probe desc in thinLTO
Previously we reliedy on pseudo probe descriptors to look up precomputed GUID during probe emission for inlined probes. Since we are moving to always using unique linkage names, GUID for functions can be computed in place from dwarf names. This eliminates the need of importing pseudo probe descs in thinlto, since those descs should be emitted by the original modules.
This significantly reduces thinlto memory footprint in some extreme case where the number of imported modules for a single module is massive.
Test Plan:
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D105248
Hongtao Yu [Mon, 12 Jul 2021 16:47:05 +0000 (09:47 -0700)]
[CSSPGO][llvm-profgen] Allow multiple executable load segments.
The linker or post-link optimizer can create an ELF image with multiple executable segments each of which will be loaded separately at run time. This breaks the assumption of llvm-profgen that currently only supports one base load address. What it ends up with is that the subsequent mmap events will be treated as an overwrite of the first mmap event which will in turn screw up address mapping. While it is non-trivial to support multiple separate load addresses and given that on x64 those segments will always be loaded at consecutive addresses (though via separate mmap
sys calls), I'm adding an error checking logic to bail out if that's violated and keep using a single load address which is the address of the first executable segment.
Also changing the disassembly output from printing section offset to printing the virtual address instead, which matches the behavior of objdump.
Differential Revision: https://reviews.llvm.org/D103178
Vitaly Buka [Wed, 14 Jul 2021 00:26:07 +0000 (17:26 -0700)]
[NFC][sanitizer] Simplify MapPackedCounterArrayBuffer
Jon Roelofs [Wed, 14 Jul 2021 01:05:30 +0000 (18:05 -0700)]
[AArch64] rm unused subreg's
Jon Roelofs [Wed, 14 Jul 2021 00:08:45 +0000 (17:08 -0700)]
[AArch64] Fix AArch64::dsub's size
Arthur Eubanks [Wed, 14 Jul 2021 00:51:44 +0000 (17:51 -0700)]
Revert "[SCEV] Handle zero stride correctly in howManyLessThans"
This reverts commit
4df591b5c960affd1612e330d0c9cd3076c18053.
Causes crashes, see comments on D105921.
Vitaly Buka [Wed, 14 Jul 2021 00:42:59 +0000 (17:42 -0700)]
Revert "[NFC][sanitizer] Simplify MapPackedCounterArrayBuffer"
Does not compile.
This reverts commit
8725b382b0a5ea375252d966bafbace62a21e93b.
Jessica Paquette [Tue, 13 Jul 2021 22:21:58 +0000 (15:21 -0700)]
[AArch64][GlobalISel] Mark v2s64 -> v2p0 G_INTTOPTR as legal
Allow
```
%x:_<2 x p0> = G_INTTOPTR %y:_<2 x s64>
```
This shows up when building clang for AArch64 with GlobalISel.
Also show that we can select it.
This should match SDAG's behaviour: https://godbolt.org/z/33oqYoaYv
Differential Revision: https://reviews.llvm.org/D105944
Vitaly Buka [Wed, 14 Jul 2021 00:26:07 +0000 (17:26 -0700)]
[NFC][sanitizer] Simplify MapPackedCounterArrayBuffer
Vitaly Buka [Tue, 13 Jul 2021 23:51:18 +0000 (16:51 -0700)]
[NFC][sanitizer] Move MemoryMapper template parameter
Matt Arsenault [Tue, 13 Jul 2021 23:33:38 +0000 (19:33 -0400)]
AMDGPU: Try to fix test failure with EXPENSIVE_CHECKS
The machine verifier is enabled by default for EXPENSIVE_CHECKS, so
the pass runs of it would pollute the output here.
Dmitry Vyukov [Tue, 13 Jul 2021 22:34:58 +0000 (15:34 -0700)]
sanitizer_common: optimize memory drain
Currently we allocate MemoryMapper per size class.
MemoryMapper mmap's and munmap's internal buffer.
This results in 50 mmap/munmap calls under the global
allocator mutex. Reuse MemoryMapper and the buffer
for all size classes. This radically reduces number of
mmap/munmap calls. Smaller size classes tend to have
more objects allocated, so it's highly likely that
the buffer allocated for the first size class will
be enough for all subsequent size classes.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D105778
Arthur Eubanks [Tue, 13 Jul 2021 19:50:34 +0000 (12:50 -0700)]
[NewPM][SimpleLoopUnswitch] Add option to not trivially unswitch
To help with debugging non-trivial unswitching issues.
Don't care about the legacy pass, nobody is using it.
If a pass's string params are empty (e.g. "simple-loop-unswitch"), don't
default to the empty constructor for the pass params. We should still
let the parser take care of it in case the parser has its own defaults.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D105933
Vitaly Buka [Tue, 13 Jul 2021 22:58:55 +0000 (15:58 -0700)]
[NFC][sanitizer] Don't store region_base_ in MemoryMapper
Part of D105778
Matt Arsenault [Wed, 26 Sep 2018 23:36:28 +0000 (09:36 +1000)]
RegAlloc: Allow targets to split register allocation
AMDGPU normally spills SGPRs to VGPRs. Previously, since all register
classes are handled at the same time, this was problematic. We don't
know ahead of time how many registers will be needed to be reserved to
handle the spilling. If no VGPRs were left for spilling, we would have
to try to spill to memory. If the spilled SGPRs were required for exec
mask manipulation, it is highly problematic because the lanes active
at the point of spill are not necessarily the same as at the restore
point.
Avoid this problem by fully allocating SGPRs in a separate regalloc
run from VGPRs. This way we know the exact number of VGPRs needed, and
can reserve them for a second run. This fixes the most serious
issues, but it is still possible using inline asm to make all VGPRs
unavailable. Start erroring in the case where we ever would require
memory for an SGPR spill.
This is implemented by giving each regalloc pass a callback which
reports if a register class should be handled or not. A few passes
need some small changes to deal with leftover virtual registers.
In the AMDGPU implementation, a new pass is introduced to take the
place of PrologEpilogInserter for SGPR spills emitted during the first
run.
One disadvantage of this is currently StackSlotColoring is no longer
used for SGPR spills. It would need to be run again, which will
require more work.
Error if the standard -regalloc option is used. Introduce new separate
-sgpr-regalloc and -vgpr-regalloc flags, so the two runs can be
controlled individually. PBQB is not currently supported, so this also
prevents using the unhandled allocator.
Eli Friedman [Tue, 13 Jul 2021 21:48:47 +0000 (14:48 -0700)]
[ScalarEvolution] Make isKnownNonZero handle more cases.
Using an unsigned range instead of signed ranges is a bit more precise.
Differential Revision: https://reviews.llvm.org/D105941
Vitaly Buka [Tue, 13 Jul 2021 21:54:24 +0000 (14:54 -0700)]
[NFC][sanitizer] Exctract DrainHalfMax
Part of D105778
Vitaly Buka [Tue, 13 Jul 2021 22:31:54 +0000 (15:31 -0700)]
[NFC][sanitizer] Rename some MemoryMapper members
Part of D105778
Geoffrey Martin-Noble [Tue, 13 Jul 2021 19:57:31 +0000 (12:57 -0700)]
[NFC][MLIR][std] Clean up ArithmeticCastOps
The documentation on these was out of sync with the implementation. Also
the declaration of inputs was repeated when it is already part of the
ArithmeticCastOp definition.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D105934
Victor Huang [Tue, 13 Jul 2021 19:57:08 +0000 (14:57 -0500)]
[PowerPC] Add PowerPC compare and multiply related builtins and instrinsics for XL compatibility
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for compare
and multiply related operations.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D102875
MaheshRavishankar [Tue, 13 Jul 2021 21:51:20 +0000 (14:51 -0700)]
[mlir][Tensor] Implement `reifyReturnTypeShapesPerResultDim` for `tensor.insert_slice`.
Differential Revision: https://reviews.llvm.org/D105852
Aart Bik [Tue, 13 Jul 2021 19:13:39 +0000 (12:13 -0700)]
[mlir][sparse] add support for std unary operations
Adds zero-preserving unary operators from std. Also adds xor.
Performs minor refactoring to remove "zero" node, and pushed
the irregular logic for negi (not support in std) into one place.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D105928
Adam Paszke [Tue, 13 Jul 2021 21:35:50 +0000 (14:35 -0700)]
Add more types to the LLVM dialect C API
This includes:
- void type
- array types
- function types
- literal (unnamed) struct types
Reviewed By: jpienaar, ftynse
Differential Revision: https://reviews.llvm.org/D105908
Derek Schuff [Tue, 13 Jul 2021 21:31:19 +0000 (14:31 -0700)]
[WebAssembly] Run varargs codegen test with non-emscripten triple
This is a followup from D105749 to cover both triples in the case
where they differ.
Alexander Yermolovich [Tue, 13 Jul 2021 19:11:53 +0000 (12:11 -0700)]
[LLD] Adding support for RELA for CG Profile.
This is a follow up to https://reviews.llvm.org/D104080, and https://github.com/llvm/llvm-project/commit/
ca3bdb57fa1ac98b711a735de048c12b5fdd8086#diff-e64a48fabe31db213a631fdc5f2acb51bdddf3f16a8fb2928784f4c579229585. The implementation of call graph profile was changed from a black box section to relocation approach. This was done to be compatible with post processing tools like strip/objcopy, and llvm equivalent. When they are invoked on object file before the final linking step with this new approach the symbol indices correctness is preserved.
The GNU binutils tools change the REL section to RELA section, unlike llvm tools. For example when strip -S is run on the ELF object files, as an intermediate step before linking. To preserve compatibility this patch extends implementation in LLD and ELFDumper to support both REL and RELA sections for call graph profile.
Reviewed By: MaskRay, jhenderson
Differential Revision: https://reviews.llvm.org/D105217
Hedin Garca [Tue, 13 Jul 2021 17:19:58 +0000 (17:19 +0000)]
[libc] Capture floating point encoding and arrange it sequentially in memory
Redefined FPBits.h and LongDoubleBitsX86 so its implementation works for the Windows
and Linux platform while maintaining a packed memory alignment of the precision floating
point numbers. For its size in memory to be the same as the data type of the float point number.
This change was necessary because the previous attribute((packed)) specification in the struct was not working
for Windows like it was for Linux and consequently static_asserts in the FPBits.h file were failing.
Reviewed By: aeubanks, sivachandra
Differential Revision: https://reviews.llvm.org/D105561
Caitlyn Cano [Thu, 8 Jul 2021 17:44:10 +0000 (17:44 +0000)]
[libc] Don't pass -fpie/-ffreestanding on Windows
The current compile options function hardcodes the -fpie and
-ffreestanding flags, which don't exist on Windows. This patch sets the
compilation flags conditionally based on the OS specifics.
Reviewed By: sivachandra, aeubanks
Differential Revision: https://reviews.llvm.org/D105643
Vitaly Buka [Tue, 13 Jul 2021 20:37:29 +0000 (13:37 -0700)]
[sanitizer] Few more NFC changes from D105778
Philip Reames [Tue, 13 Jul 2021 20:30:44 +0000 (13:30 -0700)]
[SCEV] Handle zero stride correctly in howManyLessThans
This is split from D105216, but the code is hoisted much earlier into the path where we can actually get a zero stride flowing through. Some fairly simple proofs handle the cases which show up in practice. The only test changes are the cases where we really do need a non-zero divider to produce the right result.
Differential Revision: https://reviews.llvm.org/D105921
Martin Storsjö [Tue, 13 Jul 2021 11:24:51 +0000 (14:24 +0300)]
[libcxx] [docs] Acknowledge that the library is known to work in some configs outside of what's tested in CI
Differential Revision: https://reviews.llvm.org/D105888
Vitaly Buka [Tue, 13 Jul 2021 20:16:46 +0000 (13:16 -0700)]
[NFC][sanitizer] Move MemoryMapper out of SizeClassAllocator64
Part of D105778
Hedin Garca [Wed, 30 Jun 2021 20:08:26 +0000 (20:08 +0000)]
[libc] Add on float properties for precision floating point numbers in FloatProperties.h
Defined constant that express the number of bits for exponent in single and double precision. Added bit masks values and other properties for quad precision floating point numbers that specifically targets architectures defined in PlatfromDefs.h. The exponentWidth values were added to be used in LongDoubleBitsX86.h where the implementation to set the exponent component uses this and the bitWidth value. The need occurred because of the 80-bit quad precision implementation.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D105153
Vedant Kumar [Mon, 12 Jul 2021 16:39:59 +0000 (09:39 -0700)]
[docs/llvm-cov] Document -compilation-dir
Document the `-compilation-dir` option added in D100232.
Differential Revision: https://reviews.llvm.org/D105826
Vitaly Buka [Tue, 13 Jul 2021 20:02:23 +0000 (13:02 -0700)]
[NFC][sanitizer] clang-format part of D105778
Vitaly Buka [Tue, 13 Jul 2021 19:39:16 +0000 (12:39 -0700)]
Revert "sanitizer_common: optimize memory drain"
Breaks https://lab.llvm.org/buildbot/#/builders/anitizer-windows
This reverts commit
d89d3dfae17d7795dc1ef013db66272020de1959.
Arthur O'Dwyer [Tue, 13 Jul 2021 19:57:43 +0000 (15:57 -0400)]
[libc++] [test] Add a missing `()` in TestEachIntegralType.
Hafiz Abid Qadeer [Tue, 13 Jul 2021 18:28:00 +0000 (19:28 +0100)]
[lld][AMDGPU] Handle R_AMDGPU_REL16 relocation.
This patch is a followup patch to https://reviews.llvm.org/D105760 which adds this relocation. This handles the relocation in lld.
The s_branch family of instruction does the following:
PC = PC + signext(simm * 4) + 4
so we we do the opposite on the target address before writing it in the instruction stream.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D105761
thomasraoux [Tue, 13 Jul 2021 17:38:46 +0000 (10:38 -0700)]
[mlir][Vector] Remove Vector TupleOp as it is unused
TupleOp is not used anymore after recent refactoring.
Differential Revision: https://reviews.llvm.org/D105924
Eli Friedman [Tue, 13 Jul 2021 19:21:13 +0000 (12:21 -0700)]
[NFC] Use CHECK-LABEL in trip-count-unknown-stride.ll
Eli Friedman [Tue, 13 Jul 2021 18:32:23 +0000 (11:32 -0700)]
[LoopReroll] Add an extra defensive check to avoid SCEV assertion.
Make sure getMinusSCEV() didn't return a pointer. The following check
would never succeed if it was a pointer, anyway, but calling
getMulExpr() on a pointer SCEV now asserts.
Nico Weber [Tue, 13 Jul 2021 19:15:38 +0000 (15:15 -0400)]
[gn build] (manually) port
303ddb60a2d2
Philip Reames [Tue, 13 Jul 2021 19:01:56 +0000 (12:01 -0700)]
[tests] Precommit a test case from D105216
Artem Belevich [Tue, 13 Jul 2021 18:40:11 +0000 (11:40 -0700)]
Fix cuda-bad-arch.cu test.
Tests for correctness of HIP architecture need `- xhip`
Philip Reames [Tue, 13 Jul 2021 18:51:02 +0000 (11:51 -0700)]
[SCEV] Strengthen inference of RHS > Start in howManyLessThans
Split off from D105216 to simplify review. Rewritten with a lambda to be easier to follow. Comments clarified.
Sorry for no test case, this is tricky to exercise with the current structure of the code. It's about to be hit more frequently in a follow up patch, and the change itself is simple.
Jon Roelofs [Tue, 13 Jul 2021 18:35:48 +0000 (11:35 -0700)]
[Tests] Fix test broken by:
43c7ca8e4963 [AArch64][GlobalISel] Legalize store <2 x i16>
Krishna Kariya [Tue, 13 Jul 2021 18:33:30 +0000 (20:33 +0200)]
[InstCombine] Precommit tests for D105088 (NFC)
Add tests for D105088, as well as an option to disable the
(generally) unsound inttoptr of ptrtoint optimization.
Differential Revision: https://reviews.llvm.org/D105771
Thomas Lively [Tue, 13 Jul 2021 18:25:32 +0000 (11:25 -0700)]
[WebAssembly] Generate checks for simd-load-store-alignment.ll
This will make it easier to update these tests as we add support for generating
more SIMD loads and stores with custom alignments.
Differential Revision: https://reviews.llvm.org/D105862
Victor Huang [Tue, 13 Jul 2021 18:12:58 +0000 (13:12 -0500)]
[PowerPC][NFC] Power ISA features for Semachecking
[NFC] This patch adds features for pwr7, pwr8, and pwr9 that can be
used for semachecking builtin functions that are only valid for certain
versions of ppc.
Reviewed By: nemanjai, #powerpc
Authored By: Quinn Pham <Quinn.Pham@ibm.com>
Differential revision: https://reviews.llvm.org/D105501
Victor Huang [Tue, 13 Jul 2021 17:59:41 +0000 (12:59 -0500)]
Revert "[PowerPC][NFC] Power ISA features for Semachecking"
This reverts commit
10e0cdfc6526578c8892d895c0448e77cb9ba876.
Jon Roelofs [Fri, 9 Jul 2021 18:42:05 +0000 (11:42 -0700)]
[AArch64][GlobalISel] Legalize load <2 x i16>
Differential revision: https://reviews.llvm.org/D105913
Jon Roelofs [Fri, 9 Jul 2021 18:26:20 +0000 (11:26 -0700)]
[AArch64][GlobalISel] Legalize store <2 x i16>
Differential revision: https://reviews.llvm.org/D105912
Artem Belevich [Thu, 1 Jul 2021 16:55:44 +0000 (09:55 -0700)]
[CUDA] Only allow NVIDIA offload-arch during CUDA compilation.
Otherwise, if someone specifies a valid AMD arch, we may end up triggering an
assertion on unexpected arch later on.
Differential Revision: https://reviews.llvm.org/D105295
Philip Reames [Tue, 13 Jul 2021 17:57:44 +0000 (10:57 -0700)]
[test] Add a SCEV backedge computation test with an explicit zero stride
Vitaly Buka [Tue, 13 Jul 2021 18:02:21 +0000 (11:02 -0700)]
[NFC][sanitizer] Remove trailing whitespace
Valeriy Savchenko [Fri, 9 Jul 2021 09:36:13 +0000 (12:36 +0300)]
[analyzer][solver][NFC] Refactor how we detect (dis)equalities
This patch simplifies the way we deal with (dis)equalities.
Due to the symmetry between constraint handler and range inferrer,
we can have very similar implementations of logic handling
questions about (dis)equality and assumptions involving (dis)equality.
It also helps us to remove one more visitor, and removes uncertainty
that we got all the right places to put `trackNE` and `trackEQ`.
Differential Revision: https://reviews.llvm.org/D105693
Valeriy Savchenko [Thu, 8 Jul 2021 17:09:04 +0000 (20:09 +0300)]
[analyzer][solver][NFC] Introduce ConstraintAssignor
The new component is a symmetric response to SymbolicRangeInferrer.
While the latter is the unified component, which answers all the
questions what does the solver knows about a particular symbolic
expression, assignor associates new constraints (aka "assumes")
with symbolic expressions and can imply additional knowledge that
the solver can extract and use later on.
- Why do we need it and why is SymbolicRangeInferrer not enough?
As it is noted before, the inferrer only helps us to get the most
precise range information based on the existing knowledge and on the
mathematical foundations of different operations that symbolic
expressions actually represent. It doesn't introduce new constraints.
The assignor, on the other hand, can impose constraints on other
symbols using the same domain knowledge.
- But for some expressions, SymbolicRangeInferrer looks into constraints
for similar expressions, why can't we do that for all the cases?
That's correct! But in order to do something like this, we should
have a finite number of possible "similar expressions".
Let's say we are asked about `$a - $b` and we know something about
`$b - $a`. The inferrer can invert this expression and check
constraints for `$b - $a`. This is simple!
But let's say we are asked about `$a` and we know that `$a * $b != 0`.
In this situation, we can imply that `$a != 0`, but the inferrer shouldn't
try every possible symbolic expression `X` to check if `$a * X` or
`X * $a` is constrained to non-zero.
With the assignor mechanism, we can catch this implication right at
the moment we associate `$a * $b` with non-zero range, and set similar
constraints for `$a` and `$b` as well.
Differential Revision: https://reviews.llvm.org/D105692
Vitaly Buka [Tue, 13 Jul 2021 17:44:53 +0000 (10:44 -0700)]
[sanitizer] Fix VSNPrintf %V on Windows
Louis Dionne [Mon, 12 Jul 2021 21:50:21 +0000 (17:50 -0400)]
[libc++] Add a CI job for macOS on arm64 hardware 🥳
Differential Revision: https://reviews.llvm.org/D105848
Tom Stellard [Tue, 13 Jul 2021 17:47:29 +0000 (10:47 -0700)]
Fix utils/update_cc_test_checks/check-globals.test on stand-alone builds
We want to use LLVM_EXTERNAL_LIT if defined for the %lit substitution.
Reviewed By: jdenny
Differential Revision: https://reviews.llvm.org/D105873
Peyton, Jonathan L [Tue, 13 Jul 2021 17:33:01 +0000 (12:33 -0500)]
[OpenMP] Fix one sign-compare warning from GCC
Louis Dionne [Tue, 13 Jul 2021 17:07:45 +0000 (13:07 -0400)]
[libc++] NFC: Add comment for running macOS CI setup script remotely
Craig Topper [Tue, 13 Jul 2021 17:27:29 +0000 (10:27 -0700)]
[RISCV] Use DIVUW/REMUW/DIVW instructions for i8/i16/i32 udiv/urem/sdiv when LHS is constant.
We don't really have optimizations for division with a constant
LHS. If we don't use a W instruction we end up needing to sign
or zero extend the RHS to use the 64-bit instruction.
I had to sign_extend i32 constants on the LHS instead of using
any_extend which becomes zero_extend. If we don't do this, constants
that were originally negative become harder to materialize. I think
this problem exists for more of our W instruction cases. For example
(i32 (shl -1, X)), but we don't have lit tests. I'll work on that
as a follow up.
I also left a FIXME for enabling W instruction for RHS constants
under -Oz.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D105769
Amy Kwan [Mon, 12 Jul 2021 21:35:50 +0000 (16:35 -0500)]
[PowerPC] Add FI alignment check if the addressing mode is DS/DQ-Form, emit X-Form if necessary.
This patch adds a function that checks whether or not the frame index
is aligned when the computed addressing mode is an aligned D-Form (DS, or DQ-Form).
If the frame index appears to be unaligned, within these two modes, reset
the mode to X-Form in order to fall back to selection X-Form loads.
A test case is added to ensure that the test emits X-Form loads and not DQ-Form
loads since the frame index is not aligned within the test case.
Differential Revision: https://reviews.llvm.org/D105661
Peyton, Jonathan L [Tue, 13 Jul 2021 17:23:49 +0000 (12:23 -0500)]
[OpenMP][NFC] Change comment style to eliminate warnings from GCC
Standalone build for OpenMP runtime using GCC is giving -Wcomment
warnings where a backslash newline is encountered in the // style
comment. This switches the // style for /* style to silence the
warnings.
Matheus Izvekov [Sat, 10 Jul 2021 00:34:17 +0000 (02:34 +0200)]
[clang] C++98 implicit moves are back with a vengeance
After taking C++98 implicit moves out in D104500,
we put it back in, but now in a new form which preserves
compatibility with pure C++98 programs, while at the same time
giving almost all the goodies from P1825.
* We use the exact same rules as C++20 with regards to which
id-expressions are move eligible. The previous
incarnation would only benefit from the proper subset which is
copy ellidable. This means we can implicit move, in addition:
* Parameters.
* RValue references.
* Exception variables.
* Variables with higher-than-natural required alignment.
* Objects with different type from the function return type.
* We preserve the two-overload resolution, with one small tweak to the
first one: If we either pick a (possibly converting) constructor which
does not take an rvalue reference, or a user conversion operator which
is not ref-qualified, we abort into the second overload resolution.
This gives C++98 almost all the implicit move patterns which we had created test
cases for, while at the same time preserving the meaning of these
three patterns, which are found in pure C++98 programs:
* Classes with both const and non-const copy constructors, but no move
constructors, continue to have their non-const copy constructor
selected.
* We continue to reject as ambiguous the following pattern:
```
struct A { A(B &); };
struct B { operator A(); };
A foo(B x) { return x; }
```
* We continue to pick the copy constructor in the following pattern:
```
class AutoPtrRef { };
struct AutoPtr {
AutoPtr(AutoPtr &);
AutoPtr();
AutoPtr(AutoPtrRef);
operator AutoPtrRef();
};
AutoPtr test_auto_ptr() {
AutoPtr p;
return p;
}
```
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D105756
Marcos Horro [Tue, 13 Jul 2021 17:07:03 +0000 (19:07 +0200)]
[llvm-mca] [NFC] Formatting code
Applied clang-format to all files. Discarded BottleneckAnalysis.h
80-column width violation since it contains an example of report.
Caught some typos and minor style details.
Reviewed By: andreadb
Differential Revision: https://reviews.llvm.org/D105900
Saleem Abdulrasool [Fri, 19 Mar 2021 15:26:27 +0000 (08:26 -0700)]
AST: correct name decoration for swift async functions on Windows
The name decoration scheme on Windows does not have a vendor namespace,
and the decoration scheme is not shared ownership - it is controlled by
Microsoft. `T` is a reserved identifier for an unknown calling
convention. The `W` identifier has been discussed with Microsoft
offline and is reserved as `Swift_3` as the identifier for the swift
async calling convention. Adjust the name decoration accordingly.
Philip Reames [Tue, 13 Jul 2021 16:58:19 +0000 (09:58 -0700)]
[ScalarEvolution] Fix overflow when computing max trip counts
This is split from D105216 to reduce patch complexity. Original code by Eli with very minor modification by me.
The primary point of this patch is to add the getUDivCeilSCEV routine. I included the two callers with constant arguments as we know those must constant fold even without any of the fancy inference logic.
Arthur Eubanks [Tue, 13 Jul 2021 16:57:37 +0000 (09:57 -0700)]
[NFC] Inline variable to prevent unused variable warning
thomasraoux [Tue, 13 Jul 2021 16:34:48 +0000 (09:34 -0700)]
[mlir] Add support for tensor.extract to comprehensive bufferization
Differential Revision: https://reviews.llvm.org/D105870
Craig Topper [Tue, 13 Jul 2021 06:53:40 +0000 (23:53 -0700)]
[RISCV] Prevent use of t0(aka x5) as rs1 for jalr instructions.
Some microarchitectures treat rs1=x1/x5 on jalr as a hint to pop
the return-address stack. We should avoid using x5 on jalr
instructions since we aren't using x5 as an alternate link register.
Differential Revision: https://reviews.llvm.org/D105875
Guillaume Chatelet [Tue, 13 Jul 2021 16:44:42 +0000 (16:44 +0000)]
Revert "[llvm] Add enum iteration to Sequence"
This reverts commit
a006af5d6ec6280034ae4249f6d2266d726ccef4.
Arthur Eubanks [Tue, 13 Jul 2021 16:29:53 +0000 (09:29 -0700)]
[OpaquePtr] Use byval type more
Arthur Eubanks [Tue, 13 Jul 2021 16:27:09 +0000 (09:27 -0700)]
[OpaquePtr] Get load/store type without PointerType::getElementType()
Arthur Eubanks [Tue, 13 Jul 2021 16:26:39 +0000 (09:26 -0700)]
[OpaquePtr] Use GlobalValue::getValueType() more
Arthur Eubanks [Tue, 13 Jul 2021 16:25:28 +0000 (09:25 -0700)]
[OpaquePtr] Use AllocaInst::getAllocatedType()
Julian Lettner [Mon, 12 Jul 2021 21:00:45 +0000 (14:00 -0700)]
Avoid triggering assert when program calls OSAtomicCompareAndSwapLong
A previous change brought the new, relaxed implementation of "on failure
memory ordering" for synchronization primitives in LLVM over to TSan
land [1]. It included the following assert:
```
// 31.7.2.18: "The failure argument shall not be memory_order_release
// nor memory_order_acq_rel". LLVM (2021-05) fallbacks to Monotonic
// (mo_relaxed) when those are used.
CHECK(IsLoadOrder(fmo));
static bool IsLoadOrder(morder mo) {
return mo == mo_relaxed || mo == mo_consume
|| mo == mo_acquire || mo == mo_seq_cst;
}
```
A previous workaround for a false positive when using an old Darwin
synchronization API assumed this failure mode to be unused and passed a
dummy value [2]. We update this value to `mo_relaxed` which is also the
value used by the actual implementation to avoid triggering the assert.
[1] https://reviews.llvm.org/D99434
[2] https://reviews.llvm.org/D21733
rdar://
78122243
Differential Revision: https://reviews.llvm.org/D105844
Nicolas Vasilache [Tue, 13 Jul 2021 15:32:40 +0000 (15:32 +0000)]
[mlir][Linalg] Properly specify Linalg attribute.
This fixes undefined reference introduced by https://reviews.llvm.org/D105859
Differential Revision: https://reviews.llvm.org/D105897
Fangrui Song [Tue, 13 Jul 2021 16:30:09 +0000 (09:30 -0700)]
[RISCV] Support machine constraint "S"
Similar to D46745, "S" represents an absolute symbolic operand, which
can be used to specify the access models, e.g.
extern int var;
void *addr_via_asm() {
void *ret;
asm("lui %0, %%hi(%1)\naddi %0,%0,%%lo(%1)" : "=r"(ret) : "S"(&var));
return ret;
}
'S' is documented in trunk GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101275
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D105254
Guillaume Chatelet [Tue, 13 Jul 2021 16:22:19 +0000 (16:22 +0000)]
[llvm] Add enum iteration to Sequence
This patch allows iterating typed enum via the ADT/Sequence utility.
Differential Revision: https://reviews.llvm.org/D103900
Aart Bik [Mon, 12 Jul 2021 22:50:47 +0000 (15:50 -0700)]
[mlir][memref] adjust integration tests to new lowering passes
these tests run under the emulator and thus were overlooked
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D105855
Albion Fung [Tue, 13 Jul 2021 16:01:53 +0000 (11:01 -0500)]
[PowerPC] Fix L[D|W]ARX Implementation
LDARX and LWARX sometimes gets optimized out by the compiler
when it is critical to the correctness of the code. This inline asm generation
ensures that it preserved.
Differential Revision: https://reviews.llvm.org/D105754
Simon Pilgrim [Tue, 13 Jul 2021 15:57:40 +0000 (16:57 +0100)]
[InstCombine] Add basic (select C, (gep Ptr, Idx), Ptr) tests from PR50183