Roman Lebedev [Sun, 14 Nov 2021 16:53:01 +0000 (19:53 +0300)]
[X86][Costmodel] `getReplicationShuffleCost()`: promote 16 bit-wide elements to 32 bit when no AVX512BW
The basic idea is simple, if we don't have native shuffle for this element type,
then we must have native shuffle for wider element type,
so promote, replicate, demote.
I believe, asking `getCastInstrCost(Instruction::Trunc` is correct semantically,
case in point `trunc <32 x i32> to <32 x i8>` aka 2 * ZMM will naively result in
2 * XMM, that then will be packed into 1 * YMM,
and it should count the cost of said packing,
not just the truncations.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D113609
Roman Lebedev [Sun, 14 Nov 2021 16:56:10 +0000 (19:56 +0300)]
[NFC][TTI] `getReplicationShuffleCost()`: s/Replicated/Dst/
'Replicated' is mouthful and somewhat ambigious,
while 'destination' is pretty self-explanatory.
Mehrnoosh Heidarpour [Sun, 14 Nov 2021 16:06:07 +0000 (11:06 -0500)]
[InstCombine] add tests for or-xor logic fold; NFC
Baseline tests for D113783
Differential Revision: https://reviews.llvm.org/D113846
Ahmed Bougacha [Mon, 27 Sep 2021 15:00:00 +0000 (08:00 -0700)]
[IR] Define ptrauth intrinsics.
This defines the new `@llvm.ptrauth.` pointer authentication intrinsics:
sign, auth, strip, blend, and sign_generic, documented in PointerAuth.md.
Pointer Authentication is a mechanism by which certain pointers are
signed. When a pointer gets signed, a cryptographic hash of its value
and other values (pepper and salt) is stored in unused bits of that
pointer.
Before the pointer is used, it needs to be authenticated, i.e., have its
signature checked. This prevents pointer values of unknown origin from
being used to replace the signed pointer value.
sign and auth provide the core operations. strip removes the ptrauth
bits from a signed pointer without checking them. sign_generic allows
signing non-pointer values. Finally, blend combines salt values
("discriminators") to derive more targeted and less reusable ones.
In later patches, we implement primary backend support for these
intrinsics using the AArch64 PAuth feature, and build on that to
implement the arm64e Darwin ABI and ELF PAuth ABI Extension in clang.
For more details, see the docs page, as well as our llvm-dev RFC:
http://lists.llvm.org/pipermail/llvm-dev/2019-October/136091.html
or our 2019 Developers' Meeting talk.
Differential Revision: https://reviews.llvm.org/D90868
Roman Lebedev [Sun, 14 Nov 2021 15:41:38 +0000 (18:41 +0300)]
[X86][Costmodel] `trunc v8i64 to v16i16/v32i16` can appear after legalization, cost is same as for `trunc v8i64 to v8i16`
Same as D113842, but for i64
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D113843
Roman Lebedev [Sun, 14 Nov 2021 15:41:37 +0000 (18:41 +0300)]
[X86][Costmodel] `trunc v16i32 to v32i16` can appear after legalization, cost is same as for `trunc v16i32 to v16i16`
This was noticed in D113609, hopefully it unblocks that patch.
There are likely other similar problems.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D113842
Mircea Trofin [Sun, 14 Nov 2021 06:29:38 +0000 (22:29 -0800)]
[NFC][InlineFunction] Renamed some vars to conform to coding style
Sanjay Patel [Sun, 14 Nov 2021 14:25:00 +0000 (09:25 -0500)]
[DAGCombiner] match inverted/swapped patterns for vselect of mask of signbit
This was noted as a follow-up to D113212 / D113426:
4fc1fc4005f7
7e30404c3b6c
11522cfcad6b
https://alive2.llvm.org/ce/z/e4o96b
The canonicalization rules for these IR patterns are complicated,
and we were not matching the expected forms in 2 out of the 3
cases. We can make codegen more robust by matching the swapped
forms (and that will also work if these patterns are created late).
mydeveloperday [Sun, 14 Nov 2021 14:12:42 +0000 (14:12 +0000)]
[clang-format][c++2b] support removal of the space between auto and {} in P0849R8
Looks like the work of {D113393} requires manual clang-formatting intervention.
Removal of the space between `auto` and `{}`
Reviewed By: HazardyKnusperkeks, Quuxplusone
Differential Revision: https://reviews.llvm.org/D113826
Simon Pilgrim [Sun, 14 Nov 2021 13:40:53 +0000 (13:40 +0000)]
[X86] Widen 128/256-bit VPTERNLOG patterns to 512-bit on non-VLX targets
Similar to what we've done for other ops, this patch widens VPTERNLOG to a 512-bit op for non-VLX targets.
Fixes regressions in D113192
Differential Revision: https://reviews.llvm.org/D113827
Roman Lebedev [Sun, 14 Nov 2021 13:07:30 +0000 (16:07 +0300)]
[NFC][X86][Costmodel] Improve test coverage for {i32,i64}->i16 vector *ext
See https://reviews.llvm.org/D113609 - some of these costs seem wrong.
Roman Lebedev [Sun, 14 Nov 2021 12:47:52 +0000 (15:47 +0300)]
[NFC][X86][Costmodel] Improve test coverage for i16->{i32,i64} vector *ext
Roman Lebedev [Sun, 14 Nov 2021 12:29:32 +0000 (15:29 +0300)]
[NFC][SROA] Revisit test coverage in non-capturing-call.ll
David Green [Sun, 14 Nov 2021 11:18:31 +0000 (11:18 +0000)]
[TypePromotion] Extend TypePromotion::isSafeWrap
This modifies the preconditions of TypePromotion's isSafeWrap
method, to allow it to work from all constants from the ICmp.
Using the code:
%a = add %x, C1
%c = icmp ult %a, C2
According to Alive, we can prove that is equivalent to
icmp ult (add zext(%x), sext(C1)), zext(C2) given
C1 <=s 0 and C1 >s C2.
https://alive2.llvm.org/ce/z/CECYZB
Which is similar to what is already present. We can also
prove icmp ult (add zext(%x), sext(C1)), sext(C2) given
C1 <=s 0 and C1 <=s C2.
https://alive2.llvm.org/ce/z/KKgyeL
The PrepareWrappingAdds method was removed, and the
constants are now altered to sext or zext directly as
required by the above methods.
Differential Revision: https://reviews.llvm.org/D113678
Kristina Bessonova [Tue, 2 Nov 2021 18:37:48 +0000 (20:37 +0200)]
[DwarfCompileUnit] getOrCreateCommonBlock(): check for existing entity first. NFCI
For global variables and common blocks there is no way to create entities
through getOrCreateContextDIE(), so no need to obtain the context first.
Differential Revision: https://reviews.llvm.org/D113651
Kristina Bessonova [Sun, 14 Nov 2021 08:56:44 +0000 (10:56 +0200)]
[DwarfCompileUnit] getOrCreateGlobalVariableDIE(): remove outdated comment. NFC
Vitaly Buka [Sat, 13 Nov 2021 22:47:27 +0000 (14:47 -0800)]
[sanitizer] Another try to fix the test with GLIBC 2.34
LLVM GN Syncbot [Sun, 14 Nov 2021 08:26:05 +0000 (08:26 +0000)]
[gn build] Port
f55ba3525eb1
Lang Hames [Sun, 14 Nov 2021 08:14:39 +0000 (00:14 -0800)]
Revert "[ORC] Initial MachO debugging support (via GDB JIT debug..."
This reverts commit
e1933a0488a50eb939210808fc895d374570d891 until I can look
into bot failures.
Kazu Hirata [Sun, 14 Nov 2021 05:43:28 +0000 (21:43 -0800)]
[llvm] Use GetElementPtrInst::indices (NFC)
hyeongyu kim [Sun, 14 Nov 2021 00:45:40 +0000 (09:45 +0900)]
[sanitizer][aarch64] fix clone system call's inline assembly
Return value of the system call was not returned normally.
It was discussed at https://reviews.llvm.org/D105169.
Vitaly Buka [Sat, 13 Nov 2021 22:24:50 +0000 (14:24 -0800)]
[sanitizer] Fix test for GLIBC 2.31
Newer GLIBC uses sysconf to get SIGSTKSZ.
LLVM GN Syncbot [Sat, 13 Nov 2021 21:44:00 +0000 (21:44 +0000)]
[gn build] Port
e1933a0488a5
Lang Hames [Thu, 11 Nov 2021 23:19:21 +0000 (15:19 -0800)]
[ORC] Initial MachO debugging support (via GDB JIT debug registration interface)
This commit adds a new plugin, GDBJITDebugInfoRegistrationPlugin, that checks
for objects containing debug info and registers any debug info found via the
GDB JIT registration API.
To enable this registration without redundantly representing non-debug sections
this plugin synthesizes a new embedded object within a section of the LinkGraph.
An allocation action is used to make the registration call.
Currently MachO only. ELF users can still use the DebugObjectManagerPlugin. The
two are likely to be merged in the near future.
ksyx [Sat, 13 Nov 2021 20:59:43 +0000 (15:59 -0500)]
[GVN][NFC] Remove redundant check
The if-check above deleted part guarantees that StoreOffset <= LoadOffset
and that StoreOffset + StoreSize >= LoadOffset + LoadSize, and given that
LoadOffset + LoadSize > LoadOffset when LoadSize > 0. Thus, this shows
StoreOffset + StoreSize > LoadOffset is guaranteed given LoadSize > 0,
while it could be meaningless to have a type with nonpositive size, so that
the check could be removed.
Part of revision D100179
Reviewed By: nikic
Keith Smiley [Sat, 13 Nov 2021 18:13:38 +0000 (10:13 -0800)]
reland: [VFS] Use original path when falling back to external FS
This reverts commit
f0cf544d6f6fe6cbca4c07772998272d6bb433d8.
Just a small change to fix:
```
/home/buildbot/as-builder-4/llvm-clang-x86_64-expensive-checks-ubuntu/llvm-project/llvm/lib/Support/VirtualFileSystem.cpp: In static member function ‘static llvm::ErrorOr<std::unique_ptr<llvm::vfs::File> > llvm::vfs::File::getWithPath(llvm::ErrorOr<std::unique_ptr<llvm::vfs::File> >, const llvm::Twine&)’:
/home/buildbot/as-builder-4/llvm-clang-x86_64-expensive-checks-ubuntu/llvm-project/llvm/lib/Support/VirtualFileSystem.cpp:2084:10: error: could not convert ‘F’ from ‘std::unique_ptr<llvm::vfs::File>’ to ‘llvm::ErrorOr<std::unique_ptr<llvm::vfs::File> >’
return F;
^
```
Differential Revision: https://reviews.llvm.org/D113832
Mehdi Amini [Sat, 13 Nov 2021 20:01:12 +0000 (20:01 +0000)]
Fix some clang-tidy reports in MLIR (NFC)
Mostly replace uses of `container.size()` with `container.empty()` in
conditionals when applicable.
Mogball [Sat, 13 Nov 2021 20:06:48 +0000 (20:06 +0000)]
[mlir][ods] Fix incorrect name in comment (NFC)
Sam McCall [Fri, 12 Nov 2021 14:03:23 +0000 (15:03 +0100)]
[clangd] Fix function-arg-placeholder suppression with macros.
While here, unhide function-arg-placeholders flag. It's reasonable to want and
maybe we should consider making it default.
Fixes https://github.com/clangd/clangd/issues/922
Differential Revision: https://reviews.llvm.org/D113765
David Green [Sat, 13 Nov 2021 19:09:01 +0000 (19:09 +0000)]
[ARM/AArch64] Move REQUIRES after update_cc_test_checks line. NFC
c17d9b4b125e5561925aa added REQUIRES lines to a lot of Arm and AArch64
test, but added them to the very beginning, before the existing
update_cc_test_checks lines. This just moves them later so as to not
mess up the existing ordering when the checks are regenerated.
Keith Smiley [Sat, 13 Nov 2021 18:11:25 +0000 (10:11 -0800)]
Revert "[VFS] Use original path when falling back to external FS"
```
/work/omp-vega20-0/openmp-offload-amdgpu-runtime/llvm.src/llvm/lib/Support/VirtualFileSystem.cpp: In static member function 'static llvm::ErrorOr<std::unique_ptr<llvm::vfs::File> > llvm::vfs::File::getWithPath(llvm::ErrorOr<std::unique_ptr<llvm::vfs::File> >, const llvm::Twine&)':
/work/omp-vega20-0/openmp-offload-amdgpu-runtime/llvm.src/llvm/lib/Support/VirtualFileSystem.cpp:2084:10: error: could not convert 'F' from 'std::unique_ptr<llvm::vfs::File>' to 'llvm::ErrorOr<std::unique_ptr<llvm::vfs::File> >'
return F;
^
```
This reverts commit
c972175649f4bb50d40d911659a04d5620ce6fe0.
Mark de Wever [Sat, 13 Nov 2021 18:08:50 +0000 (19:08 +0100)]
[libc++][NFC] Fixes code alignment.
D112904 fixed some code alignment issues, but it seems only line was
omitted. (Found while resolving merge conflicts for my own patches.)
Keith Smiley [Sat, 13 Nov 2021 17:34:30 +0000 (09:34 -0800)]
[VFS] Use original path when falling back to external FS
This is a follow up to
0be9ca7c0f9a733f846bb6bc4e8e36d46b518162 to make
paths in the case of falling back to the external file system use the
original format, preserving relative paths, and allow the external
filesystem to canonicalize them if needed.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D109128
Raphael Isemann [Sat, 13 Nov 2021 17:18:20 +0000 (18:18 +0100)]
Revert "[lldb] Fix that the embedded Python REPL crashes if it receives SIGINT"
This reverts commit
cef1e07cc6d00b5b429d77133201e1f404a8023c.
It broke the windows bot.
Matt Arsenault [Sat, 13 Nov 2021 16:32:28 +0000 (11:32 -0500)]
AMDGPU: Regenerate test checks
Regenerate with -NEXT checks to make a future diff clearer.
Kazu Hirata [Sat, 13 Nov 2021 16:34:22 +0000 (08:34 -0800)]
[PowerPC] Use SDNode::uses (NFC)
Kristina Bessonova [Sat, 13 Nov 2021 15:31:54 +0000 (17:31 +0200)]
[DebugInfo][test] Simplify/improve a few tests using --impicit-check-not=DW_TAG. NFC
This patch rewrites checks in a few debug info tests to avoid using
'CHECK-NOT: {{DW_TAG|NULL}}'. It proposes `--impicit-check-not=DW_TAG`
instead, as it makes the checks clearer, and easier to analyze and update.
Differential Revision: https://reviews.llvm.org/D113652
mydeveloperday [Sat, 13 Nov 2021 14:13:51 +0000 (14:13 +0000)]
[clang-format] [PR52228] clang-format csharp inconsistant nested namespace indentation
https://bugs.llvm.org/show_bug.cgi?id=52228
For multilevel namespaces in C# get their content indented when NamespaceIndentation: None is set, where as single level namespaces are formatted correctly.
Reviewed By: HazardyKnusperkeks, jbcoe
Differential Revision: https://reviews.llvm.org/D112887
Simon Pilgrim [Sat, 13 Nov 2021 13:59:42 +0000 (13:59 +0000)]
[X86] Add getAVX512Node helper. NFC.
For AVX512 targets without VLX, we have to widen 128/256-bit vectors to 512-bits to use some specific AVX512 instructions (or some other instructions with predicates etc.).
I've pulled out the widening code from LowerFunnelShift into the helper function, so we can convert some other widening patterns in the future.
Florian Hahn [Sat, 13 Nov 2021 09:39:14 +0000 (09:39 +0000)]
[SCEV] Update SCEVLoopGuardRewriter to hold reference to map. (NFC)
SCEVLoopGuardRewriter doesn't need to copy the rewrite map. It can just
hold a const reference instead, to avoid an unnecessary copy.
Dmitry Vyukov [Fri, 12 Nov 2021 18:28:39 +0000 (19:28 +0100)]
tsan: mmap shadow stack
We used to mmap C++ shadow stack as part of the trace region
before
ed7f3f5bc9 ("tsan: move shadow stack into ThreadState"),
which moved the shadow stack into TLS. This started causing
timeouts and OOMs on some of our internal tests that repeatedly
create and destroy thousands of threads.
Allocate C++ shadow stack with mmap and small pages again.
This prevents the observed timeouts and OOMs.
But we now need to be more careful with interceptors that
run after thread finalization because FuncEntry/Exit and
TraceAddEvent all need the shadow stack.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D113786
Vitaly Buka [Sat, 13 Nov 2021 07:40:30 +0000 (23:40 -0800)]
Revert "[sanitizer] Fix test linking"
This reverts commit
afafa883a4757d88d869d1abb6bf7e11022fd521.
-pthread was not the fix. Symbols removed from GLIBC 2.34
Fixed with
e60b3fcefa62311a93a9f7c8589a1da5f25b1ba9.
Vitaly Buka [Sat, 13 Nov 2021 07:38:59 +0000 (23:38 -0800)]
[sanitizer] Don't test __pthread_mutex_lock with GLIBC 2.34
Craig Topper [Sat, 13 Nov 2021 05:36:56 +0000 (21:36 -0800)]
[X86] Promote f16 STRICT_FROUND to f32 and call libc.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D113817
Lang Hames [Sat, 13 Nov 2021 04:37:07 +0000 (20:37 -0800)]
[JITLink][MachO] Fix "find-symbol-by-address" logic.
Only search within the requested section, and allow one-past-then-end addresses.
This is needed to support section-end-address references to sections with no
symbols in them.
Kazu Hirata [Sat, 13 Nov 2021 05:23:04 +0000 (21:23 -0800)]
[Target] Use SDNode::uses (NFC)
Duncan P. N. Exon Smith [Tue, 14 Sep 2021 19:13:19 +0000 (15:13 -0400)]
Support: Pass wrapped Error's error code through FileError
Change FileError to pass through the error code from the Error it wraps.
This allows APIs that return ECError to transition to FileError without
changing returned std::error_code.
This was extracted from https://reviews.llvm.org/D109345.
Differential Revision: https://reviews.llvm.org/D113225
Duncan P. N. Exon Smith [Fri, 12 Nov 2021 01:46:27 +0000 (17:46 -0800)]
ADT: Fix const-correctness of iterator facade
Fix the const-ness of `iterator_facade_base::operator->` and
`iterator_facade_base::operator[]`. This is a follow-up to
1b651be0465de70cfa22ce4f715d3501a4dcffc1, which fixed const-ness of
various iterator adaptors.
Iterators, like the pointers that they generalize, have two types of
`const`.
- The `const` qualifier on members indicates whether the iterator
itself can be changed. This is analagous to `int *const`.
- The `const` qualifier on return values of `operator*()`,
`operator[]()`, and `operator->()` controls whether the the
pointed-to value can be changed. This is analogous to `const int*`.
If an iterator facade returns a handle to its own state, then T (and
PointerT and ReferenceT) should usually be const-qualified. Otherwise,
if clients are expected to modify the state itself, the field can be
declared mutable or a const_cast can be used.
Duncan P. N. Exon Smith [Fri, 12 Nov 2021 03:14:03 +0000 (19:14 -0800)]
Support: Make VarStreamArrayIterator iterate over const values
VarStreamArrayIterator returns a reference to a just-computed internal
value. Change it to iterate over `const ValueType` to avoid allowing
clients to mutate the internal state, and to drop the
non-`const`-qualified operator*().
The removed operator*() was from
175d70ee5c2f03f6 to get
iterator_facade_base::operator->() working, and this fixes the root
cause instead: setting `T` to `const ValueType` causes
iterator_facade_base to infer `PointerT` as `const ValueType*`.
Ironically, this is the last blocker for removing the const-incorrect
overload of `iterator_facade_base::operator->()`, whose presence
triggered adding the workaround in the first place :).
Differential Revision: https://reviews.llvm.org/D113797
Tom Stellard [Fri, 12 Nov 2021 21:31:40 +0000 (13:31 -0800)]
test/ExecutionEngine: Clean up lit.local.cfg
Switch to using config.root.native_target to determine if tests are
supported. This is a canonical form of the arch from the target
triple.
Reviewed By: lhames, DavidSpickett
Differential Revision: https://reviews.llvm.org/D110788
Keith Smiley [Sat, 13 Nov 2021 03:25:37 +0000 (19:25 -0800)]
[lld-macho] Fix warning
```
/Users/ksmiley/dev/llvm-project/lld/MachO/Symbols.cpp:43:27: warning: field 'external' will be initialized after field 'weakDefCanBeHidden' [-Wreorder-ctor]
weakDef(isWeakDef), external(isExternal),
^
1 warning generated.
```
Differential Revision: https://reviews.llvm.org/D113823
Keith Smiley [Sat, 13 Nov 2021 01:24:26 +0000 (17:24 -0800)]
[llvm-obcopy][MachO] Add error for MH_PRELOAD
Previously this would crash. Fixes https://bugs.llvm.org/show_bug.cgi?id=51877
Differential Revision: https://reviews.llvm.org/D113819
Vy Nguyen [Tue, 9 Nov 2021 00:50:34 +0000 (19:50 -0500)]
[lld-macho] Allow exporting weak_def_can_be_hidden(AKA "autohide") symbols
autohide symbols behaves similarly to private_extern symbols.
However, LD64 allows exporting autohide symbols. LLD currently does not.
This patch allows LLD to export them.
Differential Revision: https://reviews.llvm.org/D113167
Matheus Izvekov [Fri, 12 Nov 2021 23:40:18 +0000 (00:40 +0100)]
[clang] retain type sugar in auto / template argument deduction
This implements the following changes:
* AutoType retains sugared deduced-as-type.
* Template argument deduction machinery analyses the sugared type all the way
down. It would previously lose the sugar on first recursion.
* Undeduced AutoType will be properly canonicalized, including the constraint
template arguments.
* Remove the decltype node created from the decltype(auto) deduction.
As a result, we start seeing sugared types in a lot more test cases,
including some which showed very unfriendly `type-parameter-*-*` types.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D110216
Phoebe Wang [Sat, 13 Nov 2021 01:24:05 +0000 (09:24 +0800)]
[X86][ABI] Change the alignment of f80 in 32-bit calling convention to meet with different data layout
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D113739
Vitaly Buka [Sat, 13 Nov 2021 01:56:19 +0000 (17:56 -0800)]
[asan] More leaks in test
It fails to detect a single leak with GLIBC 2.34.
Vy Nguyen [Sat, 13 Nov 2021 01:26:30 +0000 (20:26 -0500)]
[lld-macho] Parallelize scanning the symbol tables in export/unexport-ing.
(Split from D113167)
Benchmarking on one of our large apps which exports a few thousands symbols,
this showed an improvement of ~17%.
x ./LLD_no_parallel.txt
+ ./LLD_with_parallel.txt
N Min Max Median Avg Stddev
x 10 84.01 89.41 88.64 87.693 1.7424061
+ 10 71.9 74.29 72.63 72.753 0.
77734663
Difference at 95.0% confidence
-14.94 +/- 1.26763
-17.0367% +/- 1.44553%
(Student's t, pooled s = 1.34912)
(wallclock)
Differential Revision: https://reviews.llvm.org/D113820
Vitaly Buka [Sat, 13 Nov 2021 01:41:50 +0000 (17:41 -0800)]
[asan] Fix "no matching function" on GCC
Nico Weber [Sat, 13 Nov 2021 01:09:01 +0000 (20:09 -0500)]
[gn build] (semi-manually) port
cb0e14ce6dcd
Vitaly Buka [Sat, 13 Nov 2021 00:52:25 +0000 (16:52 -0800)]
[sanitizer] Fix test linking
Ben Langmuir [Fri, 12 Nov 2021 22:45:02 +0000 (14:45 -0800)]
[ORC][ORC-RT] Register type metadata from __swift5_types MachO section
Similar to how the other swift sections are registered by the ORC
runtime's macho platform, add the __swift5_types section, which contains
type metadata. Add a simple test that demonstrates that the swift
runtime recognized the registered types.
rdar://
85358530
Differential Revision: https://reviews.llvm.org/D113811
Craig Topper [Sat, 13 Nov 2021 00:26:13 +0000 (16:26 -0800)]
[RISCV] Fixed duplicate RUN line on float-intrinsics.ll. NFC
We had two identical RV64I RUN lines. One should be RV32I.
Josh Learn [Sat, 13 Nov 2021 00:17:18 +0000 (16:17 -0800)]
[clang][objc][codegen] Skip emitting ObjC category metadata when the
category is empty
Currently, if we create a category in ObjC that is empty, we still emit
runtime metadata for that category. This is a scenario that could
commonly be run into when using __attribute__((objc_direct_members)),
which elides the need for much of the category metadata. This is
slightly wasteful and can be easily skipped by checking the category
metadata contents during CodeGen.
rdar://
66177182
Differential Revision: https://reviews.llvm.org/D113455
Vitaly Buka [Thu, 11 Nov 2021 02:17:20 +0000 (18:17 -0800)]
[sanitizer] Switch dlsym hack to internal_allocator
Since glibc 2.34, dlsym does
1. malloc 1
2. malloc 2
3. free pointer from malloc 1
4. free pointer from malloc 2
These sequence was not handled by trivial dlsym hack.
This fixes https://bugs.llvm.org/show_bug.cgi?id=52278
Reviewed By: eugenis, morehouse
Differential Revision: https://reviews.llvm.org/D112588
Philip Reames [Fri, 12 Nov 2021 23:00:39 +0000 (15:00 -0800)]
[runtime-unroll] Use incrementing IVs instead of decrementing ones
This is one of those wonderful "in theory X doesn't matter, but in practice is does" changes. In this particular case, we shift the IVs inserted by the runtime unroller to clamp iteration count of the loops* from decrementing to incrementing.
Why does this matter? A couple of reasons:
* SCEV doesn't have a native subtract node. Instead, all subtracts (A - B) are represented as A + -1 * B and drops any flags invalidated by such. As a result, SCEV is slightly less good at reasoning about edge cases involving decrementing addrecs than incrementing ones. (You can see this in the inferred flags in some of the test cases.)
* Other parts of the optimizer produce incrementing IVs, and they're common in idiomatic source language. We do have support for reversing IVs, but in general if we produce one of each, the pair will persist surprisingly far through the optimizer before being coalesced. (You can see this looking at nearby phis in the test cases.)
Note that if the hardware prefers decrementing (i.e. zero tested) loops, LSR should convert back immediately before codegen.
* Mostly irrelevant detail: The main loop of the prolog case is handled independently and will simple use the original IV with a changed start value. We could in theory use this scheme for all iteration clamping, but that's a larger and more invasive change.
Lawrence D'Anna [Fri, 12 Nov 2021 23:38:35 +0000 (15:38 -0800)]
[lldb] temporarily disable TestPaths.test_interpreter_info on windows
I'm disabling this test until the fix is reviewed
(here https://reviews.llvm.org/D113650/)
Craig Topper [Fri, 12 Nov 2021 21:20:20 +0000 (13:20 -0800)]
[RISCV] Improve codegen for i32 udiv/urem by constant on RV64.
The division by constant optimization often produces constants that
are uimm32, but not simm32. These constants require 3 or 4 instructions
to materialize without Zba.
Since these instructions are often used by a multiply with a LHS
that needs to be zero extended with an AND, we can switch the MUL
to a MULHU by shifting both inputs left by 32. Once we shift the
constant left, the upper 32 bits no longer need to be 0 so constant
materialization is free to use LUI+ADDIW. This reduces the constant
materialization from 4 instructions to 3 in some cases while also
reducing the zero extend of the LHS from 2 shifts to 1.
Differential Revision: https://reviews.llvm.org/D113805
Duncan P. N. Exon Smith [Fri, 12 Nov 2021 22:22:00 +0000 (14:22 -0800)]
lld: const-qualify iterations through VarStreamArray, NFC
No functionality change here; just unblocking a patch to LLVM.
Duncan P. N. Exon Smith [Fri, 12 Nov 2021 02:07:14 +0000 (18:07 -0800)]
IR: Fix const-correctness of SwitchInst::CaseIterator and CaseHandle
Fix some confusion between the two types of `const` a pointer/iterator
can have. Users of a SwitchInst::CaseIterator should not (and do not!)
manually mutate the SwitchInst::CaseHandle that tracks its internal
state. Change operator*() to return `const CaseHandle&`, remove the
non-const-qualified operator*(), and const-qualify
CaseHandle::setValue() and CaseHandle::setSuccessor().
Differential Revision: https://reviews.llvm.org/D113788
Duncan P. N. Exon Smith [Fri, 12 Nov 2021 21:50:29 +0000 (13:50 -0800)]
ADT: Avoid repeating iterator adaptor/facade template params, NFC
Take advantage of class name injection to avoid redundantly specifying
template parameters of iterator adaptor/facade base classes.
No functionality change, although the private typedefs changed in a
couple of cases.
- Added a private typedef HashTableIterator::BaseT, following the
pattern from r207084 /
3478d4b164e8d3eba01f5bfa3fc5bfb287a78b97, to
pre-emptively appease MSVC (maybe it's not necessary anymore but
looks like we do this pretty consistently). Otherwise, I removed
private
- Removed private typedefs filter_iterator_impl::BaseT and
FilterIteratorTest::InputIterator::BaseT since there was only one
use of each and the definition was no longer interesting.
Alexey Bataev [Fri, 12 Nov 2021 21:45:38 +0000 (13:45 -0800)]
[SLP][NFCAdd a test for vector intrinsic with scalar parameter, NFC.
Félix Cloutier [Thu, 11 Nov 2021 02:03:36 +0000 (18:03 -0800)]
format_arg attribute does not support nullable instancetype return type
* The format_arg attribute tells the compiler that the attributed function
returns a format string that is compatible with a format string that is being
passed as a specific argument.
* Several NSString methods return copies of their input, so they would ideally
have the format_arg attribute. A previous differential (D112670) added
support for instancetype methods having the format_arg attribute when used
in the context of NSString method declarations.
* D112670 failed to account that instancetype can be sugared in certain narrow
(but critical) scenarios, like by using nullability specifiers. This patch
resolves this problem.
Differential Revision: https://reviews.llvm.org/D113636
Reviewed By: ahatanak
Radar-Id: rdar://
85278860
David Tenty [Fri, 12 Nov 2021 21:27:57 +0000 (16:27 -0500)]
[libcxx][AIX] XFAIL tests enabled by locale.fr_FR.UTF-8
We missed the tests in the earlier XFAIL-ing because the locale.fr_FR.UTF-8
feature wasn't available, but since an upgrade these are now showing up
on the CI.
Differential Revision: https://reviews.llvm.org/D113791
Mogball [Fri, 12 Nov 2021 01:17:05 +0000 (01:17 +0000)]
[mlir][ods] Cleanup of Class Codegen helper
Depends on D113331
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D113714
Peter Klausler [Thu, 11 Nov 2021 20:36:15 +0000 (12:36 -0800)]
[flang] Handle ENTRY names in IsPureProcedure() predicate
Fortran defines an ENTRY point name as being pure if its enclosing
subprogram scope defines a pure procedure.
Differential Revision: https://reviews.llvm.org/D113711
Mogball [Fri, 12 Nov 2021 21:17:38 +0000 (21:17 +0000)]
[mlir][ods] DialectAsmPrinter -> AsmPrinter in comments
Vitaly Buka [Fri, 12 Nov 2021 21:00:54 +0000 (13:00 -0800)]
[asan] Fix GCC warning "left shift count >= width"
Fixes PR52385
Jez Ng [Fri, 12 Nov 2021 20:59:07 +0000 (15:59 -0500)]
[lld-macho] Fix symbol relocs handling for LSDAs
Similar to D113702, but for the LSDAs. Clang seems to emit all LSDA
relocs as section relocs, but ld -r can turn those relocs into symbol
ones.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D113721
Jez Ng [Fri, 12 Nov 2021 21:01:25 +0000 (16:01 -0500)]
[lld-macho] Teach ICF to dedup functions with identical unwind info
Dedup'ing unwind info is tricky because each CUE contains a different
function address, if ICF operated naively and compared the entire
contents of each CUE, entries with identical unwind info but belonging
to different functions would never be considered identical. To work
around this problem, we slice away the function address before
performing ICF. We rely on `relocateCompactUnwind()` to correctly handle
these truncated input sections.
Here are the numbers before and after D109944, D109945, and this diff
were applied, as tested on my 3.2 GHz 16-Core Intel Xeon W:
Without any optimizations:
base diff difference (95% CI)
sys_time 0.849 ± 0.015 0.896 ± 0.012 [ +4.8% .. +6.2%]
user_time 3.357 ± 0.030 3.512 ± 0.023 [ +4.3% .. +5.0%]
wall_time 3.944 ± 0.039 4.032 ± 0.031 [ +1.8% .. +2.6%]
samples 40 38
With `-dead_strip`:
base diff difference (95% CI)
sys_time 0.847 ± 0.010 0.896 ± 0.012 [ +5.2% .. +6.5%]
user_time 3.377 ± 0.014 3.532 ± 0.015 [ +4.4% .. +4.8%]
wall_time 3.962 ± 0.024 4.060 ± 0.030 [ +2.1% .. +2.8%]
samples 47 30
With `-dead_strip` and `--icf=all`:
base diff difference (95% CI)
sys_time 0.935 ± 0.013 0.957 ± 0.018 [ +1.5% .. +3.2%]
user_time 3.472 ± 0.022 6.531 ± 0.046 [ +87.6% .. +88.7%]
wall_time 4.080 ± 0.040 5.329 ± 0.060 [ +30.0% .. +31.2%]
samples 37 30
Unsurprisingly, ICF is now a lot slower, likely due to the much larger
number of input sections it needs to process. But the rest of the
linker only suffers a mild slowdown.
Note that the compact-unwind-bad-reloc.s test was expanded because we
now handle the relocation for CUE's function address in a separate code
path from the rest of the CUE relocations. The extended test covers both
code paths.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D109946
Sanjay Patel [Fri, 12 Nov 2021 20:49:46 +0000 (15:49 -0500)]
[AArch64][x86] add tests for swapped cmp+vselect patterns; NFC
These patterns were noted in the recent D113212 and follow-ups.
I did not bother to duplicate every test because it should be
clear if we recognize the swaps from a smaller sample. We have
complete coverage for the original patterns.
wlei [Tue, 9 Nov 2021 07:05:16 +0000 (23:05 -0800)]
[llvm-profgen] Fix bug of setting function entry
Previously we set `isFuncEntry` flag to true when the funcName from DWARF is equal to the name in symbol table and we use this flag to ignore reporting callsite sample that's from an intra func branch. However, in HHVM, it appears that the symbol table name is inconsistent with the dwarf info func name, it's likely due to `OptimizeGlobalAliases`.
This change is a workaround in llvm-profgen side to mark the only one range as the function entry and add warnings for the remaining inconsistence.
This also fixed a missing `getCanonicalFnName` for symbol name which caused the mismatching as well.
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D113492
Aaron Puchert [Thu, 11 Nov 2021 20:44:20 +0000 (21:44 +0100)]
Comment Sema: Make most of CommentSema private (NFC)
We only need to expose setDecl, copyArray and the actOn* methods.
Aaron Puchert [Fri, 12 Nov 2021 20:09:40 +0000 (21:09 +0100)]
Comment AST: Recognize function-like objects via return type (NFC)
Instead of pretending that function pointer type aliases or variables
are functions, and thereby losing the information that they are type
aliases or variables, respectively, we use the existence of a return
type in the DeclInfo to signify a "function-like" object.
That seems pretty natural, since it's also the return type (or parameter
list) from the DeclInfo that we compare the documentation with.
Addresses a concern voiced in D111264#3115104.
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D113691
Aaron Puchert [Fri, 12 Nov 2021 20:09:16 +0000 (21:09 +0100)]
Comment AST: Find out if function is variadic in DeclInfo::fill
Then we don't have to look into the declaration again. Also it's only
natural to collect this information alongside parameters and return
type, as it's also just a parameter in some sense.
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D113690
Peter Hawkins [Fri, 12 Nov 2021 20:02:18 +0000 (12:02 -0800)]
Don't define //mlir:MLIRBindingsPythonCore in terms of the NoCAPI and CAPIDeps targets.
We noticed that the library structure causes link ordering problems in Google's internal build. However, we don't think the problem is specific to Google's build, it probably can be reproduced anywhere with the right library structure.
In general splitting the Python bindings from their dependencies (the C API targets) creates the possibility that the two libraries might end up in the wrong order on the linker command line. We can avoid this problem happening by reverting the structure of the MLIRBindingsPythonCore to represent its dependencies in the usual way, rather than composing an incomplete `MLIRBindingsPythonCoreNoCAPI` target and their CAPI dependencies. It was probably a mistake to rewrite this particular `cc_library()` rule in terms of the two, since nothing guarantees that the two will be correctly ordered by the linker when both are being linked into the same binary, and it was only an incidental "cleanup" done in passing.
Otherwise the previous PR (D113565) is fine, since that was about the case where both are being built into two separate shared libraries. It just shouldn't have made this (unrelated) change.
Reviewed By: GMNGeoffrey
Differential Revision: https://reviews.llvm.org/D113773
Jez Ng [Fri, 12 Nov 2021 20:00:51 +0000 (15:00 -0500)]
[reland][lld-macho] Fix symbol relocs handling for compact unwind's functionAddress
Clang seems to emit all functionAddress relocs as section relocs, but
`ld -r` can turn those relocs into symbol ones. It turns out that we
weren't handling that case correctly when the symbol was a weak def
whose definition did not prevail.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D113702
Jacques Pienaar [Fri, 12 Nov 2021 19:46:14 +0000 (11:46 -0800)]
[mlir][shape] Add value_as_shape op
Part of the very first discussion here, but didn't upstream it before as we
didn't use it yet. Fix that for pending updates. Just adding the op here,
follow up will add the lowering to codegen.
Duncan P. N. Exon Smith [Fri, 12 Nov 2021 02:51:31 +0000 (18:51 -0800)]
Sema: const-qualify ParsedAttr::iterator::operator*()
`const`-qualify ParsedAttr::iterator::operator*(), clearing up confusion
about the two meanings of const for pointers/iterators. Helps unblock
removal of (non-const) iterator_facade_base::operator->().
Duncan P. N. Exon Smith [Fri, 12 Nov 2021 02:20:52 +0000 (18:20 -0800)]
IR: Avoid duplication of SwitchInst::findCaseValue(), NFC
Change the non-const version of findCaseValue() to forward to the const
version.
Philip Reames [Fri, 12 Nov 2021 19:35:28 +0000 (11:35 -0800)]
[unroll] Keep unrolled iterations with initial iteration
The unrolling code was previously inserting new cloned blocks at the end of the function. The result of this with typical loop structures is that the new iterations are placed far from the initial iteration.
With unrolling, the general assumption is that the a) the loop is reasonable hot, and b) the first Count-1 copies of the loop are rarely (if ever) loop exiting. As such, placing Count-1 copies out of line is a fairly poor code placement choice. We'd much rather fall through into the hot (non-exiting) path. For code with branch profiles, later layout would fix this, but this may have a positive impact on non-PGO compiled code.
However, the real motivation for this change isn't performance. Its readability and human understanding. Having to jump around long distances in an IR file to trace an unrolled loop structure is error prone and tedious.
Peter Klausler [Tue, 2 Nov 2021 20:01:38 +0000 (13:01 -0700)]
[flang] Runtime performance improvements to real formatted input
Profiling a basic internal real input read benchmark shows some
hot spots in the code used to prepare input for decimal-to-binary
conversion, which is of course where the time should be spent.
The library that implements decimal to/from binary conversions has
been optimized, but not the code in the Fortran runtime that calls it,
and there are some obvious light changes worth making here.
Move some member functions from *.cpp files into the class definitions
of Descriptor and IoStatementState to enable inlining and specialization.
Make GetNextInputBytes() the new basic input API within the
runtime, replacing GetCurrentChar() -- which is rewritten in terms of
GetNextInputBytes -- so that input routines can have the
ability to acquire more than one input character at a time
and amortize overhead.
These changes speed up the time to read 1M random reals
using internal I/O from a character array from 1.29s to 0.54s
on my machine, which on par with Intel Fortran and much faster than
GNU Fortran.
Differential Revision: https://reviews.llvm.org/D113697
Keith Smiley [Wed, 10 Nov 2021 05:28:56 +0000 (21:28 -0800)]
[lld-macho] Fix trailing slash in oso_prefix
Previously if you passed `-oso_prefix path/to/foo/` with a trailing
slash at the end, using `real_path` would remove that slash, but that
slash is necessary to make sure OSO prefix paths end up as valid
relative paths instead of starting with `/`.
Differential Revision: https://reviews.llvm.org/D113541
Duncan P. N. Exon Smith [Thu, 4 Nov 2021 00:52:34 +0000 (17:52 -0700)]
ADT: Fix const-correctness of iterator adaptors
This fixes const-correctness of iterator adaptors, dropping non-`const`
overloads for `operator*()`.
Iterators, like the pointers that they generalize, have two types of
`const`.
The `const` qualifier on members indicates whether the iterator itself
can be changed. This is analagous to `int *const`.
The `const` qualifier on return values of `operator*()`, `operator[]()`,
and `operator->()` controls whether the the pointed-to value can be
changed. This is analogous to `const int *`.
Since `operator*()` does not (in principle) change the iterator, then
there should only be one definition, which is `const`-qualified. E.g.,
iterators wrapping `int*` should look like:
```
int *operator*() const; // always const-qualified, no overloads
```
ba7a6b314fd14bb2c9ff5d3f4fe2b6525514cada changed `iterator_adaptor_base`
away from this to work around bugs in other iterator adaptors. That was
already reverted. This patch adds back its test, which combined
llvm::enumerate() and llvm::make_filter_range(), adds a test for
iterator_adaptor_base itself, and cleans up the `const`-ness of the
other iterator adaptors.
This also updates the documented requirements for
`iterator_facade_base`:
```
/// OLD:
/// - const T &operator*() const;
/// - T &operator*();
/// New:
/// - T &operator*() const;
```
In a future commit we might also clean up `iterator_facade`'s overloads
of `operator->()` and `operator[]()`. These already (correctly) return
non-`const` proxies regardless of the iterator's `const` qualifier.
Differential Revision: https://reviews.llvm.org/D113158
Philip Reames [Fri, 12 Nov 2021 19:15:57 +0000 (11:15 -0800)]
(re-)Autogen one last unroll-and-jam test
This case was complicated because someone had added new non-autogened test to an autogened file. In particular, those new tests used two variables (%J and %j) which differeded only in capitalization. The auto-updater doesn't distinguish case, so this meant auto-gened versions of the new tests failed with non-obvious errors.
There are two key lessons here:
1) Please don't use two values which differ only in case. This is problematic for automatic tooling, but is also hard to understand for a human.
2) Please DO NOT add new tests to an autogened test without running autogen again. If autogen doesn't pass on your new test, put them in a separate file.
Peter Klausler [Wed, 10 Nov 2021 22:02:30 +0000 (14:02 -0800)]
[flang] Fix rounding edge case in F output editing
When an Fw.d output edit descriptor has a "d" value exactly
equal to the number of zeroes after the decimal point for a value
(e.g., 0.07 with F5.1), the Fw.d output editing code needs to
do the rounding itself to either 0.0 or 0.1 after performing
a conversion without rounding (to avoid 0.04999 rounding up twice).
Differential Revision: https://reviews.llvm.org/D113698
Alfsonso Gregory [Fri, 12 Nov 2021 18:53:50 +0000 (13:53 -0500)]
[libc++][NFC] Resolve Python 2 FIXME
We don't use Python 2 anymore, so let us do the recommended fix instead
of using the workaround made for Python 2.
Differential Revision: https://reviews.llvm.org/D107715
Peter Klausler [Wed, 10 Nov 2021 19:55:46 +0000 (11:55 -0800)]
[flang] Respect NO_STOP_MESSAGE=1 in runtime
When an environment variable NO_STOP_MESSAGE=1 is set,
assume that STOP statements with a successful code
have QUIET=.TRUE.
Differential Revision: https://reviews.llvm.org/D113701
Lang Hames [Fri, 12 Nov 2021 18:28:38 +0000 (10:28 -0800)]
[ORC-RT][llvm-jitlink] Fix a buggy check in ORC-RT MachO TLV deregistration.
The check was failing because it was matching against the end of the range, not
the start.
This bug wasn't causing the ORC-RT MachO TLV regression test to fail because
we were only logging deallocation errors (including TLV deregistration errors)
and not actually returning a failure code. This commit updates llvm-jitlink to
report the errors properly.
Lang Hames [Fri, 12 Nov 2021 16:46:03 +0000 (08:46 -0800)]
[JITLink] Fix think-o in handwritten CWrapperFunctionResult -> Error converter.
We need to skip the length field when generating error strings.
No test case: This hand-hacked deserializer should be removed in the near future
once JITLink can use generic ORC APIs (including SPS and WrapperFunction).
Philip Reames [Fri, 12 Nov 2021 18:30:27 +0000 (10:30 -0800)]
Autogen a bunch of unrolling tests for ease of update