Andrew Grieve [Wed, 22 Jul 2020 19:53:57 +0000 (12:53 -0700)]
asan_device_setup's wrapper scripts not handling args with spaces correctly
Summary: Came up in Chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=1103108#c21
Reviewers: eugenis
Reviewed By: eugenis
Subscribers: #sanitizers
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D84237
David Blaikie [Sun, 12 Jul 2020 22:36:56 +0000 (15:36 -0700)]
Merge some of the PCH object support with modular codegen
I was trying to pick this up a bit when reviewing D48426 (& perhaps D69778) - in any case, looks like D48426 added a module level flag that might not be needed.
The D48426 implementation worked by setting a module level flag, then code generating contents from the PCH a special case in ASTContext::DeclMustBeEmitted would be used to delay emitting the definition of these functions if they came from a Module with this flag.
This strategy is similar to the one initially implemented for modular codegen that was removed in D29901 in favor of the modular decls list and a bit on each decl to specify whether it's homed to a module.
One major difference between PCH object support and modular code generation, other than the specific list of decls that are homed, is the compilation model: MSVC PCH modules are built into the object file for some other source file (when compiling that source file /Yc is specified to say "this compilation is where the PCH is homed"), whereas modular code generation invokes a separate compilation for the PCH alone. So the current modular code generation test of to decide if a decl should be emitted "is the module where this decl is serialized the current main file" has to be extended (as Lubos did in D69778) to also test the command line flag -building-pch-with-obj.
Otherwise the whole thing is basically streamlined down to the modular code generation path.
This even offers one extra material improvement compared to the existing divergent implementation: Homed functions are not emitted into object files that use the pch. Instead at -O0 they are not emitted into the IR at all, and at -O1 they are emitted using available_externally (existing functionality implemented for modular code generation). The pch-codegen test has been updated to reflect this new behavior.
[If possible: I'd love it if we could not have the extra MSVC-style way of accessing dllexport-pch-homing, and just do it the modular codegen way, but I understand that it might be a limitation of existing build systems. @hans / @thakis: Do either of you know if it'd be practical to move to something more similar to .pcm handling, where the pch itself is passed to the compilation, rather than homed as a side effect of compiling some other source file?]
Reviewers: llunak, hans
Differential Revision: https://reviews.llvm.org/D83652
David Green [Wed, 22 Jul 2020 19:43:02 +0000 (20:43 +0100)]
[ARM] Fix missing MVE_VMUL_qr predicate
This was missed out of
1030e82598da, but hopefully fixes the issues
reported with NEON accidentally generating MVE instructions.
Thomas Raoux [Wed, 22 Jul 2020 19:16:29 +0000 (12:16 -0700)]
[mlir][linalg] Add vectorization transform for CopyOp
CopyOp get vectorized to vector.transfer_read followed by vector.transfer_write
Differential Revision: https://reviews.llvm.org/D83739
Louis Dionne [Wed, 22 Jul 2020 19:24:16 +0000 (15:24 -0400)]
[libc++] Workaround broken support for C++17 in GCC 5
Pete Steinfeld [Wed, 22 Jul 2020 18:33:35 +0000 (11:33 -0700)]
[flang] Fix an assert when RESHAPE() is called on empty strings
Summary:
When a constant array of empty strings goes through contant folding, the result
is something that contains no bytes. If this array is passed to the intrinsic
function `RESHAPE()`, we were not handling things correctly. I fixed this by
checking for an empty destination when calling the function `CopyFrom()` on an
array of strings.
I also added a test with a couple of different examples that trigger the
problem.
Reviewers: klausler, tskeith, DavidTruby
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D84352
Andrew Litteken [Wed, 22 Jul 2020 17:15:36 +0000 (10:15 -0700)]
[CGP] Add Pass Dependencies
Add pass dependecies:
- TargetTransformInfoWrapperPass
- TargetPassConfig
- LoopInfoWrapperPass
- TargetLibraryInfoWrapperPass
To fix inconsistencies when passes are added to the pipeline.
Reviewers: efriedma, kmclaughlin, paquette
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D84346
Louis Dionne [Wed, 22 Apr 2020 14:23:38 +0000 (10:23 -0400)]
[libc++] Add static_assert to make sure rate limiter doesn't use locks
We want to be sure that atomic<size_t> is always lock-free, or the code
will be much slower than expected (and could even conceivably fail if
the lock implementation somehow calls back into libc++abi).
Louis Dionne [Wed, 22 Apr 2020 15:15:05 +0000 (11:15 -0400)]
[libc++] Build the dylib with C++17 to allow aligned new/delete
This allows simplifying the implementation of barriers.
This is a re-commit of
1ac403bd145d, which had to be reverted in
64a9c944fc45 because the minimum CMake version wasn't high enough.
Now that we've upgraded, we can do this.
Differential Revision: https://reviews.llvm.org/D75243
LLVM GN Syncbot [Wed, 22 Jul 2020 18:37:02 +0000 (18:37 +0000)]
[gn build] Port
418121c30a8
Jonas Devlieghere [Wed, 22 Jul 2020 18:32:18 +0000 (11:32 -0700)]
[lldb] Use std::make_unique<DynamicRegisterInfo> (NFC)
Nikita Popov [Wed, 22 Jul 2020 18:18:13 +0000 (20:18 +0200)]
[SCCP] Add multi-edge switch + phi test case (NFC)
Amy Kwan [Wed, 22 Jul 2020 17:16:08 +0000 (12:16 -0500)]
[PowerPC][Power10] Fix the Test LSB by Byte (xvtlsbb) Builtins Implementation
The implementation of the xvtlsbb builtins/intrinsics were not correct as the
intrinsics previously used i1 as an argument type. This patch changes the i1
argument type used in these intrinsics to be i32 instead, as having the second
as an i1 can lead to issues in the backend.
Differential Revision: https://reviews.llvm.org/D84291
Simon Pilgrim [Wed, 22 Jul 2020 18:00:28 +0000 (19:00 +0100)]
DwarfCompileUnit.cpp - remove duplicate includes that already exist in DwarfCompileUnit.h. NFC.
Also remove DIE.h include from DwarfCompileUnit.h and replace with forward declarations.
Simon Pilgrim [Wed, 22 Jul 2020 17:02:43 +0000 (18:02 +0100)]
CodeViewDebug.cpp - remove duplicate includes that already exist in CodeViewDebug.h. NFC.
Louis Dionne [Wed, 22 Apr 2020 15:15:05 +0000 (11:15 -0400)]
[CMake] Bump CMake minimum version to 3.13.4
This upgrade should be friction-less because we've already been ensuring
that CMake >= 3.13.4 is used.
This is part of the effort discussed on llvm-dev here:
http://lists.llvm.org/pipermail/llvm-dev/2020-April/140578.html
Differential Revision: https://reviews.llvm.org/D78648
Hans Wennborg [Wed, 22 Jul 2020 18:12:18 +0000 (20:12 +0200)]
Revert "Enable -Wsuggest-override in the LLVM build" and the follow-ups.
After lots of follow-up fixes, there are still problems, such as
-Wno-suggest-override getting passed to the Windows Resource Compiler
because it was added with add_definitions in the CMake file.
Rather than piling on another fix, let's revert so this can be re-landed
when there's a proper fix.
This reverts commit
21c0b4c1e8d6a171899b31d072a47dac27258fc5.
This reverts commit
81d68ad27b29b1e6bc93807c6e42b14e9a77eade.
This reverts commit
a361aa5249856e333a373df90947dabf34cd6aab.
This reverts commit
fa42b7cf2949802ff0b8a63a2e111a2a68711067.
This reverts commit
955f87f947fda3072a69b0b00ca83c1f6a0566f6.
This reverts commit
8b16e45f66e24e4c10e2cea1b70d2b85a7ce64d5.
This reverts commit
308a127a38d1111f3940420b98ff45fc1c17715f.
This reverts commit
274b6b0c7a8b584662595762eaeff57d61c6807f.
This reverts commit
1c7037a2a5576d0bb083db10ad947a8308e61f65.
Mircea Trofin [Wed, 22 Jul 2020 18:16:08 +0000 (11:16 -0700)]
[llvm][NFC] Remove definition from build system of LLVM_HAVE_TF_AOT
We can just use the definition from config.h. This means we need to move
a few lines around in CMakeLists.txt - the TF_AOT detection needs to be
before the spot we process the config.h.cmake files.
Differential Revision: https://reviews.llvm.org/D84349
Matt Arsenault [Fri, 10 Jul 2020 17:57:11 +0000 (13:57 -0400)]
AArch64: Use Register
Matt Arsenault [Thu, 9 Jul 2020 00:36:48 +0000 (20:36 -0400)]
GlobalISel: Don't use virtual for distinguishing arg handlers
There's no reason to involve the hassle of a virtual method targets
have to override for a simple boolean.
Not sure exactly what's going on with Mips, but it seems to define its
own totally separate handler classes.
Nico Weber [Wed, 22 Jul 2020 18:10:17 +0000 (14:10 -0400)]
[gn build] (manually) port
746b5fad5b
Joel E. Denny [Wed, 22 Jul 2020 18:04:58 +0000 (14:04 -0400)]
[OpenMP] Implement TR8 `present` map type modifier in runtime (2/2)
This implements OpenMP runtime support for the OpenMP TR8 `present`
map type modifier. The previous patch in this series implements Clang
front end support. See that patch summary for behaviors that are not
yet supported.
Reviewed By: grokos, jdoerfert
Differential Revision: https://reviews.llvm.org/D83062
Adrian Prantl [Wed, 22 Jul 2020 18:01:16 +0000 (11:01 -0700)]
Fix Windows build
Matt Arsenault [Wed, 22 Jul 2020 16:27:50 +0000 (12:27 -0400)]
AMDGPU: Don't assert on f16 inv2pi immediates pre-gfx8
v_cvt_f32_f16 can still accept this value as a literal constant. This
showed up in GlobalISel since it doesn't have constant folding for
G_FPEXT.
Logan Smith [Wed, 22 Jul 2020 17:49:05 +0000 (10:49 -0700)]
[clangd] Disable -Wsuggest-override for unittests/
Benjamin Kramer [Wed, 22 Jul 2020 16:18:50 +0000 (18:18 +0200)]
[mlir][Vector] Vectorize integer matmuls
The underlying infrastructure supports this already, just add the
pattern matching for linalg.generic.
Differential Revision: https://reviews.llvm.org/D84335
Alex Richardson [Wed, 22 Jul 2020 17:32:34 +0000 (18:32 +0100)]
[libcxx] Fix default argument for merge_archives.py -L flag
If we use the default of None, we get a python exception in
find_and_diagnose_missing() instead of printing a sensible error message.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D84342
Matt Arsenault [Wed, 8 Jul 2020 13:11:53 +0000 (09:11 -0400)]
GlobalISel: Restructure argument lowering loop in handleAssignments
This was structured in a way that implied every split argument is in
memory, or in registers. It is possible to pass an original argument
partially in registers, and partially in memory. Transpose the logic
here to only consider a single piece at a time. Every individual
CCValAssign should be treated independently, and any merge to original
value needs to be handled later.
This is in preparation for merging some preprocessing hacks in the
AMDGPU calling convention lowering into the generic code.
I'm also not sure what the correct behavior for memlocs where the
promoted size is larger than the original value. I've opted to clamp
the memory access size to not exceed the value register to avoid the
explicit trunc/extend/vector widen/vector extract instruction. This
happens for AMDGPU for i8 arguments that end up stack passed, which
are promoted to i16 (I think this is a preexisting DAG bug though, and
they should not really be promoted when in memory).
Matt Arsenault [Wed, 22 Jul 2020 01:50:13 +0000 (21:50 -0400)]
AMDGPU: Add IntrWillReturn to llvm.amdgcn.atomic.csub
Gui Andrade [Wed, 22 Jul 2020 16:48:51 +0000 (16:48 +0000)]
[Sanitizers] Add interceptor for xdrrec_create
For now, xdrrec_create is only intercepted Linux as its signature
is different on Solaris.
The method of intercepting xdrrec_create isn't super ideal but I
couldn't think of a way around it: Using an AddrHashMap combined
with wrapping the userdata field.
We can't just allocate a handle on the heap in xdrrec_create and leave
it at that, since there'd be no way to free it later. This is because it
doesn't seem to be possible to access handle from the XDR struct, which
is the only argument to xdr_destroy.
On the other hand, the callbacks don't have a way to get at the
x_private field of XDR, which is what I chose for the HashMap key. So we
need to wrap the handle parameter of the callbacks. But we can't just
pass x_private as handle (as it hasn't been set yet). We can't put the
wrapper struct into the HashMap and pass its pointer as handle, as the
key we need (x_private again) hasn't been set yet.
So I allocate the wrapper struct on the heap, pass its pointer as
handle, and put it into the HashMap so xdr_destroy can find it later and
destroy it.
Differential Revision: https://reviews.llvm.org/D83358
Fangrui Song [Wed, 22 Jul 2020 17:15:51 +0000 (10:15 -0700)]
[profile][test] Add -fuse-ld=bfd to make instrprof-lto-pgogen.c robust
Otherwise if 'ld' is an older system LLD (FreeBSD; or if someone adds 'ld' to
point to an LLD from a different installation) which does not support the
current ModuleSummaryIndex::BitCodeSummaryVersion, the test will fail.
Add lit feature 'binutils_lto'. GNU ld is more common than GNU gold, so
we can just require 'is_binutils_lto_supported' to additionally support GNU ld.
Reviewed By: myhsu
Differential Revision: https://reviews.llvm.org/D84133
Matt Arsenault [Wed, 22 Jul 2020 17:05:51 +0000 (13:05 -0400)]
AMDGPU/GlobalISel: Fix translation of indirect calls
Thomas Lively [Wed, 22 Jul 2020 17:12:26 +0000 (10:12 -0700)]
[WebAssembly] Autogenerate checks in simd-offset.ll
Implementing new functionality tested in this file requires adding new
tests for many IR addressing patterns, which can be a large
maintenance burden. This patch makes adding tests easier by switching
to using autogenerated checks. This patch also removes the testing
mode that has simd128 disabled because it would produce very large
checks and is not particularly interesting.
Differential Revision: https://reviews.llvm.org/D84288
Tarindu Jayatilaka [Wed, 22 Jul 2020 16:52:53 +0000 (09:52 -0700)]
Reapply "Rename InlineFeatureAnalysis to FunctionPropertiesAnalysis"
(This reverts commit
a5e0194709c40212694370e0ea789a1ca14548b5, and
corrects author).
Rename the pass to be able to extend it to function properties other than inliner features.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D82044
Logan Smith [Wed, 22 Jul 2020 17:03:49 +0000 (10:03 -0700)]
Only enable -Wsuggest-override if it doesn't suggest adding override to functions that are already final
A previous patch added -Wsuggest-override using a simple add_flag_if_supported(). This causes lots of warnings in LLVM when building with older GCC versions (< 9.2) which suggest adding override to functions that are only marked final. The current flags in both GCC >=9.2 and Clang accept plain final as equivalent to override final.
This patch adds logic to detect versions of -Wsuggest-override that warn on void foo() final and disables them to avoid warning spam in builds using older GCC's. This has the added minor benefit of getting rid of the useless C_SUPPORTS_SUGGEST_OVERRIDE_FLAG CMake cache variable which was set by add_flag_if_supported().
Differential Revision: https://reviews.llvm.org/D84292
LLVM GN Syncbot [Wed, 22 Jul 2020 16:56:06 +0000 (16:56 +0000)]
[gn build] Port
a5e0194709c
LLVM GN Syncbot [Wed, 22 Jul 2020 16:56:05 +0000 (16:56 +0000)]
[gn build] Port
2a6c871596c
Jonas Devlieghere [Wed, 22 Jul 2020 16:51:24 +0000 (09:51 -0700)]
[lldb] Cleanup CommandObject registration (NFC)
- Remove the spurious argument to `CommandObjectScript`.
- Use make_shared instead of bare `new`.
- Move code duplication behind a macro.
Differential revision: https://reviews.llvm.org/D84336
Fangrui Song [Wed, 22 Jul 2020 16:49:08 +0000 (09:49 -0700)]
[gn build] Handle X86InstCombineIntrinsic.cpp in
2a6c871596ce
Gui Andrade [Wed, 22 Jul 2020 16:34:55 +0000 (16:34 +0000)]
[MSAN] Instrument libatomic load/store calls
These calls are neither intercepted by compiler-rt nor is libatomic.a
naturally instrumented.
This patch uses the existing libcall mechanism to detect a call
to atomic_load or atomic_store, and instruments them much like
the preexisting instrumentation for atomics.
Calls to _load are modified to have at least Acquire ordering, and
calls to _store at least Release ordering. Because this needs to be
converted at runtime, msan injects a LUT (implemented as a vector
with extractelement).
Differential Revision: https://reviews.llvm.org/D83337
Mircea Trofin [Wed, 22 Jul 2020 16:42:17 +0000 (09:42 -0700)]
Revert "Rename InlineFeatureAnalysis to FunctionPropertiesAnalysis"
This reverts commit
44a6bda19b40f2dfcbe92fc3d58bb6276c71ef78. I forgot
to correctly attibute it to tarinduj. Fixing and resubmitting.
Fangrui Song [Wed, 22 Jul 2020 16:40:41 +0000 (09:40 -0700)]
David Green [Wed, 22 Jul 2020 16:30:02 +0000 (17:30 +0100)]
[ARM] Add predicated add reduction patterns
Given a vecreduce.add(select(p, x, 0)), we can convert that to a
predicated vaddv, as the else value for the select is the identity
value, a zero. That is what this patch does for the vaddv, vaddva,
vaddlv and vaddlva instructions, copying the existing patterns to also
handle predication through a select.
Differential Revision: https://reviews.llvm.org/D84101
Cullen Rhodes [Tue, 9 Jun 2020 13:19:35 +0000 (13:19 +0000)]
[Sema][AArch64] Add semantics for arm_sve_vector_bits attribute
Summary:
This patch implements semantics for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define fixed-length (VLST) versions
of existing sizeless types (VLAT).
Implemented in this patch is the the behaviour described in section 3.7.3.2
and minimal parts of sections 3.7.3.3 and 3.7.3.4, this includes:
* Defining VLST globals, structs, unions, and local variables
* Implicit casting between VLAT <=> VLST.
* Diagnosis of ill-formed conditional expressions of the form:
C ? E1 : E2
where E1 is a VLAT type and E2 is a VLST, or vice-versa. This
avoids any ambiguity about the nature of the result type (i.e is
it sized or sizeless).
* For vectors:
* sizeof(VLST) == N/8
* alignof(VLST) == 16
* For predicates:
* sizeof(VLST) == N/64
* alignof(VLST) == 2
VLSTs have the same representation as VLATs in the AST but are wrapped
with a TypeAttribute. Scalable types are currently emitted in the IR for
uses such as globals and structs which don't support these types, this
is addressed in the next patch with codegen, where VLSTs are lowered to
sized arrays for globals, structs / unions and arrays.
Not implemented in this patch is the behaviour guarded by the feature
macros:
* __ARM_FEATURE_SVE_VECTOR_OPERATORS
* __ARM_FEATURE_SVE_PREDICATE_OPERATORS
As such, the GNU __attribute__((vector_size)) extension is not available
and operators such as binary '+' are not supported for VLSTs. Support
for this is intended to be addressed by later patches.
[1] https://developer.arm.com/documentation/100987/latest
This is patch 2/4 of a patch series.
Reviewers: sdesmalen, rsandifo-arm, efriedma, cameron.mcinally, ctetreau, rengolin, aaron.ballman
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D83551
Fangrui Song [Wed, 22 Jul 2020 16:23:52 +0000 (09:23 -0700)]
[ADT] Delete unused llvm::pointer_union_detail::AssignableFrom
Noticed by Zhiwei Chen
Mircea Trofin [Wed, 22 Jul 2020 16:24:15 +0000 (09:24 -0700)]
Rename InlineFeatureAnalysis to FunctionPropertiesAnalysis
Rename the pass to be able to extend it to function properties other than inliner features.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D82044
Adrian Prantl [Tue, 21 Jul 2020 20:53:43 +0000 (13:53 -0700)]
Thread ExecutionContextScope through GetByteSize where possible (NFC-ish)
This patch has no effect for C and C++. In more dynamic languages,
such as Objective-C and Swift GetByteSize() needs to call into the
language runtime, so it's important to pass one in where possible. My
primary motivation for this is some work I'm doing on the Swift
branch, however, it looks like we are also seeing warnings in
Objective-C that this may resolve. Everything in the SymbolFile
hierarchy still passes in nullptrs, because we don't have an execution
context in SymbolFile, since SymbolFile transcends processes.
Differential Revision: https://reviews.llvm.org/D84267
Arthur Eubanks [Wed, 22 Jul 2020 15:40:55 +0000 (08:40 -0700)]
[NFC][NewPM] Add clarification on analysis manager proxies
Explain why you can only get a cached analysis result, not compute one
on the fly.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D84259
Simon Pilgrim [Wed, 22 Jul 2020 14:18:32 +0000 (15:18 +0100)]
ProfileSummaryInfo.h - remove unnecessary ProfileSummary forward declaration. NFCI.
This is defined in ProfileSummary.h which we have to explicitly include already.
Anton Afanasyev [Thu, 16 Jul 2020 14:57:33 +0000 (17:57 +0300)]
[SLP][Test] Precommit tests for D83779. NFC.
Joel E. Denny [Wed, 22 Jul 2020 15:22:08 +0000 (11:22 -0400)]
Revert "[OpenMP] Implement TR8 `present` map type modifier in runtime (2/2)"
This reverts commit
45b8f7ec35ef653bafdf48034857222517c17781.
It attempts to use debug macros `DPxMOD` and `DPxPTR` in release
builds. Will fix and reapply later.
Hans Wennborg [Wed, 22 Jul 2020 15:01:57 +0000 (17:01 +0200)]
Revert
abd45154b "[Coverage] Add comment to skipped regions"
This casued assertions during Chromium builds. See comment on the code review
> Bug filled here: https://bugs.llvm.org/show_bug.cgi?id=45757.
> Add comment to skipped regions so we don't track execution count for lines containing only comments.
>
> Differential Revision: https://reviews.llvm.org/D84208
This reverts commit
abd45154bdb6b76c5b480455eacc8c75b08242aa and the
follow-up
87d725473380652bbe845fd2fbd9c0507a55172f.
Sebastian Neubauer [Wed, 22 Jul 2020 15:00:43 +0000 (17:00 +0200)]
Fix target specific InstCombine
A clang arm test was failing if clang is compiled without arm support.
Regression was introduced in
2a6c871596ce8bdd23501a96fd22f0f16d3cfcad
SharmaRithik [Wed, 22 Jul 2020 14:42:57 +0000 (20:12 +0530)]
[CodeMoverUtils] Add more data dependency related test case
Summary: This patch adds more test case focusing on data dependency.
Authored By: RithikSharma
Reviewer: Whitney, bmahjour, etiotto
Reviewed By: Whitney
Subscribers: llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D83543
Benson Li [Wed, 22 Jul 2020 14:22:59 +0000 (16:22 +0200)]
[lldb] add printing of stdout compile errors to lldbsuite
Summary: Add printing of the output of stdout during compile errors, in
addition to stderr output.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D83425
Georgii Rymar [Tue, 21 Jul 2020 10:13:01 +0000 (13:13 +0300)]
[llvm-readobj] - Don't get the name of the symbol table in ELFDumper<ELFT>::printSymbolsHelper.
It was requested in D84173 thread to not do it, because otherwise we extract and
check the name of the symbol table in LLVM style, but do not use it and
might report a warning which perhaps might be confusing.
Differential revision: https://reviews.llvm.org/D84231
Florian Hahn [Wed, 22 Jul 2020 13:53:22 +0000 (14:53 +0100)]
[SCEVExpander] Fix indentation/formatting (NFC).
The declarations inside the llvm namespace where indented too much. Fix
it by re-running clang-format on the whole file.
Dmitry Preobrazhensky [Wed, 22 Jul 2020 14:16:59 +0000 (17:16 +0300)]
[AMDGPU][MC] Corrected decoding of 16-bit literals
16-bit literals are encoded as 32-bit values. If high 16-bits of the value is 0xFFFF, the decoded instruction cannot be reassembled.
For example, the following code
0xff,0x04,0x04,0x52,0xcd,0xab,0xff,0xff
was decoded as
v_mul_lo_u16_e32 v2, 0xffffabcd, v2
However this literal is actually a 64-bit constant 0x00000000ffffabcd which violates requirements described in the documentation - the truncation is not safe.
This change corrects decoding to make reassembly possible.
Reviewers: arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D84098
David Carlier [Wed, 22 Jul 2020 14:15:45 +0000 (15:15 +0100)]
[compiler-rt] fix build on Illumos
- there are additional fields for glob_t struct, thus size check is failing.
- to access old mman.h api based on caddr_t, _XOPEN_SOURCE needs to be not defined
thus we provide the prototype.
- prxmap_t constified.
Reviewers: ro, eugenis
Reviewed-By: ro
Differential Revision: https://reviews.llvm.org/D84046
Joel E. Denny [Wed, 22 Jul 2020 14:14:30 +0000 (10:14 -0400)]
[OpenMP] Implement TR8 `present` map type modifier in runtime (2/2)
This implements OpenMP runtime support for the OpenMP TR8 `present`
map type modifier. The previous patch in this series implements Clang
front end support. See that patch summary for behaviors that are not
yet supported.
Reviewed By: grokos, jdoerfert
Differential Revision: https://reviews.llvm.org/D83062
Joel E. Denny [Wed, 22 Jul 2020 14:14:00 +0000 (10:14 -0400)]
[OpenMP] Implement TR8 `present` map type modifier in Clang (1/2)
This patch implements Clang front end support for the OpenMP TR8
`present` map type modifier. The next patch in this series implements
OpenMP runtime support.
This patch does not attempt to implement TR8 sec. 2.22.7.1 "map
Clause", p. 319, L14-16:
> If a map clause with a present map-type-modifier is present in a map
> clause, then the effect of the clause is ordered before all other
> map clauses that do not have the present modifier.
Compare to L10-11, which Clang does not appear to implement yet:
> For a given construct, the effect of a map clause with the to, from,
> or tofrom map-type is ordered before the effect of a map clause with
> the alloc, release, or delete map-type.
This patch also does not implement the `present` implicit-behavior for
`defaultmap` or the `present` motion-modifier for `target update`.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D83061
Stefan Pintilie [Tue, 21 Jul 2020 19:29:54 +0000 (14:29 -0500)]
[PowerPC] Add linker opt for PC Relative GOT indirect accesses
A linker optimization is available on PowerPC for GOT indirect PCRelative loads.
The idea is that we can mark a usual GOT indirect load:
pld 3, vec@got@pcrel(0), 1
lwa 3, 4(3)
With a relocation to say that if we don't need to go through the GOT we can let
the linker further optimize this and replace a load with a nop.
pld 3, vec@got@pcrel(0), 1
.Lpcrel1:
.reloc .Lpcrel1-8,R_PPC64_PCREL_OPT,.-(.Lpcrel1-8)
lwa 3, 4(3)
This patch adds the logic that allows the compiler to add the R_PPC64_PCREL_OPT.
Reviewers: nemanjai, lei, hfinkel, sfertile, efriedma, tstellar, grosbach
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D79864
jasonliu [Fri, 17 Jul 2020 18:40:02 +0000 (18:40 +0000)]
[XCOFF] Enable symbol alias for AIX
Summary:
AIX assembly's .set directive is not usable for aliasing purpose.
We need to use extra-label-at-defintion strategy to generate symbol
aliasing on AIX.
Reviewed By: DiggerLin, Xiangling_L
Differential Revision: https://reviews.llvm.org/D83252
Sebastian Neubauer [Wed, 3 Jun 2020 13:56:40 +0000 (15:56 +0200)]
[InstCombine] Move target-specific inst combining
For a long time, the InstCombine pass handled target specific
intrinsics. Having target specific code in general passes was noted as
an area for improvement for a long time.
D81728 moves most target specific code out of the InstCombine pass.
Applying the target specific combinations in an extra pass would
probably result in inferior optimizations compared to the current
fixed-point iteration, therefore the InstCombine pass resorts to newly
introduced functions in the TargetTransformInfo when it encounters
unknown intrinsics.
The patch should not have any effect on generated code (under the
assumption that code never uses intrinsics from a foreign target).
This introduces three new functions:
TargetTransformInfo::instCombineIntrinsic
TargetTransformInfo::simplifyDemandedUseBitsIntrinsic
TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic
A few target specific parts are left in the InstCombine folder, where
it makes sense to share code. The largest left-over part in
InstCombineCalls.cpp is the code shared between arm and aarch64.
This allows to move about 3000 lines out from InstCombine to the targets.
Differential Revision: https://reviews.llvm.org/D81728
Simon Pilgrim [Wed, 22 Jul 2020 13:12:36 +0000 (14:12 +0100)]
DebugSubsectionVisitor.h - remove unnecessary includes/forward declarations. NFC.
We don't need the StringsAndChecksumsRef forward declaration as we have to include StringsAndChecksums.h.
We don't need DebugSubsectionRecord.h and we forward declare all referenced classes.
We don't need to include cstdint as we don't use any stdint types.
Simon Pilgrim [Wed, 22 Jul 2020 12:21:45 +0000 (13:21 +0100)]
SelectionDAGBuilder.cpp - remove duplicate includes that already exist in SelectionDAGBuilder.h. NFC.
Simon Pilgrim [Mon, 20 Jul 2020 15:14:22 +0000 (16:14 +0100)]
MappedBlockStream.h - remove unnecessary MSFLayout forward declaration. NFCI.
This is defined in MSFCommon.h which we have to explicitly include already.
Alexey Bataev [Wed, 22 Jul 2020 13:03:30 +0000 (09:03 -0400)]
[SLP]Add an extra test for vectorization of non-pow-2 trees, NFC.
Roman Lebedev [Wed, 22 Jul 2020 13:09:51 +0000 (16:09 +0300)]
[NFC][Reduce] Add a test showing that we fail to to reduce single/last feature
Roman Lebedev [Wed, 22 Jul 2020 13:07:13 +0000 (16:07 +0300)]
[NFC][Reduce] Rewrite remove-funcs.ll to use FileCheck, make it less fragile
David Green [Wed, 22 Jul 2020 13:08:29 +0000 (14:08 +0100)]
[ARM] Extra MVE select(binop) patterns
This is very similar to
243970d03cace2, but handling a slightly
different form of predicated operations. When starting with a pattern of
the form select(p, BinOp(x, y), x), Instcombine will often transform
this to BinOp(x, select(p, y, 0)), where 0 is the identity value of the
binop (0 for adds/subs, 1 for muls, -1 for ands etc). This adds the
patterns that transforms those back into predicated binary operations.
There is also a very minor adjustment to tablegen null_frag in here, to
allow it to also be recognized as a PatLeaf node, so that it can be used
in MVE_TwoOpPattern to easily exclude the cases where we do not need the
alternate transform.
Differential Revision: https://reviews.llvm.org/D84091
Aleksandr Platonov [Wed, 22 Jul 2020 12:59:36 +0000 (15:59 +0300)]
[clangd] Fixes in lit tests
Summary:
Changes:
- `background-index.test` Add Windows support, don't create redundant `*-e` files on macOS
- `did-change-configuration-params.test` Replace `cat | FileCheck` with `FileCheck --input-file`
- `test-uri-windows.test` This test did not run on Windows displite `REQUIRES: windows-gnu || windows-msvc` (replacement: `UNSUPPORTED: !(windows-gnu || windows-msvc)`).
Reviewers: sammccall, kadircet
Reviewed By: kadircet
Subscribers: thakis, njames93, ormris, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83759
David Green [Wed, 22 Jul 2020 12:24:01 +0000 (13:24 +0100)]
[ARM] Add patterns for select(p, BinOp(x, y), z) -> BinOpT(x, y,p z)
Most MVE instructions can be predicated to fold a select into the
instruction, using the predicate and the selects else as a passthough.
This adds tablegen patterns for most two operand instructions using the
newly added TwoOpPattern from
1030e82598da.
Differential Revision: https://reviews.llvm.org/D83222
OCHyams [Wed, 22 Jul 2020 08:25:14 +0000 (09:25 +0100)]
[DebugInfo] Drop location ranges for variables which exist entirely outside the variable's scope
Summary:
This patch reduces file size in debug builds by dropping variable locations a
debugger user will not see.
After building the debug entity history map we loop through it. For each
variable we look at each entry. If the entry opens a location range which does
not intersect any of the variable's scope's ranges then we mark it for removal.
After visiting the entries for each variable we also mark any clobbering
entries which will no longer be referenced for removal, and then finally erase
the marked entries. This all requires the ability to query the order of
instructions, so before this runs we number them.
Tests:
Added llvm/test/DebugInfo/X86/trim-var-locs.mir
Modified llvm/test/DebugInfo/COFF/register-variables.ll
Branch folding merges the tails of if.then and if.else into if.else. Each
blocks' debug-locations point to different scopes so when they're merged we
can't use either. Because of this the variable 'c' ends up with a location
range which doesn't cover any instructions in its scope; with the patch
applied the location range is dropped and its flag changes to IsOptimizedOut.
Modified llvm/test/DebugInfo/X86/live-debug-variables.ll
Modified llvm/test/DebugInfo/ARM/PR26163.ll
In both tests an out of scope location is now removed. The remaining location
covers the entire scope of the variable allowing us to emit it as a single
location.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D82129
Georgii Rymar [Tue, 21 Jul 2020 13:48:39 +0000 (16:48 +0300)]
[llvm-readelf] - Introduce describe() helper functions.
These functions can be used to generate strings like
"SHT_?? section with index ?" to describe sections in error/warning messages,
what helps to simplify and generalize them.
Also this allows to isolate the following common code pattern:
`&Sec - &cantFail(Obj->sections()).front();`
Differential revision: https://reviews.llvm.org/D84240
Sebastian Neubauer [Tue, 21 Jul 2020 08:28:12 +0000 (10:28 +0200)]
[AMDGPU] Don't combine memory intrs to v3i16
v3i16 and v3f16 currently cannot be legalized and lowered so they should
not be emitted by inst combining.
Moved the check down to still allow extracting 1 or 2 elements via the dmask.
Fixes image intrinsics being combined to return v3x16.
Differential Revision: https://reviews.llvm.org/D84223
Florian Hahn [Wed, 22 Jul 2020 10:33:57 +0000 (11:33 +0100)]
[lAA] Return SmallVectorImpl& instead of SmallVector& (NFC).
Georgii Rymar [Mon, 20 Jul 2020 13:28:17 +0000 (16:28 +0300)]
[llvm-readelf/readobj] - Fix the behavior when a sections is included in two groups at the same time.
The current behavior was introduced by me in D37567 and it is a bit strange. It prints the
"Error: ...." message to the errs() manually and stops dumping the group section which has this error.
This behavior is consistent with GNU though, but it is very inconsistent with what the regular llvm-readelf
code usually does/prints, so I suggest to change the implementation:
1) Instead of printing "Error: ...." to errs() - just report a warning.
2) Try to continue dumping the section.
3) Merge broken-group.test to group.text.
This is what this patch does.
Differential revision: https://reviews.llvm.org/D84170
Chen Zheng [Wed, 22 Jul 2020 10:01:52 +0000 (06:01 -0400)]
[PowerPC] fixupIsDeadOrKill start and end in different block fixing
In fixupIsDeadOrKill, we assume StartMI and EndMI not exist in same
basic block, so we add an assertion in that function. This is wrong
before RA, as before RA the true definition may exist in another
block through copy like instructions.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D83365
Joachim Protze [Wed, 22 Jul 2020 10:14:28 +0000 (12:14 +0200)]
[OpenMP][NFC] pass on env variables to libomptarget tests
Ilya Golovenko [Wed, 22 Jul 2020 10:13:08 +0000 (12:13 +0200)]
[clangd] Fix conversion from Windows UNC paths to file URI format.
Summary:
The fix improves handling of Windows UNC paths to align with Appendix E. Nonstandard Syntax Variations of RFC 8089.
Before this fix it was difficult to use Windows UNC paths in compile_commands.json database as such paths were converted to file URIs using 'file:////auth/share/file.cpp' notation instead of recommended 'file://auth/share/file.cpp'.
As an example, VS.Code cannot understand file URIs with 4 starting slashes, thus such features as go-to-definition, jump-to-file, hover tooltip, etc. stop working. This also applicable to files which reside on Windows network-mapped drives because clangd internally resolves file paths to real paths in some cases and such paths get resolved to UNC paths.
Reviewers: sammccall, kadircet
Reviewed By: sammccall
Subscribers: ormris, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, kbobyrev, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D84172
Georgii Rymar [Mon, 20 Jul 2020 14:30:52 +0000 (17:30 +0300)]
[llvm-readobj/readelf] - Don't fail dumping when unable to read the name of the SHT_DYNSYM section.
We have an issue currently: we are trying to read the name of the SHT_DYNSYM section
very early and using `unwrapOrError` call for that.
The name is needed only for the GNU output. Because of the current logic, the tool
fails to dump the whole object when something is wrong with the name of the .dynsym section.
This patch delays reading the name and also allows it to be broken.
Differential revision: https://reviews.llvm.org/D84173
Max Kazantsev [Wed, 22 Jul 2020 10:10:36 +0000 (17:10 +0700)]
[Test] Add more simple tests for PR46786
Vitaly Buka [Wed, 22 Jul 2020 10:01:34 +0000 (03:01 -0700)]
[sanitizer,NFC] InternalAlloc cleanup
Valeriy Savchenko [Tue, 7 Jul 2020 08:36:20 +0000 (11:36 +0300)]
[analyzer][solver] Track symbol disequalities
Summary:
This commmit adds another relation that we can track separately from
range constraints. Symbol disequality can help us understand that
two equivalence classes are not equal to each other. We can generalize
this knowledge to classes because for every a,b,c, and d that
a == b, c == d, and b != c it is true that a != d.
As a result, we can reason about other equalities/disequalities of symbols
that we know nothing else about, i.e. no constraint ranges associated
with them. However, we also benefit from the knowledge of disequal
symbols by following the rule:
if a != b and b == C where C is a constant, a != C
This information can refine associated ranges for different classes
and reduce the number of false positives and paths to explore.
Differential Revision: https://reviews.llvm.org/D83286
Valeriy Savchenko [Wed, 24 Jun 2020 09:50:56 +0000 (12:50 +0300)]
[analyzer][solver] Track symbol equivalence
Summary:
For the most cases, we try to reason about symbol either based on the
information we know about that symbol in particular or about its
composite parts. This is faster and eliminates costly brute force
searches through existing constraints.
However, we do want to support some cases that are widespread enough
and involve reasoning about different existing constraints at once.
These include:
* resoning about 'a - b' based on what we know about 'b - a'
* reasoning about 'a <= b' based on what we know about 'a > b' or 'a < b'
This commit expands on that part by tracking symbols known to be equal
while still avoiding brute force searches. It changes the way we track
constraints for individual symbols. If we know for a fact that 'a == b'
then there is no need in tracking constraints for both 'a' and 'b' especially
if these constraints are different. This additional relationship makes
dead/live logic for constraints harder as we want to maintain as much
information on the equivalence class as possible, but we still won't
carry the information that we don't need anymore.
Differential Revision: https://reviews.llvm.org/D82445
Valeriy Savchenko [Tue, 23 Jun 2020 14:46:03 +0000 (17:46 +0300)]
[analyzer] Introduce small improvements to the solver infra
Summary:
* Add a new function to delete points from range sets.
* Introduce an internal generic interface for range set intersections.
* Remove unnecessary bits from a couple of solver functions.
* Add in-code sections.
Differential Revision: https://reviews.llvm.org/D82381
Pavel Labath [Mon, 20 Jul 2020 14:52:38 +0000 (16:52 +0200)]
[lldb/test] Delete result formatter machinery entirely
After more investigation, I realised this part of the code is totally
unused. It was used for communicating the test results from the
"inferior" dotest process to the main "dosep" process running
everything. Now that everything is being orchestrated through lit, this
is not used for anything.
Sander de Smalen [Wed, 22 Jul 2020 09:04:36 +0000 (10:04 +0100)]
[AArch64][SVE] Correctly allocate scavenging slot in presence of SVE.
This patch addresses two issues:
* Forces the availability of the base-pointer (x19) when the frame has
both scalable vectors and variable-length arrays. Otherwise it will
be expensive to access non-SVE locals.
* In presence of SVE stack objects, it will allocate the emergency
scavenging slot close to the SP, so that they can be accessed from
the SP or BP if available. If accessed from the frame-pointer, it will
otherwise need an extra register to access the scavenging slot because
of mixed scalable/non-scalable addressing modes.
Reviewers: efriedma, ostannard, cameron.mcinally, rengolin, david-arm
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D70174
Med Ismail Bennani [Wed, 22 Jul 2020 09:46:44 +0000 (11:46 +0200)]
[lldb/interpreter] Fix formatting in CommandInterpreter.cpp (NFC)
This patch addresses some formatting issues introduced by commit
5bb742b10dafd595223172ae985687765934ebe9
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Marcel Hlopko [Wed, 22 Jul 2020 08:34:53 +0000 (10:34 +0200)]
Make lit TestRunner.py work in Python 3
Summary: In Python3 SubstituteCaptures are no longer converted to String implicitly behind the scenes. Converting explicitly makes the TestRunner to work in Python3.
Reviewers: gribozavr2, compnerd
Reviewed By: gribozavr2
Subscribers: tbkka, delcypher, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81361
Med Ismail Bennani [Tue, 21 Jul 2020 14:29:16 +0000 (16:29 +0200)]
[lldb/interpreter] Add ability to save lldb session to a file
This patch introduce a new feature that allows the users to save their
debugging session's transcript (commands + outputs) to a file.
It differs from the reproducers since it doesn't require to capture a
session preemptively and replay the reproducer file in lldb.
The user can choose the save its session manually using the session save
command or automatically by setting the interpreter.save-session-on-quit
on their init file.
To do so, the patch adds a Stream object to the CommandInterpreter that
will hold the input command from the IOHandler and the CommandReturnObject
output and error. This way, that stream object accumulates passively all
the interactions throughout the session and will save them to disk on demand.
The user can specify a file path where the session's transcript will be
saved. However, it is optional, and when it is not provided, lldb will
create a temporary file name according to the session date and time.
rdar://
63347792
Differential Revision: https://reviews.llvm.org/D82155
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
David Green [Wed, 22 Jul 2020 09:40:02 +0000 (10:40 +0100)]
[ARM] Predicated binary operation tests. NFC
Pavel Labath [Fri, 10 Jul 2020 12:58:29 +0000 (14:58 +0200)]
[lldb/test] Do a better job at setting (DY)LD_LIBRARY_PATH
Summary:
registerSharedLibrariesWithTarget was setting the library path
environment variable to the process build directory, but the function is
also accepting libraries in other directories (in which case they won't
be found automatically).
This patch makes the function set the path variable correctly for these
libraries too. This enables us to remove the code for setting the path
variable in TestWeakSymbols.py, which was working only accidentally --
it was relying on the fact that
launch_info.SetEnvironmentEntries(..., append=True)
would not overwrite the path variable it has set, but that is going to
change with D83306.
Reviewers: davide, jingham
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D83552
Stefan Pintilie [Tue, 21 Jul 2020 17:54:48 +0000 (12:54 -0500)]
[PowerPC] Extend .reloc directive on PowerPC
When the compiler generates a GOT indirect load it must generate two loads. One
that loads the address of the element from the GOT and a second to load the
actual element based on the address just loaded from the GOT. However, the
linker can optimize these two loads into one load if it knows that it is safe
to do so. The compiler can tell the linker that the optimization is safe
by using the R_PPC64_PCREL_OPT relocation.
This patch extends the .reloc directive to allow the following setup
pld 3, vec@got@pcrel(0), 1
.Lpcrel1=.-8
... More instructions possible here ...
.reloc .Lpcrel1,R_PPC64_PCREL_OPT,.-.Lpcrel1
lwa 3, 4(3)
Reviewers: nemanjai, lei, hfinkel, sfertile, efriedma, tstellar, grosbach, MaskRay
Reviewed By: nemanjai, MaskRay
Differential Revision: https://reviews.llvm.org/D79625
Kadir Cetinkaya [Wed, 22 Jul 2020 08:35:23 +0000 (10:35 +0200)]
[clangd] Fix Origin and MainFileOnly-ness for macros
Summary:
This was resulting in macros coming from preambles vanishing when user
have opened the source header. For example:
```
// test.h:
#define X
```
and
```
// test.cc
#include "test.h
^
```
If user only opens test.cc, we'll get `X` as a completion candidate,
since it is indexed as part of the preamble. But if the user opens
test.h afterwards we would index it as part of the main file and lose
the symbol (as new index shard for test.h will override the existing one
in dynamic index).
Also we were not setting origins for macros correctly, this patch also
fixes it.
Fixes https://github.com/clangd/clangd/issues/461
Reviewers: hokein
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D84297
Simon Wallis [Wed, 22 Jul 2020 09:11:57 +0000 (10:11 +0100)]
[Thumb] set code alignment for 16-bit load from constant pool
Summary:
[Thumb] set code alignment for 16-bit load from constant pool
LLVM miscompiles this code when compiling for a target with v8.2-A FP16 and the Thumb ISA at -O0:
extern void bar(__fp16 P5);
int main() {
__fp16 P5 = 1.96875;
bar(P5);
}
The code section containing main has 2 byte alignment.
It needs to have 4 byte alignment,
because the load literal instruction has an offset from the
load address with the low 2 bits zeroed.
I do not include a test case in this check-in.
llc and llvm-mc do not exhibit this bug. They do not set code section alignment
in the same manner as clang.
Reviewers: dnsampaio
Reviewed By: dnsampaio
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D84169
Sjoerd Meijer [Tue, 21 Jul 2020 15:33:24 +0000 (16:33 +0100)]
[Matrix] Add LowerMatrixIntrinsics to the NPM
Pass LowerMatrixIntrinsics wasn't running yet running under the new pass
manager, and this adds LowerMatrixIntrinsics to the pipeline (to the
same place as where it is running in the old PM).
Differential Revision: https://reviews.llvm.org/D84180
Max Kazantsev [Wed, 22 Jul 2020 08:32:13 +0000 (15:32 +0700)]
[SCEV] Remove premature assert. PR46786
This assert was added to verify assumption that GEP's SCEV will be of pointer type,
basing on fact that it should be a SCEVAddExpr with (at least) last operand being
pointer. Two notes:
- GEP's SCEV does not have to be a SCEVAddExpr after all simplifications;
- In current state, GEP's SCEV does not have to have at least one pointer operands
(all of them can become int during the transforms).
However, we might want to be at a point where it is true. We are currently removing
this assert and will try to enumerate the cases where "is pointer" notion might be
lost during the transforms. When all of them are fixed, we can return it.
Differential Revision: https://reviews.llvm.org/D84294
Reviewed By: lebedev.ri
Petar Avramovic [Wed, 22 Jul 2020 08:31:41 +0000 (10:31 +0200)]
AMDGPU: Simplify f16 to i64 custom lowering
Range that f16 can represent fits into i32.
Lower as f16->i32->i64 instead of f16->f32->i64
since f32->i64 has long expansion.
Differential Revision: https://reviews.llvm.org/D84166