Diana Picus [Thu, 6 May 2021 09:26:57 +0000 (09:26 +0000)]
[flang] Remove redundant reallocation
The MaxMinHelper used to implement MIN and MAX for character types would
reallocate the accumulator whenever the number of characters in it was
different from that in the other input. This is unnecessary if the
accumulator is already larger than the other input. This patch fixes the
issue and adds a unit test to make sure we don't reallocate if we don't
need to.
Differential Revision: https://reviews.llvm.org/D101984
Diana Picus [Tue, 4 May 2021 18:57:54 +0000 (18:57 +0000)]
[flang] Add tests for MIN for character arrays. NFC
We used to test only scalar character types. This commit adds tests for
arrays with a few simple shapes.
Differential Revision: https://reviews.llvm.org/D101983
Caroline Concatto [Thu, 22 Apr 2021 07:24:40 +0000 (08:24 +0100)]
[LoopVectorize][SVE] Remove assert for scalable vector in InnerLoopVectorizer::fixReduction
The function fixReduction used to assert/crash for scalable vector when
a vector reduce could be done with a smaller vector.
This patch removes this assertion as it is safe to use scalable vector for
vector reduce and truncate.
Differential Revision: https://reviews.llvm.org/D101260
James Henderson [Fri, 7 May 2021 08:20:50 +0000 (09:20 +0100)]
[lit][test] Attempt fix when paths include symlink
Example of failure:
https://lab.llvm.org/staging/#/builders/126/builds/345/steps/5/logs/FAIL__lit___use-tool-search-env_py
Martin Storsjö [Thu, 6 May 2021 07:18:41 +0000 (10:18 +0300)]
[libcxx] Fix a case of -Wundef warnings. NFC.
Differential Revision: https://reviews.llvm.org/D101978
Peilin Guo [Fri, 7 May 2021 08:05:50 +0000 (16:05 +0800)]
[LazyValueInfo] Insert an Overdefined placeholder to prevent infinite recursion
getValueFromCondition() uses a Visited set to record the intermediate value.
However, it uses a postorder way to compute the value first and update the
Visited set later. Thus it will be trapped into an infinite recursion if there
exists IRs that use no dominated by its def as in this example:
%tmp3 = or i1 undef, %tmp4
%tmp4 = or i1 undef, %tmp3
To prevent this, we can insert an Overdefined placeholder into the set
before computing the actual value.
Reviewed by: nikic
Differential Revision: https://reviews.llvm.org/D101273
Chen Zheng [Fri, 7 May 2021 07:00:11 +0000 (07:00 +0000)]
[Debug-Info][NFC] add a wrapper for Die.addValue
Add a new wrapper function addAttribute() for Die.addValue() function,
so we can do some attributes control in one single interface.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D101125
Amara Emerson [Fri, 7 May 2021 07:00:47 +0000 (00:00 -0700)]
[GlobalISel] Micro-optimize the conditional branch optimization.
Convert a check into an assert and pass an MI instead of recomputing in the
apply function.
KareemErgawy-TomTom [Fri, 7 May 2021 06:59:35 +0000 (08:59 +0200)]
[MLIR][SPIRV] Properly (de-)serialize BranchConditionalOp.
Implements proper (de-)serialization logic for BranchConditionalOp when
such ops have true/false target operands.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D101602
Chen Zheng [Fri, 7 May 2021 06:19:29 +0000 (06:19 +0000)]
[XCOFF] handle string constants generation for AIX
This follows https://www.ibm.com/docs/en/aix/7.2?topic=constants-string
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D101280
Tobias Gysi [Fri, 7 May 2021 05:59:05 +0000 (05:59 +0000)]
[mlir][linalg] Add IndexedGenericOp to GenericOp canonicalization.
Replace all `linalg.indexed_generic` ops by `linalg.generic` ops that access the iteration indices using the `linalg.index` op.
Differential Revision: https://reviews.llvm.org/D101612
Qiu Chaofan [Fri, 7 May 2021 03:04:47 +0000 (11:04 +0800)]
[PowerPC] Remove extra swap for extract+vperm on LE
This is a simple fix on LE. On BE, vector shuffles are categorized into
different ops. We may need more work to eliminate these in
tablegen/pre-isel.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D101605
Yonghong Song [Thu, 6 May 2021 23:31:30 +0000 (16:31 -0700)]
BPF: fix FIELD_EXISTS relocation with array subscripts
Lorenz Bauer reported an issue in bpf mailing list ([1]) where
for FIELD_EXISTS relocation, if the object is an array subscript,
the patched immediate is the object offset from the base address,
instead of 1.
Currently in BPF AbstractMemberAccess pass, the final offset
from the base address is the patched offset except FIELD_EXISTS
which is 1 unconditionally. In this particular case, the last
data structure access is not a field (struct/union offset)
so it didn't hit the place to set patched immediate to be 1.
This patch fixed the issue by checking the relocation type.
If the type is FIELD_EXISTS, just set to 1.
Tested by modifying some bpf selftests, libbpf is okay with
such types with FIELD_EXISTS relocation.
[1] https://lore.kernel.org/bpf/CACAyw99n-cMEtVst7aK-3BfHb99GMEChmRLCvhrjsRpHhPrtvA@mail.gmail.com/
Differential Revision: https://reviews.llvm.org/D102036
Coelacanthus [Thu, 6 May 2021 10:36:52 +0000 (18:36 +0800)]
[TableGen] Use range-based for loops (NFC)
Use range-based for loops in TableGen.
Reviewed By: Paul-C-Anagnostopoulos
Differential Revision: https://reviews.llvm.org/D101994
qixingxue [Thu, 6 May 2021 07:33:56 +0000 (15:33 +0800)]
[IR] Fix typo in comment of Intrinsics.td (NFC)
Bruno Cardoso Lopes [Fri, 7 May 2021 04:04:23 +0000 (21:04 -0700)]
[CGAtomic] Lift strong requirement for remaining compare_exchange combinations
Follow up on
431e3138a and complete the other possible combinations.
Besides enforcing the new behavior, it also mitigates TSAN false positives when
combining orders that used to be stronger.
MaheshRavishankar [Fri, 7 May 2021 00:17:29 +0000 (17:17 -0700)]
[mlir][Linalg] Allow folding to rank-zero tensor when using rank-reducing subtensors.
The pattern to convert subtensor ops to their rank-reduced versions
(by dropping unit-dims in the result) can also convert to a zero-rank
tensor. Handle that case.
This also fixes a OOB access bug in the existing pattern for such
cases.
Differential Revision: https://reviews.llvm.org/D101949
Jianzhou Zhao [Fri, 30 Apr 2021 17:18:05 +0000 (17:18 +0000)]
[dfsan] Rename and fix an internal test issue for mmap+calloc
The linker suggests using -Wl,-z,notext.
Replaced assert by exit also fixed this.
After renaming, interceptor.c would be used to test interceptors in general by D101204.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D101649
Cyndy Ishida [Thu, 6 May 2021 23:18:55 +0000 (16:18 -0700)]
[llvm][TextAPI] add mapping from OS string to Platform
* add utility for matching target triple OS value strings to PlatformKind
This was reviewed offline by ributzka, steven_wu
Stanislav Mekhanoshin [Thu, 6 May 2021 20:29:48 +0000 (13:29 -0700)]
[AMDGPU] Expose __builtin_amdgcn_perm for v_perm_b32
Differential Revision: https://reviews.llvm.org/D102022
Rob Suderman [Thu, 6 May 2021 22:55:58 +0000 (15:55 -0700)]
[mlir][tosa] Added div op, variadic concat. Removed placeholder. Spec v0.22 alignment.
Nearly complete alignment to spec v0.22
- Adds Div op
- Concat inputs now variadic
- Removes Placeholder op
Note: TF side PR https://github.com/tensorflow/tensorflow/pull/48921 deletes Concat legalizations to avoid breaking TensorFlow CI. This must be merged only after the TF PR has merged.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D101958
Jon Chesterfield [Thu, 6 May 2021 22:52:18 +0000 (23:52 +0100)]
[libomptarget][nfc] Refactor amdgpu partial barrier to simplify adding a second one
[libomptarget][nfc] Refactor amdgpu partial barrier to simplify adding a second one
D101976 would require a second barrier instance. This NFC to amdgpu makes it
simpler to add one (an extra global, one more line in init). Also renames the
current barrier to L0.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102016
Amy Zhuang [Thu, 6 May 2021 22:08:34 +0000 (15:08 -0700)]
[mlir] Update dstNode after DenseMap insertion in loop fusion pass.
Reviewed By: vinayaka-polymage
Differential Revision: https://reviews.llvm.org/D101794
Malhar Jajoo [Thu, 6 May 2021 00:38:20 +0000 (01:38 +0100)]
[ARM] Transforming memcpy to Tail predicated Loop
This patch converts llvm.memcpy intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).
From an implementation point of view, the patch
- adds an ARM specific SDAG Node (to which the llvm.memcpy intrinsic is lowered to, during first phase of ISel)
- adds a corresponding TableGen entry to generate a pseudo instruction, with a custom inserter,
on matching the above node.
- Adds a custom inserter function that expands the pseudo instruction into MIR suitable
to be (by later passes) into a WLSTP loop.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D99723
Jon Chesterfield [Thu, 6 May 2021 22:16:30 +0000 (23:16 +0100)]
[libomptarget][amdgpu][nfc] Remove dead code from amdgpu plugin
[libomptarget][amdgpu][nfc] Remove dead code from amdgpu plugin
Drops an enum that was identical to a HSA one, localises some functions where
they were only called from one TU. Covers everything internalize + adce can
identify as dead, except for msgpack::dump which is useful when debugging.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102014
Lei Zhang [Thu, 6 May 2021 21:16:55 +0000 (17:16 -0400)]
[mlir][spirv] NFC: Replace OwningSPIRVModuleRef with OwningOpRef
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D102009
Jim Ingham [Wed, 5 May 2021 18:34:07 +0000 (11:34 -0700)]
When SendContinuePacketAndWaitForResponse returns eStateInvalid, don't fetch more packets.
This looks like just an oversight in the AsyncThread function. It gets a result of
eStateInvalid, and then marks the process as exited, but doesn't set "done" to true,
so we go to fetch another event. That is not safe, since you don't know when that
extra packet is going to arrive. If it arrives while you are tearing down the
process, the internal-state-thread might try to handle it when the process in not
in a good state.
Rather than put more effort into checking all the shutdown paths to make sure this
extra packet doesn't cause problems, just don't fetch it. We weren't going to do
anything useful with it anyway.
The main part of the patch is setting "done = true" when we get the eStateInvalid.
I also added a check at the beginning of the while(done) loop to prevent another error
from getting us to fetch packets for an exited process.
I added a test case to ensure that if an Interrupt fails, we call the process
exited. I can't test exactly the error I'm fixing, there's no good way to know
that the stop reply for the failed interrupt wasn't fetched. But at least this
asserts that the overall behavior is correct.
Differential Revision: https://reviews.llvm.org/D101933
Aaron Puchert [Thu, 6 May 2021 21:07:40 +0000 (23:07 +0200)]
Thread safety analysis: Eliminate parameter from intersectAndWarn (NFC)
We were modifying precisely when intersecting the lock sets of multiple
predecessors without back edge. That's no coincidence: we can't modify
on back edges, it doesn't make sense to modify at the end of a function,
and otherwise we always want to intersect on forward edges, because we
can build a new lock set for those.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D101755
LLVM GN Syncbot [Thu, 6 May 2021 21:03:05 +0000 (21:03 +0000)]
[gn build] Port
83af66e18e3d
Frank Derry Wanye [Thu, 6 May 2021 21:01:39 +0000 (17:01 -0400)]
new altera ID dependent backward branch check
This lint check is a part of the FLOCL (FPGA Linters for OpenCL) project
out of the Synergy Lab at Virginia Tech.
FLOCL is a set of lint checks aimed at FPGA developers who write code
in OpenCL.
The altera ID dependent backward branch lint check finds ID dependent
variables and fields used within loops, and warns of their usage. Using
these variables in loops can lead to performance degradation.
Alex Hoppen [Thu, 6 May 2021 20:11:26 +0000 (13:11 -0700)]
[Index] Ignore nullptr decls for indexing
We can end up with a call to `indexTopLevelDecl(D)` with `D == nullptr` in non-assert builds e.g. when indexing a module in `indexModule` and
- `ASTReader::GetDecl` returns `nullptr` if `Index >= DeclsLoaded.size()`, thus returning `nullptr`
=> `ModuleDeclIterator::operator*` returns `nullptr`
=> we call `IndexCtx.indexTopLevelDecl` with `nullptr`
Be resilient and just ignore the `nullptr` decls during indexing.
Reviewed By: akyrtzi
Differential Revision: https://reviews.llvm.org/D102001
River Riddle [Thu, 6 May 2021 19:09:16 +0000 (12:09 -0700)]
[mlir] Store the flag for dynamic operand storage in the low bits
It is currently stored in the high bits, which is disallowed on certain
platforms (e.g. android). This revision switches the representation to use
the low bits instead, fixing crashes/breakages on those platforms.
Differential Revision: https://reviews.llvm.org/D101969
Sanjay Patel [Thu, 6 May 2021 19:33:59 +0000 (15:33 -0400)]
[PassManager] add helper function to hold set of vector passes
This is no-functional-change-intended (NFC) and split off from
D102002 (which proposes to eliminate the LTO-based differences).
Mircea Trofin [Mon, 1 Mar 2021 20:19:20 +0000 (12:19 -0800)]
[NPM] Do not run function simplification pipeline unnecessarily
The CGSCC pass manager interplay with the FunctionAnalysisManagerCGSCCProxy is 'special' in the sense that the former will rerun the latter if there are changes to a SCC structure; that being said, some of the functions in the SCC may be unchanged. In that case, the function simplification pipeline will be re-run, which impacts compile time[1].
This patch allows the function simplification pipeline be skipped if it was already run and the function was not modified since.
The behavior is currently disabled by default. This is because, currently, the rerunning of the function simplification pipeline on an unchanged function may still result in changes. The patch simplifies investigating and fixing those cases where repeated function pass runs do actually positively impact code quality, while offering an easy workaround for those impacted negatively by compile time regressions, and not impacting mainline scenarios.
[1] A [[ http://llvm-compile-time-tracker.com/compare.php?from=
eb37d3546cd0c6e67798496634c45e501f7806f1&to=
ac722d1190dc7bbdd17e977ef7ec95e69eefc91e&stat=instructions | compile time tracker ]] run with the option enabled.
Differential Revision: https://reviews.llvm.org/D98103
Craig Topper [Thu, 6 May 2021 19:17:37 +0000 (12:17 -0700)]
[RISCV] Remove unused ComplexPatterns. NFC
Arnamoy Bhattacharyya [Thu, 6 May 2021 18:00:34 +0000 (14:00 -0400)]
[flang][OpenMP] Add semantic check for occurrence of constructs nested inside a SIMD region
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D99757
Petr Hosek [Thu, 6 May 2021 18:56:32 +0000 (11:56 -0700)]
[Fuchsia][CMake] Update OSX deployment target
Use correct spelling of CMAKE_OSX_DEPLOYMENT_TARGET and bump the
minimum version to 10.13 which matches what we use for host tools
in Fuchsia.
Differential Revision: https://reviews.llvm.org/D102013
Xing Xue [Thu, 6 May 2021 18:33:38 +0000 (14:33 -0400)]
[libunwind] NFC: Use macros to accommodate differences in representation of PowerPC assemblers
Summary:
This NFC patch replaces the representation of registers and the left shift operator in the PowerPC assembly code to allow it to be consumed by the GNU flavored assembler and the AIX assembler.
* Registers - change the representation of PowperPC registers from %rn, %fn, %vsn, and %vrn to the register number alone, e.g., n. The GNU flavored assembler and the AIX assembler are able to determine the register kind based on the context of the instruction in which the register is used.
* Left shift operator - use macro PPC_LEFT_SHIFT to represent the left shift operator. The left shift operator in the AIX assembly language is < instead of <<
Reviewed by: sfertile, MaskRay, compnerd
Differential Revision: https://reviews.llvm.org/D101179
Craig Topper [Thu, 6 May 2021 18:21:46 +0000 (11:21 -0700)]
[RISCV] Minor vector instruction tablegen cleanup. NFC
Use result_type for the IMPLICIT_DEF in masked vector patterns.
This doesn't matter today because result_type and op_type are
always the same.
Use multiclass inheritance to reduce repeated code.
peter klausler [Wed, 5 May 2021 18:37:49 +0000 (11:37 -0700)]
[flang] Implement NAMELIST I/O in the runtime
Add InputNamelist and OutputNamelist as I/O data transfer APIs
to be used with internal & external list-directed I/O; delete the
needless original namelist-specific Begin... calls.
Implement NAMELIST output and input; add basic tests.
Differential Revision: https://reviews.llvm.org/D101931
Fangrui Song [Thu, 6 May 2021 18:16:07 +0000 (11:16 -0700)]
[AArch64] Fix namespace issue. NFC
peter klausler [Wed, 5 May 2021 18:26:12 +0000 (11:26 -0700)]
[flang] Fix race condition in runtime
The code that initializes the default units 5 & 6 had
a race condition that would allow threads access to the
unit map before it had been populated.
Also add some missing calls to va_end() that will never
be called (they're in program abort situations) but might
elicit warnings if absent.
Differential Revision: https://reviews.llvm.org/D101928
Matthew Voss [Thu, 22 Apr 2021 21:07:45 +0000 (14:07 -0700)]
Allow llvm-dis to disassemble multiple files
Differential Revision: https://reviews.llvm.org/D101110
peter klausler [Wed, 5 May 2021 18:33:00 +0000 (11:33 -0700)]
[flang] Runtime must defer formatted/unformatted determination
What the Fortran standard calls "preconnected" external I/O units
might not be known to be connected to unformatted or formatted files
until the first I/O data transfer statement is executed.
Support this deferred determination by representing the flag as
a tri-state Boolean and adapting its points of use.
Differential Revision: https://reviews.llvm.org/D101929
Arthur Eubanks [Wed, 5 May 2021 23:35:14 +0000 (16:35 -0700)]
[gn build] Support compiler-rt/profile on Windows
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D101961
thomasraoux [Thu, 6 May 2021 17:12:31 +0000 (10:12 -0700)]
[mlir][vector] Fix typo
thomasraoux [Thu, 6 May 2021 14:28:09 +0000 (07:28 -0700)]
[mlir][linalg][NFC] Make reshape folding control more fine grain
This expose a lambda control instead of just a boolean to control unit
dimension folding.
This however gives more control to user to pick a good heuristic.
Folding reshapes helps fusion opportunities but may generate sub-optimal
generic ops.
Differential Revision: https://reviews.llvm.org/D101917
Thomas Lively [Thu, 6 May 2021 17:07:44 +0000 (10:07 -0700)]
[WebAssembly] Fix argument types in SIMD narrowing intrinsics
The builtins were updated to take signed parameters in
627a52695537, but the
intrinsics that use those builtins were not updated as well. The intrinsic test
did not catch this sign mismatch because it is only reported as an error under
-fno-lax-vector-conversions.
This commit fixes the type mismatch and adds -fno-lax-vector-conversions to the
test to catch similar problems in the future.
Differential Revision: https://reviews.llvm.org/D101979
Stefan Pintilie [Tue, 4 May 2021 14:35:43 +0000 (09:35 -0500)]
[PowerPC][LLD] Make sure that the correct Thunks are used.
This fixes an issue where mixed TOC / NOTOC calls can call the incorrect
thunks if a previous thunk already exists. The issue appears when a TOC
funciton calls a NOTOC callee and then a different NOTOC function calls the same
NOTOC callee. In this case the linker would sometimes incorrectly call the
same thunk for both cases.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101837
Craig Topper [Thu, 6 May 2021 16:40:49 +0000 (09:40 -0700)]
[RISCV] Remove unused RISCV::VLEFF and VLEFF_MASK. NFC
Looks like these got left behind when vleff isel was moved to
X86ISelDAGToDAG.cpp
Hubert Tong [Thu, 6 May 2021 14:15:30 +0000 (10:15 -0400)]
[AIX][Test][ORC] Skip unsupported ORC C API tests on AIX
As mentioned before in D78813, currently the XCOFF backend does not
support writing 64-bit object files, which the ORC JIT tests will try to
exercise if we are on AIX. This patch disables the tests on AIX for now.
This is consistent with what's been done, for example, regarding
`armv7`.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D101971
Denys Shabalin [Thu, 6 May 2021 16:24:07 +0000 (18:24 +0200)]
Fix array attribute in bindings for linalg.init_tensor
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D101998
Jonas Paulsson [Thu, 6 May 2021 13:22:21 +0000 (15:22 +0200)]
[SystemZ] Don't use libcall for 128 bit shifts.
Expand 128 bit shifts instead of using a libcall.
This patch removes the 128 bit shift libcalls and thereby causes
ExpandShiftWithUnknownAmountBit() to be called.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D101993
Craig Topper [Thu, 6 May 2021 15:58:58 +0000 (08:58 -0700)]
[RISCV] Cleanup instruction formats used for B extension ternary operations.
Rename RVInstR4 as used by F/D/Zfh extensions to RVInstR4Frm.
Introduce new RVInstR4 that takes funct3 as a parameter.
Add new format classes for FSRI and FSRIW instead of trying to
bend RVInstR4 to use a shamt overlayed on rs2 and funct2.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D100427
Fraser Cormack [Thu, 6 May 2021 15:37:04 +0000 (16:37 +0100)]
[LangRef][VP] Fix typos in VP sdiv/udiv examples
David Goldman [Mon, 3 May 2021 20:18:57 +0000 (16:18 -0400)]
[clangd][ObjC] Highlight Objc Ivar refs
Treat them just like we do for properties - as a `property` semantic
token although ideally we could differentiate the two.
Differential Revision: https://reviews.llvm.org/D101785
Stanislav Mekhanoshin [Wed, 5 May 2021 18:26:07 +0000 (11:26 -0700)]
[AMDGPU] Fix 64 bit DPP validation
AMDGPUAsmParser::isSupportedDPPCtrl() was failing to correctly
find a DPP register operand, regadless of the position it is
always src0. Moved this check into a new validateDPP() method
where we have full instruction already. In particular it was
failing to reject this case:
v_cvt_u32_f64 v5, v[0:1] quad_perm:[0,2,1,1] row_mask:0xf bank_mask:0xf
Essentially it was broken for any case where size of dst and
src0 differ.
It also improves the diagnostics with a proper error message.
The check in the InstPrinter also drops verification of the dst
register as it does not have anything to do with the dpp operand.
Differential Revision: https://reviews.llvm.org/D101930
Simon Pilgrim [Thu, 6 May 2021 15:19:36 +0000 (16:19 +0100)]
[SLP] Constify the TreeEntry* input into getEntryCost() + setInsertPointAfterBundle(). NFCI.
Simon Pilgrim [Thu, 6 May 2021 15:07:16 +0000 (16:07 +0100)]
[SLP] Constify the TreeEntry* input into dumpTreeCosts(). NFCI.
Simon Pilgrim [Thu, 6 May 2021 15:00:44 +0000 (16:00 +0100)]
[SLP] Use empty() instead of size() == 0. NFCI.
Jez Ng [Thu, 6 May 2021 15:18:19 +0000 (11:18 -0400)]
[lld-macho] Support loading of zippered dylibs
ld64 can emit dylibs that support more than one platform (typically macOS and
macCatalyst). This diff allows LLD to read in those dylibs. Note that this is a
super bare-bones implementation -- in particular, I haven't added support for
LLD to emit those multi-platform dylibs, nor have I added a variety of
validation checks that ld64 does. Until we have a use-case for emitting zippered
dylibs, I think this is good enough.
Fixes PR49597.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D101954
Jez Ng [Wed, 5 May 2021 22:30:23 +0000 (18:30 -0400)]
[lld-macho][nfc] Convert the mock libSystem.tbd to TBDv4
It doesn't seem like TBDv3 allows for specifying multiple platforms, so I'm
upgrading us to TBDv4. (We need to support multiple platforms in order to test
that we can handle zippered dylibs; that functionality will be added in an
upcoming diff.)
Differential Revision: https://reviews.llvm.org/D101953
thomasraoux [Thu, 6 May 2021 15:11:42 +0000 (08:11 -0700)]
[mlir][NFC] Fix warning in VectorTransforms.cpp
thomasraoux [Wed, 5 May 2021 23:03:22 +0000 (16:03 -0700)]
[mlir][vector] add pattern to cast away lead unit dimension for broadcast op
Differential Revision: https://reviews.llvm.org/D101955
Nemanja Ivanovic [Thu, 6 May 2021 14:44:07 +0000 (09:44 -0500)]
[PowerPC] Re-commit
ed87f512bb9eb5c1d44e9a1182ffeaf23d6c5ae8
This was reverted in
3761b9a2345aff197707d23a68d4a178489f60e4 just
as I was about to commit the fix. This patch inlcudes the
necessary fix.
Austin Kerbow [Thu, 6 May 2021 14:43:11 +0000 (07:43 -0700)]
[AMDGPU][NFC] Fix typos in SIFormMemoryClauses description
NFC.
David Spickett [Thu, 6 May 2021 14:13:19 +0000 (14:13 +0000)]
[OpenMP] Temporarily require X86 target for parallel_for_codegen.cpp test
Since https://reviews.llvm.org/D101849 this test has been failing
on bots that only enable either Arm or AArch64 targets.
See: https://lab.llvm.org/buildbot/#/builders/107/builds/7601
Temporarily requires X86 for this test while the difference is figured out.
Louis Dionne [Tue, 4 May 2021 22:51:58 +0000 (18:51 -0400)]
[libc++] Rewrite std::to_address to avoid relying on element_type
This is a rough reapplication of the change that fixed std::to_address
to avoid relying on element_type (
da456167). It is somewhat different
because the fix to avoid breaking Clang (which caused it to be reverted
in
347f69c55) was a bit more involved.
Differential Revision: https://reviews.llvm.org/D101638
Victor Huang [Thu, 6 May 2021 13:37:09 +0000 (08:37 -0500)]
[AIX][TLS] Add support for TLSGD relocations to XCOFF objects
- Add branch absolute reloction R_RBA, R_TLS relocation for the variable offset
for the tlsgd model and R_TLSM for the region handle for the tlsgd model
- Properly set the relocation fixed values for R_TLS and R_TLSM
- Emit the TCEntry with the variant kind in the XCOFFStreamer
Reviewed by: sfertile, nemanjai, DiggerLin
Differential Revision: https://reviews.llvm.org/D100214
Nico Weber [Thu, 6 May 2021 14:00:39 +0000 (10:00 -0400)]
Revert "[PowerPC] Provide some P8-specific altivec overloads for P7"
This reverts commit
ed87f512bb9eb5c1d44e9a1182ffeaf23d6c5ae8.
Breaks check-clang, see e.g.
https://lab.llvm.org/buildbot/#/builders/139/builds/3818
Raphael Isemann [Thu, 6 May 2021 14:00:24 +0000 (16:00 +0200)]
[lldb][NFC] Make assert in TestStaticVariables more expressive
Jay Foad [Thu, 6 May 2021 13:47:43 +0000 (14:47 +0100)]
[AMDGPU] SIInsertHardClauses: move more stuff into the class. NFC.
Nemanja Ivanovic [Thu, 6 May 2021 11:54:52 +0000 (06:54 -0500)]
[PowerPC] Provide some P8-specific altivec overloads for P7
This adds additional support for XL compatibility. There are a number
of functions in altivec.h that produce a single instruction (or a
very short sequence) for Power8 but can be done on Power7 without
scalarization. XL provides these implementations.
This patch adds the following overloads for doubleword vectors:
vec_add
vec_cmpeq
vec_cmpgt
vec_cmpge
vec_cmplt
vec_cmple
vec_sl
vec_sr
vec_sra
Paul C. Anagnostopoulos [Wed, 28 Apr 2021 23:55:12 +0000 (19:55 -0400)]
[TableGen] [Clang] Clean up Options.td and add asserts.
Differential Revision: https://reviews.llvm.org/D101766
Anastasia Stulova [Thu, 6 May 2021 11:48:46 +0000 (12:48 +0100)]
[OpenCL] Remove subgroups pragma in enqueue kernel and pipe builtins.
This patch simplifies the parser and makes the language semantics
consistent. There is no extension pragma requirement in the spec
for the subgroup functions in enqueue kernel or pipes and all other
builtin functions are available without the pragama.
Differential Revision: https://reviews.llvm.org/D100984
Jon Chesterfield [Thu, 6 May 2021 12:06:59 +0000 (13:06 +0100)]
[amdgpu-arch] Fix rpath to run from build dir
[amdgpu-arch] Fix rpath to run from build dir
Prior to this, amdgpu-arch has RUNPATH set to $ORIGIN/../lib which works
for some installs, but not from the build directory where clang executes
the tool from when running tests.
This cmake option adds the location of the rocr runtime to the RUNPATH
(note, it amends RUNPATH here, despite the cmake option referring to RPATH)
to create a binary that runs from build or install location.
Before:
RUNPATH [$ORIGIN/../lib]
After:
RUNPATH [$ORIGIN/../lib:$HOME/llvm-install/lib]
Credit to Greg for knowing this trick and pointing to examples of it in use
for the aomp build scripts.
Reviewed By: pdhaliwal
Differential Revision: https://reviews.llvm.org/D101926
Carl Ritson [Thu, 6 May 2021 11:27:03 +0000 (20:27 +0900)]
[AMDGPU] Fix WQM failure with single block inactive demote
Instruction test for inactive kill/demote needs to be based on
actual opcode not whether instruction would be lowered to demote.
Reviewed By: piotr
Differential Revision: https://reviews.llvm.org/D101966
Malhar Jajoo [Thu, 6 May 2021 11:20:00 +0000 (12:20 +0100)]
Revert "[ARM] Transforming memcpy to Tail predicated Loop"
Reverting commit since it causes failure (10462).
This reverts commit
b856f4a232cbd43476e9b9f75c80aacfc6f5c152.
Benjamin Kramer [Thu, 6 May 2021 11:37:26 +0000 (13:37 +0200)]
[ORC] Silence unused variable warnings in Release builds. NFC.
David Green [Thu, 6 May 2021 11:36:46 +0000 (12:36 +0100)]
[LV] Account for tripcount when calculation vectorization profitability
The loop vectorizer will currently assume a large trip count when
calculating which of several vectorization factors are more profitable.
That is often not a terrible assumption to make as small trip count
loops will usually have been fully unrolled. There are cases however
where we will try to vectorize them, and especially when folding the
tail by masking can incorrectly choose to vectorize loops that are not
beneficial, due to the folded tail rounding the iteration count up for
the vectorized loop.
The motivating example here has a trip count of 5, so either performs 5
scalar iterations or 2 vector iterations (with VF=4). At a high enough
trip count the vectorization becomes profitable, but the rounding up to
2 vector iterations vs only 5 scalar makes it unprofitable.
This adds an alternative cost calculation when we know the max trip
count and are folding tail by masking, rounding the iteration count up
to the correct number for the vector width. We still do not account for
anything like setup cost or the mixture of vector and scalar loops, but
this is at least an improvement in a few cases that we have had
reported.
Differential Revision: https://reviews.llvm.org/D101726
Ben Dunbobbin [Thu, 6 May 2021 11:05:27 +0000 (12:05 +0100)]
[LLD] Improve --strip-all help text
This is a slight improvement to the help text, as I was slightly
surprised when strip-all did more than remove the symbol table.
Currently, we match gold's help text for strip-all and strip-debug.
I think that the GNU documentation for these options is not particularly
clear. However, I have opted to make only a minor change here and keep
the help text similar to gold's as these are mature options that are
well understood.
ld.bfd (https://sourceware.org/binutils/docs/ld/Options.html) has a
similar implication although it defines strip-debug as a subset of
strip-all. However, felt that noting that strip-all implies strip-debug
is better; because, with the ld.bfd approach you have to read both the
--strip-debug and the --strip-all help text to understand the behaviour
of --strip-all (and the --strip-all help text doesn't indicate that he
--strip-debug help text is related).
Differential Revision: https://reviews.llvm.org/D101890
Christian Sigg [Wed, 5 May 2021 18:06:37 +0000 (20:06 +0200)]
[mlir] Add support for ops with regions in 'gpu-async-region' rewriter.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D101757
Simon Pilgrim [Thu, 6 May 2021 09:26:24 +0000 (10:26 +0100)]
[AMDGPU] Regenerate fp2int tests. NFCI.
Simon Pilgrim [Thu, 6 May 2021 09:25:59 +0000 (10:25 +0100)]
[AMDGPU] Regenerate shift tests. NFCI.
Jonas Paulsson [Sun, 2 May 2021 14:38:05 +0000 (16:38 +0200)]
[SystemZ] Support builtin_frame_address with packed stack without backchain.
In order to use __builtin_frame_address(0) with packed stack and no
backchain, the address of where the backchain would have been written is
returned (like GCC).
This address may either contain a saved register or be unused.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D101897
Kerry McLaughlin [Thu, 6 May 2021 09:50:51 +0000 (10:50 +0100)]
[SVE][LoopVectorize] Add support for scalable vectorization of first-order recurrences
Adds support for scalable vectorization of loops containing first-order recurrences, e.g:
```
for(int i = 0; i < n; i++)
b[i] = a[i] + a[i - 1]
```
This patch changes fixFirstOrderRecurrence for scalable vectors to take vscale into
account when inserting into and extracting from the last lane of a vector.
CreateVectorSplice has been added to construct a vector for the recurrence, which
returns a splice intrinsic for scalable types. For fixed-width the behaviour
remains unchanged as CreateVectorSplice will return a shufflevector instead.
The tests included here are the same as test/Transform/LoopVectorize/first-order-recurrence.ll
Reviewed By: david-arm, fhahn
Differential Revision: https://reviews.llvm.org/D101076
Eliza Velasquez [Thu, 6 May 2021 10:12:05 +0000 (12:12 +0200)]
[clang-format] Rename common types between C#/JS
Reviewed By: curdeius
Differential Revision: https://reviews.llvm.org/D101862
Eliza Velasquez [Thu, 6 May 2021 10:06:00 +0000 (12:06 +0200)]
[clang-format] Fix C# nullable-related errors
This fixes two errors:
Previously, clang-format was splitting up type identifiers from the
nullable ?. This changes this behavior so that the type name sticks with
the operator.
Additionally, nullable operators attached to return types in interface
functions were not parsed correctly. Digging deeper, it looks like
interface bodies were being parsed differently than classes and structs,
causing MustBeDeclaration to be incorrect for interface members. They
now share the same logic.
One other change is reintroducing the CSharpNullable type independent of
JsTypeOptionalQuestion. Despite having a similar semantic purpose, their
actual syntax differs quite a bit.
Reviewed By: MyDeveloperDay, curdeius
Differential Revision: https://reviews.llvm.org/D101860
Eliza Velasquez [Thu, 6 May 2021 09:22:31 +0000 (11:22 +0200)]
[clang-format] Add more support for C# 8 nullables
This adds support for the null-coalescing assignment and null-forgiving
operators.
https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/operators/null-coalescing-operator
https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/operators/null-forgiving
Reviewed By: krasimir, curdeius
Differential Revision: https://reviews.llvm.org/D101702
Jay Foad [Wed, 7 Apr 2021 12:49:07 +0000 (13:49 +0100)]
[AMDGPU] SIFoldOperands: clean up tryConstantFoldOp
First clean up the strange API of tryConstantFoldOp where it took an
immediate operand value, but no indication of which operand it was the
value for.
Second clean up the loop that calls tryConstantFoldOp so that it does
not have to restart from the beginning every time it folds an
instruction.
This is NFCI but there are some minor changes caused by the order in
which things are folded.
Differential Revision: https://reviews.llvm.org/D100031
Andrzej Warzynski [Fri, 23 Apr 2021 14:46:35 +0000 (14:46 +0000)]
[flang] Remove `%f18` from LIT configuration files
`%f18` was originally introduced to represent the old Flang driver,
`f18`. With the introduction of the new driver, `flang-new`, we have
been switching to `%flang` (compiler driver) and `%flang_fc1` (frontend
driver) as more generic alternatives.
As most tests have been portend to use the new LIT variables instead of
`%f18`, this is good time to remove it from lit.cfg.py. There's only one
test left that requires the old driver to run. It's updated with:
```
! REQUIRES: old-flang-driver
```
This way we preserve its semantics while reducing the number of
variables in LIT configuration.
Differential Revision: https://reviews.llvm.org/D101281
Malhar Jajoo [Thu, 6 May 2021 00:38:20 +0000 (01:38 +0100)]
[ARM] Transforming memcpy to Tail predicated Loop
This patch converts llvm.memcpy intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).
From an implementation point of view, the patch
- adds an ARM specific SDAG Node (to which the llvm.memcpy intrinsic is lowered to, during first phase of ISel)
- adds a corresponding TableGen entry to generate a pseudo instruction, with a custom inserter,
on matching the above node.
- Adds a custom inserter function that expands the pseudo instruction into MIR suitable
to be (by later passes) into a WLSTP loop.
Note: A cli option is used to control the conversion of memcpy to TP
loop and this option is currently disabled by default. It may be enabled
in the future after further downstream testing.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D99723
James Henderson [Wed, 5 May 2021 10:56:46 +0000 (11:56 +0100)]
[lit] Report tool path from use_llvm_tool if found via env variable
Previously, if the search_env argument was specified, and the tool was
found at that location, the path was not reported, unlike other
situations when this function was called. Adding the reporting makes the
function consistent.
Reviewed by: thopre
Differential Revision: https://reviews.llvm.org/D101896
Tim Renouf [Tue, 4 May 2021 09:10:41 +0000 (10:10 +0100)]
[llvm-objdump] Use std::make_unique
Fix up my recent commit rG1128311a19179ceca799ff0fbc4dd206ab56e560 to
use std::make_unique instead of std::unique_ptr(new), as requested by
David Blaikie.
Differential Revision: https://reviews.llvm.org/D101822
Guillaume Chatelet [Thu, 6 May 2021 07:46:19 +0000 (07:46 +0000)]
[llvm][NFC] Remove CallingConvLower deprecated alignment functions
Differential Revision: https://reviews.llvm.org/D101910
Guillaume Chatelet [Thu, 6 May 2021 07:44:14 +0000 (07:44 +0000)]
[llvm][NFC] Remove SelectionDag alignment deprecated functions
Differential Revision: https://reviews.llvm.org/D101909
Guillaume Chatelet [Thu, 6 May 2021 07:40:18 +0000 (07:40 +0000)]
[llvm][NFC] Remove deprecated InterleaveGroup::getAlignment() function.
Differential Revision: https://reviews.llvm.org/D101907
Guillaume Chatelet [Thu, 6 May 2021 07:28:00 +0000 (07:28 +0000)]
[llvm][NFC] Remove deprecated DataLayout::getPreferredAlignment functions
Differential Revision: https://reviews.llvm.org/D101906
Guillaume Chatelet [Thu, 6 May 2021 07:21:23 +0000 (07:21 +0000)]
[llvm][NFC] Remove deprecated Alignment::None()
Differential Revision: https://reviews.llvm.org/D101905
Johannes Doerfert [Thu, 22 Apr 2021 05:57:28 +0000 (00:57 -0500)]
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030