platform/upstream/llvm.git
3 years ago[AMDGPU] Restrict immediate scratch offsets
Sebastian Neubauer [Mon, 3 May 2021 08:14:12 +0000 (10:14 +0200)]
[AMDGPU] Restrict immediate scratch offsets

gfx9 does not work with negative offsets, gfx10 works only with
aligned negative offsets, but not with unaligned negative offsets.

This is slightly more conservative than needed, gfx9 does support
negative offsets when a VGPR address is used and gfx10 supports
negative, unaligned offsets when an SGPR address is used, but we
do not make use of that with this patch.

Differential Revision: https://reviews.llvm.org/D101292

3 years agoAMDGPU: Correct const_index_stride for wave 32 for PAL ABI
David Stuttard [Mon, 24 Feb 2020 21:19:15 +0000 (21:19 +0000)]
AMDGPU: Correct const_index_stride for wave 32 for PAL ABI

Retrying after revert and fix (removed implicit def flag from operand). Now
passes with expensive_checks enabled.

Since there is a single scratch resource descriptor for all shaders, if there is
a wave32 and a wave64 shader (for instance for VsFs pairs)
then the const_index_stride will be incorrect for wave32 shaders.

Differential Revision: https://reviews.llvm.org/D101830

Change-Id: Ie3b8b2921237968caca91527dd0c97b1b0cc0360

3 years agoFix: [DebugInfo] Fix crash when emitting an invalidated SDDbgValue
Stephen Tozer [Fri, 7 May 2021 12:36:31 +0000 (13:36 +0100)]
Fix: [DebugInfo] Fix crash when emitting an invalidated SDDbgValue

This patch is a fix for revision ce0c1f3c, which caused test failures on
bots without x86 as a registered target. This patch moves the test added
in the prior patch to the x86 folder, so that it only runs on bots with
the correct target available.

3 years ago[ARM] Transforming memset to Tail predicated Loop
Malhar Jajoo [Thu, 6 May 2021 23:29:06 +0000 (00:29 +0100)]
[ARM] Transforming memset to Tail predicated Loop

This patch converts llvm.memset intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).

The llvm.memset is converted to a TP loop for both
constant and non-constant input sizes (of llvm.memset).

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100435

3 years ago[OpenCL] Fix optional image types.
Anastasia Stulova [Fri, 7 May 2021 11:15:51 +0000 (12:15 +0100)]
[OpenCL] Fix optional image types.

This change allows the use of identifiers for image types
from `cl_khr_gl_msaa_sharing` freely in the kernel code if
the extension is not supported since they are not in the
list of the reserved identifiers.

This change also removed the need for pragma for the types
in the extensions since the spec does not require the pragma
uses.

Differential Revision: https://reviews.llvm.org/D100983

3 years ago[NFC] Correctly assert the indents for printEnumValHelpStr.
Joachim Meyer [Thu, 6 May 2021 20:26:19 +0000 (22:26 +0200)]
[NFC] Correctly assert the indents for printEnumValHelpStr.

Only verify that there's no negative indent.
Noted by @chapuni in https://reviews.llvm.org/D93494.

Reviewed By: chapuni

Differential Revision: https://reviews.llvm.org/D102021

3 years ago[DebugInfo] Fix crash when emitting an invalidated SDDbgValue
Stephen Tozer [Thu, 29 Apr 2021 15:36:05 +0000 (16:36 +0100)]
[DebugInfo] Fix crash when emitting an invalidated SDDbgValue

This patch fixes a crash in the compiler that occurs when certain
invalidated SDDbgValues are emitted. The cause of this was that we would
attempt to check the liveness of the debug value's operands, which
triggers an assert if any of those operands are invalid. This patch
changes this check such that it only occurs if the SDDbgValue is valid;
if not, the check is irrelevant anyway, so can be safely ignored.

Differential Revision: https://reviews.llvm.org/D101540

3 years ago[DAG] Add a generic expansion for SHIFT_PARTS opcodes using funnel shifts
Simon Pilgrim [Fri, 7 May 2021 12:12:16 +0000 (13:12 +0100)]
[DAG] Add a generic expansion for SHIFT_PARTS opcodes using funnel shifts

Based off a discussion on D89281 - where the AARCH64 implementations were being replaced to use funnel shifts.

Any target that has efficient funnel shift lowering can handle the shift parts expansion using the same expansion, avoiding a lot of duplication.

I've generalized the X86 implementation and moved it to TargetLowering - so far I've found that AARCH64 and AMDGPU benefit, but many other targets (ARM, PowerPC + RISCV in particular) could easily use this with a few minor improvements to their funnel shift lowering (or the folding of their target ops that funnel shifts lower to).

NOTE: I'm trying to avoid adding full SHIFT_PARTS legalizer handling as I think it might actually be possible to remove these opcodes in the medium-term and use funnel shift / libcall expansion directly.

Differential Revision: https://reviews.llvm.org/D101987

3 years agoRevert "AMDGPU: Correct const_index_stride for wave 32 for PAL ABI"
David Stuttard [Fri, 7 May 2021 11:49:17 +0000 (12:49 +0100)]
Revert "AMDGPU: Correct const_index_stride for wave 32 for PAL ABI"

This reverts commit 442de0c1adf36bfddb5fb66b442bba8999fa733b.

3 years ago[SLP] Regenerate tests to reduce diff in D98714. NFCI.
Simon Pilgrim [Fri, 7 May 2021 11:31:05 +0000 (12:31 +0100)]
[SLP] Regenerate tests to reduce diff in D98714. NFCI.

3 years ago[X86] Ensure we pass DebugLoc by const reference where possible. NFCI.
Simon Pilgrim [Thu, 6 May 2021 17:57:19 +0000 (18:57 +0100)]
[X86] Ensure we pass DebugLoc by const reference where possible. NFCI.

Avoids a lot of unnecessary tracking increments/decrements of the underlying TrackingMDNodeRef

3 years ago[NFC] (test commit) Changed example invocation of C++ for OpenCL
Ole Strohm [Fri, 7 May 2021 11:30:31 +0000 (12:30 +0100)]
[NFC] (test commit) Changed example invocation of C++ for OpenCL

3 years agoAMDGPU: Correct const_index_stride for wave 32 for PAL ABI
David Stuttard [Mon, 24 Feb 2020 21:19:15 +0000 (21:19 +0000)]
AMDGPU: Correct const_index_stride for wave 32 for PAL ABI

Since there is a single scratch resource descriptor for all shaders, if there is
a wave32 and a wave64 shader (for instance for VsFs pairs)
then the const_index_stride will be incorrect for wave32 shaders.

Differential Revision: https://reviews.llvm.org/D101830

Change-Id: Id8de5566b0d1a07a814e2e7db016df9d20bf6d2c

3 years ago[NFC][X86][MCA] AMD Zen 3: add tests with non-eliminatible MMX moves
Roman Lebedev [Fri, 7 May 2021 10:43:46 +0000 (13:43 +0300)]
[NFC][X86][MCA] AMD Zen 3: add tests with non-eliminatible MMX moves

In Zen3, MMX moves are *not* eliminated,
i've verified this with llvm-exegesis.

3 years ago[X86] AMD Zen 3: 32/64 -bit GPR register moves are zero-cycle
Roman Lebedev [Fri, 7 May 2021 10:02:14 +0000 (13:02 +0300)]
[X86] AMD Zen 3: 32/64 -bit GPR register moves are zero-cycle

I've verified this with llvm-exegesis.
This is not limited to zero registers.

Refs:
AMD SOG 19h, 2.9.4 Zero Cycle Move
The processor is able to execute certain register to register
mov operations with zero cycle delay.

Agner,
22.13 Instructions with no latency
Register-to-register move instructions are resolved at
the register rename stage without using any execution units.
These instructions have zero latency. It is possible to do six such
register renamings per clock cycle, and it is even possible to
rename the same register multiple times in one clock cycle.

3 years ago[NFC][X86][MCA] AMD Zen 3: add tests with eliminatible GPR moves
Roman Lebedev [Fri, 7 May 2021 10:02:07 +0000 (13:02 +0300)]
[NFC][X86][MCA] AMD Zen 3: add tests with eliminatible GPR moves

3 years ago[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST
Stephen Tozer [Thu, 29 Apr 2021 15:04:24 +0000 (16:04 +0100)]
[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST

This patch modifies updateDbgUsersToReg to properly handle
DBG_VALUE_LIST instructions, by replacing the hard-coded operand indices
(i.e. getOperand(0)) with the more general getDebugOperandsForReg(), and
updating the register for all matching operands.

Differential Revision: https://reviews.llvm.org/D101523

3 years ago[llvm-dwarfdump] Help option output should be consistent with the command guide
gbreynoo [Fri, 7 May 2021 10:21:51 +0000 (11:21 +0100)]
[llvm-dwarfdump] Help option output should be consistent with the command guide

The dwarfdump command guide shows the short options used as aliases but
these are not found in the help text unless --show-hidden is used.
Investigating other tools some follow this pattern, others like
llvm-objdump show aliases with --help. This change fixes the help output
to be consistent with the command guide. This includes updating alias
descriptions in the help output to use "--".

As part of this change I updated cmdline.test, including some options
that were missing testing.

Differential Revision: https://reviews.llvm.org/D101646

3 years ago[llvm][NFC] Remove remaining deprecated alignment functions from CodeGen
Guillaume Chatelet [Fri, 7 May 2021 10:22:41 +0000 (10:22 +0000)]
[llvm][NFC] Remove remaining deprecated alignment functions from CodeGen

Differential Revision: https://reviews.llvm.org/D102058

3 years ago[llvm][NFC] Remove deprecated TargetFrameLowering and InstrTypes alignment functions
Guillaume Chatelet [Fri, 7 May 2021 09:12:56 +0000 (09:12 +0000)]
[llvm][NFC] Remove deprecated TargetFrameLowering and InstrTypes alignment functions

Differential Revision: https://reviews.llvm.org/D102056

3 years ago[AsmParser][ARM] Make .thumb_func imply .thumb
LemonBoy [Fri, 7 May 2021 10:09:38 +0000 (12:09 +0200)]
[AsmParser][ARM] Make .thumb_func imply .thumb

GNU as documentation states that a `.thumb_func` directive implies `.thumb`, teach the asm parser to switch mode whenever it's encountered. On the other hand the labeled form, exclusive to Apple's toolchain, doesn't switch mode at all.

Reviewed By: nickdesaulniers, peter.smith

Differential Revision: https://reviews.llvm.org/D101975

3 years ago[gn build] Port 98e5ede60499
LLVM GN Syncbot [Fri, 7 May 2021 09:15:50 +0000 (09:15 +0000)]
[gn build] Port 98e5ede60499

3 years ago[AMDGPU] Serialize MFInfo::ScavengeFI
Sebastian Neubauer [Fri, 30 Apr 2021 19:31:55 +0000 (21:31 +0200)]
[AMDGPU] Serialize MFInfo::ScavengeFI

Serialize ScavengeFI from SIMachineFunctionInfo into yaml.

ScavengeFI is not used outside of the PrologEpilogInserter,
so this shouldn't change anything.

Differential Revision: https://reviews.llvm.org/D101367

3 years ago[flang] Remove redundant reallocation
Diana Picus [Thu, 6 May 2021 09:26:57 +0000 (09:26 +0000)]
[flang] Remove redundant reallocation

The MaxMinHelper used to implement MIN and MAX for character types would
reallocate the accumulator whenever the number of characters in it was
different from that in the other input. This is unnecessary if the
accumulator is already larger than the other input. This patch fixes the
issue and adds a unit test to make sure we don't reallocate if we don't
need to.

Differential Revision: https://reviews.llvm.org/D101984

3 years ago[flang] Add tests for MIN for character arrays. NFC
Diana Picus [Tue, 4 May 2021 18:57:54 +0000 (18:57 +0000)]
[flang] Add tests for MIN for character arrays. NFC

We used to test only scalar character types. This commit adds tests for
arrays with a few simple shapes.

Differential Revision: https://reviews.llvm.org/D101983

3 years ago[LoopVectorize][SVE] Remove assert for scalable vector in InnerLoopVectorizer::fixRed...
Caroline Concatto [Thu, 22 Apr 2021 07:24:40 +0000 (08:24 +0100)]
[LoopVectorize][SVE] Remove assert for scalable vector in InnerLoopVectorizer::fixReduction

The function fixReduction used to assert/crash for scalable vector when
a vector reduce could be done with a smaller vector.
This patch removes this assertion as it is safe to use scalable vector for
vector reduce and truncate.

Differential Revision: https://reviews.llvm.org/D101260

3 years ago[lit][test] Attempt fix when paths include symlink
James Henderson [Fri, 7 May 2021 08:20:50 +0000 (09:20 +0100)]
[lit][test] Attempt fix when paths include symlink

Example of failure:
https://lab.llvm.org/staging/#/builders/126/builds/345/steps/5/logs/FAIL__lit___use-tool-search-env_py

3 years ago[libcxx] Fix a case of -Wundef warnings. NFC.
Martin Storsjö [Thu, 6 May 2021 07:18:41 +0000 (10:18 +0300)]
[libcxx] Fix a case of -Wundef warnings. NFC.

Differential Revision: https://reviews.llvm.org/D101978

3 years ago[LazyValueInfo] Insert an Overdefined placeholder to prevent infinite recursion
Peilin Guo [Fri, 7 May 2021 08:05:50 +0000 (16:05 +0800)]
[LazyValueInfo] Insert an Overdefined placeholder to prevent infinite recursion

getValueFromCondition() uses a Visited set to record the intermediate value.
However, it uses a postorder way to compute the value first and update the
Visited set later. Thus it will be trapped into an infinite recursion if there
exists IRs that use no dominated by its def as in this example:

  %tmp3 = or i1 undef, %tmp4
  %tmp4 = or i1 undef, %tmp3

To prevent this, we can insert an Overdefined placeholder into the set
before computing the actual value.

Reviewed by: nikic

Differential Revision: https://reviews.llvm.org/D101273

3 years ago[Debug-Info][NFC] add a wrapper for Die.addValue
Chen Zheng [Fri, 7 May 2021 07:00:11 +0000 (07:00 +0000)]
[Debug-Info][NFC] add a wrapper for Die.addValue

Add a new wrapper function addAttribute() for Die.addValue() function,
so we can do some attributes control in one single interface.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D101125

3 years ago[GlobalISel] Micro-optimize the conditional branch optimization.
Amara Emerson [Fri, 7 May 2021 07:00:47 +0000 (00:00 -0700)]
[GlobalISel] Micro-optimize the conditional branch optimization.

Convert a check into an assert and pass an MI instead of recomputing in the
apply function.

3 years ago[MLIR][SPIRV] Properly (de-)serialize BranchConditionalOp.
KareemErgawy-TomTom [Fri, 7 May 2021 06:59:35 +0000 (08:59 +0200)]
[MLIR][SPIRV] Properly (de-)serialize BranchConditionalOp.

Implements proper (de-)serialization logic for BranchConditionalOp when
such ops have true/false target operands.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D101602

3 years ago[XCOFF] handle string constants generation for AIX
Chen Zheng [Fri, 7 May 2021 06:19:29 +0000 (06:19 +0000)]
[XCOFF] handle string constants generation for AIX

This follows https://www.ibm.com/docs/en/aix/7.2?topic=constants-string

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D101280

3 years ago[mlir][linalg] Add IndexedGenericOp to GenericOp canonicalization.
Tobias Gysi [Fri, 7 May 2021 05:59:05 +0000 (05:59 +0000)]
[mlir][linalg] Add IndexedGenericOp to GenericOp canonicalization.

Replace all `linalg.indexed_generic` ops by `linalg.generic` ops that access the iteration indices using the `linalg.index` op.

Differential Revision: https://reviews.llvm.org/D101612

3 years ago[PowerPC] Remove extra swap for extract+vperm on LE
Qiu Chaofan [Fri, 7 May 2021 03:04:47 +0000 (11:04 +0800)]
[PowerPC] Remove extra swap for extract+vperm on LE

This is a simple fix on LE. On BE, vector shuffles are categorized into
different ops. We may need more work to eliminate these in
tablegen/pre-isel.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D101605

3 years agoBPF: fix FIELD_EXISTS relocation with array subscripts
Yonghong Song [Thu, 6 May 2021 23:31:30 +0000 (16:31 -0700)]
BPF: fix FIELD_EXISTS relocation with array subscripts

Lorenz Bauer reported an issue in bpf mailing list ([1]) where
for FIELD_EXISTS relocation, if the object is an array subscript,
the patched immediate is the object offset from the base address,
instead of 1.

Currently in BPF AbstractMemberAccess pass, the final offset
from the base address is the patched offset except FIELD_EXISTS
which is 1 unconditionally. In this particular case, the last
data structure access is not a field (struct/union offset)
so it didn't hit the place to set patched immediate to be 1.

This patch fixed the issue by checking the relocation type.
If the type is FIELD_EXISTS, just set to 1.
Tested by modifying some bpf selftests, libbpf is okay with
such types with FIELD_EXISTS relocation.

 [1] https://lore.kernel.org/bpf/CACAyw99n-cMEtVst7aK-3BfHb99GMEChmRLCvhrjsRpHhPrtvA@mail.gmail.com/

Differential Revision: https://reviews.llvm.org/D102036

3 years ago[TableGen] Use range-based for loops (NFC)
Coelacanthus [Thu, 6 May 2021 10:36:52 +0000 (18:36 +0800)]
[TableGen] Use range-based for loops (NFC)

Use range-based for loops in TableGen.

Reviewed By: Paul-C-Anagnostopoulos

Differential Revision: https://reviews.llvm.org/D101994

3 years ago[IR] Fix typo in comment of Intrinsics.td (NFC)
qixingxue [Thu, 6 May 2021 07:33:56 +0000 (15:33 +0800)]
[IR] Fix typo in comment of Intrinsics.td (NFC)

3 years ago[CGAtomic] Lift strong requirement for remaining compare_exchange combinations
Bruno Cardoso Lopes [Fri, 7 May 2021 04:04:23 +0000 (21:04 -0700)]
[CGAtomic] Lift strong requirement for remaining compare_exchange combinations

Follow up on 431e3138a and complete the other possible combinations.

Besides enforcing the new behavior, it also mitigates TSAN false positives when
combining orders that used to be stronger.

3 years ago[mlir][Linalg] Allow folding to rank-zero tensor when using rank-reducing subtensors.
MaheshRavishankar [Fri, 7 May 2021 00:17:29 +0000 (17:17 -0700)]
[mlir][Linalg] Allow folding to rank-zero tensor when using rank-reducing subtensors.

The pattern to convert subtensor ops to their rank-reduced versions
(by dropping unit-dims in the result) can also convert to a zero-rank
tensor. Handle that case.
This also fixes a OOB access bug in the existing pattern for such
cases.

Differential Revision: https://reviews.llvm.org/D101949

3 years ago[dfsan] Rename and fix an internal test issue for mmap+calloc
Jianzhou Zhao [Fri, 30 Apr 2021 17:18:05 +0000 (17:18 +0000)]
[dfsan] Rename and fix an internal test issue for mmap+calloc

The linker suggests using -Wl,-z,notext.

Replaced assert by exit also fixed this.

After renaming, interceptor.c would be used to test interceptors in general by D101204.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D101649

3 years ago[llvm][TextAPI] add mapping from OS string to Platform
Cyndy Ishida [Thu, 6 May 2021 23:18:55 +0000 (16:18 -0700)]
[llvm][TextAPI] add mapping from OS string to Platform

* add utility for matching target triple OS value strings  to PlatformKind

This was reviewed offline by ributzka, steven_wu

3 years ago[AMDGPU] Expose __builtin_amdgcn_perm for v_perm_b32
Stanislav Mekhanoshin [Thu, 6 May 2021 20:29:48 +0000 (13:29 -0700)]
[AMDGPU] Expose __builtin_amdgcn_perm for v_perm_b32

Differential Revision: https://reviews.llvm.org/D102022

3 years ago[mlir][tosa] Added div op, variadic concat. Removed placeholder. Spec v0.22 alignment.
Rob Suderman [Thu, 6 May 2021 22:55:58 +0000 (15:55 -0700)]
[mlir][tosa] Added div op, variadic concat. Removed placeholder. Spec v0.22 alignment.

Nearly complete alignment to spec v0.22
- Adds Div op
- Concat inputs now variadic
- Removes Placeholder op

Note: TF side PR https://github.com/tensorflow/tensorflow/pull/48921 deletes Concat legalizations to avoid breaking TensorFlow CI. This must be merged only after the TF PR has merged.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D101958

3 years ago[libomptarget][nfc] Refactor amdgpu partial barrier to simplify adding a second one
Jon Chesterfield [Thu, 6 May 2021 22:52:18 +0000 (23:52 +0100)]
[libomptarget][nfc] Refactor amdgpu partial barrier to simplify adding a second one

[libomptarget][nfc] Refactor amdgpu partial barrier to simplify adding a second one

D101976 would require a second barrier instance. This NFC to amdgpu makes it
simpler to add one (an extra global, one more line in init). Also renames the
current barrier to L0.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D102016

3 years ago[mlir] Update dstNode after DenseMap insertion in loop fusion pass.
Amy Zhuang [Thu, 6 May 2021 22:08:34 +0000 (15:08 -0700)]
[mlir] Update dstNode after DenseMap insertion in loop fusion pass.

Reviewed By: vinayaka-polymage

Differential Revision: https://reviews.llvm.org/D101794

3 years ago[ARM] Transforming memcpy to Tail predicated Loop
Malhar Jajoo [Thu, 6 May 2021 00:38:20 +0000 (01:38 +0100)]
[ARM] Transforming memcpy to Tail predicated Loop

This patch converts llvm.memcpy intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).

From an implementation point of view, the patch

- adds an ARM specific SDAG Node (to which the llvm.memcpy intrinsic is lowered to, during first phase of ISel)
- adds a corresponding TableGen entry to generate a pseudo instruction, with a custom inserter,
  on matching the above node.
- Adds a custom inserter function that expands the pseudo instruction into MIR suitable
   to be (by later passes) into a WLSTP loop.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D99723

3 years ago[libomptarget][amdgpu][nfc] Remove dead code from amdgpu plugin
Jon Chesterfield [Thu, 6 May 2021 22:16:30 +0000 (23:16 +0100)]
[libomptarget][amdgpu][nfc] Remove dead code from amdgpu plugin

[libomptarget][amdgpu][nfc] Remove dead code from amdgpu plugin

Drops an enum that was identical to a HSA one, localises some functions where
they were only called from one TU. Covers everything internalize + adce can
identify as dead, except for msgpack::dump which is useful when debugging.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D102014

3 years ago[mlir][spirv] NFC: Replace OwningSPIRVModuleRef with OwningOpRef
Lei Zhang [Thu, 6 May 2021 21:16:55 +0000 (17:16 -0400)]
[mlir][spirv] NFC: Replace OwningSPIRVModuleRef with OwningOpRef

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D102009

3 years agoWhen SendContinuePacketAndWaitForResponse returns eStateInvalid, don't fetch more...
Jim Ingham [Wed, 5 May 2021 18:34:07 +0000 (11:34 -0700)]
When SendContinuePacketAndWaitForResponse returns eStateInvalid, don't fetch more packets.

This looks like just an oversight in the AsyncThread function.  It gets a result of
eStateInvalid, and then marks the process as exited, but doesn't set "done" to true,
so we go to fetch another event.  That is not safe, since you don't know when that
extra packet is going to arrive.  If it arrives while you are tearing down the
process, the internal-state-thread might try to handle it when the process in not
in a good state.

Rather than put more effort into checking all the shutdown paths to make sure this
extra packet doesn't cause problems, just don't fetch it.  We weren't going to do
anything useful with it anyway.

The main part of the patch is setting "done = true" when we get the eStateInvalid.
I also added a check at the beginning of the while(done) loop to prevent another error
from getting us to fetch packets for an exited process.

I added a test case to ensure that if an Interrupt fails, we call the process
exited.  I can't test exactly the error I'm fixing, there's no good way to know
that the stop reply for the failed interrupt wasn't fetched.  But at least this
asserts that the overall behavior is correct.

Differential Revision: https://reviews.llvm.org/D101933

3 years agoThread safety analysis: Eliminate parameter from intersectAndWarn (NFC)
Aaron Puchert [Thu, 6 May 2021 21:07:40 +0000 (23:07 +0200)]
Thread safety analysis: Eliminate parameter from intersectAndWarn (NFC)

We were modifying precisely when intersecting the lock sets of multiple
predecessors without back edge. That's no coincidence: we can't modify
on back edges, it doesn't make sense to modify at the end of a function,
and otherwise we always want to intersect on forward edges, because we
can build a new lock set for those.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D101755

3 years ago[gn build] Port 83af66e18e3d
LLVM GN Syncbot [Thu, 6 May 2021 21:03:05 +0000 (21:03 +0000)]
[gn build] Port 83af66e18e3d

3 years agonew altera ID dependent backward branch check
Frank Derry Wanye [Thu, 6 May 2021 21:01:39 +0000 (17:01 -0400)]
new altera ID dependent backward branch check

This lint check is a part of the FLOCL (FPGA Linters for OpenCL) project
out of the Synergy Lab at Virginia Tech.

FLOCL is a set of lint checks aimed at FPGA developers who write code
in OpenCL.

The altera ID dependent backward branch lint check finds ID dependent
variables and fields used within loops, and warns of their usage. Using
these variables in loops can lead to performance degradation.

3 years ago[Index] Ignore nullptr decls for indexing
Alex Hoppen [Thu, 6 May 2021 20:11:26 +0000 (13:11 -0700)]
[Index] Ignore nullptr decls for indexing

We can end up with a call to `indexTopLevelDecl(D)` with `D == nullptr` in non-assert builds e.g. when indexing a module in `indexModule` and
- `ASTReader::GetDecl` returns `nullptr` if `Index >= DeclsLoaded.size()`, thus returning `nullptr`
=> `ModuleDeclIterator::operator*` returns `nullptr`
=> we call `IndexCtx.indexTopLevelDecl` with `nullptr`

Be resilient and just ignore the `nullptr` decls during indexing.

Reviewed By: akyrtzi

Differential Revision: https://reviews.llvm.org/D102001

3 years ago[mlir] Store the flag for dynamic operand storage in the low bits
River Riddle [Thu, 6 May 2021 19:09:16 +0000 (12:09 -0700)]
[mlir] Store the flag for dynamic operand storage in the low bits

It is currently stored in the high bits, which is disallowed on certain
platforms (e.g. android). This revision switches the representation to use
the low bits instead, fixing crashes/breakages on those platforms.

Differential Revision: https://reviews.llvm.org/D101969

3 years ago[PassManager] add helper function to hold set of vector passes
Sanjay Patel [Thu, 6 May 2021 19:33:59 +0000 (15:33 -0400)]
[PassManager] add helper function to hold set of vector passes

This is no-functional-change-intended (NFC) and split off from
D102002 (which proposes to eliminate the LTO-based differences).

3 years ago[NPM] Do not run function simplification pipeline unnecessarily
Mircea Trofin [Mon, 1 Mar 2021 20:19:20 +0000 (12:19 -0800)]
[NPM] Do not run function simplification pipeline unnecessarily

The CGSCC pass manager interplay with the FunctionAnalysisManagerCGSCCProxy is 'special' in the sense that the former will rerun the latter if there are changes to a SCC structure; that being said, some of the functions in the SCC may be unchanged. In that case, the function simplification pipeline will be re-run, which impacts compile time[1].

This patch allows the function simplification pipeline be skipped if it was already run and the function was not modified since.

The behavior is currently disabled by default. This is because, currently, the rerunning of the function simplification pipeline on an unchanged function may still result in changes. The patch simplifies investigating and fixing those cases where repeated function pass runs do actually positively impact code quality, while offering an easy workaround for those impacted negatively by compile time regressions, and not impacting mainline scenarios.

[1] A [[ http://llvm-compile-time-tracker.com/compare.php?from=eb37d3546cd0c6e67798496634c45e501f7806f1&to=ac722d1190dc7bbdd17e977ef7ec95e69eefc91e&stat=instructions | compile time tracker ]] run with the option enabled.

Differential Revision: https://reviews.llvm.org/D98103

3 years ago[RISCV] Remove unused ComplexPatterns. NFC
Craig Topper [Thu, 6 May 2021 19:17:37 +0000 (12:17 -0700)]
[RISCV] Remove unused ComplexPatterns. NFC

3 years ago[flang][OpenMP] Add semantic check for occurrence of constructs nested inside a SIMD...
Arnamoy Bhattacharyya [Thu, 6 May 2021 18:00:34 +0000 (14:00 -0400)]
[flang][OpenMP] Add semantic check for occurrence of constructs nested inside a SIMD region

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D99757

3 years ago[Fuchsia][CMake] Update OSX deployment target
Petr Hosek [Thu, 6 May 2021 18:56:32 +0000 (11:56 -0700)]
[Fuchsia][CMake] Update OSX deployment target

Use correct spelling of CMAKE_OSX_DEPLOYMENT_TARGET and bump the
minimum version to 10.13 which matches what we use for host tools
in Fuchsia.

Differential Revision: https://reviews.llvm.org/D102013

3 years ago[libunwind] NFC: Use macros to accommodate differences in representation of PowerPC...
Xing Xue [Thu, 6 May 2021 18:33:38 +0000 (14:33 -0400)]
[libunwind] NFC: Use macros to accommodate differences in representation of PowerPC assemblers

Summary:
This NFC patch replaces the representation of registers and the left shift operator in the PowerPC assembly code to allow it to be consumed by the GNU flavored assembler and the AIX assembler.

* Registers - change the representation of PowperPC registers from %rn, %fn, %vsn, and %vrn to the register number alone, e.g., n. The GNU flavored assembler and the AIX assembler are able to determine the register kind based on the context of the instruction in which the register is used.

* Left shift operator - use macro PPC_LEFT_SHIFT to represent the left shift operator. The left shift operator in the AIX assembly language is < instead of <<

Reviewed by: sfertile, MaskRay, compnerd

Differential Revision: https://reviews.llvm.org/D101179

3 years ago[RISCV] Minor vector instruction tablegen cleanup. NFC
Craig Topper [Thu, 6 May 2021 18:21:46 +0000 (11:21 -0700)]
[RISCV] Minor vector instruction tablegen cleanup. NFC

Use result_type for the IMPLICIT_DEF in masked vector patterns.
This doesn't matter today because result_type and op_type are
always the same.

Use multiclass inheritance to reduce repeated code.

3 years ago[flang] Implement NAMELIST I/O in the runtime
peter klausler [Wed, 5 May 2021 18:37:49 +0000 (11:37 -0700)]
[flang] Implement NAMELIST I/O in the runtime

Add InputNamelist and OutputNamelist as I/O data transfer APIs
to be used with internal & external list-directed I/O; delete the
needless original namelist-specific Begin... calls.
Implement NAMELIST output and input; add basic tests.

Differential Revision: https://reviews.llvm.org/D101931

3 years ago[AArch64] Fix namespace issue. NFC
Fangrui Song [Thu, 6 May 2021 18:16:07 +0000 (11:16 -0700)]
[AArch64] Fix namespace issue. NFC

3 years ago[flang] Fix race condition in runtime
peter klausler [Wed, 5 May 2021 18:26:12 +0000 (11:26 -0700)]
[flang] Fix race condition in runtime

The code that initializes the default units 5 & 6 had
a race condition that would allow threads access to the
unit map before it had been populated.

Also add some missing calls to va_end() that will never
be called (they're in program abort situations) but might
elicit warnings if absent.

Differential Revision: https://reviews.llvm.org/D101928

3 years agoAllow llvm-dis to disassemble multiple files
Matthew Voss [Thu, 22 Apr 2021 21:07:45 +0000 (14:07 -0700)]
Allow llvm-dis to disassemble multiple files

Differential Revision: https://reviews.llvm.org/D101110

3 years ago[flang] Runtime must defer formatted/unformatted determination
peter klausler [Wed, 5 May 2021 18:33:00 +0000 (11:33 -0700)]
[flang] Runtime must defer formatted/unformatted determination

What the Fortran standard calls "preconnected" external I/O units
might not be known to be connected to unformatted or formatted files
until the first I/O data transfer statement is executed.
Support this deferred determination by representing the flag as
a tri-state Boolean and adapting its points of use.

Differential Revision: https://reviews.llvm.org/D101929

3 years ago[gn build] Support compiler-rt/profile on Windows
Arthur Eubanks [Wed, 5 May 2021 23:35:14 +0000 (16:35 -0700)]
[gn build] Support compiler-rt/profile on Windows

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D101961

3 years ago[mlir][vector] Fix typo
thomasraoux [Thu, 6 May 2021 17:12:31 +0000 (10:12 -0700)]
[mlir][vector] Fix typo

3 years ago[mlir][linalg][NFC] Make reshape folding control more fine grain
thomasraoux [Thu, 6 May 2021 14:28:09 +0000 (07:28 -0700)]
[mlir][linalg][NFC] Make reshape folding control more fine grain

This expose a lambda control instead of just a boolean to control unit
dimension folding.
This however gives more control to user to pick a good heuristic.
Folding reshapes helps fusion opportunities but may generate sub-optimal
generic ops.

Differential Revision: https://reviews.llvm.org/D101917

3 years ago[WebAssembly] Fix argument types in SIMD narrowing intrinsics
Thomas Lively [Thu, 6 May 2021 17:07:44 +0000 (10:07 -0700)]
[WebAssembly] Fix argument types in SIMD narrowing intrinsics

The builtins were updated to take signed parameters in 627a52695537, but the
intrinsics that use those builtins were not updated as well. The intrinsic test
did not catch this sign mismatch because it is only reported as an error under
-fno-lax-vector-conversions.

This commit fixes the type mismatch and adds -fno-lax-vector-conversions to the
test to catch similar problems in the future.

Differential Revision: https://reviews.llvm.org/D101979

3 years ago[PowerPC][LLD] Make sure that the correct Thunks are used.
Stefan Pintilie [Tue, 4 May 2021 14:35:43 +0000 (09:35 -0500)]
[PowerPC][LLD] Make sure that the correct Thunks are used.

This fixes an issue where mixed TOC / NOTOC calls can call the incorrect
thunks if a previous thunk already exists. The issue appears when a TOC
funciton calls a NOTOC callee and then a different NOTOC function calls the same
NOTOC callee. In this case the linker would sometimes incorrectly call the
same thunk for both cases.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D101837

3 years ago[RISCV] Remove unused RISCV::VLEFF and VLEFF_MASK. NFC
Craig Topper [Thu, 6 May 2021 16:40:49 +0000 (09:40 -0700)]
[RISCV] Remove unused RISCV::VLEFF and VLEFF_MASK. NFC

Looks like these got left behind when vleff isel was moved to
X86ISelDAGToDAG.cpp

3 years ago[AIX][Test][ORC] Skip unsupported ORC C API tests on AIX
Hubert Tong [Thu, 6 May 2021 14:15:30 +0000 (10:15 -0400)]
[AIX][Test][ORC] Skip unsupported ORC C API tests on AIX

As mentioned before in D78813, currently the XCOFF backend does not
support writing 64-bit object files, which the ORC JIT tests will try to
exercise if we are on AIX. This patch disables the tests on AIX for now.
This is consistent with what's been done, for example, regarding
`armv7`.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D101971

3 years agoFix array attribute in bindings for linalg.init_tensor
Denys Shabalin [Thu, 6 May 2021 16:24:07 +0000 (18:24 +0200)]
Fix array attribute in bindings for linalg.init_tensor

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D101998

3 years ago[SystemZ] Don't use libcall for 128 bit shifts.
Jonas Paulsson [Thu, 6 May 2021 13:22:21 +0000 (15:22 +0200)]
[SystemZ] Don't use libcall for 128 bit shifts.

Expand 128 bit shifts instead of using a libcall.

This patch removes the 128 bit shift libcalls and thereby causes
ExpandShiftWithUnknownAmountBit() to be called.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D101993

3 years ago[RISCV] Cleanup instruction formats used for B extension ternary operations.
Craig Topper [Thu, 6 May 2021 15:58:58 +0000 (08:58 -0700)]
[RISCV] Cleanup instruction formats used for B extension ternary operations.

Rename RVInstR4 as used by F/D/Zfh extensions to RVInstR4Frm.
Introduce new RVInstR4 that takes funct3 as a parameter.

Add new format classes for FSRI and FSRIW instead of trying to
bend RVInstR4 to use a shamt overlayed on rs2 and funct2.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D100427

3 years ago[LangRef][VP] Fix typos in VP sdiv/udiv examples
Fraser Cormack [Thu, 6 May 2021 15:37:04 +0000 (16:37 +0100)]
[LangRef][VP] Fix typos in VP sdiv/udiv examples

3 years ago[clangd][ObjC] Highlight Objc Ivar refs
David Goldman [Mon, 3 May 2021 20:18:57 +0000 (16:18 -0400)]
[clangd][ObjC] Highlight Objc Ivar refs

Treat them just like we do for properties - as a `property` semantic
token although ideally we could differentiate the two.

Differential Revision: https://reviews.llvm.org/D101785

3 years ago[AMDGPU] Fix 64 bit DPP validation
Stanislav Mekhanoshin [Wed, 5 May 2021 18:26:07 +0000 (11:26 -0700)]
[AMDGPU] Fix 64 bit DPP validation

AMDGPUAsmParser::isSupportedDPPCtrl() was failing to correctly
find a DPP register operand, regadless of the position it is
always src0. Moved this check into a new validateDPP() method
where we have full instruction already. In particular it was
failing to reject this case:

v_cvt_u32_f64 v5, v[0:1] quad_perm:[0,2,1,1] row_mask:0xf bank_mask:0xf

Essentially it was broken for any case where size of dst and
src0 differ.

It also improves the diagnostics with a proper error message.

The check in the InstPrinter also drops verification of the dst
register as it does not have anything to do with the dpp operand.

Differential Revision: https://reviews.llvm.org/D101930

3 years ago[SLP] Constify the TreeEntry* input into getEntryCost() + setInsertPointAfterBundle...
Simon Pilgrim [Thu, 6 May 2021 15:19:36 +0000 (16:19 +0100)]
[SLP] Constify the TreeEntry* input into getEntryCost() + setInsertPointAfterBundle(). NFCI.

3 years ago[SLP] Constify the TreeEntry* input into dumpTreeCosts(). NFCI.
Simon Pilgrim [Thu, 6 May 2021 15:07:16 +0000 (16:07 +0100)]
[SLP] Constify the TreeEntry* input into dumpTreeCosts(). NFCI.

3 years ago[SLP] Use empty() instead of size() == 0. NFCI.
Simon Pilgrim [Thu, 6 May 2021 15:00:44 +0000 (16:00 +0100)]
[SLP] Use empty() instead of size() == 0. NFCI.

3 years ago[lld-macho] Support loading of zippered dylibs
Jez Ng [Thu, 6 May 2021 15:18:19 +0000 (11:18 -0400)]
[lld-macho] Support loading of zippered dylibs

ld64 can emit dylibs that support more than one platform (typically macOS and
macCatalyst). This diff allows LLD to read in those dylibs. Note that this is a
super bare-bones implementation -- in particular, I haven't added support for
LLD to emit those multi-platform dylibs, nor have I added a variety of
validation checks that ld64 does. Until we have a use-case for emitting zippered
dylibs, I think this is good enough.

Fixes PR49597.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D101954

3 years ago[lld-macho][nfc] Convert the mock libSystem.tbd to TBDv4
Jez Ng [Wed, 5 May 2021 22:30:23 +0000 (18:30 -0400)]
[lld-macho][nfc] Convert the mock libSystem.tbd to TBDv4

It doesn't seem like TBDv3 allows for specifying multiple platforms, so I'm
upgrading us to TBDv4. (We need to support multiple platforms in order to test
that we can handle zippered dylibs; that functionality will be added in an
upcoming diff.)

Differential Revision: https://reviews.llvm.org/D101953

3 years ago[mlir][NFC] Fix warning in VectorTransforms.cpp
thomasraoux [Thu, 6 May 2021 15:11:42 +0000 (08:11 -0700)]
[mlir][NFC] Fix warning in VectorTransforms.cpp

3 years ago[mlir][vector] add pattern to cast away lead unit dimension for broadcast op
thomasraoux [Wed, 5 May 2021 23:03:22 +0000 (16:03 -0700)]
[mlir][vector] add pattern to cast away lead unit dimension for broadcast op

Differential Revision: https://reviews.llvm.org/D101955

3 years ago[PowerPC] Re-commit ed87f512bb9eb5c1d44e9a1182ffeaf23d6c5ae8
Nemanja Ivanovic [Thu, 6 May 2021 14:44:07 +0000 (09:44 -0500)]
[PowerPC] Re-commit ed87f512bb9eb5c1d44e9a1182ffeaf23d6c5ae8

This was reverted in 3761b9a2345aff197707d23a68d4a178489f60e4 just
as I was about to commit the fix. This patch inlcudes the
necessary fix.

3 years ago[AMDGPU][NFC] Fix typos in SIFormMemoryClauses description
Austin Kerbow [Thu, 6 May 2021 14:43:11 +0000 (07:43 -0700)]
[AMDGPU][NFC] Fix typos in SIFormMemoryClauses description

NFC.

3 years ago[OpenMP] Temporarily require X86 target for parallel_for_codegen.cpp test
David Spickett [Thu, 6 May 2021 14:13:19 +0000 (14:13 +0000)]
[OpenMP] Temporarily require X86 target for parallel_for_codegen.cpp test

Since https://reviews.llvm.org/D101849 this test has been failing
on bots that only enable either Arm or AArch64 targets.

See: https://lab.llvm.org/buildbot/#/builders/107/builds/7601

Temporarily requires X86 for this test while the difference is figured out.

3 years ago[libc++] Rewrite std::to_address to avoid relying on element_type
Louis Dionne [Tue, 4 May 2021 22:51:58 +0000 (18:51 -0400)]
[libc++] Rewrite std::to_address to avoid relying on element_type

This is a rough reapplication of the change that fixed std::to_address
to avoid relying on element_type (da456167). It is somewhat different
because the fix to avoid breaking Clang (which caused it to be reverted
in 347f69c55) was a bit more involved.

Differential Revision: https://reviews.llvm.org/D101638

3 years ago[AIX][TLS] Add support for TLSGD relocations to XCOFF objects
Victor Huang [Thu, 6 May 2021 13:37:09 +0000 (08:37 -0500)]
[AIX][TLS] Add support for TLSGD relocations to XCOFF objects

- Add branch absolute reloction R_RBA, R_TLS relocation for the variable offset
  for the tlsgd model and R_TLSM for the region handle for the tlsgd model
- Properly set the relocation fixed values for R_TLS and R_TLSM
- Emit the TCEntry with the variant kind in the XCOFFStreamer

Reviewed by: sfertile, nemanjai, DiggerLin

Differential Revision: https://reviews.llvm.org/D100214

3 years agoRevert "[PowerPC] Provide some P8-specific altivec overloads for P7"
Nico Weber [Thu, 6 May 2021 14:00:39 +0000 (10:00 -0400)]
Revert "[PowerPC] Provide some P8-specific altivec overloads for P7"

This reverts commit ed87f512bb9eb5c1d44e9a1182ffeaf23d6c5ae8.
Breaks check-clang, see e.g.
https://lab.llvm.org/buildbot/#/builders/139/builds/3818

3 years ago[lldb][NFC] Make assert in TestStaticVariables more expressive
Raphael Isemann [Thu, 6 May 2021 14:00:24 +0000 (16:00 +0200)]
[lldb][NFC] Make assert in TestStaticVariables more expressive

3 years ago[AMDGPU] SIInsertHardClauses: move more stuff into the class. NFC.
Jay Foad [Thu, 6 May 2021 13:47:43 +0000 (14:47 +0100)]
[AMDGPU] SIInsertHardClauses: move more stuff into the class. NFC.

3 years ago[PowerPC] Provide some P8-specific altivec overloads for P7
Nemanja Ivanovic [Thu, 6 May 2021 11:54:52 +0000 (06:54 -0500)]
[PowerPC] Provide some P8-specific altivec overloads for P7

This adds additional support for XL compatibility. There are a number
of functions in altivec.h that produce a single instruction (or a
very short sequence) for Power8 but can be done on Power7 without
scalarization. XL provides these implementations.
This patch adds the following overloads for doubleword vectors:
vec_add
vec_cmpeq
vec_cmpgt
vec_cmpge
vec_cmplt
vec_cmple
vec_sl
vec_sr
vec_sra

3 years ago[TableGen] [Clang] Clean up Options.td and add asserts.
Paul C. Anagnostopoulos [Wed, 28 Apr 2021 23:55:12 +0000 (19:55 -0400)]
[TableGen] [Clang] Clean up Options.td and add asserts.

Differential Revision: https://reviews.llvm.org/D101766

3 years ago[OpenCL] Remove subgroups pragma in enqueue kernel and pipe builtins.
Anastasia Stulova [Thu, 6 May 2021 11:48:46 +0000 (12:48 +0100)]
[OpenCL] Remove subgroups pragma in enqueue kernel and pipe builtins.

This patch simplifies the parser and makes the language semantics
consistent. There is no extension pragma requirement in the spec
for the subgroup functions in enqueue kernel or pipes and all other
builtin functions are available without the pragama.

Differential Revision: https://reviews.llvm.org/D100984

3 years ago[amdgpu-arch] Fix rpath to run from build dir
Jon Chesterfield [Thu, 6 May 2021 12:06:59 +0000 (13:06 +0100)]
[amdgpu-arch] Fix rpath to run from build dir

[amdgpu-arch] Fix rpath to run from build dir

Prior to this, amdgpu-arch has RUNPATH set to $ORIGIN/../lib which works
for some installs, but not from the build directory where clang executes
the tool from when running tests.

This cmake option adds the location of the rocr runtime to the RUNPATH
(note, it amends RUNPATH here, despite the cmake option referring to RPATH)
to create a binary that runs from build or install location.

Before:
RUNPATH [$ORIGIN/../lib]
After:
RUNPATH [$ORIGIN/../lib:$HOME/llvm-install/lib]

Credit to Greg for knowing this trick and pointing to examples of it in use
for the aomp build scripts.

Reviewed By: pdhaliwal

Differential Revision: https://reviews.llvm.org/D101926

3 years ago[AMDGPU] Fix WQM failure with single block inactive demote
Carl Ritson [Thu, 6 May 2021 11:27:03 +0000 (20:27 +0900)]
[AMDGPU] Fix WQM failure with single block inactive demote

Instruction test for inactive kill/demote needs to be based on
actual opcode not whether instruction would be lowered to demote.

Reviewed By: piotr

Differential Revision: https://reviews.llvm.org/D101966