review.tizen.org Git - platform/upstream/llvm.git/log

[NFC] Use getZero instead of getConstant(0)

[lldb] Format unix signal table (NFC)

Restore unix signal table to its original glory and mark it as not to be
clang-formatted.

[SROA] rewritePartition()/findCommonType(): if uses have conflicting type, try getTypePartition() before falling back to largest integral use type (PR47592)

And another step towards transformss not introducing inttoptr and/or
ptrtoint casts that weren't there already.

In this case, when load/store uses have conflicting types,
instead of falling back to the iN, we can try to use allocated sub-type.
As disscussed, this isn't the best idea overall (we shouldn't rely on
allocated type), but it works fine as a temporary measure.

I've measured, and @ `-O3` as of vanilla llvm test-suite + RawSpeed,
this results in +0.05% more bitcasts, -5.51% less inttoptr
and -1.05% less ptrtoint (at the end of middle-end opt pipeline)

See https://bugs.llvm.org/show_bug.cgi?id=47592

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D88788

BPF: avoid duplicated globals for CORE relocations

This patch fixed two issues related with relocation globals.
In LLVM, if a global, e.g. with name "g", is created and
conflict with another global with the same name, LLVM will
rename the global, e.g., with a new name "g.2". Since
relocation global name has special meaning, we do not want
llvm to change it, so internally we have logic to check
whether duplication happens or not. If happens, just reuse
the previous global.

The first bug is related to non-btf-id relocation
(BPFAbstractMemberAccess.cpp). Commit 54d9f743c8b0
("BPF: move AbstractMemberAccess and PreserveDIType passes
to EP_EarlyAsPossible") changed ModulePass to FunctionPass,
i.e., handling each function at a time. But still just
one BPFAbstractMemberAccess object is created so module
level de-duplication still possible. Commit 40251fee0084
("[BPF][NewPM] Make BPFTargetMachine properly adjust NPM optimizer
pipeline") made a change to create a BPFAbstractMemberAccess
object per function so module level de-duplication is not
possible any more without going through all module globals.
This patch simply changed the map which holds reloc globals
as class static, so it will be available to all
BPFAbstractMemberAccess objects for different functions.

The second bug is related to btf-id relocation
(BPFPreserveDIType.cpp). Before Commit 54d9f743c8b0, the pass
is a ModulePass, so we have a local variable, incremented for
each instance, and works fine. But after Commit 54d9f743c8b0,
the pass becomes a FunctionPass. Local variable won't work
properly since different functions will start with the same
initial value. Fix the issue by change the local count variable
as static, so it will be truely unique across the whole module
compilation.

Differential Revision: https://reviews.llvm.org/D88942

[Test] Add test showing that we can avoid inserting trunc/zext

Reapply "[OpenMP][FIX] Verify compatible types for declare variant calls" D88384

This reapplies D88384 with the minor modification that an assertion was
changed to a regular conditional and graceful exit from
ASTContext::mergeTypes.

[MachineInstr] exclude call instruction in mayAlias

we now get noAlias result for a call instruction and other
load/store/call instructions if we query mayAlias.
This is not right as call instruction is not with mayloadorstore,
but it may alter the memory.

This patch fixes this wrong alias query.

Differential Revision: https://reviews.llvm.org/D87490

[PowerPC] implement target hook getTgtMemIntrinsic

This patch can make pass recognize Powerpc related memory intrinsics.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D88373

[PowerPC] add more builtins for PPCTargetLowering::getTgtMemIntrinsic

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D88374

[CodeGen][TailDuplicator] Don't duplicate blocks with INLINEASM_BR

Tail duplication of a block with an INLINEASM_BR may result in a PHI
node on the indirect branch. This is okay, but it also introduces a copy
for that PHI node *after* the INLINEASM_BR, which is not okay.

See: https://github.com/ClangBuiltLinux/linux/issues/1125

Differential Revision: https://reviews.llvm.org/D88823

[flang][openacc] Fix device_num and device_type clauses for init directive

This patch fix the device_num and device_type clauses used in the init clause. device_num was not
spelled correctly in the parser and was to restrictive with scalarIntConstantExpr instead of scalarIntExpr.
device_type is now taking a list of ScalarIntExpr.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D88571

[Attributor] Use smarter way to determine alignment of GEPs

Use same logic existing in other places to deal with base case GEPs.

Add the original Attributor talk example.

[Attributor] Ignore read accesses to constant memory

The old function attribute deduction pass ignores reads of constant
memory and we need to copy this behavior to replace the pass completely.
First step are constant globals. TBAA can also describe constant
accesses and there are other possibilities. We might want to consider
asking the alias analyses that are available but for now this is simpler
and cheaper.

[Attributor] Give up early on AANoReturn::initialize

If the function is not assumed `noreturn` we should not wait for an
update to mark the call site as "may-return".

This has two kinds of consequences:
  - We have less iterations in many tests.
  - We have less deductions based on "known information" (since we ask
    earlier, point 1, and therefore assumed information is not "known"
    yet).
The latter is an artifact that we might want to tackle properly at some
point but which is not easily fixable right now.

[lldb] Change the xcrun (fallback) logic in GetXcodeSDK

This changes the logic in GetXcodeSDK to find an SDK with xcrun. The
code now executes the following steps:

1. If DEVELOPER_DIR is set in the environment, it invokes xcrun with
    the given developer dir. If this fails we stop and don't fall back.
2. If the shlib dir is set and exists,it invokes xcrun with the
    developer dir corresponding to the shlib dir. If this fails we fall
    back to 3.
3. We run xcrun without a developer dir.

The new behavior introduced in this patch is that we fall back to
running xcrun without a developer dir if running it based on the shlib
dir failed.

A situation where this matters is when you're running lldb from an Xcode
that has no SDKs and that is not xcode-selected. Based on lldb's shlib
dir pointing into this Xcode installation, it will do an xcrun with the
developer set to the Xcode without any SDKs which will fail. With this
patch, when that happens, we'll fall back to trying the xcode-selected
Xcode by running xcrun without a developer dir.

Differential revision: https://reviews.llvm.org/D88866

[gn build] manually port 5e4409f308177

Relax FuseTensorReshapeOpAsproducer identity mapping constraint

Differential Revision: https://reviews.llvm.org/D88869

Fix out-of-tree clang build due to sysexits change

The sysexists change broke clang building out of tree against llvm.

https://reviews.llvm.org/D88467

[RuntimeDyld][COFF] Report fatal error on error, rather than emiting diagnostic.

Report a fatal error if an IMAGE_REL_AMD64_ADDR32NB cannot be applied due to an
out-of-range target. Previously we emitted a diagnostic to llvm::errs and
continued.

Patch by Dale Martin. Thanks Dale!

docs: Emphasize ArrayRef over SmallVectorImpl

The section on SmallVector has a note about preferring SmallVectorImpl
for APIs but doesn't mention ArrayRef. Although ArrayRef is discussed
elsewhere, let's re-emphasize here.

Differential Revision: https://reviews.llvm.org/D49881

Replace shadow space zero-out by madvise at mmap

After D88686, munmap uses MADV_DONTNEED to ensure zero-out before the
next access. Because the entire shadow space is created by MAP_PRIVATE
and MAP_ANONYMOUS, the first access is also on zero-filled values.

So it is fine to not zero-out data, but use madvise(MADV_DONTNEED) at
mmap. This reduces runtime
overhead.

Reviewed-by: morehouse
Differential Revision: https://reviews.llvm.org/D88755

[CMake] Track TSan's dependency on C++ headers

TSan relies on C++ headers, so when libc++ is being built as part of
the runtimes build, include an explicit dependency on cxx-headers which
is the same approach that's already used for other sanitizers.

Differential Revision: https://reviews.llvm.org/D88912

[libc++] Add assert to check bounds in `constexpr string_view::operator[]`

Differential Revision: https://reviews.llvm.org/D88864

[mlir] [sparse] convenience runtime support to read Matrix Market format

Setting up input data for benchmarks and integration tests can be tedious in
pure MLIR. With more sparse tensor work planned, this convenience library
simplifies reading sparse matrices in the popular Matrix Market Exchange
Format (see https://math.nist.gov/MatrixMarket). Note that this library
is *not* part of core MLIR. It is merely intended as a convenience library
for benchmarking and integration testing.

Reviewed By: penpornk

Differential Revision: https://reviews.llvm.org/D88856

Remove unneeded "allow-unregistered-dialect" from shape-type-conversion.mlir test (NFC)

Revert [lit] Support running tests on Windows without GnuWin32

This reverts b3418cb4eb1456c41606f4621dcfa362fe54183c and d12ae042e17b27ebc8d2b5ae3d8dd5f88384d093

This breaks some external bots, see discussion in https://reviews.llvm.org/D84380

In the meanwhile, please use `cmake -DLLVM_LIT_TOOLS_DIR="C:/Program Files/Git/usr/bin"` or add it to %PATH%.

[libc++] Add a script to setup CI on macOS nodes

[c++17] Implement P0145R3 during constant evaluation.

Ensure that we evaluate assignment and compound-assignment
right-to-left, and array subscripting left-to-right.

Fixes PR47724.

This is a re-commit of ded79be, reverted in 37c74df, with a fix and test
for the crasher bug previously introduced.

[NFC][MC] Type uses of MCRegUnitIterator as MCRegister

This is one of many subsequent similar changes. Note that we're ok with
the parameter being typed as MCPhysReg, as MCPhysReg -> MCRegister is a
correct conversion; Register -> MCRegister assumes the former is indeed
physical, so we stop relying on the implicit conversion and use the
explicit, value-asserting asMCReg().

Differential Revision: https://reviews.llvm.org/D88862

[mlir][spirv] Add Vector to SPIR-V conversion pass

Add conversion pass for Vector dialect to SPIR-V dialect and add some simple
conversion pattern for vector.broadcast, vector.insert, vector.extract.

Differential Revision: https://reviews.llvm.org/D88761

[AMDGPU] Fix remaining kernel descriptor test

Follow up on e4a9e4ef554a to fix a test I missed in the original patch.
Committed as obvious.

[NFC][flang] Add the header file Todo.h. This file is being upstreamed to satisfy dependencies and enable continued progress on lowering of OpenMP, OpenACC, etc.

Differential Revision: https://reviews.llvm.org/D88909

[SystemZ][z/OS] Set default alignment rules for z/OS target

Update RUN line to fix lit failure

Differential Revision: https://reviews.llvm.org/D88845

[mlir][Linalg] Implement tiling on tensors

This revision implements tiling on tensors as described in:
https://llvm.discourse.group/t/an-update-on-linalg-on-tensors/1878/4

Differential revision: https://reviews.llvm.org/D88733

[mlir][spirv] Fix extended insts deserialization generation

This change replaces container used for storing temporary
strings for generated code to std::list.
SmallVector may reallocate internal data, which will invalidate
references when more than one extended instruction set is
generated.

Reviewed By: mravishankar, antiagainst

Differential Revision: https://reviews.llvm.org/D88626

[AMDGPU] Emit correct kernel descriptor on big-endian hosts

Previously we wrote multi-byte values out as-is from host memory. Use
the `emitIntN` helpers in `MCStreamer` to produce a valid descriptor
irrespective of the host endianness.

Reviewed By: arsenm, rochauha

Differential Revision: https://reviews.llvm.org/D88858

[mlir][vector] Fold extractOp coming from broadcastOp

Combine ExtractOp with scalar result with BroadcastOp source. This is useful to
be able to incrementally convert degenerated vector of one element into scalar.

Differential Revision: https://reviews.llvm.org/D88751

[AMDGPU] Create isGFX9Plus utility function

Introduce a utility function to make it more
convenient to write code that is the same on
the GFX9 and GFX10 subtargets.

Use isGFX9Plus in the AsmParser for AMDGPU.

Authored By: Joe_Nash

Differential Revision: https://reviews.llvm.org/D88908

[SystemZ][z/OS] Set default alignment rules for z/OS target

Set the default alignment control variables for z/OS target and add test case for alignment rules on z/OS.

Reviewed By: abhina.sreeskantharajan

Differential Revision: https://reviews.llvm.org/D88845

[X86][SSE] combineX86ShuffleChain add 'CanonicalizeShuffleInput' helper. NFCI.

As part of PR45974, we're getting closer to not creating 'padded' vectors on-the-fly in combineX86ShufflesRecursively, and only pad the source inputs if we have a definite match inside combineX86ShuffleChain.

At the moment combineX86ShuffleChain just has to bitcast an input to the correct shuffle type, but eventually we'll need to pad them as well. So, move the bitcast into a 'CanonicalizeShuffleInput helper for now, making the diff for future padding support a lot smaller.

[AMDGPU] Remove SIInstrInfo::calculateLDSSpillAddress

This function does not seem to be used anymore.

Differential Revision: https://reviews.llvm.org/D88904

[MemCpyOpt] Use dereferenceable pointer helper

The call slot optimization has some home-grown code for checking
whether the destination is dereferenceable. Replace this with the
generic isDereferenceableAndAlignedPointer() helper.

I'm not checking alignment here, because that is currently handled
separately and may be an enforced alignment for allocas. The clean
way of integrating that part would probably be to accept a callback
in isDereferenceableAndAlignedPointer() for the actual isAligned check,
which would then have a chance to use an enforced alignment instead.

This allows the destination to be a GEP (among other things), though
the two open TODOs may prevent it from working in practice.

Differential Revision: https://reviews.llvm.org/D88805

[MemCpyOpt] Check for throwing calls during call slot optimization

When performing call slot optimization for a non-local destination,
we need to check whether there may be throwing calls between the
call and the copy. Otherwise, the early write to the destination
may be observable by the caller.

This was already done for call slot optimization of load/store,
but not for memcpys. For the sake of clarity, I'm moving this check
into the common optimization function, even if that does need an
additional instruction scan for the load/store case.

As efriedma pointed out, this check is not sufficient due to
potential accesses from another thread. This case is left as a TODO.

Differential Revision: https://reviews.llvm.org/D88799

[MemCpyOpt] Add separate statistic for call slot optimization (NFC)

[libc++] Check _LIBCPP_USE_CLOCK_GETTIME before using clock_gettime

The clock_gettime function is available when _POSIX_TIMERS is defined.
We check for this and set _LIBCPP_USE_CLOCK_GETTIME accordingly since
59b3102739c. But check for _LIBCPP_USE_CLOCK_GETTIME was removed in
babd3aefc91. As a result, code is now trying to use clock_gettime even
on platforms where it is not available and it is causing build failure
with newlib.

This patch restores the checks to fix this.

Differential Revision: https://reviews.llvm.org/D88825

[flang] Track CHARACTER length better in TypeAndShape

CHARACTER length expressions were not always being
captured or computed as part of procedure "characteristics",
leading to test failures due to an inability to compute
memory size expressions accurately.

Differential revision: https://reviews.llvm.org/D88689

[APIntTest] Extend extractBits to check 'lshr+trunc' pattern for each case as well.

Noticed while triaging PR47731 that we don't have great coverage for such patterns.

[libc++] Allow retries in two flaky tests

[X86] .code16: temporarily set Mode32Bit when matching an instruction with the data32 prefix

PR47632

This allows MC to match `data32 ...` as one instruction instead of two (data32 without insn + insn).

The compatibility with GNU as improves: `data32 ljmp` will be matched as ljmpl.
`data32 lgdt 4(%eax)` will be matched as `lgdtl` (prefixes: 0x67 0x66, instead
of 0x66 0x67).

GNU as supports many other `data32 *w` as `*l`. We currently just hard code
`data32 callw` and `data32 ljmpw`. Generalizing the suffix replacement is
tricky and requires a think about the "bwlq" appending suffix rules in MatchAndEmitATTInstruction.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D88772

[HIP] NFC Add comments to cmath functions

Add missing comments to cmath functions.

Differential Revision: https://reviews.llvm.org/D88837

[HIP] NFC properly reference Differential Revision

Committed [HIP] Restructure hip headers to add cmath
with typo in commit message. Should be Differential
Revision instead of Review. Using this to close the
diff.

Differential Revision: https://reviews.llvm.org/D88837

[SimplifyLibCalls] Optimize mempcpy_chk to mempcpy

[gn build] Port aa2b593f149

[HIP] Restructure hip headers to add cmath

Separate __clang_hip_math.h header into __clang_hip_cmath.h
and __clang_hip_math.h. Improve the math function definition,
and add missing definitions or declarations. Add missing
overloads.

Reviewed By: tra, JonChesterfield

Differential Review: https://reviews.llvm.org/D88837

[BPF][NewPM] Make BPFTargetMachine properly adjust NPM optimizer pipeline

This involves porting BPFAbstractMemberAccess and BPFPreserveDIType to
NPM, then adding them BPFTargetMachine::registerPassBuilderCallbacks
(the NPM equivalent of adjustPassManager()).

Reviewed By: yonghong-song, asbirlea

Differential Revision: https://reviews.llvm.org/D88855

[test][InstCombine][NewPM] Fix InstCombine tests under NPM

Some of these depended on analyses being present that aren't provided
automatically in NPM.

early_dce_clobbers_callgraph.ll was previously inlining a noinline function?

cast-call-combine.ll relied on the legacy always-inline pass being a
CGSCC pass and getting rerun.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D88187

[test][NewPM] Make dead-uses.ll work under NPM

This one is weird...

globals-aa needs to be already computed at licm, or else a function pass
can't run a module analysis and won't have access to globals-aa.
But the globals-aa result is impacted by instcombine in a way that
affects what the test is expecting. If globals-aa is computed before
instcombine, it is cached and globals-aa used in licm won't contain the
necessary info provided by instcombine.
Another catch is that if we don't invalidate AAManager, it will use the
cached AAManager that instcombine requested, which may not contain
globals-aa. So we have to invalidate<aa> so that licm can recompute
an AAManager with the globals-aa created by the require<globals-aa>.

This is essentially the problem described in https://reviews.llvm.org/D84259.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D88118

[Attributor][FIX] Move assertion to make it not trivially fail

The idea of this assertion was to check the simplified value before we
assign it, not after, which caused this to trivially fail all the time.

[Attributor][FIX] Dead return values are not `noundef`

When we assume a return value is dead we might still visit return
instructions via `Attributor::checkForAllReturnedValuesAndReturnInsts(..)`.
When we do so the "returned value" is potentially simplified to `undef`
as it is the assumed "returned value". This is a problem if there was a
preexisting `noundef` attribute that will only be removed as we manifest
the `undef` return value. We should not use this combination to derive
`unreachable` though. Two test cases fixed.

[Attributor][NFC] Ignore benign uses in AAMemoryBehaviorFloating

In AAMemoryBehaviorFloating we used to track benign uses in a SetVector.
With this change we look through benign uses eagerly to reduce the
number of elements (=Uses) we look at during an update.

The test does actually not fail prior to this commit but I already wrote
it so I kept it.

Add ability to turn off -fpch-instantiate-templates in clang-cl

A lot of our code building with clang-cl.exe using Clang 11 was failing with
the following 2 type of errors:

1. explicit specialization of 'foo' after instantiation
2. no matching function for call to 'bar'

Note that we also use -fdelayed-template-parsing in our builds.

I tried pretty hard to get a small repro for these failures, but couldn't. So
there is some subtle edge case in the -fpch-instantiate-templates feature
introduced by this change: https://reviews.llvm.org/D69585

When I tried turning this off using -fno-pch-instantiate-templates, builds
would silently fail with the same error without any indication that
-fno-pch-instantiate-templates was being ignored by the compiler. Then I
realized this "no" option wasn't actually working when I ran Clang under a
debugger.

Differential revision: https://reviews.llvm.org/D88680

Silence -Wunused-variable in NDEBUG mode

Revert "[c++17] Implement P0145R3 during constant evaluation."

This reverts commit ded79be63555f4e5bfdb0db27ef22b71fe568474. It causes
a crash (I sent the crash reproducer directly to the author).

[InstCombine] canRewriteGEPAsOffset - don't dereference a dyn_cast<>. NFCI.

We know V is a IntToPtrInst or PtrToIntInst type so we know its a CastInst - so use cast<> directly.

Prevents clang static analyzer warning that we could deference a null pointer.

[InstCombine] FoldShiftByConstant - consistently use ConstantExpr in logicalshift(trunc(shift(x,c1)),c2) fold. NFCI.

This still only gets used for scalar types but now always uses ConstantExpr in preparation for vector support - it was using APInt methods in some places.

[clangd] Add basic keyword-name-validation in rename.

Differential Revision: https://reviews.llvm.org/D88875

[ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV

This folds a select_cc or select(set_cc) of a max or min vector reduction with a scalar value into a VMAXV or VMINV.

Differential Revision: https://reviews.llvm.org/D87836

[AMDGPU][MC] Added detection of unsupported instructions

Implemented identification of unsupported instructions; improved errors reporting.

See bug 42590.

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D88211

[mlir][Linalg] Extend buffer allocation to support Linalg init tensors

This revision adds init_tensors support to buffer allocation for Linalg on tensors.
Currently makes the assumption that the init_tensors fold onto the first output tensors.

This assumption is not currently enforced or cast in stone and requires experimenting with tiling linalg on tensors for ops **without reductions**.

Still this allows progress towards the end-to-end goal.

Convert diagnostics about multi-character literals from extension to warning

This addresses PR46797.

[SystemZAsmParser] Treat VR128 separately in ParseDirectiveInsn().

This patch makes the parser
  - reject higher vector registers (>=16) in operands where they should not
    be accepted.
  - accept higher integers (>=16) in vector register operands.

Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D88888

[lldb] [Platform] Move common ::DebugProcess() to PlatformPOSIX

Move common ::DebugProcess() implementation shared by Linux and NetBSD
(and to be shared by FreeBSD shortly) into PlatformPOSIX, and move
the old base implementation used only by Darwin to PlatformDarwin.

Differential Revision: https://reviews.llvm.org/D88852

[InstCombine] FoldShiftByConstant - use PatternMatch for logicalshift(trunc(shift(x,c1)),c2) fold. NFCI.

[InstCombine] FoldShiftByConstant - remove unnecessary cast<>. NFC.

Op1 is already a Constant*

[llvm-objcopy][NFC] fix style issues reported by clang-format.

[gn build] Port d6c9dc3c17e

[clang-tidy] Remove obsolete checker google-runtime-references

The rules which is the base of this checker is removed from the
//Google C++ Style Guide// in May:
[[ https://github.com/google/styleguide/pull/553 | Update C++ styleguide ]].
Now this checker became obsolete.

Differential Revision: https://reviews.llvm.org/D88831

[llvm-objcopy][MachO] Add support for universal binaries

This diff adds support for universal binaries to llvm-objcopy.
This is a recommit of 32c8435ef70031 with the asan issue fixed.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D88400

[Statepoints] Change statepoint machine instr format to better suit VReg lowering.

Current Statepoint MI format is this:

   STATEPOINT
   <id>, <num patch bytes >, <num call arguments>, <call target>,
   [call arguments...],
   <StackMaps::ConstantOp>, <calling convention>,
   <StackMaps::ConstantOp>, <statepoint flags>,
   <StackMaps::ConstantOp>, <num deopt args>, [deopt args...],
   <gc base/derived pairs...> <gc allocas...>

Note that GC pointers are listed in pairs <base,derived>.
This causes base pointers to appear many times (at least twice) in
instruction, which is bad for us when VReg lowering is ON.
The problem is that machine operand tiedness is 1-1 relation, so
it might look like this:

  %vr2 = STATEPOINT ... %vr1, %vr1(tied-def0)

Since only one instance of %vr1 is tied, that may lead to incorrect
codegen (see PR46917 for more details), so we have to always spill
base pointers. This mostly defeats new VReg lowering scheme.

This patch changes statepoint instruction format so that every
gc pointer appears only once in operand list. That way they all can
be tied. Additional set of operands is added to preserve base-derived
relation required to build stackmap.
New statepoint has following format:

  STATEPOINT
  <id>, <num patch bytes>, <num call arguments>, <call target>,
  [call arguments...],
  <StackMaps::ConstantOp>, <calling convention>,
  <StackMaps::ConstantOp>, <statepoint flags>,
  <StackMaps::ConstantOp>, <num deopt args>, [deopt args...],
  <StackMaps::ConstantOp>, <num gc pointers>, [gc pointers...],
  <StackMaps::ConstantOp>, <num gc allocas>,  [gc allocas...]
  <StackMaps::ConstantOp>, <num entries in gc map>, [base/derived indices...]

Changes are:
  - every gc pointer is listed only once in a flat length-prefixed list;
  - alloca list is prefixed with its length too;
  - following alloca list is length-prefixed list of base-derived
    indices of pointers from gc pointer list. Note that indices are
    logical (number of pointer), not absolute (index of machine operand).

Differential Revision: https://reviews.llvm.org/D87154

[libcxx][lit] Add support for custom ssh/scp flags in ssh.py

In our CHERI Jenkins CI we need to pass `-F <custom_config_file>` to each
ssh/scp command to set various arguments such as the localhost port, usage
of controlmaster, etc. to speed up connections to our emulated QEMU systems.

For our specific use-case I could have also added a single --ssh-config-file
argument that can be used for both the scp and ssh commands, but being able
to pass arbitrary extra flags for both commands seems more flexible.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D84097

[AArch64] Correct parameter type for unsigned Neon scalar shift intrinsics

In the following intrinsics the shift amount
(parameter 2) should be signed.

vqshlb_u8 vqshlh_u16 vqshls_u32 vqshld_u64
vqrshlb_u8 vqrshlh_u16 vqrshls_u32 vqrshld_u64
vshld_u64
vrshld_u64

See https://developer.arm.com/documentation/ihi0073/latest

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D88013

[clangd] Add `score` extension to workspace/symbol response.

The protocol doesn't really incorporate ranking.
As with code completion, most clients respect what the server sends, but
VSCode re-ranks items, with predictable results.
See https://github.com/clangd/vscode-clangd/issues/81

There's no filterText field so we may be unable to construct a good workaround.
But expose the score so we may be able to do this on the client in future.

Differential Revision: https://reviews.llvm.org/D88844

[SVE] Lower fixed length vector fneg and fsqrt operations.

Also updates sve-fp.ll to use fneg directly.

Differential Revision: https://reviews.llvm.org/D88683

[SVE] Lower fixed length vector floating point rounding operations.

Adds lowering for:
  llvm.ceil
  llvm.floor
  llvm.nearbyint
  llvm.rint
  llvm.round
  llvm.trunc

Differential Revision: https://reviews.llvm.org/D88671

[OpenMP][RTL] Remove dead code

RequiresDataSharing was always 0, resulting dead code in device runtime library.

Reviewed By: jdoerfert, JonChesterfield

Differential Revision: https://reviews.llvm.org/D88829

[mlir] Add file to implement bufferization for shape ops.

This adds a shape-bufferize pass and implements the pattern for
shape.assuming.

Differential Revision: https://reviews.llvm.org/D88083

Revert "[llvm-objcopy][MachO] Add support for universal binaries"

This reverts commit 32c8435ef70031d7bd3dce48e41bdce65747e123. It fails
ASan, details in https://reviews.llvm.org/D88400.

Revert "[llvm-objcopy][MachO] Add missing std::move."

This reverts commit 6e25586990b93e2c9eaaa4f473b6720ccd646c46. It depends
on 32c8435ef70031d7bd3dce48e41bdce65747e123, which I'm reverting due to
ASan failures. Details in https://reviews.llvm.org/D88400.

[VPlan] Add vplan native path vectorization test case for inner loop reduction

Regarding this bug I posted earlier: https://bugs.llvm.org/show_bug.cgi?id=47035

After reading through LLVM source code and getting familiar with VPlan I was able to vectorize the code using by enabling VPlan native path. After talking with @fhahn he suggested that I contribute this as a test case. So here it is. I tried to follow the available guides how to do this best I could. I modified IR code by hand to have more clear variable names instead of numbers.

One thing what I'd like to get input from someone is that is current CHECK lines sufficient enough to verify that the inner loop has been vectorized properly?

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D87564

[llvm-readobj/elf][test] - Stop using precompiled binaries in mips-got.test

This removed 2 last precompiled binaries from the mips-got.test.
YAML descriptions are used instead.

Differential revision: https://reviews.llvm.org/D88565

[clangd] Verify the diagnostic code in include-fixer diagnostic tests, NFC.

Make it easier to spot which diagnostics in the include-fixer list are tested.

Differential Revision: https://reviews.llvm.org/D88828

[AMDGPU] Fix gcc warnings

uint8_t types are implicitly promoted to int, leading to a
unsigned-signed comparison.

Thanks for the heads-up @uabelho.

Differential Revision: https://reviews.llvm.org/D88876

[MLIR][SPIRVToLLVM] Conversion for composite extract and insert

A pattern to convert `spv.CompositeInsert` and `spv.CompositeExtract`.
In LLVM, there are 2 ops that correspond to each instruction depending
on the container type. If the container type is a vector type, then
the result of conversion is `llvm.insertelement` or `llvm.extractelement`.
If the container type is an aggregate type (i.e. struct, array), the
result of conversion is `llvm.insertvalue` or `llvm.extractvalue`.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D88205

[clangd] Fix an inconsistent ReasonToReject enum usage, NFC.

[mlir][Linalg] Reintroduced missing verification check

A verification check on the number of indexing maps seems to have dropped inadvertently. Also update the relevant roundtrip tests.

[flang][NFC] Remove redundant `;`

Sadly this has been causing gcc-10 builds to fail.

[LLDB] Add QEMU testing environment setup guide for SVE testing

This patch adds a HowTo document to lldb docs which gives instruction for
setting up a virtual environment based on QEMU emulator for LLDB testing.

Instruction in this document are tested on Arm and AArch64 targets but
can easily be duplicated for other targets supported by QEMU.

This helps test LLDB in absence for modern AArch64 features not released
in publicly available hardware till date.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D82064

[lldb] Symlink the Clang resource directory to the LLDB build directory in standalone builds

When doing a standalone build (i.e., building just LLDB against an existing
LLVM/Clang installation), LLDB is currently unable to find any Clang resource
directory that contains all the builtin headers we need to parse real source
code. This causes several tests that actually parse source code on disk within
the expression parser to fail (most notably nearly all the import-std-module
tests).

The reason why LLDB can't find the resource directory is that we search based on
the path of the LLDB shared library path. We assumed that the Clang resource
directory is in the same prefix and has the same relative path to the LLDB
shared library (e.g., `../clang/10.0.0/include`). However for a standalone build
where the existing Clang can be anywhere on the disk, so we can't just rely on
the hardcoded relative paths to the LLDB shared library.

It seems we can either solve this by copying the resource directory to the LLDB
installation, symlinking it there or we pass the path to the Clang installation
to the code that is trying to find the resource directory. When building the
LLDB framework we currently copy the resource directory over to the framework
folder (this is why the import-std-module are not failing on the Green Dragon
standalone bot).

This patch symlinks the resource directory of Clang into the LLDB build
directory. The reason for that is simply that this is only needed when running
LLDB from the build directory. Once LLDB and Clang/LLVM are installed the
already existing logic can find the Clang resource directory by searching
relative to the LLDB shared library.

Reviewed By: kastiglione, JDevlieghere

Differential Revision: https://reviews.llvm.org/D88581

[SVE][CodeGen] Fix DAGCombiner::ForwardStoreValueToDirectLoad for scalable vectors

In DAGCombiner::ForwardStoreValueToDirectLoad I have fixed up some
implicit casts from TypeSize -> uint64_t and replaced calls to
getVectorNumElements() with getVectorElementCount(). There are some
simple cases of forwarding that we can definitely support for
scalable vectors, i.e. when the store and load are both scalable
vectors and have the same size. I have added tests for the new
code paths here:

CodeGen/AArch64/sve-forward-st-to-ld.ll

Differential Revision: https://reviews.llvm.org/D87098

[AST][RecoveryExpr] Support dependent binary operator in C for error recovery.

see the whole context in: https://reviews.llvm.org/D85025

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D84226