review.tizen.org Git - platform/upstream/llvm.git/log

[LLD][COFF] Add LLVM toolchain library paths by default.

We want lld-link to automatically find compiler-rt's and
libc++ when it's in the same directory as the rest of the
toolchain. This is because on Windows linking isn't done
via the clang driver - but instead invoked directly.

This prepends: <llvm>/lib <llvm>/lib/clang/XX/lib and
<llvm>/lib/clang/XX/lib/windows automatically to the library
search paths.

Related to #63827

Differential Revision: https://reviews.llvm.org/D151188

[InstCombine] Fold cttz of lowest set bit

cttz(-a & a) is the same as cttz(a). -a & a is an idiom to extract
the lowest set bit, which naturally does not affect the number of
trailing zeroes.

Proof: https://alive2.llvm.org/ce/z/Yp26x7

[InstCombine] Add tests for cttz of lowest set bit (NFC)

[RISCV][test] Add RV32I and RV64I RUN lines to condops.ll test

Some of these test cases will be changed by upcoming combines, even in
the non-zicond case.

[mlir][linalg] BufferizeToAllocationOp: Add option to specify custom alloc op

Supported ops are "memref.alloc" and "memref.alloca".

Differential Revision: https://reviews.llvm.org/D155282

[mlir][bufferization] OneShotBufferizeOp: Add options to use linalg.copy

This new option allows users to specify a custom memcpy op.

Differential Revision: https://reviews.llvm.org/D155280

[EarlyCSE] Do not CSE convergent calls with memory effects

D149348 did this for readnone calls, which are handled by SimpleValue.
This patch does the same for all other CSEable calls, which are handled
by CallValue.

Differential Revision: https://reviews.llvm.org/D153151

[EarlyCSE] Precommit test for D153151

Differential Revision: https://reviews.llvm.org/D155210

[RISCV] Introduce RISCVISD::CZERO_{EQZ,NEZ} nodes produce them when zicond is present in lowerSELECT

This patch is a step towards altering how we handle the emission of
condops. Marking ISD::SELECT as legal is a major change in the codegen
path, and gives few options for maintaining the old codegen path when
it is believed to be better (e.g. a better branchless sequence is
possible using non-zicond instructions, or the branch-based sequence is
preferable).

This removes the existing SelectionDAG patterns and moves the logic into
lowerSELECT. Along some small codegen changes you'll note a few minor
regressions in the generated code quality - this are due to the fact
that by lowering the SELECT node early we miss out on combines that
would kick in later when setcc condcodes that aren't natively supported
have been expanded (thus exposing opportunities for optimisation by
performing logical negation and swapping truev/falsev). I've opted to
split out work that addresses these into follow-on patches (especially
as zicond is still 'experimental').

matchSetCC is a straight-forward translation from the version in
RISCVISelDAGToDAG. Ideally, in the future it can be converted to a
helper shared between both files.

Differential Revision: https://reviews.llvm.org/D155083

[lld][COFF] Find libraries with relative paths.

This patch is spun out of https://reviews.llvm.org/D151188
and makes it possible for lld-link to find libraries with
relative paths. This will be used later to implement the
changes to autolinking runtimes explained in #63827

Differential Revision: https://reviews.llvm.org/D155268

[InstCombine] Handle const select arm in foldSelectCtlzToCttz()

The select arm that takes the ctlz result can also instead be a
constant with the bit width (as this is what the ctlz evaluates to
for a==0).

This avoids a regression when strengthening the
simplifyWithOpReplaced() fold.

Proof: https://alive2.llvm.org/ce/z/DMRL5A

[InstCombine] Add test for ctlz->cttz fold with constant in select (NFC)

[llvm] Remove uses of getNonOpaquePointerElementType() (NFC)

[X86] Fold PACKSS(NOT(X),NOT(Y)) -> NOT(PACKSS(X,Y))

[HexagonVectorCombine] Remove use of getNonOpaquePointerElementType() (NFC)

[AArch64] Handle 64bit vector s/umull from extracts

This is similar to D153632, but for mul nodes instead of add/sub. They get
recognised in LowerMUL in order to detect the mul(ext, ext), in a way that will
work for i64 nodes as well as i16/i32. This extends it to look for
mul(subvector_extract(ext(x), 0), subvector_extract(ext(y), 0)), generating a
subvector_extract(mull(x,y)) if it matches.

Differential Revision: https://reviews.llvm.org/D154063

[mlir][memref] NFC - Move utility function declaration from IR/MemRef.h to Utils/MemRefUtils.h

Revert "[RandomIRBuilder] Remove use of getNonOpaquePointerElementType() (NFC)"

This reverts commit afdb83b19c674dd2a622697863a201cd44e2458a.

This was landed with a bad description.

[Flang][OpenMP] Use typed assignment in Atomic Write lowering

Use typed assignment in Atomic Write lowering to better handle
type conversions of allowed types.

Note: We should make similar changes for other constructs in
later patches.

Reviewed By: NimishMishra

Differential Revision: https://reviews.llvm.org/D154163

[mlir][LLVM] Convert alias metadata to using attributes instead of ops

Using MLIR attributes instead of metadata has many advantages:
* No indirection: Attributes can simply refer to each other seemlessly without having to use the indirection of `SymbolRefAttr`. This also gives us correctness by construction in a lot of places as well
* Multithreading save: The Attribute infrastructure gives us thread-safety for free. Creating operations and inserting them into a block is not thread-safe. This is a major use case for e.g. the inliner in MLIR which runs in parallel
* Easier to create: There is no need for a builder or a metadata region

This patch therefore does exactly that. It leverages the new distinct attributes to create distinct alias domains and scopes in a deterministic and threadsafe manner.

Differential Revision: https://reviews.llvm.org/D155159

[RandomIRBuilder] Remove use of getNonOpaquePointerElementType() (NFC)

[LLParser] Remove checks related to typed pointers (NFC)

[Verifier] Remove typed pointer verification (NFC)

[llvm] Remove uses of hasSameElemenTypeAs() (NFC)

Always returns true with opaque pointers.

[llvm][clang] Remove uses of isOpaquePointerTy() (NFC)

This now always returns true (for pointer types).

[RISCV] Fix required features checking with empty string

In our downstream, we define some intrinsics that don't require any
extra extension enabled. Such as

TARGET_BUILTIN(__builtin_riscv_xxx, "LiLi", "nc", "")

But `split` function's `KeepEmpty` argument is True. Got the error message

error: builtin requires at least one of the following extensions support to be enabled : ''

when we use our customized intrinsic.

Reviewed By: craig.topper, wangpc

Differential Revision: https://reviews.llvm.org/D154596

[include-cleaner] Bail out in the standalone tool for invalid ignore-headers
flag

[mlir][vector] VectorToSCF: Omit redundant out-of-bounds check

There was a bug in `TransferWriteNonPermutationLowering`, a pattern that extends the permutation map of a TransferWriteOp with leading transfer dimensions of size ones. These newly added transfer dimensions are always in-bounds, because the starting point of any dimension is in-bounds. VectorToSCF inserts out-of-bounds checks based on the "in_bounds" attribute and dims that are marked as out-of-bounds but that are actually always in-bounds lead to unnecessary "scf.if" ops.

Differential Revision: https://reviews.llvm.org/D155196

[RISCV] Narrow types of index operand matched pattern (shl (zext), C).

(shl (zext to iXLenVec), C) is a possible pattern in auto-vectorized code for
indexed loads/stores. But extending to iXLen might be too aggressive, RVV
indexed load/store instructions zero extend their indexed operand to XLEN.
The patch tries to narrow the type of the zero extension. It's benefit to
decrease register pressure.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154687

[X86]Recommit D154193 - Remove TEST in AND32ri+TEST16rr in peephole-opt

Previously we remove a pattern like:
  %reg = and32ri %in_reg, 5
  ...                         // EFLAGS not changed.
  %src_reg = subreg_to_reg 0, %reg, %subreg.sub_index
  test64rr %src_reg, %src_reg, implicit-def $eflags
We can remove test64rr since it has same functionality as and subreg_to_reg avoid the opt in previous code, so we handle this case specially.
And this case is also can be opted for the same reason, like:
  %reg = and32ri %in_reg, 5
  ...                         // EFLAGS not changed.
  %src_reg = copy %reg.sub_16bit:gr32
  test16rr %src_reg, %src_reg, implicit-def $eflags
The COPY from gr32 to gr16 prevent the opt in previous code too, just handle it specially as what we did for test64rr.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D154193

[llvm] Remove calls to supportsTypedPointers() (NFC)

Always returns false now.

[llvm] Remove calls to setOpaquePointers() (NFC)

True is the default (and only possible) value.

[IR] Remove LLVMPointerToElt and LLVMAnyPointerToElt intrinsic types (NFC)

With opaque pointers, LLVMPointerToElt can be replaced by llvm_ptr_ty
and LLVMAnyPointerToElt by llvm_anyptr_ty.

This still leaves LLVMVectorOfAnyPointersToElt, where we can't just
replace with an existing IIT descriptor.

Differential Revision: https://reviews.llvm.org/D155167

[clangd] Fix an assertion failure in NamedDecl::getName during the prepareRename

getName method required to be called on a simple-identifier NamedDecl,
otherwise it will trigger an assertion.

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D153617

[MLIR] Remove explicit -opaque-pointers flag from test (NFC)

[IR] Remove -opaque-pointers option

The test migration to opaque pointers has finished, so we can finally
drop typed pointer support from LLVM \o/

This removes the ability to disable typed pointers, as well as the
-opaque-pointers option, but otherwise doesn't yet touch any API
surface. I'll leave deprecation/removal of compatibility APIs to
future changes.

This also drops a few tests: These are either testing errors that
only occur with typed pointers, or type linking behavior that, to
the best of my knowledge, only applies to typed pointers.

Note that this will break some tests in the experimental SPIRV
backend, because the maintainers have failed to update their tests
in a reasonable time-frame, despite multiple warnings. In accordance
with our experimental target policy, this is not a blocking concern.
This issue is tracked at https://github.com/llvm/llvm-project/issues/60133.

Differential Revision: https://reviews.llvm.org/D155079

[AMDGPU] Relax restrictions on unbreakable PHI users in BreakLargePHis

The previous heuristic rejected a PHI if one of its user was an unbreakable PHI, no matter what the other users were.

This worked well in most cases, but there's one case in rocRAND where
it doesn't work. In that case, a PHI node has 2 PHI users where one is
breakable but not the other. When that PHI node isn't broken performance falls by 35%.

Relaxing the restriction to "require that half of the PHI node users are breakable" fixes the issue, and seems like a sensible change.

Solves SWDEV-409648, SWDEV-398393

Reviewed By: #amdgpu, arsenm

Differential Revision: https://reviews.llvm.org/D155184

[libcxx] [test] Skip timezone formatting tests on Windows

While these tests do pass in the CI environment, they fail elsewhere.
On GitHub Action runners, they produce '+0000' instead of '-0000' for
the UTC offset, and on local machines, it outputs the UTC offset of
the local timezone.

Differential Revision: https://reviews.llvm.org/D155182

[libc++] Implement istringstream members of P0408R7 (Efficient Access to basic_stringbuf's Buffer)

Reviewed By: #libc, Mordante

Differential Revision: https://reviews.llvm.org/D154454

[mlir] Improve syntax of `distinct[n]<unit>`

In cases where memory is of less of a concern (e.g. small attributes where all instances have to be distinct by definition), using `DistinctAttr` with a unit attribute is a useful and conscious way of generating deterministic unique IDs.
The syntax as is however, makes them less useful to use, as it 1) always prints `<unit>` at the back and 2) always aliases them leading to not very useful `#distinct = distinct[n]<unit>` lines in the printer output.

This patch fixes that by special casing `UnitAttr` to simply elide the `unit` attribute in the back and not printing it as alias in that case.

Differential Revision: https://reviews.llvm.org/D155162

[mlir] Don't emit forward declaration for user defined storage classes

Currently DefGen::emitDecl always emits forward declarations of storage classes even for user define ones, which makes it difficult to use template class directly in ODS. This patch changes `DefGen` not to emit forward decl when `genStorageClass` is false.

Original discussion: https://discourse.llvm.org/t/use-template-classes-as-user-defined-storage-classes/72015

Reviewed By: mehdi_amini, rriddle

Differential Revision: https://reviews.llvm.org/D155225

Fix comparison of constrained deduced return types in explicit
instantiations.

Fixes #62272.

Remove unnecessary std::moves [NFC]

These trigger the following error:

error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]

[XRay] Add initial support for loongarch64

Only support patching FunctionEntry/FunctionExit/FunctionTailExit for now.

Reviewed By: MaskRay, xen0n
Co-Authored-By: zhanglimin <zhanglimin@loongson.cn>
Differential Revision: https://reviews.llvm.org/D140727

[mlir][sparse][gpu] remove zero init memset

avoids quite a big memory fill for each setup

Reviewed By: K-Wu

Differential Revision: https://reviews.llvm.org/D155251

[MC/AsmLexer] Add '?' (Question) token

'?' is a valid token in our downstream target. There seem to be no way
to do target-specific lexing, so just add make AsmParser recognize it.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D154202

[XCOFF][AIX] Peephole optimization for toc-data.

Followup to D101178 - peephole optimization that converts a
load address instruction and a consuming load/store into just the
load/store when its safe to do so.

eg: converts the 2 instruction code sequence
  la 4, i[TD](2)
  stw 3, 0(4)
to
  stw 3, i[TD](2)

Differential Revision: https://reviews.llvm.org/D101470

[InstCombine] Transform `icmp eq/ne ({su}div exact X,Y),C` -> `icmp eq/ne X, Y*C`

We can do this if `Y*C` doesn't overflow. This is trivial if `C` is
0/1. Otherwise we actually generate a `mul` instruction iff the `div`
has one use.

Alive2 Links:
udiv: https://alive2.llvm.org/ce/z/GWPW67
sdiv: https://alive2.llvm.org/ce/z/bUoX9h

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D150091

[InstCombine] Add tests for `icmp eq/ne ({su}div exact X, Y), C`; NFC

Differential Revision: https://reviews.llvm.org/D150090

[DebugInfo] Add missing dependency on intrinsics_gen

[clang] Support '-fgpu-default-stream=per-thread' for NVIDIA CUDA

I'm using clang to compile CUDA code. And just found that clang doesn't support the per-thread stream option for NV CUDA. I don't know if there is another solution.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D154822

[compiler-rt][AArch64] Correct how FMV use ifunc resolver abi.

The patch fixes second argument of Function Multi Versioning resolvers,
it is pointer to an extendible struct containing hwcap and hwcap2 not a
unsigned long hwcap2. Also fixes FMV features caching in resolver.

Differential Revision: https://reviews.llvm.org/D155026

[amdgpu] Delete elide-module-lds attribute

Requires D155190

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D155238

[ARM] Replace OperandMatchResultTy with ParseStatus (NFC)

ParseStatus is slightly more convenient to use due to implicit
conversion from bool, which allows to do something like:
```
  return Error(L, "msg");
```
when with MatchOperandResultTy it had to be:
```
  Error(L, "msg");
  return MatchOperand_ParseFail;
```
It also has more appropriate name since parse* methods are not only for
parsing operands.

Reviewed By: olista01

Differential Revision: https://reviews.llvm.org/D154304

[AArch64] Replace OperandMatchResultTy with ParseStatus (NFC)

ParseStatus is slightly more convenient to use due to implicit
conversion from bool, which allows to do something like:
```
  return Error(L, "msg");
```
when with MatchOperandResultTy it had to be:
```
  Error(L, "msg");
  return MatchOperand_ParseFail;
```
It also has more appropriate name since parse* methods are not only for
parsing operands.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D154292

[mlir][sparse][gpu] minor improvements in 2:4 example

Reviewed By: K-Wu

Differential Revision: https://reviews.llvm.org/D155244

[amdgpu][lds] Remove recalculation of LDS frame from backend

Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend.

Prior to this patch:
The IR lowering pass sets up a per-kernel LDS frame and annotates the variables with absolute_symbol
metadata so that the assembler can build lookup tables out of it. There is a fragile association between
kernel functions and named structs which is used to recompute the frame layout in the backend, with
fatal_errors catching inconsistencies in the second calculation.

After this patch:
The IR lowering pass additionally sets a frame size attribute on kernels. The backend uses the same
absolute_symbol metadata that the assembler uses to place objects within that frame size.

Deleted the now dead allocation code from the backend. Left for a later cleanup:
- enabling lowering for anonymous functions
- removing the elide-module-lds attribute (test churn, it's not used by llc any more)
- adjusting the dynamic alignment check to not use symbol names

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D155190

[libc++][NFC] Suppress -Wdeprecated-literal-operator

Remove spaces between operator"" and identifier to suppress
-Wdeprecated-literal-operator, and between operator and ""
like how they are written in [string.view.literals] and [basic.string.literals].

Differential Revision: https://reviews.llvm.org/D155200

[llvm][utils] Disable lldb formatters for PointerIntPair and PointerUnion

These synthetic providers use expression evaluation and fail in some cases.

Examples:

```
llvm::PointerIntPair<llvm::PointerUnion<const Type *, const ExtQuals *>,
                     Qualifiers::FastWidth> Value;
```

and

```
typedef llvm::PointerUnion<const ValueDecl *, const Expr *, TypeInfoLValue,
                           DynamicAllocLValue>
        PtrTy;
```

Original contribution: D117779

rdar://110791233
rdar://112195543

Differential Revision: https://reviews.llvm.org/D155219

[lldb] Move decorators to test method

Make sure TestCTF only run on Darwin when ctfconvert and llvm-objdump
are available.

[SLP]Add a test with the stores with long distances between them, NFC.

[lldb] Support compressed CTF

Add support for compressed CTF data. The flags in the header can
indicate whether the CTF body is compressed with zlib deflate. This
patch supports inflating the data before parsing.

Differential revision: https://reviews.llvm.org/D155221

[clang][modules] Deserialize included files lazily

In D114095, `HeaderFileInfo::NumIncludes` was moved into `Preprocessor`. This still makes sense, because we want to track this on the granularity of submodules (D112915, D114173), but the way this information is serialized is not ideal. In `ASTWriter`, the set of included files gets deserialized eagerly, issuing lots of calls to `FileManager::getFile()` for input files the PCM consumer might not be interested in.

This patch makes the information part of the header file info table, taking advantage of its lazy deserialization which typically happens when a file is about to be included.

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D155131

[llvm-debuginfod] Include llvm/Support/StringExtras.h after D155178

To fix undefined errors like to_float. This tool is often not built as
LLVM_ENABLE_HTTPLIB defaults to off (and the external dependency
cpp-httplib is difficult to set up due to a dependency on brotli) and
LLVM_TOOL_LLVM_DEBUGINFOD_BUILD disabling logic in D147185.

[mlir][MemRef] Move narrow type emulation common methods to MemRefUtils.

It also unifies the computation of StridedLayoutAttr. If the stride is
static known value, we can just use it.

Differential Revision: https://reviews.llvm.org/D155017

Include some llvm/Support/StringExtras.h after D155178

[mlir] Don't make the ROCm conversions depend on the execution engine

During a conversion to MLIR_ENABLE_EXECUTION_ENGINE from checking for
the native target, the ROCm conversion passes (--serialize-to-hsaco)
were mistakenly flagged for being disabled if the execution ending is
not being built.

These passes use LLVM to build binaries for AMD GPUs, and so require
that backend to be enabled. However, they do not produce native code,
nor do they interact with the JIT or any of the execution engine
support libraries.

When building MLIR into a compiler library that's intended to produce
GPU binaries, we want to build only the AMDGPU backend and have the
binary serialization passes available. This change makes that
possible.

It looks like the CUDA path might currently require a native target,
it's hard to tell, so this commit leaves that if statement untouched.

Reviewed By: fmorac

Differential Revision: https://reviews.llvm.org/D155227

[AMDGPU] Support -mcpu=native for OpenCL

When -mcpu=native is specified, try detecting GPU
on the system by using amdgpu-arch tool. If it
fails to detect GPU, emit an error about GPU
not detected. If multiple GPUs are detected,
use the first GPU and emit a warning.

Reviewed by: Matt Arsenault, Fangrui Song

Differential Revision: https://reviews.llvm.org/D154531

[Support] Move StringExtras.h include from Error.h to Error.cpp

Move the implementation of the `toString` function from
`llvm/Support/Error.h` to the source file, which allows us to move
`#include "llvm/ADT/StringExtras.h"` to the source file as well.

As `Error.h` is present in a large number of translation units this
means we are unnecessarily bringing in the contents of
`StringExtras.h` - itself a large file with lots of includes - and
slowing down compilation.

Also move the `#include "llvm/ADT/SmallVector.h"` directive to the
source file as it's no longer needed, but this does not give as much of
a benefit.

This reduces the total number of preprocessing tokens across the LLVM
source files in lib from (roughly) 1,920,413,050 to 1,903,629,230 - a
reduction of ~0.87%. This should result in a small improvement in
compilation time.

Differential Revision: https://reviews.llvm.org/D155178

[lldb] Add missing StringExtras.h includes

In preparation for removing the #include "llvm/ADT/StringExtras.h"
from the header to source file of llvm/Support/Error.h, first add in
all the missing includes that were previously included transitively
through this header.

This is fixing all files missed in b0abd4893fa1, 39d8e6e22cd1,
a11efd49266f, 5551657b310b, and 90bfe2df25e7.

Differential Revision: https://reviews.llvm.org/D155178

[AMDGPU] Move SIEncodingFamily into SIDefines.h. NFC.

I need this for future patch in the MC, while TII is not available
in the llvm-mc. Besides this is not a first time I want it there.

Differential Revision: https://reviews.llvm.org/D155228

[RISCV] Add Zce extension.

According to the spec, Zce is an alias for Zca, Zcb, Zcmp, and Zcmt.
If F is enabled on RV32 it also includes Zcf.

This patch adds the Zce and the implication rule which unfortunately
requires custom handling for adding Zcf.

I've also made all the Zc* extensions imply Zca.

I've also added an error for Zcf without RV32.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D153742

[PowerPC] Improve code gen for vector add

Improve codegen for vectors modulo additions.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D154447

[mlir][gpu] Add dump-ptx option

When targeting NVIDIA GPUs, seeing the generated PTX is important. Currently, we don't have simple way to do it.

This work adds dump-ptx to gpu-to-cubin pass. One can use it like `gpu-to-cubin{chip=sm_90 features=+ptx80 dump-ptx}`.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D155166

[AMDGPU][IGLP] Add iglp_opt(1) strategy for single wave gemms

This adds the IGLP strategy for single-wave gemms. The SchedGroup pipeline is laid out in multiple phases, with each phase corresponding to a distinct pattern present in gemm kernels. The resilience of the optimization is dependent upon IR (as seen by pre-RA scheduling) continuing to have these patterns (as defined by instruction class and dependencies) in their current relative ordering.

The kernels of interest have these specific phases:
NT: 1, 2a, 2c
NN: 1, 2a, 2b
TT: 1, 2b, 2c
TN: 1, 2b

The general approach taken was to have a long SchedGroup pipeline. In this way the scheduler will have less capability of doing the wrong thing. In order to resolve the challenge of correctly fitting these long pipelines, we leverage the rules infrastructure to help the solver.

Differential Revision: https://reviews.llvm.org/D149773

Change-Id: I1a35962a95b4bdf740602b8f110d3297c6fb9d96

[flang][runtime] Support in-tree device build of Flang runtime.

I changed the set of files that are built for experimental CUDA/OMP
builds, i.e. the files with enabled device support are built
as such and the rest of the files are built just for the host target.
With this change we can build Flang runtime library that is fully functional
on the host target, so in-tree targets like check-flang become operational.

Reviewed By: klausler, PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D155029

[AMDGPU][AsmParser][NFC] Translate parsed MIMG instructions to MCInsts automatically.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D155061

[AMDGPU][MC] Fix handling of A16 operands in intersect_ray instructions.

The patch adds the support for 'noa16' operands in non-A16 variants of
the instructions, fixes validation of A16 operands and eliminates the
custom conversion to MCInst.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D155057

[RISCV] Add initial SDNode patterns for unary zvbb instructions

This patch adds pseudos and SDNode patterns for vbrev.v, vrev8.v, vclz.v,
vctz.v and vcpop.v.
I've only added them for integer element types so far since we're lacking tests
for floats.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D155216

[RISCV] Correct resource cycles for vzext/vsext in SiFive7 scheduler.

The instructions produce DLEN bits per cycle. The vsetvli LMUL for these
instructions is the output EMUL. The input EMUL is scaled down by
the vector factor suffix on the instruction name.

So for LMUL=1 there are 2*DLEN bits of result produced over 2 cycles.
This makes SiFive7GetCyclesDefault the correct resource cycles.

Reviewed By: monkchiang

Differential Revision: https://reviews.llvm.org/D155010

[AMDGPU][MC] Pre-commit tests for the noa16 intersect_ray instructions fix, D155057.

The added instructions are incorrectly encoded as a16 ones despite the
'noa16' modifiers.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D155059

[DebugInfo] Force users of DWARFDebugAbbrev to call parse before iterating

In an attempt to make it easier to catch errors when parsing the
debug_abbrev section, we should force users to call `parse` before
calling `begin`. In a follow-up change, I will change the return type of
`parse` from `void` to `Error`.

I also explored using the fallible_iterator pattern instead of forcing
users to parse everything up front. I think it would be a useful and
interesting pattern to implement, but it would require more extensive
changes to both DWARFDebugAbbrev and its users. Because my top priority
is improving the safety around parsing debug_abbrev, I'm opting to
preserve existing behavior until I or somebody else has time to refactor
to be able to implement a fallible_iterator.

Differential Revision: https://reviews.llvm.org/D154655

[lldb] Support Compact C Type Format (CTF)

Add support for the Compact C Type Format (CTF) in LLDB. The format
describes the layout and sizes of C types. It is most commonly consumed
by dtrace.

We generate CTF for the XNU kernel and want to be able to use this in
LLDB to debug kernels for which we don't have dSYMs (anymore). CTF is a
much more limited debug format than DWARF which allows is to be an order
of magnitude smaller: a 1GB dSYM can be converted to a handful of
megabytes of CTF. For XNU, the goal is not to replace DWARF, but rather
to have CTF serve as a "better than nothing" debug info format when
DWARF is not available.

It's worth noting that the LLVM toolchain does not support emitting CTF.
XNU uses ctfconvert to generate CTF from DWARF which is used for
testing.

Differential revision: https://reviews.llvm.org/D154862

[lldb] Support Compact C Type Format (CTF) section

Teach LLDB about the ctf (Compact C Type Format) section.

Differential revision: https://reviews.llvm.org/D154668

[RISCV] Common remaining operand logic in performCombineVMergeAndVOps [nfc]

We can share the code for both the unmasked and masked cases, and add a missing consistency assert in the process.

This is a subset of Luke's D155063. I'm splitting pieces and landing them in the process of convincing myself all the individual transforms are in fact correct. This is the last major piece.

[lldb] Move CommandOverrideCallbackWithResult to lldb_private namespace

This has an `lldb_private` type in its parameter, it should be in
`lldb-private-types.h`

Differential Revision: https://reviews.llvm.org/D155129

[flang][openacc] Add support for complex mul reduction

Add support to lower reduction with the multiply operator and
complex type.

Depends on D155007

Reviewed By: razvanlupusoru

Differential Revision: https://reviews.llvm.org/D155014

Switch to strncpy to silence GCC stringop overflow warnings.

Thanks to Simon Pilgrim for letting me know about these in
https://reviews.llvm.org/rG9d701c8a8d65.

[BOLT] Attach ORC info to instructions in CFG

Propagate Linux Kernel ORC information read from the file to the whole
function CFG once the graph has been built. We have a choice to either
attach ORC state annotation to every instruction, or to the first
instruction in the basic block to conserve processing memory. I chose to
attach to every instruction under --print-orc option which is currently
on by default.

Depends on D155153, D154815

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D155156

[BOLT][NFC] Add post-CFG processing to MetadataRewriter interface

Add MetadataRewriter::postCFGInitializer().

Reviewed By: jobnoorman

Differential Revision: https://reviews.llvm.org/D155153

[RISCV] Reason explicitly about mask and rounding mode in performCombineVMergeAndVOps [nfc]

This is a subset of Luke's D155063. I'm splitting pieces and landing them in the process of convincing myself all the individual transforms are in fact correct.

The code structure here is overly verbose. I'm landing this staging change with the code structure exactly matching the non-masked case to make the following cleanup that commons this all obviously correct.

[BOLT] Add reading support for Linux ORC sections

Read ORC (oops rewind capability) info used for unwinding the stack by
Linux Kernel. The info is stored in .orc_unwind and .orc_unwind_ip
sections. There is also a related .orc_lookup section that is being
populated by the kernel during runtime. Contents of the sections are
sorted for quicker lookup by a post-link objtool.

Unless we modify stack access instructions, we don't have to change ORC
info attributed to instructions in the binary. However, we need to
update instruction addresses and sort both sections based on the new
layout.

For pretty printing, we add "--print-orc" option that prints ORC info
next to instructions in code dumps.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D154815

Remove Clang :: CodeGenCXX/unified-cfi-lto.cpp due to buildbot failures

This test has been failing on sanitizer-x86_64-linux-bootstrap-asan
since it was commited. Removing this test while I work on reproducing
this.

Example: https://lab.llvm.org/buildbot/#/builders/168/builds/14579

[libc++] Fix filesystem tests on platforms that don't have IO

This patch moves a few tests that were still using std::fprintf to
using TEST_REQUIRE instead, which provides a single point to tweak
for platforms that don't implement fprintf. As a fly-by fix, it also
avoids including `time_utils.h` in filesystem_clock.cpp when it is
not required, since that header makes some pretty large assumptions
about the platform it is on.

Differential Revision: https://reviews.llvm.org/D155019

[BOLT][DWARF] Fix adding DW_AT_GNU_ranges_base

There are cases in DWARF4 when Skeleton CU has ranges, but dwo CU doesn't.
Bug was introduced in new DWARFRewriter where for DWARF4 it would fall through
to DWARF5 case.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D155033

[lldb] Forward declare SBPlatform and SBTypeMember in SBDefines

Differential Revision: https://reviews.llvm.org/D155137

[BOLT][DWARF][NFC] Fix false positive error

The DWO Unit DIE, doesn't have low_pc/high_pc, so we were printing this error
for valid cases.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D155032

Don't assert on a non-pointer value being used for a "p" inline asm constraint.

GCC and existing codebases allow the use of integral values to be used
with this constraint. A recent change D133914 in this area started causing asserts.
Removing the assert is enough as the rest of the code works fine.

rdar://109675485

Differential Revision: https://reviews.llvm.org/D155023

[RISCV] Update test after the addition for rounding mode to vfadd intrinsic. NFC

The greediness of the operand matching regular expressions made
the test pass even though an operand is missing.