review.tizen.org Git - platform/upstream/llvm.git/log

[AArch64][GlobaISel] Mark target generic instructions as HasNoSideEffects.

One test needed updating because the newly side-effect-free instructions were
now being DCE'd.

[NFC][LSAN] Limit the number of concurrent threads is the test

Test still fails with D88184 reverted.

The test was flaky on https://bugs.chromium.org/p/chromium/issues/detail?id=1206745 and
https://lab.llvm.org/buildbot/#/builders/sanitizer-x86_64-linux

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D102218

[mlir] Move move capture in SparseElementsAttr::getValues

This was a TODO for the move to C++14. Now that the move has been completed, we can resolve it.

[lld][WebAssembly] Remove relocation target verification

We have this extra step in wasm-ld that doesn't exist in other lld
backend which verifies the existing contents of the relocation targets.
This was originally intended as an extra form of double checking and an
aid to compiler developers.   However it has always been somewhat
controversial and there have been suggestions in the past the we simply
remove it.

My motivation for removing it now is that its causing me a headache
when trying to fix an issue with negative addends.  In the case of
negative addends that final result can be wrapped/negative but this
checking code would require significant modification to be able to deal
with that case.  For example with some test cases I'm looking at I'm
seeing error like this:

```
wasm-ld: warning: /usr/local/google/home/sbc/dev/wasm/llvm-build/tools/lld/test/wasm/Output/merge-string.s.tmp.o:(.rodata_relocs): unexpected existing value for R_WASM_MEMORY_ADDR_I32: existing=FFFFFFFA expected=FFFFFFFFFFFFFFFA
```

Rather than try to refactor `calcExpectedValue` to somehow return two
different types of results (32 and 64-bit) depending on the relocation
type, I think we can just remove this code.

Differential Revision: https://reviews.llvm.org/D102265

Add an "interrupt timeout" to Process, and pipe that through the
ProcessGDBRemote plugin layers.

Also fix a bug where if we tried to interrupt, but the ReadPacket
wakeup timer woke us up just after the timeout, we would break out
the switch, but then since we immediately check if the response is
empty & fail if it is, we could end up actually only giving a
small interval to the interrupt.

Differential Revision: https://reviews.llvm.org/D102085

[libc++] Run `substitutes-in-compile-flags.sh.cpp` test on Windows.

Fix for substitutes-in-compile-flags.sh.cpp to run it properly on Windows platform.

Differential Revision: https://reviews.llvm.org/D102048

[OpenMP] Use compound operators for reduction combiner if available.

The OpenMP spec seems to require the compound operators be used for
+, *, &, |, and ^ reduction. So use these if a class has those operators.
If not try the simple operators as we did previously to limit the impact
to existing code.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=48584

Differential Revision: https://reviews.llvm.org/D101941

[clang] Support -fpic -fno-semantic-interposition for RISCV

-fno-semantic-interposition (only effective with -fpic) can optimize default
visibility external linkage (non-ifunc-non-COMDAT) variable access and function
calls to avoid GOT/PLT, by using local aliases, e.g.
```
int var;
__attribute__((optnone)) int fun(int x) { return x * x; }
int test() { return fun(var); }
```

-fpic (var and fun are dso_preemptable)
```
test:
.LBB1_1:
        auipc   a0, %got_pcrel_hi(var)
        ld      a0, %pcrel_lo(.LBB1_1)(a0)
        lw      a0, 0(a0)
// fun is preemptible by default in ld -shared mode. ld will create a PLT.
        tail    fun@plt
```

vs -fpic -fno-semantic-interposition (var and fun are dso_local)
```
test:
.Ltest$local:
.LBB1_1:
        auipc   a0, %pcrel_hi(.Lvar$local)
        addi    a0, a0, %pcrel_lo(.LBB1_1)
        lw      a0, 0(a0)
// The assembler either resolves .Lfun$local at assembly time (-mno-relax
// -fno-function-sections), or produces a relocation referencing a non-preemptible
// local symbol (which can avoid PLT).
        tail    .Lfun$local
```

Note: Clang's default -fpic is more aggressive than GCC -fpic: interprocedural
optimizations (including inlining) are available but local aliases are not used.
-fpic -fsemantic-interposition can disable interprocedural optimizations.

Depends on D101875

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D101876

[lld][WebAssembly] Convert test to assembly. NFC.

Differential Revision: https://reviews.llvm.org/D102264

[X86] X86TTIImpl::getInterleavedMemoryOpCostAVX2(): canonicalize to integer type

This way we don't have to duplicate i32/f32 and i64/f64 entries,
which was already forgotten to be done for a few tuples.

[GlobalOpt] Remove heap SROA

GlobalOpt implements a heap SROA (SROA for an malloc allocatated struct or array
of structs) which is largely undertested (heap-sra-[1234].ll are basically the
same test with very little difference) and does not trigger at all when
bootstrapping clang (it only supports the case of one single store).

The heap SROA implementation causes PR50027 (GEP is not properly handled; crash or miscompile).
Just drop the implementation. I have deleted some obviously duplicated tests
but kept `heap-sra-[12]{,-no-nullopt}.ll`.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D102257

[AArch64][GlobalISel] Support truncstorei8/i16 w/ combine to form truncating G_STOREs.

This needs some tablegen changes so that we can actually import the patterns properly.

Differential Revision: https://reviews.llvm.org/D102204

[RISCV] Prefer to lower MC_GlobalAddress operands to .Lfoo$local

Similar to X86 D73230 and AArch64 D101872

With this change, we can set dso_local in clang's -fpic -fno-semantic-interposition mode,
for default visibility external linkage non-ifunc-non-COMDAT definitions.

For such dso_local definitions, variable access/taking the address of a
function/calling a function will go through a local alias to avoid GOT/PLT.

Reviewed By: jrtc27, luismarques

Differential Revision: https://reviews.llvm.org/D101875

[ArgumentPromotion] Fix byval alignment handling.

Make sure the alignment of the generated operations matches the
alignment of the byval argument. Previously, we were just ignoring
alignment and getting lucky.

While I'm here, also delete the unnecessary "tail" handling.
Passing a pointer to a byval argument to a "tail" call is UB, so
rewriting to an alloca doesn't require any special handling.

Differential Revision: https://reviews.llvm.org/D89819

[mlir][ODS]: Add per-op cppNamespace.

This is useful for dialects that have logical subparts.

Differential Revision: https://reviews.llvm.org/D102200

[libcxx] [test] Fix filesystem permission tests for windows

On Windows, the permission bits are mapped down to essentially only
two possible states; readonly or readwrite. Normalize the checked
permission bitmask to match what the implementation will return.

Differential Revision: https://reviews.llvm.org/D101728

[git-clang-format] Do not apply clang-format to symlinks

This fixes PR46992.

Git stores symlinks as text files and we should not format them even if
they have one of the requested extensions.

(Move the call to `cd_to_toplevel()` up a few lines so we can also print
the skipped symlinks during verbose output.)

Differential Revision: https://reviews.llvm.org/D101878

[lld/mac] Implement -sectalign

clang sometimes passes this flag along (see D68351), so we should implement it.

Differential Revision: https://reviews.llvm.org/D102247

Re-apply "[ORC-RT] Add unit test infrastructure, extensible_rtti..."

This reapplies 6d263b6f1c9 (which was reverted in 1c7c6f2b106) with a fix for a
CMake issue.

[flang] Allow large and erroneous ac-implied-do's

We sometimes unroll an ac-implied-do of an array constructor into a flat list
of values.  We then re-analyze the array constructor that contains the
resulting list of expressions.  Such a list may or may not contain errors.

But when processing an array constructor with an unrolled ac-implied-do, the
compiler was building an expression to represent the extent of the resulting
array constructor containing the list of values.  The number of operands
in this extent expression was based on the number of elements in the
unrolled list of values.  For very large lists, this created an
expression so large that it could not be evaluated by the compiler
without overflowing the stack.

I fixed this by continuously folding the extent expression as each operand is
added to it.  I added the test .../flang/test/Semantics/array-constr-big.f90
that will cause the compiler to seg fault without this change.

Also, when the unrolled ac-implied-do expression contains errors, we were
repeating the same error message referencing the same source line for every
instance of the erroneous expression in the unrolled list.  This potentially
resulted in a very long list of messages for a single error in the source code.

I fixed this by comparing the message being emitted to the previously emitted
message.  If they are the same, I do not emit the message.  This change is also
tested by the new test array-constr-big.f90.

Several of the existing tests had duplicate error messages for the same source
line, and this change caused differences in their output.  So I adjusted the
tests to match the new message emitting behavior.

Differential Revision: https://reviews.llvm.org/D102210

[TextAPI] Reformat llvm_unreachable message

Change llvm_unreachable message from "Unknown llvm.MachO.PlatformKind
enum" to "Unknown llvm::MachO::PlatformKind enum".

Differential revision: https://reviews.llvm.org/D102250

Revert "[ORC-RT] Add unit test infrastructure, extensible_rtti..."

This reverts commit 6d263b6f1c9 while I investigate the CMake failures that it
causes in some configurations.

Reland "[Coverage] Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation""

Originally landed in: 6400905a615282c83a2fc6e49e57ff716aa8b4de
Reverted in: 668dccc396da4f593ac87c92dc0eb7bc983b5762

Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation
groups.

This change corrects the implementation for the branch coverage summary to do
the same thing for branches that is done for lines and regions. That is,
across function instantiations in an instantiation group, the maximum branch
coverage found in any of those instantiations is returned, with the total
number of branches being the same across instantiations.

Differential Revision: https://reviews.llvm.org/D102193

[X86][SSE] Add tests for permute(phaddw(phaddw(x,y),phaddw(z,w))) -> phaddw(phaddw(),phaddw()) folds.

We currently only fold if NumEltsPerLane == 4

[libcxx][tests] Fix incomplte.verify tests by disabling them on clang-10.

For some reason clang-10 can't match the expected errors produced by
passing icomplete arrays to range access functions. Disabling the tests
is a stop-gap solution to fix the bots.

[RISCV] Use fractional LMULs for fixed length types smaller than riscv-v-vector-bits-min.

My thought process is that if v2i64 is an LMUL=1 type then v2i32
should be an LMUL=1/2 type. We limit the fractional LMUL so that
SEW=64 clips to LMUL=1, SEW=32 clips to LMUL=1/2, etc. This
ensures there's always a fractional LMUL available to truncate a type.
This does reduce the number of vsetvlis in some cases.

Some tests increase vsetvlis because the best container type for a
mask type is dependent on the LMUL+SEW that the mask was produced
from, but you can't tell that from the type. I think this is
something we need to solve this in the machine IR when optimizing
vsetvlis.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D101215

[X86][Codegen] Shift amount mod: sh? i64 x, (32-y) --> sh? i64 x, -(y+32)

I've seen this in the RawSpeed's BitPumpMSB*::push() hotpath,
after fixing the buffer abstraction to a more sane one,
when looking into a +5% runtime regression.
I was hoping that this would fix it, but it does not look it does.

This seems to be at least not worse than the original pattern.
But i'm actually mainly interested in the case where we already
compute `(y+32)` (see last test),

https://alive2.llvm.org/ce/z/ZCzJio

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D101944

[RISCV] Match trunc_vector_vl+sra_vl/srl_vl with splat shift amount to vnsra/vnsrl.

Limited to splats because we would need to truncate the shift
amount vector otherwise.

I tried to do this with new ISD nodes and a DAG combine to
avoid such a large pattern, but we don't form the splat until
LegalizeDAG and need DAG combine to remove a scalable->fixed->scalable
cast before it becomes visible to the shift node. By the time that
happens we've already visited the truncate node and won't revisit it.

I think I have an idea how to improve i64 on RV32 I'll save for a
follow up.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D102019

Revert "Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation"

This reverts commit 6400905a615282c83a2fc6e49e57ff716aa8b4de.

[libc++] Remove more unnecessary _VSTD:: from type names. NFCI.

Differential Revision: https://reviews.llvm.org/D102181

[libc++] s/_VSTD::is_unsigned/is_unsigned/ in <random>. NFCI.

[libc++] s/_VSTD::chrono/chrono/g. NFCI.

[libc++] s/std::size_t/size_t/g. NFCI.

[libc++] s/_VSTD::declval/declval/g. NFCI.

[libomptarget][nfc] Add hook to easily disable building amdgcn bclib

[libomptarget][nfc] Add hook to easily disable building amdgcn bclib

This is useful when building LLVM with a toolchain that can't emit code
for amdgcn, e.g. because it overrides the include search path with headers
from another architecture, or the clang compiler is missing builtins.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D102229

[mlir] Use static shape knowledge when lowering memref.reshape

This is actually necessary for correctness, as memref.reinterpret_cast
doesn't verify if the output shape doesn't match the static sizes.

Differential Revision: https://reviews.llvm.org/D102232

Add null-pointer checks when accessing a TypeSystem's SymbolFile

A type system is not guaranteed to have a symbol file. This patch adds null-pointer checks so we don't crash when trying to access a type system's symbol file.

Reviewed By: aprantl, teemperor

Differential Revision: https://reviews.llvm.org/D101539

Change Target::ReadMemory to ensure the amount of memory read from the file-cache is the amount requested.

This change ensures that if for whatever reason we read less bytes than expected (for example, when trying to read memory that spans multiple sections), we try reading from the live process as well.

Reviewed By: jasonmolenda

Differential Revision: https://reviews.llvm.org/D101390

Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation
groups.

This change corrects the implementation for the branch coverage
summary to do the same thing for branches that is done for lines and regions.
That is, across function instantiations in an instantiation group, the maximum
branch coverage found in any of those instantiations is returned, with the
total number of branches being the same across instantiations.

Differential Revision: https://reviews.llvm.org/D102193

[NFC][X86] Precommit another testcase for D101944

Produce warning for performing pointer arithmetic on a null pointer.

Summary:
Test and produce warning for subtracting a pointer from null or subtracting
null from a pointer.  Reuse existing warning that this is undefined
behaviour.  Also add unit test for both warnings.

Reformat to satisfy clang-format.

Respond to review comments:  add additional test.

Respond to review comments:  Do not issue warning for nullptr - nullptr
in C++.

Fix indenting to satisfy clang-format.

Respond to review comments:  Add C++ tests.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: efriedma (Eli Friedman), nickdesaulniers (Nick Desaulniers)
Differential Revision: https://reviews.llvm.org/D98798

[IR][AutoUpgrade] Drop align attribute from void return types

Since D87304, `align` become an invalid attribute on none pointer types and
verifier will reject bitcode that has invalid `align` attribute.

The problem is before the change, DeadArgumentElimination can easily
turn a pointer return type into a void return type without removing
`align` attribute. Teach Autograde to remove invalid `align` attribute
from return types to maintain bitcode compatibility.

rdar://77022993

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D102201

[NFC][AMDGPU] Correct product name for gfx908

The product name for gfx908 is "AMD Instinct MI100 Accelerator".

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D102209

Revert "[AMDGPU][OpenMP] Emit textual IR for -emit-llvm -S"

This reverts commit 7f78e409d0280c62209e1a7dc8c6d1409acc9184.

[LoopInterchange] Fix legality for triangular loops

This is a bug fix in legality check.

When we encounter triangular loops such as the following form:
    for (int i = 0; i < m; i++)
      for (int j = 0; j < i; j++), or

    for (int i = 0; i < m; i++)
      for (int j = 0; j*i < n; j++),

we should not perform interchange since the number of executions of the loop body
will be different before and after interchange, resulting in incorrect results.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D101305

Fix typo "Execpt" in comments

Differential Revision: https://reviews.llvm.org/D101858

Revert "[TableGen] Make the NUL character invalid in .td files"

At least one build uses a 'sed' that does not understand \x00.

This reverts commit cf9647011c4f05e1eb4423c6637d84e2f26b2042.

[OpenMP] Fix hidden helper + affinity

When KMP_AFFINITY is set, each worker thread's gtid value is used as an
index into the place list to determine the thread's placement. With hidden
helpers enabled, this gtid value is shifted down leading to unexpected
shifted thread placement. This patch restores the previous behavior by
adjusting the mask index to take the number of hidden helper threads
into account.

Hidden helper threads are given the full initial mask and do not
participate in any of the other affinity mechanisms (place partitioning,
balanced affinity). Their affinity is only printed for debug builds.

Differential Revision: https://reviews.llvm.org/D101882

[VPlan] Register recipe for instr if the simplified value is recipe.

If the simplified VPValue is a recipe, we need to register it for Instr,
in case it needs to be recorded. The way this is handled in general may
change soon, following some post-commit comments.

This fixes PR50298.

[X86] X86TTIImpl::getInterleavedMemoryOpCostAVX2(): use getMemoryOpCost()

Now that getMemoryOpCost() correctly handles all the vector variants,
we should no longer hand-roll our own version of it, but use it directly.

The AVX512 variant probably needs a similar change,
but there it is less obvious.

[TableGen] Make the NUL character invalid in .td files

Differential Revision: https://reviews.llvm.org/D101923

[X86] Replace repeated isa/cast<ConstantSDNode> calls with single single dyn_cast<>. NFCI.

Noticed while looking at D101944

[X86][SSE] Replace foldShuffleOfHorizOp with generalized version in canonicalizeShuffleMaskWithHorizOp

foldShuffleOfHorizOp only handled basic shufps(hop(x,y),hop(z,w)) folds - by moving this to canonicalizeShuffleMaskWithHorizOp we can work with more general/combined v4x32 shuffles masks, float/integer domains and support shuffle-of-packs as well.

The next step will be to support 256/512-bit vector cases.

CodeGen: Fix null dereference before null check

[X86][CostModel] X86TTIImpl::getMemoryOpCost(): rewrite vector handling again

Instead of handling power-of-two sized vector chunks,
try handling the large vector in a stream mode,
decreasing the operational vector size
once it no longer works for the elements left to process.

Notably, this improves costs for overaligned loads - loading padding is fine.
This more directly tracks when we need to insert/extract the YMM/XMM subvector,
some costs fluctuate because of that.

Reviewed By: RKSimon, ABataev

Differential Revision: https://reviews.llvm.org/D100684

[SLP] restrict matching of load combine candidates

The test example from https://llvm.org/PR50256 (and reduced here)
shows that we can match a load combine candidate even when there
are no "or" instructions. We can avoid that by confirming that we
do see an "or". This doesn't apply when matching an or-reduction
because that match begins from the operands of the reduction.

Differential Revision: https://reviews.llvm.org/D102074

[AMDGPU] Move code sinking before structurizer

Moving code sinking pass before structurizer creates more sinking
opportunities.

The extra flow edges introduced by the structurizer can have adverse
effects on sinking, because the sinking pass prefers moving instructions
to blocks with unique predecessors and the structurizer destroys that
property in some cases.

A notable example is moving high-latency image instructions across kills.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D101115

[OpenCL] Allow use of double type without extension pragma.

Simply use of extensions by allowing the use of supported
double types without the pragma. Since earlier standards
instructed that the pragma is used explicitly a new warning
is introduced in pedantic mode to indicate that use of
type without extension pragma enable can be non-portable.

This patch does not break backward compatibility since the
extension pragma is still supported and it makes the behavior
of the compiler less strict by accepting code without extra
pragma statements.

Differential Revision: https://reviews.llvm.org/D100980

[libomptarget][nfc] Drop stringify in macro

[libomptarget][nfc] Drop stringify in macro
A step towards deleting the macros entirely.

Differential Revision: https://reviews.llvm.org/D102228

[LLDB] Don't use the local python to set a default for LLDB_PYTHON_RELATIVE_PATH when cross compiling.

Differential Revision: https://reviews.llvm.org/D101903

[PowerPC][Bug] Fix Bug in Stack Frame Update Code

The stack frame update code does not take into consideration spilling
to registers for callee saved registers. The option -ppc-enable-pe-vector-spills
turns on spilling to registers for callee saved registers and may expose a bug
in the code that moves a stack frame pointer update instruction.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D101366

[RegAllocFast] properly handle STATEPOINT instruction.

STATEPOINT is a fancy and complex pseudo instruction which
has both tied defs and regmask operand.

Basic FastRA algorithm is as follows:

1. Mark registers used by defs as free
2. If instruction has regmask operand displace clobbered registers
according to regmask.
3. Assign registers for use operands.

In case of tied defs step 1 is replaced with allocation of registers
for them. But regmask is still processed, which may displace already
allocated registers. As a result, tied use and def will get assigned
to different registers.

This patch makes FastRA to process instruction's RegMask (if any) when
checking for physical registers interference.
That way tied operands won't get registers clobbered by regmask.

Reviewed By: arsenm, skatkov
Differential Revision: https://reviews.llvm.org/D99284

[AMDGPU] Add some GFX10.3 testing. NFC.

[MLIR] Switch llvm.noalias to a unit attribute

Switch llvm.noalias attribute from a boolean attribute to a unit
attribute.

Differential Revision: https://reviews.llvm.org/D102225

[LLD] [COFF] Add an assert regarding the RVA of exported symbols. NFC.

As this isn't handled as a regular relocation, the normal handling of
maybeReportRelocationToDiscarded in Chunks.cpp doesn't apply here.

This would have caught the issue fixed by
82de4e075339f5ad8d68cfe31eb45b771d4750ae.

Differential Revision: https://reviews.llvm.org/D102115

[CodeGen][WebAssembly] Better lowering for WASM_SYMBOL_TYPE_GLOBAL symbols

As we have been missing support for WebAssembly globals on the IR level,
the lowering of WASM_SYMBOL_TYPE_GLOBAL to IR was incomplete. This
commit fleshes out the lowering support, lowering references to and
definitions of addrspace(1) values to correctly typed
WASM_SYMBOL_TYPE_GLOBAL symbols.

Depends on D101608.

Differential Revision: https://reviews.llvm.org/D101913

[VP] Improve the VP intrinsic unittests

Test that all VP intrinsics are tested.
Test intrinsic id -> opcode -> intrinsic id round tripping.
Test property scopes in the include/llvm/IR/VPIntrinsics.def file.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D93534

[WebAssembly] Support for WebAssembly globals in LLVM IR

This patch adds support for WebAssembly globals in LLVM IR, representing
them as pointers to global values, in a non-default, non-integral
address space. Instruction selection legalizes loads and stores to
these pointers to new WebAssemblyISD nodes GLOBAL_GET and GLOBAL_SET.
Once the lowering creates the new nodes, tablegen pattern matches those
and converts them to Wasm global.get/set of the appropriate type.

Based on work by Paulo Matos in https://reviews.llvm.org/D95425.

Reviewed By: pmatos

Differential Revision: https://reviews.llvm.org/D101608

[flang][cmake] Enable the new driver by default

With this patch, `FLANG_BUILD_NEW_DRIVER` is set to `On` by default
(i.e. the new driver is enabled). Note that the new driver depends on
Clang and hence with this change you will need to add `clang` to
`LLVM_ENABLE_PROJECTS`.

If you don't want to build the new driver, set `FLANG_BUILD_NEW_DRIVER`
to `Off`. This way you won't be required to include `clang` in
`LLVM_ENABLE_PROJECTS`.

Differential Revision: https://reviews.llvm.org/D101842

* Add support for JSON output style to llvm-symbolizer

This patch adds JSON output style to llvm-symbolizer to better support CLI automation by providing a machine readable output.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D96883

Fix -Wdocumentation warnings. NFCI.

[OpenCL] [NFC] Fixed underline being too short in rst

Support VectorTransfer splitting on writes also.

VectorTransfer split previously only split read xfer ops. This adds
the same logic to write ops. The resulting code involves 2
conditionals for write ops while read ops only needed 1, but the created
ops are built upon the same patterns, so pattern matching/expectations
are all consistent other than in regards to the if/else ops.

Differential Revision: https://reviews.llvm.org/D102157

[libcxx][test] Make string.modifiers/clear_and_shrink_db1.pass.cpp a regular mode test

Turn this test into a normal mode as it contains well-formed code and
checks for defined behavior. It still can be run in debug mode as of D100866.

Differential Revision: https://reviews.llvm.org/D102192

[llvm-dwarfdump] Fix abstract origin vars location stats calculation

There are cases where a concrete DIE with DW_TAG_subprogram can have
abstract_origin attribute, so we handle that situation as well.

Differential Revision: https://reviews.llvm.org/D101025

[mlir][linalg] Remove IndexedGenericOp support from LinalgToLoops...

after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).

Differential Revision: https://reviews.llvm.org/D102187

[mlir][linalg] Remove IndexedGenericOp support from Fusion...

after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).

Differential Revision: https://reviews.llvm.org/D102174

[libcxx] deprecates/removes `std::raw_storage_iterator`

C++17 deprecates `std::raw_storage_iterator` and C++20 removes it.

Implements part of:
* P0174R2 'Deprecating Vestigial Library Parts in C++17'
* P0619R4 'Reviewing Deprecated Facilities of C++17 for C++20'

Differential Revision: https://reviews.llvm.org/D101730

[libcxx] makes comparison operators for `std::*_ordering` types hidden friends

The standard leaves it up to the implementation to decide whether or not
these operators are hidden friends. There are several (well-documented)
reasons to prefer hidden friends, as well as an argument for improved
readability.

Depends on D100342.

Differential Revision: https://reviews.llvm.org/D101707

[libcxx] removes operator!= and globally guards against no spaceship operator

* `operator!=` isn't in the spec
* `<compare>` is designed to work with `operator<=>` so it doesn't
really make sense to have `operator<=>`-less friendly sections.

Depends on D100283.

Differential Revision: https://reviews.llvm.org/D100342

[clangd][remote-client] Set HasMore to true for failure

Currently client was setting the HasMore to true iff stream said so.
Hence if we had a broken stream for whatever reason (e.g. hitting deadline for a
huge response), HasMore would be false, which is semantically incorrect (e.g.
will throw rename off).

Differential Revision: https://reviews.llvm.org/D101915

[clangd][index-sever] Limit results in repsonse

This is to prevent server from being DOS'd by possible malicious
parties issuing requests that can yield huge responses.

One possible drawback is on rename workflow. As it really requests all
occurences, but it has an internal limit on 50 files currently.
We are putting the limit on 10000 elements per response So for rename to regress
one should have 10k refs to a symbol in less than 50 files. This seems unlikely
and we fix it if there are complaints by giving up on the response based on the
number of files covered instead.

Differential Revision: https://reviews.llvm.org/D101914

[mlir][linalg] Remove IndexedGenericOp support from Tiling...

after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).

Differential Revision: https://reviews.llvm.org/D102176

[LLD] Improve reporting unresolved symbols in shared libraries

Currently, when reporting unresolved symbols in shared libraries, if an
undefined symbol is firstly seen in a regular object file that shadows
the reference for the same symbol in a shared object. As a result, the
error for the unresolved symbol in the shared library is not reported.
If referencing sections in regular object files are discarded because of
'--gc-sections', no reports about such symbols are generated, and the
linker finishes successfully, generating an output image that fails on
the run.

The patch fixes the issue by keeping symbols, which should be checked,
for each shared library separately.

Differential Revision: https://reviews.llvm.org/D101996

[OpAsmParser] Refactor parseOptionalInteger to support wide integers, NFC.

OpAsmParser (and DialectAsmParser) supports a pair of
parseInteger/parseOptionalInteger methods, which allow parsing a bare
integer into a C type of your choice (e.g. int8_t) using templates. It
was implemented in terms of a virtual method call that is hard coded to
int64_t because "that should be big enough".

Change the virtual method hook to return an APInt instead. This allows
asmparsers for custom ops to parse large integers if they want to, without
changing any of the clients of the fixed size C API.

Differential Revision: https://reviews.llvm.org/D102120

[AMDGPU] Pre-commit tests for D102211

[RISCV] Fix the calculation of the offset of Zvlsseg spilling.

For Zvlsseg spilling, we need to convert the pseudo instructions
into multiple vector load/store instructions with appropriate offsets.
For example, for PseudoVSPILL3_M2, we need to convert it to

VS2R %v2, %base
ADDI %base, %base, (vlenb x 2)
VS2R %v4, %base
ADDI %base, %base, (vlenb x 2)
VS2R %v6, %base

We need to keep the size of the offset in the pseudo spilling instructions.
In this case, it is (vlenb x 2).

In the original implementation, we use the size of frame objects divide the
number of vectors in zvlsseg types. The size of frame objects is not
necessary exactly the same as the spilling data. It may be larger than
it. So, we change it to (VLENB x LMUL) in this patch. The calculation is
more direct and easy to understand.

Differential Revision: https://reviews.llvm.org/D101869

Enable export of FIR includes into the install tree
https://reviews.llvm.org/D102040

[NFC][LSAN] Fix flaky multithreaded test

[gn build] Port e5d483f28a3a

[ORC-RT] Add unit test infrastructure, extensible_rtti implementation, unit test

Add unit test infrastructure for the ORC runtime, plus a cut-down
extensible_rtti system and extensible_rtti unit test.

Removes the placeholder.cpp source file.

Differential Revision: https://reviews.llvm.org/D102080

[libcxx][ranges] Add ranges::empty CPO.

Depends on D101079. Refs D101189.

Differential Revision: https://reviews.llvm.org/D101193

[mlir][linalg] remove the -now- obsolete sparse support in linalg

All glue and clutter in the linalg ops has been replaced by proper
sparse tensor type encoding. This code is no longer needed. Thanks
to ntv@ for giving us a temporary home in linalg.

So long, and thanks for all the fish.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D102098

[AMDGPU] Constant fold Intrinsic::amdgcn_perm

Differential Revision: https://reviews.llvm.org/D102203

[gn build] Port 3b8d2be52725

Reland: "[lld][WebAssembly] Initial support merging string data"

This change was originally landed in: 5000a1b4b9edeb9e994f2a5b36da8d48599bea49
It was reverted in: 061e071d8c9b98526f35cad55a918a4f1615afd4

This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.

Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)

Like the ELF linker merging is only performed at `-O1` and above.

This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)

Differential Revision: https://reviews.llvm.org/D97657

[mlir][Tensor] Add folding for tensor.from_elements

This trivially folds into a constant when all operands are constant.

Differential Revision: https://reviews.llvm.org/D102199

[AArch64][GlobalISel] Add post-legalizer lowering for NEON vector fcmps

This is roughly equivalent to the floating point portion of
`AArch64TargetLowering::LowerVSETCC`. Main part that's missing is the v4s16 bit.

This also adds helpers equivalent to `EmitVectorComparison`, and
`changeVectorFPCCToAArch64CC`. This moves `changeFCMPPredToAArch64CC` out of
the selector into AArch64GlobalISelUtils for the sake of code reuse.

This is done in post-legalizer lowering with pseudos to simplify selection.
The imported patterns end up handling selection for us this way.

Differential Revision: https://reviews.llvm.org/D101782

Revert "[lld][WebAssembly] Initial support merging string data"

This reverts commit 5000a1b4b9edeb9e994f2a5b36da8d48599bea49.
Breaks tests, see https://reviews.llvm.org/D97657#2749151

Easily repros locally with `ninja check-llvm-mc-webassembly`.

[AArch64][GlobalISel] Enable memcpy family combines on minsize functions

The combines in `tryCombineMemCpyFamily` have heuristics (e.g.
`TLI.getMaxStoresPerMemset`) which consider size. So, theoretically, enabling
these combines on minsize functions shouldn't be harmful.

With this enabled we save 0.9% geomean on CTMark at -Oz, and 5.1% on Bullet.
There are no code size regressions.

Differential Revision: https://reviews.llvm.org/D102198