review.tizen.org Git - platform/upstream/llvm.git/log

[NFC][IndVars] Autogenerate check lines in tests being affected by upcoming patch

[NFC][LSR] Autogenerate check lines in tests being affected by upcoming patch

[NFC][SCEV] Autogenerate check lines in tests being affected by upcoming patch

[gn bulid] Remove phantom WebAssembly tablegen() calls

Apparenlty I added these in https://reviews.llvm.org/rL350628 but
I'm not sure why. They never existed in the CMake build, and now
they're causing trouble.

[AArch64] Stack frame reordering.

Implement stack frame reordering in the AArch64 backend.

Unlike the X86 implementation, AArch64 does not seem to benefit from
"access density" based frame reordering, mainly because it has a much
smaller variety of addressing modes, and the fact that all instructions
are 4 bytes so each frame object is either in range of an instruction
(and then the access is "free") or not (and that has a code size cost
of 4 bytes).

This change improves Memory Tagging codegen by
* Placing an object that has been chosen as the base tagged pointer of
the function at SP + 0. This saves one instruction to setup the pointer
(IRG does not have an offset immediate), and more because that object
can now be referenced without materializing its tagged address in a
scratch register.
* Placing objects that go out of scope simultaneously together. This
exposes opportunities for instruction merging in tryMergeAdjacentSTG.

Differential Revision: https://reviews.llvm.org/D72366

[MTE] Pin the tagged base pointer to one of the stack slots.

Summary:
Pin the tagged base pointer to one of the stack slots, and (if
necessary) rewrite tag offsets so that an object that occupies that
slot has both address and tag offsets of 0. This allows ADDG
instructions for that object to be eliminated and their uses replaced
with the tagged base pointer itself.

This optimization must be done in machine instructions and not in the IR
instrumentation pass, because referring to a stack slot through an IRG
pointer would confuse the stack coloring pass.

The optimization makes a (pretty naive) attempt to find the slot that
would benefit the most by counting the uses of stack slots in the
function.

Reviewers: ostannard, pcc

Subscribers: merge_guards_bot, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72365

[AMDGPU] gfx1032 target

Differential Revision: https://reviews.llvm.org/D89487

Reland "[WebAssembly] v128.load{8,16,32,64}_lane instructions"

This reverts commit 7c8385a352ba21cb388046290d93b53dc273cd9f with a typing fix
to an instruction selection pattern.

[SemaObjC] Fix composite pointer type calculation for `void*` and pointer to lifetime qualified ObjC pointer type

Fixes a regression introduced in 9a6f4d451ca7. rdar://70101809

Differential revision: https://reviews.llvm.org/D89475

[mlir] Add std.tensor_to_memref op and teach the infra about it

The opposite of tensor_to_memref is tensor_load.

- Add some basic tensor_load/tensor_to_memref folding.
- Add source/target materializations to BufferizeTypeConverter.
- Add an example std bufferization pattern/pass that shows how the
  materialiations work together (more std bufferization patterns to come
  in subsequent commits).
  - In coming commits, I'll document how to write composable
  bufferization passes/patterns and update the other in-tree
  bufferization passes to match this convention. The populate* functions
  will of course continue to be exposed for power users.

The naming on tensor_load/tensor_to_memref and their pretty forms are
not very intuitive. I'm open to any suggestions here. One key
observation is that the memref type must always be the one specified in
the pretty form, since the tensor type can be inferred from the memref
type but not vice-versa.

With this, I've been able to replace all my custom bufferization type
converters in npcomp with BufferizeTypeConverter!

Part of the plan discussed in:
https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17

Differential Revision: https://reviews.llvm.org/D89437

[mlir] Fix typo in LangRef

Reapply [LLD] [COFF] Implement a GNU/ELF like -wrap option

Add a simple forwarding option in the MinGW frontend, and implement
the private -wrap option in the COFF linker.

The feature in lld-link isn't gated by the -lldmingw option, but
the option is left as a private, undocumented option primarily
used by the MinGW driver.

The implementation is significantly based on the support for --wrap
in the ELF linker, but many small nuance details are different
between the ELF and COFF linkers, ending up with more than a few
implementation differences.

This fixes https://bugs.llvm.org/show_bug.cgi?id=47384.

Differential Revision: https://reviews.llvm.org/D89004

Reapplied with the bitfield member canInline fixed so it doesn't break
builds targeting windows.

[InstCombine] update tests for logic folds to exercise commuted patterns; NFC

This was the intent for D88551.
I also varied the types a bit for extra coverage
and tried to make better test/value names.

[NFC][CaptureTracking] Move static function isNonEscapingLocalObject to llvm namespace

Function isNonEscapingLocalObject is a static one within BasicAliasAnalysis.cpp.
It wraps around PointerMayBeCaptured of CaptureTracking, checking whether a pointer
is to a function-local object, which never escapes from the function.

Although at the moment, isNonEscapingLocalObject is used only by BasicAliasAnalysis,
its functionality can be used by other pass(es), one of which I will put up for review
very soon. Instead of copying the contents of this static function, I move it to llvm
scope, and place it amongst other functions with similar functionality in CaptureTracking.

The rationale for the location are:
- Pointer escape and pointer being captured are actually two sides of the same coin
- isNonEscapingLocalObject is wrapping around another function in CaptureTracking

Reviewed By: jdoerfert (Johannes Doerfert)

Differential Revision: https://reviews.llvm.org/D89465

Make sure both cc1 and cc1as process -m[no-]code-object-v3

Differential Revision: https://reviews.llvm.org/D89478

[libc++] Reduce dependencies on <iostream> from <random>

We included <istream> and <ostream> from <random>, but really it is
sufficient to include <iosfwd> if we make sure we access ios_base
members through a dependent type. This allows us to break a hard
dependency of <random> on locales.

[flang][msvc] Avoid a reinterpret_cast

The call to the binary->decimal formatter in real.cpp was cheating
by using a reinterpret_cast<> to extract its binary value.
Use a more principled and portable approach by extending the
API of evaluate::Integer<> to include ToUInt<>()/ToSInt<>()
member function templates that do the "right" thing. Retain
ToUInt64()/ToSInt64() for compatibility.

Differential revision: https://reviews.llvm.org/D89435

[mlir][Linalg] NFC - Rename test files s/_/-/g

Revert "[LLD] [COFF] Implement a GNU/ELF like -wrap option"

This reverts commit a012c704b5e5b60f9d2a7304d27cbc84a3619571.

Breaks Windows builds.

C:\src\llvm-mint\lld\COFF\Symbols.cpp(26,1): error: static_assert failed due to requirement 'sizeof(lld::coff::SymbolUnion) <= 48' "symbols should be optimized for memory usage"
static_assert(sizeof(SymbolUnion) <= 48,

[LV] Add a getRecurrenceBinOp and make use of it. NFC

[libc++][filesystem] Only include <fstream> when we actually need it in copy_file_impl

This allows building <filesystem> on systems that don't support <fstream>,
such as systems that don't support localization.

[CostModel] remove cost-kind predicate for ctlz/cttz intrinsics in basic TTI implementation

The cost modeling for intrinsics is a patchwork based on different
expectations from the callers, so it's a mess. I'm hoping to untangle
this to allow canonicalization to the new min/max intrinsics in IR.
The general goal is to remove the cost-kind restriction here in the
basic implementation class. Ie, if some intrinsic has throughput cost
of 104, assume that it has the same size, latency, and blended costs.
Effectively, an intrinsic with cost N is composed of N simple
instructions. If that's not correct, the target should provide a more
accurate override.

The x86-64 SSE2 subtarget cost diffs require explanation:

1. The scalar ctlz/cttz are assuming "BSR+XOR+CMOV" or
   "TEST+BSF+CMOV/BRANCH", so not cheap.
2. The 128-bit SSE vector width versions assume cost of 18 or 26
   (no explanation provided in the tables, but this corresponds to a
   bunch of shift/logic/compare).
3. The 512-bit vectors in the test file are scaled up by a factor of
   4 from the legal vector width costs.
4. The plain latency cost-kind is not affected in this patch because
   that calc is diverted before we get to getIntrinsicInstrCost().

Differential Revision: https://reviews.llvm.org/D89461

[PGO] Remove the old memop value profiling buckets.

Following up D81682 and D83903, remove the code for the old value profiling
buckets, which have been replaced with the new, extended buckets and disabled by
default.

Also syncing InstrProfData.inc between compiler-rt and llvm.

Differential Revision: https://reviews.llvm.org/D88838

[libc++] NFC: Remove unused include

[libc++] Allow building libc++ on platforms without a random device

Some platforms, like several embedded platforms, do not provide a source
of randomness through a random device. This commit makes it possible to
build and test libc++ for such platforms, i.e. without std::random_device.

Surprisingly, the only functionality that doesn't work on such platforms
is std::random_device itself -- everything else in <random> still works,
one just has to find alternative ways to seed the PRNGs.

[x86] add no 'unwind' to reduce test noise; NFC

I suggested this in D89412, but the comment was missed.

Revert "[WebAssembly] v128.load{8,16,32,64}_lane instructions"

This reverts commit 7c6bfd90ab2ddaa60de62878c8512db0645e8452.

[lldb] [Process/FreeBSDRemote] Initial multithreading support

Implement initial support for watching thread creation and termination.
Update ptrace() calls to correctly indicate requested thread.
Watchpoints are not supported yet.

This patch fixes at least multithreaded register tests.

Differential Revision: https://reviews.llvm.org/D89413

[LLD] [COFF] Implement a GNU/ELF like -wrap option

Add a simple forwarding option in the MinGW frontend, and implement
the private -wrap option in the COFF linker.

The feature in lld-link isn't gated by the -lldmingw option, but
the option is left as a private, undocumented option primarily
used by the MinGW driver.

The implementation is significantly based on the support for --wrap
in the ELF linker, but many small nuance details are different
between the ELF and COFF linkers, ending up with more than a few
implementation differences.

This fixes https://bugs.llvm.org/show_bug.cgi?id=47384.

Differential Revision: https://reviews.llvm.org/D89004

[LLD] [COFF] Fix a condition that was missed in 7f0e6c31c255. NFC.

This should fix cases when e.g. auto import is enabled without
mingw mode in total being enabled.

Differential Revision: https://reviews.llvm.org/D89006

[WebAssembly] v128.load{8,16,32,64}_lane instructions

Prototype the newly proposed load_lane instructions, as specified in
https://github.com/WebAssembly/simd/pull/350. Since these instructions are not
available to origin trial users on Chrome stable, make them opt-in by only
selecting them from intrinsics rather than normal ISel patterns. Since we only
need rough prototypes to measure performance right now, this commit does not
implement all the load and store patterns that would be necessary to make full
use of the offset immediate. However, the full suite of offset tests is included
to make it easy to track improvements in the future.

Since these are the first instructions to have a memarg immediate as well as an
additional immediate, the disassembler needed some additional hacks to be able
to parse them correctly. Making that code more principled is left as future
work.

Differential Revision: https://reviews.llvm.org/D89366

[RISCV] fix a mistake in RISCVInstrInfoV.td

A commit of VALUVVNoVm was wrong, fixed it.

Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D88142

[InstCombine] Use m_SpecificInt instead of m_APInt + comparison. NFCI.

[InstCombine] SimplifyDemandedUseBits - xor - refactor cast<ConstantInt> usage to PatternMatch. NFCI.

First step towards replacing these to add full vector support.

[InstCombine] InstCombineAndOrXor - refactor cast<ConstantInt> usages to PatternMatch. NFCI.

First step towards replacing these to add full vector support.

[openmp][libomptarget] Include header from LLVM source tree

[openmp][libomptarget] Include header from LLVM source tree

The change is to the amdgpu plugin so is unlikely to break anything.

The point of contention is whether libomptarget can depend on LLVM.
A community discussion was cautiously not opposed yesterday.

This introduces a compile time dependency on the LLVM source tree, in this case
expressed as skipping the building of the plugin if LLVM_MAIN_INCLUDE_DIR is not
set. One the source files will #include llvm/Frontend/OpenMP/OMPGridValues.h,
instead of copy&pasting the numbers across.

For users that download the monorepo, the llvm tree is already on disk. This will
inconvenience users who download only the openmp source as a tar, as they would
now also have to download (at least a file or two) from the llvm source, if they want
to build the parts of the openmp project that (post this patch) depend on llvm.

There was interest expressed in going further - using llvm tools as part of
building libomp, or linking against llvm libraries. That seems less clear cut
an improvement and worthy of further discussion. This patch seeks only to change
policy to support openmp depending on the llvm source tree. Including in the
other direction, or using libraries / tools etc, are purposefully out of scope.

Reviewers are a best guess at interested parties, please feel free to add others

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D87841

[mlir][standard] Fix parsing of scalar subview and canonicalize

Parsing of a scalar subview did not create the required static_offsets attribute.
This also adds support for folding scalar subviews away.

Differential Revision: https://reviews.llvm.org/D89467

[NFC] Fix license header from D87841

[TableGen] Add the !not and !xor operators.
Update the TableGen Programmer's Reference.

[RISCV] [TableGen] Modify RISCVCompressInstEmitter.cpp to use getAllDerivedDefinitions().

Add "not" to an llvm-symbolizer test that expects to fail

In a7b209a6d40d77b, llvm-symbolizer was adjusted to return a failure status
code when it produced an error, to flag up DWARF parsing problems. The
test for missing PDB file is analogous, and returns a failure status now
too.

This should fix the llvm-clang-win-x-armv7l buildbot croaking:

http://lab.llvm.org:8011/#/builders/60/builds/77

AMDGPU: Fix verifier error on killed spill of partially undef register

This does unfortunately end up with extra waitcnts getting inserted
that were avoided before. Ideally we would avoid the spills of these
undef components in the first place.

[InstCombine] visitXor - refactor ((X^C1)>>C2)^C3 -> (X>>C2)^((C1>>C2)^C3) fold. NFCI.

This is still ConstantInt-only (scalar) but is refactored to use PatternMatch to make adding vector support in the future relatively trivial.

[AMDGPU] Minimize number of s_mov generated by copyPhysReg

Generate the minimal set of s_mov instructions required when
expanding a SGPR copy operation in copyPhysReg.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D89187

[SVE]Fix implicit TypeSize casts in EmitCheckValue

Using TypeSize::getFixedSize() instead of relying upon the implicit
TypeSize->uint64_cast as the type is always fixed width.

Differential Revision: https://reviews.llvm.org/D89313

[Statepoints] Remove MI limit on number of tied operands.

After D87915 statepoint can have more than 15 tied operands.
Remove this restriction from statepoint lowering code.

[LLD][ELF] Improve ICF for relocations to ineligible sections via "aliases"

ICF was not able to merge equivalent sections because of relocations to
sections ineligible for ICF that use alternative symbols, e.g. symbol
aliases or section relative relocations.

Merging in this scenario has been enabled by giving the sections that
are ineligible for ICF a unique ID, i.e. an equivalence class of their
own. This approach also provides another benefit as it improves the
hashing that is used to perform the initial equivalance grouping for
ICF. This is because the ICF ineligible sections can now contribute a
unique value towards the hashes instead of the same value of zero. This
has been seen to reduce link time with ICF by ~68% for objects compiled
with -fprofile-instr-generate.

In order to facilitate this use of a unique ID, the existing
inconsistent approach to the setting of the InputSection eqClass in ICF
has been changed so that there is a clear distinction between the
eqClass values of ICF eligible sections and those of the ineligible
sections that have a unique ID. This inconsistency could have caused
incorrect equivalence class equality in the past, although it appears
that no issues were encountered in actual use.

Differential Revision: https://reviews.llvm.org/D88830

[flang] Fix build with BUILD_SHARED_LIBS=ON and FLANG_BUILD_NEW_DRIVER=ON

As usual, it's difficult to handle all different configuration in the first row,
but this one has been extensively tested

Differential Revision: https://reviews.llvm.org/D89452

Fix unused variable warning when compiling with asserts disabled.

Differential Revision: https://reviews.llvm.org/D89454

[DebugInstrRef] Support recording of instruction reference substitutions

Add a table recording "substitutions" between pairs of <instruction,
operand> numbers, from old pairs to new pairs. Post-isel optimizations are
able to record the outcome of an optimization in this way. For example, if
there were a divide instruction that generated the quotient and remainder,
and it were replaced by one that only generated the quotient:

  $rax, $rcx = DIV-AND-REMAINDER $rdx, $rsi, debug-instr-num 1
  DBG_INSTR_REF 1, 0
  DBG_INSTR_REF 1, 1

Became:

  $rax = DIV $rdx, $rsi, debug-instr-num 2
  DBG_INSTR_REF 1, 0
  DBG_INSTR_REF 1, 1

We could enter a substitution from <1, 0> to <2, 0>, and no substitution
for <1, 1> as it's no longer generated.

This approach means that if an instruction or value is deleted once we've
left SSA form, all variables that used the value implicitly become
"optimized out", something that isn't true of the current DBG_VALUE
approach.

Differential Revision: https://reviews.llvm.org/D85749

[AggressiveInstCombine] foldAnyOrAllBitsSet - add uniform vector support

Replace m_ConstantInt with m_APInt to support uniform vectors (with no undef elements)

Adding non-undef support would involve some refactoring of the MaskOps struct but this might still be worth it.

[AggressiveInstCombine] foldAnyOrAllBitsSet - add uniform vector tests

[CodeGen][X86] Emit fshl/fshr ir intrinsics for shiftleft128/shiftright128 ms intrinsics

Now that funnel shift handling is pretty good, we can use the intrinsics directly and avoid a lot of zext/trunc issues.

https://godbolt.org/z/YqhnnM

Differential Revision: https://reviews.llvm.org/D89405

[lldb] Explicitly test the template argument SB API

[AMDGPU] Add objdump invalid metadata testcase

Checks that metadata and invalid message are printed.

Differential Revision: https://reviews.llvm.org/D89375

[Statepoints] Unlimited tied operands.

Current limit on amount of tied operands (15) sometimes is too low
for statepoint. We may get couple dozens of gc pointer operands on
statepoint.
Review D87154 changed format of statepoint to list every gc pointer
only once, which makes it trivial to find tiedness relation between
statepoint operands: defs are mapped 1-1 to gc pointer operands passed
on registers.

Reviewed By: skatkov

Differential Revision: https://reviews.llvm.org/D87915

[NFC] Correct name of profile function to Profile in APValue

Capitalize the profile function of APValue such that it can be used by FoldingSetNodeID

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D88643

[yaml2obj] - Allow specifying no tags to create empty sections in few cases.

Currently we have a few sections that
does not support specifying no keys for them. E.g. it is required that one
of "Content", "Size" or "Entries" key is present. There is no reason to
have this restriction. We can allow this and emit an empty section instead.

This opens road for a simplification and generalization of the code in `validate()`
that is discussed in the D89039 thread.

Depends on D89039.

Differential revision: https://reviews.llvm.org/D89391

[libc][NFC] Add probability distributions for memory function sizes

This patch adds memory function size distributions sampled from different applications running in production.
This will be used to benchmark and compare memory functions implementations.

Differential Revision: https://reviews.llvm.org/D89401

[yaml2obj/obj2yaml] - Add support of 'Size' and 'Content' keys for all sections.

Many sections either do not have a support of `Size`/`Content` or support just a
one of them, e.g only `Content`.

`Section` is the base class for sections. This patch adds `Content` and `Size` members
to it and removes similar members from derived classes. This allows to cleanup and
generalize the code and adds a support of these keys for all sections (`SHT_MIPS_ABIFLAGS`
is a only exception, it requires unrelated specific changes to be done).

I had to update/add many tests to test the new functionality properly.

Differential revision: https://reviews.llvm.org/D89039

[TargetLowering] Replace Log2_32_Ceil with Log2_32 in SimplifySetCC ctpop combine.

This combine can look through (trunc (ctpop X)). When doing this
it tries to make sure the trunc doesn't lose any information
from the ctpop. It does this by checking that the truncated type
has more bits that Log2_32_Ceil of the ctpop type. The Ceil is
unnecessary and pessimizes non-power of 2 types.

For example, ctpop of i256 requires 9 bits to represent the max
value of 256. But ctpop of i255 only requires 8 bits to represent
the max result of 255. Log2_32_Ceil of 256 and 255 both return 8
while Log2_32 returns 8 for 256 and 7 for 255

The code with popcnt enabled is a regression for this test case,
but it does match what already happens with i256 truncated to i9.
Since power of 2 is more likely, I don't think it should block
this change.

Differential Revision: https://reviews.llvm.org/D89412

[SVE][NFC] Replace some TypeSize comparisons in non-AArch64 Targets

In most of lib/Target we know that we are not dealing with scalable
types so it's perfectly fine to replace TypeSize comparison operators
with their fixed width equivalents, making use of getFixedSize()
and so on.

Differential Revision: https://reviews.llvm.org/D89101

Increase timeout to find a dSYM in macos DownloadObjectAndSymbolFile

With a large dSYM over a slow home connection, the two minute timeout
would sometimes be exceeded, and we haven't seen instances of a
long timeout causing people any problems, so we're bumping it up.

640 seconds ought to be enough for anyone.

<rdar://problem/67759526>

[LLD] Set alignment as part of Characteristics in TLS table.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46473

LLD wasn't previously specifying any specific alignment in the TLS table's Characteristics field so the loader would just assume the default value (16 bytes). This works most of the time except if you have thread locals that want specific higher alignments (e.g. 32 as in the bug) *even* if they specify an alignment on the thread local. This change updates LLD to take the max alignment from tls section.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D88637

Revert "[LLD] Set alignment as part of Characteristics in TLS table."

Revert individual wip commits and will instead follow up with a
single commit with all the changes. Makes cherry-picking easier
and will contain all the right tags.

This reverts commit 32a4ad3b6ce6028a371b028cf06fa5feff9534bf.
This reverts commit 7fe13af676678815989a6d0ece684687953245e7.
This reverts commit 51fbc1bef657bb0f5808986555ec3517a84768c4.
This reverts commit f80950a8bb985c082b26534b0e157447bf803935.
This reverts commit 0778cad9f325df4d7b32b22f3dba201a16a0b8fe.
This reverts commit 8b70d527d7ec1c8b9e921177119a0d906ffad4f0.

Fix llvm-symbolizer assembly-based test to require x86 and specify x86 when assembling

llvm-symbolizer: Exit non-zero when DWARF parsing errors have been rendered

Fix typeo in attach failed error message.
<rdar://problem/70296751>

[AMDGPU] Pre-commit test for D89187

llvm-symbolizer: Ensure non-zero exit when an error is printed

(this doesn't cover all cases - libDebugInfoDWARF has a default error
handler that prints errors without any exit code handling - I'll be
following up with a patch for that after this)

[mlir][SPIRV] Adding an attribute to capture configuration for cooperative matrix operations.

Each hardware that supports SPV_C_CooperativeMatrixNV has a list of
configurations that are supported natively. Add an attribute to
specify the configurations supported to the `spv.target_env`.

Reviewed By: antiagainst, ThomasRaoux

Differential Revision: https://reviews.llvm.org/D89364

llvm-dwarfdump: Exit non-zero on an error path

Perform lvalue conversions on the left of a pseudo-destructor call 'p->~T()'.

Previously we failed to convert 'p' from array/function to pointer type,
and to represent the load of 'p' in the AST. The latter causes problems
for constant evaluation.

clang-{tools,unittests}: Stop using SourceManager::getBuffer, NFC

Update clang-tools-extra, clang/tools, clang/unittests to migrate from
`SourceManager::getBuffer`, which returns an always dereferenceable
`MemoryBuffer*`, to `getBufferOrNone` or `getBufferOrFake`, both of
which return a `MemoryBufferRef`, depending on whether the call site was
checking for validity of the buffer. No functionality change intended.

Differential Revision: https://reviews.llvm.org/D89416

clang/StaticAnalyzer: Stop using SourceManager::getBuffer

Update clang/lib/StaticAnalyzer to stop relying on a `MemoryBuffer*`,
using the `MemoryBufferRef` from `getBufferOrNone` or the
`Optional<MemoryBufferRef>` from `getBufferOrFake`, depending on whether
there's logic for checking validity of the buffer. The change to
clang/lib/StaticAnalyzer/Core/IssueHash.cpp is potentially a
functionality change, since the logic was wrong (it checked for
`nullptr`, which was never returned by the old API), but if that was
reachable the new behaviour should be better.

Differential Revision: https://reviews.llvm.org/D89414

[AArch64] Combine UADDVs to generate vector add

ADD(UADDV a, UADDV b) --> UADDV(ADD a, b)

This partially solves the bug: https://bugs.llvm.org/show_bug.cgi?id=46888
Meta ticket: https://bugs.llvm.org/show_bug.cgi?id=46929

Differential Revision: https://reviews.llvm.org/D88731

clang/CodeGen: Stop using SourceManager::getBuffer, NFC

Update `clang/lib/CodeGen` to use a `MemoryBufferRef` from
`getBufferOrNone` instead of `MemoryBuffer*` from `getBuffer`. No
functionality change here.

Differential Revision: https://reviews.llvm.org/D89411

clang/Frontend: Mostly stop using SourceManager::getBuffer, NFC

Update clang/lib/Frontend to use a `MemoryBufferRef` from
`getBufferOrFake` instead of `MemoryBuffer*` from `getBuffer`, with the
exception of `FrontendInputFile`, which I'm leaving for later.

Differential Revision: https://reviews.llvm.org/D89409

[dsymutil] Fix handling of aliases to private external symbols

dsymutil was incorrectly ignoring aliases to private extern symbols in
the MachODebugMapParser. This resulted in spurious warnings about not
being able to find symbols.

rdar://49652389

Differential revision: https://reviews.llvm.org/D89444

clang/Basic: Stop using SourceManager::getBuffer, NFC

Update clang/lib/Basic to stop relying on a `MemoryBuffer*`, using the
`MemoryBufferRef` from `getBufferOrNone` or `getBufferOrFake` instead of
`getBuffer`.

Differential Revision: https://reviews.llvm.org/D89394

[LLD] Set alignment as part of Characteristics in TLS table.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46473

LLD wasn't previously specifying any specific alignment in the TLS table's Characteristics field so the loader would just assume the default value (16 bytes). This works most of the time except if you have thread locals that want specific higher alignments (e.g. 32 as in the bug) *even* if they specify an alignment on the thread local. This change updates LLD to take the max alignment from tls section.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D88637

Nit: Use early return to reduce indentation.

Mask out existing alignment bits.

Update tests.

Fix style warnings.

[LLD] Set alignment as part of Characteristics in TLS table.

Differential Revision: https://reviews.llvm.org/D88637

[AMDGPU] Correct typos in SIMemoryLegalizer.cpp comments

Revert "[CMake] Avoid accidental C++ standard library dependency in sanitizers"

This reverts commit 287c318690f19fcbe337211278798d97ccf7884e which broke
sanitizer tests that use C++ standard library.

[CMake] Avoid accidental C++ standard library dependency in sanitizers

While sanitizers don't use C++ standard library, we could still end
up accidentally including or linking it just by the virtue of using
the C++ compiler. Pass -nostdinc++ and -nostdlib++ to avoid these
accidental dependencies.

Differential Revision: https://reviews.llvm.org/D88922

PR47805: Use a single object for a function parameter in the caller and
callee in constant evaluation.

We previously made a deep copy of function parameters of class type when
passing them, resulting in the destructor for the parameter applying to
the original argument value, ignoring any modifications made in the
function body. This also meant that the 'this' pointer of the function
parameter could be observed changing between the caller and the callee.

This change completely reimplements how we model function parameters
during constant evaluation. We now model them roughly as if they were
variables living in the caller, albeit with an artificially reduced
scope that covers only the duration of the function call, instead of
modeling them as temporaries in the caller that we partially "reparent"
into the callee at the point of the call. This brings some minor
diagnostic improvements, as well as significantly reduced stack usage
during constant evaluation.

[VE] Add vector load/store instructions

Add vector registers and vector load/store instructions. Add
regression tests for vector load/store instructions too.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D89183

Revert "[ASTImporter] Fix crash caused by unset AttributeSpellingListIndex"

This broke the GreenDragon build, due to the following error while running
TestImportBuiltinFileID:

```
Ignored/unknown shouldn't get here
UNREACHABLE executed at tools/clang/include/clang/Sema/AttrSpellingListIndex.inc:13!
```

See http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/24213/

This reverts commit 73c6beb2f7053fe8b5150072c2b5cd930de38a22.
This reverts https://reviews.llvm.org/D89318

[VE] Change to expand SHL_PARTS/SRA_PARTS/SRL_PARTS

VE doesn't have SHL_PARTS/SRA_PARTS/SRL_PARTS instructions, so need
to expand them. Add regression tests too.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D89396

[AArch64][GlobalISel] Don't use explicit zero registers for compare results.

These cause problems for later optimizations, just using an unused vreg like
SelectionDAG generates better code in the end, and obviates the need for some
GISel specific flag optimizations.

Differential Revision: https://reviews.llvm.org/D89419

[ADT] Use alignas + sizeof for inline storage, NFC

AlignedCharArrayUnion is really only needed to handle the "union" case
when we need memory of suitable size and alignment for multiple types.
SmallVector only needs storage for one type, so use that directly.

[clang][NFC] Change diagnostic to start with lowercase letter

[Format/ObjC] Add NS_SWIFT_NAME() and CF_SWIFT_NAME() to WhitespaceSensitiveMacros

The argument passed to the preprocessor macros `NS_SWIFT_NAME(x)` and
`CF_SWIFT_NAME(x)` is stringified before passing to
`__attribute__((swift_name("x")))`.

ClangFormat didn't know about this stringification, so its custom parser
tried to parse the argument(s) passed to the macro as if they were
normal function arguments.

That means ClangFormat currently incorrectly inserts whitespace
between `NS_SWIFT_NAME` arguments with colons and dots, so:

```
extern UIWindow *MainWindow(void) NS_SWIFT_NAME(getter:MyHelper.mainWindow());
```

becomes:

```
extern UIWindow *MainWindow(void) NS_SWIFT_NAME(getter : MyHelper.mainWindow());
```

which clang treats as a parser error:

```
error: 'swift_name' attribute has invalid identifier for context name [-Werror,-Wswift-name-attribute]
```

Thankfully, D82620 recently added the ability to treat specific macros
as "whitespace sensitive", meaning their arguments are implicitly
treated as strings (so whitespace is not added anywhere inside).

This diff adds `NS_SWIFT_NAME` and `CF_SWIFT_NAME` to
`WhitespaceSensitiveMacros` so their arguments are implicitly treated
as whitespace-sensitive.

Test Plan:
New tests added. Ran tests with:
% ninja FormatTests && ./tools/clang/unittests/Format/FormatTests

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D89425

Register TargetCXXABI.def as a textual header

[mlir][Linalg] Rethink fusion of linalg ops with reshape ops.

The current fusion on tensors fuses reshape ops with generic ops by
linearizing the indexing maps of the fused tensor in the generic
op. This has some limitations
- It only works for static shapes
- The resulting indexing map has a linearization that would be
potentially prevent fusion later on (for ex. tile + fuse).

Instead, try to fuse the reshape consumer (producer) with generic op
producer (consumer) by expanding the dimensionality of the generic op
when the reshape is expanding (folding). This approach conflicts with
the linearization approach. The expansion method is used instead of
the linearization method.

Further refactoring that changes the fusion on tensors to be a
collection of patterns.

Differential Revision: https://reviews.llvm.org/D89002

Make header self-contained. NFC.