review.tizen.org Git - platform/upstream/llvm.git/log

[CodeGenCXX] Convert some tests to opaque pointers (NFC)

Conversion done using the script at
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34.

These are tests where the conversion worked out of the box and no
manual fixup was performed.

Revert "[CMake] Provide Findzstd module"

This reverts commit 849059861c63f5d89a6956191ecb8da8acf16bdb.

This breaks running llvm tests:

llvm-lit: /home/npopov/repos/llvm-project/llvm/utils/lit/lit/TestingConfig.py:138: fatal: unable to parse config file '/home/npopov/repos/llvm-project/build/test/lit.site.cfg.py', traceback: Traceback (most recent call last):
  File "/home/npopov/repos/llvm-project/llvm/utils/lit/lit/TestingConfig.py", line 127, in load_from_path
    exec(compile(data, path, 'exec'), cfg_globals, None)
  File "/home/npopov/repos/llvm-project/build/test/lit.site.cfg.py", line 48, in <module>
    config.have_zstd = FALSE
NameError: name 'FALSE' is not defined

Revert "Revert "[clang][Lex] Fix a crash on malformed string literals""

This reverts commit feea7ef23cb1bef92d363cc613052f8f3a878fc2.
Drops the test case, see https://reviews.llvm.org/D135161#3839510

[clangd] Avoid scanning up to end of file on each comment!

Assigning char* (pointing at comment start) to StringRef was causing us
to scan the rest of the source file looking for the null terminator.

This seems to be eating about 8% of our *total* CPU!

While fixing this, factor out the common bits from the two places we're
parsing IWYU pragmas.

Differential Revision: https://reviews.llvm.org/D135314

[clang] Add Create method for CXXBoolLiteralExpr

Reviewed By: shafik

Differential Revision: https://reviews.llvm.org/D135256

[Local] Fix unused variable warnings (NFC)

[AA] Update unit test missed in previous commit (NFC)

Missed this unit test use in 3d0b5f019e85eab5e7153516d3f6e9e4f2b9135b.

[CMake] Provide Findzstd module

This module is used to find the system zstd library. The imported
targets intentionally use the same name as the generate zstd config
CMake file so these can be used interchangeably.

Differential Revision: https://reviews.llvm.org/D134990

[AA] Remove unused template argument from AAResultBase (NFC)

After D94363, there is no more need to use CRTP here.

Detect Visual Studio in Windows packaging script

Instead of hardcoding a specific VS install, try sequentially:

- %VSINSTALLDIR% (already set from a vs prompt)
- 2019/Enterprise
- 2019/Professional
- 2019/Community
- 2019/BuildTools

It stops when one is found and set vsdevcmd env var.

Differential revision: https://reviews.llvm.org/D135173

[SourceManager] Improve getFileIDLoaded.

Similar to getFileIDLocal patch, but for the version for load module.

Test with clangd (building AST with preamble), FileID scans in binary
search is reduced:

SemaExpr.cpp: 142K -> 137K (-3%)
FindTarget.cpp: 368K -> 343K (-6%)

Differential Revision: https://reviews.llvm.org/D135258

[ConstraintElimination] Extend test coverage for AND chains.

[AA] Pass AAResults through AAQueryInfo

Currently, AAResultBase (from which alias analysis providers inherit)
stores a reference back to the AAResults aggregation it is part of,
so it can perform recursive alias analysis queries via
getBestAAResults().

This patch removes the back-reference from AAResultBase to AAResults,
and instead passes the used aggregation through the AAQueryInfo.
This can be used to perform recursive AA queries using the full
aggregation.

Differential Revision: https://reviews.llvm.org/D94363

[clang][Tooling] Move STL recognizer to its own library

As pointed out in https://reviews.llvm.org/D119130#3829816, this
introduces a clang AST dependency to the clangToolingInclusions, which is used
by clang-format.

Since rest of the inclusion tooling doesn't depend on clang ast, moving this
into a separate library.

Differential Revision: https://reviews.llvm.org/D135245

[AA] Thread AAQI through getModRefBehavior() (NFC)

This is in preparation for D94363, as we will need AAQI to
perform the recursive call to the function variant.

[clang] Remove CLANG_ENABLE_OPAQUE_POINTERS cmake option

Remove the ability to disable opaque pointers by default in clang.
It is still possible to explicitly disable them via cc1
-no-opaque-pointers.

Differential Revision: https://reviews.llvm.org/D135259

[ConstraintElimination] Extend test coverage for OR chains.

[DAG] Update `isKnownNeverNaN` for `FMA/FMAD`

We can still get a NaN even if none of the operands are NaN,
e.g. from +inf/-inf. D50804 didn't catch that.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D134854

[ConstraintElimination] Use ConstraintTy::IsSigned instead of Predicate.

This should be NFC and ensure the sign of the constraint is used
consistently in the future.

[AMDGPU][GISel] Add missing V2S16 BUILD_VECTOR_TRUNC legalization

Previously we would be unable to legalize V2S16 BUILD_VECTOR_TRUNC on GFX8 & below as the custom legalization was missing.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D135149

[InstrProf] Make __llvm_profile_counter_bias_default hidden

This symbol shouldn't have default visibility.

Differential Revision: https://reviews.llvm.org/D135346

[mlir][tensor][bufferize] Bufferize inserts into equivalent tensors in-place

Inserting a tensor into an equivalent tensor is a no-op after bufferization. No alloc is needed.

Differential Revision: https://reviews.llvm.org/D132662

Fix d5090cd94, MSVC mangling issue

Evidently * and [] are mangled differently by MSVC...

[llvm-driver] Add various tools to the llvm-driver

The llvm-driver, enabled with LLVM_TOOL_LLVM_DRIVER_BUILD combines many llvm executables
into one to save overall toolchain size. This patch adds a few more llvm tools to the
llvm-driver.

Differential Revision: https://reviews.llvm.org/D135281

[clang-format][NFC] Clean up class HeaderIncludes and Format.cpp

Differential Revision: https://reviews.llvm.org/D134852

[mlir][bazel] fix VectorToGPU bazel breakage

NOTE: this is probably not the long term organization
      that you want to keep after the refactoring to new
      directories, but this fixes the breakage for now;
      I leave proper refactoring of build to the NVGPU
      bazel team.

Differential Revision: https://reviews.llvm.org/D135344

[mlir][nvgpu] NFC - move NVGPU conversion helpers to NvGpu utils library

The ConvertVectorToGpu pass implementation contained a small private
support library for performing various calculations during conversion
between `vector` and `nvgpu.mma.sync` and `nvgpu.ldmatrix` operations.
The support library is moved under `Dialect/NVGPU/Utils` because the
functions have wider utility. Some documentation comments are added or
improved.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D135303

[mlir][sparse] Favors defined dimension when optimize lattice points.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D135337

[mlir][sparse] Fixing bug in python test

This is a followup to D135004, to correct one of the tests that didn't get caught by the buildbot.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D135336

[SPIRV] read kernel arg attributes from fuction/module metadata

The patch introduces reading the attributes of kernel arguments both from
function-attached and module-level metadata, during kernel arguments lowering.
Two tests are added to show the improvement.

Differential Revision: https://reviews.llvm.org/D135106

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey.tretyakov@mail.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>

[mlir][sparse] Adjusting DimLevelType numeric values for faster predicates

This differential adjusts the numeric values for DimLevelType values: using the low-order two bits for recording the "No" and "Nu" properties, and the high-order bits for the formats per se. (The choice of encoding may seem a bit peculiar, since the bits are mapped to negative properties rather than positive properties. But this was done in order to preserve the collation order of DimLevelType values. If we don't care about collation order, then we may prefer to flip the semantics of the property bits, so that they're less surprising to readers.)

Using distinguished bits for the properties and formats enables faster implementation for the predicates detecting those properties/formats, which matters because this is in the runtime library itself (rather than on the codegen side of things). This differential pushes through the changes to the enum values, and optimizes the basic predicates. However it does not optimize all the places where we check compound predicates (e.g., "is compressed or singleton"), to help reduce rebasing conflict with D134933. Those optimizations will be done after this differential and D134933 are landed.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D135004

[Sink] Allow sinking of invariant loads across critical edges

Invariant loads can always be sunk.

Reviewed By: foad, arsenm

Differential Revision: https://reviews.llvm.org/D135133

[mlir] vector.multi_reduction canonicalizes to vector.shape_cast (or
vector.extract, if the result is a scalar) only if all reduction
dimensions are of size 1.

Differential Revision: https://reviews.llvm.org/D135333

[NFC] Fix typo in error message.

Fix typo in error message. No other changes

Differential Revision: https://reviews.llvm.org/D135318

[clang-offload-bundler] extracting compatible bundle entry

In HIP a library is usually compiled with default target ID e.g. gfx906 so that
it can be used in all GPU configurations. The bitcode is saved in bundled
bitcode with gfx906 in entry ID.

In runtime compilation, a HIP program is compiled with a target ID matching
the GPU configuration, e.g. gfx906:xnack-. This program needs to link with
a library bundled bitcode with target ID gfx906.

For example:

clang --offload-arch=gfx906 -o lib.o lib.hip
clang --offload-arch=gfx906:xnack- program.hip lib.o

This common use case requires that clang-offlod-bundler to be able to extract
entry with compatible target ID, e.g. extracting an gfx906 entry when requesting
gfx906:xnack-.

Currently clang-offload-bundler only allow extracting entry with exact match
of target ID. This patch relaxes that so that it can extract entries with compatible
target ID.

Reviewed by: Artem Belevich, Saiyedul Islam

Differential Revision: https://reviews.llvm.org/D134546

Revert "[libc] Resolve NaN/implementation-defined behavior of floating-point tests"

This reverts commit 5470b1fcb5754b45c1233df7759411ff56942d6e.

[mlir][tosa] tosa.resize canonicalizer for trivial noop

If the scaling factor is by 1 with no offset or border, then the
resize is a no-op.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D135329

[libc] Resolve NaN/implementation-defined behavior of floating-point tests

Differential Revision: https://reviews.llvm.org/D134917

[mlir][sparse] further implement singleton dimension level type

Handle more cases of singleton DLT including direct sparse2sparse conversion. (Followup to D134096)

Depends On D134926

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D134933

[clang][deps] Canonicalize module map path

When dep-scanning, canonicalize the module map path as much as we can.
This avoids unnecessarily needing to build multiple versions of a module
due to symlinks or case-insensitive file paths.

Despite the name `tryGetRealPathName`, the previous implementation did
not actually return the realpath most of the time, and indeed it would
be incorrect to do so since the realpath could be outside the module
directory, which would have broken finding headers relative to the
module.

Instead, use a canonicalization that is specific to the needs of
modulemap files (canonicalize the directory separately from the
filename).

Differential Revision: https://reviews.llvm.org/D134923

[mlir][sparse] Case coverage fix no errorhandling

Restores the fix from D134925 for MSVC without breaking cpu runner.

Differential Revision: https://reviews.llvm.org/D135304

[AArch64][KCFI] Define Size for KCFI_CHECK

Specify the correct size for the KCFI_CHECK pseudo
instruction, which is lowered into six 4-byte instructions in
AArch64AsmPrinter::LowerKCFI_CHECK.

Link: https://github.com/ClangBuiltLinux/linux/issues/1730

ReduceOperands: Do not crash on vector of pointer types

Avoid crash in `reduceOperandsOneDeltaPass` function for operands with
vector of pointer type.

While on it add a `reduce-operands-ptr.ll` test in the spirit of the
existing `reduce-operands-int.ll`/`reduce-operands-fp.ll` tests.

Differential Revision: https://reviews.llvm.org/D135307

[mlir][sparse] move sparse tensor rewriting into its own pass

Makes individual testing and debugging easier.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D135319

Revert "Added canonicalization for vector.multi_reduction"

This reverts commit c16f3260a9255c7d9880f72de7d856f9ceeb1866.

There's a bug in the commit creates a scalar result with `ShapeCastOp`.
Reverting till that fix is done.

[RISCV][ISel] Finally fix the UBSan error

Forgot another SDValue check and a boolean initialization.

[flang] Keep current polymorphic implementation under a flag

It is useful for couple of test suite like NAG to keep failing
with a TODO until the polymorphic entities is implemented all the
way done to codegen.

This pass adds a flag to LoweringOptions for experimental development.
This flag is off by default and can be enable in `bbc` with `-polymorphic-type`.
Options can be added in the driver and tco when needed.

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D135283

[clangd] Avoid lexicographic compare when sorting SymbolIDs. NFC

These are 8 bytes and we don't care about the actual ordering, so use
integer compare.

The array generated code has some extra byte swaps (clang), calls memcmp (gcc)
or inlines a big chain of comparisons (MSVC): https://godbolt.org/z/e79r6jM6K

[RISCV][ISel] Attempt to fix UBSan error

Explicitly check an SDValue with the invalid SDValue.

UBSan reports:
runtime error: load of value 36, which is not a valid value for type
'bool'

https://lab.llvm.org/buildbot/#/builders/85/builds/11231

[RISCV][ISel] Fold extensions when all the users can consume them

This patch allows the combines that fold extensions in binary operations
to have more than one use.
The approach here is pretty conservative: if all the users of an
extension can fold the extension, then the folding is done, otherwise we
don't fold.
This is the first step towards avoiding the one-use limitation.

As a result, we make a decision to fold/don't fold for a web of
instructions. An instruction is part of the web of instructions as soon
as it consumes an extension that needs to be folded for all its users.

Because of how SDISel works a web of instructions can be visited over
and over. More precisely, if the folding happens, it happens for the
whole web and that's the end of it, but if the folding fails, the whole
web may be revisited when another member of the web is visited.

To avoid a compile time explosion in pathological cases, we bail out
earlier for webs that are bigger than a given threshold (arbitrarily set
at 18 for now.) This size can be changed using
`--riscv-lower-ext-max-web-size=<maxWebSize>`.

At the current time, I didn't see a better scheme for that. Assuming we
want to stick with doing that in SDISel.

Differential Revision: https://reviews.llvm.org/D133739

[AArch64] Add tablegen patterns for bf16 trn/zip/uzp.

This adds some missing tablegen patterns to handle trn1/trn2/zip1/zip2/uzp1/uzp2,
similar to the Arm handling in 5e1a9d319d2ee5d59a151e1d82f, but via tablegen
patterns for the AArch64 backend.

[libc++] Fix wrong implementation of CityHash

As PR56606 stated, the current implementation of CityHash in libc++
would drop some bits unintentionally. Cast the 32bit int to the 64bit
int to avoid this happened.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D134124

Use inheriting ctors for OSTargetInfo

(& remove PSPTargetInfo because it's unused - it had the wrong ctor in
it anyway, so wouldn't've been able to be instantiated - must've
happened due to bitrot over the years)

[libc++][format] Implements formattable concept.

This concept is introduced in P2286, but was implemented in libc++
before. This implementation was used in the library internally. This
implementation lacked the resolution of LWG3636. The original formatter
had a non-const member function that wasn't trivial to make a const
member. The recent parser improvements made this member a const member
in preparation of LWG3636.

Note LWG3636 isn't voted in. Its status is Ready. P2286's concept has
been written as-if LWG3636 is accepted and refers to that LWG issue.

Updates some tests make format a const member function and removes a
tests that's mainly a duplicate of the formattable concept test.

Implements
- LWG3636 formatter<T>::format should be const-qualified

Implements parts of
- P2286R8 Formatting Ranges

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D134110

[mlir][tosa] Update TOSA resize to match specification

Attribute stride and shift are removed, and has new scale and border.

Signed-off-by: TatWai Chong <tatwai.chong@arm.com>
Change-Id: I6cdbeb3978f5ee540bc6cf59eb7c273eb0131430

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D131629

[clang] Update ModuleMap::getModuleMapFile* to use FileEntryRef

Update SourceManager::ContentCache::OrigEntry to keep the original
FileEntryRef, and use that to enable ModuleMap::getModuleMapFile* to
return the original FileEntryRef. This change should be NFC for
most users of SourceManager::ContentCache, but it could affect behaviour
for users of getNameAsRequested such as in compileModuleImpl. I have not
found a way to detect that difference without additional functional
changes, other than incidental cases like changes from / to \ on
Windows so there is no new test.

Differential Revision: https://reviews.llvm.org/D135220

[AArch64] Add tests for i128 comparison; NFC

Baseline tests for D135302.

[mlir:Parser] Always splice parsed operations to the end of the parsed block

The current splicing behavior dates back to when all blocks had terminators,
so we would "helpfully" splice before the terminator. This doesn't make sense
anymore, and leads to somewhat unexpected results when parsing multiple
pieces of IR into the same block.

Differential Revision: https://reviews.llvm.org/D135096

[clang][ExtractAPI] Don't print locations for anonymous tags

ExtractAPI doesn't care about locations of anonymous TagDecls. Set the
printing policy to exclude that from anonymous decl names.

Differential Revision: https://reviews.llvm.org/D135295

[mlir][gpu] Introduce `host_shared` flag to `gpu.alloc`

Motivation: we have lowering pipeline based on upstream gpu and spirv dialects and and we are using host shared gpu memory to transfer data between host and device.
Add `host_shared` flag to `gpu.alloc` to distinguish between shared and device-only gpu memory allocations.

Differential Revision: https://reviews.llvm.org/D133533

[flang] Fixed build issue after 88f07a736bbc3f0062d7d8f4032f0b54aff5c018

nullptr matches both against ::mlir::UnitAttr and ::mlir::TypeRange,
so the following two candidates fit:
static void mlir::omp::OrderedRegionOp::build(::mlir::OpBuilder &odsBuilder,
    ::mlir::OperationState &odsState,
    /*optional*/::mlir::UnitAttr simd)
static void mlir::omp::OrderedRegionOp::build(::mlir::OpBuilder &odsBuilder,
    ::mlir::OperationState &odsState,
    ::mlir::TypeRange resultTypes,
    /*optional*/bool simd = false)

[clang/Sema] Fix non-deterministic order for certain kind of diagnostics

In the context of caching clang invocations it is important to emit diagnostics in deterministic order;
the same clang invocation should result in the same diagnostic output.

rdar://100336989

Differential Revision: https://reviews.llvm.org/D135118

[DeviceRTL] Allow IsSPMDMode to be optimized out in LTO mode

A previous patch merged the static and bitcode versions of the
deviceRTL. We previously used the static library's separate compilation
to set a special flag that prevented `IsSPMDMode` from being put in the
used list and preventing it from being optimized out. When they were
merged we could no longer do this separate compilation that allowed
users of LTO to get more optimal code.

This patch rearranges the code. The `IsSPMDMode` global is now
transitively used by its inclusion in the changed `__keep_alive`
function. This allows us to then manually delete the `__keep_alive`
function from the module when building the static library via
`llvm-extract`. The result is that the bitcode library correctly will
maintain the needed shared state, while the static library will be able
to internalize it and optimize it out.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D135280

[OpenMP] Make the exec_mode global have protected visibility

We use protected visibility for almost everything with offloading. This
is because it provides us with the ability to read things from the host
without the expectation that it will be preempted by a shared library
load, bugs related to this have happened when offloading to the host.
This patch just makes the `exec_mode` global generated for each plugin
have protected visibility.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D135285

[clangd] Fix non-idempotent cases of canonicalRenameDecl()

The simplest way to ensure full canonicalization is to canonicalize
recursively in most cases.

This fixes an assertion failure and presumably correctness bugs.

It does show up that D132797's index-based virtual method renames doesn't handle
templates well (the AST behavior is different and IMO better).
We could choose to disable in this case or change the index behavior,
but this patch doesn't do either.

Differential Revision: https://reviews.llvm.org/D133415

[mlir][arith] Add shli support to WIE

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D135234

[mlir][NFC] Make 'printOp' public in AsmPrinter

This patch moves the 'printOp' functionality to the public API of
AsmPrinter and rename it to 'printCustomOrGenericOp'. No 'parseOp'
is needed at this time as existing APIs are able to parse operations
producing results where results are omitted in the textual form
(the LHS of an operation is redundant when it comes to building the
operation itself as it only contains the result names).

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D135006

[clangd] Don't clone SymbolSlab::Builder arenas when finalizing.

SymbolSlab::Builder has an arena to store strings of owned symbols, and
deduplicates them. build() copies all the strings and deduplicates them again!
This is potentially useful: we may have overwritten a symbol and
rendered some strings unreachable.

However in practice this is not the case. When testing on a variety of
files in LLVM (e.g. SemaExpr.cpp), the strings for the full preamble
index are 3MB and shrink by 0.4% (12KB). For comparison the serializde
preamble is >50MB.
There are also hundreds of smaller slabs (file sharding) that do not shrink at
all.

CPU time spent on this is significant (something like 3-5% of buildPreamble).
We're better off not bothering.

Differential Revision: https://reviews.llvm.org/D135231

Fix build failures from 4f6477a615d18dac6cc3aa0d3fb946191f8168c9.

Added canonicalization for vector.multi_reduction

If there are reductions only along unit dimensions, then they are folded

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D134996

Revert "[flang] Add -fpass-plugin option to Flang frontend"

This reverts commit 43fe6f7cc35ded691bbc2fa844086d321e705d46.

Reverting this as CI breaks.

To reproduce, run check-flang, and it will fail with an error saying
.../lib/Bye.so not found in pass-plugin.f90

[llvm][NFC] Consolidate equivalent function type parsing code into
single function

Differential Revision: https://reviews.llvm.org/D135296

[mlir][arith] Add andi, ori, and xori support to WIE

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D135204

[Dwarf] Remove unnecessary module flags from test

These extra module flags are not needed for this test, so remove them. In fact, leaving them in produces the following error message:

> invalid behavior operand in module flag (unexpected constant)

Differential Revision: https://reviews.llvm.org/D135294

[libc][Obvious] Add "__" prefix to sched_getcpucount in the spec and elsewhere.

Without this fix, the declaration in sched.h will not have the "__" prefix and
will cause a compile failure.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D135286

[mlir][mlprogram] Add CAPI project for MLProgram

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D135291

[RISCV][ISel] Refactor the formation of VW operations

This patch centralizes all the combines of add|sub|mul with extended
operands in one "framework".
The rationale for this change is to offer a one-stop-shop for all these
transformations so that, in the future, it is easier to make combine
decisions for a web of instructions (i.e., instructions connected
through s|zext operands).

Technically this patch is not NFC because the new version is more
powerful than the previous version.
In particular, it diverges in two cases:
- VWMULSU can now also be produced from `mul(splat, zext)`, whereas
  previously only `mul(sext, splat)` were supported when `splat`s were
  involved. (As demonstrated in rvv/fixed-vectors-vwmulsu.ll)
- VWSUB(U) can now also be produced from `sub(splat, ext)`, whereas
  previously only `sub(ext, splat)` were supported when `splat`s were
  involved. (As demonstrated in rvv/fixed-vectors-vwsub.ll)

If we wanted, we could block these transformations to make this
patch really NFC. For instance, we could do something similar to
`AllowSplatInVW_W`, which prevents the combines to form vw(add|sub)(u)_w
when the RHS is a splat.

Regarding the "framework" itself, the bulk of the patch is some
boilderplate code that abstracts away the actual extensions that are
present in the DAG. This allows us to handle `vwadd_w(ext a, b)` as if
it was a regular `add(ext a, ext b)`. Since the node `ext b` doesn't
actually exist in the DAG, we have a bunch of methods (all in the
NodeExtensionHelper class) that fake all that for us.

The other half of the change is around `CombineToTry` and
`CombineResult`. These helper structures respectively:
- Represent the kind of combines that can be applied to a node, and
- Store what needs to happen to do that combine.

This can be viewed as a two step approach:
- First, check if a pattern applies, and
- Second apply it.

The checks and the materialization of the combines are decoupled so that
in the future we can perform several checks and do all the related
applies in one go.

Differential Revision: https://reviews.llvm.org/D134703

[mlir] Make UnitAttr's default val in unwrapped builder

UnitAttr is optional but unwrapped builders require it. Make Change onstructing
from bool as required for when not set at moment (for UnitAttr nothing needs to
be constructed, this is true for others here too and can be addressed
together).

Differential Revision: https://reviews.llvm.org/D135058

[Sema][ObjC] Fix assertion failure in getCommonNonSugarTypeNode

Instead of checking that the protocols of both types are all equal,
check that the canonical decls are equal.

[aarch64] add missing run line to a test

The CHECK-IOS lines were added in 1c353419ab51f, but without a
matching FileCheck invocation. Add it.

The dead CHECK-IOS lines were found by Daniel Bertalan.

Differential Revision: https://reviews.llvm.org/D133772

Add `const` to `dump` method of `OpFoldResult`.

While most `dump` methods are marked `const`, some arent marked as
`const`. Adding `const` to `OpFoldResult` here since this was
encountered as an issue while debugging (doing `dump` within a debug
console threw an error indicating the method should be marked
`const`).

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D135241

[mlir][Linalg] Expose vectorization precondition check as a utility function.

This patch exposes the method to check if an op can be vectorized or
not for downstream uses. Also adds a check to mark elementwise operations
that have non-vectorizable ops (like `tensor.extract`) as non vectorizable.

Reviewed By: nicolasvasilache, dcaballe, ThomasRaoux

Differential Revision: https://reviews.llvm.org/D135201

[lld][ELF] Fix lazy ThinLTO index writing in thin archives

Currently when the --thinlto-emit-index-files is used with LLD and a
thin archive is passed containing references to object files to link
against where the object files are in a different folder than the thin
archive and some of the archives aren't linked against (ie stay lazy),
the empty index file writer ends up trying to write to a path that
doesn't exist. This patch changes the behavior of that function to use
the path of the obj member of the BitcodeFile object rather than just
the path of the BitcodeFile object itself, which matches the behavior of
the default (non-lazy) case.

Fixes #57963

Regression test added.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D135014

Revert "[compiler-rt][test] Heed COMPILER_RT_DEBUG when compiling unittests"

Breaks some bots, details in https://reviews.llvm.org/D91620

This reverts commit 93b1256e38f63a81561288b9a90c5d52af63cb6e.

Revert "[mlir][sparse] Restore case coverage warning fix"

Breaks https://lab.llvm.org/buildbot/#/builders/168/builds/9288

This reverts commit 83839700c32996c58ddebc0c74e3dc4970e005bc.

[mlir][sparse] introduce a higher-order tensor mapping

This extension to the sparse tensor type system in MLIR
opens up a whole new set of sparse storage schemes, such as
block sparse storage (e.g. BCSR) and ELL (aka jagged diagonals).

This revision merely introduces the type extension and
initial documentation. The actual interpretation of the type
(reading in tensors, lowering to code, etc.) will follow.

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D135206

[libc++][chrono] Implements formatter month.

Partially implements:
- P1361 Integration of chrono with text formatting

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D134138

Fix SourceManager::isBeforeInTranslationUnit bug with token-pasting

isBeforeInTranslationUnit compares SourceLocations across FileIDs by
mapping them onto a common ancestor file, following include/expansion edges.

It is possible to get a tie in the common ancestor, because multiple
"chunks" of a macro arg will expand to the same macro param token in the body:
  #define ID(X) X
  #define TWO 2
  ID(1 TWO)
Here two FileIDs both expand into `X` in ID's expansion:
- one containing `1` and spelled on line 3
- one containing `2` and spelled by the macro expansion of TWO
isBeforeInTranslationUnit breaks this tie by comparing the two FileIDs:
the one "on the left" is always created first and is numerically smaller.
This seems correct so far.

Prior to this patch it also takes a shortcut (unclear if intentionally).
Instead of comparing the two FileIDs that directly expand to the same location,
it compares the original FileIDs being compared. These may not be the
same if there are multiple macro expansions in between.
This *almost* always yields the right answer, because macro expansion
yields "trees" of FileIDs allocated in a contiguous range: when comparing tree A
to tree B, it doesn't matter what representative you pick.

However, the splitting of >> tokens is modeled as macro expansion (as if
the first '>' was a macro that expands to a '>' spelled a scratch buffer).
This splitting occurs retroactively when parsing, so the FileID allocated is
larger than expected if it were a real macro expansion performed during lexing.
As a result, macro tree A can be on the left of tree B, and yet contain
a token-split FileID whose numeric value is *greator* than those in B.
In this case the tiebreak gives the wrong answer.

Concretely:
  #define ID(X) X
  template <typename> class S{};
  ID(
    ID(S<S<int>> x);
    int y;
  )

  Given Greater = (typeloc of S<int>).getEndLoc();
        Y       = (decl of y).getLocation();
  isBeforeInTranslationUnit(Greater, Y) should return true, but returns false.

Here the common FileID of (Greater, Y) is the body of the outer ID
expansion, and they both expand to X within it.
With the current tiebreak rules, we compare the FileID of Greater (a split)
to the FileID of Y (a macro arg expansion into X of the outer ID).
The former is larger because the token split occurred relatively late.

This patch fixes the issue by removing the shortcut. It tracks the immediate
FileIDs used to reach the common file, and uses these IDs to break ties.
In the example, we now compare the macro arg expansion of the inner ID()
to the macro arg expansion of Y, and find that it is smaller.

This requires some changes to the InBeforeInTUCacheEntry (sic).
We store a little more data so it's probably slightly slower.
It was difficult to resist more invasive changes:
- performance: the sizing is very suspicious, and once the cache "fills up"
   we're thrashing a single entry
- API: the class seems to be needlessly complicated
However I tried to avoid mixing these with subtle behavior changes, and
will send a followup instead.

Differential Revision: https://reviews.llvm.org/D134685

[HLSL] Support register binding attribute on global variable

Allow register binding attribute on variables.

Report warning when register binding attribute applies to local variable or static variable.
It will be ignored in this case.

Type check for register binding is tracked with https://github.com/llvm/llvm-project/issues/57886.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D134617

[Dwarf] Reference the correct CU when inlining

Sometimes when a function is inlined into a different CU, `llvm-dwarfdump --verify` would find an inlined subroutine with an invalid abstract origin. This is because `DwarfUnit::addDIEEntry()` will incorrectly assume the inlined subroutine and the abstract origin are from the same CU if it can't find the CU for the inlined subroutine.

In the added test, the inlined subroutine for `bar()` is created before the CU for `B.swift` is created, so it tries to point to `goo()` in the wrong CU. Interestingly, if we swap the order of the two functions then we don't see a crash since the module for `goo()` is created first.

The fix is to give a parent DIE to `ScopeDIE` before calling `addDIEEntry()` so that its CU can be found. Luckily, `constructInlinedScopeDIE()` is only called once so we can pass it the DIE of the scope's parent and give it a child just after it's created.

`constructInlinedScopeDIE()` should always return a DIE, so assert that it is not null.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D135114

[mlir][unittest] Fix crash when building with MSVC 2022

The test Dialect/Affine/ops.mlir was failing when building with
Visual Studio 2022 version 17.3.5. This was caused by a bad MSVC codegen, when
capturing a `constexpr` in a lambda. The bug was reported to Microsoft, see
differential for more information.

Differential revision: https://reviews.llvm.org/D134227

[mlir] Fix ambiguity when building with Clang 14.0.6

Differential revision: https://reviews.llvm.org/D134219

[Orc] Fix the SharedMemoryMapper dtor

As briefly discussed on https://reviews.llvm.org/rG1134d3a03facccd75efc5385ba46918bef94fcb6, fix the unintended copy while iterating on Reservations and add a mutex guard, to be symmetric with other usages of Reservations.

Differential revision: https://reviews.llvm.org/D134212

[Syntax] Fix macro-arg handling in TokenBuffer::spelledForExpanded

A few cases were not handled correctly. Notably:
  #define ID(X) X
  #define HIDE a ID(b)
  HIDE
spelledForExpanded() would claim HIDE is an equivalent range of the 'b' it
contains, despite the fact that HIDE also covers 'a'.

While trying to fix this bug, I found findCommonRangeForMacroArgs hard
to understand (both the implementation and how it's used in spelledForExpanded).
It relies on details of the SourceLocation graph that are IMO fairly obscure.
So I've added/revised quite a lot of comments and made some naming tweaks.

Fixes https://github.com/clangd/clangd/issues/1289

Differential Revision: https://reviews.llvm.org/D134618

[libc++] Get rid of _LIBCPP_HAS_OPEN_WITH_WCHAR in the test suite

Differential Revision: https://reviews.llvm.org/D135163

[ConstraintElimination] Convert NewIndices to vector and rename (NFCI).

The callers of getConstraint only require a list of new variables.
Update the naming and types to make this clearer.

[AMDGPU] Fix V_CMP_CLASS_F16_t16_e64 src1 type.

For V_CMP_CLASS_F16_t16_e64 and V_CMPX_CLASS_F16_t16_e64,
https://reviews.llvm.org/D133723 changed the value type of src1 from i32 to i16.
These src1 operands are 16 bits, therefore need to be encoded as true16
operands. So the _e32 type was correctly set to VGPR_32_Lo128.
In _e64 form the operand class went from
VSrc_b32 to VSrc_b16. For some reason, we cannot encode inline literals for
VSrc_b16, see 5f5f566b265db00f577ead268400d99f34ba9cdd. In this phase of
the true16 implementation, VSrc_b16 and VSrc_b32 are still similar,
except from that quirk of inlines. So set the operand class to regain
that function.

Reviewed By: dp, arsenm

Differential Revision: https://reviews.llvm.org/D134897

[OpenMP][FIX] Update device API to match recent changes

[PhaseOrdering] Name instructions in test (NFC)

Run through opt -instnamer.

[libc++][chrono] Implements formatter year.

Partially implements:
- P1361 Integration of chrono with text formatting

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D133663