review.tizen.org Git - platform/upstream/llvm.git/log

[SLP] Add FMA test case with missing or partial fast-math flags.

Add extra FMA tests with missing or partial fast-math flags.

[lld-macho] Set the SG_READ_ONLY flag on __DATA_CONST

This flag instructs dyld to make the segment read-only after fixups have
been performed.

I'm not sure why this flag is needed, as on macOS 13 beta at least,
__DATA_CONST is read-only even without this flag; but ld64 sets it as
well.

Differential Revision: https://reviews.llvm.org/D133010

[DAG] extractShiftForRotate - replace assertion for shift opcode with an early-out

We feed the result from the first extractShiftForRotate call into the second, and that result might no longer be a shift op (usually due to constant folding).

NOTE: We REALLY need to stop creating nodes on the fly inside extractShiftForRotate!

Fixes Issue #57474

[amdgpu][nfc] Factor predicate out of findLDSVariablesToLower

[lld-macho][nfc] Simplify MarkLive.cpp using `if constexpr`

No significant perf diff, as expected.

             base           diff           difference (95% CI)
  sys_time   1.722 ± 0.030  1.727 ± 0.027  [  -0.6% ..   +1.2%]
  user_time  5.081 ± 0.032  5.087 ± 0.030  [  -0.2% ..   +0.4%]
  wall_time  6.008 ± 0.056  6.029 ± 0.053  [  -0.1% ..   +0.8%]
  samples    25             37

Reviewed By: #lld-macho, oontvoo, thakis, BertalanD

Differential Revision: https://reviews.llvm.org/D133014

[bazel overlay][libc] Add unistd targets.

Reviewed By: gchatelet

Differential Revision: https://reviews.llvm.org/D133004

Further update -Wbitfield-constant-conversion for 1-bit bitfield

https://reviews.llvm.org/D131255 (82afc9b169a67e8b8a1862fb9c41a2cd974d6691)
began warning about conversion causing data loss for a single-bit
bit-field. However, after landing the changes, there were reports about
significant false positives from some code bases.

This alters the approach taken in that patch by introducing a new
warning group (-Wsingle-bit-bitfield-constant-conversion) which is
grouped under -Wbitfield-constant-conversion to allow users to
selectively disable the single-bit warning without losing the other
constant conversion warnings.

Differential Revision: https://reviews.llvm.org/D132851

[LV] Add test case where SCEV is needed to remove vector backedge.

Test case mentioned in the discussion for D115261.

Clarifying the documentation for diagnostic formats; NFC

While discussing diagnostic format strings with a GSoC mentee, it
became clear there was some confusion regarding how to use them.
Specifically, the documentation for %select caused confunsion because
it was using %select{}2 and talking about how the integer value must
be in the range [0..2], which made it seem like the positional argument
was actually specifying the range of acceptable values.

I clarified several of the examples similarly, moved some documentation
to a more appropriate place, and added some additional information to
the %s modifier to point out that %plural exists.

[AArch64 - SVE]: Use SVE to lower reduce.fadd.

Differential Revision: https://reviews.llvm.org/D132573

skip custom-lowering for v1f64 to be expanded instead, because it has only one lane

Differential Revision: https://reviews.llvm.org/D132959

[LV] Fix test cases where vector loop never executed.

It looks like the vector loops in the modified test cases
unintentionally never get executed. Update the exit condition to ensure
it does to avoid them getting optimized away in upcoming changes.

[LLParser] Add test for phi first class type error (NFC)

[LLParser] Allow zero-input phi nodes

Zero-input phi nodes are accepted by the verifier and bitcode reader,
but currently rejected by the IR parser. Allow them there as well.

Because phi nodes must have one entry for each predecessor, such
phis can only occur in blocks without predecessors, aka unreachable
code.

Usually, when removing the last predecessor from a block, we also
remove phi nodes in it. However, this is not possible for
invalidation reasons sometimes, which is why we ended up allowing
zero-entry phis at some point in the past. See 9eb2c0113dfe,
D92247 and PR48296 for context.

I've dropped the verifier unit test, because this is now covered
by the regular IR test.

This fixes at least part of https://github.com/llvm/llvm-project/issues/57446.

Differential Revision: https://reviews.llvm.org/D133000

[COFF] Use the more accurate GuardFlags definition everywhere

This also modifies llvm-readobj to be more future-proof when printing
the guard FIDs table by calculating the entry size correctly according
to MS docs.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D132924

[llvm-readobj][COFF] Print load config GuardFlags as enum flags

Print flags as documented in MS docs.
https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#load-configuration-layout
https://docs.microsoft.com/en-us/windows/win32/secbp/pe-metadata

EH_CONTINUATION_TABLE_PRESENT is not mentioned in the docs but is
instead taken from Windows SDK headers.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D132823

[clang] Silence a false positive GCC -Wunused-but-set-parameter warning with constexpr

This fixes the following warning:

    In file included from ../tools/clang/lib/Tooling/Transformer/Transformer.cpp:9:
    ../tools/clang/include/clang/Tooling/Transformer/Transformer.h: In instantiation of ‘llvm::Error clang::tooling::detail::populateMetadata(const clang::transformer::RewriteRuleWith<MetadataT>&, size_t, const clang::ast_matchers::MatchFinder::MatchResult&, clang::tooling::TransformerResult<T>&) [with T = void; size_t = long unsigned int]’:
    ../tools/clang/include/clang/Tooling/Transformer/Transformer.h:179:34:   required from ‘void clang::tooling::detail::WithMetadataImpl<T>::onMatchImpl(const clang::ast_matchers::MatchFinder::MatchResult&) [with T = void]’
    ../tools/clang/include/clang/Tooling/Transformer/Transformer.h:156:8:   required from here
    ../tools/clang/include/clang/Tooling/Transformer/Transformer.h:120:25: warning: parameter ‘SelectedCase’ set but not used [-Wunused-but-set-parameter]
      120 |                  size_t SelectedCase,
          |                  ~~~~~~~^~~~~~~~~~~~

The issue is fixed in GCC 10 and later, but this silences the noisy
warning in older versions. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85827
for more details about the bug.

Differential Revision: https://reviews.llvm.org/D132920

[AArch64-SVE-fixed]:
change vscale_range<2,0> to vscale_range<1,0> for 64/128-bit vectors of fadda tests

[bazel] Drop ConversionPassDetail, it shouldn't be needed after 67d0d7ac0acb0665d6a09f61278fbcf51f0114c2

[flang] Apply lower bounds correctly before runtime call to ubound

Apply lower bounds before call to the ubound runtime function.
This is similary done in genLBound.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D133001

[DAG] visitFreeze - account for operand depth when calling isGuaranteedNotToBeUndefOrPoison (PR57402)

We were calling isGuaranteedNotToBeUndefOrPoison on operands (with Depth = 0), but wasn't accounting for the fact that a later isGuaranteedNotToBeUndefOrPoison assertion will call from the new node (with Depth = 0 as well) - which will then recursively call isGuaranteedNotToBeUndefOrPoison for its operands with Depth = 1

Fixes #57402

[clang] update pr27699 test to make headers different (NFC)

some build systems treat those headers as identical, causing a warning

[ARM] Add a phase ordering test for MVE intrinsic remainder vectorization/unrolling. NFC

[MLIR] Update pass declarations to new autogenerated files

The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.

Reviewed By: mehdi_amini, rriddle

Differential Review: https://reviews.llvm.org/D132838

[clang][dataflow] Extend transfer functions for other `CFGElement`s

Previously, the transfer function `void transfer(const Stmt *, ...)` overriden by users is restricted to apply only on `CFGStmt`s and its contained `Stmt`.

By using a transfer function (`void transfer(const CFGElement *, ...)`) that takes a `CFGElement` as input, this patch extends user-defined analysis to all kinds of `CFGElement`. For example, users can now handle `CFGInitializer`s where `CXXCtorInitializer` AST nodes are contained.

Reviewed By: gribozavr2, sgatev

Differential Revision: https://reviews.llvm.org/D131614

[CostModel][X86] Replace CostKindCosts constructor with default values.

This improves static initialization of the cost tables and significantly speeds up MSVC compile time.

[InstCombine] Use getInsertionPointAfterDef() in freeze fold

This simplifies the code and fixes handling of catchswitch, in
which case we have no insertion point for the freeze.

Originally part of D129660.

[clang-tidy] Fix modernize-use-emplace to support alias cases

Fix modernize-use-emplace to support alias cases

Reviewed By: njames93

Differential Revision: https://reviews.llvm.org/D132640

[libclc] Quote addition of CLC/LLAsm flags

Otherwise cmake will insert a semicolon if flags are already set.

Differential Revision: https://reviews.llvm.org/D131490

[Reassociate] Use getInsertionPointerAfterDef()

This simplifies the code and fixes handling for the callbr case,
where the instruction needs to be inserted in the normal
destination, rather than after the terminator.

Originally part of D129660.

Remove `REQUIRES: x86-registered-target` from ps4/ps5 driver tests

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D132950

[IR] Add Instruction::getInsertionPointAfterDef()

Transforms occasionally want to insert an instruction directly
after the definition point of a value. This involves quite a few
different edge cases, e.g. for phi nodes the next insertion point
is not the next instruction, and for invokes and callbrs its not
even in the same block. Additionally, the insertion point may not
exist at all if catchswitch is involved.

This adds a general Instruction::getInsertionPointAfterDef() API to
implement the necessary logic. For now it is used in two places
where this should be mostly NFC. I will follow up with additional
uses where this fixes specific bugs in the existing implementations.

Differential Revision: https://reviews.llvm.org/D129660

[mlir][OpenMP] Apply ClangTidy readability finding.

Use .empty() check instead of size() check.

[lld-macho] Support synthesizing __TEXT,__init_offsets

This section stores 32-bit `__TEXT` segment offsets of initializer
functions, and is used instead of `__mod_init_func` when chained fixups
are enabled.

Storing the offsets lets us avoid emitting fixups for the initializers.

Differential Revision: https://reviews.llvm.org/D132947

Revert "[clang] Fix a crash in constant evaluation"

This reverts commit a5ab650714d05c2e49ec158dc99156118a893027.

[clang] Fix a crash in constant evaluation

This was showing up in our internal crash collector. I have no idea how
to test it out though, open for suggestions if there are easy paths but
otherwise I'd move forward with the patch.

Differential Revision: https://reviews.llvm.org/D132918

[GVN] Add another test for phi translation miscompile (NFC)

[SPIR-V] Use llvm::Optional for builtin lowering result.

Replace result type std::pair<bool, bool> of lowerBuiltin with
a nice and convenient Optional<bool>.

Reviewed By: iliya-diyachkov, MaskRay

Differential Revision: https://reviews.llvm.org/D132802

[LoongArch] Support floating-point number reciprocal

Differential Revision: https://reviews.llvm.org/D132847

[DirectX backend] change MinVectorRegisterBitWidth to 32.

This is to avoid vector-combine generate vector4 on float.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D132826

[SLPVectorizer] Fix -Wunused-lambda-capture in -DLLVM_ENABLE_ASSERTIONS=off build

[clang-format] Fix a bug in inserting braces at trailing comments

If the style wraps control statement braces, the opening braces
should be inserted after the trailing comments if present.

Fixes #57419.

Differential Revision: https://reviews.llvm.org/D132905

[libc][doc] Update implementation status of atanf and atanhf.

[NFC] Add an invalid test case for clang/test/CXX/module/module.reach/ex1.cpp

[mlir][OpenMP] Translation to LLVM IR for omp.taskgroup

This patch adds translation from OpenMP Dialect to LLVM IR for
omp.taskgroup. This patch also adds missing tests for the clauses in
omp.taskgroup operation.

Reviewed By: peixin

Differential Revision: https://reviews.llvm.org/D130157

[RISCV] Add cost model for select and integer compare instructions.

This patch adds cost model for vector select and integer compare instructions.

[docs] Add "Standard C++ Modules"

We get some standard C++ module things done in clang15.x. But we lack a
user documentation for it. The implementation of standard C++ modules
share a big part of codes with clang modules. But they have very
different semantics and user interfaces, so I think it is necessary to
add a document for Standard C++ modules. Previously, there were also
some people ask the document for standard C++ Modules and I couldn't
offer that time.

Reviewed By: iains, Mordante, h-vetinari, ruoso, dblaikie, JohelEGP,
aaronmondal

Differential Revision: https://reviews.llvm.org/D131388

[RISCV][test] Add cost model coverage for compare instructions.

Differential Revision: https://reviews.llvm.org/D132827

[InstCombine] add support for multi-use Y of (X op Y) op Z --> (Y op Z) op X

For (X op Y) op Z --> (Y op Z) op X
we can still do transform when Y is multi-use. In D131356 limit it to one-use,
this patch remove this limit.

This is still not a complete solution, I add a todo test to show it.
In this case, X and Y are both multi use, we can't differentiate how to convert based on this.
But at least we don't make the code worse，and it can solve half the scenarios.

[msan] Add more specific messages for use-after-destroy

Reviewed By: kda, kstoimenov

Differential Revision: https://reviews.llvm.org/D132907

[AtomicExpand] Make floating point conversion happens before fence insertion

IIUC, the conversion part is not part of atomic operations and fences should be put around converted atomic operations.
This also fixes atomic load of floating point values which requires fence on PowerPC.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D127609

Revert "[driver] Additional ignoring of module-map related flags, if modules are disabled"

This reverts commit 33162a81d4c93a53ef847d3601b0b03830937d3c.

This change breaks the usage of module maps with modules disabled, such
as for layering checking via `-fmodules-decluse`.

Regression test added.

[Clang] Fix lambda CheckForDefaultedFunction(...) so that it checks the CXXMethodDecl is a special member function before attempting to call DefineDefaultedFunction(...)

In Sema::CheckCompletedCXXClass(...) It used a lambda CheckForDefaultedFunction
the CXXMethodDecl passed to CheckForDefaultedFunction may not be a special
member function and so before attempting to apply functions that only apply to
special member functions it needs to check. It fails to do this before calling
DefineDefaultedFunction(...). This PR adds that check and test to verify we no
longer crash.

This fixes https://github.com/llvm/llvm-project/issues/57431

Differential Revision: https://reviews.llvm.org/D132906

[llc] Use CPUStr instead of calling codegen::getMCPU(). NFC

`getCPUStr()` fallsback to `getMCPU()`.

The only difference between `getCPUStr()` and `getMCPU()` is that
`getCPUStr()` handles `-mcpu=native`. That doesn't matter for this case.

This is just a simplification of the original code and it does not
change the functionality. So no new tests added.

Differential Revision: https://reviews.llvm.org/D132849

[ORC-RT] Make llvm-jitlink an ORC-RT specific dependence.

The llvm-jitlink tool is not needed by other sanitizer tests.

[Libomptarget] Remove old workaround for GCC 5,6 from libomptarget

Some code previous needed the `used` attribute to prevent the GCC
compiler versions 5 and 6 from removing it. This is no longer required
as the minimum supported GCC version for LLVM 16 is >=7.1.0.

Reviewed By: JonChesterfield, vzakhari

Differential Revision: https://reviews.llvm.org/D132976

[Docs][CodeReview] Add a small paragraph on adding tokens, NFC.

Reviewed By: whisperity

Differential Revision: https://reviews.llvm.org/D131500

[gn build] Port ea9ac3519c13

An upcoming patch to LLDB will require the ability to decode base64. This patch adds support for decoding base64 and adds tests.

Resubmission of https://reviews.llvm.org/D126254 with where decodeBase64Byte is no longer a lambda but a static function. Some compilers have different errors or warnings with respect to what needs to be captured and what doesn't (see comments in https://reviews.llvm.org/D126254 for details).

Differential Revision: https://reviews.llvm.org/D128560

Revert "[clang][deps] Split translation units into individual -cc1 or other commands"

Failing on some bots, reverting until I can fix it.

This reverts commit f80a0ea760728e70f70debf744277bc3aa59bc17.

[GlobalISel] Explicitly fail trying to translate `gc.statepoint` and related intrinsics

The provided testcase would previously fail with an assertion due to later down below trying to allocate registers for `token` return types and arguments. This is especially problematic as the process would then exit instead of falling back to using FastIsel.

This patch fixes that by simply explicitly failing translation if either of these intrinsics are encountered.

Fixes https://github.com/llvm/llvm-project/issues/57349

Differential Revision: https://reviews.llvm.org/D132974

[clang][deps] Split translation units into individual -cc1 or other commands

Instead of trying to "fix" the original driver invocation by appending
arguments to it, split it into multiple commands, and for each -cc1
command use a CompilerInvocation to give precise control over the
invocation.

This change should make it easier to (in the future) canonicalize the
command-line (e.g. to improve hits in something like ccache), apply
optimizations, or start supporting multi-arch builds, which would
require different modules for each arch.

In the long run it may make sense to treat the TU commands as a
dependency graph, each with their own dependencies on modules or earlier
TU commands, but for now they are simply a list that is executed in
order, and the dependencies are simply duplicated. Since we currently
only support single-arch builds, there is no parallelism available in
the execution.

Differential Revision: https://reviews.llvm.org/D132405

[clang][modules] Don't hard code [no_undeclared_includes] for the Darwin module

The Darwin module has specified [no_undeclared_includes] for at least five years now, there's no need to hard code it in the compiler.

Reviewed By: ributzka, Bigcheese

Differential Revision: https://reviews.llvm.org/D132971

[NFC] Move a test case across files.

The test case is about pmull2 instruction generated used than a SIMD
ldr being generated. So aarch64-pmull2.ll is a better test file.

Differential Revision: https://reviews.llvm.org/D132277

[mlir] Fix try_value_begin_impl for DenseElementsAttr

The previous implementation would still crash if the element type was
not iterable. This patch changes SparseElementsAttr to properly
implement `try_value_begin_impl` according to ElementsAttr and changes
DenseElementsAttr to implement `tryGetValues` as the basis for querying
element values.

Depends on D132904

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D132958

[mlir][ElementsAttr] Change value_begin_impl to try_value_begin_impl

This patch changes `value_begin_impl` to a faillable
`try_value_begin_impl` so that specific cases can fail iteration if the
type doesn't match the internal storage.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D132904

[flang] Lower integer exponentiation into math::IPowI.

Differential Revision: https://reviews.llvm.org/D132770

[lldb] Fix two bugs in ObjectContainerMachOFileset

Fix two small issues in the live-memory variant of ObjectContainerMachOFileset.

Differential revision: https://reviews.llvm.org/D132973

[libc][math] Fix broken atan function.

[libc][math] Fix broken tests.

[mlir] fix -Wsign-compare equivalent on Windows

Some clients treat this as compilation error.

[libc][math] Added atanf function.

Performance by core-math (core-math/glibc 2.31/current llvm-14):
28.879/20.843/20.15

Differential Revision: https://reviews.llvm.org/D132842

[libc][math] Added atanhf function.

Performance by core-math (core-math/glibc 2.31/current llvm-14):
10.845/43.174/13.467

The review is done on top of D132809.

Differential Revision: https://reviews.llvm.org/D132811

[libc][math] Added auxiliary function log2_eval for asinhf/acoshf/atanhf.

1) `double log2_eval(double)` function added with better than float precision is added.
2) Some refactoring done to put all auxiliary functions and corresponding data
to one place to reuse the code.
3) Added tests for new functions.
4) Performance and precision tests of the function shows, that it more precise than exiting log2,
(no exceptional cases), but timing is ~5% higer that on current one.

Differential Revision: https://reviews.llvm.org/D132809

[mlir] Allow dense array to be parsed with type elision

This patch makes parsing dense arrays with type elision work properly.
If a ranked tensor type is supplied to `parseAttribute` on a dense
array, the element type is skipped. Moreover, if type elision is set to
`AttrTypeElision::Must`, the element type is elided.

For example, this allows

```
memref.global @z : memref<3xi32> = array<1, 2, 3>
```

Fixes #57433

Depends on D132758

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D132964

[mlir] Make DenseArrayAttr generic

This patch turns `DenseArrayBaseAttr` into a fully-functional attribute by
adding a generic parser and printer, supporting bool or integer and floating
point element types with bitwidths divisible by 8. It has been renamed
to `DenseArrayAttr`. The patch maintains the specialized subclasses,
e.g. `DenseI32ArrayAttr`, which remain the preferred API for accessing
elements in C++.

This allows `DenseArrayAttr` to hold signed and unsigned integer elements:

```
array<si8: -128, 127>
array<ui8: 255>
```

"Exotic" floating point elements:

```
array<bf16: 1.2, 3.4>
```

And integers of other bitwidths:

```
array<i24: 8388607>
```

Reviewed By: rriddle, lattner

Differential Revision: https://reviews.llvm.org/D132758

Revert "[MLIR] Update pass declarations to new autogenerated files"

This reverts commit 2be8af8f0e0780901213b6fd3013a5268ddc3359.

[ORC] Update mapper deinitialize functions to deinitialize in reverse order.

This updates the ExecutorSharedMemoryMapperService::deinitialize and
InProcessMemoryMapper::deinitialize methods to deinitialize in reverse order,
bringing them into alignment with the behavior of
InProcessMemoryManager::deallocate and SimpleExecutorMemoryManager::deallocate.
Reverse deinitialization is required because later allocations can depend on
earlier ones.

This fixes failures in the ORC runtime test suite.

[mlir][tosa] Fix windows build-bot error due to implicit i64 cast

There is an implicit i64 cast due to the << during MulOp's folder.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D132969

[MLIR] Update pass declarations to new autogenerated files

The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.

Reviewed By: mehdi_amini, rriddle

Differential Review: https://reviews.llvm.org/D132838

[profile] Create only prof header when no counters

When we use selective instrumentation and instrument a file
that is not in the selected files list provided via -fprofile-list,
we generate an empty raw profile. This leads to empty_raw_profile
error when we try to read that profile. This patch fixes the issue by
generating a raw profile that contains only a profile header when
there are no counters and profile data.

A small reproducer for the above issue:
echo "src:other.cc" > code.list
clang++ -O2 -fprofile-instr-generate -fcoverage-mapping
-fprofile-list=code.list code.cc -o code
./code
llvm-profdata show default.profraw

Differential Revision: https://reviews.llvm.org/D132094

[RISCV] Use uint64_t countTrailingZeros/Ones instead of APInt. NFC

We know the type is 32 or 64 bits, we can use getZExtValue and
bypass the slow path check in APInt.

[Verifier] remove stale comment about PHI with no operands; NFC

The code was changed with:
9eb2c0113dfe
...but missed the corresponding code comment.

[SLP]Fix PR57447: Assertion `!getTreeEntry(V) && "Scalar already in tree!"' failed.

The pointer operands for the ScatterVectorize node may contain
non-instruction values and they are not checked for "already being
vectorized". Need to check that such pointers are already vectorized and
gather them instead of trying to build vectorize node to avoid compiler
crash.

Differential Revision: https://reviews.llvm.org/D132949

[RISCV] Improve isel of AND with shiftedMask containing 32 leading zeros and some trailing zeros.

We can use srliw to shift out the trailing bits and slli to shift
back in zeros. The sign extend of srliw will 0 the upper 32 bits
since we will be shifting a 0 into bit 31.

[AMDGPU] Limit TID / wavefrontsize uniformness to 1D kernels

If a kernel has uneven dimensions we can have a value of workitem-id-x
divided by the wavefrontsize non-uniform. For example dimensions (65, 2)
will have workitems with address (64, 0) and (0, 1) packed into a same
wave which gives 1 and 0 after the division by 64 respectively.

Unfortunately, this limits the optimization to OpenCL only and only if
reqd_work_group_size attribute is set. This patch limits it to 1D kernels,
although that shall be possible to perform this optimization is the size
of the X dimension is a power of 2, we just do not currently have
infrastructure to query it.

Note that presence of amdgpu-no-workitem-id-y attribute does not help
as it only hints the lack of the workitem-id-y query, but not the absence
of the actual 2nd dimension, therefore affecting just the SGPR allocation.

Differential Revision: https://reviews.llvm.org/D132879

[clang] Don't emit debug vtable information for consteval functions

Fixes https://github.com/llvm/llvm-project/issues/55065

Reviewed By: shafik

Differential Revision: https://reviews.llvm.org/D132874

[AMDGPU] Precommit two tests showing missed combines to v_med3

[AMDGPU][GFX11] Fix dst register class for V_CVT_U32_U16

This instruction was referring to the wrong VOPProfile, likely due to a
typo, leading to an incorrect destination register type.

The MC layer will care about this change, but is NFC while 16-bit values
actually use 32 bit registers.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D132878

[clang-tidy] Fix false positive on ArrayInitIndexExpr inside ProBoundsConstantArrayIndexCheck

Sometimes in the AST we can have an ArraySubscriptExpr,
where the index is an ArrayInitIndexExpr.
ArrayInitIndexExpr is not a constant, so
ProBoundsConstantArrayIndexCheck reports a warning when
it sees such expression. This expression can only be
implicitly generated, and always appears inside an
ArrayInitLoopExpr, so we shouldn't report a warning.

Differential Revision: https://reviews.llvm.org/D132654

[InstCombine] add tests for xor-of-ctlz/cttz; NFC

[InstCombine] delete redundant folds; NFC

InstSimplify does this via isKnownNonEqual(), so it's already
using knownbits on these patterns and trying other folds.

[InstCombine] add tests for signbit test using lshr; NFC

[SystemZ][z/OS] Account for renamed parameter name (libc++)

The following patch (https://reviews.llvm.org/D129051) broke z/OS builds by renaming the parameter name. This patch accounts for that change.

Differential Revision: https://reviews.llvm.org/D132946

[mlir][sparse] add missing file for singleton revision

Differential Revision: https://reviews.llvm.org/D132961

[SVE] Fix SVEDup0 matching -0.0f

Because of D128669, CPY is being used to zero active lanes even in the case of -0.0f. This patch checks for floating point positive zero. That way SVEDup0 won't match -0.0f.

Fixes https://github.com/llvm/llvm-project/issues/57428

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D132880

[mlir] Async: add unrealized cast materializations to AsyncToLLVM pass

[mlir] Async: add unrealized cast materializations to AsyncToLLVM pass

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D132768

[mlir][sparse] add more dimension level types and properties

We recently removed the singleton dimension level type (see the revision
https://reviews.llvm.org/D131002) since it was unimplemented but also
incomplete (properties were missing). This revision add singleton back as
extra dimension level type, together with properties ordered/not-ordered
and unique/not-unique. Even though still not lowered to actual code, this
provides a complete way of defining many more sparse storage schemes (in
the long run, we want to support even dimension level types and properties
using the additional extensions proposed in [Chou]).

Note that the current solution of using suffixes for the properties is not
ideal, but keeps the extension relatively simple with respect to parsing and
printing. Furthermore, it is rather consistent with the TACO implementation
which uses things like Compressed-Unique as well. Nevertheless, we probably
want to separate dimension level types from properties when we add more types
and properties.

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D132897

[Docs] Fixing incorrect document title

Doh! This clearly slipped my review. Thanks DuckDuckGo for showing me
the error of my ways :).

[Docs] [HLSL] Documenting HLSL Entry Functions

This document describes the basic usage and implementation details for
HLSL entry functions in Clang.

Reviewed By: python3kgae

Differential Revision: https://reviews.llvm.org/D132672

Change the meaning of a UUID with all zeros for data.

Previously, depending on how you constructed a UUID from data or a
StringRef, an input value of all zeros was valid (e.g. setFromData)
or not (e.g. setFromOptionalData). Since there was no way to tell
which interpretation to use, it was done somewhat inconsistently.
This standardizes the meaning of a UUID of all zeros to Not Valid,
and removes all the Optional methods and their uses, as well as the
static factories that supported them.

Differential Revision: https://reviews.llvm.org/D132191