review.tizen.org Git - platform/upstream/llvm.git/log

[Clang][AMDGPU] Set LTO CG opt level based on Clang option

For AMDGCN default to mapping --lto-O# to --lto-CGO# in a 1:1 manner
(i.e. clang -O<N> implies --lto-O<N> and --lto-CGO<N>).

Ensure there is a means to override this via -Xoffload-linker and begin
to claim these arguments to avoid incorrect warnings that they are not
used.

Reviewed By: yaxunl, MaskRay

Differential Revision: https://reviews.llvm.org/D142499

[LLD] Add --lto-CGO[0-3] option

Allow controlling the CodeGenOpt::Level independent of the LTO
optimization level in LLD via new options for the COFF, ELF, MachO, and
wasm frontends to lld. Most are spelled as --lto-CGO[0-3], but COFF is
spelled as -opt:lldltocgo=[0-3].

See D57422 for discussion surrounding the issue of how to set the CG opt
level. The ultimate goal is to let each function control its CG opt
level, but until then the current default means it is impossible to
specify a CG opt level lower than 2 while using LTO. This option gives
the user a means to control it for as long as it is not handled on a
per-function basis.

Reviewed By: MaskRay, #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D141970

[mlir][vectorToGPU] Fix type used when folding transpose into read op

Pick the right result type when folding transpose op into a read

Differential Revision: https://reviews.llvm.org/D144113

[DebugInfo][Docs] Fix broken link in instruction referencing doc

[mlir] Silence a few -Wunused-but-set-parameter warnings

Some older versions of gcc hit false positives here:

/workdir/llvm-project/mlir/include/mlir/IR/Attributes.h:391:49: warning: parameter 'ty' set but not used [-Wunused-but-set-parameter]
391 | static inline bool isPossible(mlir::Attribute ty) {

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81676

[mlir][AMDGPU] 8-bit float usage in the AMDGPU dialect

Upcoming AMD hardware will include functions that accept 8-bit floats.
Specifically, there are MFMA instructions that accept 8-bit floats,
either using the same or mixed formats. This patch adds MLIR wrappers
for these intrinsics and explicitly adds support for 8-bit floats in
the gpu-to-rocdl conversion by way of amdgpu-to-rocdl.

Since LLVM does not have f8 types, when targeting LLVM for compilation
on an AMD GPU, both f8 types used on AMD hardware (f8E5M2FNUZ and
f8E4M3FNUZ) are rewritten to i8.

This patch also relaxes the restriction that the types of both source
operands to a amdgpu.mfma instructions match exactly, as this is not
necessarily required for the bf8 (f8E5M2FNUZ) and fp8 (f8E4M3FNUZ)
instructions. In addition, since the buffer_{load,store} operations
maintain a whitelist of permitted types, we add the relevant f8 types
to that list.

This patch does not add any implementations of arithmetic operations
for f8 types.

Reviewed By: jakeh-gc

Differential Revision: https://reviews.llvm.org/D143956

[InstCombine] remove stale test comment; NFC

Recommit "[ConstraintElimination] Change debug output to display variable names."

This reverts commit 02ae7e72b3f00969eeb579a2b4346082827f0b35.

include Value.h in ConstraintSystem.h

[libc++abi][AIX] Skip non-C++ EH aware frames when retrieving exception object

Summary:
The personality routine for the legacy AIX xlclang++ compiler uses the stack slot reserved for compilers to pass the exception object to the landing pad. The landing pad retrieves the exception object with a call to the runtime function __xlc_exception_handle(). The current implementation incorrectly assumes that __xlc_exception_handle() should go up one stack frame to get to the stack frame of the caller of __xlc_exception_handle(), which is supposedly the stack frame of the function containing the landing pad. However, this does not always work, e.g., the xlclang++ compiler sometimes generates a wrapper of __xlc_exception_handle() and calls the wrapper from the landing pad for optimization purposes. This patch changes the implementation to unwind the stack from __xlc_exception_handle() and skip frames not associated with functions that are C++ EH-aware until a frame associated with a C++ EH-aware function is found and then retrieving the exception object with the expectation that the located frame is the one that the personality routine installed as it transferred control to the landing pad.

Reviewed by: cebowleratibm, hubert.reinterpretcast, daltenty

Differential Revision: https://reviews.llvm.org/D143010

[AArch64] Always lower fp16 zero to FMOVH0

We can always use FMOVH0 to lower fp16 zero, even without fullfp16. We can
either expand it to movi d0, #0 or fmov s0, wzr, which will both clear all the
bits of the register.

Differential Revision: https://reviews.llvm.org/D143988

[flang][runtime] Return the right mutable modes from INQUIRE in child I/O

During child I/O -- a call to a user-defined subroutine with a generic interface
to handle read and write operations on a specific derived type -- ensure that
any INQUIRE statement probing mutable modes like BLANK= will observe changes
that have been made to those modes via control edit descriptors like BN
while processing the original I/O statement's format.

Differential Revision: https://reviews.llvm.org/D144046

[mlir][linalg] Fix insertion point bug in D144022

This should have been part of D144022.

[libc++][NFC] Replace _LIBCPP_STD_VER > x with _LIBCPP_STD_VER >= x

This change is almost fully mechanical. The only interesting change is in `generate_feature_test_macro_components.py` to generate `_LIBCPP_STD_VER >=` instead. To avoid churn in the git-blame this commit should be added to the `.git-blame-ignore-revs` once committed.

Reviewed By: ldionne, var-const, #libc

Spies: jloser, libcxx-commits, arichardson, arphaman, wenlei

Differential Revision: https://reviews.llvm.org/D143962

[clang][dataflow] Change `transfer` API to take a reference.

The provided `CFGElement` is never null, so a reference is a more precise type.

Differential Revision: https://reviews.llvm.org/D143920

Revert "Recommit "[ConstraintElimination] Change debug output to display variable names.""

This reverts commit 2a2a6bfcfe8e62886542cb673ac8df349cf26499.

This causes build failures:

    /home/npopov/repos/llvm-project/llvm/lib/Analysis/ConstraintSystem.cpp: In member function ‘llvm::SmallVector<std::__cxx11::basic_string<char> > llvm::ConstraintSystem::getVarNamesList() const’:
    /home/npopov/repos/llvm-project/llvm/lib/Analysis/ConstraintSystem.cpp:118:10: error: invalid use of incomplete type ‘class llvm::Value’
      118 |     if (V->getName().empty())
          |          ^~
    In file included from /home/npopov/repos/llvm-project/llvm/lib/Analysis/ConstraintSystem.cpp:9:
    /home/npopov/repos/llvm-project/llvm/include/llvm/Analysis/ConstraintSystem.h:21:7: note: forward declaration of ‘class llvm::Value’
       21 | class Value;
          |       ^~~~~
    /home/npopov/repos/llvm-project/llvm/lib/Analysis/ConstraintSystem.cpp:119:22: error: invalid use of incomplete type ‘class llvm::Value’
      119 |       OperandName = V->getNameOrAsOperand();
          |                      ^~
    /home/npopov/repos/llvm-project/llvm/include/llvm/Analysis/ConstraintSystem.h:21:7: note: forward declaration of ‘class llvm::Value’
       21 | class Value;
          |       ^~~~~
    /home/npopov/repos/llvm-project/llvm/lib/Analysis/ConstraintSystem.cpp:121:41: error: invalid use of incomplete type ‘class llvm::Value’
      121 |       OperandName = std::string("%") + V->getName().str();
          |                                         ^~
    /home/npopov/repos/llvm-project/llvm/include/llvm/Analysis/ConstraintSystem.h:21:7: note: forward declaration of ‘class llvm::Value’
       21 | class Value;
          |       ^~~~~

[InstSimplify] Add additional insertvalue into undef tests (NFC)

[llvm][AArch64] Fix BTI after returns_twice when call has no attributes

Previously we were looking for the returns twice attribute by manually
getting the function attributes from the call. This meant that we only
found attributes on the call itself, not what it was calling.

So if you had:
%call1 = call i32 @setjmp(ptr noundef null)

We would not BTI protect that even though setjmp clearly needs it.

Clang happens to produce:
%call = call i32 @setjmp(ptr noundef null) #0 ; returns_twice

So all valid calls were protected. This is not guaranteed,
the frontend may choose not to put attributes on the call.

It is undefined behaviour to call setjmp indirectly
(https://pubs.opengroup.org/onlinepubs/9699919799/functions/setjmp.html)
but as I misused the APIs here I think it's worth fixing up regardless.

Added comments to the test file where the IR differs from what
clang would output.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D144082

Recommit "[ConstraintElimination] Change debug output to display variable names."

This reverts commit 62d0e1a8541f93dfbf66d982f66da32676df2df7.

remove `dumpWithNames` function

Revert "[ConstraintElimination] Change debug output to display variable names."

This reverts commit 869c87ad10e87db7c032c3464338ab9d50916510.

`dumpWithNames` function should be removed

[ConstraintElimination] Change debug output to display variable names.

Previously when constraint system outputs the rows in the system the variables used are x1,2...n making it hard to infer which ones they relate to in the IR

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D142618

[reland][libc] Separate memcpy implementations per arch

As x86_64 implementations is likely to grow up to a point where it's no more manageable to have all implementations in the same file.

[mlir][memref] Bufferize memref.tensor_store op

This change adds the BufferizableOpInterface implementation for memref.tensor_store.

Differential Revision: https://reviews.llvm.org/D144080

[mlir][linalg] Add bufferize_to_allocation transform op

This transform materializes a buffer allocation for a given tensor value. All uses of the original value are replaced with the allocation.

Certain non-DPS ops may have an optimized lowering path that bufferizes the entire defining op. Such optimization is added for `tensor.pad` as part of this change.

The resulting IR can be further bufferized with One-Shot Bufferize.

Differential Revision: https://reviews.llvm.org/D144022

[ConstantFold] Check for constant global earlier (NFC)

Check that the underlying object is a constant global with
definitive initializer upfront, so we can skip the more expensive
offset calculation logic if we can't perform the fold anyway.

[lld][RISCV][test] Expand testing in riscv-attributes.s

This patch is related to the discussion in
<https://discourse.llvm.org/t/rfc-resolving-issues-related-to-extension-versioning-in-risc-v/68472/1>.
It slightly expands the coverage of behaviour for unrecognized
extensions (there are slightly different forms of error for zfoo vs e.g.
'y'), and adds coverage for an extension with an
unrecognized/unsupported version number.

Use "unrecognized" terminology because the expectation is the error
checking will be reduced in a follow-up patch posted for review.

[LangRef] improve documentation of SNaN in the default FP environment

Make it explicit that SNaN is not handled differently than
QNaN in the LLVM default floating-point environment.

Note that an IEEE-754-compliant model disallows transforms
like "X * 1.0 -> X". That is because math operations are
expected to convert SNaN to QNaN (set the signaling bit).

But LLVM has had those kinds of transforms from the beginning:
https://alive2.llvm.org/ce/z/igb55y

We should be IEEE-754-compliant under strict-FP (the logic is
implemented with a helper named canIgnoreSNaN()), but I don't
think there is any demand to do that with default optimization.

See issue #43070 for earlier draft/discussion about this change.

Differential Revision: https://reviews.llvm.org/D143074

[bazel] Fix missing dependency in clang-tools-extra/clang-tidy:llvmlibc

[NVPTX] Fix NVPTX output name in the driver with `-save-temps`

Summary:
Currently, OpenMP and direct compilation uses an NVPTX toolchain to
directly invoke the CUDA tools from Clang to do the assembling and
linking of NVPTX codes. This breaks under `-save-temps` because of a
workaround. The `nvlink` linker does not accept `.o` files, so we need
to be selective when we output these. The previous logic keyed off of
presense in the temp files and wasn't a great solution. Change this to
just query the input args for `-c` to see if we stop at the assembler.

Fixes https://github.com/llvm/llvm-project/issues/60767

Revert "[libc] Separate memcpy implementations per arch"

This is patch is still breaking downstream users...
This reverts commit 97e441dc6cfae31bc56b375e43899946ec7f08a8.

[libc] Separate memcpy implementations per arch

As x86_64 implementations is likely to grow up to a point where it's no more manageable to have all implementations in the same file.

[mlir][transform] Add transform.get_result op

This transform op returns a value handle pointing to the specified OpResult of the targeted op.

Differential Revision: https://reviews.llvm.org/D144087

[clang-format] assert(false) -> llvm_unreachable

Avoids warnings in -asserts builds.

FormatTokenSource.h:240:3: error: non-void function does not return a value [-Werror,-Wreturn-type]
}
^

[bazel] create a clang-tidy binary target

Create a binary target for clang-tidy. Tested by running:

```
$ bazel build --config=generic_clang @llvm-project//clang-tools-extra/...
$ bazel test --config=generic_clang @llvm-project//clang-tools-extra/...
$ bazel run --config=generic_clang @llvm-project//clang-tools-extra/clang-tidy -- --help
```

Reviewed By: #bazel_build, aaronmondal

Differential Revision: https://reviews.llvm.org/D143804

[MachineTraceMetrics] Add local strategy

This strategy makes each trace local to the basic block. For in-order cores some
heuristics work better when we do local decisions. For example, MachineCombiner
may expect that instructions outside the current basic block do not lengthen
the critical path when we execute instructions in order or the core has a
small re-order buffer.

This patch only introduce the strategy, real use-case is added in the further
pathes.

Depends on D140539

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D140540

[X86] Remove abs(sub_nsw()) -> abds fold from adbu test file

Copy+paste typo - it was correctly removed from 128/512 variants

[lldb] Fix a log format warning on Windows, don't assume uint64_t is a long type

On Windows, long is 32 bit, and uint64_t is long long.

Differential Revision: https://reviews.llvm.org/D144078

[clang-format] Enable FormatTokenSource to insert tokens.

In preparation for configured macro replacements in formatting,
add the ability to insert tokens to FormatTokenSource, and implement
token insertion in IndexedTokenSource.

Differential Revision: https://reviews.llvm.org/D143070

AMDGPU: Fix not adding to depth in getNegatedExpression

clang: Rename misleading test name

This is a test for attribute propagation, not metadata

[lld][ARM][NFCI][1/3]Big Endian support - Removing assumptions

Change:
- Replacing the memcpy that assume little endian with the endian-aware write.

Shouldn't affect the output for now, just a prerequisite for the next patches.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D140201

[AArch64][NFC] Rename AEK_SMEF64 and AEK_SMEI64 feature flags

Update feature flag names from:
AEK_SMEF64 to AEK_SMEF64F64
and
AEK_SMEI64 to AEK_SMEI16I64
These feature flags had their name changed in this previous patch
https://reviews.llvm.org/D135974

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D143989

[SimpleLoopUnswitch] Canonicalize conditions for injection of invariant condition

When loop condition isn't immediately in the form supported by invariant injection
unswitching, try to canonicalize it to this form.

Differential Revision: https://reviews.llvm.org/D143175
Reviewed By: skatkov

[Test][NFC] Rename one of test parameters to avoid confusing associations

Name 'len' implicitly hints that it should be non-negative, but in fact in some of
these tests the baseline profitable scenario for the transform to be profitable is
when this thing is negative. Rename to avoid this confusion.

[Symbolize][MinGW] Support demangling i386 call-conv-decorated C++ names

On i386 Windows, after C++ names have been Itanium-mangled, the C name
mangling specific to its call convention may also be applied on top.
This change teaches symbolizer to be able to demangle this type of
mangled names.

As part of this change, `demanglePE32ExternCFunc` has also been modified
to fix unwanted stripping for vectorcall names when the demangled name
is supposed to contain a leading `_`. Notice that the vectorcall
mangling does not add either an `_` or `@` prefix. The old code always
tries to strip the prefix first, which for Itanium mangled names in
vectorcall, the leading underscore of the Itanium name gets stripped
instead and breaks the Itanium demangler.

Differential Revision: https://reviews.llvm.org/D144049

[CVP] Add additional ctlz tests (NFC)

[XCore] Adapt threads.ll to opaque pointers.

Differential Revision: https://reviews.llvm.org/D144085

[SimpleLoopUnswitch] Fix overflowing frequencies case

When branch is so hot that sum of its frequencies overflows, we have an assertion
failure when trying to construct BranchProbability. Such cases are not interesting
as candidate for unswitching anyways (their branches are too hot), so reject them.

[clang-tidy][NFC] Remove ModernizeTidyModule::getModuleOptions

Most of the options stated there are duplicated already in
the implementation of each check as a default value for
each option.

The only place where this is not the case is the nullptr
check. Move the default option there instead. Only the
HICPP guidelines alias this modernize check, and there is
nothing in the documentation that suggests it should have
a different default value than the main modernize check.

Differential Revision: https://reviews.llvm.org/D143843

[DAGCombine] Fold redundant select

Recommit bbdf24357932b064f2aa18ea1356b474e0220dde.

Original commit message:

If a chain of two selects share a true/false value and are controlled
by two setcc nodes, that are never both true, we can fold away one of
the selects. So, the following:
(select (setcc X, const0, eq), Y,
(select (setcc X, const1, eq), Z, Y))

Can be combined to:
select (setcc X, const1, eq) Z, Y

Differential Revision: https://reviews.llvm.org/D142535

[libc][NFC] Make tuning macros start with LIBC_COPT_

Rename preprocessor definitions that control tuning of llvm libc.

Differential Revision: https://reviews.llvm.org/D143913

Add InstSimplify tests for comparisons between known constants

[JITLink] Drop const qualifier from argument to ELFLinkGraphBuilder's ctors (NFC)

These values are moved into the base class constructors, so the `const` doesn't make any sense. Turns out, I accidentally introduced it myself with 2ed91da0f1f3 and since than it spread by copy/paste.

[JITLink] Fix whitespace in debug dumps (NFC)

[bazel][mlir][examples] Add missing dependency for 72429a42ac33564fa82449d99dc234da32a05498

Revert "InstCombine: Fold is.fpclass(x, fcZero) to fcmp oeq 0"

This reverts commit df78976d023a6b7fcf64bc695261b7b402fcede0.

I pushed the wrong patch

[flang] Update intrinsic types to unlimited polymorphic form

This patch updates the code added in D143888 to avoid
overwriting some part of the types when updating it
for unlimited polymorphic types.

Reviewed By: jeanPerier, PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D143995

AMDGPU: Override getNegatedExpression constant handling

Ignore the multiple use heuristics of the default
implementation, and report cost based on inline immediates. This
is mostly interesting for -0 vs. 0. Gets a few small improvements.
fneg_fadd_0_f16 is a small regression. We could probably avoid this
if we handled folding fneg into div_fixup.

AMDGPU: Refactor isConstantCostlierToNegate

InstCombine: Fold is.fpclass(x, fcZero) to fcmp oeq 0

This requires the denormal mode to definitively be IEEE handling.

[mlir][bufferization] Add restrict and writable attrs to to_tensor

`restrict` is similar to the C++ restrict keyword. Results of `to_tensor` that have the `restrict` attribute are guaranteed to not alias any other `to_tensor` result (after bufferization).

Note: Since `to_memref` ops are not supported by One-Shot Bufferize and all bufferizable ops follow DPS rules (i.e., the buffer of the result is the buffer of an operand or an alias thereof), the buffer of a `to_tensor` op that has the `restrict` attribute is always an entirely "new" buffer that is not aliasing with the future buffer of any tensor value in the entire program. This makes such `to_tensor` ops "safe" from a bufferization perspective; they cannot cause RaW conflicts.

Differential Revision: https://reviews.llvm.org/D144021

[SROA] NFC: Look at TypeStoreSize scalable property, rather than at type directly.

Some places in the code have checks for isa<ScalableVectorType> and use
that to bail out of the code. It's also possible to look directly at the
allocated type-size and check if the size is scalable. This means it's
possible to also support other scalable types that are not vectors (i.e.
TargetExtType).

This is split out from D136861.

[llvm-c] Add C API methods to match 64bit ArrayType C++ API signatures

Fixes https://github.com/llvm/llvm-project/issues/56496.

As mentioned in the issue, new functions LLVMArrayType2 and
LLVMGetArrayLength2 are created so as to not break the old API.
The old methods are then marked as deprecated and callers are
updated.

Differential Revision: https://reviews.llvm.org/D143700

[mlir] Add vectorize_nd_extract attribute to the masked_vectorize Op

This patch simply adds `vectorize_nd_extract` (that's currently only
used for the `vectorize` Op) to `masked_vectorize`. A test is added to
verify that it works as expected - it prevents the masked vectorisation
of `tensor.extract`, which is currently not supported.

Differential Revision: https://reviews.llvm.org/D142634

[Metarenamer] Use 'inst' as default name for instructions

Currently we use 'tmp', which is also a keyword for FileCheck. It leads to this
annoying warning whenever a script for auto-generation of checks is used.
It is especially annoying that it happens to every test affected by metarenamer.

Just use another prefix for metarenamed names to avoid this.

Differential Revision: https://reviews.llvm.org/D144001
Reviewed By: nikic

[clang][analyzer] Make messages of StdCLibraryFunctionsChecker user-friendly

Warnings and notes of checker alpha.unix.StdLibraryFunctionArgs are
improved. Previously one warning and one note was emitted for every
finding, now one warning is emitted only that contains a detailed
description of the found issue.

Reviewed By: Szelethus

Differential Revision: https://reviews.llvm.org/D143194

[LoopFuse] Remove legacy pass

Following recent changes to remove non-core legacy passes.

[MLIR][Tensor] Introduce a pattern to propagate through `tensor.pad`

Introduce a pattern to 'push down' a `tensor.unpack` through a
`tensor.pad`. The propagation happens if the unpack does not touch the
padded dimensions.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D143907

[InstCombine] Fix InstCombinerImpl::foldICmpMulConstant for nsw and nuw mul with unsigned compare.

If we have both an nsw and nuw flag, we would see the nsw flag
first and only handle signed comparisons.

This patch ignores the nsw flag if the comparison isn't signed.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D143766

[mlir][Bazel] Add missing dependencies for 72429a42ac33564fa82449d99dc234da32a05498

[mlir] Add RewriterBase::replaceAllUsesWith for Blocks.

When changing IR in a RewriterPattern, all changes must go through the
rewriter. There are several convenience functions in RewriterBase that
help with high-level modifications, such as replaceAllUsesWith for
Values, but there is currently none to do the same task for Blocks.

Reviewed By: mehdi_amini, ingomueller-net

Differential Revision: https://reviews.llvm.org/D142525

Add a pass that builds a debug info scope for LLVMFuncOp (adding a DISubprogramAttr)

This may be seen as a hack, but it allows for any piece of MLIR to be able
to end up with DWARF debug info through LLVM.
Assuming the operations in the function have location such as FileLineCol,
this provides backtraces with line tables and allows to step in a debugger.
That makes this pass a perfect companion to -snapshot-op-locations

It was also the default behavior of MLIR to LLVM IR translation until MLIR
got support for proper debug info attributes.

Differential Revision: https://reviews.llvm.org/D144069

[mlir][Bazel] Add dependency needed for e9b82a5c

[mlir][Vector] Enable masking for static shapes

Support for masking static shapes was already implemented in the past
but not enabled so this patch is just removing a pre-condition check and
adding some tests with static shapes.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D143937

[mlir][Vector] Add support for masked vector gather ops

This patch adds support for masked vector.gather ops using the
vector.mask representation. It includes the implementation of the
MaskableOpInterface, Linalg vectorizer support and lowering to LLVM.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D143939

[mlir][Vector] Add support for masked vector.contract

This patch adds support for masking vector.contract ops with the
vector.mask approach. This also includes the lowering of vector.contract
through the vector.outerproduct path to LLVM. For now, this only adds
support for one of the many potential flavors of
vector.contract/vector.outerproduct but unsupported cases will fail
gratefully.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D143965

[mlir][Vector] Add LLVM lowering for masked reductions

This patch adds the conversion patterns to lower masked reduction
operations to the corresponding vp intrinsics in LLVM.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D142177

[LoopVersioningLICM] Remove legacy pass

Following recent changes to remove non-core features of the legacy PM/optimization pipeline.

[mlir] Use std::optional instead of llvm::Optional (NFC)

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

[llvm] Use std::optional instead of llvm::Optional (NFC)

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

[mlir] Fix -Wsign-compare in SparseTensorRewriting.cpp and Sparsification.cpp (NFC)

/home/jiefu/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp:279:33: error: comparison of integers of different signs: 'int64_t' (aka 'long') and 'const mlir::sparse_tensor::Level' (aka 'const unsigned long') [-Werror,-Wsign-compare]
    assert(env.op().getRank(&t) == lvlRank);
           ~~~~~~~~~~~~~~~~~~~~ ^  ~~~~~~~
/usr/include/assert.h:93:27: note: expanded from macro 'assert'
     (static_cast <bool> (expr)                                         \
                          ^~~~
1 error generated.

/home/jiefu/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorRewriting.cpp:788:29: error: comparison of integers of different signs: 'int64_t' (aka 'long') and 'const mlir::sparse_tensor::Dimension' (aka 'const unsigned long') [-Werror,-Wsign-compare]
    assert(srcRTT.getRank() == dimRank);
           ~~~~~~~~~~~~~~~~ ^  ~~~~~~~
/usr/include/assert.h:93:27: note: expanded from macro 'assert'
     (static_cast <bool> (expr)                                         \
                          ^~~~
/home/jiefu/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorRewriting.cpp:810:31: error: comparison of integers of different signs: 'int64_t' (aka 'long') and 'const mlir::sparse_tensor::Dimension' (aka 'const unsigned long') [-Werror,-Wsign-compare]
      assert(srcRTT.getRank() == dimRank);
             ~~~~~~~~~~~~~~~~ ^  ~~~~~~~
/usr/include/assert.h:93:27: note: expanded from macro 'assert'
     (static_cast <bool> (expr)                                         \
                          ^~~~
2 errors generated.

[gn build] Port f1c4241fb6e5

[RISCV] Add new pass to transform undef to pseudo for vector values.

RISC-V vector instruction has register overlapping constraint for certain
instructions, and will cause illegal instruction trap if violated, we use
early clobber to model this constraint, but it can't prevent register allocator
allocated same or overlapped if the input register is undef value, so convert
IMPLICIT_DEF to temporary pseudo could prevent that happen, it's not best way
to resolve this. Ideally we should model the constraint right, but before we
model the constraint right, it's the approach to prevent that happen.

See also: https://github.com/llvm/llvm-project/issues/50157

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D129735

[RISCV] precommit test for D129735

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D137763

[NFC] Refactor ModuleDeclStateTest to make it not dependent on Frontend

Required in https://reviews.llvm.org/D137526. And it is indeed odd that
LexTest depends on ClangFrontend.

[ORC] Add ELFNixPlatform::Create overload -- Pass ORC runtime as def generator.

The existing Create method took a path to the ORC runtime and created a
StaticLibraryDefinitionGenerator for it. The new overload takes a
std::unique_ptr<DefinitionGenerator> directly instead. This provides more
flexibility when constructing MachOPlatforms. E.g. The runtime archive can be
embedded in a special section in the ORC controller executable or library,
rather than being on-disk.

This is the ELFNixPlatform equivalent of the MachOPlatform change in
be2fc577c38.

[mlir][sparse] Factoring out SparseTensorType class

This change adds a new `SparseTensorType` class for making the "dim" vs "lvl" distinction more overt, and for abstracting over the differences between sparse-tensors and dense-tensors. In addition, this change also adds new type aliases `Dimension`, `Level`, and `FieldIndex` to make code more self-documenting.

Although the diff is very large, the majority of the changes are mechanical in nature (e.g., changing types to use the new aliases, updating variable names to match, etc). Along the way I also made many variables `const` when they could be; the majority of which required only adding the keyword. A few places had conditional definitions of these variables, requiring actual code changes; however, that was only done when the overall change was extremely local and easy to extract. All these changes are included in the current patch only because it would be too onerous to split them off into a separate patch.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D143800

[C++20] [Modules] [ClangScanDeps] Ensure that we can mix the use of and clang modules

Add a test to ensure that the clang-scan-deps won't crash due to the
unexpected use of clang modules.

[clang-format][NFC] Reformat clang/tools/clang-format/fuzzer/

[mlir][sparse] fix a bug in UnpackOp converter.

UnpackOp Converter used to create reallocOp unconditionally, but it might cause issue when the requested memory size is smaller than the actually storage.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D144065

[InstCombine] fold icmp of the sum of ext bool based on limited range

For the pattern `(zext i1 X) + (sext i1 Y)`, the constant range is [-1, 1].
We can simplify the pattern by logical operations. Like:

```
    (zext i1 X) + (sext i1 Y) == -1 -->  ~X & Y
    (zext i1 X) + (sext i1 Y) == 0  --> ~(X ^ Y)
    (zext i1 X) + (sext i1 Y) == 1 --> X & ~Y
```
And other predicates can the combination of these results:

```
    (zext i1 X) + (sext i1 Y)) != -1 --> X | ~Y
    (zext i1 X) + (sext i1 Y)) s> -1 --> X | ~Y
    (zext i1 X) + (sext i1 Y)) u< -1 --> X | ~Y
    (zext i1 X) + (sext i1 Y)) s> 0 --> X & ~Y
    (zext i1 X) + (sext i1 Y)) s< 0 --> ~X & Y
    (zext i1 X) + (sext i1 Y)) != 1 --> ~X | Y
    (zext i1 X) + (sext i1 Y)) s< 1 --> ~X | Y
    (zext i1 X) + (sext i1 Y)) u> 1 --> ~X & Y
```

All alive proofs:
https://alive2.llvm.org/ce/z/KmgDpF
https://alive2.llvm.org/ce/z/fLwWa9
https://alive2.llvm.org/ce/z/ZKQn2P

Fix: https://github.com/llvm/llvm-project/issues/59666

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D143373

[SeparateConstOffsetFromGEP] Fix: `b - a` matched `a - b` during reuniteExts

During the SeparateConstOffsetFromGEP pass, a - b and b - a will be
considered equivalent in some instances.

An example- the IR contains:

  BB1:
      %add = add %a, 511
      br label %BB2
  BB2:
      %sub2 = sub %b,  %a
      br label %BB3
  BB3:
      %sub1 = sub %add, %b
      %gep = getelementptr float, ptr %p, %sub1

Step 1 in the SeparateConstOffsetFromGEP pass, after split constant index:

  BB1:
      %add = add %a, 511
      br label %BB2
  BB2:
      %sub2 = sub %b,  %a
      br label %BB3
  BB3:
      %sub.t = sub %a, %b
      %gep.base = getelementptr float, ptr %p, %sub.t
      %gep = getelementptr float, ptr %gep.base, 511

Step 2, after reuniteExts:

  BB1:
      br label %BB2
  BB2:
      %sub2 = sub %b,  %a
      br label %BB3
  BB3:
      %gep.base = getelementptr float, ptr %p, %sub2
      %gep = getelementptr float, ptr %gep.base, 511

Obviously, reuniteExts treated a - b and b - a as equivalent.
This patch fixes that.

Reviewed By: nikic, spatel

Differential Revision: https://reviews.llvm.org/D143542

[Instcombine] Precommit tests for icmp range; NFC

[NFC][SeparateConstOffsetFromGEP] Added flag `lower-gep`

We need such a flag to check whether the transformation is correct if
LowerGEP was enabled.

Reviewed By: nikic, arsenm, spatel

Differential Revision: https://reviews.llvm.org/D143980

[bazel] Port 26662ac010ef50e65e2774eab84f325aa09360fe

[Clang][RISCV] Guard vector int64, float32, float64 with semantic analysis

Depends on D143657

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D143665

[mlir] Fix two build warnings in VectorToGPU (NFC)

In file included from /data/llvm-project/mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp:13:
/data/llvm-project/mlir/include/mlir/Conversion/VectorToGPU/VectorToGPU.h:15:1: error: class 'LogicalResult' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Werror,-Wmismatched-tags]
class LogicalResult;
^
/data/llvm-project/mlir/include/mlir/Support/LogicalResult.h:26:22: note: previous use is here
struct [[nodiscard]] LogicalResult {
                     ^
/data/llvm-project/mlir/include/mlir/Conversion/VectorToGPU/VectorToGPU.h:15:1: note: did you mean struct here?
class LogicalResult;
^~~~~
struct

/data/llvm-project/mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp:724:5: error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result]
    rewriter.notifyMatchFailure(
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
2 errors generated.

[gn build] Add missing count dependency for check-asan

[scudo] Update ring buffer test to make it accept zero size

allocation ring buffer is allowed to be zero. Update the logic in the
test so that on the platform that disables it won't fail this case.

Reviewed By: fmayer

Differential Revision: https://reviews.llvm.org/D144055

[LoopUnrollAndJam] Remove legacy pass

Following recent changes to remove non-core features of the legacy PM/optimization pipeline.