review.tizen.org Git - platform/upstream/llvm.git/log

[AArch64][SME]: Add missing Ops that need custom-lowering in streaming mode.

Add missing Ops and update related testing files.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D141595

[LVI] Fix and re-enable at-use reasoning (PR60629)

This fixes the handling of phi nodes in getConstantRangeAtUse()
and re-enables it, reverting the workaround from
c77c186a647b385c291ddabecd70a2b4f84ae342.

For phi nodes, while we can make use of the edge condition for the
incoming value, we shouldn't look past the phi node to look for
further conditions, because we might be reasoning about values
from two different cycle iterations (which will have the same
SSA value).

To handle this more specifically we would have to detect cycles,
and there doesn't seem to be any motivating case for that at this
point.

[LV] Synthesize all true masks for masked vector function variants

When vectorizing code with function calls in it, if we encounter
a function which only has vectorized variants requiring a mask
we can synthesize an all-true mask to enable us to proceed.

Since we want the mask to be represented in vplan, the pointer
to the chosen Function is now stored as part of the
VPWidenCallRecipe, and mask arguments are added at the
appropriate index to the recipe operands.

Reviewed By: david-arm, fhahn, reames

Differential Revision: https://reviews.llvm.org/D132458

DAG: Remove hasBitPreservingFPLogic

This doesn't make sense as an option. fneg and fabs are bit
preserving by definition. If a target has some fneg or fabs
instruction that are not bitpreserving it's incorrect to lower
fneg/fabs to use it.

[mlir][llvm] Make LoopAnnotations non-discardable

This commit adds the loop annotation attribute to LLVM::Br and
LLVM::CondBr to ensure it is non-discardable. Furthermore, the name is
changed from "llvm.loop" to "loop-annotation".

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D143986

[clang][NFC] Adjust tests to not un/define predefined macros

An upcoming patch will be making all defining or undefining of
predefined macros to be warning (currently only some give a warning).
In preparation for this adjust some tests that would emit a warning:
* In thread-specifier.c the undefine is done to avoid a different
   warning, but we get that warning just because __thread and
   __private_extern__ are the wrong way around so we can just swap
   them.
* There are a couple of objective-c tests that redefine IBAction to
   what it's already defined as, so we can just remove the define.

[flang] support fir.unreachable in stack arrays pass

Some functions (e.g. the main function) end with a call to the STOP
statement instead of a func.return. This is lowered as a call to the
stop runtime function followed by a fir.unreachable. fir.unreachable is
a terminator and so this can cause functions to have no func.return.

The stack arrays pass looks to see which heap allocations have always
been freed by the time a function returns. Without any returns, the pass
does not detect any freed allocations. This patch changes this behaviour
so that fir.unreachable is checked as well as func.return.

This allows 15 heap allocations for array temporaries in spec2017
exchange2's main function to be moved to the stack.

Differential Revision: https://reviews.llvm.org/D143918

[flang] automatically load FIR dialect with hlfir

MLIR loads dialects lazily so if a hlfir type (or operation) is found
before any fir type (or operation), the fir dialect will not have been
loaded when the hlfir thing is verified. Verification of some hlfir
operations does depend on fir types (e.g. hlfir.sum needs
fir::SequenceType).

Tablegen change recommended by jeanPerier

Differential Revision: https://reviews.llvm.org/D143930

[libc] Conform memcpy tuning macro to the new naming scheme

[MachineTraceMetrics][NFC] Move Strategy enum out of the class

Make forward declaration possible to reduce amount of dependencies and reduce
re-compilation burden caused by further patches.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D140539

[mlir][LLVM] Properly wrap code examples in markdown code blocks

These are otherwise rendered and formatted as raw text on the website, making them completely unreadable

[flang][hlfir] remove unnecessary header include

Builder/HLFIRTools.h is not needed and is causing build
issues in some shared library builds because it belongs to another
library that depends on libHLFIRDialect (so libHLFIRDialect should
not depend on it).

Differential Revision: https://reviews.llvm.org/D143994

[flang] Fix USE rename

Fix USE rename when use-name and local-name are the same.
Previously, the associated symbol was being removed from scope.

Operator rename implementation needed no change, because, as it
doesn't call AddUseRename(), symbol erasure is skipped.

Fixes #60223

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D143933

[SimpleLoopUnswitch] Re-enable simple-loop-unswitch-inject-invariant-conditions

Underlying bug (taking a branch from inner loop as candidate) is now fixed.
We can return it.

[SimpleLoopUnswitch] Ignore inner loops when injecting invariant conditions. PR60736

The transform and all related updates don't expect the situation when candidate
is from an inner loop. I think we *might* still do something in this case, but
the current implementation doesn't expect this and does incorrect loop info
updates in this situation.

Details: https://github.com/llvm/llvm-project/issues/60736

DAG: Relax foldBitcastedFPLogic conditions

Requiring a bitcast to exist was unhelpful. The most basic cases
are always going to be a CopyFromReg or load, so they would need
a new cast inserted. Don't require a bitcast if it's a free
operation. I don't think this logic makes particularly much sense
(it seems to be imparting special interpretation of bitcast), but
this needs to be in sync with foldSignChangeInBitcast.

We should also get rid of this hasBitPreservingFPLogic hook. fabs/fneg
are bitpreserving or incorrectly implemented, so this should just be a
regular legality check.

[NFC] Move some asserts out of Expensive Checks

This was done by reviewer's request, but in fact without them we
skip very nasty bugs. Unless it is a REAL problem, I'd keep them
in default setup.

[llvm][AArch64] Fix an interaction of SLS and BTI after a returns twice call

This fixes the combination of two things:
* Placing a BTI after calls to a returns twice function like setjmp.
  This allows the setjmp to return with a br instead of a ret.
* Straight line speculation mitigations that replace BLR with a BL
  to a thunk that does the mitigation, and then goes to the original
  target.

Originally I marked AArch64call_bti as requiring that SLS mitigation
be disabled. This caused a crash when you tried to codegen with both.
Since CALL_BTI tried to match with AArch64call_bti but could not.

This change does 2 things:
* Follow the pattern set by AArch64call and add 2 patterns for
  AArch64call_bti. One with no IP (interprocedural) registers,
  and one with. For SLS mitigation on and off respectively.
* Modify the sls hardening pass to iterate through bundled
  instructions, as the AArch64 KCFI pass does.

Since there is a 1:1 replacement of the BLR with a BL,
the bundle remains intact. This is checked with an MIR test.

The ir -> asm testing is updated to add runs with the sls
mitigation enabled.

Reviewed By: kristof.beyls, pzheng

Differential Revision: https://reviews.llvm.org/D143915

[Test] Add test for PR60736

Details at https://github.com/llvm/llvm-project/issues/60736

[SimpleLoopUnswitch] Temorarily switch off simple-loop-unswitch-inject-invariant-conditions. PR60736

It caused an assertion failure, not sure induced or introduced. Disabling
to investigate it. See details at https://github.com/llvm/llvm-project/issues/60736

[mlir] fix LLVM IR translation of vector<... x index>

When the translation was written, `vector<... x index>` was not allowed
at all. After it was added later, the translation was never adapted. It
kept working in the most common case of index-typed attributes using
64-bit storage and being converted to 64-bit integers in LLVM IR, but
not in the other cases that require truncation or extension, producing
wrong results when using the raw data storage of the dense attrbute to
construct the LLVM IR constant. When the storage size doesn't match,
fall back to the per-element constant construction, which is slower but
handles bitwidth differences correctly.

Fixes #60614.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D143993

[Flang] Fix for Any/All simplification to properly propogate the inital value

When rank > 1, the inital value would be lost on inner loops, leading to the wrong
value to be returned, e.g. This would return T. This patch fixes this to use the correct
inital value for all cases.
```
Integer :: m(0,10)
Any(m .eq 0)
```

Reviewed By: vdonaldson

Differential Revision: https://reviews.llvm.org/D143899

[docs] Update the ACLE URL

[docs] Fix bullet list formatting

reST requires an empty line before a bullet list.

[mlir][linalg] expose convolution dimension classifier

Make available through functions in the `linalg::detail` namespace the
classification of Linalg op dimensions as different kinds (batch, image,
channel, etc) of convolution dimensions. This is useful for identifying
which dimensions to target with transformations.

Reviewed By: chelini

Differential Revision: https://reviews.llvm.org/D143584

[mlir] reallow null results in TransformEachOpTrait

Previous changes in 98acd7468307b6099e7deae206a749af324ff95f were overly
eager to disallow null payload everywhere. The semantics of
TransformEachOpTrait allows individual applications to return null
payloads as means of filtering out the operations to which they are not
applicable without emitting even a silenceable failure. This is a
questionable choice, but one apparently relied upon. Null payloads are
not supposed to leak outside of the trait.

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D143904

Use llvm::bit_cast (NFC)

[mlir] Drop unused arith conversion class (NFC)

This commit removes an old helper class for fastmath attribute
conversion that is no longer used.

The last usage of this class was dropped in https://reviews.llvm.org/D137456.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D143912

[include-cleaner] Better support ambiguous std symbols

By special-casing them at the moment. The tooling stdlib lib doesn't
support these symbols (most important one is std::move).

Differential Revision: https://reviews.llvm.org/D143906

[Modules] Don't re-generate template specialization in the importer

Close https://github.com/llvm/llvm-project/issues/60693.

In this issue, we can find that the importer will try to generate the
template specialization again in the importer, which is not good and
wastes time. This patch tries to address the problem.

[Tooling][Stdlib][NFC] Reflow comments and strip clang-format pragmas

[AVR] Fix inaccurate offsets in PC relative branch instructions

In avr-gcc, the destination of "rjmp label + offset" is address
'label + offset', while destination of "rjmp . + offset" is
'address_of_rjmp + offset + 2'.

Clang is in accordance with avr-gcc for "rjmp label + offset", but
emits incorrect destination of "rjmp . + offset" to
'address_of_rjmp + offset', in which the expected offset 2 is missing.

This patch fixes the above issue.

Fixes https://github.com/llvm/llvm-project/issues/60019

Reviewed By: jacquesguan, aykevl

Differential Revision: https://reviews.llvm.org/D143901

Move global namespace cl::opt inside llvm::

AMDGPU: Teach getNegatedExpression about rcp

AMDGPU: Add test for getNegatedExpression with rcp

AMDGPU: Add additional tests for combiner infinite loop

llvm-reduce: Run instruction reduction last

With the current state of mir support, this is going to generate
a large number of verifier errors. Running the use and def
reductions first helps to mitigate the impact of this.

[DAG] Handle build_vector with all undefs in reduceBuildVecTruncToBitCast

While working on D143731 I hit a case where a build_vector with 2 undef operands could be generated (with one undef hidden behind a bitcast).
That made `reduceBuildVecTruncToBitCast` crash because it seems to assume there is at least one good operand.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D143886

[LangRef] Global variable declarations imply minimum size

Adjust the wording added in D78952 to say that global variable
declarations (and interposable definitions) do imply a minimum
size (and alignment) on the global. They just don't imply a
maximum size.

We rely on these semantics in at least two places:

* Global dereferenceability: https://github.com/llvm/llvm-project/blob/2153544865a9733b06579823814c981f735e4201/llvm/lib/IR/Value.cpp#L907
* Global inbounds GEP: https://github.com/llvm/llvm-project/blob/2153544865a9733b06579823814c981f735e4201/llvm/lib/IR/ConstantFold.cpp#L2283

Differential Revision: https://reviews.llvm.org/D143057

[Coroutines] Don't run optimizations for optnone functions

Currently we will run two optimization (rematerialization and sink
lifetime markers) unconditionally even if the coroutine is marked as
optnone (O0). This looks not good. This patch disables these 2
optimizations for optnone functions. An internal change shows the change
improve the compilation time for 3% in the debug build.

[mlir][llvm] Reintroduce string based attribute setting.

Reintroduce string based attribute setting in the
translation from LLVM dialect to LLVM IR. The TypeSwitch
based implementation introduced in
https://reviews.llvm.org/D143654 does not work for
intrinsics that set the requiresAccessGroup or
requiresAliasScope flag.

Reviewed By: hgreving

Differential Revision: https://reviews.llvm.org/D143936

[PowerPC] Specify the dynamic loader prefix in ppc-float-abi-warning

Ensure the tests do not fail during cross compilation by specifying
the dynamic loader prefix for a GNU installation that is expected to
support IEEE 128-bit long double.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D143736

[LoopDeletion] Simplify. NFC

[docs] Add document for clang-scan-deps -format=p1689

The patches for `clang-scan-deps` have been landed. And we need to
document the behavior then.

[OpenMP] Fix extra parenthesis in kmp_os.h

Differential Revision: https://reviews.llvm.org/D143940

[ARM] Use llvm::rotl and llvm::rotr (NFC)

[RISCV] Rename InstFormatCSZN->InstFormatCU to match latest Zcb spec. NFC

This was updated in version 1.0.2 of the spec. Latest version
https://github.com/riscv/riscv-code-size-reduction/releases/download/v1.0.3/Zc-v1.0.3.pdf

[RISCV] Use llvm::rotl (NFC)

Recommit: [NFC][IR] Make Module::getAliasList() private

This reverts commit 6d4a674acbc56458bb084878d82d16e393d45a6b.

[AArch64] Use llvm::rotl and llvm::rotr (NFC)

[bazel] build fix

Differential Revision: https://reviews.llvm.org/D143973

[mlir][affine] Normalize constant valued bound loop

This change aims to resolve the issue reported in https://github.com/llvm/llvm-project/issues/59994.

After calling AffineForOp#setUpperBound and setLowerBound, it makes the original affine map structure inconsistent. It's necessary to keep the original map to build the new induction vars.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D142082

Revert "[NFC][IR] Make Module::getAliasList() private"

This reverts commit b64f7d028bdcaf679130afeed9518c09663f6dc8.

Revert "[DAGCombiner] handle more store value forwarding"

This reverts commit f35a09daebd0a90daa536432e62a2476f708150d.

Causes miscompiles, see D138899

Revert "[DAGCombiner] fix comments for D138899; NFC"

This reverts commit 63854f91d3ee1056796a5ef27753648396cac6ec.

Dependent commit to be reverted.

[NFC][IR] Make Module::getAliasList() private

This patch adds several missing AliasList modifier functions, like
removeAlias(), eraseAlias() and insertAlias().
There is no longer need to access the list directly so it also makes
getAliaList() private.

Differential Revision: https://reviews.llvm.org/D143958

[PowerPC][GISel] add support for fpconstant

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D133340

Revert "[mlir] Make the vast majority of integration and runner tests work on Windows"

This reverts commit 161b9d741a3c25f7bd79620598c5a2acf3f0f377.

REASON:

cmake --build . --target check-mlir-integration

Failed Tests (186):
  MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-addi-i16.mlir
  MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-cmpi-i16.mlir
  MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-compare-results-i16.mlir
  MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-constants-i16.mlir
  MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-max-min-i16.mlir
  MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-muli-i16.mlir
  MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-shli-i16.mlir
  MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-shrsi-i16.mlir
  MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-shrui-i16.mlir
  MLIR :: Integration/Dialect/Async/CPU/microbench-linalg-async-parallel-for.mlir
  MLIR :: Integration/Dialect/Async/CPU/microbench-scf-async-parallel-for.mlir
  MLIR :: Integration/Dialect/Async/CPU/test-async-parallel-for-1d.mlir
  MLIR :: Integration/Dialect/Async/CPU/test-async-parallel-for-2d.mlir
  MLIR :: Integration/Dialect/Complex/CPU/correctness.mlir
  MLIR :: Integration/Dialect/LLVMIR/CPU/X86/test-inline-asm-vector.mlir
  MLIR :: Integration/Dialect/LLVMIR/CPU/X86/test-inline-asm.mlir
  MLIR :: Integration/Dialect/LLVMIR/CPU/test-vector-reductions-fp.mlir
  MLIR :: Integration/Dialect/LLVMIR/CPU/test-vector-reductions-int.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/matmul-vs-matvec.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/rank-reducing-subview.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-collapse-tensor.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-conv-1d-call.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-conv-1d-nwc-wcf-call.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-conv-2d-call.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-conv-2d-nhwc-hwcf-call.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-conv-3d-call.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-conv-3d-ndhwc-dhwcf-call.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-elementwise.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-expand-tensor.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-one-shot-bufferize.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-padtensor.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-subtensor-insert-multiple-uses.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-subtensor-insert.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-tensor-e2e.mlir
  MLIR :: Integration/Dialect/Linalg/CPU/test-tensor-matmul.mlir
  MLIR :: Integration/Dialect/Memref/cast-runtime-verification.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/concatenate.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/dense_output.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/dense_output_bf16.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/dense_output_f16.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_abs.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_cast.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_codegen_dim.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_codegen_foreach.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_complex32.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_complex64.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_complex_ops.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_constant_to_sparse_tensor.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conv_1d_nwc_wcf.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conv_2d.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_nhwc_hwcf.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conv_3d.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conv_3d_ndhwc_dhwcf.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conversion.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conversion_dyn.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conversion_ptr.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2dense.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2sparse.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_dot.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_expand.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_file_io.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_filter_conv2d.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_flatten.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_foreach_slices.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_index.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_index_dense.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_insert_1d.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_insert_2d.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_insert_3d.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_matmul.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_matvec.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_mttkrp.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_out_mult_elt.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_out_reduction.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_out_simple.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_pack.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_quantized_matmul.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_re_im.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_reduce_custom.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_reduce_custom_prod.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_reductions.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_reductions_prod.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_reshape.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_rewrite_push_back.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_rewrite_sort.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_rewrite_sort_coo.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sampled_matmul.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sampled_mm_fusion.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_scale.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_scf_nested.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_select.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sign.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sorted_coo.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_spmm.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_storage.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sum.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sum_bf16.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sum_c32.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sum_f16.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_tanh.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_tensor_mul.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_tensor_ops.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_transpose.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_unary.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir
  MLIR :: Integration/Dialect/SparseTensor/python/test_SDDMM.py
  MLIR :: Integration/Dialect/SparseTensor/python/test_SpMM.py
  MLIR :: Integration/Dialect/SparseTensor/python/test_elementwise_add_sparse_output.py
  MLIR :: Integration/Dialect/SparseTensor/python/test_output.py
  MLIR :: Integration/Dialect/SparseTensor/python/test_stress.py
  MLIR :: Integration/Dialect/SparseTensor/taco/test_MTTKRP.py
  MLIR :: Integration/Dialect/SparseTensor/taco/test_SDDMM.py
  MLIR :: Integration/Dialect/SparseTensor/taco/test_SpMM.py
  MLIR :: Integration/Dialect/SparseTensor/taco/test_SpMV.py
  MLIR :: Integration/Dialect/SparseTensor/taco/test_Tensor.py
  MLIR :: Integration/Dialect/SparseTensor/taco/test_scalar_tensor_algebra.py
  MLIR :: Integration/Dialect/SparseTensor/taco/test_simple_tensor_algebra.py
  MLIR :: Integration/Dialect/SparseTensor/taco/test_tensor_complex.py
  MLIR :: Integration/Dialect/SparseTensor/taco/test_tensor_types.py
  MLIR :: Integration/Dialect/SparseTensor/taco/test_tensor_unary_ops.py
  MLIR :: Integration/Dialect/SparseTensor/taco/test_true_dense_tensor_algebra.py
  MLIR :: Integration/Dialect/SparseTensor/taco/unit_test_tensor_core.py
  MLIR :: Integration/Dialect/SparseTensor/taco/unit_test_tensor_io.py
  MLIR :: Integration/Dialect/SparseTensor/taco/unit_test_tensor_utils.py
  MLIR :: Integration/Dialect/Standard/CPU/test-ceil-floor-pos-neg.mlir
  MLIR :: Integration/Dialect/Standard/CPU/test_subview.mlir
  MLIR :: Integration/Dialect/Vector/CPU/AMX/test-mulf-full.mlir
  MLIR :: Integration/Dialect/Vector/CPU/AMX/test-mulf.mlir
  MLIR :: Integration/Dialect/Vector/CPU/AMX/test-muli-ext.mlir
  MLIR :: Integration/Dialect/Vector/CPU/AMX/test-muli-full.mlir
  MLIR :: Integration/Dialect/Vector/CPU/AMX/test-muli.mlir
  MLIR :: Integration/Dialect/Vector/CPU/AMX/test-tilezero-block.mlir
  MLIR :: Integration/Dialect/Vector/CPU/AMX/test-tilezero.mlir
  MLIR :: Integration/Dialect/Vector/CPU/X86Vector/test-dot.mlir
  MLIR :: Integration/Dialect/Vector/CPU/X86Vector/test-inline-asm-vector-avx512.mlir
  MLIR :: Integration/Dialect/Vector/CPU/X86Vector/test-mask-compress.mlir
  MLIR :: Integration/Dialect/Vector/CPU/X86Vector/test-rsqrt.mlir
  MLIR :: Integration/Dialect/Vector/CPU/X86Vector/test-sparse-dot-product.mlir
  MLIR :: Integration/Dialect/Vector/CPU/X86Vector/test-vp2intersect-i32.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-0-d-vectors.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-broadcast.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-compress.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-constant-mask.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-contraction.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-create-mask-v4i1.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-create-mask.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-expand.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-extract-strided-slice.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-flat-transpose-col.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-flat-transpose-row.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-fma.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-gather.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-index-vectors.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-insert-strided-slice.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-maskedload.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-maskedstore.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-matrix-multiply-col.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-matrix-multiply-row.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-outerproduct-f32.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-outerproduct-i64.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-print-int.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-realloc.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-reductions-f32-reassoc.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-reductions-f32.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-reductions-f64-reassoc.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-reductions-f64.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-reductions-i32.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-reductions-i4.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-reductions-i64.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-reductions-si4.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-reductions-ui4.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-scan.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-scatter.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-shape-cast.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-shuffle.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-sparse-dot-matvec.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-sparse-saxpy-jagged-matvec.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-transfer-read-1d.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-transfer-read-2d.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-transfer-read-3d.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-transfer-read.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-transfer-to-loops.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-transfer-write.mlir
  MLIR :: Integration/Dialect/Vector/CPU/test-transpose.mlir

Testing Time: 0.29s
  Unsupported:  31
  Passed     :   5
  Failed     : 186

Differential Revision: https://reviews.llvm.org/D143970

[flang] Disable libc++ assertions in the runtime library

Similar to D143168. Solve compiling error caused by D143612.

Error info:
```
flang-new -flang-experimental-exec hello.f90
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
llvm-project/build/lib/libFortranRuntime.a(unit.cpp.o): in function `std::__1::optional<unsigned long>::operator*[abi:v170000]() &':
unit.cpp:(.text._ZNRSt3__18optionalImEdeB7v170000Ev[_ZNRSt3__18optionalImEdeB7v170000Ev]+0x4f): undefined reference to `std::__1::__libcpp_verbose_abort(char const*, ...)'
/usr/bin/ld: llvm-project/build/lib/libFortranRuntime.a(unit.cpp.o): in function `void std::__1::__optional_storage_base<long, false>::__construct[abi:v170000]<unsigned long>(unsigned long&&)':
unit.cpp:(.text._ZNSt3__123__optional_storage_baseIlLb0EE11__constructB7v170000IJmEEEvDpOT_[_ZNSt3__123__optional_storage_baseIlLb0EE11__constructB7v170000IJmEEEvDpOT_]+0x55): undefined reference to `std::__1::__libcpp_verbose_abort(char const*, ...)'
```

Reviewed By: vzakhari

Differential Revision: https://reviews.llvm.org/D143890

[flang] Catch repeated BIND(C) attribute specifications for a symbol

A BIND(C) attribute statement or type-declaration-stmt attribute, just
like most attributes, can only appear once. Name resolution was excluding
the BIND(C) attribute from its check for duplicated attributes, but I
don't see a reason that remains to do so.

Differential Revision: https://reviews.llvm.org/D143834

[Clang][RISCV] Guard vector float16 type correctly with semantic analysis

Before this commit, vector float 16 types (e.g. `vfloat16m1_t`) of RVV
is only defined when extension `zvfh` is defined. However this
generate inaccurate diagnostics like:

```
error: unknown type name 'vfloat16m1_t'
```

This commit improves the compiler by guarding type check correctly
under semantic analysis.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D143657

[mlir][NFC] Remove unused variable 'indexType' in GPUTransformOps.cpp

/data/jiefu/llvm-project/mlir/lib/Dialect/GPU/TransformOps/GPUTransformOps.cpp:430:13: error: unused variable 'indexType' [-Werror,-Wunused-variable]
IndexType indexType = rewriter.getIndexType();
^
1 error generated.

[mlir][gpu] NFC change to pass threadID ops to rewriteOneForeachThreadToGpuThreads

This allows user to give both the thread ids and dimension of the threads we want to distribute on.
This means we can use it to distribute on warps as well.

Reviewed By: harsh

Differential Revision: https://reviews.llvm.org/D143950

[flang] Check for invalid BIND(C) names

Require BIND(C) interoperable names to be valid C identifiers.

Differential Revision: https://reviews.llvm.org/D143833

[flang] Check for non-interoperable intrinsic types in BIND(C) derived types

Every component of a BIND(C) interoperable derived type must have an
interoperable type. Semantics was checking components with derived types,
but not components with intrinsic types.

Differential Revision: https://reviews.llvm.org/D143832

[libc++][NFC] Remove duplicated line from `Cxx20Issues.csv`

[flang] Pointers returned from functions are not definable as pointers

A reference to a pointer-valued function is a "variable" in the argot of
the Fortran standard, and can be the left-hand side of an assignment
statement or passed as a definable actual argument -- but it is not a
definable pointer, and cannot be associated with a pointer dummy argument
that is not INTENT(IN).

Differential Revision: https://reviews.llvm.org/D143827

[flang] Respect inaccessibility of type-bound ASSIGNMENT(=)

When a derived type has a PRIVATE type-bound generic binding for
a defined ASSIGNMENT(=), don't use it in scopes outside of the
module that defines the type. We already get this case right
for other type-bound generics, including defined operators,
and for non-type-bound generic interfaces, but the check was
not applied for this case.

Differential Revision: https://reviews.llvm.org/D143826

Revert "[lldb] Use portable format string PRIx64"

This reverts commit be7d7ca1101840fc8e19e0e48f9dc395da569d23.

This commit was made to fix be7d7ca1101840fc8e19e0e48f9dc395da569d23
which got reverted in 620b3d9ba3343d7bc5bab2340174a20952fcd00f. We need
to revert this commit as well because types in log statements are 32 bit again.

[Fuchsia] Add FUCHSIA_ENABLE_LLDB option.

This CMake option builds/installs LLDB as part of the Fuchsia toolchain.
Once this is better supported, the effects of this will be inlined into
the toolchain cache file.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D143794

[flang] Warn about dangerous TRANSFER()

When the source or mold of a reference to the intrinsic function TRANSFER()
has a derived type with a direct component that contains a descriptor,
such as an allocatable or a pointer, emit a warning. User programs
should never access descriptors directly.

Differential Revision: https://reviews.llvm.org/D143823

[TLS] Added a LangRef entry wrt the module flag MaxTLSAlign.

The module flag was introduced with commit 5d07e0448e38d4be0.

[flang] Catch obscure structure constructor error

A scalar value in a structure constructor may correspond to an
array component in the derived type only when that component has
a shape to which the scalar value may be expanded.

Differential Revision: https://reviews.llvm.org/D143822

Find SDK path more lazily in Apple Simulator platforms

In https://reviews.llvm.org/D122373 I delayed the search for
the SDK filepath until the simulator platform is Created.
In the qProcessInfo binary-addresses key, I have to force-Create
every platform to find one that can handle a kernel fileset;
this forced all of the simulator platforms to create, taking the
SDK filepath discovery perf hit.

This patch delays that path search further until the Apple
Simulator platform calls a method that actually needs the full
filepath; it saves the SDK name ("WatchSimulator.sdk" etc) until
it needs to expand it.

Differential Revision: https://reviews.llvm.org/D143932
rdar://103380717

[scudo] Call getStats when the region is exhausted

Because of lock contention, we temporarily disabled the printing of
regions' status when it's exhausted. Given that it's useful when the
Region OOM happens, this CL brings it back without lock contention.

Differential Revision: https://reviews.llvm.org/D141955

[scudo] Calling getStats requires holding lock

We didn't acquire the mutex while accessing those lock protected data,
this CL fixes it and now we don't need to disable the allocator while
reading its states.

Differential Revision: https://reviews.llvm.org/D142149

[flang] Conform with standard (mostly) for character length mismatches on arguments

Fortran 2018 defines some flavors of dummy arguments to require exact
matching of character lengths between dummy and actual arguments;
these situations tend to be those in which the interface must be
explicit and a descriptor is involved: assumed shape, assumed rank,
allocatable, and pointer.

Fortran allows an actual argument in other cases to have a longer
length than the dummy argument; as a common extension, we support a
shorter actual argument as well by means of blank padding, but should
emit a warning.

Differential Revision: https://reviews.llvm.org/D143821

[mlir][sparse] Extend readCOOIndices to support overhead types beyond index_type.

This is to prepare for implementing the C API for reading a COO tensor to the
given buffers for indices and values.

Reviewed By: wrengr

Differential Revision: https://reviews.llvm.org/D143861

[flang] Diagnose REPEAT with negative NCOPIES=

Emit an error when the NCOPIES= argument to the intrinsic function
REPEAT has a negative value. (The current implementation crashes, which
isn't informative.)

Differential Revision: https://reviews.llvm.org/D143820

[mlir][sparse] fix a memory leakage when converting from a tensor slice

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D143929

[libc] Write stub files to a new directory to avoid conflicts

Summary:
This hack with stub files is used to make the final object archive
have human-understandable names. We currently output these into the
current binary directory, which sometimes interferes with the actual
source file. Put these in their own directory to be certain they don't
conflict.

[flang][build] Fix build issue reported on recent commit

Some compiler (not specified) reported to issue an error on a
"default:" clause in a switch statement whose cases cover all of
the values of an "enum class". Since other compilers/versions
are known to complain in the other direction, change the switch
statement to a cascade of ifs.

Revert "[LLDB] Enable 64 bit debug/type offset"

This reverts commit f36fe009c0fc1d655bfc6168730bedfa1b36e622.

[OpenMP] Add check for target allocator regardless of the availability of libmemkind

Current runtime implementation only checks for target allocator when libmemkind is
not available. This patch adds checks for target allocator regardless of the
presence of libmemkind library.

Differential Revision: https://reviews.llvm.org/D142582

[mlir][tosa] Enable `apply_scale` unrolling

Make `tosa.apply_scale` implement `VectorUnrollOpInterface` so that we
can unroll it.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D143944

[Release] Produce bolt tarball

Source tarball's are used from some distribution to build the project

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D143809

[lldb] Use portable format string PRIx64

[runtimes] Set LLVM_ENABLE_PER_TARGET_RUNTIME_DIR_default to ON for OS390

Reviewed By: abhina.sreeskantharajan, zibi

Differential Revision: https://reviews.llvm.org/D143916

libclc: remove sqrt/rsqrt from clspv SOURCES

https://reviews.llvm.org/D134040

Patch by: Aaron Greig <aaron.greig@codeplay.com>

[VPlan] Use properlyDominates predicate for ordering FOR users.

The current implementation may return true for A < B and B < A, which
may cause issues if the sort implementation assures this property of the
comperator. This should fix a crash with MSVC.

[mlir] Make the vast majority of integration and runner tests work on Windows

This patch contains the changes required to make the vast majority of integration and runner tests run on Windows.
Historically speaking, the JIT support for Windows has been lacking behind, but recent versions of ORC JIT have now caught up and works for basically all examples in repo.

Sadly due to these tests previously not working on Windows, basically all of them are making unix-like assumptions about things like filenames, paths, shell syntax etc.
This patch fixes all these issues in one big swoop and enables Windows support for the vast majority of integration tests.

More specifically, following changes had to be done:
* The various JIT runners used paths to the runtime libraries that assumed a Unix toolchain layout and filenames. I abstracted the specific path and filename of these runtime libraries away by making the paths to the runtime libraries be passed from cmake into lit. This now also allows a much more convenient syntax: `--shared-libs=%mlir_c_runner_utils` instead of `--shared-libs=%mlir_lib_dir/lib/libmlir_c_runner_utils%shlibext`
* Some tests using python set environment variables using the `ENV=VALUE cmd` format. This works on Unix, but on Windows it has to prefixed using `env ENV=VALUE cmd`
* Some tests used C functions that are simply not available or exported on Windows (`fabsf`, `aligned_alloc`). These tests have either been adjusted or explicitly marked as `UNSUPPORTED`

Some tests remain disabled on Windows as before:
* In SparseTensor some tests have non-trivial logic for finding the runtime libraries which seems to be required for the use of emulators. I do not have the time to port these so I simply kept them disabled
* Some tests requiring special hardware which I simply cannot test remain disabled on Windows. These include usage of AVX512 or AMX

The tests for `mlir-vulkan-runner` and `mlir-spirv-runner` all work now as well and so do the vast majority of `mlir-cpu-runner`.

Differential Revision: https://reviews.llvm.org/D143925

[mlir][SPIRVToLLVM] Add pass option to emit opaque-pointers

Part of https://discourse.llvm.org/t/rfc-switching-the-llvm-dialect-and-dialect-lowerings-to-opaque-pointers/68179

This patch adds the pass option and required changes to the patterns to support the emission of LLVM IR opaque pointers. Given how close SPIRV semantics are to LLVM IR semantics this boils down to just a few changes:
* Making sure that GEP and alloca are built with the explicit base pointer type
* creating opaque pointers instead of typed pointers if requested
* omitting pointer to pointer bitcasts

Differential Revision: https://reviews.llvm.org/D143900

[mlir][LLVM] Verify correct pointer casts with `llvm.bitcast`

`llvm.bitcast` has so far not had a verifier and this allowed various bugs to sneak into the codebase (including within tests!) which could only be caught once translated to actual LLVM IR. This patch fixes those problematic cases by now verifying bitcasts on pointers are done correctly.

Specifically, it verifies that if pointers are involved, that both result and source types are pointers, that this also applies to vector of pointers and that pointer casts are of the same address space.

The only thing left unverified is the general case of "source type size does not match result type size". I think this case is less trivial and more prone to false positives, so I did not yet implement it.

Differential Revision: https://reviews.llvm.org/D143868

[mlir][GPU] add required address space cast when lowering to LLVM

The runtime functions `memset` and `memcpy` are lowered are declared with pointers to the default address space (0) while their ops however are compatible with memrefs taking any address space.
Such cases do not cause any issues with MLIRs LLVM Dialect due to `bitcast`s verifier being too lenient at the moment, but actual LLVM IR does not allow casting between address spaces using `bitcast`: https://godbolt.org/z/3a1z97rc9

This patch fixes the issue by inserting an address space cast before the bitcast, to first cast the pointer into the correct address space before doing the bitcast.

Differential Revision: https://reviews.llvm.org/D143866

[mlir][tosa] Fix segmentation fault in case of folding unranked tensor

Trying to fold the unranked tensor for "tosa.equal" crashes due to null reference.
We need to check the dynamic cast result beforehand. This is reported in
https://github.com/llvm/llvm-project/issues/60192.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D143034

[LLDB] Enable 64 bit debug/type offset

This came out of from https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902
With big binaries we can have .dwp files where .debug_info.dwo section can grow
beyond 4GB. We would like to support this in LLVM and in LLDB.

The plan is to enable manual parsing of cu/tu index in DWARF library
(https://reviews.llvm.org/D137882), and then
switch internal index data structure to 64 bit.
For the second part is to enable 64bit offset support in LLDB with
this patch.

Depends on D139955

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D138618

[libc][bazel] Update math function unit tests' dependency computation.

[flang] Warn on mismatched DATA substring sizes rather than crashing

When a DATA statement initializes a substring with a character constant
of the wrong length, do the right thing with blank padding or truncation,
and emit a warning. Current code is crashing due to an unhandled error
reported from the low-level data image initialization framework.

Differential Revision: https://reviews.llvm.org/D143819

[lld][WebAssembly] Limit size of shared 64-bit memories of 2^^34

This is current limit in v8. See
https://github.com/WebAssembly/memory64/issues/33 how we might change
this in the future.

Differential Revision: https://reviews.llvm.org/D143783

[libc] Add a loader utility for AMDHSA architectures for testing

This is the first attempt to get some testing support for GPUs in LLVM's
libc. We want to be able to compile for and call generic code while on
the device. This is difficult as most GPU applications also require the
support of large runtimes that may contain their own bugs (e.g. CUDA /
HIP / OpenMP / OpenCL / SYCL). The proposed solution is to provide a
"loader" utility that allows us to execute a "main" function on the GPU.

This patch implements a simple loader utility targeting the AMDHSA
runtime called `amdhsa_loader` that takes a GPU program as its first
argument. It will then attempt to load a predetermined `_start` kernel
inside that image and launch execution. The `_start` symbol is provided
by a `start` utility function that will be linked alongside the
application. Thus, this should allow us to run arbitrary code on the
user's GPU with the following steps for testing.

```
clang++ Start.cpp --target=amdgcn-amd-amdhsa -mcpu=<arch> -ffreestanding -nogpulib -nostdinc -nostdlib -c
clang++ Main.cpp --target=amdgcn-amd-amdhsa -mcpu=<arch> -nogpulib -nostdinc -nostdlib -c
clang++ Start.o Main.o --target=amdgcn-amd-amdhsa -o image
amdhsa_loader image <args, ...>
```

We determine the `-mcpu` value using the `amdgpu-arch` utility provided
either by `clang` or `rocm`. If `amdgpu-arch` isn't found or returns an
error we shouldn't run the tests as the machine does not have a valid
HSA compatible GPU. Alternatively we could make this utility in-source
to avoid the external dependency.

This patch provides a single test for this untility that simply checks
to see if we can compile an application containing a simple `main`
function and execute it.

The proposed solution in the future is to create an alternate
implementation of the LibcTest.cpp source that can be compiled and
launched using this utility. This approach should allow us to use the
same test sources as the other applications.

This is primarily a prototype, suggestions for how to better integrate
this with the existing LibC infastructure would be greatly appreciated.
The loader code should also be cleaned up somewhat. An implementation
for NVPTX will need to be written as well.

Reviewed By: sivachandra, JonChesterfield

Differential Revision: https://reviews.llvm.org/D139839