review.tizen.org Git - platform/upstream/llvm.git/log

[NFC][sanitizer] Move GET_MALLOC_STACK_TRACE closer to the use

[ComprehensiveBufferize] Fix a warning

This patch fixes:

  mlir/lib/Dialect/Linalg/ComprehensiveBufferize/ComprehensiveBufferize.cpp:2240:13:
  error: unused function 'checkAliasInfoConsistency'
  [-Werror,-Wunused-function]

[NFC][sanitizer] Make const PointerIsMine and FromPrimary

[lldb/Plugins] Refactor ScriptedThread register context creation

This patch changes the ScriptedThread class to create the register
context when Process::RefreshStateAfterStop is called rather than
doing it in the thread constructor.

This is required to update the thread state for execution control.

Differential Revision: https://reviews.llvm.org/D112167

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>

Correct handling of the 'throw()' exception specifier in C++17.

Per C++17 [except.spec], 'throw()' has become equivalent to
'noexcept', and should therefore call std::terminate, not
std::unexpected.

Differential Revision: https://reviews.llvm.org/D113517

[mlir][tosa] Add lowering for tosa.pad with explicit value

New TOSA pad operation can support explicitly specifying the pad value. Added
lowering to linalg that uses the explicit value.

Differential Revision: https://reviews.llvm.org/D113515

Emit hidden hostcall argument for sanitized kernels

this patch - https://reviews.llvm.org/D110337 changes the way how hostcall
hidden argument is emitted for printf, but the sanitized kernels also use
hostcall buffer to report a error for invalid memory access, which is not
handled by the above patch and it leads to vdi runtime error:

Device::callbackQueue aborting with error : HSA_STATUS_ERROR_MEMORY_FAULT:
Agent attempted to access an inaccessible address. code: 0x2b

Patch by: Praveen Velliengiri

Reviewed by: Yaxun Liu, Matt Arsenault

Differential Revision: https://reviews.llvm.org/D112820

[InstCombine] Support splat vectors in some or of icmp folds

Replace m_ConstantInt() with m_APInt() in order to support splat
constants in addition to scalar integers.

[CUDA][HIP] Allow comdat for kernels

Two identical instantiations of a template function can be emitted by two TU's
with linkonce_odr linkage without causing duplicate symbols in linker. MSVC
also requires these symbols be in comdat sections. Linux does not require
the symbols in comdat sections to be merged by linker but by default
clang puts them in comdat sections.

If a template kernel is instantiated identically in two TU's. MSVC requires
that them to be in comdat sections, otherwise MSVC linker will diagnose them as
duplicate symbols. However, currently clang does not put instantiated template
kernels in comdat sections, which causes link error for MSVC.

This patch allows putting instantiated template kernels into comdat sections.

Reviewed by: Artem Belevich, Reid Kleckner

Differential Revision: https://reviews.llvm.org/D112492

[InstCombine] Support splat vectors in some and of icmp folds

Replace m_ConstantInt() with m_APInt() to support splat vectors
in addition to scalar integers.

[InstCombine] Add vector variants to merge-icmps.ll (NFC)

And regenerate test checks.

[Clang] Pass -z rel to linker for Fuchsia

Fuchsia already supports the more compact relocation format.
Make it the default.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D113136

[InstCombine] Strip offset when folding and/or of icmps

When folding and/or of icmps, look through add of a constant and
adjust the icmp range instead. Effectively, this decomposes
X + C1 < C2 style range checks back into a normal range. This allows
us to fold comparisons involving two range checks or one range check
and some other condition. We had a fold for a really specific case
of this (or of range check and eq, and only one one side!) while
this handles it in fully generality.

Differential Revision: https://reviews.llvm.org/D113510

[compiler-rt] Fix typo in DeadlockDetector (chanding->changing)

[LLDB][NFC] Fix test that broke due to libc++ std::vector changes

D112976 moved most of the guts of __vector_base into vector, this broke
some LLDB tests by changing the result types that LLDB sees. This updates
the test to reflect the new structure.

[libc++] Fix segmentation fault in __do_put_integral

6 chars are not sufficient to represent all formats for 64 bit integers.

This was accidentally introduced in commit b889cbf36635a302f5b77560f1769178f196c2c7 (https://reviews.llvm.org/D112830).

This causes failures in downstream projects, for example:

* https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40817
* https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40841

Differential Revision: https://reviews.llvm.org/D113600

[MLIR][Affine][NFC] affine.store op verifier message fix and check

Fix typo in affine.store op verifier message and test case.

Differential Revision: https://reviews.llvm.org/D113360

Revert "[clang] Add early exit when checking for const init of arrays."

This reverts commit 48bb5f4cbe8d5951c1153e469dc6713a122b7fa3.

Several breakages, including ARM (fixed later, but not sufficient) and
MSan (to be diagnosed later).

Differential Revision: https://reviews.llvm.org/D113599

[RISCV] Prevent bad legalizer behavior when bitcasting fixed vectors to i64 on RV32 with Zve32.

Similar to D113219, we need to make sure we don't create a vXi64
vector when it isn't legal. This fixes an error found by an
expensive checks build.

[debugserver] Remove varaible `ldb_set` which is set but not used.

Differential revision: https://reviews.llvm.org/D113598

[X86][Costmodel] `getReplicationShuffleCost()`: implement cost model for 8 bit-wide elements with AVX512VBMI

VBMI introduced VPERMB, so cost-model i8 replication shuffle using it.
Note that we can still model i8 replication shufflle without VBMI,
by promoting to i16/i32. That will be done in follow-ups.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113479

[X86][Costmodel] `getReplicationShuffleCost()`: implement cost model for 16 bit-wide elements with AVX512BW

BWI introduced VPERMW, so cost-model i16 replication shuffle using it.
Note that we can still model i16 replication shufflle without BWI,
by promoting to i32. That will be done in follow-ups.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113478

[X86][Costmodel] `getReplicationShuffleCost()`: implement cost model for 32/64 bit-wide elements with AVX512F

This models lowering to `vpermd`/`vpermq`/`vpermps`/`vpermpd`,
that take a single input vector and a single index vector,
and are cross-lane. So far i haven't seen evidence that
replication ever results in demanding more than a single
input vector per output vector.

This results in *shockingly* lesser costs :)

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113350

tosa-make-broadcatable pass now supports numpy style broadcasting only.

- fix bug that in [c,1] + [a, b, c, d] broadcast
- add test [3,3,4,1] + [4,5]

Signed-off-by: Kevin Cheng <kevin.cheng@arm.com>
Change-Id: Iaed2f04df8775f655c82c740271395274163d147

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D113596

Replace include by forward declaration in test case

This should fix the remaining buildbot issue for the misleading identifier
detection pass.

Differential Revision: https://reviews.llvm.org/D112914

[fir] Remove `fir.unbox` operation

`fir.unbox` operation is an old operation that is no longer required.
There are couple of other operations that can be used to extract
information from a `fir.box` such as `fir.box_rank`, `fir.box_addr`,
`fir.box_dims`.

This was found during the upstreaming process.

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D113581

[LLDB][Breakpad] Make lldb understand INLINE and INLINE_ORIGIN records in breakpad.

Teach LLDB to understand INLINE and INLINE_ORIGIN records in breakpad.
They have the following formats:
```
INLINE inline_nest_level call_site_line call_site_file_num origin_num [address size]+
INLINE_ORIGIN origin_num name
```
`INLNIE_ORIGIN` is simply a string pool for INLINE so that we won't have
duplicated names for inlined functions and can show up anywhere in the symbol
file.
`INLINE` follows immediately after `FUNC` represents the ranges of momery
address that has functions inlined inside the function.

Differential Revision: https://reviews.llvm.org/D113330

[lldb/test] Skip TestScriptedProcess when using system's debugserver (NFC)

Because TestScriptedProcess.py creates a skinny corefile to provides data
to the ScriptedProcess and ScriptedThread, we need to make sure that the
debugserver used is not out of tree, to ensure feature availability
between debugserver and lldb.

This also removes the `SKIP_SCRIPTED_PROCESS_LAUNCH` env variable after
each test finish running.

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>

[AArch64] Add missing tests for i8 vector to half conversions.

[LLDB][Breakpad] Create a function for each compilation unit.

Since every FUNC record (in breakpad) is a compilation unit, creating the
function for the CU allows `ResolveSymbolContext` to resolve
`eSymbolContextFunction`.

Differential Revision: https://reviews.llvm.org/D113163

Revert "[lldb] Disable minimal import mode for RecordDecls that back FieldDecls"

This reverts commit 3bf96b0329be554c67282b0d7d8da6a864b9e38f.

It causes crashes as reported in PR52257 and a few other places. A reproducer is bundled with this commit to verify any fix forward. The original test is left in place, but marked XFAIL as it now produces the wrong result.

Reviewed By: teemperor

Differential Revision: https://reviews.llvm.org/D113449

[lldb] make it easier to find LLDB's python

It is surprisingly difficult to write a simple python script that
can reliably `import lldb` without failing, or crashing.   I'm
currently resorting to convolutions like this:

    def find_lldb(may_reexec=False):
if prefix := os.environ.get('LLDB_PYTHON_PREFIX'):
if os.path.realpath(prefix) != os.path.realpath(sys.prefix):
raise Exception("cannot import lldb.\n"
f"  sys.prefix should be: {prefix}\n"
f"  but it is: {sys.prefix}")
else:
line1, line2 = subprocess.run(
['lldb', '-x', '-b', '-o', 'script print(sys.prefix)'],
encoding='utf8', stdout=subprocess.PIPE,
check=True).stdout.strip().splitlines()
assert line1.strip() == '(lldb) script print(sys.prefix)'
prefix = line2.strip()
os.environ['LLDB_PYTHON_PREFIX'] = prefix

if sys.prefix != prefix:
if not may_reexec:
raise Exception(
"cannot import lldb.\n" +
f"  This python, at {sys.prefix}\n"
f"  does not math LLDB's python at {prefix}")
os.environ['LLDB_PYTHON_PREFIX'] = prefix
python_exe = os.path.join(prefix, 'bin', 'python3')
os.execl(python_exe, python_exe, *sys.argv)

lldb_path = subprocess.run(['lldb', '-P'],
check=True, stdout=subprocess.PIPE,
encoding='utf8').stdout.strip()

sys.path = [lldb_path] + sys.path

This patch aims to replace all that with:

  #!/usr/bin/env lldb-python
  import lldb
  ...

... by adding the following features:

* new command line option: --print-script-interpreter-info.  This
   prints language-specific information about the script interpreter
   in JSON format.

* new tool (unix only): lldb-python which finds python and exec's it.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D112973

[x86] simplify code; NFC

We bail out if the types don't match, so it's clearer to
have a single variable to show that common type.

[x86] fix formatting; NFC

[NFC][llvm][M68k] Inclusive language: reword comment

Rewording the comment to avoid the use of blacklist.

[clang] Fix armv7-quick build by hardcoding -triple=x86_64 in OOM test.

The test was added in 48bb5f4cbe8d5951c1153e469dc6713a122b7fa3 and it
creates a very large array, which is too large for ARM. Making the array
smaller is not a good option, since we would need a very low ulimit and
could then hit it for other reasons.

Should fix the https://lab.llvm.org/buildbot/#/builders/171/builds/5851
failure.

Differential Revision: https://reviews.llvm.org/D113583

[CGSCC][LazyCallGraph][NFC] Fix typos in code comments

[InstCombine] Relax and reorganize one use checks in the ~(a | b) & c

Since there is just a single check for LHS in ~(A | B) & C | ...
transforms and multiple RHS checks inside with more coming I am
removing m_OneUse checks for LHS and adding new checks for RHS.
This is non essential as long as there is total benefit.

In addition (~(A | B) & C) | (~(A | C) & B) --> (B ^ C) & ~A
checks were overly restrictive, it should be good without any
additional checks.

Differential Revision: https://reviews.llvm.org/D113141

[mlir] Make topologicalSort iterative and consider op regions

When doing topological sort we need to make sure an op is scheduled before any
of the ops within its regions.
Also change the algorithm to not be recursive in order to prevent potential
stack overflow.

Differential Revision: https://reviews.llvm.org/D113423

[libc++][AIX] Alignment of bool on AIX is 1

Update test so that we check for a 1 byte alignment on AIX PPC32.

Differential Revision: https://reviews.llvm.org/D112087

[mlir][nvvm] Remove special case ptr arithmetic lowering in gpu to nvvm

Use existing helper instead of handling only a subset of indices lowering
arithmetic. Also relax the restriction on the memref rank for the GPU mma ops
as we can now support any rank.

Differential Revision: https://reviews.llvm.org/D113383

[SelectionDAG] Replace the Chain in LOAD->VP_LOAD widening

The introduction of this legalization, D111248, forgot to replace the
old chain with the new. This could manifest itself in the old
(illegally-typed) value remaining in the DAG, though the simple test
cases didn't catch this.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D113561

[RISCV] Prevent crashes when bitcasting between fixed vectors and scalars.

Not all scalar element types are allowed in vectors so we may not
be able to bitcast to a 1 element vector to use insert/extract.

This will become a bigger issue when the Zve extensions are commited.
For now, I'm using the ELEN limit to limit the element types.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D113219

[clang] Add early exit when checking for const init of arrays.

Before this commit, on code like:

struct S { ... };
S arr[10000000];

while checking if arr is constexpr, clang would reserve memory for
arr before running constructor for S. If S turned out to not have a
valid constexpr c-tor, clang would still try to initialize each element
(and, in case the c-tor was trivial, even skipping the constexpr step
limit), only to discard that whole APValue later, since the first
element generated a diagnostic.

With this change, we start by allocating just 1 element in the array to
try out the c-tor and take an early exit if any diagnostics are
generated, avoiding possibly large memory allocation and a lot of work
initializing to-be-discarded APValues.

Fixes 51712 and 51843.

In the future we may want to be smarter about large possibly-constexrp
arrays and maybe make the allocation lazy.

Differential Revision: https://reviews.llvm.org/D113120

[mlir] recursively convert builtin types to LLVM when possible

Given that LLVM dialect types may now optionally contain types from other
dialects, which itself is motivated by dialect interoperability and progressive
lowering, the conversion should no longer assume that the outermost LLVM
dialect type can be left as is. Instead, it should inspect the types it
contains and attempt to convert them to the LLVM dialect. Introduce this
capability for LLVM array, pointer and structure types. Only literal structures
are currently supported as handling identified structures requires the
converison infrastructure to have a mechanism for avoiding infite recursion in
case of recursive types.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D112550

[lldb] [test] Fix new signal tests to use remote-linux plugin

Hopefully this will fix OSX and Windows buildbots

[mlir][Linalg] Add interface method to Linalg ops to allow setting the output operand.

Differential Revision: https://reviews.llvm.org/D113522

[InstCombine] Add additional test with signed range check (NFC)

[SCEV] Add tests that require rewriting zexts when applying guards.

Precommit tests inspired by PR40961 and PR52464.

[X86] combineMulToPMADDWD - remove useless TODO

We should always be able to use PMULUDQ/PMULDQ in PMADDWD patterns with greater than 32-bit extended integer sources

[OpenMP] Fix: opposite attributes could be set by -fno-inline

After the changes introduced by D106799 it is possible to tag
outlined function with both AlwaysInline and NoInline attributes using
-fno-inline command line options.
This issue is similiar to D107649.

Differential Revision: https://reviews.llvm.org/D112645

[lldb/test] Update TestScriptedProcess to use skinny corefiles

This patch changes the ScriptedProcess test to use a stack-only skinny
corefile as a backing store.

The corefile is saved as a temporary file at the beginning of the test,
and a second target is created for the ScriptedProcess. To do so, we use
the SBAPI from the ScriptedProcess' python script to interact with the
corefile process.

This patch also makes some small adjustments to the other ScriptedProcess
scripts to resolve some inconsistencies and removes the raw memory dump
that was previously checked in.

Differential Revision: https://reviews.llvm.org/D112047

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>

[lldb/bindings] Change ScriptedThread initializer parameters

This patch changes the `ScriptedThread` initializer in couple of ways:
- It replaces the `SBTarget` parameter by a `SBProcess` (pointing to the
`ScriptedProcess` that "owns" the `ScriptedThread`).
- It adds a reference to the `ScriptedProcessInfo` Dictionary, to pass
arbitrary user-input to the `ScriptedThread`.

This patch also fixes the SWIG bindings methods that call the
`ScriptedProcess` and `ScriptedThread` initializers by passing all the
arguments to the appropriate `PythonCallable` object.

Differential Revision: https://reviews.llvm.org/D112046

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>

[lldb] Fix Scripted ProcessLaunchInfo Argument nullptr deref

This patch adds a new `StructuredData::Dictionary` constructor that
takes a `StructuredData::ObjectSP` as an argument. This is used to pass
the opaque_ptr from the `SBStructuredData` used to initialize a
ScriptedProecss, to the `ProcessLaunchInfo` class.

This also updates `SBLaunchInfo::SetScriptedProcessDictionary` to
reflect the formentionned changes which solves the nullptr deref.

Differential Revision: https://reviews.llvm.org/D112107

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>

[mlir][linalg] Remove getSmallestBoundingIndex (NFC).

Remove the getSmallestBoundingIndex method that has no uses anymore.

Depends On D113548

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113549

[clang] Do not crash in APValue::prettyPrint() on forward-decl structs.

The call to getTypeSizeInChars() is replaced with
getTypeSizeInCharsIfKnown(), which does not crash on forward declared
structs. This only affects printing.

Differential Revision: https://reviews.llvm.org/D113570

[linalg][mlir] Replace getSmallestBoundingIndex in promotion (NFC).

Replace the getSmallestBoundingIndex method used in promotion by getConstantUpperBoundForIndex that uses flat affine constraints to compute a constant upper bound.

Depends On D113546

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113548

[AArch64] Combine vector fptoi.sat(fmul) to fixed point fcvtz

Similar to D113199 but dealing with the vector size, this extends the
fptosi+fmul to fixed point fold to handle fptosi.sat nodes that are
equally viable, so long as the saturation width matches the output
width.

Differential Revision: https://reviews.llvm.org/D113200

[NVPTX] Add imm variants for surface and texture instructions

Texture/sampler/surface operands can be either a register or an
immediate (an index of .texref, .samplerref or .surfref).

TableGen declarations for these instructions used to only have
Int64Regs operands, so this caused issues when machine verifier
is turned on:

    *** Bad machine code: Expected a register operand. ***
    - function:    bar
    - basic block: %bb.0  (0x55b144d99ab8)
    - instruction: %4:int32regs = SULD_1D_I32_TRAP 0, killed %2:int32regs
    - operand 1:   0

The solution is to duplicate these instructions for all possible
operand types (i16imm and Int64Regs). Since this would
essentially double the amount code in TableGen, the patch also
does some refactoring for the original instructions to keep
things manageable.

Differential Revision: https://reviews.llvm.org/D112232

[mlir][linalg] Use getUpperBoundForIndex in hoisting (NFC).

Use the custom upper bound computation in hoisting by the new getUpperBoundForIndex method.

Depends On D113546

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113547

[mli][linalg] Use CodegenStrategy to test interchange (NFC).

Use CodegenStrategy instead of a separate test pass to test iterator interchange.

Depends On D113409

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113550

[libFuzzer] Deflake entropic exec-time test.

Entropic scheduling with exec-time option can be misled, if inputs
on the right path to become crashing inputs accidentally take more
time to execute before it's added to the corpus. This patch, by letting
more of such inputs added to the corpus (four inputs of size 7 to 10,
instead of a single input of size 2), reduces possibilities of being
influenced by timing flakiness.

A longer-term fix could be to reduce timing flakiness in the fuzzer;
one way could be to execute inputs multiple times and take average of
their execution time before they are added to the corpus.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D113544

[aarch64/mac] Correctly disassemble @TLVPPAGE(OFF) relocs

`llvm-otool -tV foo.o` and `llvm-objdump --macho -d foo.o` would
previously fail on object files containing @TLVPPAGE or @TLVPPAGEOFF relocs.

Move llvm-objdump-specific test from
llvm/test/MC/AArch64/arm64-tls-modifiers-darwin.s to new
llvm/test/tools/llvm-objdump/MachO/disassemble-arm64-tlv-modifers.test
and put test for this fix to that new file.

Fixes PR52356.

Differential Revision: https://reviews.llvm.org/D112843

[mlir][linalg] Remove padding test pass (NFC).

Remove padding test pass that was replaced by CodegenStrategy.

Depends On D113411

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113412

[OpenMP] Lower printf to __llvm_omp_vprintf

Extension of D112504. Lower amdgpu printf to `__llvm_omp_vprintf`
which takes the same const char*, void* arguments as cuda vprintf and also
passes the size of the void* alloca which will be needed by a non-stub
implementation of `__llvm_omp_vprintf` for amdgpu.

This removes the amdgpu link error on any printf in a target region in favour
of silently compiling code that doesn't print anything to stdout.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D112680

[NFC] Inclusive language: replace master with main in benchmark docs

[NFC] As part of using inclusive language within the llvm project and to match
the renamed master branch of `google/benchmark`, this patch replaces master with
main in the benchmark releasing docs.

Reviewed By: kbobyrev

Differential Revision: https://reviews.llvm.org/D113513

[DAG] reassociateOpsCommutative - pull out repeated getOperand() calls. NFC.

[linalg][mlir] Replace getSmallestBoundingIndex in padding (NFC).

Replace the getSmallestBoundingIndex method used in padding by getConstantUpperBoundForIndex that uses flat affine constraints to compute a constant upper bound.

Depends On D113398

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113546

[mlir][linalg] Use CodegenStrategy to test hoisting (NFC).

Use CodegenStrategy instead of a separate test pass to test hoisting.

Depends On D113410

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113411

[mli][linalg] Use CodegenStrategy to test padding (NFC).

Use CodegenStrategy instead of a separate test pass to test padding.

Depends On D113409

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113410

[mlir][linalg] Use AffineApplyOp to compute padding width (NFC).

Use AffineApplyOp instead of SubIOp to compute the padding width when creating a pad tensor operation.

Depends On D113382

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113404

[InstCombine] add check for integer source type from cast to prevent crash

A problem was noted in the post-commit review for
c36b7e21bd8f04a44d6 / D113035 :

If the source type is not integer or integer vector,
then we could crash when trying to ComputeNumSignBits().

[x86] shorten function name; NFC

[x86] add tests for signbit splat mask patterns; NFC

[fir] Add fir.box_rank, fir.box_addr, fir.box_dims and fir.box_elesize conversion

This patch adds conversion for basic box operations that extract
information from the box.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D113551

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>

[fir] Use contralized values for indexing box

Add constant to index the different values in a box so that
they can be reused for the codegen part as well.

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D113553

[PowerPC] Respect rounding mode in the back end

Currently, the floating point instructions that depend on
rounding mode are correctly marked in the PPC back end with
an implicit use of the RM register. Similarly, instructions
that explicitly define the register are marked with an
implicit def of the same register. So for the most part,
RM-using code won't be moved across RM-setting instructions.

However, calls are not marked as RM-setting instructions so
code can be moved across calls. This is generally desired,
but so is the ability to turn off this behaviour with an
appropriate option - and -frounding-math really should be
that option.

This patch provides a set of call instructions (for direct
and indirect calls) that are marked with an implicit def of
the RM register. These will be used for calls that are marked
with the strictfp attribute.

Differential revision: https://reviews.llvm.org/D111433

[mli][linalg] Add flag to control CodegenStrategy enable pass.

Add a flag to control if CodegenStrategy runs the EnablePass between the transformations.

Depends On D113382

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113409

[lldb] DeConstStringify the Property class

Most of the interfaces were converted already, this just converts the
internal implementation.

[mlir][linalg] Hoist padding simplifications (NFC).

Remove unused members and store the indexing and packing loops in SmallVector.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113398

[NFC][AArch64] Handle processLogicalImmediate error

If processLogicalImmediate fails, we should return from the function
without changing InsInstrs or DelInstrs. This happens for
CodeGen/AArch64/urem-seteq-nonzero.ll LIT test as described in
https://reviews.llvm.org/D99662#2662296.

Callers of genAlternativeCodeSequence skip patterns where InsInstrs
stays empty, so this does not cause any issues now.

Differential Revision: https://reviews.llvm.org/D100047

[llvm-reduce] Use DenseSet instead of std::set (NFC).

When reducing functions with very large basic blocks (~ almost 1 million
BBs), the majority of time is spent maintaining the order in the std::set
for the basic blocks to keep.

In those cases, DenseSet<> is much more efficient. Use it instead.

[mlir][linalg] Remove padding from tiling options.

Remove the padding options from the tiling options since padding is now implemented by a separate pattern/pass introduced in https://reviews.llvm.org/D112412.

The revsion remove the tile-and-pad-tensors.mlir and replaces it with the pad.mlir that tests padding in isolation (without tiling). Similarly, hoist-padding.mlir is replaced by pad-and-hoist.mlir introduced in https://reviews.llvm.org/D112713.

Depends On D112838

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113382

[DAG] Split BuildVectorSDNode::getConstantRawBits into BuildVectorSDNode::recastRawBits helper. NFC.

NFC refactor of D113351, pulling out the APInt split/merge code from the BuildVectorSDNode bits extraction into a BuildVectorSDNode::recastRawBits helper. This is to allow us to reuse the code when we're packing constant folded APInt data back together.

[LV] Do not rely on InductionDescriptor::getCastInsts. (NFC)

Now that CastDef is passed as VPValue, there is no need to access
ID.getCastInsts, as CastDef can instead be checked.

[clang-repl] Allow Interpreter::getSymbolAddress to take a mangled name.

[fir] Fixup comment. NFC

Fixed comment as requested in https://reviews.llvm.org/D113560.

[fir] Add !fir.char type conversion

This patch is part of the upstreaming effort from fir-dev.

Differential Revision: https://reviews.llvm.org/D113560

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>

[fir] Add !fir.ptr type conversion

This patch is part of the upstreaming effort for fir-dev.

Differential Revision: https://reviews.llvm.org/D113559

Co-authored-by: Jean Perier <jperier@nvidia.com>

[mlir] Reintroduce nano time to execution_engine

Prior change had a broken test that wasn't run by accident.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D113488

Revert "[openmp] Add OMPT initialization in libomptarget"

Reverting initial OMPT for target implementation in favor of a
different implementation.

This reverts commit 3bc8ce5dd718beef0031bf4b070ac4026e6910d7.

[LV] Move optimized IV recipes to phi section of header after sinking.

Unfortunately sinking recipes for first-order recurrences relies on
the original position of recipes. So if a recipes needs to be sunk after
an optimized induction, it needs to stay in the original position, until
sinking is done. This is causing PR52460.

To fix the crash, keep the recipes in the original position until
sink-after is done.

Post-commit follow-up to c45045bfd04af9 to address PR52460.

Revert "[LoopVectorize] Extract the last lane from a uniform store"

This reverts commit 0d748b4d32cbddf58a1ff83f3ff178ec1ad49edc.
This is causing some failures when building Spec2017 with scalable
vectors. Reverting to investigate.

[NFC][SROA] Revisit test coverage in non-capturing-call.ll

Reapply 5ec2386 "Reapply db28934 "[IndVars] Pass TTI to replaceCongruentIVs""

This reverts commit 7cd273c339cfe8427404f881ae280bd9fae6ff78.

Several patches with tests fixes have been applied:
0cada82f0a30e5ae22dce66b58604ab9b47a3897 "[Test] Remove incorrect test in GVN"
97cb13615d6d9df254e3c0f3deef9eaedfe189b6 "[Test] Separate IndVars test into AArch64 and X86 parts"
985cc490f17d28b20392ee214895d947b85120ef "[Test] Remove separated test in IndVars",
and test failures caused by 5ec2386 should be resolved now.

[mlir][linalg][bufferize] Add mustBufferizeInPlace to op interface

This is useful for ops such as scf::IfOp, which always bufferize in-place.

This commit is in preparation of decoupling BufferizationAliasInfo from the SCF dialect.

Differential Revision: https://reviews.llvm.org/D113339

[lldb] [test] Skip new signal tests on Windows

[SelectionDAG] Widen scalable-vector loads/stores via VP_LOAD/VP_STORE

This patch fixes a compiler crash when widening scalable-vector loads
and stores which end up breaking down to element-wise store operations.
It does so by providing a way for targets with support for
vector-predicated loads and stores to use those instead. By widening the
operation but maintaining the original effective operation length via
the EVL, only the intended vector elements are loaded or stored.

This method should in theory be possible and even preferred for
fixed-length vector types, but all fixed-length types can be broken down
into their elements, and regardless I have observed regressions in the
generated code when doing so. I believe this is simply due to
VP_LOAD/VP_STORE not being up to par with LOAD/STORE in terms of
optimization. It does improve performance on smaller self-contained
examples, however, so the potential is there.

While the only target that benefits from this is RISCV, the legalization
is generic and so was placed centrally.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D111248

Revert "[DebugInfo] Only create concrete DIEs of concrete functions"

This reverts commit f19471a24985a0cbc32b6548c8fce1d2514e8243.
This leads to a crash. Still working on a reproducer to share.

[mlir][linalg][bufferize] Bufferize ops via PreOrder traversal

The existing PostOrder traversal with special rules for certain ops was complicated and had a bug. Switch to PreOrder traversal.

Differential Revision: https://reviews.llvm.org/D113338