review.tizen.org Git - platform/upstream/llvm.git/log

[OpenMP][OMPT] Change OMPT kind for OpenMP test lock functions

The OpenMP specification mentions that omp_test_lock and
omp_test_nest_lock dispatch OMPT callbacks with ompt_mutex_test_lock
and ompt_mutex_test_nest_lock for their kind respectively. Previously,
the values ompt_mutex_lock and ompt_mutex_nest_lock were used. This
could cause issues in application relying on the kind to correctly
determine lock states. This commit changes the kind to the expected
ones.

Also update callback.h and OMPT tests to reflect this change.

Patch prepared by Thyre

Differential Review: https://reviews.llvm.org/D153028
Differential Review: https://reviews.llvm.org/D153031
Differential Review: https://reviews.llvm.org/D153032

[lldb][NFC] Factor out code from SymbolFileDWARF::ParseVariableDIE

This function does a _lot_ of different things:

1. Parses a DIE,
2. Builds an ExpressionList
3. Figures out lifetime of variable
4. Remaps addresses for debug maps
5. Handles external variables
6. Figures out scope of variables

A lot of this functionality is coded in a complex nest of conditions, variables
that are declared and then initialized much later, variables that are updated in
multiple code paths. All of this makes the code really hard to follow.

This commit attempts to improve the state of things by factoring out (3), adding
documentation on how the expression list is built, and by reducing the scope of
variables.

Differential Revision: https://reviews.llvm.org/D154513

[LoopVectorize] Regenerate test checks (NFC)

Remove rdar links; NFC

This removes links to rdar, which is an internal bug tracker that the
community doesn't have visibility into.

See further discussion at:
https://discourse.llvm.org/t/code-review-reminder-about-links-in-code-commit-messages/71847

[MLIR][Linalg] Add max named op to linalg

I've been trying to come up with a simple and clean implementation for
ReLU. TOSA uses `clamp` which is probably the goal, but that means
table-gen to make it efficient (attributes, only lower `min` or `max`).

For now, `max` is a reasonable named op despite ReLU, so we can start
using it for tiling and fusion, and upon success, we create a more
complete op `clamp` that doesn't need a whole tensor filled with zeroes
or ones to implement the different activation functions.

As with other named ops, we start "requiring" type casts and broadcasts,
and zero filled constant tensors to a more complex pattern-matcher, and
can slowly simplify with attributes or structured matchers (ex. PDL) in
the future.

Differential Revision: https://reviews.llvm.org/D154703

[dataflow] Remove [[deprecated]] from deprecated functions

This fixes -Werror -Wdeprecated builds
See D153469

InstCombine: Fold ldexp(ldexp(x, a), b) -> ldexp(x, a + b)

The problem here is overflow or underflow which would have occurred in
the inner operation, which the exponent offsetting avoids. We can do
this if we know the two exponents are in the same direction, or
reassoc flags allow unsafe reassociates.

InstCombine: Add baseline tests for ldexp reassociation combine

InstSimplify: Update another cannotBeOrderedLessThanZero use

Pass all the optional arguments to enable assumes.

[MLIR][Presburger] Implement composition for PresburgerRelation

This patch implements range and domain composition for PresburgerRelations

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D154444

[gn build] Port fc9821a877d4

[gn build] Port c0221e006d47

Revert "[DWARF][BOLT] Implement new mechanism for DWARFRewriter"

This reverts commit 460a2244430fae192298a5fd9fa2a269e540e8c1.
It breaks building on macOS, and it was landed with a review URL
pointing to some Facebook-internal service.

Also reverts a bunch of follow-ups:

Revert "[BOLT][DWARF] Don't check string offsets"
This reverts commit f9d6f48c8bf5acaac07502403c41cf0b0d89c8d2.

Revert "[BOLT][DWARF] Change to process and write out TUs first then CUs in batches"
This reverts commit 88e95c1e4bb6e2ad3bfd185b96341ad5c09eff6b.

Revert "[BOLT][DWARF] Output DWO files as they are being processed"
This reverts commit 46ca2e3fcd419b1246357ed3b9cd36630f16e64d.

Revert "[BOLT][DWARF] Don't check string offsets"
This reverts commit cfe4a4b04f219a9dbb4e3fc01883437b6ff0e702.

Revert "[BOLT][DWARF] Numerous fixes for a new DWARFRewriter"
This reverts commit 2701a661daa393ad5901ac88d420d7aa931eda0d.

[OpenMP][OMPT] Rename callback master to masked in ompt-multiplex.h

OpenMP 5.1 replaced callback ompt_callback_master_t by
ompt_callback_masked_t. In order to stick to the standard,
the implementation is updated accordingly.

Patch prepared by Semih Burak

Differential Revision: https://reviews.llvm.org/D112798

[OpenMP][OMPT] Add two missing nullpointer checks in ompt-multiplex.h

In the functions ompt_multiplex_get_own_ompt_data
and ompt_multiplex_get_client_ompt_data in addition to
data being NULL, also the void pointer field "ptr" of
"data" could be NULL, leading to a subsequent
segfault.
This patch add the corresponding checks.

Patch prepared by Semih Burak

Differential Revision: https://reviews.llvm.org/D112806

[AST] Stop evaluate constant expression if the condition expression which in switch statement contains errors

This fix issue: https://github.com/llvm/llvm-project/issues/63453

```
constexpr int foo(unsigned char c) {
    switch (f) {
    case 0:
        return 7;
    default:
        break;
    }
    return 0;
}

static_assert(foo('d'));

```

Reviewed By: aaron.ballman, erichkeane, hokein

Differential Revision: https://reviews.llvm.org/D153296

[OpenMP][Tools] Add omp_all_memory support for Archer

The semantic of depend(out:omp_all_memory) is quite similar to taskwait in
that it separates all tasks (with dependency) created before an
all_memory-task from all tasks (with dependency) created after an
all_memory-task.
Only a single of such tasks can execute at a time. Similar to taskwait, we
have a CV (AllMemory[1]) in the generating task to express the dependency
sink semantic of an all_memory-task. In addition, AllMemory[0] describes the
dependency source semantic of an all_memory-task. All tasks with dependency
create an HB-arc towards the sink and terminate an HB-arc from the source.

Since we expect that not many applications will use such dependency, the
support for handling the synchronization semantic is off by default and
can be turned on using ARCHER_OPTION="all_memory=1". The most costly part
is the precautionary posting of an HB-arc towards the sink, which represents
a potentially contentious write from all concurrently executing sibling tasks.
A warning is printed at runtime, when the option is off while such dependency
is observed. In most cases the lazy activation will still lead to false alerts.

Differential Revision: https://reviews.llvm.org/D111895

[analyzer][NFC] Simplify CStringChecker strong types

[OpenMP] Add OMPT support for omp_all_memory task dependence

omp_all_memory currently has no representation in OMPT.

Adding new dependency flags as suggested by omp-lang issue #3007.

Differential Revision: https://reviews.llvm.org/D111788

ValueTracking: Update another cannotBeOrderedLessThanZero use

ValueTracking: Update a use of cannotBeOrderedLessThanZero

Makes assumes work.

[Clang][AArch64] Implement ACLE feature macro for FEAT_LRCPC3

This implements the new value for the `__ARM_FEATURE_RCPC` feature
macro, which was introduced to the ACLE to indicate the availability of
FEAT_LRCPC3.

More details can be found on:
https://github.com/ARM-software/acle/blob/main/main/acle.md#rcpc

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D153130

[AArch64][RCPC3] Instruction selection for LDAP1/STL1 instructions

This implements the DAG patterns to enable instruction selection for the
LDAP1 and STL1 instructions from FEAT_LRCPC3. The instructions should
match the following combinations:

* Aqcuiring atomic load + vector insert element for LDAP1.
* Vector extract element + releasing atomic store for STL1.

Patterns have also been added to cope with the DAG structure found when
dealing with 1-lane sub-vectors.

Reviewed By: tmatheson, efriedma

Differential Revision: https://reviews.llvm.org/D153129

[AArch64][RCPC3] Add Neon intrinsics for LDAP1 and STL1

This adds new intrisics to support the LDAP1 and STL1 Advanced SIMD
(Neon) instructions introduced as part of FEAT_LRCPC3.
The new intrinsics `vldap1(q)_lane`/`vstl1(q)_lane` generate IR code
similar to the existing `vld1(q)_lane/st1(q)_lane` ones, but capturing
the difference in the atomic release/acquire memory model.

The LLVM code generation changes to ensure that this instruction pair
is lowered to the correct LDAP1/STL1 instructions will be covered in a
separate commit.

Based on a patch by Sam Elliott.

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D153128

Implement P2361 Unevaluated string literals

This patch proposes to handle in an uniform fashion
the parsing of strings that are never evaluated,
in asm statement, static assert, attrributes, extern,
etc.

Unevaluated strings are UTF-8 internally and so currently
behave as narrow strings, but these things will diverge with
D93031.

The big question both for this patch and the P2361 paper
is whether we risk breaking code by disallowing
encoding prefixes in this context.
I hope this patch may allow to gather some data on that.

Future work:
Improve the rendering of unicode characters, line break
and so forth in static-assert messages

Reviewed By: aaron.ballman, shafik

Differential Revision: https://reviews.llvm.org/D105759

[analyzer] Remove deprecated analyzer-config options

The `consider-single-element-arrays-as-flexible-array-members` analyzer
option was deprecated in clang-16, and now removed from clang-17 as
promised in
https://releases.llvm.org/16.0.0/tools/clang/docs/ReleaseNotes.html#static-analyzer

This shouldn't change observable behavior.

Differential Revision: https://reviews.llvm.org/D154481

[LTO] Ensure LICM hoists expensive fdiv instructions introduced by InstCombine

In the LTO pipeline we run InstCombine after LICM, which is
different to what we normally do without LTO. This has the
effect of undoing all the great work done by LICM to reduce
the cost of the loop when it hoists the fdiv out and replaces
it with fmul. When InstCombine runs after LICM it puts the
fdiv straight back which, on AArch64 at least, is darn
expensive. You can observe this problem in the SPEC2017
benchmark parest if you build with "-Ofast -flto" and the
loop-vectoriser uses an unroll factor of 1, which is what
often happens when tail-folding is enabled.

This is also a problem for scalar loops, or indeed any loop
where there is only one use of the preheader fdiv result in
the loop.

See InstCombinerImpl::visitFMul for the code that sinks the fdiv.

I've attempted to fix this by adding another LICM pass for Full
LTO after InstCombine. The alternative is to stop InstCombine
from sinking the fdiv into loops. See D87479 for a previous
discussion on this issue.

Differential Revision: https://reviews.llvm.org/D143631

[mlir] Mark test-interpreter unsupported on Windows on Arm

This seems to fail every time there is some change in MLIR,
but not always.

For example: https://lab.llvm.org/buildbot/#/builders/65/builds/10415

[libc] Adding a version of memcpy w/ software prefetching

For machines with a lot of cores, hardware prefetchers can saturate the memory bus when utilization is high.
In this case it is desirable to turn off the hardware prefetcher completely.
This has a big impact on the performance of memory functions such as `memcpy` that rely on the fact that the next cache line will be readily available.

This patch adds the 'LIBC_COPT_MEMCPY_X86_USE_SOFTWARE_PREFETCHING' compile time option that generates a version of memcpy with software prefetching. While not fully restoring the original performances it mitigates the impact to an acceptable level.

Reviewed By: rtenneti

Differential Revision: https://reviews.llvm.org/D154494

[LV] Do not add load to group if it moves across conflicting store.

This patch prevents invalid load groups from being formed, where a load
needs to be moved across a conflicting store.

Once we hit a store that conflicts with a load with an existing
interleave group, we need to stop adding earlier loads to the group, as
this would force hoisting the previous stores in the group across the
conflicting load.

To detect such cases, add a new CompletedLoadGroups set, which is used
to keep track of load groups to which no earlier loads can be added.

Fixes https://github.com/llvm/llvm-project/issues/63602

Reviewed By: anna

Differential Revision: https://reviews.llvm.org/D154309

Reland "[dataflow] Add dedicated representation of boolean formulas"

This reverts commit 7a72ce98224be76d9328e65eee472381f7c8e7fe.

Test problems were due to unspecified order of function arg evaluation.

Reland "[dataflow] Replace most BoolValue subclasses with references to Formula (and AtomicBoolValue => Atom and BoolValue => Formula where appropriate)"

This properly frees the Value hierarchy from managing boolean formulas.

We still distinguish AtomicBoolValue; this type is used in client code.
However we expect to convert such uses to BoolValue (where the
distinction is not needed) or Atom (where atomic identity is intended),
and then fold AtomicBoolValue into FormulaBoolValue.

We also distinguish TopBoolValue; this has distinct rules for
widen/join/equivalence, and top-ness is not represented in Formula.
It'd be nice to find a cleaner representation (e.g. the absence of a
formula), but no immediate plans.

For now, BoolValues with the same Formula are deduplicated. This doesn't
seem desirable, as Values are mutable by their creators (properties).
We can probably drop this for FormulaBoolValue immediately (not in this
patch, to isolate changes). For AtomicBoolValue we first need to update
clients to stop using value pointers for atom identity.

The data structures around flow conditions are updated:
- flow condition tokens are Atom, rather than AtomicBoolValue*
- conditions are Formula, rather than BoolValue
Most APIs were changed directly, some with many clients had a
new version added and the existing one deprecated.

The factories for BoolValues in Environment keep their existing
signatures for now (e.g. makeOr(BoolValue, BoolValue) => BoolValue)
and are not deprecated. These have very many clients and finding the
most ergonomic API & migration path still needs some thought.

Differential Revision: https://reviews.llvm.org/D153469

[RISCV] Add riscv_vsoxei_mask/riscv_vsuxei_mask to getTgtMemIntrinsic.

This constructs a proper memory operand for riscv_vsoxei_mask and riscv_vsuxei_mask.
I think they are missed in D147119.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D154694

[compiler-rt][xray] Disable fdr-single-thread test on Arm

For unknown reasons this casues a bus error.

See:
https://lab.llvm.org/buildbot/#/builders/178/builds/5157

[MLIR][Linalg] Add unary named ops to linalg

Following binary arithmetic in previous commits, this patch adds unary
maths ops to linalg.

It also fixes a few of the previous tests, and makes the binary ops call
BinaryFn.<op> directly instead of relying on Python to recognise the
operation.

Differential Revision: https://reviews.llvm.org/D154618

[flang][hlfir] allow assoicate where the expr is also used by shape_of

This fixes the majority of cases where we hit the "hlfir.associate of
hlfir.expr with more than one use" TODO. In particular, this allows cam4
to be built.

hlfir.shape_of is just a way to delay reading shape information until
after intrinsics have been lowered to FIR runtime calls. It gets the
shape information from reading existing SSA values (e.g. fetching the
shape used when hlfir.declare'ing the variable).

Therefore hlfir.shape_of doesn't affect decisions about when to
deallocate the buffer.

Differential Revision: https://reviews.llvm.org/D154521

[mlir] Add InsertionGuards to OneToNPatternRewriter.

This fixes bad behavior of that class that surfaced in
https://reviews.llvm.org/D154299, where calling applySignatureConversion
left the insertion point different from before the call, which broke a
subsequent call to replaceOp. This patch introduces a fix in both
functions, each of which is enough to fix the specific problem in the
aforementioned diff: (1) applySignatureConversion now resets the
insertion point with a guard for the whole function and (2) replace sets
the insertion point to the op that should be replaced (and resets it
with a guard).

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D154684

[mlir] Avoid unnecessary copies in SCF's OneToNTypeConversions. (NFC)

In two places, a ResultRange was copied into a SmallVector just to be
passed as a ValueRange argument. With this patch, the ResultRanges are
passed directly, avoiding a copy.

Reviewed By: ingomueller-net

Differential Revision: https://reviews.llvm.org/D154685

[ARM][Driver] Change float-abi warning

Previously the warning stated "flag ignored" which is only partially
true - the invalid flag would prevent -feature +soft-float-abi from
being emitted which resulted in user-visible behaviour like
__ARM_PCS_VFP being defined.

Rather than attempt to coerce invalid flags into valid behaviour, don't
describe the expected behaviour.

Ideally the warning would be an error, as it is in GCC. However there
are tests in llvm-project that trigger the warning. Therefore one has to
assume that making the warning an error would break other code that
already exists in the wild.

Also apply test improvements suggested by @MaskRay on D150902.

Reviewed By: simon_tatham

Differential Revision: https://reviews.llvm.org/D154578

[InstCombine] Preserve inbounds when folding select of GEP

The select base, (gep base, offset) to gep base, select (0, offset)
fold used to drop inbounds, because the gep base, 0 this introduces
might not be inbounds. After the semantics change in D154051, such
a GEP is always considered inbounds, in which allows us to preserve
the flag here.

As the PhaseOrdering test demonstrates, this can result in major
optimization improvements in some cases.

Differential Revision: https://reviews.llvm.org/D154055

[bazel] Port for 88e95c1e4bb6e2ad3bfd185b96341ad5c09eff6b

[compiler-rt][RISCV] Fix __fe_getround and __fe_raise_inexact for Zfinx

Zfinx extension also provide floating point environment like F extension, so
enable that on `__fe_getround` and `__fe_raise_inexact` too.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D154570

[RISCV] Add a pass to combine `cm.pop` and `ret` insts

`RISCVPushPopOptimizer.cpp` combine `cm.pop` and `ret` to generates `cm.popretz` or `cm.popret` .

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D150416

[X86] Support some Intel CPUs for cpu_specific/dispatch feature

Reviewed By: RKSimon, skan

Differential Revision: https://reviews.llvm.org/D154493

[RISCV] Rename prefix `fixed-vector` to `fixed-vectors` to be the same with other testcases. NFC.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154679

[Attributor] Check all NoFPClass attributes found in the IR

[Attributor][NFC] Add missing comments

[Attributor][NFCI] Use AA::hasAssumedIRAttr for NoSync

Register new assumption in a cache

When new assumption is created it should be registered in assumption cache
or cache should be invalidated.

Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D154601

[RISCV] Don't sink i1 vectors in shouldSinkOperands.

These can't create .vx instructions so there's no reason to sink them.

[RISCV] Remove unused private field 'RVPushable' in RISCVMachineFunctionInfo.h

/data/llvm-project/llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h:78:8: error: private field 'RVPushable' is not used [-Werror,-Wunused-private-field
]
bool RVPushable = false;
^
1 error generated.

[lldb] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D154542

[lldb] Replace countPopulation with popcount after D153686

[RISCV] Readjusting the framestack for Zcmp

This patch readjusts the frame stack for the push and pop instructions

co-author: @Lukacma

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134599

[LoongArch][MC] Add testcases for LASX instructions

Depends on D154195

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D154326

[LoongArch][MC] Add testcases for LSX instructions

Depends on D154183

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D154197

[LoongArch] Add definition for LASX instructions

This patch adds the definition of the LASX instructions, providing
support only for assembly and disassembly, similar to D154183.

Depends on D154183

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D154195

[LoongArch] Add definition for LSX instructions

This patch adds the definition for the `LSX` registers and instructions.
It also adds handling for new immediate operands in the AsmParser. This
patch ensures that llvm-mc and llvm-objdump correctly handle the `LSX`
instructions.

We expand those pseudo-instructions `vrepli.{b,h,w,d}` in the
MCCodeEmitter. This increases the readability of the output when
generating assembly files.

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D154183

[llvm-exegesis] Remove unnecessary includes

Left some includes around from debugging the last patch that I forgot to
take out, so here's the patch taking them out.

[llvm-exegesis] Switch to using PTRACE_ATTACH instead of PTRACE_SEIZE

This patch switches from using PTRACE_SEIZE within the subprocess
benchmark runner for llvm-exegesis as PTRACE_SEIZE was introduced in
Linux kernel 3.4. Some LLVM users were reporting build failures as they
are using Kernel versions older than 3.4 (such as on CentOS/RHEL 6),
hence the patch.

[llvm-exegesis] Disable SubprocessMemory class on Android

Having the SubprocessMemory class available on Android was causing build
failures in downstream builds as Android doesn't implement the SystemV
IPC specification that supplies shm_open and other functions that the
class relies on. This patch simply makes it unavailable on Android using
preprocessor directives.

[mlir][sparse] Correcting RTTI implementation for the Var class

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D154662

Revert "Change the dyld notification function that lldb puts a breakpoint in"

This reverts commit c3192196aea279eea0a87247a02f224ebe067c44.

Reverting my second attempt at https://reviews.llvm.org/D139453
changing which dyld notification method is being used.
The Intel macOS CI bot is still failing with this
rewrite at https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/
I'll need to set up an Intel macOS system running a matching
OS version to debug this directly I think.

DAG: Handle inversion of fcSubnormal | fcZero

There are a number of more test combinations here that
can be done together and reduce the number of instructions.

https://reviews.llvm.org/D143191

Revert "[BPF] Undo transformation for LICM.cpp:hoistMinMax()"

This reverts commit 09feee559a294611257ee157dba039fb05fe4f68.

Revert because of a testbot failure:
https://lab.llvm.org/buildbot/#/builders/5/builds/34931

Change the dyld notification function that lldb puts a breakpoint in

On Darwin systems, the dynamic linker dyld has an empty function
it calls when binaries are added/removed from the process. lldb puts
a breakpoint on this dyld function to catch the notifications. The
function arguments are used by lldb to tell what is happening.

The linker has a natural representation when the addresses of
binaries being added/removed are in the pointer size of the process.
There is then a second function where the addresses of the binaries
are in a uint64_t array, which the debugger has been using before -
dyld allocates memory for the array, copies the values in to it,
and calls it for lldb's benefit.

This changes to using the native notifier function, with pointer-sized
addresses.

This is the second time landing this change; this time correct the
size of the image_count argument, and add a fallback if the
notification function "lldb_image_notifier" can't be found.

Differential Revision: https://reviews.llvm.org/D139453

[ORC][AArch64] Guard against negative offsets in writeIndirectStubsBlock.

In OrcAArch64::writeIndirectStubsBlock, masks the high bits of the immediate
operand to the stub's ldr instruction so that negative offsets to the stub
pointer do not overflow.

No testcase -- this fixes most of the OrcLazy testcases for AArch64 (at least on
Darwin), but we still need to fix the exception-handling test before we can turn
them on.

[BOLT][DWARF] Don't check string offsets

With different linker, the offset of strings can change.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D154670

Make test more flexible for platforms that may emit extra arguments.

[BOLT][DWARF] Don't check string offsets

With different linker, the offset of strings can change.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D154669

Revert "[gn build] Manually port 015dabd7"

I've temporarily reverted D154272 (which this commit depends on)
due to a build breakage, so also reverting this for now.

This reverts commit c1e283851772ba494113311405d48cfb883751d1.

[RISCV] Support i32 brev8 intrinsic on RV64.

Similar to what we do for orc.b. Another patch will expose this
as a builtin in clang.

Revert "Reland '[msan] Intercept dladdr1, and refactor dladdr'"

This reverts my commit 015dabd7672f936cdb5bdcad20fe80b17f05c9ca
due to breaking non-glibc builds.

[clang] Fix a typo "mdoule" in comments. NFC.

[BOLT][DWARF] Fix references in tests

Fixed invalid assembly, where references were not correct.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D154667

Revert "[libc] Add support for creating wrapper headers for offloading in clang"

This reverts commit a4a26374aa11d48ac6bf65c78c2aaf8f16414287.

This was causing some problems with the CPU build and CUDA buildbot.
Revert until I can figure out what those issues are and fix them. I
believe it is just some CMake.

[mlir][complex] Add a `complex.bitcast` operation

Converting between a complex<f32> to i64 could be useful for handling interop
between the `arith` and `complex` dialects.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D154663

Refine the reporting mechanism for interruption.

Also, make it possible for new Targets which haven't been added to
the TargetList yet to check for interruption, and add a few more
places in building modules where we can check for interruption.

Differential Revision: https://reviews.llvm.org/D154542

[libc] Add support for creating wrapper headers for offloading in clang

This is an alternate approach to the patches proposed in D153897 and
D153794. Rather than exporting a single header that can be included on
the GPU in all circumstances, this patch chooses to instead generate a
separate set of headers that only provides the declarations. This can
then be used by external tooling to set up what's on the GPU. This
leaves room for header hacks for offloading languages without needing to
worry about the `libc` implementation.

Currently this generates a set of headers that only contain the
declarations. These will then be installed to a new clang resource
directory called `llvm_libc_wrappers/` which will house the shim code.
We can then automaticlaly include this from `clang` when offloading to
wrap around the headers while specifying what's on the GPU.

Reviewed By: jdoerfert, JonChesterfield

Differential Revision: https://reviews.llvm.org/D154036

[llvm-mca][RISCV] vsetivli and vsetvli act as instruments

Since the LMUL data that is needed to create an instrument is
avaliable statically from vsetivli and vsetvli instructions,
LMUL instruments can be automatically generated so that clients
of the tool do no need to manually insert instrument comments.

Instrument comments may be placed after a vset{i}vli instruction,
which will override instrument that was automatically inserted.
As a result, clients of llvm-mca instruments do not need to update
their existing instrument comments. However, if the instrument
has the same LMUL as the vset{i}vli, then it is reccomended to
remove the instrument comment as it becomes redundant.

Differential Revision: https://reviews.llvm.org/D154526

[LAA] Add test that shows MaxSafeDepDistBytes is incorrect. NFC.

This precommit patch shows MaxSafeDepBytesDist is 24 when it should
be 8.

Differential Revision: https://reviews.llvm.org/D154173

[lldb] Fix dead lock issue when loading modules in Scripted Process

This patch attempts to fix a dead lock when loading modules in a Scripted
Process.

This issue was triggered by loading the modules after the process did resume,
but before the process actually stop, causing the language runtime mutex to
be locked by a separate thread, responsible to unwind the stack (using
the runtime unwind plan), while the module loading thread was trying to
notify the runtimes of the newly loaded module.

To address that, this patch moves the module loading logic to be done before
sending the stop event, to prevent the dead lock situation described above.

Differential Revision: https://reviews.llvm.org/D154649

Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>

[BOLT][DWARF] Change to process and write out TUs first then CUs in batches

Summary:
To reduce memory footprint changed so that we process and write out TUs first,
reset DIEBuilder and process CUs. CUs are processed in buckets. First bucket
contains all the CUs with cross CU references. Rest processd one at a time.

clang-17 build in debug mode, by clang-17.
before
8:25.81 real, 834.37 user, 86.03 sys, 0 amem, 79525064 mmem
8:02.20 real, 820.46 user, 81.81 sys, 0 amem, 79501616 mmem
7:52.69 real, 802.01 user, 83.99 sys, 0 amem, 79534392 mmem

after
7:49.35 real, 822.04 user, 66.19 sys, 0 amem, 34934260 mmem
7:42.16 real, 825.46 user, 63.52 sys, 0 amem, 34951660 mmem
7:46.71 real, 821.11 user, 63.14 sys, 0 amem, 34981164 mmem

Differential Revision: https://phabricator.intern.facebook.com/D45883198

[BOLT][DWARF] Output DWO files as they are being processed

Summary:
Changed how we handle writing out .dwo and .dwp files. We now write out DWO
sections sooner and destroy DIEBuilder. This should decrease memory footprint.

Ran on clang-17 build in debug mode with split-dwarf.
before
8:07.49 real,   664.62 user,    69.00 sys,      0 amem, 41601612 mmem
8:07.06 real,   669.60 user,    68.75 sys,      0 amem, 41822588 mmem
8:00.36 real,   664.14 user,    66.36 sys,      0 amem, 41561548 mmem

after
8:21.85 real,   682.23 user,    69.64 sys,      0 amem, 39379880 mmem
8:04.58 real,   671.62 user,    66.50 sys,      0 amem, 39735800 mmem
8:10.02 real,   680.67 user,    67.24 sys,       0 amem, 39662888 mmem

Differential Revision: https://phabricator.intern.facebook.com/D45458889

[BOLT][DWARF] Numerous fixes for a new DWARFRewriter

Summary:

* Some cleanup and minor fixes for the new debug information re-writer before moving on
to productatization.

* The new rewriter wasn't handling binary with DWARF5 and DWARF4 with
-fdebug-types-sections.

* Removed dead cross cu reference code.

* Added support for DW_AT_sibling.

* With the new re-writer abbrev number can change which can lead to offset of Type
Units changing. Before we would just copy raw data. Changed to write out Type
Unit List. This is generated by gdb-add-index.

* Fixed how bolt handles gdb-index generated by gdb-11 with types sections.
Simplified logic that handles variations of gdb-index.

* Clang can generate two type units with the same hash, but different content. LLD
does not de-duplicate when ThinLTO is involved. Changed so that TU hash and
offset are used to make TU's unique.

* It is possible to have references within location expression to another DIE.
Fixed it so that relative offset is updated correctly.

* Removed all the code related to patching.

* Removed dead code. Changed how we handling writting out TUs and TU Index. It now
should fully work for DWARF4 and DWARF5.

* Removed unused arguments from some APIs, changed return type to void, and other
small cleanups.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: https://phabricator.intern.facebook.com/D46168257

[DWARF][BOLT] Implement new mechanism for DWARFRewriter

Summary:
This revision implement new mechanism for DWARFRewriter.
In the new mechanism, we adopt the same way with DWARFLinker did.
By parsing Debug information into IR, we are allowed to handle debug information more flexible.
Now the debug information updating process relies on IR and IR will be written out to binary once the updating finished.

A new class was added: DIEBuilder. This class is responsible for parsing debug information and raising it to the IR level.
This class is also used to write out the .debug_info and .debug_abbrev sections.
Since we output brand new Abbrev section we won't need to always convert low_pc/high_pc into ranges.
When conversion does happen we can also remove low_pc entry.

Differential Revision: https://phabricator.intern.facebook.com/D39484421

Tasks: T117448832

[AMDGPU] Improve assembler + disassembler handling of kernel descriptors

* Relax the AsmParser to accept `.amdhsa_wavefront_size32 0` when the
  `.amdhsa_shared_vgpr_count` directive is present.
* Teach the KD disassembler to respect the setting of
  KERNEL_CODE_PROPERTY_ENABLE_WAVEFRONT_SIZE32 when calculating the
  value of `.amdhsa_next_free_vgpr`.
* Teach the KD disassembler to disassemble COMPUTE_PGM_RSRC3 for gfx90a
  and gfx10+.
* Include "pseudo directive" comments for gfx10 fields which are not
  controlled by any assembler directive.
* Fix disassembleObject failure diagnostic in llvm-objdump to not
  hard-code a comment string, and to follow the convention of not
  capitalizing the first sentence.

Reviewed By: rochauha

Differential Revision: https://reviews.llvm.org/D128014

[lldb][DebugNamesDWARF] Also use mangled name when matching regex

When LLDB queries the debug names index with a regex, we should use the
`Mangled` class wrapper, which attempts to match regex first against the mangled
name and then against the demangled name. It is important to do so, otherwise
queries like `frame var --regex A::` would never work. This is what is done for
the Apple index as well.

This fixes test/API/lang/cpp/class_static/main.cpp when compiled with DWARF 5.

Differential Revision: https://reviews.llvm.org/D154617

[lldb][NFC] Remove unused parameter from DebugNames::ProcessEntry

Differential Revision: https://reviews.llvm.org/D154610

[DSE] Don't eagerly optimize MemorySSA uses

Compile time improvements:
https://llvm-compile-time-tracker.com/compare.php?from=a4a2b62495a63516a4f782acff1b19361906546b&to=a408521f71702a5c5fb65077adc23413d8631cfc&stat=instructions:u

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D152744

[MemorySSA] Always perform MemoryUses liveOnEntry optimization on MSSA construction

Fixes invariant memory regressions in future DSE patches.

Also add a flag to print<memoryssa> to not ensure optimized uses to test this.

Noticeable compile time regression [1], but a future DSE change that depends on this more than makes up for it.

[1] https://llvm-compile-time-tracker.com/compare.php?from=9d5466849a770eeab222d5a5890376d3596e8ad6&to=95682dbe11d76a3342870437377216e96b167504&stat=instructions:u

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D152859

[flang][openacc] Support array slices when creating private recipe

The return type of the recipe must match the array slice provided by
the user. This patch enhance the recipe creation to take into account
the constant slices.

Depends on D154259

Reviewed By: razvanlupusoru

Differential Revision: https://reviews.llvm.org/D154648

[flang][openacc] Populate the init region for acc.private.recipe for simple type

Generate code to allocate privates for trivial scalars and arrays.

Reviewed By: razvanlupusoru

Differential Revision: https://reviews.llvm.org/D154259

[libc++][NFC] Add 'const' to some operator()

This is NFC because the function object is stateless anyway. This is
done solely for consistency with surrounding code and this was probably
an oversight in https://reviews.llvm.org/D132505.

Differential Revision: https://reviews.llvm.org/D154612

[BOLT] Fix buildbot failure

GCC requires "class" keyword when variable name matches class name.

Reviewed By: ayermolo

Differential Revision: https://reviews.llvm.org/D154654

[clang-format] Support block indenting array/struct list initializers

C89 and C99 list initializers are treated differently than Cpp11 braced
initializers. This patch identifies the C array/struct initializer lists by
finding the preceding equal sign before a left brace, and applies formatting
rules for BracketAlignmentStyle.BlockIndent to those list initializers.

Fixes #57878.

Differential Revision: https://reviews.llvm.org/D153205

[WebAssembly] Add frexp{f,l} libcall signatures

The llvm.frexp.* family of intrinsics and their corresponding libcalls were
recently added, which means we need to know their signatures.

Differential Revision: https://reviews.llvm.org/D154639
Fixed: https://github.com/llvm/llvm-project/issues/63657

[lldb-vscode] Adding support for column break points.

Reviewed By: wallace

Differential Revision: https://reviews.llvm.org/D154029

[flang][hlfir] Lower char length inquiry via hlfir.get_length.

ApplyOp provides the type parameters in its argument, so we can
take it from there. For hlfir.expr block arguments (such as with
user-defined assignments) we use hlfir.get_length in lowering.

Depends on D154561

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D154562

[flang][hlfir] Codegen for hlfir.get_length.

Lower hlfir.get_length into the char length inquiry of the bufferized
entity. In some cases the codegen will fail with
`hlfir.associate of hlfir.expr with more than one use` - this will be fixed
separately (after D154521).

Depends on D154560

Reviewed By: tblah, jeanPerier

Differential Revision: https://reviews.llvm.org/D154561

[flang][hlfir] Added hlfir.get_length to inquire char length from hlfir.expr.

We will use hlfir.get_length to lower inquiries of char length
applied to hlfir.expr character values.

Reviewed By: tblah, jeanPerier

Differential Revision: https://reviews.llvm.org/D154560