review.tizen.org Git - platform/upstream/llvm.git/log

[RISCV] Mark fminnum_vl and fmaxnum_vl as commutable.

[RISCV] Add commuted fixed vector vfmax.vf and vfmin.vf tests. NFC

The ISD opcodes aren't marked commutable so we don't match these
properly.

[RISCV] Mark vsadd(u)_vl as commutable

This allows fixed length vectors involving splats on the LHS to commute into the _vx form of the instruction. Oddly, the generic canonicalization rules appear to catch the scalable vector cases. I haven't fully dug in to understand why, but I suspect it's because of a difference in how we represent splats (splat_vector vs build_vector).

Differential Revision: https://reviews.llvm.org/D129302

[DWARF] Add linkagename to hash

Originally encountered with RUST, but also there are cases with distributed LTO
where debug info dwo units contain structurally the same debug information, with
difference in DW_AT_linkage_name. This causes collision on DWO ID.

Differential Revision: https://reviews.llvm.org/D129317

[mlir] Fixed double-free bug in SymbolUserMap

`SymbolUserMap` relied on `try_emplace` and `std::move` to relocate an entry to another key. However, if this triggered the resizing of the `DenseMap`, the value was destroyed before it could be moved to the new storage location, leading to a dangling `users` reference to be inserted into the map. On destruction, since a new entry was created from one that was already freed, a double-free error occurred.

Fixed issue by re-fetching the iterator after the mutation of the container.

Differential Revision: https://reviews.llvm.org/D129345

[RISCV] Mark (s/u)min_vl and (s/u)max_vl as commutable.

[RISCV] Add fixed vector vmin(u).vx and vmax(u).vx tests. NFC

[X86] Regenerate vec_shift6.ll to remove superfluous whitespace. NFC

[gn build] Manually port d2ead9e3

[flang] Changed lowering for allocatable assignment to make array-value-copy correct.

Array-value-copy fails to generate a temporary array for case like this:
subroutine bug(b)
  real, allocatable :: b(:)
  b = b(2:1:-1)
end subroutine

Since LHS may need to be reallocated, lowering produces the following FIR:
%rhs_load = fir.array_load %b %slice

%lhs_mem = fir.if %b_is_allocated_with_right_shape {
   fir.result %b
} else {
   %new_storage = fir.allocmem %rhs_shape
   fir.result %new_storage
}

%lhs = fir.array_load %lhs_mem
%loop = fir.do_loop {
....
}
fir.array_merge_store %lhs, %loop to %lhs_mem
// deallocate old storage if reallocation occured,
// and update b descriptor if needed.

Since %b in array_load and %lhs_mem in array_merge_store are not the same SSA
values, array-value-copy does not detect the conflict and does not produce
a temporary array. This causes incorrect result in runtime.

The suggested change in lowering is to generate this:
%rhs_load = fir.array_load %b %slice
%lhs_mem = fir.if %b_is_allocated_with_right_shape {
   %lhs = fir.array_load %b
   %loop = fir.do_loop {
      ....
   }
   fir.array_merge_store %lhs, %loop to %b
   fir.result %b
} else {
   %new_storage = fir.allocmem %rhs_shape
   %lhs = fir.array_load %new_storage
   %loop = fir.do_loop {
      ....
   }
   fir.array_merge_store %lhs, %loop to %new_storage
   fir.result %new_storage
}
// deallocate old storage if reallocation occured,
// and update b descriptor if needed.

Note that there are actually 3 branches in FIR, so the assignment loops
are currently produced in three copies, which is a code-size issue.
It is possible to generate just two branches with two copies of the loops,
but it is not addressed in this change-set.

Differential Revision: https://reviews.llvm.org/D129314

[VPlan] Move VPWidenSelectRecipe::execute to VPlanRecipes.cpp (NFC).

Depends on D127968.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D127970

[libc++] Make parameter names consistent and enforce the naming style using readability-identifier-naming

Ensure that parameter names have the style `__lower_case`

Reviewed By: ldionne, #libc

Spies: aheejin, sstefan1, libcxx-commits, miyuki

Differential Revision: https://reviews.llvm.org/D129051

AArch64/GlobalISel: Stop using legal s1 values

As far as I can tell treating s1 values as legal makes no sense. There
are no allocatable 1-bit registers. SelectionDAG legalizes the usual
set of boolean operations to 32-bits, and this should do the
same. This avoids some special case handling in the selector of s1
values, and some extra code to look through truncates.

This makes some code worse at -O0, since nothing cleans up the and 1
the artifact combiner inserts. We could probably add some
non-essential combines or teach the artifact combiner to elide
intermediates betweeen boolean uses and defs.

GlobalISel: Add buildBoolExtInReg helper

GlobalISel: Allow forming atomic/volatile G_SEXTLOAD

Mirror the change to G_ZEXTLOAD.

GlobalISel: Allow forming atomic/volatile G_ZEXTLOAD

SelectionDAG has a target hook, getExtendForAtomicOps, which it uses
in the computeKnownBits implementation for ATOMIC_LOAD. This is pretty
ugly (as is having a separate load opcode for atomics), so instead
allow making use of atomic zextload. Enable this for AArch64 since the
DAG path defaults in to the zext behavior.

The tablegen changes are pretty ugly, but partially helps migrate
SelectionDAG from using ISD::ATOMIC_LOAD to regular ISD::LOAD with
atomic memory operands. For now the DAG emitter will emit matchers for
patterns which the DAG will not produce.

I'm still a bit confused by the intent of the isLoad/isStore/isAtomic
bits. The DAG implementation rejects trying to use any of these in
combination. For now I've opted to make the isLoad checks also check
isAtomic, although I think having isLoad and isAtomic set on these
makes most sense.

[Clang] Fix test failing due to renamed arg

[ConstantFolding] Guard against unfolded FP binop

Check that the operation actually folded before trying to flush
denormals. A minor variation of the pr33453 test exposed this
with the FP binops marked as undesirable.

[LinkerWrapper] Fix save-temps and argument name

Summary:
The previous path reworked some handling of temporary files which
exposed some bugs related to capturing local state by reference in the
callback labmda. Squashing this by copying in everything instead. There
was also a problem where the argument name was changed for
`--bitcode-library=` but clang still used `--target-library=`.

[InstCombine] Avoid ConstantExpr::get() in vector binop fold (NFCI)

Use the ConstantFoldBinaryOpOperands() API instead. This case
would bail out on a non-folded result anyway.

[LinkerWrapper][NFC] Move error handling to a common function

Summary:
This patch merges all the error handling functions to a single function
call so we don't define the same lambda many times.

[LinkerWrapper][NFC] Rework command line argument handling in the linker wrapper

Summary:
This patch reworks the command line argument handling in the linker
wrapper from using the LLVM `cl` interface to using the `Option`
interface with TableGen. This has several benefits compared to the old
method.

We use arguments from the linker arguments in the linker
wrapper, such as the libraries and input files, this allows us to
properly parse these. Additionally we can now easily set up aliases to
the linker wrapper arguments and pass them in the linker input directly.
That is, pass an option like `cuda-path=` as `--offload-arg=cuda-path=`
in the linker's inputs. This will allow us to handle offloading
compilation in the linker itself some day. Finally, this is also a much
cleaner interface for passing arguments to the individual device linking
jobs.

[InstCombine] Avoid ConstantExpr::get() call

Avoid calling ConstantExpr::get() for associative/commutative
binops, call ConstantFoldBinaryOpOperands() instead. We only
want to perform the reassociation of the constants actually fold.

[DAG] SimplifyDemandedBits - fold AND(INSERT_SUBVECTOR(C,X,I),M) -> INSERT_SUBVECTOR(AND(C,M),X,I)

If all the demanded bits of the AND mask covering the inserted subvector 'X' are known to be one, then the mask isn't affecting the subvector at all.

In which case, if the base vector 'C' is undef/constant, then move the AND mask up to just (constant) fold it directly.

Addresses some of the regressions from D129150, particularly the cases where we're attempting to zero the upper elements of a widened vector.

Differential Revision: https://reviews.llvm.org/D129290

[libomptarget] compile DeviceRTL bc files with -O3

bc files of DeviceRTL are compiled with -O3, the same as the static library.

Differential Revision: https://reviews.llvm.org/D129344

[ConstantExpr] Don't create float binop expressions

Mark the fadd, fsub, fmul, fdiv, and frem expressions as
undesirable, so they are not created automatically. This is in
preparation for their removal.

[InstCombine] Avoid creating float binop ConstantExprs

Replace ConstantExpr:getFAdd etc with call to
ConstantFoldBinaryOpOperands(). I'm using the constant folding API
rather than IRBuilder here to ensure that this does actually
constant fold. These transforms don't use m_ImmConstant(), so this
would not otherwise be guaranteed (and apparently, they can't use
m_ImmConstant because they want to handle scalable vector splats).

There is an opportunity here to further migrate these to the
ConstantFoldFPInstOperands() API, which would respect the denormal
mode. I've held off on doing so here, because some of this code
explicitly checks for denormal results, and I don't want to touch
it in a mostly NFC change.

[InstCombine] enhance fold for subtract-from-constant -> xor

A low-bit mask is not required:
https://alive2.llvm.org/ce/z/yPShss

This matches the SDAG implementation that was updated at:
8b756713140f

[InstCombine] add tests for masked sub; NFC

[flang][openacc][NFC] Extract device_type parser to its own

Move the device_type parser to a separate parser AccDeviceTypeExprList. Preparatory work for D106968.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D106967

[flang][openacc][NFC] Make self clause value optional in ACC.td and extract the parser

Set the isOptional flag for the self clause. Move the optional and parenthesis part of the parser. Update the rest of the code to deal with the optional value.

Preparatory work for D106968.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D106965

[AArch64] Use Neoverse N2 sched model as default for:

  - Cortex-A710
  - Cortex-X2
  - Neoverse-V1
  - Neoverse-512tvb

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D129203

[LiveIntervals] Fix incorrect range (re)construction from subranges.

After D82916 `updateAllRanges()` started to fix holes in main range with
subranges but it fails on instructions with two subregs def which are parts of
one reg. The main range constructed with //all// subranges of subregs just after
processing the first operand. So the main range gets intervals from subranges
those are not updated yet.

The patch takes into account lane mask to update the main range.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D128553

[gn build] Port 1cdec6c96e85

[libc++] Re-apply the use of ABI tags to provide per-TU insulation

This commit re-applies 9ee97ce3b830, which was reverted by 61d417ce
because it broke the LLDB data formatter tests. It also re-applies
6148c79a (the manual GN change associated to it).

Differential Revision: https://reviews.llvm.org/D127444

[X86][FP16] Add constrained FP support for scalar emulation

This is a follow up patch to support constrained FP in FP16 emulation.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D128114

[libcxx][ranges] implement `std::ranges::set_difference`

implement `std::ranges::set_difference`
reused classic std::set_difference
added unit tests

Differential Revision: https://reviews.llvm.org/D128983

[SDAG] try to replace subtract-from-constant with xor

This is almost the same as the abandoned D48529, but it
allows splat vector constants too.

This replaces the x86-specific code that was added with
the alternate patch D48557 with the original generic
combine.

This transform is a less restricted form of an existing
InstCombine and the proposed SDAG equivalent for that
in D128080:
https://alive2.llvm.org/ce/z/OUm6N_

Differential Revision: https://reviews.llvm.org/D128123

Disable clang-format entirely for test directories

See discussion here:

https://github.com/llvm/llvm-project/issues/55982

And the RFC here:
https://discourse.llvm.org/t/rfc-disable-clang-format-in-the-clang-test-tree/63498/2

We don't generally expect test files to be formatted according to the
style guide. Indeed, some tests may require specific formatting for the
purposes of the test.

When tests intentionally do not conform to the "correct" formatting,
this causes errors in the CI, which can drown out real errors and causes
people to stop trusting the CI over time.

From the history of the clang/test/.clang-format file, it looks as if
there have been attempts to make clang-format do a subset of formatting
that would be useful for tests. However, it looks as if it's hard to
make clang-format do exactly the right thing -- see the back-and-forth
between
13316a7
and
7b5bddf.

These changes disable the .clang-format file for clang/test, llvm/test,
and clang-tools-extra/test.

Fixes #55982
Differential Revision: https://reviews.llvm.org/D128706

Fix the Clang sphinx bot

This should resolve the issues with:
https://lab.llvm.org/buildbot/#/builders/92/builds/29439

[NFC][SelectionDAG] Fix debug prints in salvageUnresolvedDbgValue

The prints are printing pointer values - fix by dereferencing the pointers.

[PhaseOrdering] Add test for IndVars + SROA interaction (NFC)

[AMDGPU] Add GFX11 test coverage sharing checks with GFX10

[AArch64] Remove incorrect use of DemandElts

This call to computeKnownBits was passing in a 0xff mask, looking like
it was expecting it to be used as a DemandBits, not a DemandElts mask.

[lldb/test] Disable TestStringLiteralExpr.test on Windows

This test, introduced by b042d15d2e39eea528c51a30fe637b9ea84250d3, fails on
https://lab.llvm.org/buildbot/#/builders/83/builds/20933/steps/7/logs/stdio

but succeeds on other targets, see for instance
https://lab.llvm.org/buildbot/#/builders/68/builds/35462/steps/6/logs/stdio

This test is not be arch specific, just disabling it on Windows.

[RISCV] Fix wrong register rename for store value during make-compressible optimization

Current implementation will rename both register in store instructions if
we store base address into memory with same base register, it's OK if
the offset is 0, however that is wrong transform if offset isn't 0, give
a smalle example here:

sd      a0, 808(a0)

We should not transform into:

addi    a2, a0, 768
sd      a2, 40(a2)

That should just rename base address like this:

addi    a2, a0, 768
sd      a0, 40(a2)

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D128876

[AMDGPU] More GFX11 coverage for tests with generated checks

[AArch64] Initial sched model for Neoverse N2

The optimization guide can be found here:
https://developer.arm.com/documentation/PJDOC-466751330-18256/latest/

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D128631

[Support] Fix Windows dump file hang with multi-threaded crashes

Prevents deadlock between MiniDumpWriteDump and
CryptAcquireContextW (called via fs::createTemporaryFile) in
WriteWindowsDumpFile.

However, there's no guarantee that deadlock can't still occur between
MiniDumpWriteDump and some other Win32 API call. But that would appear
to be the "accepted" risk of using MiniDumpWriteDump in this manner.

Differential Revision: https://reviews.llvm.org/D129004

[LoongArch] Add codegen support for multiplication operations

Reference:
https://llvm.org/docs/LangRef.html#mul-instruction

Differential Revision: https://reviews.llvm.org/D128194

[IndVars] Eliminate redundant type cast between integer and float

Recompute the range: match for fptosi of sitofp, and then query the range of the input to the sitofp
according the comment on D129140.

Fixes https://github.com/llvm/llvm-project/issues/55505.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D129191

[RISCV] Precommit testcase to show wrong result of make-compressible optimization

Use following example to demo what happened now:

  li      a1, 1
  sd      a1, 800(a0)
  sd      a0, 808(a0) # Store base address into base + offset
  li      a1, 2
  sd      a1, 816(a0)

Current will optimizate into:

  li      a1, 1
  addi    a2, a0, 768
  sd      a1, 32(a2)
  sd      a2, 40(a2) # Wrong replacement for the source register.
  li      a1, 2
  sd      a1, 48(a2)

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D128875

[AMDGPU] Add GFX11 coverage to shared sdag/gisel tests

[AArch64][GlobalISel] Fix call lowering for <3 x i32> vector arguments

Differential Revision: https://reviews.llvm.org/D129194

[SelectionDAG] computeKnownBits / ComputeNumSignBits for the remaining overflow-aware nodes

Some overflow-aware nodes were missing from the switches in
computeKnownBits and ComputeNumSignBits.

[AMDGPU] Add GFX11 test coverage

Add GFX11 test coverage to a bunch of tests where it was easy to do so,
mostly because the checks are autogenerated and/or GFX11 can share the
same checks as GFX10.

Differential Revision: https://reviews.llvm.org/D129295

[lldb/test] Add Shell/Expr/TestStringLiteralExpr.test

This test should exercise the usage of expressions containing
string literals and ensure that lldb doesn't crash.

Differential Revision: https://reviews.llvm.org/D129261

[mlir][Transform] Fix isDefiniteFailure helper

This newly added helper was returning definiteFailure even in the case of silenceableFailure.

Differential Revision: https://reviews.llvm.org/D129347

[JumpThreading] Avoid threadThroughTwoBasicBlocks when PredPred BB ends with indirectbranch

Since we can't change the destination of indirectbr, so when
encounter indirectbr as PredPredBB terminator, we should pass it.

Differential Revision: https://reviews.llvm.org/D129193

[CallSiteSplitting] Regenerate test checks (NFC)

This test requires --function-signature to work with unmodified UTC.

[BasicBlockUtils] Allow critical edge splitting with callbr terminators

After D129205, we support SplitBlockPredecessors() for predecessors
with callbr terminators. This means that it is now also safe to
invoke critical edge splitting for an edge coming from a callbr
terminator. Remove checks in various passes that were protecting
against that.

Differential Revision: https://reviews.llvm.org/D129256

[UpdateTestChecks] Remove outdated help text

Manually modifying the result of update_test_checks.py is discouraged,
we prefer unmodified check lines where possible. The output is also
considered authoritative nowadays, at least for tests targeting core
middle-end components, where not using it is an automatic review
rejection.

Differential Revision: https://reviews.llvm.org/D129259

[libcxx] Make LIBCXX_HERMETIC_STATIC_LIBRARY apply to libc++experimental too

This avoids dllexports in that library.

Differential Revision: https://reviews.llvm.org/D129271

[NFC] Move isSameDefaultTemplateArgument into ASTContext

Move isSameDefaultTemplateArgument into ASTContext to keep consistent
with other ASTContext:isSame* methods.

[SLP] Add missing space to optimization remark.

Reviewed By: vporpo

Differential Revision: https://reviews.llvm.org/D129330

[RISCV] Change VECTOR_SPLICE mask operation from expand to promote

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D128717

Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues"

This reverts commit f17639ea0cd30f52ac853ba2eb25518426cc3bb8 as three
AMDGPU tests haven't been updated. Will need to verify the changes are
not regressions we should avoid.

[Attributor] Replace AAValueSimplify with AAPotentialValues

For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many reasons:
- We recomputed the result a lot as there was no caching for the 9
  locations calling `genericValueTraversal`.
- We added the idea of "intra" vs. "inter" procedural simplification
  only as an afterthought. `genericValueTraversal` did offer an option
  but `AAValueSimplify` did not. Thus, we might end up with "too much"
  simplification in certain situations and then gave up on it.
- Because `genericValueTraversal` was not a real `AA` we ended up with
  problems like the infinite recursion bug (#54981) as well as code
  duplication.

This patch introduces `AAPotentialValues` and replaces the
`AAValueSimplify` uses with it. `genericValueTraversal` is folded into
`AAPotentialValues` as are the instruction simplifications performed in
`AAValueSimplify` before. We further distinguish "intra" and "inter"
procedural simplification now.

`AAValueSimplify` was not deleted as we haven't ported the
re-materialization of instructions yet. There are other differences over
the former handling, e.g., we may not fold trivially foldable
instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2`
but if an operand would be simplified to `i32 1` we would fold it still.

We are also even more aware of function/SCC boundaries in CGSCC passes,
which is good even if some tests look like they regress.

Fixes: https://github.com/llvm/llvm-project/issues/54981
Note: A previous version was flawed and consequently reverted in
      6555558a80589d1c5a1154b92cc3af9495f8f86c.

[AMDGPU] Use the HasNoUse predicate for no-ret atomic op selection

This change replaces the C++ predicates with the HasNoUse builtin
predicate that would enable the no-ret atomic op selection in
GlobalISel.

Differential Revision: https://reviews.llvm.org/D125213

[GlobalISel][SelectionDAG] Implement the HasNoUse builtin predicate

This change introduces the HasNoUse builtin predicate in PatFrags that
checks for the absence of use of the first result operand.
GlobalISelEmitter will allow source PatFrags with this predicate to be
matched with destination instructions with empty outs. This predicate is
required for selecting the no-return variant of atomic instructions in
AMDGPU.

Differential Revision: https://reviews.llvm.org/D125212

[AMDGPU] Use AddedComplexity for ret and noret atomic ops selection

This patch removes the predicate for return atomic ops and uses
AddedComplexity to distinguish its selection from its no return variant.
This will produce better matchers that doesn't unnecessarily check for
the negated predicate if the initial predicate failed. Also, it
simplifies the enabling of no return atomic ops selection in GlobalISel.

Differential Revision: https://reviews.llvm.org/D128241

[mlir] Delete ForwardDataFlowAnalysis

With SCCP and integer range analysis ported to the new framework, this old framework is redundant. Delete it.

Depends on D128866

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D128867

[docs] Add document "Debugging C++ Coroutines"

Previously in D99179, I tried to construct debug information for
coroutine frames in the middle end to enhance the debugability for
coroutines. But I forget to add ReleaseNotes to hint people and
documents to help people to use. My bad. @probinson revealed this in
https://github.com/llvm/llvm-project/issues/55916.

So I try to add the use document now.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D127626

[mlir] Swap integer range inference to the new framework

Integer range inference has been swapped to the new framework. The integer value range lattices automatically updates the corresponding constant value on update.

Depends on D127173

Reviewed By: krzysz00, rriddle

Differential Revision: https://reviews.llvm.org/D128866

[C++20] [Modules] Don't complain about duplicated default template argument across modules

See https://github.com/cplusplus/draft/pull/5204 for a detailed
background.

Simply, the test redundant-template-default-arg.cpp attached to this
patch should be accepted instead of being complained about the
redefinition.

Reviewed By: urnathan, rsmith, ChuanqiXu

Differential Revision: https://reviews.llvm.org/D118034

[mlir][complex] Lower complex.angle to libm

complex.angle corresponds to arg function in libm. We can lower complex.angle to arg and argf.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D129341

[X86] Fix collectLeaves for adds used by phi that forms loop

When add has additional users, we should indentify whether add's
user is phi that forms loop rather than root's.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D129169

[RISCV] Recommit test for D128717

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D128778

[X86][FP16] Fix crash when lowering copysign for f16

This is to address the assertion fail reported in https://reviews.llvm.org/D107082#3635612
Not sure if it is a problem of promoting FCOPYSIGN + libcall FP_ROUND.
The promoting will set the rounding mode to 1 https://github.com/llvm/llvm-project/blob/a442c628882eb07fffff8c9f7c87a317af14555a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp#L4810-L4814
While libcall cannot handle the rounding mode equals to 1 https://github.com/llvm/llvm-project/blob/a442c628882eb07fffff8c9f7c87a317af14555a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp#L4324-L4328
So changing the action to Expand to workaround the problem.

Reviewed By: clementval, MaskRay

Differential Revision: https://reviews.llvm.org/D129294

Revert "[RISCV] Precommit test for D128717"

This reverts commit b3b37f3ecfd6be665c385455561603bdfc7d6c76.

[RISCV] Precommit test for D128717

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D128778

[ms] [llvm-ml] Add support for the remaining binary named operators

Finish adding support for the remaining binary named operators in expression context: XOR, SHL, and SHR.

Differential Revision: https://reviews.llvm.org/D129299

[clang-repl][NFC] Split weak symbol test to a new test

Windows has some issues when we try to use `__attribute__((weak))` in
JIT, so we disabled that. But it's not worth to disable the whole test
just for this single feature. This patch split that part from the
original test so we can keep testing stuff that normally working in
Windows.

Signed-off-by: Jun Zhang <jun@junz.org>
Differential Revision: https://reviews.llvm.org/D129250

[mlir][complex] Convert complex.abs to libm

Convert complex.abs to libm library

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D127476

[llvm-objdump][Docs] Document new flag

[mlir][bzl] Update for 1a92dbcfa88a857bf735c897125d9a2f9be53e9e and cab44c515c63db876a7f626fdc36be4a9157ebad

Revert "[Sanitizer][Darwin] Cleanup MaybeReexec() function and usage"

Many tests for the `UBSan-Standalone-iossim-x86_64` fail with this.
Reverting so I can investigate.

This reverts commit 0a9667b0f56b1b450abd02f74c6175bea54f832e.

Add a little extra test coverage for simple template names

This would fail with an overly naive approach to simple template
name (clang's -gsimple-template-names) since the names wouldn't be
unique per specialization, creating ambiguity/chance that a query for
one specialization would find another.

[mlir] add complex type to getZeroAttr

Fixes issue encountered with <sparse> complex constant
https://github.com/llvm/llvm-project/issues/56428

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D129325

[AArch64] Try to re-use extended operand for SETCC with vector ops.

Try to re-use an already extended operand for SetCC with vector operands
feeding an extended select. Doing so avoids requiring another full
extension of the SET_CC result when lowering the select.

This improves lowering for certain extend/cmp/select patterns operating.
For example with  v16i8, this replaces 6 instructions for the extra extension
with 4 separate selects.

This improves the generated code for loops like the one below in
combination with D96522.

    int foo(uint8_t *p, int N) {
      unsigned long long sum = 0;
      for (int i = 0; i < N ; i++, p++) {
        unsigned int v = *p;
        sum += (v < 127) ? v : 256 - v;
      }
      return sum;
    }

https://clang.godbolt.org/z/Wco866MjY

On the AArch64 cores I have access to, the patch improves performance of
the vector loop by ~10%.

This could be generalized per follow-ups, but the initial version
targets one of the more important cases in combination with D96522.

Alive2 modeling:
* sext EQ https://alive2.llvm.org/ce/z/5upBvb
* sext NE https://alive2.llvm.org/ce/z/zbEcJp
* zext EQ https://alive2.llvm.org/ce/z/_xMwof
* zext NE https://alive2.llvm.org/ce/z/5FwKfc
* zext unsigned predicate: https://alive2.llvm.org/ce/z/iEwLU3
* sext signed predicate: https://alive2.llvm.org/ce/z/aMBega

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D120481

[Sanitizer][Darwin] Cleanup MaybeReexec() function and usage

While investigating another issue, I noticed that `MaybeReexec()` never
actually "re-executes via `execv()`" anymore.  `DyldNeedsEnvVariable()`
only returned true on macOS 10.10 and below.

Usually, I try to avoid "unnecessary" cleanups (it's hard to be certain
that there truly is no fallout), but I decided to do this one because:

* I initially tricked myself into thinking that `MaybeReexec()` was
  relevant to my original investigation (instead of being dead code).
* The deleted code itself is quite complicated.
* Over time a few other things were mushed into `MaybeReexec()`:
  initializing `MonotonicNanoTime()`, verifying interceptors are
  working, and stripping the `DYLD_INSERT_LIBRARIES` env var to avoid
  problems when forking.
* This platform-specific thing leaked into `sanitizer_common.h`.
* The `ReexecDisabled()` config nob relies on the "strong overrides weak
  pattern", which is now problematic and can be completely removed.
* `ReexecDisabled()` actually hid another issue with interceptors not
  working in unit tests.  I added an explicit `verify_interceptors`
  (defaults to `true`) option instead.

Differential Revision: https://reviews.llvm.org/D129157

Revert "[LLDB][NFC] Decouple dwarf location table from DWARFExpression."

This reverts commit 227dffd0b6d78154516ace45f6ed28259c7baa48 and its
follow up 562c3467a6738aa89203f72fc1d1343e5baadf3c because it breaks a
bunch of tests on GreenDragon:

https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45155/

[libcxx][ranges] Create a test tool `ProxyIterator` that customises `iter_move` and `iter_swap`

It is meant to be used in ranges algorithm tests.
It is much simplified version of C++23's tuple + zip_view.
Using std::swap would cause compilation failure and using `std::move` would not create the correct rvalue proxy which would result in copies.

Differential Revision: https://reviews.llvm.org/D129099

Revert "[RISCV] Optimize 2x SELECT for floating-point types"

This reverts commit 1178992c72b002c3b2c87203252c566eeb273cc1.

[clang-format][NFC] Clean up IndentForLevel in LevelIndentTracker

Differential Revision: https://reviews.llvm.org/D129105

[mlir] An implementation of dense data-flow analysis

This patch introduces an implementation of dense data-flow analysis. Dense
data-flow analysis attaches a lattice before and after the execution of every
operation. The lattice state is propagated across operations by a user-defined
transfer function. The state is joined across control-flow and callgraph edges.

Thge patch provides an example pass that uses both a dense and a sparse analysis
together.

Depends on D127139

Reviewed By: rriddle, phisiart

Differential Revision: https://reviews.llvm.org/D127173

[LLDB] Fix aggregate-indirect-arg.cpp failure introduced by 227dffd0b6d78154516ace45f6ed28259c7baa48

[gn build/mac] Use -mmacos-version-min instead of -mmacosx-version-min

The two flags do the same thing, but the OS is called macOS these days.

(The new flag is 5 years old: https://reviews.llvm.org/D32796)

No behavior change.

[vscode-mlir] Bump to version 0.0.10

Since version 0.9 we've:

* Bumped the language-client to `8.0.2-next.5` to fix various bugs/stability issues
* Fixed an issue with starting a language server for non-workspace files

[Attributor] Make heap2stack record alloca placement

We recently learned to place the alloca during the heap2stack
transformation in the entry block but we did not account for other
concurrent modifications. We need to record our decision rather than
checking (then outdated) passes during the manifest stage. This will
also allow us to use a custom (=optimistic) "loop info" in the future.