review.tizen.org Git - platform/upstream/llvm.git/log

[libc++] Fix modules issues on OS X

First, fix a collision with the Point type from MacTypes.h, which was
reported on Slack, 2022-07-31: https://cpplang.slack.com/archives/C2X659D1B/p1659284691275889

Second, rename the meta:: namespace to types::. OSX's "/usr/include/ncurses.h"
defines a `meta` function, and is (for some reason) included in
"<SDK>/usr/include/module.modulemap", so that identifier is off-limits
for us to use in anything that compiles with -fmodules:

    libcxx/test/support/type_algorithms.h:16:11: error: redefinition of 'meta' as different kind of symbol
    namespace meta {
               ^
    <SDK>/usr/include/ncurses.h:603:28: note: previous definition is here
    extern NCURSES_EXPORT(int) meta (WINDOW *,bool);                        /* implemented */
                                ^

Finally, add a CI configuration for modules on OS X to make sure it
does not regress.

Differential Revision: https://reviews.llvm.org/D144915

[mlir][sparse] Improve the implementation of sparse_tensor.new for the codegen path.

Rewrite a NewOp into a NewOp of a sorted COO tensor and a ConvertOp for
converting the sorted COO tensor to the destination tensor type.

Codegen a NewOp of a sorted COO tensor to use the new bulk reader API and sort
the elements only when the input is not sorted.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D144504

[mlir][MemRef] Rewrite multi-buffering with proper composable abstractions

Rewrite and document multi-buffering properly:
1. Use IndexingUtils / StaticValueUtils instead of duplicating functionality
2. Properly plumb RewriterBase through.
3. Add support
4. Better debug messages.

This revision is otherwise almost NFC, if it weren't for the extra DeallocOp
support that would previoulsy make multi-buffering fail.

Depends on: D145036

Differential Revision: https://reviews.llvm.org/D145055

[clang][RISCV][test] Add coverage for __fp16 support in arguments/returns

By choice, we don't set HalfArgsAndReturns=true (which would allow
__fp16 in args and returns). Add test coverage for this to ensure it
isn't changed by accident.

[SimpleLoopUnswitch] Forget loops before invalidating IR.

Invalidate SCEV before adjusting switch instruction, so the IR remains
in a valid state for SCEV invalidation.

[ARM] Remove a redundant function fixupBTI

Since the redundant BTI instructions emitted by jump tables are now
removed in the ARMBranchTargets pass, the fixupBTI function is not needed
in the ARMConstantIslandPass. Some related tests are removed as well.

The relevant patch that removes the redundant BTI instructions:
https://reviews.llvm.org/D144470

Differential Revision: https://reviews.llvm.org/D145048

[flang] MERGE result is polymorphic only if TSOURCE and FSOURCE are polymorphic

16.9.129 point 4: the result is polymorphic if and only if both TSOURCE and
FSOURCE are polymorphic.

If neither TSOURCE and FSOURCE are polymorphic then the current behavior is
preserved.

Depends on D145058

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D145060

LibclangTest: remove libclang-test-* tmp dir reliably

Temporary directories created by two LibclangReparseTest tests -
ReparseWithModule and clang_parseTranslationUnit2FullArgv - remained in
the system temporary directory after running libclangTests, because not
all files and subdirectories created in TestDir were added to set
LibclangParseTest::Files.

Differential Revision: https://reviews.llvm.org/D143415

[flang] Allow scalar boxed record type in intrinsic elemental lowering

Relax a bit the condition added in D144417 and allow scalar polymorphic entities
and boxed scalar record type.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D145058

[clang][Interp] This pointers are writable in de-/constructors

This is possible in C++20, so we need to check this when doing stores.

Differential Revision: https://reviews.llvm.org/D136751

[SLP][NFC]Update the test to simplify and avoid dead instruction
removal, NFC.

[CodeGen] Always expand division larger than i128

Default MaxDivRemBitWidthSupported to 128, so that divisions larger
than 128 bits are always expanded, without requiring additional
configuration from the target.

Note that this may still emit calls to __udivti3 on 32-bit targets,
which likely don't have an implementation of that builtin. However,
I believe this is sufficient to fix
https://github.com/llvm/llvm-project/issues/60531, because Zig must
already be defining those builtins.

Differential Revision: https://reviews.llvm.org/D144871

[RISCV] Pre-commit test case for ordered reduction, NFC

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D144458

[LoopVectorize] Use overflow-check analysis to improve tail-folding.

This work follows on from D142109 and addresses a possible regression
when we know the loop iteration counter cannot overflow.

When we know the overflow-check always evaluates to false, it's better to
use the other style of tail folding where it assumes a runtime check was
added, because that avoids having to calculate a modified trip-count.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D142894

[gn build] Port f8d10d5ac9ab

[InstCombine] prevent miscompiles from select-of-div/rem transform

This avoids the danger shown in issue #60906.
There were no regression tests for these patterns, so these potential
failures have been around for a long time.

We freeze the condition and preserve the optimization because
getting rid of a div/rem is always a win.

Here are a couple of examples that can be corrected by freezing the
condition:
https://alive2.llvm.org/ce/z/sXHTTC

Differential Revision: https://reviews.llvm.org/D144671

[AArch64] Load into zero vector patterns

A LDR will implicitly zero the rest of the vector, so vector_insert(zeros,
load, 0) can use a single load. This adds tablegen patterns for both scaled and
unscaled loads, detecting where we are inserting a load into the lower element
of a zero vector.

Differential Revision: https://reviews.llvm.org/D144086

[gn] port e281d102fb73 more

Move close() to the proper else block

`LogWriter::Close(LW)` is outside the null check if-else block, which, when `LW == nullptr`, will causing a NULL dereference.
I think the close() means to be in else block, which is when `LW != nullptr`.

Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D145039

[clang][RISCV][test] Add further test coverage for _Float16 on RISC-V

Check for size and alignment as we do for other types.

[flang] Implement isnan and ieee_is_nan intrinsics

To implement these we call the LLVM intrinsic is.fpclass indicating that
we are checking for either a quiet or signalling NaN.

Differential Revision: https://reviews.llvm.org/D144649

[mlir][Linalg] Improve HoistPadding to propagate through iter_args

This revision properly plumbs the subsitution of a padded op through
iter_args in the case of an scf::ForOp consumer.

Differential Revision: https://reviews.llvm.org/D145036

[NFC] Fix incorrect comment in VLIW packetizer

Reviewed By: bcain

Differential Revision: https://reviews.llvm.org/D145050

[mlir][standalone] Enable to build as LLVM external project

In addition to the component build, this enables the standalone example
to be build as part of a monolithic LLVM build by using the LLVM
external projects mechanism (`LLVM_EXTERNAL_PROJECTS`).

Reviewed By: stephenneuendorffer, stellaraccident

Differential Revision: https://reviews.llvm.org/D143718

[lldb/test] Update error message in debug-types-signature-loop.s

The error message changed in D144664.

[clang-format][NFC] Refactor formatting unit tests.

Pull out common base class for formatting unit tests, removing duplicate
code that accumulated over the years.

Pull out macro expansion test into its own test file.

[AMDGPU][AsmParser][NFC] Simplify parsing cache policies.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D144954

[mlir-reduce] Create proper tmp test files (NFC)

This commit ensures that the sh script creates temporary files with
mktmp to ensure they do not collide with existing files. The previous
behaviour caused sporadic permission issues on a multi-user system.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D145054

[Flang][WWW] Update Bug Reports link to point to Github issues

[Dexter] Use non-blocking resume when debugging Visual Studio

The Visual Studio debugger currently uses blocking calls to Go and
StepInto, which interferes with Dexter's ability to do any processing
(e.g. checking for time outs) in between breakpoints. This patch updates
these functions to use non-blocking calls.

Reviewed By: Orlando

Differential Revision: https://reviews.llvm.org/D144986

[clang][test][RISCV] Add RISC-V to clang/test/Sema/Float16.c

Since D105001, HasFloat16 was unconditionally set to true for RISC-V.
This patch adds test coverage for this.

[mlir][llvm] Make DISubprogram name optional

This commit make the name parameter of the DISubprogramAttr optional.
LLVM will for example omit these subprogram names in initialization
functions for globals.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D145046

[NFC][clang] Refine tests by adding `:` to checks

The tests can fail if wokring directory where the tests were launched
has a `error` substring in its path.

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D144495

[AArch64] More patterns to generate LD1R vector splats

We are missing patterns to generate vector splats using LD1R. A shuffle vector
with all 0s is a vector splat if the operands are a load and undef for which
we can generate a LD1R.

Differential Revision: https://reviews.llvm.org/D145004

[AArch64] Precommit tests to check more ld1r vector splat patterns in D145004.

[mlir] Fix GreedyPatternRewriteDriver::notifyOperationModified.

The previous implementation did not notify the attached listener.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D145049

[flang][hlfir] Implement hlfir.declare optional codegen

The hlfir fir.box with the local lower bounds and type parameters
must be generated conditionally when the entity is optional.

Differential Revision: https://reviews.llvm.org/D144962

[AArch64] Remove 64bit->128bit vector insert lowering

The AArch64 backend, during lowering, will convert an 64bit vector insert to a
128bit vector:

vector_insert %dreg, %v, %idx
=>
%qreg = insert_subvector undef, %dreg, 0
%ins = vector_insert %qreg, %v, %idx
EXTRACT_SUBREG %ins, dsub

This creates a bit of mess in the DAG, and the EXTRACT_SUBREG being a machine
nodes makes it difficult to simplify. This patch removes that, treating the
64bit vector insert as legal and handling them with extra tablegen patterns.

The end result is a simpler DAG that is easier to write tablegen patterns for.

Differential Revision: https://reviews.llvm.org/D144550

[InstCombine] Improvement the analytics through the dominating condition

Address the dominating condition, the urem fold is benefit from the analytics improvements.
Fix https://github.com/llvm/llvm-project/issues/60546

NOTE: delete the calls in simplifyBinaryIntrinsic and foldICmpWithDominatingICmp
is used to reduce compile time.

Reviewed By: nikic, arsenm, erikdesjardins
Differential Revision: https://reviews.llvm.org/D144248

[LoopVectorize] Remove runtime check and scalar tail loop when tail-folding.

When using tail-folding and using the predicate for both data and control-flow
(the next vector iteration's predicate is generated with the llvm.active.lane.mask
intrinsic and then tested for the backedge), the LoopVectorizer still inserts a
runtime check to see if the 'i + VF' may at any point overflow for the given
trip-count. When it does, it falls back to a scalar epilogue loop.

We can get rid of that runtime check in the pre-header and therefore also
remove the scalar epilogue loop. This reduces code-size and avoids a runtime
check.

Consider the following loop:

  void foo(char * __restrict__ dst, char *src, unsigned long N) {
      for (unsigned long  i=0; i<N; ++i)
          dst[i] = src[i] + 42;
  }

If 'N' is e.g. ULONG_MAX, and the VF > 1, then the loop iteration counter
will overflow when calculating the predicate for the next vector iteration
at some point, because LLVM does:

  vector.ph:
    %active.lane.mask.entry = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 0, i64 %N)

  vector.body:
    %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
    %active.lane.mask = phi <vscale x 16 x i1> [ %active.lane.mask.entry, %vector.ph ], [ %active.lane.mask.next, %vector.body ]
    ...

    %index.next = add i64 %index, 16
      ; The add above may overflow, which would affect the lane mask and control flow. Hence a runtime check is needed.
    %active.lane.mask.next = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %index.next, i64 %N)
    %8 = extractelement <vscale x 16 x i1> %active.lane.mask.next, i64 0
    br i1 %8, label %vector.body, label %for.cond.cleanup, !llvm.loop !7

The solution:

What we can do instead is calculate the predicate before incrementing
the loop iteration counter, such that the llvm.active.lane.mask is
calculated from 'i' to 'tripcount > VF ? tripcount - VF : 0', i.e.

  vector.ph:
    %active.lane.mask.entry = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 0, i64 %N)
    %N_minus_VF = select %N > 16 ? %N - 16 : 0

  vector.body:
    %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
    %active.lane.mask = phi <vscale x 16 x i1> [ %active.lane.mask.entry, %vector.ph ], [ %active.lane.mask.next, %vector.body ]
    ...

    %active.lane.mask.next = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %index, i64 %N_minus_VF)
    %index.next = add i64 %index, %4
      ; The add above may still overflow, but this time the active.lane.mask is not affected
    %8 = extractelement <vscale x 16 x i1> %active.lane.mask.next, i64 0
    br i1 %8, label %vector.body, label %for.cond.cleanup, !llvm.loop !7

For N = 20, we'd then get:

  vector.ph:
    %active.lane.mask.entry = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 0, i64 %N)
      ; %active.lane.mask.entry = <1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1>
    %N_minus_VF = select 20 > 16 ? 20 - 16 : 0
      ; %N_minus_VF = 4

  vector.body: (1st iteration)
    ... ; using <1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1> as predicate in the loop
    ...
    %active.lane.mask.next = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 0, i64 4)
      ; %active.lane.mask.next = <1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0>
    %index.next = add i64 0, 16
      ; %index.next = 16
    %8 = extractelement <vscale x 16 x i1> %active.lane.mask.next, i64 0
      ; %8 = 1
    br i1 %8, label %vector.body, label %for.cond.cleanup, !llvm.loop !7
      ; branch to %vector.body

  vector.body: (2nd iteration)
    ... ; using <1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0> as predicate in the loop
    ...
    %active.lane.mask.next = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 16, i64 4)
      ; %active.lane.mask.next = <0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0>
    %index.next = add i64 16, 16
      ; %index.next = 32
    %8 = extractelement <vscale x 16 x i1> %active.lane.mask.next, i64 0
      ; %8 = 0
    br i1 %8, label %vector.body, label %for.cond.cleanup, !llvm.loop !7
      ; branch to %for.cond.cleanup

Reviewed By: fhahn, david-arm

Differential Revision: https://reviews.llvm.org/D142109

NFC: Use generate_test_checks script for LV tests which seem to have been auto-generated.

[Orc] Remove LLVMInitializeCore() calls from examples

Per discussion on D144970, these are no longer necessary.

[flang][NFC] Remove redundant and incomplete comment

[flang] Handle dynamic type in move_alloc

Update move_alloc to carry over the dyanmic type of `from` to `to`
and reset the dynamic type of `from` to its declared type when it
is polymorphic.

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D144997

[MLIR][Linalg] Fix propagation for rank-zero tensor

`isScalar` only returns true if the operand is non-shaped.
But we need to handle also rank zero tensors.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D144989

[IR][Legalization] Split illegal deinterleave and interleave vectors

To make legalization easier, the operands and outputs have the same size for
these ISD Nodes. When legalizing the results in SplitVectorResult the operands
are legalized to the same size as the outputs.
The ISD Node has two output/results, therefore the legalizing functions update
both results/outputs.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144744

[mlir][llvm] Add AliasAnalysis and AccessGroup interfaces.

The revision introduces two interfaces that provide access to
the alias analysis and access group metadata attributes. The
AliasAnalysis interface combines all alias analysis related
attributes (alias, noalias, and tbaa) similar to LLVM's getAAMetadata
method, while the AccessGroup interface is dedicated to the
access group metadata.

Previously, only the load and store operations supported alias analysis
and access group metadata. This revision extends this support to the
atomic operations. A follow up revision will also add support for the
memcopy, memset, and memove intrinsics. The interfaces then provide
convenient access to the metadata attributes and eliminate the need
of TypeSwitch or string based attribute access.

The revision still relies on string based attribute access for
the translation to LLVM IR (except for tbaa metadata). Only once
the the memory access intrinsics also implement the new interfaces,
the translation to LLVM IR can be fully switched to use interface
based attribute accesses.

Depends on D144875

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D144851

[clang][ASTImporter] Improve import of InjectedClassNameType.

During AST import multiple different InjectedClassNameType objects
could be created for a single class template. This can cause problems
and failed assertions when these types are compared and found to be
not the same (because the instance is different and there is no
canonical type).
The import of this type does not use the factory method in ASTContext,
probably because the preconditions are not fulfilled at that state.
The fix tries to make the code in ASTImporter work more like the code
in ASTContext::getInjectedClassNameType. If a type is stored at the
Decl or previous Decl object, it is reused instead of creating a new
one. This avoids crash at least a part of the cases.

Reviewed By: gamesh411, donat.nagy, vabridgers

Differential Revision: https://reviews.llvm.org/D140562

[mlir][NFC] Address filecheck_lint findings in Vector/CPU/test-broadcast.mlir.

Differential Revision: https://reviews.llvm.org/D144972

[IR] Add LLVM IR support for target("aarch64.svcount") type.

The C and C++ Language Extensions for AArch64 SME2 [1] adds a new type called
`svcount_t` which describes a predicate. This is not a predicate vector
mask, but rather a description of a predicate vector mask that can be
expanded into a mask using explicit instructions. The type is a scalable
opaque type.

To implement `svcount_t` type this patch uses the existing Target Extension Type
mechanism, but adds further support so that this type can be a scalable type.

AArch64 CodeGen support will follow in a separate patch.

[1] https://github.com/ARM-software/acle/pull/217

Reviewed By: jcranmer-intel, nikic

Differential Revision: https://reviews.llvm.org/D136861

[mlir][llvm] Prioritize DILocalScope over file loc

This commit ensures that the LLVMIR export prioritizes existing
DILocalScope attribute information as location scopes over files
constructed from filenames. All DILocalScope attributes contain file
information, so no information is lost. The previous implementation
caused the introduction of superfluous DILexicalBlockFile nodes in
certain cases. The old implementation remains as a fallback when no
DILocalScope is present.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D144968

[clang][driver] Do not emit default '-Tdata' for AVR devices

Different AVR devices have different data regions. Current clang
driver emits a default '-Tdata' option to the linker. This way
works fine if there is no user specified linker script, but it
will cause conflicts if there is one.

A better solution for setting the default data region to GNU ld
is defining symbol __DATA_REGION_ORIGIN__, which is expected by
GNU ld's default AVR linker script.

Fixes https://github.com/llvm/llvm-project/issues/60362

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D144533

[libclang] Remove redundant return statements in CXType.cpp

Let the branch fall through the error path like other functions here do.

Differential Revision: https://reviews.llvm.org/D140074

[TableGen] Minor tweak to AssemblerCondDag evaluation to be more consistent with other dags. NFC

Instead of using getAsString on the dag operator, check if the operator
is a DefInit and then get the name of the Def.

[X86] Add `TuningPreferShiftShuffle` for when Shifts are preferable to shuffles.

SKX has an objectively faster shift than shuffle, on all other targets
the two have equal performance (with maybe a slight preference for
shifts because p5 is a more common bottleneck).

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D143786

[X86] Make `(shift X (xor/sub N-1, Y))` -> `(shift X, (not Y))` check for one use.

`(xor/sub N-1, Y)` -> `(not Y)` is minorly preferable (especially for
`(sub N-1, Y)` where it saves an instruction), but isn't worth
potentially creating an extra instruction for.

So, only do the transformation if `(xor/sub N-1, Y)` has one use.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D144985

[X86] Fix `(shift X, (xor Y, N-1))` -> `(shift X, (not Y))` by properly inserting `not Y` into DAG. [#61038]

Previously not inserting the `-1` in `not Y` (`xor Y, -1`) into the
DAG. Not inserting `-1` as a DAG node comes up as a bug when doing
`(xor (shl 1, A), B)` -> `(btc A, B)`. `btc` requires `B` (dst) to be
a register.

Differential Revision: https://reviews.llvm.org/D144984

[PowerPC] update PPCTTIImpl::supportsTailCallFor() check conditions

This patch reuse `PPCTargetLowering::isEligibleForTCO()` to check
`PPCTTIImpl::supportsTailCallFor()`.

Fixes #59315

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D140369

Revert "[Modules] Don't check [temp.friend]p9 in ASTContext::isSameEntity"

This reverts commit 74565c3add6d683559618973863e78a5e6836e48.

Since it looks like this one causes the modular libcxx build fails.

Revert "[C++20] [Modules] Trying to compare the trailing require clause from the primary template function"

This reverts commit 9e50578ba43c49ee13ac3bb7d4868565824f9b29. Since it
looks like this one prevents us to fix the modular build for libcxx.

[Doc][NFC] Add template type when use MachinePassRegistry.

Reviewed By: serge-sans-paille

Differential Revision: https://reviews.llvm.org/D144945

[COFF][X86_64] Put jump table in .rdata for Windows

Put jump table in .rdata for Windows to align with that for Linux.
It can avoid loading the same code page into I$ and D$ simultaneously
and thus favor performance.

Differential Revision: https://reviews.llvm.org/D144701

Revert "[ControlHeightReduction] Don't combine a "poison" branch"

This reverts commit 38a64aab4a3fbaaeb383638ff654247902796556.

llvm-clang-x86_64-expensive-checks-debian is failing:

https://lab.llvm.org/buildbot/#/builders/16/builds/44249

Fix failed libcxx test build on the Windows to Linux cross builders. NFC.

Disable `modules_include.sh.cpp` test on Windows build hosts,
it cannot be executed there anymore.

Differential Revision: https://reviews.llvm.org/D144640

Fix the run locker setting for async launches that don't stop at the
initial stop. The code was using PrivateResume when it should have
used Resume.

This was allowing expression evaluation while the target was running,
and though that was caught a litle later on, we should never have gotten
that far. To make sure that this is caught immediately I made an error
SBValue when this happens, and test that we get this error.

Differential Revision: https://reviews.llvm.org/D144665

[mlir][math] Math expansion for math.tan

We can implement a polynomial approximation of math.tan by
decomposing to `math.sin` and `math.cos`. While it is not
technically a polynomial approximation it should be the most
straight forward approximation.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D144980

[mlir][NFC] Address filecheck_lint findings in tosa-to-linalg-named.mlir.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D144971

Fix typos in the test command for D144929

Put the arch-dep debugserver files in main CMakeLists.txt

The architecture dependent files for debugserver were
built out of their own separate CMakeLists.txt for historical
reasons; it's not necessary any longer. Remove that file
and put them in the main debugserver CMakeLists.txt.

Differential Revision: https://reviews.llvm.org/D145020
rdar://105993317

[Clang][CodeGen] Fix this argument type for certain destructors

With the Microsoft ABI, some destructors need to offset a parameter to
get the derived this pointer, in which case the type of that parameter
should not be a pointer to the derived type.

Fixes #60465

An SBValue whose underlying ValueObject has no valid value, but does
hold an error should:

(a) return false for IsValid, since that's the current behavior and is
a convenient way to check "should I get the value for this".
(b) preserve the error when an SBValue is made from it, and print the
error in the ValueObjectPrinter.

Make that happen.

Differential Revision: https://reviews.llvm.org/D144664

[builtins] Add option to always build int128 routines

32-bit targets don't build these by default, but e.g. armv7 and x86 can
build them just fine, and it's useful to have the int128 routines
available for certain applications. Add a CMake option to let us include
the int128 routines for architectures which would otherwise lack them.

Reviewed By: compnerd, MaskRay, phosek

Differential Revision: https://reviews.llvm.org/D145003

[ControlHeightReduction] Don't combine a "poison" branch

Without this patch, the control height reduction pass would combine a
"poison" branch with an earlier well-defined branch, turning the
earlier branch into a "poison" branch also.

This patch fixes the problem by rejecting "poison" conditional
branches.

Differential Revision: https://reviews.llvm.org/D145008

Add SBCommandInterpreter::UserCommandExists parallel to CommandExists.

The latter only checks built-in commands. I also added some docs to
make the distinction clear and a test.

Differential Revision: https://reviews.llvm.org/D144929

[OCaml] Migrate from naked pointers to prepare for OCaml 5

The OCaml bindings currently return pointers to LLVM objects as-is to
OCaml. These "naked pointers" end up appearing as values of local
variables in OCaml code, stored as part of other OCaml values,
etc. The safety of this design relies on the OCaml runtime system's
ability to distinguish these pointers from pointers to memory on the
OCaml garbage collected heap. In particular, when the OCaml GC
encounters a pointer to memory known to not be part of the OCaml heap,
it does not follow it.

In OCaml 4.02 an optimized "no naked pointers" mode was introduced
where the runtime system does not perform such checks and requires
that no such naked pointers be passed to OCaml code, instead one of
several encodings needs to be used. In OCaml 5, the no naked pointers
mode is now the only mode. This diff uses one of the potential
encodings to eliminate naked pointers, making the LLVM OCaml bindings
compatible with the "no naked pointers" mode of OCaml >= 4.02 and 5.

The encoding implemented in this diff relies on LLVM objects to be at
least 2-byte aligned, meaning that the lsb of pointers will
necessarily be clear. The encoding sets the lsb when passing LLVM
pointers to OCaml, and clears it on the return path. Setting the lsb
causes the OCaml runtime system to interpret the resulting value as a
tagged integer, which does not participate in garbage collection.

In some cases, particularly functions that receive an OCaml array of
LLVM pointers, this encoding requires allocation of a temporary array,
but otherwise this diff aims to preserve the existing performance
characteristics of the OCaml bindings.

Reviewed By: jberdine

Differential Revision: https://reviews.llvm.org/D136400

[X86] Revise Alderlake P-Core schedule model

The previous Alderlake P-Core model prefer data from uops.info than intel doc.
Some measures latency from uops.info is larger than real latency. e.g. addpd
latency is 3 in uops.info while 2 in intel doc. This patch adjust the priority
of those two data source so that intel doc is more preferable.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D144388

[Coroutines] Avoid creating conditional cleanup markers in suspend block

We shouldn't access coro frame after returning from `await_suspend()` and before `llvm.coro.suspend()`.
Make sure we always hoist conditional cleanup markers when inside the `await.suspend` block.

Fix https://github.com/llvm/llvm-project/issues/59181

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D144680

[Fuchsia] Enable LLVM_USE_RELATIVE_PATHS_IN_FILES in bootstrap

This patch enables LLVM_USE_RELATIVE_PATHS_IN_FILES when building the
bootstrap toolchain for 2 stages build.

Differential Revision: https://reviews.llvm.org/D145010

[X86] Improve select of constants

Without this patch:

  %cmp = icmp eq i32 %a, %b
  %cond = select i1 %cmp, i32 1, i32 2

is compiled as:

  31 c9                      xor    %ecx,%ecx
  39 f7                      cmp    %esi,%edi
  0f 94 c1                   sete   %cl
  b8 02 00 00 00             mov    $0x2,%eax
  29 c8                      sub    %ecx,%eax

With this patch, the compiler generates:

  31 c0                      xor    %eax,%eax
  39 f7                      cmp    %esi,%edi
  0f 95 c0                   setne  %al
  ff c0                      inc    %eax

saving 5 bytes while reducing register usage.

This patch transforms C - setcc into inverted_setcc + (C-1) if C is a
nonzero constant.

This patch fixes:

https://github.com/llvm/llvm-project/issues/60854

Differential Revision: https://reviews.llvm.org/D144449

[GWP-ASan] Handle wild touches of the guarded pool.

AllocMeta could be null when returned from __gwp_asan_get_metadata() for
a bad access into the GuardedPagePool that was never allocated.
Currently, then we dereference the null pointer, oops.

Hoist the check up and print a message (only once in recoverable mode)
about the bad memory access.

Reviewed By: fmayer

Differential Revision: https://reviews.llvm.org/D144973

Revert "[X86] Drop single use check for freeze(undef) in LowerAVXCONCAT_VECTORS"

This reverts commit 9e58182d6446bb61dbd13c0e6314f291e50d4d7c.

[X86] Drop single use check for freeze(undef) in LowerAVXCONCAT_VECTORS

Ignoring freeze(undef) if it has multiple uses in LowerAVXCONCAT_VECTORS
causes the custom INSERT_SUBVECTOR for vector widening to be ignored.

Differential Revision: https://reviews.llvm.org/D14490

Update debugserver xcode proj to build with c++17

Also a few small fixes for building debugserver on iOS
in c++17.

[ADT] Fix const-correctness issues in `zippy`

This defines the iterator tuple based on the storage type of `zippy`,
instead of its type arguments. This way, we can support temporaries that
gets passed in and allow for them to be modified during iteration.

Because the iterator types to the tuple storage can have different types
when the storage is and isn't const, this defines a const iterator type
and non-const `begin`/`end` functions. This way we avoid unintentional
casts, e.g., trying to cast `vector<bool>::reference` to
`vector<bool>::const_reference`, which may be unrelated types that are
not convertible.

This patch is a general and free-standing improvement but my primary use
is in the implemention a version of `enumerate` that accepts multiple ranges:
D144583.

Reviewed By: dblaikie, zero9178

Differential Revision: https://reviews.llvm.org/D144834

[clang-format] Only add pragma continuation indentation for 'omp' clauses

The patch in D136100 added custom handling for pragmas to assist in
formatting OpenMP clauses correctly. One of these changes added extra
indentation. This is desirable for OpenMP pragmas as they are several
complete tokens that would otherwise we on the exact same line. However,
this is not desired for the other pragmas.

This solution is extremely hacky, I'm not overly familiar with the
`clang-format` codebase. A better solution would probably require
actually parsing these as tokens, but I just wanted to propose a
solution.

Fixes https://github.com/llvm/llvm-project/issues/59473

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D144884

[mlir][sparse] fix performance bug in matmul with a sparse rhs due to suboptimal iteration graphs.

While dense tensors support random accesses, it is critical to visit them in a row-major order for better cache locality. However, we previously consider dense inputs and outputs together when computing constraints for building iteration graph, it could lead us to less efficient iteration graphs.

This patch adds a new `SortMask::kIncludeDenseInput` to treat dense inputs/outputs separately when building iteration graph, thus increasing the chance for use to construct a better iteration graph.

A more fine-grained approach is to treat each input separately.

Note, related to:
https://github.com/llvm/llvm-project/issues/51651

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D144932

[lldb] Remove const qualifier on bool argument passed by value

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>

[libc][NFC] Refactor internal errno.

This is in preparation for the transition to a solution to make libc tests
hermetic with respect to their use of errno. The implementation of strdup
has been switched over to libc_errno as an example of what the code looks
like in the new way.

See #61037 for more information.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D144928

[OpenMP]Emit captured decls for target data if no devices were specified.

If use_device_ptr/use_device_addr clauses are used on target data
directive and no device was specified during the compilation, only host
part should be emitted. But it still required to emit captured decls for
partially mapped data fields.

Differential Revision: https://reviews.llvm.org/D144993

[test] Add missing -### to Driver/config-file3.c

Otherwise clang may invoke ld. If ld is a shell script using `~`, the
command will fail since `HOME` is changed.

[Driver] Revert -mcpu=?/-mtune=? and make -mcpu=help/-mtune=help unnamed

Follow-up to D144914.
-mcpu=help seems fine as a Clang extension not in GCC, because llc supports -mcpu=help.
-mcpu=? is a bad choice as ? may be expanded by the shell.

[AMDGPU] Replace LegacyDA with Uniformity Analysis in AnnotateUniformValues

Reviewed By: sameerds

Differential Revision: https://reviews.llvm.org/D144162

Fix SimplifyAllocConst pattern when we have alloc of negative sizes

This is UB, but we shouldn't crash the compiler either.

Fixes #61056

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D144978

[lldb] Fix {break,watch}point command function stopping behaviour

In order to run a {break,watch}point command, lldb can resolve to the
script interpreter to run an arbitrary piece of code or call into a
user-provided function. To do so, we will generate a wrapping function,
where we first copy lldb's internal dictionary keys into the
interpreter's global dictionary, copied inline the user code before
resetting the global dictionary to its previous state.

However, {break,watch}point commands can optionally return a value that
would tell lldb whether we should stop or not. This feature was
only implemented for breakpoint commands and since we inlined the user
code directly into the wrapping function, introducing an early return,
that caused lldb to let the interpreter global dictionary tinted with the
internal dictionary keys.

This patch fixes that issue while also adding the stopping behaviour to
watchpoint commands.

To do so, this patch refactors the {break,watch}point command creation
method, to let the lldb wrapper function generator know if the user code is
a function call or a arbitrary expression.

Then the wrapper generator, if the user input was a function call, the
wrapper function will call the user function and save the return value into
a variable. If the user input was an arbitrary expression, the wrapper will
inline it into a nested function, call the nested function and save the
return value into the same variable. After resetting the interpreter global
dictionary to its previous state, the generated wrapper function will return
the varible containing the return value.

rdar://105461140

Differential Revision: https://reviews.llvm.org/D144688

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>

[SLP] Add banner argument to SLP costs debug printer method - NFC.

Removed unnecessary warning workaround.

Differential Revision: https://reviews.llvm.org/D144992

[InstCombine] reassociate subtract-from-constant to add-constant

(C - X) + Y --> (Y - X) + C

Moving the constant operand to an 'add' gives more
flexibility to subsequent reassociation patterns,
and it may be better for codegen on targets that
don't have subtract-from-immediate instructions.

[IR] fix spelling/formatting; NFC

Even within this file, the usual spelling is 'Opcode',
so make it consistent.

[InstCombine] simplify test for div/rem; NFC

This is too conservative as noted in the TODO comment.

[Sema] Add missing entries to the arrays in GetImplicitConversionName and GetConversionRank.

It appears that ICK_Zero_Queue_Conversion was inserted into the ICK
enum without updating this table. Easy to do since the table size
was set to ICK_Num_Conversion_Kinds.

I've used ICR_Exact_Match to match what was previously done for
ICK_Zero_Event_Conversion that last time someone noticed this had happened.

To prevent this from happening again, I've removed the explicit size
and used a static_assert to check the size against ICK_Num_Conversion_Kinds.

Differential Revision: https://reviews.llvm.org/D144990

Revert "Revert "[Modules] Don't check [temp.friend]p9 in ASTContext::isSameEntity""

This fixes the Clang modular CI, but breaks other CIs.

This reverts commit 2ae39902506f38d6368a7dbe3d64109f57ad6f99.