review.tizen.org Git - platform/upstream/llvm.git/log

[gn build] (semi-automatically) port 243da90ea535

[NVPTX] Fix a segfault for bitcasted calls with byval params

`getFunctionParamOptimizedAlign` was being passed a null function
argument when getting the callee of a bitcasted function symbol. This is
because `CallBase::getCalledFunction` does not look through bitcasts.

There is already code to handle this case in
`NVPTXTargetLowering::getArgumentAlignment`, which is now hoisted into
an NVPTX util.

The alignment computation now gracefully handles computing alignment of
virtual functions with a check for null.

[LLDB] Change RegisterValue::GetAsMemoryData to const RegisterInfo&

Most of the paths to this never passed nullptr intentionally. Those
that possibly could have were assuming it was not null elsehwere,
so would have crashed.

I've added asserts in those cases.

At least one case was relying on GetAsMemoryData to return an error
when it was given nullptr. So I've hoisted that error setting code
out into the caller.

Depends on D134963

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D134965

[InstCombine] try harder to cancel out mul/div

((Op1 * X) / Y) / Op1 --> X / Y
https://alive2.llvm.org/ce/z/JYxWjA

InstSimplify handles the more basic mul+div pattern with
shared operand, but we don't seem to have any reassociation
folds to handle cases where the common op is further away.

This is a generalization of 9cff4711ac72 and another
transform derived from issue #58137.

[InstCombine] add tests for div with common mul operand; NFC

[mlir][Bazel] Remove unused dependency.

Make windows resource generation more robust

This is another attempt at https://reviews.llvm.org/D110489.

When build IREE we run into cases where we don't have / need
LLVM_VERSION_* etc set. Compilation fails if it isn't an integer.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D135650

[LLDB] Switch to RegisterInfo& for EmulateInstruction::WriteRegister

WriteRegister and WriteRegisterUnsigned were never pased nullptr,
and only one of them appeared to handle it. Switch to ref to make
the intent clear.

Depends on D134962

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D134963

[mlir][llvm] Introduce a mapValue function in LLVMIR import (nfc).

The revision adds a mapValue function to the Importer, which can be used
in the MLIR builders to provide controlled accesses to the result
mapping of the imported instructions. Additionally, the change allows us
to avoid accessing a private member variable of the Importer class,
which simplifies future refactorings that aim at factoring out a
conversion interface (similar to the MLIR to LLVM translation). The
revision also renames the variables used when emitting the MLIR builders
to prepare the generalization to non-intrinsic instructions. In
particular, it renames callInst to inst and it passes in the instruction
arguments using an llvmOperands array rather than accessing the call
arguments directly.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D135645

[TableGen] Change representation of ArgumentAttributes (NFC)

Instead of a flat list that includes the argument index, use a
nested vector, where each inner vector is the attribute set for
a single argument. This is more obvious and makes followup changes
simpler.

[libclang] CIndex: Visit UsingTypeLoc as TypeRef

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D135555

[libc++] Add the C++17 <memory_resource> header (mono-patch)

This patch is the rebase and squash of three earlier patches.
It supersedes all three of them.

- D47111: experimental monotonic_buffer_resource.
- D47358: experimental pool resources.
- D47360: Copy std::experimental::pmr to std::pmr.

The significant difference between this patch and the-sum-of-those-three
is that this patch does not add `std::experimental::pmr::monotonic_buffer_resource`
and so on. This patch simply adds the C++17 standard facilities, and
leaves the `std::experimental` namespace entirely alone.

Differential Revision: https://reviews.llvm.org/D89057

Reland "[clang][LoongArch] Set MaxAtomicInlineWidth and MaxAtomicPromoteWidth for LoongArch"

Differential Revision: https://reviews.llvm.org/D135526

[NFC] Factor out collection of unswitch candidate to a separate function

Just to make the code more structured and easier to understand.

[NFC] Refine API in SimpleLoopUnswitch: add missing const notions

[NFC] Refine API: add missing const notion in hasPartialIVCondition

[flang] Add cpowi function to runtime and use instead of pgmath

This patch adds a cpowi function to the flang runtime, and switches
to using that function instead of pgmath for complex number to
integer power operations.

Differential Revision: https://reviews.llvm.org/D134889

[LLDB] Change pointer to ref in EmulateInstruction::ReadRegister methods

ReadRegister and ReadRegisterAsUnsigned are always passed valid pointers,
so the parameter should be a ref to make the intent clear.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D134962

[NFC] Pre-commit tests for D135434.

pipeliner-preserve-ties.mir demonstrates a current bug in which the
output of the Modulo Software Pipelining pass has left off a tie
between operands in the conditional `t2ADDri` instruction. It should
look like this:

%19:rgpr = t2ADDri %1, 1, 1 /* CC::ne */, $cpsr, $noreg, implicit %1(tied-def 0)

in which the final input operand is tied to the output, because that's
the input that will become the output value if the conditionalized add
instruction does not execute, and hence, must necessarily be whatever
was in the output register beforehand.

In the input to the pipeliner, those `tied-def` specifications are
present and correct. But when the pipeliner clones MachineInstrs, it
loses them.

pipeliner-inlineasm.mir does not demonstrate any bug: the output is
already correct, because of compensation code in the machine pipeliner
that applies only to INLINEASM instructions. But no test previously
exercised that code, so I add one now before making changes in that
area.

[mlir] drop unnecssary transform.with_pdl_patterns from tests, NFC

Many tests wrap the piece of the IR related to the transform dialect
into `transform.with_pdl_patterns` without actually using PDL patterns
inside. Some of these are leftovers from migration to `structured.match`
and some others are cargo cult, both are useless and pollute the tests.

Reviewed By: guraypp

Differential Revision: https://reviews.llvm.org/D135661

Reland "[Clang][LoongArch] Add inline asm support for constraints k/m/ZB/ZC"

Reference: https://gcc.gnu.org/onlinedocs/gccint/Machine-Constraints.html

k: A memory operand whose address is formed by a base register and
(optionally scaled) index register.

m: A memory operand whose address is formed by a base register and
offset that is suitable for use in instructions with the same
addressing mode as st.w and ld.w.

ZB: An address that is held in a general-purpose register. The offset
is zero.

ZC: A memory operand whose address is formed by a base register and
offset that is suitable for use in instructions with the same
addressing mode as ll.w and sc.w.

Note:
The INLINEASM SDNode flags in below tests are updated because the new
introduced enum `Constraint_k` is added before `Constraint_m`.
  llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-inline-asm.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll
  llvm/test/CodeGen/X86/callbr-asm-kill.mir

This patch passes `ninja check-all` on a X86 machine with all official
targets and the LoongArch target enabled.

Differential Revision: https://reviews.llvm.org/D134638

[AMDGPU][MC] Correct image_gather4h

Correct encoding of image_gather4h for GFX9; disable this instruction for SI, CI and VI.

Differential Revision: https://reviews.llvm.org/D135605

[AArch64] Add SEH_Nop opcodes for BTI hints

These are harmless for the unwinder - the unwinder doesn't need to
handle them for being able to unwind correctly.

Only add the opcodes when the branch target is in a SEH prologue;
for jumptables e.g. within a function, we shouldn't add any SEH
opcodes.

Differential Revision: https://reviews.llvm.org/D135277

[llvm-readobj] Decode the new ARM64 SEH info for return address signing

This got documented upstream in
https://github.com/MicrosoftDocs/cpp-docs/pull/4202.

Differential Revision: https://reviews.llvm.org/D135275

Revert "[clang][LoongArch] Set MaxAtomicInlineWidth and MaxAtomicPromoteWidth for LoongArch"

This reverts commit 6547565e7bdcd9c3f683ad196b62d08c7061fdf1.

This breaks test: Preprocessor/init-loongarch.c

[CostModel][X86] Add insertelement costs into a known base vector value

We were only testing inserting into undef/poison base vectors

Test coverage for Issue #58261

[LoongArch] Define getSetCCResultType for setting vector setCC type

To avoid trigger "No default SetCC type for vectors!" Assertion.

Differential Revision: https://reviews.llvm.org/D135527

[mlir] use raw function pointer instead of std::function

Accessing the target of std::function apparently requires RTTI, use a
raw pointer instead.

[clang][LoongArch] Set MaxAtomicInlineWidth and MaxAtomicPromoteWidth for LoongArch

Differential Revision: https://reviews.llvm.org/D135526

[LoongArch] Add codegen support of GlobalTLSAddress lowering

There are static and dynamic TLS address lowering in DAG stage according
to different TLS models.

TLS address will be lowered to pseudo instruction and then expanded by
the `LoongArch Pre-RA pseudo instruction expansion` pass.

Differential Revision: https://reviews.llvm.org/D134713

[mlir] switch the transform loop extension to use types

Add types to the Loop (SCF) extension of the transform dialect.

See https://discourse.llvm.org/t/rfc-type-system-for-the-transform-dialect/65702

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D135587

[mlir] add OperationType to the Transform dialect

Add a new OperationType handle type to the Transform dialect. This
transform type is parameterized by the name of the payload operation it
can point to. It is intended as a constraint on transformations that are
only applicable to a specific kind of payload operations. If a
transformation is applicable to a small set of operation classes, it can
be wrapped into a transform op by using a disjunctive constraint, such
as `Type<Or<[Transform_ConcreteOperation<"foo">.predicate,
Transform_ConcreteOperation<"bar">.predicate]>>` for its operand without
modifying this type. Broader sets of accepted operations should be
modeled as specific types.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D135586

[mlir] cleanup transform payload setting

Before the multi-handle relaxation, the transform interpreter did not
actually set payload for a handle that had no further uses as a hacky
mechanism to work around the multi-handle problem. This is no longer
necessary and can be removed to avoid debugging surprises when a handle
has no payload even when its producing op assigned it.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D135585

[mlir] switch transform dialect ops to use TransformTypeInterface

Use the recently introduced TransformTypeInterface instead of hardcoding
the PDLOperationType. This will allow the operations to use more
specific transform types to express pre/post-conditions in the future.
It requires the syntax and Python op construction API to be updated.
Dialect extensions will be switched separately.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D135584

[mlir] clean up transform dialect definitions, NFC

Refactor the definition of the Transform dialect to move non-trivial
method implementations out of the .td file, and detemplatize functions
when possible while moving their implementations to a .cpp.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D135165

[mlir] add types to the transform dialect

Introduce a type system for the transform dialect. A transform IR type
captures the expectations of the transform IR on the payload IR
operations that are being transformed, such as being of a certain kind
or implementing an interface that enables the transformation. This
provides stricter checking and better readability of the transform IR
than using the catch-all "handle" type.

This change implements the basic support for a type system amendable to
dialect extensions and adds a drop-in replacement for the unrestricted
"handle" type. The actual switch of transform dialect ops to that type
will happen in a separate commit.

See https://discourse.llvm.org/t/rfc-type-system-for-the-transform-dialect/65702

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D135164

[mlir][gpu] Add `subgroup_reduce` operation

Introduce `subgroup_reduce` operation, similar to `all_reduce`, but operating on subgroup scope instead of workgroup.
It is intended as low-level building block for more high level abstractions (e.g for workgroup-wide `all_reduce` ops).
Only introduce version taking reduce operation enum for simplicity sake.

Differential Revision: https://reviews.llvm.org/D135323

[mlir][Linalg] Retire LinalgStrategyTilePass and filter-based pattern.

Context: https://discourse.llvm.org/t/psa-retire-linalg-filter-based-patterns/63785

Uses of `LinalgTilingPattern::returningMatchAndRewrite` are replaced by a top-level `tileWithLinalgTilingOptions` function that is marked obsolete and serves
as a temporary means to transition away from `LinalgTilingOptions`-based tiling.
LinalgTilingOptions supports too many options that have been orthogonalized with the use of the transform dialect.

Additionally, the revision introduces a `transform.structured.tile_to_scf_for` structured transform operation that is needed to properly tile `tensor.pad`
via the TilingInterface. Uses of `transform.structured.tile` will be deprecated and replaced by this new op.
This will achieve the deprecation of `linalg::tileLinalgOp`.
Context: https://discourse.llvm.org/t/psa-retire-tileandfuselinalgops-method/63850

In the process of transitioning, tests that were performing tile and distribute on tensors are retired: transformations should be orthogonalized better in the future.
In particular, tiling to specific loop types and tileAndDistribute behavior are not available via the transform ops.
The behavior is still available as part of the `tileWithLinalgTilingOptions` method to allow downstream clients to transition without breakages but is meant to be retired soon.

As more tests are ported to the transform dialect, it became necessary to introduce a test-transform-dialect-erase-schedule-pass to discard the transform specification
once applied so that e2e lowering and execution is possible.

Lastly, a number of redundant tests that were testing composition of patterns are retired as they are available with a better mechanism via the transform dialect.

Differential Revision: https://reviews.llvm.org/D135573

[SimplifyLibCalls] Use helper methods to query attributes (NFC)

[Attributes] Return Optional from getAllocSizeArgs() (NFC)

As suggested on D135572, return Optional<> from getAllocSizeArgs()
rather than the peculiar pair(0, 0) sentinel.

The method on Attribute itself does not return Optional, because
the attribute must exist in that case.

[Attributes] Support int attributes with zero value

This regularly comes up as a stumbling stone when adding int
attributes: They currently need to be encoded in a way to avoids
the zero value.

This adds support for zero-value int attributes by a) making the
ctor determine int/enum attribute based on attribute kind, not
whether the value is non-zero and b) switching getRawIntAttr()
to return an Optional, so that it's possible to distinguish a zero
value from non-existence.

Differential Revision: https://reviews.llvm.org/D135572

[AttrBuilder] Remove unused vscale accessors (NFC)

These accessors are not used. Generally, nowadays it is preferable
to perform queries on AttributeSets/Lists, rather than the
AttrBuilder, which is optimized towards attribute construction now.

[AArch64][ARM] Alter most of arm_neon.h to be target-based, not preprocessor based.

Similar to D131064, this alters most of the intrinsics in arm_neon.h to
be target based, not preprocessor based. The intrinsics that are changed
are the ones with obvious target features (fp16, fp16fml, cryptos, i8mm
and bf16). The ones that are not yet altered are the ones without target
features like rdma (8.1) and complex (8.3). Those will be switched in a
followup patch that allows targeting architecture versions.

The existing ArchGuard in arm_neon.td is split into ArchGuard that still
adds ifdef defines (for example for intrinsics that require __aarch64__),
and TargetGuards for intrinsics dependant on target features. From there
the TargetGuards are used in two ways:
- For intrinsics emitted as functions, __attribute__((target(TargetGuard)))
   is added to the definition of the function. Along with the existing
   always_inline intrinsic, this will give a compile time error if the
   function is used in a context where the target feature is not available.
- For intrinsics emitted as macros, the __builtins are emitted into
   arm_neon.inc using TARGET_BUILTIN as opposed to BUILTIN, which includes
   the target feature and gives an error if the builtin is found in a
   function without the required features, similar to arm_sve.h.

The second method requires that the intrinsics be separable from the
existing _v intrinsics used in other types. For example
__builtin_neon_splat_lane_bf16 is used as opposed to
__builtin_neon_splat_lane_v. There are some adjustments to the CGBuiltin
to account for intrinsics that can be treated similarly, except for
their target features.

Differential Revision: https://reviews.llvm.org/D132034

[Attributes] Remove AttrBuilder::hasAlignmentAttr() method (NFC)

This was the odd one out, with similar methods not existing for
any other attributes. In the places where it is used, it is best
replaced by AttrBuilder::getAttribute(), which allows us to both
test for presence of the attribute and retrieve its value at the
same time. (To just check for presence, contains() could be used.)

[AsmParser] Remove some redundant checks for align attributes

The verifier already has a more general check that the attribute is
legal for the position it is used in.

[Attributes] Remove special SRet/ByVal attribute handling in C API

Proper construction functions for these have long since been
exposed, and these attributes require a type nowadays, so drop the
old compatibility code.

[MustExec][LICM] Handle latch being part of an inner cycle (PR57780)

The algorithm in allLoopPathsLeadToBlock() does not handle the case
where the loop latch is part of the predecessor set correctly: In
this case, we may take the backedge (escaping to a different loop
iteration) and not execute other latch successors. This can happen
if the latch is part of an inner cycle.

Fixes https://github.com/llvm/llvm-project/issues/57780.

Differential Revision: https://reviews.llvm.org/D134279

[test] Require aarch64-registered-target

[libc] Add POSIX functions posix_spawn_file_actions_*.

Namely, posix_spawn_file_actions_addclose,
posix_spawn_file_actions_adddup2, posix_spawn_file_actions_addopen,
posix_spawn_file_actions_destroy, posix_spawn_file_actions_init have
been added.

Reviewed By: michaelrj, lntue

Differential Revision: https://reviews.llvm.org/D135603

[MC][test] Add Mach-O .addrsig test

[clang-format] Fix mis-attributing preprocessor directives to macros

This solves the issue where a case statement inside a macro greedily
adds preprocessor lines such as #include to the macro even if they
are not a part of the macro to begin with.

Fixes #58214.

Differential Revision: https://reviews.llvm.org/D135422

[flang] Support GNU extensions IARGC and GETARG in runtime

The GNU extension intrinsic IARGC is equivalent to
COMMAND_ARGUMENT_COUNT, and the GETARG is similar to
GET_COMMAND_ARGUMENT, but with less arguments. Reuse the runtime of
COMMAND_ARGUMENT_COUNT and GET_COMMAND_ARGUMENT for IARGC and GETARG.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D133558

[DeviceRTL] CMake fix using target-level dependency

File-level dependency should not be used on files generated during the build. The next command may execute before the generating command finishes writing the file. Use add_custom_target and use target-level dependency.

Differential Revision: https://reviews.llvm.org/D135630

[flang][NFC] Fix typos in FIROps.td

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D135570

[flang][NFC] Fix fir::ConvertOp description

According to the support of fir::ConvertOp conversion in CodeGen.cpp,
it also supports integer to pointer conversion and pointer to integer
conversion. The entity can be array, and the conversion type can be
pointer to array.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D135571

[clang] makes `__is_destructible` KEYALL instead of KEYMS

This makes it possible to be used in all modes, instead of just when
`-fms-extensions` is enabled. Also moves the `-fms-extensions`-exclusive
traits into their own file so we can check the others aren't dependent
on this flag.

This is information that the compiler already has, and should be exposed
so that the library doesn't need to reimplement the exact same
functionality.

This was originally a part of D116280.

Depends on D135177.

Differential Revision: https://reviews.llvm.org/D135339

[clang] adds `__is_scoped_enum`, `__is_nullptr`, and `__is_referenceable`

... as builtins.

This is information that the compiler already has, and should be exposed
so that the library doesn't need to reimplement the exact same
functionality.

This was originally a part of D116280.

Depends on D135175.

Differential Revision: https://reviews.llvm.org/D135177

[clang] adds `__is_bounded_array` and `__is_unbounded_array` as builtins

This is information that the compiler already has, and should be exposed
so that the library doesn't need to reimplement the exact same
functionality.

This was originally a part of D116280.

Differential Revision: https://reviews.llvm.org/D135175

[mlir][arith] Fix another file header (NFC)

[flang] Add type-specific runtime entries for Minloc/Maxloc.

We used to have a big switch statement over the type categories and kinds
inside Minloc/Maxloc. After D133051 the switch grew bigger, and this
changed inlining decisions made by GCC (the build compiler). Some of the
simple methods stopped being inlined, and this caused slight performance
regression in Polyhedron/gas_dyn2. This change adds separate entries
for real/integer data types to let them be optimized separately.

Differential Revision: https://reviews.llvm.org/D135610

[mlir][arith] Fix header spacing (NFC)

[libc] fix header list for x86_64

Some headers hadn't been added, this fixes that and improves the
ordering.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D135629

[libc] reset errno in isatty tests

Errno wasn't getting reset between tests, causing cascading failures.

Differential Revision: https://reviews.llvm.org/D135627

[mlir][tosa] Fix tosa::Select to linalg::generic indexingMaps bug

The tosa to linalg generic ops indexingMaps rank use is wrong.
Find this bug in gpt2 pytorch model lowering to tosa.
issue link is here https://github.com/llvm/llvm-project/issues/58154

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D135343

[instcombine] (extelt (inselt Vec, Value, Index), Index) -> Value

When Index is variable but still trivially known to be equal we can use Value
from before the insertion, possibly eliminating the vector.

Reverts a functional change from:
Author: Philip Reames <listmail@philipreames.com>
Date: Wed Dec 8 12:21:10 2021 -0800

[instcombine] A couple style tweaks to visitExtractElementInst [nfc]

Thanks to Michele Scandale for identifying the bug

Differential Revision: https://reviews.llvm.org/D135625

[libc] handle case where /dev/tty doesn't exist

In the isatty test it was assumed that /dev/tty existed, but that's not
always the case. Now we check for that.

Differential Revision: https://reviews.llvm.org/D135626

Revert "[gn build] Don't set LLVM_UNREACHABLE_OPTIMIZE when llvm_enable_assertions"

This reverts commit 0f19c603423e28ab663c1fdff2048c555abe5f6d.

This didn't actually do anything. llvm_unreachable() under `#ifndef NDEBUG` is always supposed to report an error regardless of LLVM_UNREACHABLE_OPTIMIZE. I can't reproduce the issue I was originally seeing with this reverted, not sure what was happening back then, manually verified by messing around with various binaries/configurations.

[libc] add isatty

The isatty function uses the side effects of an ioctl call to determine
if a specific file descriptor is a terminal. I chose TIOCGETD (get line
discipline of terminal) because it didn't require any new structs.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D135618

[mlir][sparse] fixed memory leak on sparse tensors

This was introduced by https://reviews.llvm.org/D134933
This change also cleans up a dump&release method which
had become a misnomer after prior bufferization improvement.

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D135623

[NFCI] More TypeCategoryImpl refactoring.

The main aim of this patch is to delete the remaining instances of code
reaching into the internals of `TypeCategoryImpl`. I made the following
changes:

- Add some more methods to `TieredFormatterContainer` and
  `TypeCategoryImpl` to expose functionality that is implemented in
  `FormattersContainer`.

- Add new overloads of `TypeCategoryImpl::AddTypeXXX` to make it easier
  to add formatters to categories without reaching into the internal
  `FormattersContainer` objects.

- Remove the `GetTypeXXXContainer` and `GetRegexTypeXXXContainer`
  accessors from `TypeCategoryImpl` and update all call sites to use the
  new methods instead.

Differential Revision: https://reviews.llvm.org/D135399

[SPIRV] Fix call lowering of "anonymous" functions

The patch fixes lowering of anonymous functions, removes file/linkage
info for builtin call demangling, and adds relevant test demonstrating
a fixed problem.

Differential Revision: https://reviews.llvm.org/D135390

Revert "[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant."

This reverts commit d4facda414b6b9b8b1a34bc7e6b7c15172775318.

This has been reported to cause failures. Reverting while I investigate.

[libc] add sysconf with pagesize

The sysconf function has many options, this patch adds the basic funtion
and the pagesize option. More options will be added in future patches.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D135409

[mlir][Vector] Introduce 'vector.mask' operation and MaskableOpInterface

This patch introduces the `vector.mask` operation and the MaskableOpInterface
as described in https://discourse.llvm.org/t/rfc-vector-masking-representation-in-mlir/64964.
The `vector.mask` operation is used to predicate the execution of operations
implementing the MaskableOpInterface. This interface will be implemented by maskable
operations and provides information about its masking constraints and semantics.

For now, only vector transfer and reduction ops implement the MaskableOpInterface
for illustration and testing purposes.

Reviewed By: nicolasvasilache, rriddle

Differential Revision: https://reviews.llvm.org/D134939

[ARM] Add errors for MVE exclusive registers.

These instructions already had errors for operands that could not share
the same register:
VCMUL, VMULL, VQDMULL.
This extends that to a few others:
VREV64, VQDMULLqr, VCADD and VHCADD.
Only the i32 types require the error.

Differential Revision: https://reviews.llvm.org/D135560

[LinkerWrapper] Fix failing linker wrapper save temps test

Summary:
This test started failing locally due to a misspelling of `-save-temps`
for the linker wrapper and the `ls` command not having the glob
arguments put in a string. This patch should fix it.

[NFC][3/n] Remove enable-new-pm from Inline tests

This change is updating remaining Inline tests
by removing -enable-new-pm=0 flag and adjusting CHECKs
where it is required.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D135497

[AMDGPU] Fix True16 patterns for cmp on GFX11

These patterns should have a True16 version and a non-true16 version.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D135609

[mlir][linalg] Remove redundant check on linalgOps to fix windows builder

One of the assertion is causing signed/unsigned mismatch. However this
assertion seems redundant and is no longer used.

Reviewed By: mravishankar, ThomasRaoux

Differential Revision: https://reviews.llvm.org/D135612

[lld-macho] Flip ZERO_AR_DATE default

Previously unless ZERO_AR_DATE was set to any value, ld64.lld would
embed non-hermetic timestamps into the final binary. This flips that
default to zero those values unless ZERO_AR_DATE is set explicitly to 0.
As far as I know there isn't a downside to this, except that it differs
from ld64.

Differential Revision: https://reviews.llvm.org/D135529

[mlir] Fix a warning

This patch fixes:

  mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp:1051:40: error: comparison
  of integers of different signs: 'int64_t' (aka 'long') and
  'std::size_t' (aka 'unsigned long') [-Werror,-Wsign-compare]

Reword diagnostics for style; NFC

This removes the capital letter at the start of a few diagnostics and
reformats the nearby diagnostics to match the local style.

[Clang] reject bit-fields as instruction operands in Microsoft style inline asm blocks.

MSVC allows bit-fields to be specified as instruction operands in inline asm
blocks. Such references are resolved to the address of the allocation unit that
the named bitfield is a part of. The result is that reads and writes of such
operands will read or mutate nearby bit-fields stored in the same allocation
unit. This is a surprising behavior for which MSVC issues warning C4401,
"'<identifier>': member is bit field". Intel's icc compiler also allows such
bit-field references, but does not issue a diagnostic.

Prior to this change, Clang fails the following assertion when a bit-field is
referenced in such instructions:
clang/lib/CodeGen/CGValue.h:338: llvm::Value* clang::CodeGen::LValue::getPointer(clang::CodeGen::CodeGenFunction&) const: Assertion `isSimple()' failed.
In non-assert enabled builds, Clang's behavior appears to match the behavior
of the MSVC and icc compilers, though it is unknown if that is by design or
happenstance.

Following this change, attempts to use a bit-field as an instruction operand
in Microsoft style asm blocks is diagnosed as an error due to the risk of
unintentionally reading or writing nearby bit-fields.

Fixes https://github.com/llvm/llvm-project/issues/57791

Reviewed By: erichkeane, aaron.ballman

Differential Revision: https://reviews.llvm.org/D135500

[clang] Mention -Werror changes revived for Clang 16

-Wimplicit-function-declaration and -Wimplicit-int become errors
by default in Clang 16 (originally in 15, but we reverted it in 15.0.1).

Mention it in the release notes like we did originally for Clang 15.

See https://discourse.llvm.org/t/configure-script-breakage-with-the-new-werror-implicit-function-declaration/65213 for more context.

Signed-off-by: Sam James <sam@gentoo.org>
Differential Revision: https://reviews.llvm.org/D135545

[SCEV] Verify block disposition cache.

This extends the existing SCEV verification to catch cache invalidation
issues as in #57837.

The validation logic is similar to the recently added loop disposition
cache validation in bb68b2402daa9.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D134531

[ConstraintElimination] Regenerate check lines for test.

Thanks @nikic for spotting that!

[VectorCombine] convert scalar fneg with insert/extract to vector fneg

insertelt DestVec, (fneg (extractelt SrcVec, Index)), Index --> shuffle DestVec, (fneg SrcVec), Mask

This is a specialized form of what could be a more general fold for a binop.
It's also possible that fneg is overlooked by SLP in this kind of
insert/extract pattern since it's a unary op.

This shows up in the motivating example from #issue 58139, but it won't solve
it (that probably requires some x86-specific backend changes). There are also
some small enhancements (see TODO comments) that can be done as follow-up
patches.

Differential Revision: https://reviews.llvm.org/D135278

[VectorCombine] add test with out-of-bounds insert/extract index; NFC

D135278

Update implementation status of P2468R2

Differential Revision: https://reviews.llvm.org/D135608

[AMDGPU][NFC] Update DW_OP_LLVM_overlay documentation

Update DWARF Extensions For Heterogeneous Debugging proposal for the
DW_OP_LLVM_overlay operation:

1. Add an example.
2. Correct typo in definition of rbss.
3. Correct definition to specify both operands of the
DW_OP_bit_piece operations.

Reviewed By: zoran.zaric

Differential Revision: https://reviews.llvm.org/D135394

[mlir][linalg] Remove unused payload related OutOpOperand

Some higher level operations such as torch.max generates linalg generic
that returns both the index and the value of the max operation. However
sometimes not all information is being used. This however blocks
vectorization for certain cases which causes performance degradation.
This patch aims to fix this issue.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D135388

Fix a typo in the Release Notes; NFC

Fix unused variable warning from D134529

Revert "[LTO] Make local linkage GlobalValue in non-prevailing COMDAT available_externally"

This reverts commit 4fbe33593c8132fdc48647c06f4d1455bfff1c88. It causes linking errors, with details provided internally. (Hopefully the author/reviewers will be able to upstream the internal repro).

[libc] Add implementation of pthread_atfork.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D135432

Fix a typo in the docs; NFC

sence -> sense

Also re-flowed to the usual 80-col limit and added a comma.

[mlir][sparse] Removing DLL attributes from ExecutionEngine/SparseTensor/Enums.h

This differential attempts to resolve certain issues on Windows (e.g., https://reviews.llvm.org/D134933#3843372 and https://reviews.llvm.org/D133462#3844195).

Reviewed By: stella.stamenova, aganea

Differential Revision: https://reviews.llvm.org/D135502

[clang-format][docs][NFC] Fix invalid syntax in ShortLambdaStyle examples.

Repair a confusing standards reference; NFC

There is no 6.9 in C++11, the quote actually lives in
[intro.multithread] for that revision. However, the words moved in
C++17 to [intro.progress] so I added that information as well.

[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant.

If the divisor is even, we can first shift the dividend and divisor
right by the number of trailing zeros. Now the divisor is odd and we
can do the original algorithm to calculate a remainder. Then we shift
that remainder left by the number of trailing zeros and add the bits
that were shifted out of the dividend.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D135541