review.tizen.org Git - platform/upstream/llvm.git/log

[clang][driver] Set the input type to Fortran when reading from stdin

This patch makes sure that for the following invocation of the new Flang
driver, clangDriver sets the input type to Fortran:
```
flang-new -E -
```
This change does not affect `clang`, i.e. for the following invocation
the input type is set to C:
```
clang -E -
```

This change leverages the fact that for `flang-new` the driver is in
Flang mode.

Differential Revision: https://reviews.llvm.org/D96777

[clang] Remove a superfluous semicolon, silencing GCC warnings. NFC.

[clang][cli] NFC: Remove ArgList infrastructure for recording queries

This patch removes the infrastructure for recording queries in `ArgList`, partially reverting D94472.

The infrastructure was used during command line round-trip to determine which arguments should a certain subset of `CompilerInvocation` generate.

Since D96280, the command line arguments are being generated all at once, making this code no longer necessary.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D96325

[clang][cli] NFC: Remove intermediate command line parsing functions

Patch D96280 moved command line round-tripping from each parsing functions into single `CreateFromArgs` function.

This patch cleans up the individual parsing functions, essentially merging `ParseXxxImpl` with `ParseXxx`, as the distinction is not necessary anymore.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D96323

[lldb][NFC] Document ClangASTImporter

[RISCV] Support fixed-length vector FP_ROUND & FP_EXTEND

This patch extends the support for vector FP_ROUND and FP_EXTEND by
including support for fixed-length vector types. Since fixed-length
vectors use "VL" nodes and scalable vectors can use the standard nodes,
there is slightly more to do in the fixed-length case. A helper function
was introduced to try and reduce the divergent paths. It is expected
that this function will similarly come in useful for lowering the
int-to-fp and fp-to-int operations for fixed-length vectors.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97301

Pass GPU events instead of streams across async regions.

Lower !gpu.async.tokens returned from async.execute regions to events instead of streams.

Make !gpu.async.token returned from !async.execute single-use.
This allows creating one event per use and destroying them without leaking or ref-counting.
Technically we only need this for stream/event-based lowering. I kept the code separate
from the rest of the gpu-async-region pass so that we can make this optional or move
to a separate pass as needed.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D96965

[RISCV] Support fixed-length vector truncates

This patch extends support for our custom-lowering of scalable-vector
truncates to include those of fixed-length vectors. It does this by
co-opting the custom RISCVISD::TRUNCATE_VECTOR node and adding mask and
VL operands. This avoids unnecessary duplication of patterns and
inflation of the ISel table.

Some truncates go through CONCAT_VECTORS which currently isn't
efficiently handled, as it goes through the stack. This can be improved
upon in the future.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97202

[RISCV] Support fixed-length vector sign/zero extension

This patch adds support for the custom lowering sign- and zero-extension
of fixed-length vector types. It does so through custom nodes. Since the
source and destination types are (necessarily) of different sizes, it is
possible that the source type is legal whilst the larger destination
type isn't. In this case the legalization makes heavy use of
EXTRACT_SUBVECTOR.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97194

[RISCV] Unify scalable- and fixed-vector EXTRACT_SUBVECTOR lowering

This patch unifies the two disparate paths for lowering
EXTRACT_SUBVECTOR operations under one roof. Consequently, with this
patch it is possible to support any fixed-length subvector extraction,
not just "cast-like" ones.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97192

[NFC] Fix build failure after 83d134c3c4222e8b8d3d90c099f749a3b3abc8e0

[X86] Regenerate sdiv_fix.ll tests. NFCI.

[NARY-REASSOCIATE] Support reassociation of min/max

Support reassociation for min/max. With that we should be able to transform min(min(a, b), c) -> min(min(a, c), b) if min(a, c) is already available.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D88287

[X86][SSE] Move unaryshuffle(xor(x,-1)) -> xor(unaryshuffle(x),-1) fold into helper. NFCI.

We should be able to extend this "canonicalizeShuffleWithBinOps" to handle more generic binop cases where either/both operands can be cheaply shuffled.

Support standalone build of clang-tidy unittest

Apply the same pattern as the one used in clangd/unittests/CMakeLists.txt

Differential Revision: https://reviews.llvm.org/D96788

[lldb][NFC] Remove some obsolete comments in ClangASTImporter.cpp

The first two comments are incomplete and reference obsolete code. The
last one is just commented out code (that also doesn't look correct).

[lldb] Let ClangASTImporter assert that the target AST has an external source

This prevents people from accidentially using this code outside the
intended setup.

Prefer /usr/bin/env xxx over /usr/bin/xxx where xxx = perl, python, awk

Allow users to use a non-system version of perl, python and awk, which is useful
in certain package managers.

Reviewed By: JDevlieghere, MaskRay

Differential Revision: https://reviews.llvm.org/D95119

[CodeGen] Canonicalise adds/subs of i1 vectors using XOR

When calling SelectionDAG::getNode() to create an ADD or SUB
of two vectors with i1 element types we can canonicalise this
to use XOR instead, where 1+1 is treated as wrapping around
to 0 and 0-1 wraps to 1.

I've added the following tests for SVE targets:

CodeGen/AArch64/sve-pred-arith.ll

and modified some X86 tests to reflect the much simpler codegen
required.

Differential Revision: https://reviews.llvm.org/D97276

AArch64: relax address-space assertion in FastISel.

Some people are using alternative address spaces to track GC data, but
otherwise they behave exactly the same. This is the only place in the backend
we even try to care about it so it's really not achieving anything.

[clang][cli] Round-trip the whole CompilerInvocation

Finally, this patch moves from round-tripping one `CompilerInvocation` at a time to round-tripping the invocation as a whole.

This patch includes only the code required to make round-tripping the whole invocation work. More cleanups will be done in a follow-up patch.

Depends on D96847, D97041 & D97042.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D96280

[clang][cli] Store additional optimization remarks info

After a revision of D96274 changed `DiagnosticOptions` to not store all remark arguments **as-written**, it is no longer possible to reconstruct the arguments accurately from the class.

This is caused by the fact that for `-Rpass=regexp` and friends, `DiagnosticOptions` store only the group name `pass` and not `regexp`. This is the same representation used for the plain `-Rpass` argument.

Note that each argument must be generated exactly once in `CompilerInvocation::generateCC1CommandLine`, otherwise each subsequent call would produce more arguments than the previous one. Currently this works out because of the way `RoundTrip` splits the responsibilities for certain arguments based on what arguments were queried during parsing. However, this invariant breaks when we move to single round-trip for the whole `CompilerInvocation`.

This patch ensures that for one `-Rpass=regexp` argument, we don't generate two arguments (`-Rpass` from `DiagnosticOptions` and `-Rpass=regexp` from `CodeGenOptions`) by shifting the responsibility for handling both cases to `CodeGenOptions`. To distinguish between the cases correctly, additional information is stored in `CodeGenOptions`.

The `CodeGenOptions` parser of `-Rpass[=regexp]` arguments also looks at `-Rno-pass` and `-R[no-]everything`, which is necessary for generating the correct argument regardless of the ordering of `CodeGenOptions`/`DiagnosticOptions` parsing/generation.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D96847

[AArch64] Add abs intrinsic costs

This patch adds cost-modelling for abs vector intrinsic.

Change-Id: I89007971bfb15f5b4a02a2eadfd43018e9a73976

[clangd] NFC, remove an extra "class" keyword.

[clang][cli] Remove marshalling from Opt{In,Out}FFlag

We can now express all marshalling semantics in `Opt{In,Out}FFlag` via `BoolFOption`.

This patch moves remaining `Opt{In,Out}FFlag` instances using marshalling to `BoolFOption` and removes marshalling capabilities from `Opt{In,Out}FFlag` entirely.

This simplifies the decisions developers have to make when creating new boolean options:
* For simple cc1 flag pairs, use `Bool{,F,G}Option`.
* For cc1 flag pairs that require complex marshalling logic, use `Opt{In,Out}FFlag` and implement marshalling manually.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D97370

[clang][cli] Add MarshallingInfoEnum multiclass

This patch introduces a tablegen multiclass called `MarshallingInfoEnum`. It has the same semantics as `MarshallingInfoString` had in combination with `AutoNormalizeEnum`, but it's easier to use and follows the convention used for other `MarshallingInfoXxx` multiclasses.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D97375

[mlir][nfc] Fix typo in documentation comment

[mlir] Fix emitting attribute documentation

This fixes the documentation emitted for type parameters. Also adds a
missing empty line, rendered as line break in mark down.

Co-authored-by: Simon Camphausen <simon.camphausen@iml.fraunhofer.de>
Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D97267

[clang][RecoveryAST] Add design doc to clang internal manual.

Hopefully it would be useful for new developers.

Differential Revision: https://reviews.llvm.org/D96944

[debugserver] Fix logic to extract app bundle from file path

Fix the logic to find the app bundle in a path by correctly accounting
for paths containing multiple occurrences of `.app`. The new logic will
correctly extract `com.app.Foo.app` from `com.app.Foo.app/com.app.Foo`.

rdar://74666208

Differential revision: https://reviews.llvm.org/D97441

OpenMP: Fix object clobbering issue when using save-temps

There are two preconditions to reproduce the issue,
1. Use -save-temps option
2. Provide the -o option with name equal to the input file name
without the file extension. For e.g. clang a.c -o a

With the -o specified, the AssembleJobAction after OffloadWrapperJobAction
will produce the object file with same name as host code object file.
Due to this clash, the OffloadWrapperAction overwrites the initial host
object file, which results in lld error. This also fixes the `multiple definition of __dummy.omp_offloading.entry'` issue in D96769 .

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D97273

[RISCV] Reuse existing SDLoc and XLenVT in the switch in RISCVISelDAGToDAG::Select. NFC

A SDLoc and XLenVT were already created above the switch.

[docs][JITLink] Reintroduce JITLink design/API doc with fixes and improvements.

This document was originally introduced in ab4648504b2, and was reverted in
912bc4980e9 while I investigated a number of shpinx bot errors. This commit
reintroduces the document with fixes for those errors, as well as some
improvements to the wording and formatting.

[NARY][NFC] New tests for upcoming changes.

[NFC][AIX] Rename aix-csr-vector.ll to aix-csr-vector-extabi.ll

[Coroutine] Check indirect uses of alloca when checking lifetime info

In the existing logic, we look at the lifetime.start marker of each alloca, and check all uses of the alloca, to see if any pair of the lifetime marker and an use of alloca crosses suspension point.
This approach is unfortunately incorrect. An use of alloca does not need to be a direct use, but can be an indirect use through alias.
Only checking direct uses can miss cases where indirect uses are crossing suspension point.
This can be demonstrated in the newly added test case 007.
In the test case, both x and y are only directly used prior to suspend, but they are captured into an alias, merged through a PHINode (so they couldn't be materialized), and used after CoroSuspend.
If we only check whether the lifetime starts cross suspension points with direct uses, we will put the allocas to the stack, and then capture their addresses in the frame.

Instead of fixing it in D96441 and D96566, this patch takes a different approach which I think is better.
We still checks the lifetime info in the same way as before, but with two differences:
1. The collection of liftime.start is moved into AllocaUseVisitor to make the logic more concentrated.
2. When looking at lifetime.start and use pairs, we not only checks the direct uses as before, but in this patch we check all uses collected by AllocaUseVisitor, which would include all indirect uses through alias. This will make the analysis more accurate without throwing away the lifetime optimization.

Differential Revision: https://reviews.llvm.org/D96922

[docs] Add a release note for the removing of -Wreturn-std-move-in-c++11

`-Wreturn-std-move-in-c++11` has been removed in fbee4a0c79cc4ee87c34e51342742a5bc6fcf872.

Reviewed By: aaronpuchert, amccarth

Differential Revision: https://reviews.llvm.org/D97364

[flang][fir][NFC] Remove dead code.

This patch removes OpaqueAttr as it is no longer used.

Differential Revision: https://reviews.llvm.org/D97424

[flang][fir][NFC] Move remaining types to TableGen type definition

Move the remaing of FIR types to TableGen type definition. This follow suggestion in D96422.

Reviewed By: schweitz, jeanPerier, rriddle

Differential Revision: https://reviews.llvm.org/D96987

[ThinLTO][NewPM] Clean up dead code under -O0

We're running into undefined references using ThinLTO with -O0 on
Windows/Chrome. This fixes that.

This matches the legacy PM.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D97414

[X86] Support amx-bf16 intrinsic.

Adding support for intrinsics of AMX-BF16.
This patch alse fix a bug that AMX-INT8 instructions will be selected with wrong
predicate.

Differential Revision: https://reviews.llvm.org/D97358

[lld-macho] add code signature for native arm64 macOS

Differential Revision: https://reviews.llvm.org/D96164

[test] Improve SanitizerCoverage tests on !associated and comdat

update AMDGPU _Float16 support in clang doc

Reviewed by: Matt Arsenault

Differential Revision: https://reviews.llvm.org/D97386

[llvm] Check availability for os_signpost

Add availability checks to the os_signpost code so this can be used with
an older deployment target.

Differential revision: https://reviews.llvm.org/D97410

Improve attribute documentation for nodebug on typedefs

(followup to 8472fa6c54c9d044adcd147f6826bccebd730f30 )

[RISCV] Teach VSETVLI inserter to use VSETIVLI when possible.

We always create the VL operand using a register, but if we can
determine that it came from an ADDI X0, imm with a sufficiently
small immediate, we can use VSETIVLI.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97332

[RISCV] Use a ComplexPattern for zexti32 to match sexti32.

We just started using a ComplexPattern for sexti32. This updates
zexti32 to match.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D97231

[CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc

For -fgpu-rdc mode, static device vars in different TU's may have the same name.
To support accessing file-scope static device variables in host code, we need to give them
a distinct name and external linkage. This can be done by postfixing each static device variable with
a distinct CUID (Compilation Unit ID) hash.

Since the static device variables have different name across compilation units, now we let
them have external linkage so that they can be looked up by the runtime.

Reviewed by: Artem Belevich, and Jon Chesterfield

Differential Revision: https://reviews.llvm.org/D85223

Allow !shape.size type operands in "shape.from_extents" op.

This expands the op to support error propagation and also makes it symmetric with "shape.get_extent" op.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D97261

[profile] Fix buffer overrun when parsing %c in filename string

Fix a buffer overrun that can occur when parsing '%c' at the end of a
filename pattern string.

rdar://74571261

Reviewed By: kastiglione

Differential Revision: https://reviews.llvm.org/D97239

Revert "[builtins] Define fmax and scalbn inline"

This reverts commit 341889ee9e03e73b313263c516b3d1fd33d4c4ba.

The new unit tests fail on sanitizer-windows.

Reland "[Driver][Windows] Support per-target runtimes dir layout for profile instr generate"

This relands commit rG7f9d5d6e444c which was reverted in rGab5b00ada9e7

Differential Revision: https://reviews.llvm.org/D96638

[builtins] Define fmax and scalbn inline

Define inline versions of __compiler_rt_fmax* and __compiler_rt_scalbn*
rather than depend on the versions in libm. As with
__compiler_rt_logbn*, these functions are only defined for single,
double, and quad precision (binary128).

Fixes PR32279 for targets using only these FP formats (e.g. Android
on arm/arm64/x86/x86_64).

For single and double precision, on AArch64, use __builtin_fmax[f]
instead of the new inline function, because the builtin expands to the
AArch64 fmaxnm instruction.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D91841

[MC][ARM] make Thumb function also if type attribute is set

Make sure to set the bottom bit of the symbol even when the type
attribute of a label is set after the label.

GNU as sets the thumb state according to the thumb state of the label.
If a .type directive is placed after the label, set the symbol's thumb
state according to the thumb state of the .type directive. This matches
GNU as in most cases.

From: Stefan Agner <stefan@agner.ch>

This fixes:
https://bugs.llvm.org/show_bug.cgi?id=44860
https://github.com/ClangBuiltLinux/linux/issues/866

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D74927

Revert "[Profile] Include a few asserts in coverage mapping test"

This reverts commit 80f329bcd0281c11062879025761d0657167fe8b.

[InstCombine] fold fdiv with powi divisor (PR49147)

This extends b40fde062c for the especially non-standard
powi pattern. We want to avoid being completely wrong
on the negation-of-int-min corner case, so I'm adding
an extra FMF check for 'ninf' assuming that gives us
the flexibility to handle that possibility.
https://llvm.org/PR49147

[InstCombine] add helper for x/pow(); NFC

We at least want to add powi to this list, so
split it off into a switch to reduce code duplication.

[Profile] Include a few asserts in coverage mapping test

These should catch any accidental use of the compilation directory.

Differential Revision: https://reviews.llvm.org/D97402

Transforms: Clone distinct nodes in metadata mapper unless RF_ReuseAndMutateDistinctMDs

This is a follow up to 22a52dfddcefad4f275eb8ad1cc0e200074c2d8a and a
revert of df763188c9a1ecb1e7e5c4d4ea53a99fbb755903.

With this change, we only skip cloning distinct nodes in
MDNodeMapper::mapDistinct if RF_ReuseAndMutateDistinctMDs, dropping the
no-longer-needed local helper `cloneOrBuildODR()`. Skipping cloning in
other cases is unsound and breaks CloneModule, which is why the textual
IR for PR48841 didn't pass previously. This commit adds the test as:
Transforms/ThinLTOBitcodeWriter/cfi-debug-info-cloned-type-references-global-value.ll

Cloning less often exposed a hole in subprogram cloning in
CloneFunctionInto thanks to df763188c9a1ecb1e7e5c4d4ea53a99fbb755903's
test ThinLTO/X86/Inputs/dicompositetype-unique-alias.ll. If a function
has a subprogram attachment whose scope is a DICompositeType that
shouldn't be cloned, but it has no internal debug info pointing at that
type, that composite type was being cloned. This commit plugs that hole,
calling DebugInfoFinder::processSubprogram from CloneFunctionInto.

As hinted at in 22a52dfddcefad4f275eb8ad1cc0e200074c2d8a's commit
message, I think we need to formalize ownership of metadata a bit more
so that ValueMapper/CloneFunctionInto (and similar functions) can deal
with cloning (or not) metadata in a more generic, less fragile way.

This fixes PR48841.

Differential Revision: https://reviews.llvm.org/D96734

IR: Rename Metadata::ImplicitCode to SubclassData1, NFC

Metadata::ImplicitCode is a bit shaved off of Metadata::Storage,
currently only in use by the subclass DILocation. However, the bit isn't
reserved for that purpose. Rename it `SubclassData1` to make it clear
that it has nothing to do with Metadata itself (and other subclasses are
free to use it).

As a drive-by, remove an old TODO about exposing bits to subclasses
(looks like that has mostly been done).

No functionality change here.

Differential Revision: https://reviews.llvm.org/D96740

[tests] precommit tests for D97219

[amdgpu] Atomic should be source of divergence.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D97392

[libcxx] [test] Quote the path to the python interpreter

This should allow running tests with the interpreter in some of the
default paths where Python for Windows might be installed.

Differential Revision: https://reviews.llvm.org/D97369

[InstCombine] add tests for fdiv+powi; NFC

AMDGPU: Remove special case in shouldCoalesce

Unaligned registers are now constrained with classes, rather than
specially reserving a subset of the whole class.

AMDGPU: Add even aligned VGPR/AGPR register classes

gfx90a operations require even aligned registers, but this was
previously achieved by reserving registers inside the full class.

Ideally this would be captured in the static instruction definitions
for the operands, and we would have different instructions per
subtarget. The hackiest part of this is we need to manually reassign
AGPR register classes after instruction selection (we get away without
this for VGPRs since those types are actually registered for legal
types).

[mlir][docs] Small fix to local Pass Manager reproduction documentation

[mlir][linalg] Reuse the symbol if attribute uses are identical.

Depends On D97312

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D97383

[mlir][linalg] Support for using output values in TC definitions.

This will allow us to define select(pred, in, out) for TC ops, which is useful
for pooling ops.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D97312

[lldb] Support debugging utility functions

LLDB uses utility functions to run code in the inferior for its own
internal purposes, such as reading classes from the Objective-C runtime
for example. Because these expressions should be transparent to the
user, we ignore breakpoints and unwind the stack on errors, which
makes them hard to debug.

This patch adds a new setting target.debug-utility-expression that, when
enabled, changes these options to facilitate debugging. It enables
breakpoints, disables unwinding and writes out the utility function
source code to disk so it shows up in the source view.

Differential revision: https://reviews.llvm.org/D97249

[llvm-objcopy] If input=output, preserve umask bits, otherwise drop S_ISUID/S_ISGID bits

This makes the behavior similar to cp

```
chmod u+s,g+s,o+x a
sudo llvm-strip a -o b
// With this patch, b drops set-user-ID and set-group-ID bits.
// sudo cp a b => b does not have set-user-ID or set-group-ID bits.
```

This also changes the behavior for the following case:

```
chmod u+s,g+s,o+x a
llvm-strip a
// a preserves set-user-ID and set-group-ID bits.
// This matches binutils<2.36 and probably >=2.37. 2.36 and 2.36.1 have some compatibility issues.
```

Differential Revision: https://reviews.llvm.org/D97253

Remove a workaround for MSVC 2013, now that MSVC 2017 is the minimum.

In MSVC 2013, 'alignas(integer-template-arg)' didn't compile; verified
on godbolt that this now works properly.

[AArch64][GlobalISel] Fix manual selection for v4s16 and v8s8 G_DUP

The manual G_DUP selection code would produce DUPv16i8 for v8s8s and DUPv8i16
for v4s16.

This adds the missing cases to the manual selection code, and makes it return
false when there is an unexpected size.

Update select-dup.mir to reflect the change.

Differential Revision: https://reviews.llvm.org/D97240

[RISCV] Support fixed vector extract element. Use VL=1 for scalable vector extract element.

I've changed to use VL=1 for slidedown and shifts to avoid extra
element processing that we don't need.

The i64 fixed vector handling on i32 isn't great if the vector type
isn't legal due to an ordering issue in type legalization. If the
vector type isn't legal, we fall back to default legalization
which will bitcast the vector to vXi32 and use two independent extracts.
Doing better will require handling several different cases by
manually inserting insert_subvector/extract_subvector to adjust the type
to a legal vector before emitting custom nodes.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97319

[lit] Add --ignore-fail

For some build configurations, `check-all` calls lit multiple times to
run multiple lit test suites.  Most recently, I've found this to be
true when configuring openmp as part of `LLVM_ENABLE_RUNTIMES`, but
this is not the first time.

If one test suite fails, none of the remaining test suites run, so you
cannot determine if your patch has broken them.  It can then be
frustrating to try to determine which `check-` targets will run the
remaining tests without getting stuck on the failing tests.

When such cases arise, it is probably best to adjust the cmake
configuration for `check-all` to run all test suites as part of one
lit invocation.  Because that fix will likely not be implemented and
land immediately, this patch introduces `--ignore-fail` to serve as a
workaround for developers trying to see test results until it does
land:

```
$ LIT_OPTS=--ignore-fail ninja check-all
```

One problem with `--ignore-fail` is that it makes it challenging to
detect test failures in a script, perhaps in CI.  This problem should
serve as motivation to actually fix the cmake configuration instead of
continuing to use `--ignore-fail` indefinitely.

Reviewed By: jhenderson, thopre

Differential Revision: https://reviews.llvm.org/D96371

[mlir][spirv] Define spv.GLSL.Ldexp

co-authored-by: Alan Liu <alanliu.yf@gmail.com>

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D97228

[LegalizeIntegerTypes] Further improve ExpandIntRes_SADDSUBO for targets where SADDO/SSUBO aren't supported.

Rather than converting 3 signbits to bools and comparing them,
we can do bitwise logic on the whole vector and convert the
resulting sign bit to a bool at the end.

This is still a different algorithm than what we do in LegalizeDAG
through expandSADDOSSUBO. That algorithm needs to know that the
RHS of SSUBO is > 0, but that's costly when the type is split.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D97325

[mlir] Add constBuilderCall to TypeAttr to simplify builders

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D97344

Revert rGd65ddca83ff85c7345fe9a0f5a15750f01e38420 - "[ValueTracking] ComputeKnownBits - minimum leading/trailing zero bits in LSHR/SHL (PR44526)"

This is causing sanitizer test failures that I haven't been able to fix yet.

[libomptarget] Fixed MSVC build fail caused by __attribute__((used)).

Differential Revision: https://reviews.llvm.org/D97348

[MC][ARM] add .w suffixes for BL (T1) and DBG

F1.2 Standard assembler syntax fields
describes .w and .n suffixes for wide and narrow encodings.

arch/arm/probes/kprobes/test-thumb.c tests installing kprobes for
certain instructions using inline asm. There's a few instructions we
fail to assemble due to missing .w t2InstAliases.

Adds .w suffixes for:
* bl (F5.1.25 BL, BLX (immediate) T1)
* dbg (F5.1.42 DBG T1)

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D97236

[AArch64] Do not fold SP adjustments into pre-increment addr modes if it overflows the redzone.

Instead of outright disabling this completely with the noredzone attribute,
we only avoid doing the optimization if there are memory operations between
the adjustment and the load/store that the adjustment would be folded into.
This avoids the case of something like a stack cookie being corrupted if an
exception happens before the pre-increment to the SP occurs.

This also prevents the folding happening if we have a redzone, but the offset
being folded is above the redzone amount (128 bytes in this case).

rdar://73269336

Differential Revision: https://reviews.llvm.org/D95179

[flang] add attribute to trim runtime implementation establish call

CFI allocatable attribute is needed so that the descriptor for the
result can be allocated/deallocated.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D97395

[tests] precommit tests for an upcoming AA improvement

[OpenMP][Tests][NFC] rename macro to avoid naming clash

Rename a macro use missed in e0f3acc5d34aa

[OpenMP] Fixed a crash when offloading to x86_64 with target nowait

PR#49334 reports a crash when offloading to x86_64 with `target nowait`,
which is caused by referencing a nullptr. The root cause of the issue is, when
pushing a hidden helper task in `__kmp_push_task`, it also maps the gtid to its
shadow gtid, which is wrong.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D97329

[OpenMP][Tests][NFC] lit might also be known as llvm-lit.py

Revert "[tests] Mark an autogened test as such"

This reverts commit 43a569faeb332ae8b355fffc33eec1ef6e33052e.

Unhelpfully, the tool just added the header and didn't actually update any of the tests. I didn't notice until after pushing.

Make sure some types are indeed trivially_copyable per llvm::is_trivially_copyable

Test a few types used as llvm::SmallVector parameter. It is important to ensure
we have a consistent behavior for these types to prevent ABI issues as the one
we met in https://bugs.llvm.org/show_bug.cgi?id=39427.

Differential Revision: https://reviews.llvm.org/D96536

[libomptarget] Load images in order of registration

This makes sure that images are loaded in the order in which they are registered with libomptarget.

If a target can load multiple images and these images depend on each other (for example if one image contains the programs target regions and one image contains library code), then the order in which images are loaded can be important for symbol resolution (for example, in the VE plugin).
In this case: because the same code exist in the host binaries, the order in which the host linker loads them (which is also the order in which images are registered with libomptarget) is the order in which the images have to be loaded onto the device.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95530

[tests] Mark an autogened test as such

[OpenMP][Tests][NFC] rename macro to avoid naming clash

Rename a macro and macro use missed in 35ab6d6390ecd

[AMDGPU] Add a bit more gfx90a test coverage

Update the GlobalISel version of llvm.amdgcn.workitem.id.ll to mostly
match the SelctionDAG version.

Differential Revision: https://reviews.llvm.org/D97377

[OpenMP][Tests][NFC] rename macro to avoid naming clash

When including <ostream>, the register_callback macro of the OMPT callback.h
clashes with a function defined in ostream. This patch renames the macro
and includes ompt into the macro name.

[libc][NFC] Exclude few targets from the `all` target.

[libc++] NFC: Fix a few tests in tuple that would succeed trivially

[libc++] NFC: Fix a few tests in pair that would succeed trivially

[flang][fir] Add zero_bits operation.

This patch adds the new zero_bits operation and upstrams other changes
including the following:

  - update tablegen syntax to newer forms
  - update memory effects annotations
  - update documentation [NFC]
  - other NFC, such as whitespace and formatting

Differential revision: https://reviews.llvm.org/D97331

[clang-tidy] Fix readability-avoid-const-params-in-decls removing const in template paramaters

Fixes https://bugs.llvm.org/show_bug.cgi?id=38035

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D96209