review.tizen.org Git - platform/upstream/llvm.git/log

[AMDGPU] Set threshold for regbanks reassign pass

This is to limit compile time. I did experiments with some
inputs and found that compile time keeps reasonable for this
pass if we have less than 100000 virtual registers and then
starts to explode somewhere between 100000 and 150000.

Differential Revision: https://reviews.llvm.org/D97218

[flang][test] Share all driver test dirs between `f18` and `flang-new`

Originally, when we added the new driver, we created dedicated test
directories for `flang-new`. This way we separated the tests for the
`throwaway` and the new driver.

As we are increasing test coverage and starting to share tests between
the two drivers, it makes sense to share all directories and instead
rely on:
```
! REQUIRES: new-flang-driver
```
to mark tests as exclusively for the new driver.

Differential Revision: https://reviews.llvm.org/D97207

[OpenMP][NVPTX] Fixed a compilation error in deviceRTLs caused by unsupported feature in release verion of LLVM

`ptx71` is not supported in release version of LLVM yet. As a result,
the support of CUDA 11.2 and CUDA 11.1 caused a compilation error as mentioned
in D97004. Since the support in D97004 is just a WA for releease, and we'll not
use it in the near future, using `ptx70` for CUDA 11 is feasible.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D97195

make Affine parallel and yield ops MemRefsNormalizable

Affine parallel ops may contain and yield results from MemRefsNormalizable ops in the loop body. Thus, both affine.parallel and affine.yield should have the MemRefsNormalizable trait.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D96821

[InstructionSimplify] SimplifyShift - rename shift amount KnownBits. NFCI.

As suggested on D97305.

Revert "Module: Use FileEntryRef and DirectoryEntryRef in Umbrella, Header, and DirectoryName, NFC"

This (mostly) reverts 32c501dd88b62787d3a5ffda7aabcf4650dbe3cd. Hit a
case where this causes a behaviour change, perhaps the same root cause
that triggered the revert of a40db5502b2515a6f2f1676b5d7a655ae0f41179 in
7799ef7121aa7d59f4bd95cdf70035de724ead6f.

(The API changes in DirectoryEntry.h have NOT been reverted as a number
of subsequent commits depend on those.)

https://reviews.llvm.org/D90497#2582166

[LegalizeIntegerTypes] Improve ExpandIntRes_SADDSUBO codegen on targets without SADDO/SSUBO.

This code creates 3 setccs that need to be expanded. It was
creating a sign bit test as setge X, 0 which is non-canonical.
Canonical would be setgt X, -1. This misses the special case in
IntegerExpandSetCCOperands for sign bit tests that assumes
canonical form. If we don't hit this special case we end up
with a multipart setcc instead of just checking the sign of
the high part.

To fix this I've reversed the polarity of all of the setccs to
setlt X, 0 which is canonical. The rest of the logic should
still work. This seems to produce better code on RISCV which
lacks a setgt instruction.

This probably still isn't the best code sequence we could use here.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D97181

[THUMB2] add .w suffixes for ldr/str (immediate) T4

The Linux kernel when built with CONFIG_THUMB2_KERNEL makes use of these
instructions with immediate operands and wide encodings.

These are the T4 variants of the follow sections from the Arm ARM.
F5.1.72 LDR (immediate)
F5.1.229 STR (immediate)

I wasn't able to represent these simple aliases using t2InstAlias due to
the Constraints on the non-suffixed existing instructions, which results
in some manual parsing logic needing to be added.

F1.2 Standard assembler syntax fields
describes the use of the .w (wide) vs .n (narrow) encoding suffix.

Link: https://bugs.llvm.org/show_bug.cgi?id=49118
Link: https://github.com/ClangBuiltLinux/linux/issues/1296
Reported-by: Stefan Agner <stefan@agner.ch>
Reported-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D96632

[darwin] use new crash reporter api

Add support for the new crash reporter api if the headers are available. Falls back to the old API if they are not available. This change was based on [[ https://github.com/llvm/llvm-project/blob/0164d546d2691c439fc06c8fff126224276c2d02/llvm/lib/Support/PrettyStackTrace.cpp#L111 | /llvm/lib/Support/PrettyStackTrace.cpp ]]

There is a lit for this behavior here: https://reviews.llvm.org/D96737 but is not included in this diff because it is potentially flaky.

rdar://69767688

Reviewed By: delcypher, yln

Commited by Dan Liew on behalf of Emily Shi.

Differential Revision: https://reviews.llvm.org/D96830

[darwin][asan] add test for application specific information in crash logs

Added a lit test that finds its corresponding crash log and checks to make sure it has asn output under `Application Specific Information`.

This required adding two python commands:
- `get_pid_from_output`: takes the output from the asan instrumentation and parses out the process ID
- `print_crashreport_for_pid`: takes in the pid of the process and the file name of the binary that was run and prints the contents of the corresponding crash log.

This test was added in preparation for changing the integration with crash reporter from the old api to the new api, which is implemented in a subsequent commit.

rdar://69767688

Reviewed By: delcypher

Commited by Dan Liew on behalf of Emily Shi.

Differential Revision: https://reviews.llvm.org/D96737

[GlobalISel] Make more use of replaceSingleDefInstWithReg. NFC.

[lldb] Add deref support and tests to shared_ptr synthetic

Add `frame variable` dereference suppport to libc++ `std::shared_ptr`.

This change allows for commands like `v *thing_sp` and `v thing_sp->m_id`. These
commands now work the same way they do with raw pointers. This is done by adding an
unaccounted for child member named `$$dereference$$`.

Also, add API tests for `std::shared_ptr`, previously there were none.

Differential Revision: https://reviews.llvm.org/D97165

Revert "[LV] Allow tryToCreateWidenRecipe to return a VPValue, use for blends."

This reverts commit 4efa097eb4c87d7ffe09a95a5b4ff372bdddda85, because
some the compilers used for some bots do not support automatic
conversions to PointerUnion.

[LV] Allow tryToCreateWidenRecipe to return a VPValue, use for blends.

Generalize the return value of tryToCreateWidenRecipe to return either a
newly create recipe or an existing VPValue. Use this to avoid creating
unnecessary VPBlendRecipes.

Fixes PR44800.

[AMDGPU][SelectionDAG] Don't combine uniform multiplies to MUL_[UI]24

Prefer to keep uniform (non-divergent) multiplies on the scalar ALU when
possible. This significantly improves some game cases by eliminating
v_readfirstlane instructions when the result feeds into a scalar
operation, like the address calculation for a scalar load or store.

Since isDivergent is only an approximation of whether a value is in
SGPRs, it can potentially regress some situations where a uniform value
ends up in a VGPR. These should be rare in real code, although the test
changes do contain a number of examples.

Most of the test changes are just using s_mul instead of v_mul/mad which
is generally better for both register pressure and latency (at least on
GFX10 where sgpr pressure doesn't affect occupancy and vector ALU
instructions have significantly longer latency than scalar ALU). Some
R600 tests now use MULLO_INT instead of MUL_UINT24.

GlobalISel appears to handle more scenarios in the desirable way,
although it can also be thrown off and fails to select the 24-bit
multiplies in some cases.

Alternative solution considered and rejected was to allow selecting
MUL_[UI]24 to S_MUL_I32. I've rejected this because the definition of
those SD operations works is don't-care on the most significant 8 bits,
and this fact is used in some combines via SimplifyDemandedBits.

Based on a patch by Nicolai Hähnle.

Differential Revision: https://reviews.llvm.org/D97063

[JumpThreading] Update computeValueKnownInPredecessors to recognize logical and/or patterns

This allows JumpThreading's computeValueKnownInPredecessors to
recognize select form of and/or patterns as well.

[AMDGPU] Rename a prefix for sanity. NFC.

Add @llvm.coro.async.size.replace intrinsic.

The new intrinsic replaces the size in one specified AsyncFunctionPointer with
the size in another. This ability is necessary for functions which merely
forward to async functions such as those defined for partial applications.

Reviewed By: aschwaighofer

Differential Revision: https://reviews.llvm.org/D97229

[Driver][NFC] Add explicit break to final case

[libcxx] [test] Define _CRT_STDIO_ISO_WIDE_SPECIFIERS while building tests

This matches how libc++ itself is built. This avoids errors due to
mismatch if linking libc++ statically.

Differential Revision: https://reviews.llvm.org/D97169

[clang-tidy] Remove IncludeInserter from MoveConstructorInit check.

This check registers an IncludeInserter, however the check itself doesn't actually emit any fixes or includes, so the inserter is redundant.

From what I can tell the fixes were removed in D26453(rL290051) but the inserter was left in, probably an oversight.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D97243

[clang][SVE] Don't warn on vector to sizeless builtin implicit conversion

This commit prevents warnings from -Wconversion when a clang vector type
is implicitly converted to a sizeless builtin type -- for example, when
implicitly converting a fixed-predicate to a scalable predicate.

The code below:

     1    #include <arm_sve.h>
     2
     3    #define N __ARM_FEATURE_SVE_BITS
     4    #define FIXED_ATTR __attribute__((arm_sve_vector_bits (N)))
     5    typedef svbool_t fixed_svbool_t FIXED_ATTR;
     6
     7    inline fixed_svbool_t foo(fixed_svbool_t p) {
     8      return svnot_z(svptrue_b64(), p);
     9    }

would previously raise this warning:

    warning: implicit conversion turns vector to scalar: \
    'fixed_svbool_t' (vector of 8 'unsigned char' values) to 'svbool_t' \
    (aka '__SVBool_t') [-Wconversion]

Note that many cases of these implicit conversions were already
permitted because many functions inside arm_sve.h are spawned via
preprocessor macros, and the call to isInSystemMacro would cover us in
this case. This commit fixes the remaining cases.

Differential Revision: https://reviews.llvm.org/D97053

[clang-tidy] Extending bugprone-signal-handler with POSIX functions.

An option is added to the check to select wich set of functions is
defined as asynchronous-safe functions.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D90851

[lldb] [test] Un-XFAIL TestBuiltinTrap on FreeBSD/aarch64

[lldb] [test] Un-XFAIL a test that no longer fail on FreeBSD

[X86] Cleanup overflow test check prefixes. NFCI.

Tidy up the check prefixes to improve reuse.

[AMDGPU] Use divergent addresses for vector loads

Change some test cases to use divergent addresses for vector loads,
which should be the common case in real world code. Using uniform
addresses causes poor instruction selection for the surrounding
code which has to be fixed up post-register-allocation, and this causes
a lot of testsuite churn for a forthcoming patch to stop selecting
24-bit vector multiply instructions for uniform multiplies.

This shows up some problems in the idot tests where we fail to select
v_dot instructions because the patterns only match MUL_[UI]24 ISD nodes,
but the DAG contains i16 mul nodes instead.

Differential Revision: https://reviews.llvm.org/D97062

[ARM] do not consider sp as deprecated for ldm/stm

Early versions of the ARMv7 reference manuals considered the sp register
as a deprecated register for ldm/stm familiy of instructions. However,
later versions such as ARM DDI 0406C.d added a note to the Appendix:

D9.3 Use of the SP as a general-purpose register
Most ARM instructions, unlike Thumb instructions, provide exactly the
same access to the SP as to R0-R12. This means that it is possible to
use the SP as a general-purpose register. Earlier issues of this manual
deprecated the use of SP in an ARM instruction, in any way that is
deprecated, not permitted, or not possible in the corresponding
Thumb instruction. However, user feedback indicates a number of cases
where these instructions are useful. Therefore, ARM no longer deprecates
these instruction uses.
Also Armv8 manuals no longer consider SP as deprecated register for ldm/
stm A32 instructions.

Furthermore, GNU as also does not print a deprecated warning when using
SP with those instructions.

Drop deprecation warning for pop/ldm/push/stm instructions.

Patch by: Stefan Agner.

Differential Revision: https://reviews.llvm.org/D82692

[TTI] Change getOperandsScalarizationOverhead to take Type args

As a followup to D95291, getOperandsScalarizationOverhead was still
using a VF as a vector factor if the arguments were scalar, and would
assert on certain matrix intrinsics with differently sized vector
arguments. This patch removes the VF arg, instead passing the Types
through directly. This should allow it to more accurately compute the
cost without having to guess at which operands will be vectorized,
something difficult with more complex intrinsics.

This adjusts one SVE test as it is now calling the wrong intrinsic vs
veccall. Without invalid InstructCosts the cost of the scalarized
intrinsic is too low. This should get fixed when the cost of
scalarization is accounted for with scalable types.

Differential Revision: https://reviews.llvm.org/D96287

[CostModel] Remove VF from IntrinsicCostAttributes

getIntrinsicInstrCost takes a IntrinsicCostAttributes holding various
parameters of the intrinsic being costed. It can either be called with a
scalar intrinsic (RetTy==Scalar, VF==1), with a vector instruction
(RetTy==Vector, VF==1) or from the vectorizer with a scalar type and
vector width (RetTy==Scalar, VF>1). A RetTy==Vector, VF>1 is considered
an error. Both of the vector modes are expected to be treated the same,
but because this is confusing many backends end up getting it wrong.

Instead of trying work with those two values separately this removes the
VF parameter, widening the RetTy/ArgTys by VF used called from the
vectorizer. This keeps things simpler, but does require some other
modifications to keep things consistent.

Most backends look like this will be an improvement (or were not using
getIntrinsicInstrCost). AMDGPU needed the most changes to keep the code
from c230965ccf36af5c88c working. ARM removed the fix in
dfac521da1b90db683, webassembly happens to get a fixup for an SLP cost
issue and both X86 and AArch64 seem to now be using better costs from
the vectorizer.

Differential Revision: https://reviews.llvm.org/D95291

[clang-tidy] Update checks list.

[clang][parse][NFC] Remove dead ProhibitAttributes() call

GNU-style attribute in enum bodies are allowed (and used by several
tests), and this call to ProhibitAttributes() was dead code.

Differential Revision: https://reviews.llvm.org/D97271

[clang-tidy] Install run-clang-tidy.py in bin/ as run-clang-tidy

The run-clang-tidy.py helper script is supposed to be used by the
user, hence it should be placed in the user's PATH. Some
distributions, like Gentoo [1], won't have it in PATH unless it is
installed in bin/.

Furthermore, installed scripts in PATH usually do not carry a filename
extension, since there is no need to know that this is a Python
script. For example Debian and Ubuntu already install this script as
'run-clang-tidy' [2] and hence build systems like Meson also look for
this name first [3]. Hence we install run-clang-tidy.py as
run-clang-tidy, as suggested by Sylvestre Ledru [4].

1: https://bugs.gentoo.org/753380
2: https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/60aefb14171ab5c3867a0081844b507fc9f6e015/debian/clang-tidy-X.Y.links.in#L2
3: https://github.com/mesonbuild/meson/blob/b6dc4d5e5c6e838de0b52e62d982ba2547eb366d/mesonbuild/scripts/clangtidy.py#L44
4: https://reviews.llvm.org/D90972#2380640

Reviewed By: sylvestre.ledru, JonasToth

Differential Revision: https://reviews.llvm.org/D90972

[DSE] Allow ptrs defined in the entry block in IsGuaranteedLoopInvariant.

The **IsGuaranteedLoopInvariant** function is making sure to check if the
incoming pointer is guaranteed to be loop invariant, therefore I think
the case where the pointer is defined in the entry block of a function
automatically guarantees the pointer to be loop invariant, as the entry
block of a function cannot have predecessors or be part of a loop.

I implemented this small patch and tested it using
**ninja check-llvm-unit** and **ninja check-llvm**. I added a contained test
file that shows the problem and used **opt -O3 -debug** on it to make sure
the case is not currently handled (in fact the debug log is showing that
the DSE pass is bailing out when testing if the killer store is able to
clobber the dead store).

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D96979

[RISCV] vle1.v/vse1.v should be unmasked instructions.

vle1.v/vse1.v should be unmasked instructions. The vm encoding is 1 for
unmasked instructions.

Differential Revision: https://reviews.llvm.org/D97237

[OpenCL][Docs] Change description for the OpenCL standard headers.

After updating the user interface in D96515, update the docs
reflecting the new approach.

Tags: #clang

Differential Revision: https://reviews.llvm.org/D96616

[mlir][Linalg] Retire hoistViewAllocOps.

This transformation was only used for quick experimentation and is not general enough.
Retire it.

Differential Revision: https://reviews.llvm.org/D97266

Fix Wdocumentation parameter warning. NFCI.

[mlir] NFC - Use declarative assembly for scf::YieldOp

[lldb][NFC] Remove unused ValueObject::LogValueObject functions

Those functions aren't called anywhere. For debugging purposes we usually
have Dump() methods (which already exist in some semi-functional form in
ValueObject).

[Support] Add reserve() method to the raw_ostream.

If resulting size of the output stream is already known,
then the space for stream data could be preliminary
allocated in some cases. f.e. raw_string_ostream could
preallocate the space for the target string(it allows
to avoid reallocations during writing into the stream).

Differential Revision: https://reviews.llvm.org/D91693

[lldb][NFC] Clean up ValueObject comments

* Remove commented out code.
* Doxygenify comments that serve as documentation.
* Use the LLVM comment style where possible.

[ARM] Add pre/post inc tests of various sizes. NFC

Revert "[WebAssembly] call_indirect issues table number relocs"

This reverts commit 861dbe1a021e6439af837b72b219fb9c449a57ae. It broke
emscripten -- see https://reviews.llvm.org/D90948#2578843.

[RISCV] Support insertion of misaligned subvectors

This patch extends the support for RVV INSERT_SUBVECTOR to cover those
which don't align to a vector register boundary. Like the support for
EXTRACT_SUBVECTOR in D96959, it accomplishes this by extracting the
nearest register-sized subvector (a subregister operation), then sliding
the vector down with VSLIDEDOWN, inserting the subvector to the first
position, and sliding the vector back up again afterwards.

Unlike subvector extraction, for vectors that occupy less than a full
vector register we must preserve the untouched elements. We do this by
lowering to an LMUL=1 INSERT_SUBVECTOR using the above method and
lowering that to a VSLIDEUP with a zero offset. This uses a
tail-undisturbed policy and so has the effect of "sliding in" the
subvector elements while preserving the surrounding ones.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96972

Fix unused variable

[OpenCL] Move remaining defines to opencl-c-base.h

Move any remaining preprocessor defines from `opencl-c.h` to
`opencl-c-base.h`, such that they are shared with
`-fdeclare-opencl-builtins` too.

In particular, move:
- the `as_type` and `as_typen` definitions, and
- the `kernel_exec` and `__kernel_exec` definitions.

Also clang-format the changes.

Differential Revision: https://reviews.llvm.org/D96948

[lldb][NFC] Give CompilerType's IsArrayType/IsVectorType/IsBlockPointerType out-parameters default values

We already do this for most functions that have out-parameters, so let's do
the same here and avoid all the `nullptr, nullptr, nullptr` in every call.

Fix UBSAN in __ubsan::Value::getSIntValue

/home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77:25: runtime error: left shift of 0x0000000000000000fffffffffffffffb by 96 places cannot be represented in type '__int128'
    #0 0x7ffff754edfe in __ubsan::Value::getSIntValue() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77
    #1 0x7ffff7548719 in __ubsan::Value::isNegative() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.h:190
    #2 0x7ffff7542a34 in handleShiftOutOfBoundsImpl /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:338
    #3 0x7ffff75431b7 in __ubsan_handle_shift_out_of_bounds /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:370
    #4 0x40067f in main (/home/marxin/Programming/testcases/a.out+0x40067f)
    #5 0x7ffff72c8b24 in __libc_start_main (/lib64/libc.so.6+0x27b24)
    #6 0x4005bd in _start (/home/marxin/Programming/testcases/a.out+0x4005bd)

Differential Revision: https://reviews.llvm.org/D97263

[Sanitizer][NFC] Fix typo

[lldb][NFC] Don't inherit from UserID in ValueObject

ValueObject inherits from UserID which is just a bad idea:

* The inheritance gives ValueObject some member functions that are at best
  misleading (such as `Clear()` which doesn't clear any value beside `id`).

* It allows passing ValueObject to the overloaded operators for UserID (such as
  `==` or `<<` which won't actually compare or print anything in the ValueObject).

* It exposes the `SetID` and `Clear` which both allow users to change the
  internal id value.

Similar to D91699 which did the same for Process

Reviewed By: #lldb, JDevlieghere

Differential Revision: https://reviews.llvm.org/D97205

[X86] Support amx-int8 intrinsic.

Adding support for intrinsics of TDPBSUD/TDPBUSD/TDPBUUD.

Differential Revision: https://reviews.llvm.org/D97259

[mlir] Add support for DebugCounters using the new DebugAction infrastructure

DebugCounters allow for selectively enabling the execution of a debug action based upon a "counter". This counter is comprised of two components that are used in the control of execution of an action, a "skip" value and a "count" value. The "skip" value is used to skip a certain number of initial executions of a debug action. The "count" value is used to prevent a debug action from executing after it has executed for a set number of times (not including any executions that have been skipped). For example, a counter for a debug action with `skip=47` and `count=2`, would skip the first 47 executions, then execute twice, and finally prevent any further executions.

This is effectively the same as the DebugCounter infrastructure in LLVM, but using the DebugAction infrastructure in MLIR. We can't simply reuse the DebugCounter support already present in LLVM due to its heavy reliance on global constructors (which are not allowed in MLIR). The DebugAction infrastructure already nicely supports the debug counter use case, and promotes the separation of policy and mechanism design philosophy.

Differential Revision: https://reviews.llvm.org/D96395

[mlir] Add a new debug action framework.

This revision adds the infrastructure for `Debug Actions`. This is a DEBUG only
API that allows for external entities to control various aspects of compiler
execution. This is conceptually similar to something like DebugCounters in LLVM, but at a lower level. This framework doesn't make any assumptions about how the higher level driver is controlling the execution, it merely provides a framework for connecting the two together. This means that on top of DebugCounter functionality, we could also provide more interesting drivers such as interactive execution. A high level overview of the workflow surrounding debug actions is
shown below:

*   Compiler developer defines an `action` that is taken by the a pass,
    transformation, utility that they are developing.
*   Depending on the needs, the developer dispatches various queries, pertaining
    to this action, to an `action manager` that will provide an answer as to
    what behavior the action should do.
*   An external entity registers an `action handler` with the action manager,
    and provides the logic to resolve queries on actions.

The exact definition of an `external entity` is left opaque, to allow for more
interesting handlers.

This framework was proposed here: https://llvm.discourse.group/t/rfc-debug-actions-in-mlir-debug-counters-for-the-modern-world

Differential Revision: https://reviews.llvm.org/D84986

[clang][DeclPrinter] Pass Context into StmtPrinter whenever possible

ASTContext were only passed to the StmtPrinter in some places, while it
is always available in DeclPrinter. The context is used by StmtPrinter to better
print statements in some cases, like printing constants as written.

Differential Revision: https://reviews.llvm.org/D97043

[lldb][NFC] Cleanup ValueObject construction code

Just code cleanup for ValueObject constructors:

* Use default member initializers where possible.
* Doxygenify the comments for membersa nd constructors where needed.
* Delete the default constructor which isn't defined.
* Initialize the bitfields via a utility struct instead of doing this in the
different constructors.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D97199

[RISCV] Add test case for missed opportunity use bgez for the canonical form X > -1. NFC

[SimplifyCFG] Minor tweaks to the added tests (NFC)

[SimplifyCFG] Add tests for D97244 (NFC)

[flang][NFC] Add source line to lowering TODO messages

- Add a fatal error handler that can print a message with source location
  before aborting.
- Update TODO macro to take an mlir location argument and to use the
  newly introduced fatal error handler.
- Introduce TODO_NOLOC for the few places where no source location is
  easily accessible.

Reviewed By: schweitz

Differential Revision: https://reviews.llvm.org/D97190

[CMake][profile] Don't use `TARGET lld` to avoid ordering issues

Depending on the order in which lld and compiler-rt projects are
processed by CMake, `TARGET lld` might evaluate to `TRUE` or `FALSE`
even though `lld-available` lit stanza is always set because lld is
being built. We check whether lld project is enabled instead which
is used by other compiler-rt tests.

The ideal solution here would be to use CMake generator expressions,
but those cannot be used for dependencies yet, see:
https://gitlab.kitware.com/cmake/cmake/-/issues/19467

Differential Revision: https://reviews.llvm.org/D97256

[MLIR][LinAlg] Start detensoring implementation.

This commit is the first baby step towards detensoring in
linalg-on-tensors.

Detensoring is the process through which a tensor value is convereted to one
or potentially more primitive value(s). During this process, operations with
such detensored operands are also converted to an equivalen form that works
on primitives.

The detensoring process is driven by linalg-on-tensor ops. In particular, a
linalg-on-tensor op is checked to see whether *all* its operands can be
detensored. If so, those operands are converted to thier primitive
counterparts and the linalg op is replaced by an equivalent op that takes
those new primitive values as operands.

This works towards handling github/google/iree#1159.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D96271

[NFC][libc++] Fix _LIBCPP_HAS_BITSCAN64 usage.

Seems line was accidentally left in
llvm-svn: 290924 86eebc5b658b5c2ccf2f4fbc16e8aee9880919a5

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D97211

[builtins] Replace __SOFT_FP__ with __SOFTFP__

Fix PR46294

Differential Revision: https://reviews.llvm.org/D82014

[docs][ORC] Fix section title and reference.

[SLP][Test] Add test for PR49081.ll

Move the MLIR integration tests as a subdirectory of test (NFC)

This does not change the behavior directly: the tests only run when
`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON` is configured. However running
`ninja check-mlir` will not run all the tests within a single
lit invocation. The previous behavior would wait for all the integration
tests to complete before starting to run the first regular test. The
test results were also reported separately. This change is unifying all
of this and allow concurrent execution of the integration tests with
regular non-regression and unit-tests.

Differential Revision: https://reviews.llvm.org/D97241

[libc][NFC] Eliminate couple of dependencies on llvm/ADT/StringExtras.h.

[BuildLibCalls] Add noundef to allocator fns' size

This is a patch to explicitly mark the size parameter of allocator functions like malloc/realloc/... as noundef.

For C/C++: undef can be created from reading an uninitialized variable or padding.
Calling a function with uninitialized variable is already UB.
Calling malloc with padding value is.. something that's not expected. Padding bits may appear in a coerced aggregate, which doesn't apply to malloc's size.
Therefore, malloc's size can be marked as noundef.

For transformations that introduce malloc/realloc/..: I ran LLVM unit tests with an updated Alive2 semantics, and found no regression, so it seems okay.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D97045

Only verify LazyCallGraph under expensive checks

These verify calls are causing a lot of slowdown on some files, up to 8x.
The LazyCallGraph infra has been tested a lot over the years, so I'm fairly confident that we don't always need to run the verifys.

These verifies took >90% of total time in one of the compilations I looked at.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D97225

[Analysis] Use range-based for loops (NFC)

[llvm] Use llvm::drop_begin (NFC)

[Analysis] Use ListSeparator (NFC)

[libc] [Obvious] Fix.

[mlir][pdl][NFC] Extract the execution of each bytecode operation into its own function

This makes the implementation of each bytecode operation much easier to reason about, and lets the compiler decide which implementations are beneficial to inline into the main switch.

Differential Revision: https://reviews.llvm.org/D95716

[mlir][pdl] Fix bug when ordering predicates

We should be ordering predicates with higher primary/secondary sums first, but we are currently ordering them last. This allows for predicates more frequently encountered to be checked first.

Differential Revision: https://reviews.llvm.org/D95715

[GVN] Fix a typo in comment

NFC.

Differential Revision: https://reviews.llvm.org/D97200

Reviewed By: fhahn

Changes to mktime to handle invalid dates, overflow and underflow andcalculating the correct date and thenumber of seconds even if invalid datesare passed as arguments.

Added tests for invalid dates like the following
Date 1970-01-01 00:00:-1 is treated as 1969-12-31 23:59:59 and seconds
are returned for the modified date.

Tested the code by doing ninja check-libc (and cmake).

Reviewed By: sivachandra, rtenneti

Differential Revision: https://reviews.llvm.org/D96684

[dfsan] Propagate origins at non-memory/phi/call instructions

This is a part of https://reviews.llvm.org/D95835.

Reviewed-by: morehouse
Differential Revision: https://reviews.llvm.org/D97200

[obj2yaml,yaml2obj] Add NumBlocks to the BBAddrMapEntry yaml field.

As discussed in D95511, this allows us to encode invalid BBAddrMap
sections to be used in more rigorous testing.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D96831

[lldb] add check for libcxx runtime

When enabling LLDB tests with `LLVM_ENABLE_RUNTIMES=libcxx` CMake will
fail with:

```
LLDB test suite requires libc++, but it is currently disabled.
```

The issue is that the targets in LLVM_ENABLE_RUNTIMES are configured
after the targets in LLVM_ENABLE_PROJECTS, so at this point the check
for the `cxx` target will fail. CMake will add a dependency for a target
that does not exist yet however, so by first checking for `libcxx` in
LLVM_ENABLE_RUNTIMES we ensure that the `cxx` target will be present at
build time.

Tested with:
```
% cmake -G Ninja \
    -C ~/local/llvm-project/lldb/cmake/caches/Apple-lldb-macOS.cmake \
    -DLLVM_ENABLE_PROJECTS="clang;lldb" -DLLVM_ENABLE_RUNTIMES="libcxx" \
    -DLIBCXX_INCLUDE_TESTS=NO ~/local/llvm-project/llvm
% ninja check-lldb
```

Reviewed By: smeenai, JDevlieghere

Differential Revision: https://reviews.llvm.org/D97227

[mlir][IR] Refactor the `getChecked` and `verifyConstructionInvariants` methods on Attributes/Types

`verifyConstructionInvariants` is intended to allow for verifying the invariants of an attribute/type on construction, and `getChecked` is intended to enable more graceful error handling aside from an assert. There are a few problems with the current implementation of these methods:
* `verifyConstructionInvariants` requires an mlir::Location for emitting errors, which is prohibitively costly in the situations that would most likely use them, e.g. the parser.
This creates an unfortunate code duplication between the verifier code and the parser code, given that the parser operates on llvm::SMLoc and it is an undesirable overhead to pre-emptively convert from that to an mlir::Location.
* `getChecked` effectively requires duplicating the definition of the `get` method, creating a quite clunky workflow due to the subtle different in its signature.

This revision aims to talk the above problems by refactoring the implementation to use a callback for error emission. Using a callback allows for deferring the costly part of error emission until it is actually necessary.

Due to the necessary signature change in each instance of these methods, this revision also takes this opportunity to cleanup the definition of these methods by:
* restructuring the signature of `getChecked` such that it can be generated from the same code block as the `get` method.
* renaming `verifyConstructionInvariants` to `verify` to match the naming scheme of the rest of the compiler.

Differential Revision: https://reviews.llvm.org/D97100

Revert "[AArch64][GlobalISel] Match G_SHUFFLE_VECTOR -> insert elt + extract elt"

This reverts commit 867e379c0e14527eb7aa68485a10324693e35f5d.

For some reason this is upsetting Linux/Windows bots. Reverting while I try to
reproduce.

[Test][AArch64] Test SADDE/SSUBE/UADDE/USUBE narrowing legalization

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D96676

[AArch64][GlobalISel] Make overflow legalization use clampScalar

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96674

[GlobalISel] Implement narrowScalar for SADDE/SSUBE/UADDE/USUBE

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96673

[GlobalISel] Implement narrowScalar for SADDO/SSUBO

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96672

[GlobalISel] Implement narrowScalar for UADDO/USUBO

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96671

[lldb] Reinstate support for LLDB_VERSION_STRING

Reinstate support for specifying -DLLDB_VERSION_STRING="best-lldb"
which seems to have gotten accidentally removed in the past.

rdar://38983903

Differential revision: https://reviews.llvm.org/D97235

[MacroExpansionContext] Fix a warning.

This patch fixes:

error: private field 'PP' is not used [-Werror,-Wunused-private-field]

[libunwind] unw_* alias fixes for ELF and Mach-O

Rename the CMake option, LIBUNWIND_HERMETIC_STATIC_LIBRARY, to
LIBUNWIND_HIDE_SYMBOLS. Rename the C macro define,
_LIBUNWIND_DISABLE_VISIBILITY_ANNOTATIONS, to _LIBUNWIND_HIDE_SYMBOLS,
because now the macro adds a .hidden directive rather than merely
suppress visibility annotations.

For ELF, when LIBUNWIND_HIDE_SYMBOLS is enabled, mark unw_getcontext as
hidden. This symbol is the only one defined using src/assembly.h's
WEAK_ALIAS macro. Other unw_* weak aliases are defined in C++ and are
already hidden.

Mach-O doesn't support weak aliases, so remove .weak_reference and
weak_import. When LIBUNWIND_HIDE_SYMBOLS is enabled, output
.private_extern for the unw_* aliases.

In assembly.h, add missing SYMBOL_NAME macro invocations, which are
used to prefix symbol names with '_' on some targets.

Fixes PR46709.

Reviewed By: #libunwind, phosek, compnerd, steven_wu

Differential Revision: https://reviews.llvm.org/D93003

[sparse][mlir] simplify lattice optimization logic

Simplifies the way lattices are optimized with less, but more
powerful rules. This also fixes an inaccuracy where too many
lattices resulted (expecting a non-existing universal index).
Also puts no-side-effects on all proper getters and unifies
bufferization flags order in integration tests (for future,
more complex use cases).

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D97134

[JITLink] Add a getFixupAddress convenience method to Block.

[JITLink] Don't allow creation of sections with duplicate names.

[gn build] Port 8f48ddd19358

[X86][AMX] Lower tile copy instruction.

Since there is no tile copy instruction, we need to store tile
register to stack and load from stack to another tile register.
We need extra GR to hold the stride, and we need stack slot to
hold the tile data register. We would run this pass after copy
propagation, so that we don't miss copy optimization. And we
would run this pass before prolog/epilog insertion, so that we
can allocate stack slot.

Differential Revision: https://reviews.llvm.org/D97112

DebugInfo: Emit "LocalToUnit" flag on local member function decls.

Follow-up to fe2dcd89acfd9301a230e38e9030734553baa8dc.

Update test per review comments, restoring the "D" type to its
original state, and adding new "L" type. (Sorry, this was intended to
be included in the prior commit)

Differential Revision: https://reviews.llvm.org/D96044

Add auto-upgrade support for annotation intrinsics

The llvm.ptr.annotation and llvm.var.annotation intrinsics were changed
since the 11.0 release to add an additional parameter. This patch
auto-upgrades IR containing the four-parameter versions of these
intrinsics, adding a null pointer as the fifth argument.

Differential Revision: https://reviews.llvm.org/D95993

[AMDGPU] Move RPT::getLiveRegs() check under EXPENSIVE_CHECKS

This is too expensive even for debug builds. It doubles
scheduling time if enabled.

Differential Revision: https://reviews.llvm.org/D97232

[CMake] Don't optimize tests so much under ThinLTO

This drops check-llvm under -DLLVM_ENABLE_LTO=Thin from 13m to 10m20s on Windows and 20m to 15m35s on Linux.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D96618