review.tizen.org Git - platform/upstream/llvm.git/log

[fir] Add fir.extract_value and fir.insert_value conversion

This patch add the conversion pattern for fir.extract_value
and fir.insert_value. fir.extract_value is lowered to llvm.extractvalue
anf fir.insert_value is lowered to llvm.insertvalue.
This patch also adds the type conversion for the BoxType and RecordType
needed to have some comprehensive tests.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D112961

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>

[gn build] Reformat all files

Ran `git ls-files '*.gn' '*.gni' | xargs llvm/utils/gn/gn.py format`.
No behavior change.

[clang] 'unused-but-set-variable' warning should not apply to __block objective-c pointers

The __block Objective-C pointers can be set but not used due to a commonly used lifetime extension pattern in Objective-C.

Differential Revision: https://reviews.llvm.org/D112850

[gn build] Use build-machine-independent paths in coverage information

This is possible after D106314 / 8773822c578a.

Makes the required prepare-code-coverage-artifact.py invocation a bit longer,
but that seems like a good tradeoff.

Differential Revision: https://reviews.llvm.org/D113282

Extend timeout of llvm/unittests:ir_tests

This test became much slower after 01d8759ac9

[ValueTracking][InstCombine] Introduce and use ComputeMinSignedBits

This introduces a new ComputeMinSignedBits method for ValueTracking that
returns the BitWidth - SignBits + 1 from ComputeSignBits, and represents
the minimum bit size for the value as a signed integer. Similar to the
existing APInt::getMinSignedBits method, this can make some of the
reasoning around ComputeSignBits more natural.

See https://reviews.llvm.org/D112298

[DAG] FoldConstantVectorArithmetic - remove SDNodeFlags argument

Another minor step towards merging FoldConstantVectorArithmetic into FoldConstantArithmetic.

We don't use SDNodeFlags in any constant folding inside DAG, so passing the Flags argument is a waste of time - an alternative would be to wire up FoldConstantArithmetic to take SDNodeFlags just-in-case we someday start using it, but we don't have any way to test it and I'd prefer to avoid dead code.

Differential Revision: https://reviews.llvm.org/D113276

[X86] `X86TTIImpl::getInterleavedMemoryOpCostAVX512()`: mask is i8 not i1

Even though AVX512's masked mem ops (unlike AVX1/2) have a mask
that is a `VF x i1`, replication of said masks happens after
promotion of it to `VF x i8`, so we should use `i8`, not `i1`,
when calculating the cost of mask replication.

[DAGCombiner] add fold for vselect based on mask of signbit

(X s< 0) ? Y : 0 --> (X s>> BW-1) & Y

We canonicalize to the icmp+select form in IR, and we already have this fold
for scalar select in SDAG, so I think it's an oversight that we don't have
the fold for vectors. It seems neutral for AArch64 and saves some instructions
on x86.

Whether we should also have the sibling folds for the inverse condition or
all-ones true value may depend on target-specific factors such as whether
there's an "and-not" instruction.

Differential Revision: https://reviews.llvm.org/D113212

[AArch64] add tests for vector select; NFC

[x86] add tests for vector select; NFC

[InstCombine] add signbit tests for icmp with trunc; NFC

[gn build] Port 7a98761d74db

[IR][ShuffleVector] Introduce `isReplicationMask()` matcher

Avid readers of this saga may recall from previous installments,
that replication mask replicates (lol) each of the `VF` elements
in a vector `ReplicationFactor` times. For example, the mask for
`ReplicationFactor=3` and `VF=4` is: `<0,0,0,1,1,1,2,2,2,3,3,3>`.
More importantly, replication mask is used by LoopVectorizer
when using masked interleaved memory operations.

As discussed in previous installments, while it is used by LV,
and we **seem** to support masked interleaved memory operations on X86,
it's support in cost model leaves a lot to be desired:
until basically yesterday even for AVX512 we had no cost model for it.

As it has been witnessed in the recent
AVX2 `X86TTIImpl::getInterleavedMemoryOpCost()`
costmodel patches, while it is hard-enough to query the cost
of a particular assembly sequence [from llvm-mca],
afterwards the check lines LV costmodel tests must be updated manually.
This is, at the very least, boring.

Okay, now we have decent costmodel coverage for interleaving shuffles,
but now basically the same mind-killing sequence has to be performed
for replication mask. I think we can improve at least the second half
of the problem, by teaching
the `TargetTransformInfoImplCRTPBase::getUserCost()` to recognize
`Instruction::ShuffleVector` that are repetition masks,
adding exhaustive test coverage
using `-cost-model -analyze` + `utils/update_analyze_test_checks.py`

This way we can have good exhaustive coverage for cost model,
and only basic coverage for the LV costmodel.

This patch adds precise undef-aware `isReplicationMask()`,
with exhaustive test coverage.
* `InstructionsTest.ShuffleMaskIsReplicationMask` shows that
   it correctly detects all the known masks.
* `InstructionsTest.ShuffleMaskIsReplicationMask_undef`
  shows that replacing some mask elements in a known replication mask
  still allows us to recognize it as a replication mask.
  Note, with enough undef elts, we may detect a different tuple.
* `InstructionsTest.ShuffleMaskIsReplicationMask_Exhaustive_Correctness`
  shows that if we detected the replication mask with given params,
  then if we actually generate a true replication mask with said params,
  it matches element-wise ignoring undef mask elements.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D113214

[NFC] Move CombinationGenerator from Exegesis to ADT

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D113213

[AArch64] Add target DAG combine for UUNPKHI/LO

When created a UUNPKLO/HI node with an undef input then the
output should also be undef. I've added a target DAG combine
function to ensure we avoid creating an unnecessary uunpklo/hi
instruction.

Differential Revision: https://reviews.llvm.org/D113266

[NFC] Inclusive language: Remove instances of master in URLs

[NFC] This patch fixes URLs containing "master". Old URLs were either broken or
redirecting to the new URL.

Reviewed By: #libc, ldionne, mehdi_amini

Differential Revision: https://reviews.llvm.org/D113186

[DAG] FoldConstantArithmetic - rename NumOps -> NumElts. NFC.

NumOps represents the number of elements for vector constant folding, rename this NumElts so in future we can the consistently use NumOps to represent the number of operands of the opcode.

Minor cleanup before trying to begin generalizing FoldConstantArithmetic to support opcodes other than binops.

[gn build] (manually) port df0ba47c36f

[AArch64] Fix a bug from a pattern for uaddv(uaddlp(x)) ==> uaddlv

A pattern has selected wrong uaddlv MI. It should be as below.

uaddv(uaddlp(v8i8)) ==> uaddlv(v8i8)

Differential Revision: https://reviews.llvm.org/D113263

[FreeBSD] Do not mark __stack_chk_guard as dso_local

This symbol is defined in libc.so so it is definitely not DSO-Local.
Marking it as such causes problems on some platforms (such as PowerPC).

Differential revision: https://reviews.llvm.org/D109090

Enable -Wformat-pedantic and fix fallout.

Differential Revision: https://reviews.llvm.org/D113172

[DAG] FoldConstantArithmetic - fold bitlogic(bitcast(x),bitcast(y)) -> bitcast(bitlogic(x,y))

To constant fold bitwise logic ops where we've legalized constant build vectors to a different type (e.g. v2i64 -> v4i32), this patch adds a basic ability to peek through the bitcasts and perform the constant fold on the inner operands.

The MVE predicate v2i64 regressions will be addressed by future support for basic v2i64 type support.

One of the yak shaving fixes for D113192....

Differential Revision: https://reviews.llvm.org/D113202

[InstCombine] Add additional tests for converting to sadd.sat with sign bits. NFC

[fir] Add fir.select and fir.select_rank FIR to LLVM IR conversion patterns

The `fir.select` and `fir.select_rank` are lowered to llvm.switch.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D113089

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>

[LangRef][VP] Document vp.gather and vp.scatter intrinsics

This patch fleshes out the missing documentation for the final two VP
intrinsics introduced in D99355: `llvm.vp.gather` and `llvm.vp.scatter`.
It does so mostly by deferring to the `llvm.masked.gather` and
`llvm.masked.scatter` intrinsics, respectively.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D112997

[mlir][python] fix constructor generation for optional operands in presence of segment attribute

The ODS-based Python op bindings generator has been generating incorrect
specification of the operand segment in presence if both optional and variadic
operand groups: optional groups were treated as variadic whereas they require
separate treatement. Make sure it is the case. Also harden the tests around
generated op constructors as they could hitherto accept the code for both
optional and variadic arguments.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113259

[X86] Enable v32i16 rotate lowering on non-BWI targets

Fixes one of the regressions in D113192

[ARM] Extra MVE constant select test. NFC

[LangRef][VP] Document vp.load and vp.store intrinsics

This patch fleshes out the missing documentation for two of the VP
intrinsics introduced in D99355: `llvm.vp.load` and `llvm.vp.store`. It
does so mostly by deferring to the `llvm.masked.load` and
`llvm.masked.store` intrinsics, respectively.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D112930

[Sema][NFC] Add tests for builtin spaceship operator.

In preparation for D112453.

[Polly][Isl] Use the function unsignedFromIslSize to manage a isl::size object. NFCI

This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in lib/External/isl/include/isl/isl-noxceptions.h and the official isl C++ interface.
In the official interface the type `isl::size` cannot be casted to an unsigned without previously having checked if it contains a valid value with the function `isl::size::is_error()`.
For this reason two helping functions have been added:
- `IslAssert`: assert that no errors are present in debug builds and just disables the mandatory error check in non-debug builds
- `unisgnedFromIslSIze`: cast the `isl::size` object to `unsigned`

Changes made:
- Add the functions `IslAssert` and `unsignedFromIslSize`
- Add the utility function `rangeIslSize()`
- Retype `MaxDisjunctsInDomain` from `int` to `unsigned`
- Retype `RunTimeChecksMaxAccessDisjuncts` from `int` to `unsigned`
- Retype `MaxDimensionsInAccessRange` from `int` to `unsigned`
- Replaced some usages of `isl_size` to `unsigned` since we aim not to use `isl_size` anymore
- `isl-noexceptions.h` has been generated by https://github.com/patacca/isl/commit/e704f73c88f0b4d88e62e447bdb732cf5914094b

No functional change intended.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D113101

[PowerPC] use correct selection for v16i8/v8i16 splat load

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D113236

Revert "[TwoAddressInstructionPass] Update existing physreg live intervals"

This reverts commit ec0e1e88d24fadb2cb22f431d66b22ee1b01cd43.

It was pushed by mistake.

[AMDGPU] NFC formatting fixes in SIMemoryLegalizer

[TwoAddressInstructionPass] Update existing physreg live intervals

In TwoAddressInstructionPass::processTiedPairs with
-early-live-intervals, update any preexisting physreg live intervals,
as well as virtreg live intervals. By default (without
-precompute-phys-liveness) physreg live intervals only exist for
registers that are live-in to some basic block.

Differential Revision: https://reviews.llvm.org/D113191

[mlir][linalg][bufferize] Move bufferizesToAliasOnly to extraClassDecls

By doing so, the method can no longer be reimplemented.

Differential Revision: https://reviews.llvm.org/D113248

Fix `insertFunctionArguments()` block argument order.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D113171

Add Bazel support for LLVM_WINDOWS_PREFER_FORWARD_SLASH

This was added in df0ba47c36f6bd0865e3286853b76d37e037c2d7

[PowerPC] Add intrinsic to convert between ppc_fp128 and fp128

ppc_fp128 and fp128 are both 128-bit floating point types. However, we
can't do conversion between them now, since trunc/ext are not allowed
for same-size fp types.

This patch adds two new intrinsics: llvm.ppc.convert.f128.to.ppcf128 and
llvm.convert.ppcf128.to.f128, to support such conversion.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D109421

[Support] Allow configuring the preferred type of slashes on Windows

Default to preferring forward slashes when built for MinGW, as
many usecases, when e.g. Clang is used as a drop-in replacement
for GCC, requires the compiler to output paths with forward slashes.

Not all tests pass yet, if configuring to prefer forward slashes though.

Differential Revision: https://reviews.llvm.org/D112787

[Support] [Windows] Convert paths to the preferred form

This normalizes most paths (except ones input from the user as command
line arguments) into the preferred form, if `real_style()` evaluates to
`windows_forward`.

Differential Revision: https://reviews.llvm.org/D111880

[Support] Add a new path style for Windows with forward slashes

This behaves just like the regular Windows style, with both separator
forms accepted, but with get_separator() returning forward slashes.

Add a more descriptive name for the existing style, keeping the old
name around as an alias initially.

Add a new function `make_preferred()` (like the C++17
`std::filesystem::path` function with the same name), which converts
windows paths to the preferred separator form (while this one works on
any platform and takes a `path::Style` argument).

Contrary to `native()` (just like `make_preferred()` in `std::filesystem`),
this doesn't do anything at all on Posix, it doesn't try to reinterpret
backslashes into forward slashes there.

Differential Revision: https://reviews.llvm.org/D111879

Revert "[Attr] support btf_type_tag attribute"

This reverts commits 737e4216c537c33aab8ec51880f06b8a54325b94 and
ce7ac9e66aba2b937b3d3b5505ce6cc75dcc56ac.

After those commits, the compiler can crash with a reduced
testcase like this:

$ cat reduced.c
void a(*);
void a() {}
$ clang -c reduced.c -O2 -g

[libunwind] Try to add --unwindlib=none while configuring and building libunwind

If Clang is set up to link directly against libunwind (via the
--unwindlib option, or the corresponding builtin default option),
configuring libunwind will fail while bootstrapping (before the
initial libunwind is built), because every cmake test will
fail due to -lunwind not being found, and linking the shared library
will fail similarly.

Check if --unwindlib=none is supported, and add it in that case.
Using check_c_compiler_flag on its own doesn't work, because that only
adds the tested flag to the compilation command, and if -lunwind is
missing, the linking step would still fail - instead try adding it
to CMAKE_REQUIRED_FLAGS and restore the variable if it doesn't work.

This avoids having to pass --unwindlib=none while building libunwind.

Differential Revision: https://reviews.llvm.org/D112126

[NPM] Fix bug in llvm/utils/reduce_pipeline.py

Last minute changes in https://reviews.llvm.org/D110908 unfortunately
introduced a bug wrt automatic pipeline expansion. This patch fixes that
as well as gets rid of a few redundant variables.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D113177

[asan] compiler-rt version of D113143

Fix some issues with the gdb pretty printers for llvm::Twine

Still some pending bugs, but at least ironed some things out.

[Preprocessor] Fix newline before/after _Pragma.

The PragmaAssumeNonNullHandler (and maybe others) passes an invalid
SourceLocation to its callback, hence PrintPreprocessedOutput does not
know how many lines to insert between the previous token and the
pragma and does nothing.

With this patch we instead assume that the unknown token is on the same
line as the previous such that we can call the procedure that also emits
semantically significant whitespace.

Fixes bug reported here: https://reviews.llvm.org/D104601#3105044

[Preprocessor] Fix warning: left and right subexpressions are identical. NFCI.

This is reported by msvc as
warning C6287: redundant code: the left and right subexpressions are identical

EmittedDirectiveOnThisLine implies EmittedTokensOnThisLine
making this an NFC change. To be on the safe side and because both of
them are checked at other places as well, we continue to check both.

Compiler warning reported here:
https://reviews.llvm.org/D104601#2957333

[PowerPC] address post-commit comments for D106555; NFC

Address namanjai post commit comments.

[lld-macho] Replace LC_LINKER_OPTION parsing

This removes the tablegen based parsing of LC_LINKER_OPTION since it can
only actually contain a very small number of potential arguments. In our
project with tablegen this took 5 seconds before.

This replaces https://reviews.llvm.org/D113075

Differential Revision: https://reviews.llvm.org/D113235

[mlir][linalg][bufferize] Separate pass from ComprehensiveBufferize

This commit separates the bufferization from the bufferization pass in Linalg. This allows other dialects to use ComprehensiveBufferize more easily.

This commit mainly moves files to a new directory and adds a new build target.

Differential Revision: https://reviews.llvm.org/D112989

[lld-macho] Fix an assertion failure when -u specifies an undefined section$start symbol

This matches ld64. Also improve the test for `-dead_strip`.

Reviewed By: #lld-macho, Jez Ng

Differential Revision: https://reviews.llvm.org/D113147

[X86][MS-InlineAsm][test] Add triple in ms-inline-asm-array.ll

Fix the LIT test fail on Mac, which is reported in D113096.

[mlir][linalg][bufferize][NFC] Simplify AllocationCallbacks

AllocationCallbacks functions allocate/deallocate only. They no longer set the insertion point.

This is in preparation of decoupling ComprehensiveBufferize from the Linalg dialect.

Differential Revision: https://reviews.llvm.org/D112991

[mlir][linalg][bufferize] Decouple BufferizationAliasInfo

Move dialect-specific and analysis-specific function out of BufferizationAliasInfo. BufferizationAliasInfo's only job now is to keep track of aliases.

This is in preparation of futher decoupling ComprehensiveBufferize from various dialects.

Differential Revision: https://reviews.llvm.org/D112992

[mlir][linalg][bufferize] Add isWritable to op interface

By default, OpResult buffers are writable. But there are ops (e.g., ConstantOp) for which this is not the case.

The purpose of this commit is to further decouple Comprehensive Bufferize from the Standard dialect.

Differential Revision: https://reviews.llvm.org/D112908

[OpaquePtr] Fix initialization-order-fiasco

Asan detects it after D112732.

[mlir][linalg][bufferize] Add MemCpyFn to AllocationCallbacks struct

This in preparation of decoupling BufferizableOpInterface, Comprehensive Bufferize and dialects.

The goal of this CL is to make `getResultBuffer` (and other `bufferize` functions) independent of `LinalgOps`.

Differential Revision: https://reviews.llvm.org/D112907

[NFC] Don't set rlimit in test with MSAN

[NFC] Disabled few tests with MemoryWithOrigins

They pass regular MemorySanitizer, but hang with origin
tracking.

[X86][MS-InlineAsm] Add constraint *m for memory access w/ global var

Constraint `*m` should be used when the address of a variable is passed
as a value. And the constraint is missing for MS inline assembly when sth
is written to the address of the variable.

The missing would cause FE delete the definition of the static varible,
and then result in "undefined reference to xxx" issue.

Reviewed By: xiangzhangllvm

Differential Revision: https://reviews.llvm.org/D113096

[lld-macho] Clear resolvedReads cache

https://reviews.llvm.org/D113153#3108083

smeenai, int3

Differential Revision: https://reviews.llvm.org/D113198

[mlir][linalg][bufferize] Remove redundant methods from op interface

These two methods are redundant and removed:
* `bufferizesToAliasOnly`: If not `bufferizesToMemoryRead` and not `bufferizesToMemoryWrite` but `getAliasingOpResult` returns a non-null value, we know that this OpOperand is alias-only. This method now has a default implementation and does not have to be implemented.
* `getInplaceableOpResult`: The analysis does not differentiate between "inplaceable" and "aliasing". The only thing that matters is whether or not OpOperand and OpResult are aliasing. That is the key property that makes buffer copies necessary.

Differential Revision: https://reviews.llvm.org/D112902

[mlir][sparse] implement full reduction "scalarization" across loop nests

The earlier reduction "scalarization" was only applied to a chain of
*innermost* and *for* loops. This revision generalizes this to any
nesting of for- and while-loops. This implies that reductions can be
implemented with a lot less load and store operations. The chaining
is implemented with a forest of yield statements (but not as bad as
when we would also include the while-induction).

Fixes https://bugs.llvm.org/show_bug.cgi?id=52311

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D113078

[ASan] Added stack safety support in address sanitizer.

Added and implemented -asan-use-stack-safety flag, which control if ASan would use the Stack Safety results to emit less code for operations which are marked as 'safe' by the static analysis.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D112098

[NewPM] Make eager analysis invalidation per-adaptor

Follow-up change to D111575.
We don't need eager invalidation on every adaptor. Most notably,
adaptors running passes that use very few analyses, or passes that
purely invalidate specific analyses.

Also allow testing of this via a pipeline string
"function<eager-inv>()".

The compile time/memory impact of this is very comparable to D111575.
https://llvm-compile-time-tracker.com/compare.php?from=9a2eec512a29df45c90c2fcb741e9d5c693b1383&to=b9f20bcdea138060967d95a98eab87ce725b22bb&stat=instructions

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D113196

BPF: Support btf_type_tag attribute

A new kind BTF_KIND_TYPE_TAG is defined. The tags associated
with a pointer type are emitted in their IR order as modifiers.
For example, for the following declaration:
int __tag1 * __tag1 __tag2 *g;
The BTF type chain will look like
VAR(g) -> __tag1 --> __tag2 -> pointer -> __tag1 -> pointer -> int
In the above "->" means BTF CommonType.Type which indicates
the point-to type.

Differential Revision: https://reviews.llvm.org/D113222

Canonicalization for add to no-op if one of the inputs is zero

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D113207

[libcxxabi] Fix NO_THREADS version of test_exception_storage.pass.cpp

`thread_code` returns param, which for NO_THREADS is going to be
`&thread_globals`. Thus, the return value will never be null. The test
was probably meant to check if `*thread_code(&thread_globals) == 0`.
However, to avoid the extra cast, and to bring the NO_THREADS version
more in line with the regular version of the test, this changes it to
check if thread_globals == 0 directly.

Reviewed By: ldionne, #libc_abi

Differential Revision: https://reviews.llvm.org/D113048

BPF: fix a buildbot test failure

Commit 737e4216c537 ("[Attr] support btf_type_tag attribute")
added btf_type_tag support in llvm. Buildbot reported a
failure with attr-btf_type_tag.ll.

  ; CHECK-NEXT: DW_AT_type (0x[[T1:[0-9]+]] "int ***")

  <stdin>:15:2: note: possible intended match here
   DW_AT_type (0x0000002f "int ***")

The pattern [0-9]+ is not enough to match 0000002f, we
need [0-9a-f]+. This patch fixed the issue.

[OpenMP] Build device runtimes for sm_86

Reviewed By: carlo.bertolli

Differential Revision: https://reviews.llvm.org/D113111

[OpenMP] Introduce the keepAlive function into the old device RT

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D113110

[OpenMP][NFCI] Cleanup new device RT mapping interface

Minimize the `impl` interface and clean up some uses of mapping
functions.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D112154

[indvars] Use loop guards when canonicalizing exit conditions

This extends the logic in canonicalizeExitConditions to use loop guards to specialize the SCEV of the loop invariant term before quering it's range.

[NewPM] Use the default AA pipeline by default

We almost always want to use the default AA pipeline. It's very easy for
users of PassBuilder to forget to customize the AAManager to use the
default AA pipeline (for example, the NewPM C API forgets to do this).

If somebody wants a custom AA pipeline, similar to what is being done
now with the default AA pipeline registration, they can

FAM.registerPass([&] { return std::move(MyAA); });

before calling

PB.registerFunctionAnalyses(FAM);

For example, LTOBackend.cpp and NewPMDriver.cpp do this.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D113210

[ORC] Add a utility for adding missing "self" relocations to a Symbol

If a tool wants to introduce new indirections via stubs at link-time in
ORC, it can cause fidelity issues around the address of the function if
some references to the function do not have relocations. This is known
to happen inside the body of the function itself on x86_64 for example,
where a PC-relative address is formed, but without a relocation.

```
_foo:
leaq -7(%rip), %rax ## form pointer to '_foo' without relocation

_bar:
leaq (%rip), %rax ## uses X86_64_RELOC_SIGNED to '_foo'
```

The consequence of introducing a stub for such a function at link time
is that if it forms a pointer to itself without relocation, it will not
have the same value as a pointer from outside the function. If the
function pointer is used as a key, this can cause problems.

This utility provides best-effort support for adding such missing
relocations using MCDisassembler and MCInstrAnalysis to identify the
problematic instructions. Currently it is only implemented for x86_64.

Note: the related issue with call/jump instructions is not handled
here, only forming function pointers.

rdar://83514317

Differential revision: https://reviews.llvm.org/D113038

[libcxx][NFC] tidy up money_get::__do_get's sign parsing

Same logic, but much easier to read this way

Reviewed By: ldionne, #libc, Mordante

Differential Revision: https://reviews.llvm.org/D112958

DebugInfo: Fix incorrect line table lookup when resolving decl_file from a split unit

Specifically in DWARFv5 the unit for the line table entry was correct
but the context was incorrect - leading to looking up .debug_line_str in
the dwp instead of the executable.

(perhaps we could/should remove the context pointer entirely, and rely
on the one in the unit... I might try that as a separate follow-up
commit)

[indvars] Allow rotation (narrowing) of exit test when discovering trip count

This relaxes the one-use requirement on the rotation transform specifically for the case where we know we're zexting an IV of the loop. This allows us to discover trip count information in SCEV, which seems worth a single extra loop invariant truncate. Honestly, I'd prefer if SCEV could just compute the trip count directly (e.g. D109457), but this unblocks practical benefit.

[mlir][core] Slightly improved attribute lookup

- String binary search does 1 less string comparison
- Identifier linear scan on large attribute list is switched to string binary search

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D112970

[OpenMP] Add parsing/sema/serialization for 'bind' clause.

Differential Revision: https://reviews.llvm.org/D113154

[InstCombine] Precommit updated and-xor-or.ll tests. NFC.

Tests for:

(~(a | b) & c) | ~(a | (b | c)) -> ~(a | b)
(~(a | b) & c) | ~(b | (a | c)) -> ~(a | b)

[Attr] support btf_type_tag attribute

This patch added clang codegen and llvm support
for btf_type_tag support. Currently, btf_type_tag
attribute info is preserved in DebugInfo IR only for
pointer types associated with typedef, global variable
and function declaration. Eventually, such information
is emitted to dwarf.

The following is an example:
  $ cat test.c
  #define __tag __attribute__((btf_type_tag("tag")))
  int __tag *g;
  $ clang -O2 -g -c test.c
  $ llvm-dwarfdump --debug-info test.o
  ...
  0x0000001e:   DW_TAG_variable
                  DW_AT_name      ("g")
                  DW_AT_type      (0x00000033 "int *")
                  DW_AT_external  (true)
                  DW_AT_decl_file ("/home/yhs/test.c")
                  DW_AT_decl_line (2)
                  DW_AT_location  (DW_OP_addr 0x0)

  0x00000033:   DW_TAG_pointer_type
                  DW_AT_type      (0x00000042 "int")

  0x00000038:     DW_TAG_LLVM_annotation
                    DW_AT_name    ("btf_type_tag")
                    DW_AT_const_value     ("tag")

  0x00000041:     NULL

  0x00000042:   DW_TAG_base_type
                  DW_AT_name      ("int")
                  DW_AT_encoding  (DW_ATE_signed)
                  DW_AT_byte_size (0x04)

  0x00000049:   NULL

Basically, a DW_TAG_LLVM_annotation tag will be inserted
under DW_TAG_pointer_type tag if that pointer has a btf_type_tag
associated with it.

Differential Revision: https://reviews.llvm.org/D111199

[indvars] Extend canonicalizeExitConditions to inverted operands

As discussed in the original reviews, but done in a follow on.

[Clang][Attr] Support btf_type_tag attribute

This patch introduced btf_type_tag attribute. The attribute
is a type attribute and intends to address the below
linux use cases.
    typedef int __user *__intp;
    int foo(int __user *arg, ...)
    static int do_execve(struct filename *filename,
        const char __user *const __user *__argv,
        const char __user *const __user *__envp)

Here __user in the kernel defined as
    __attribute__((noderef, address_space(__user)))
for sparse ([1]) type checking mode.

For normal clang compilation, we intend to replace it with
    __attribute__((btf_type_tag("user")))
and record such informaiton in dwarf and BTF so such
information later can be used in kernel for bpf verification
or for other tracing functionalities.

  [1] https://www.kernel.org/doc/html/v4.11/dev-tools/sparse.html

Differential Revision: https://reviews.llvm.org/D111199

[AMDGPU] Changes the AMDGPU_Gfx calling convention by making the SGPRs 4..29 callee-save. This is to avoid superfluous s_movs when executing amdgpu_gfx function calls as the callee is likely not going to change the argument values.

This patch changes the AMDGPU_Gfx calling convention. It defines the SGPR registers s[4:29] as callee-save and leaves some SGPRs usable for callers. The intention is to avoid unneccessary s_mov instructions for arguments the caller would otherwise save and restore in these registers.

Reviewed By: sebastian-ne

Differential Revision: https://reviews.llvm.org/D111637

[Support] Improve Caching conformance with Support library behavior

This diff makes several amendments to the local file caching mechanism
which was migrated from ThinLTO to Support in
rGe678c51177102845c93529d457b020f969125373 in response to follow-up
discussion on that commit.

Patch By: noajshu

Differential Revision: https://reviews.llvm.org/D113080

[Flang][OpenMP] Use the ultimate symbol in a call to the IsPointer function

The IsPointer check currently fails for host-associated symbols in OpenMP
regions. This causes some failures in semantic checks for pointer association
in an OpenMP region. Fix is to use the ultimate symbol in the call to the
IsPointer function in CheckPointerAssignment function in
lib/Semantics/pointer-assignment.cpp.

Reviewed By: klausler, peixin

Differential Revision: https://reviews.llvm.org/D112876

[X86][SSE] Regenerate vector funnel shift tests

[mlir][ods] Op::verify should not call OpAdaptor::verify

OpAdaptor::verify performs string lookups on an attribute dictionary. By
calling OpAdaptor::verify, Op::verify is not able to use cached attribute
identifiers for faster lookups.

Reviewed By: jpienaar, rriddle

Differential Revision: https://reviews.llvm.org/D113039

[llvm][adt] make_first_range returning reference to temporary

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D112957

[ARM] Move VPTBlock pass after post-ra scheduling

Currently when tail predicating loops, vpt blocks need to be created
with the vctp predicate in case we need to revert to non-tail predicated
form. This has the unfortunate side effect of severely hampering post-ra
scheduling at times as the instructions are already stuck in vpt blocks,
not allowed to be independently ordered.

This patch addresses that by just moving the creation of VPT blocks
later in the pipeline, after post-ra scheduling has been performed. This
allows more optimal scheduling post-ra before the vpt blocks are
created, leading to more optimal tail predicated loops.

Differential Revision: https://reviews.llvm.org/D113094

[libc] add stpcpy and stpncpy

Adds an implementation for stpcpy and stpncpy, which are posix extension
functions.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D111913

[WebAssembly] Fix debug locations for ExplicitLocals pass

This is a reworked version of the reverted patch: https://reviews.llvm.org/D112487
Note that
a) it doesn't need the test changes anymore, and
b) I checked at least locally it passes other.test_pthread_lsan_leak

Differential Revision: https://reviews.llvm.org/D113208

[libc++] Improve no wide characters configuration.

When wide characters are supported libc++ manually translates a
`narrow non-breaking space` and a `non-breaking space` to a space.
This behaviour wasn't available when wide characters were disabled.
This enables an emulation for that configuration.

Updating the libc++ Docker image to Ubuntu Focal caused some breakage.
This was temporary disabled in D112737. This re-enables four of these
tests.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D113133

[libc++] Remove non-atomic "platform" semaphore implementations.

These can't be made constexpr-constructible (constinit'able),
so they aren't C++20-conforming. Also, the platform versions are
going to be bigger than the atomic/futex version, so we'd have
the awkward situation that `semaphore<42>` could be bigger than
`semaphore<43>`, and that's just silly.

Differential Revision: https://reviews.llvm.org/D110110

[mlir] Handle StringAttr in SparseElementsAttr::getZeroAttr.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D111203

[flang][flang-omp-report] Add flang-omp-report summarising script

The flang plugin ``flang-omp-report`` takes one fortran file in and returns a
YAML report file of the input file. This becomes an issue when you want to
analyse an entire project into one final report.
The purpose of this Python script is to generate a final YAML
report from all of the files generated by ``flang-omp-report``. The report can
have (currently) 2 formats; summary and log. Summary focuses on "summarizing"
all constructs and there clauses from all YAML files with a corresponding "count"
for each. Log instead combines the generated YAML files into one report in a
"cleaner" format. (Pseudo) Examples can be found for both formats at the top of
the script.

Differential Revision: https://reviews.llvm.org/D111042

Co-Authored by: Ivan Zhechev <ivan.zhechev@arm.com>