review.tizen.org Git - platform/upstream/llvm.git/log

[lld][WebAssembly] stub objects: Fix handling of LTO libcall dependencies

This actually simplifies the code by performs a pre-pass of the stub
objects prior to LTO.

This should be the final change needed before we can make the switch
on the emscripten side: https://github.com/emscripten-core/emscripten/pull/18905

Differential Revision: https://reviews.llvm.org/D148287

[TableGen] Allow references to class template arguments in defvar

We can't refer to template arguments for defvar statements in class
definitions, or it will report some errors like:

```
error: Variable not defined: 'xxx'.
```

The key point here is we used to pass nullptr to `ParseValue` in
`ParseDefvar`. As a result, we can't refer to template arguments
since `CurRec` is nullptr in `ParseIDValue`.

So we add an argument `CurRec` to `ParseDefvar` and provide it
when parsing defvar statements in class definitions.

Reviewed By: tra, simon_tatham

Differential Revision: https://reviews.llvm.org/D148197

[libc++] Fix `generate_ignore_format.sh` and regenerate the file.

The script incorrectly produced double slashes in paths, e.g.
`libcxx/src//thread.cpp`.

[lldb][test] Fix -Wsign-compare in RegisterFlagsTest.cpp (NFC)

/data/llvm-project/third-party/unittest/googletest/include/gtest/gtest.h:1526:11: error: comparison of integers of different signs: 'const unsigned long long' and 'const int' [-Werror,-Wsign-compare]
  if (lhs == rhs) {
      ~~~ ^  ~~~
/data/llvm-project/third-party/unittest/googletest/include/gtest/gtest.h:1553:12: note: in instantiation of function template specialization 'testing::internal::CmpHelperEQ<unsigned long long, int>' requested here
    return CmpHelperEQ(lhs_expression, rhs_expression, lhs, rhs);
           ^
/data/llvm-project/lldb/unittests/Target/RegisterFlagsTest.cpp:128:3: note: in instantiation of function template specialization 'testing::internal::EqHelper::Compare<unsigned long long, int, nullptr>' requested here
  ASSERT_EQ(0x12345678ULL, rf.ReverseFieldOrder(0x12345678));
  ^
/data/llvm-project/third-party/unittest/googletest/include/gtest/gtest.h:2056:32: note: expanded from macro 'ASSERT_EQ'
                               ^
/data/llvm-project/third-party/unittest/googletest/include/gtest/gtest.h:2040:54: note: expanded from macro 'GTEST_ASSERT_EQ'
  ASSERT_PRED_FORMAT2(::testing::internal::EqHelper::Compare, val1, val2)
                                                     ^
1 error generated.

Revert "[flang] REAL(KIND=3) and COMPLEX(KIND=3) descriptors"

This reverts commit 17a4fcecf40ee5191ab05b27a58ac37e5f57261d.

[RISCV] Support vector strict_fsetcc/fsetccs.

The patch supports vector strict_fsetcc/fsetccs. Instead of revserving fflags,
the method to implement scalar quiet compares, the patch implement quiet
compares by masking the signaling compares when either input is NaN [0].

[0]: https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vector-floating-point-compare-instructions

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D147998

[flang] REAL(KIND=3) and COMPLEX(KIND=3) descriptors

Update descriptor generation to correctly set the `type` field for
REAL(3) and COMPLEX(3) objects.

[lldb] Use ObjectFileJSON to create modules for interactive crashlogs

Create an artificial module using a JSON object file when we can't
locate the module and dSYM through dsymForUUID (or however
locate_module_and_debug_symbols is implemented). By parsing the symbols
from the crashlog and making them part of the JSON object file, LLDB can
symbolicate frames it otherwise wouldn't be able to, as there is no
module for it.

For non-interactive crashlogs, that never was a problem because we could
simply show the "pre-symbolicated" frame from the input. For interactive
crashlogs, we need a way to pass the symbol information to LLDB so that
it can symbolicate the frames, which is what motivated the JSON object
file format.

Differential revision: https://reviews.llvm.org/D148172

Generate staging `MachineValueType.h` (partially) from `ValueTypes.td`

- Implement `VTEmitter` as `llvm-tblgen -gen-vt`.
- Create a copy of `llvm/Support/MachineValueType.h` into `unittests/Support`.
It includes `GenVT.inc` generated by `VTEmitter`.
- Implement `MVTTest` in `SupportTests`. It checks equivalence between
`llvm/Support/MachineValueType.h` and the generated header.

Differential Revision: https://reviews.llvm.org/D146906

Copy `llvm/Support/MachineValueType.h` into `unittests/Support` for D146906

Differential Revision: https://reviews.llvm.org/D146907

[flang] Fix -Wmismatched-tags warning of NonTbpDefinedIoTable (NFC)

/Users/jiefu/llvm-project/flang/runtime/non-tbp-dio.h:41:1: error: 'NonTbpDefinedIoTable' defined as a struct here but previously declared as a class; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Werror,-Wmismatched-tags]
struct NonTbpDefinedIoTable {
^
/Users/jiefu/llvm-project/flang/include/flang/Runtime/io-api.h:26:1: note: did you mean struct here?
class NonTbpDefinedIoTable;
^~~~~
struct
1 error generated.

[flang][openacc] Lower serial and serial loop construct

Lower the parse tree to acc dialects operations. Make use
of the parallel construct lowering and make it suitable
for all compute constructs lowering.

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D148273

[flang] Rework handling of non-type-bound user-defined I/O

A fairly recent introduction of runtime I/O APIs called OutputDerivedType()
and InputDerivedType() didn't cover NAMELIST I/O's need to access
non-type-bound generic interfaces for user-defined derived type I/O
when those generic interfaces are defined in some scope other than the
one that defines the derived type.

The patch adds a new data structure shared between lowering
and the runtime that can represent all of the cases that can
arise with non-type-bound defined I/O. It can represent
scopes in which non-type-bound defined I/O generic interfaces
are inaccessible, too, due to IMPORT statements.

The data structure is now an operand to OutputDerivedType() and
InputDerivedType() as well as a data member in the NamelistGroup
structure.

Differential Revision: https://reviews.llvm.org/D148257

[Demangle] Remove uses of llvm::itanium_demangle::StringView::{dropBack,dropFront}. NFC

Make it easier to migrate StringView to std::string_view.

[Flang][OpenMP] Add support for logical neqv reduction in worksharing-loop

Adds support for .neqv. reductions with logical types.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D147677

[mlir][openacc] Accept acc.serial has parent of acc.yield op

acc.serial op is modeled on the acc.parallel op.
acc.yield operation must then accept acc.serial has a parent
operation.

Reviewed By: PeteSteinfeld, razvanlupusoru

Differential Revision: https://reviews.llvm.org/D148258

[llvm-profdata] Fixed various issue with Sample Profile Reader

Fixed various undefind behaviors with current Sample Profile Reader when reading unusual input. Furthermore, add the following rule on allowing multiple name table sections (current Reader has conflicted code handling such case):

When a new name table section is read (in the order sections are read), the names in the previous name table are cleared. Any subsequent sections referring to function names will index into the most recent read name table.

Also changed name table index to uint64_t to be consistent since there's a mix of using uint32_t and uint64_t.

Reviewed By: snehasish, huangjd

Differential Revision: https://reviews.llvm.org/D146182

[lldb] Use the host's target triple in TestObjectFileJSON

Use the target's triple when adding JSON modules in TestObjectFileJSON.
This should fix the test failure on non-Darwin bots.

[lldb] Make ObjectFileJSON loadable as a module

This patch adds support for creating modules from JSON object files.
This is necessary for the crashlog use case where we don't have either a
module or a symbol file. In that case the ObjectFileJSON serves as both.

The patch adds support for an object file type (i.e. executable, shared
library, etc). It also adds the ability to specify sections, which is
necessary in order specify symbols by address. Finally, this patch
improves error handling and fixes a bug where we wouldn't read more than
the initial 512 bytes in GetModuleSpecifications.

Differential revision: https://reviews.llvm.org/D148062

[VPlan] Switch to checking sinking legality for recurrences in VPlan.

Building on D142885 and D142589, retire the SinkAfter map from the
recurrence handling code. It is replaced by checking whether it is
possible to sink all users of a recurrence directly in VPlan. This
results in simpler code overall and allows to handle additional cases
(see the improvements in @test_crash).

Depends on D142885.
Depends on D142589.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D142886

Reland "[compiler-rt] Fix scudo build on ARM"

This reverts commit 5abef0bdbe9237b5728215bb7cd915ffb960e5c3 and
fixes Fuchsia AArch64 cross-compile.

[mlir][Vector] Add a masked vectorization of tensor.pad

This revision takes advantage of masking support to introduce a vectorized
version of pad that does not require lowering to lower-level form.

Lowering to lower-level form (if/else + generate + fill + copy + insert_slice)
creates unnecessary complexity that can be completely sidestepped by using
masked vectorization properly.

Differential Revision: https://reviews.llvm.org/D148261

[AArch64] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds

[LV] Add exit users for recurrences in test, fix names.

Add users of exit values for recurrences to make sure exit value
generation will be checked in a follow-up change.

Also adjusts/fixes naming in the test.

[mlir][Vector] Add a vector.materialize_masks transform operation

[LoopIdiomRecognize] Remove NUW flag from SCEV in getTripCount.

Based on the conversation in D147355.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148170

Attributor: Add baseline tests for nofpclass changes

ValueTracking: Add some more uitofp/sitofp tests

Before computeKnownFPClass had a use, these were
only covered by unit tests and didn't cover vectors.

[runtimes][asan] Fix swapcontext interception

Resetting oucp's stack to zero in swapcontext interception is incorrect,
since it breaks ucp cleanup after swapcontext returns in some cases:

Say we have two contexts, A and B, and we swapcontext from A to B, do
some work on Bs stack and then swapcontext back from B to A. At this
point shadow memory of Bs stack is in arbitrary state, but since we
can't know whether B will ever swapcontext-ed to again we clean up it's
shadow memory, because otherwise it remains poisoned and blows in
completely unrelated places when heap-allocated memory of Bs context
gets reused later (see https://github.com/llvm/llvm-project/issues/58633
for example). swapcontext prototype is swapcontext(ucontext* oucp,
ucontext* ucp), so in this example A is oucp and B is ucp, and i refer
to the process of cleaning up Bs shadow memory as ucp cleanup.

About how it breaks:
Take the same example with A and B: when we swapcontext back from B to A
the oucp parameter of swapcontext is actually B, and current trunk
resets its stack in a way that it becomes "uncleanupable" later. It
works fine if we do A->B->A, but if we do A->B->A->B->A no cleanup is
performed for Bs stack after B "returns" to A second time. That's
exactly what happens in the test i provided, and it's actually a pretty
common real world scenario.

Instead of resetting oucp's we make use of uc_stack.ss_flags to mark
context as "cleanup-able" by storing stack specific hash. It should be
safe since this field is not used in [get|make|swap]context functions
and is hopefully never meaningfully used in real-world scenarios (and i
haven't seen any).

Fixes https://github.com/llvm/llvm-project/issues/58633

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D137654

ValueTracking: Add baseline test for computeKnownFPClass for sin/cos

ValueTracking: Add cannotBeOrderedLessThanZero to KnownFPClass

Eventually we should be able to replace the existing
CannotBeOrderedLessThanZero.

ValueTracking: Address todo for nan fmul handling in computeKnownFPClass

If both operands can't be zero or nan, the result can't be nan.

ValueTracking: Handle no-nan check for computeKnownFPClass for fmul

Copy the logic from isKnownNeverNaN for fadd/fsub. Leave the
extension to handle the zero case for a future change.

ValueTracking: Add baseline test for fmul computeKnownFPClass handling

[X86] Only fold PTEST->TESTP on AVX targets

While PTEST is a SSE41 instruction, TESTPS/TESTPD was only added for AVX

[X86] Split 128 and 256-bit PTEST combines

We are missing SSE41 test coverage (and in fact have a bug with a PTEST->TESTP fold that is only valid on AVX).

[SLP][NFC]Remove extra semicolons after function definitions, NFC

[flang][openacc] Add proper TODO for the block construct

Trigger the not yet implemented message for
block constructs that are not lowered.

Reviewed By: PeteSteinfeld, razvanlupusoru

Differential Revision: https://reviews.llvm.org/D148253

[-Wunsafe-buffer-usage] A follow-up fix to 762af11d4c5d0bd1e76f23a07087773db09ef17d

Add -triple=arm-apple to the test `SemaCXX/warn-unsafe-buffer-usage-fixits-pre-increment.cpp`

Revert "[Modules] Remove unnecessary check when generating name lookup table in ASTWriter"

This reverts commit bc95f27337c7ed77c28e713c855272848f01802a, originally db987b9589be1eb604fcb74c85b410469e31485f.
clang/test/Modules/pr61065.cppm is retained to make relands show less diff.

There are other module-related issues that were not caught, related to
false positive errors like
"error: no matching constructor for initialization of 'union (anonymous union at ..."

Reopen #61065

Fix warnings in InstrProfTest.cpp

The warnings were introduced in https://reviews.llvm.org/D148150

Reviewed By: snehasish

Differential Revision: https://reviews.llvm.org/D148259

[Test][Sanitizer][atos] Disable atos-symbolized-recovery test

This test tests uses undefined behavior and is proving to be
very flakey. I am disabling for now.

Radar to add a new test: rdar://108003900

rdar://107846128

[mlir][math] Expand math.round to truncate, compare and increment.

Round functions are pushed directly to libm. This is problematic for
situations where libm is not available. This patch will decompose the
roundf function by adding 0.5 to positive number to input
(subtracting for negative) following by a truncate.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D148026

[CostModel][X86] Add latency/code-size/size-latency test coverage for integer add/sub saturation intrinsics

[Matrix] Fix IsSupported check in lowerDotProduct.

The check incorrectly checks the RHS while LHS is transformed later.
Update to check LHS, which fixes a crash in the newly added test cases.

[nfc][asan] Reformat the file

[lldb] Fix library layering after D145574

[lld][RISCV] Implement GP relaxation for R_RISCV_HI20/R_RISCV_LO12_I/R_RISCV_LO12_S.

This implements support for relaxing these relocations to use the GP
register to compute addresses of globals in the .sdata and .sbss
sections.

This feature is off by default and must be enabled by passing
--relax-gp to the linker.

The GP register might not always be the "global pointer". It can
be used for other purposes. See discussion here
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/371

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D143673

[CostModel][X86] Add SSE2 test coverage to integer add/sub saturation tests

[InstrProf][Temporal] Add weight field to traces

As discussed in [0], add a `weight` field to temporal profiling traces found in profiles. This allows users to use the `--weighted-input=` flag in the `llvm-profdata merge` command to weight traces from different scenarios differently.

Note that this is a breaking change, but since [1] landed very recently and there is no way to "use" this trace data, there should be no users of this feature. We believe it is acceptable to land this change without bumping the profile format version.

[0] https://reviews.llvm.org/D147812#4259507
[1] https://reviews.llvm.org/D147287

Reviewed By: snehasish

Differential Revision: https://reviews.llvm.org/D148150

[lld][WebAssembly] Trace export of symbols when specified with --trace-symbol. NFC

Differential Revision: https://reviews.llvm.org/D148190

[mlir][openacc] Add acc.serial operation

The acc.serial operation models the OpenACC serial construct.
The serial construct defines a region of a program that is to be
executed sequentially on the current device.
The operation is modelled on the acc.parallel operation and will
receive similar updates when the data operands operations will
be implemented.

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D148250

Internalize AllocationBegin functions after D147005

Reviewed By: thurston

Differential Revision: https://reviews.llvm.org/D148195

[SLP]Introduce gather cost estimation function.

Introduced BoUpSLP::ShuffleCostEstimator::gather function as an initial
implementation of the gather/buildvector cost estimation for buildvector
nodes. It will allow to use general codegen infrastructure for better
cost estimation + it improves the cost estimation for the
gathers/buildvectors.

Improved part of D110978.

Differential Revision: https://reviews.llvm.org/D148174

[mlir][openacc][NFC] Use assembly format for acc.parallel

Remove the custoom parser and printer for the acc.parallel
operation and use the assembly format directly.

Reviewed By: PeteSteinfeld, razvanlupusoru

Differential Revision: https://reviews.llvm.org/D148183

[CostModel][X86] Add latency/code-size/size-latency target costs for minnum/maxnum intrinsics

Using the latest version of the script from D103695 to compare costmodel vs llvm-mca statistics.

Avoids using the default costs, which was assuming libm calls.

[clang][ExtractAPI] Complete declaration fragments for TagDecl types defined in a typedef

enums and structs declared inside typedefs have incorrect declaration fragments, where the typedef keyword and other syntax is missing.

For the following struct:

typedef struct Test {
    int hello;
} Test;
The produced declaration is:

"declarationFragments": [
  {
    "kind": "keyword",
    "spelling": "struct"
  },
  {
    "kind": "text",
    "spelling": " "
  },
  {
    "kind": "identifier",
    "spelling": "Test"
  }
],
instead the declaration fragments should represent the following

typedef struct Test {
    …
} Test;

This patch removes the condition in SymbolGraphSerializer.cpp file and completes declaration fragments

Reviewed By: dang

Differential Revision: https://reviews.llvm.org/D146385

[mlir][scf] Make whileOp builder funcs optional

Create empty block without the terminator in this case.

Differential Revision: https://reviews.llvm.org/D148137

[flang][runtime] Reset the left tab limit when flushing output

When flushing output to a non-positionable tty or socket file, reset the
left tab limit. Otherwise, non-advancing output to that file will contain
an increasing amount of leading spaces in each flush. Also, detect
newline characters in stream output, and treat them as record
advancement.

Differential Revision: https://reviews.llvm.org/D148157

[StackProtector] don't check stack protector before calling nounwind functions

https://reviews.llvm.org/rGd656ae28095726830f9beb8dbd4d69f5144ef821
introduced a additional checks before calling noreturn functions in
response to this security paper related to Catch Handler Oriented
Programming (CHOP):
https://download.vusec.net/papers/chop_ndss23.pdf
See also:
https://bugs.chromium.org/p/llvm/issues/detail?id=30

This causes stack canaries to be inserted in C code which was
unexpected; we noticed certain Linux kernel trees stopped booting after
this (in functions trying to initialize the stack canary itself).
https://github.com/ClangBuiltLinux/linux/issues/1815

There is no point checking the stack canary like this when exceptions
are disabled (-fno-exceptions or function is marked noexcept) or for C
code. The GCC patch for this issue does something similar:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=a25982ada523689c8745d7fb4b1b93c8f5dab2e7

Android measured a 2% regression in RSS as a result of d656ae280957 and
undid it globally:
https://android-review.googlesource.com/c/platform/build/soong/+/2524336

Reviewed By: xiangzhangllvm

Differential Revision: https://reviews.llvm.org/D147975

[CostModel][X86] Add latency/code-size test coverage for minnum/maxnum intrinsics

And improve test coverage for the existing size-latency tests to match

[llvm-readobj] fix unit test failure on 32bit machines

Several bots are failing on 32-bit since https://reviews.llvm.org/D145761 was merged
https://lab.llvm.org/buildbot/#/builders/178/builds/4384

It seems due to the use of uintptr_t (32bit here) for storing 64 bit values.

Issue is fixed by replacing to uint64_t (as suggested by DavidSpickett).

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D148208

Revert "[COFF] Add MC support for emitting IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY symbols"

This reverts commit fffdb7eac58b4efde5e23c1281e7a7f93a42d280.

Causes crashes, see https://reviews.llvm.org/D145208

[mlir][math] Expand math.exp2 to use math.exp.

Exp2 functions are pushed directly to libm. This is problematic for
situations where libm is not available. This patch will expand the exp2
function to use exp2 with the input multiplied by ln2 (natural log).

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D148064

InstSimplify: Regenerate checks in a test

For some reason this was missing the "Assertions have been autogenerated"
comment so were getting spurious diffs in a future change.

[SLP] Compute min/max scalar reduction costs using min/max intrinsics instead of expanded cmp+sel

By default these will expand back to cmp/sel, but some targets (X86) has optimized costs for scalar integer min/max patterns which are lower than the default expansion (pre-SSE41 is particularly weak for vector min/max support).

Differential Revision: [SLP] Compute min/max scalar reduction costs using min/max intrinsics instead of expanded cmp+sel

[AMDGPU] Allow use of TTMP registers in AMDGPUResourceUsageAnalysis

With architected SGPRs, workgroup IDs are passed into a compute shader
in TTMP registers. Allow for this in AMDGPUResourceUsageAnalysis instead
of failing an assertion.

Differential Revision: https://reviews.llvm.org/D148239

[AMDGPU] Avoid else-after-return in isLegalAddressingMode. NFC.

[AArch64] Add more efficient vector bitcast for AArch64

Adds a DAG combine checks for vector comparisons followed by a bitcast to a
scalar value. Previously, this resulted in an expand. Now, this is done with a
constant number of instructions that take one bit per vector value (via an AND
mask) and perfom a horizontal add to get a single value. This is especially
useful for Clang's __builtin_convertvector() to a bool vector.

Issue: https://github.com/llvm/llvm-project/issues/59829

Differential Revision: https://reviews.llvm.org/D145301

[IR] llvm::createMinMaxOp - create integer min/max intrinsics instead of icmp/sel

Based off D148215, when expanding a min/max reduction we should be creating min/max intrinsics directly instead of relying on instcombine to fold them back together.

This patch handles integer min/max cases. Hopefully we can add floating point support soon (at least for fastmath/nnan cases) - but we're missing some of the plumbing to pass the correct FMF to the intrinsic at the moment.

Differential Revision: https://reviews.llvm.org/D148221

[lldb] Add 'CHECK' to class-type-nullptr-deref.s test.

This test previously relied on just segfaulting or not. This commit adds
a CHECK statement to the test.

Differential Revision: https://reviews.llvm.org/D148151

Use `bytes`, not `str`, to return C++ strings to Python.

`str` must be valid UTF-8, which is not guaranteed for C++ strings.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D147818

[libcxx][AIX] Reverting XFAILs for test cases after OS update

These test cases were fixed with AIX 73TL1, and are currently passing on AIX machines with that fix. This fix has also been backported to the 7.2 service line. These were tested on a machine with AIX 7.2 TL 5 SP4 installed.

Differential Revision: https://reviews.llvm.org/D148040

[Coverage] Handle invalid end location of an expression/statement.

Fix a crash when an expression/statement can have valid start location but invalid end location in some situations. For example: https://github.com/llvm/llvm-project/blob/llvmorg-16.0.1/clang/lib/Sema/SemaExprCXX.cpp#L1536

This confuses `CounterCoverageMappingBuilder` when popping a region from region
stack as if the end location is a macro or include location.

Reviewed By: hans, aaron.ballman

Differential Revision: https://reviews.llvm.org/D147073

[clangd] Use FileEntryRef for canonicalizing filepaths.

Using FileEntry for retrieving filenames give the name used for the last access. FileEntryRef gives the first access name.

Last access name is suspected to change with unrelated changes in clang or the underlying filesystem. First access name gives more stability to the name and makes it easier to track.

Differential Revision: https://reviews.llvm.org/D148213

[clang][NFC] Use a different service for CWG issue links

We've been using https://wg21.link for C++ DR status page, but it forwards non-resolved issues to EDG wiki, which is not useful for general public. This patch replace it with https://cplusplus.github.io/CWG/issues/ .

[BOLT][NFC] Fix UB due to unaligned load in DebugStrOffsetsWriter

The following tests fail when enabling UBSan due to an unaligned memory
load:

> runtime error: load of misaligned address 0x620000000643 for type
> 'const uint32_t' (aka 'const unsigned int'), which requires 4 byte
> alignment

  BOLT :: AArch64/asm-func-debug.test
  BOLT :: AArch64/update-debug-reloc.test
  BOLT :: X86/asm-func-debug.test
  BOLT :: X86/dwarf5-df-dualcu.test
  BOLT :: X86/dwarf5-df-mono-dualcu.test
  BOLT :: X86/dwarf5-ftypes-dwp-input-dwo-output.test
  BOLT :: X86/dwarf5-locaddrx.test
  BOLT :: X86/dwarf5-split-dwarf4-monolithic.test
  BOLT :: X86/inlined-function-mixed.test
  BOLT :: non-empty-debug-line.test

This patch fixes this by using read32le for the load.

Reviewed By: ayermolo

Differential Revision: https://reviews.llvm.org/D148217

[Attributor] Convert some tests to opaque pointers (NFC)

[ARM] Fix Crashes in fp16/bf16 Inline Asm

We were still seeing occasional crashes with inline assembly blocks
using fp16/bf16 after my previous patches:
- https://reviews.llvm.org/rGff4027d152d0
- https://reviews.llvm.org/rG7d15212b8c0c
- https://reviews.llvm.org/rG20b2d11896d9

It turns out:
- The original two commits were wrong, and we should have always been
  choosing the SPR register class, not the HPR register class, so that
  LLVM's SelectionDAGBuilder correctly did the right splits/joins.
- The `splitValueIntoRegisterParts`/`joinRegisterPartsIntoValue` changes
  from rG20b2d11896d9 are still correct, even though they sometimes
  result in inefficient codegen of casts between fp16/bf16 and i32/f32
  (which is visible in these tests).

This patch fixes crashes in `getCopyToParts` and when trying to select
`(bf16 (bitconvert (fp16 ...)))` dags when Neon is enabled.

This patch also adds support for passing fp16/bf16 values using the 'x'
constraint that is LLVM-specific. This should broadly match how we pass
with 't' and 'w', but with a different set of valid S registers.

Differential Revision: https://reviews.llvm.org/D147715

[clang][Interp] Add a failing test case

[clang][Interp] Don't create global variables more than once

just because we're being told to evaluate it twice. This sometimes
happens when a variable is evaluated again during codegen.

Differential Revision: https://reviews.llvm.org/D147535

[Docs] Point to the correct bug tracker

Signed-off-by: Jun Zhang <jun@junz.org>

[clang][Interp] Make sure we have a variable scope for initializers

Otherwise, we run into an assertion when trying to use the current
variable scope while creating temporaries for constructor initializers.

Differential Revision: https://reviews.llvm.org/D147534

[lldb] Show register fields using bitfield struct types

This change uses the information from target.xml sent by
the GDB stub to produce C types that we can use to print
register fields.

lldb-server *does not* produce this information yet. This will
only work with GDB stubs that do. gdbserver or qemu
are 2 I know of. Testing is added that uses a mocked lldb-server.
```
(lldb) register read cpsr x0 fpcr fpsr x1
    cpsr = 0x60001000
         = (N = 0, Z = 1, C = 1, V = 0, TCO = 0, DIT = 0, UAO = 0, PAN = 0, SS = 0, IL = 0, SSBS = 1, BTYPE = 0, D = 0, A = 0, I = 0, F = 0, nRW = 0, EL = 0, SP = 0)
```

Only "register read" will display fields, and only when
we are not printing a register block.

For example, cpsr is a 32 bit register. Using the target's scratch type
system we construct a type:
```
struct __attribute__((__packed__)) cpsr {
  uint32_t N : 1;
  uint32_t Z : 1;
  ...
  uint32_t EL : 2;
  uint32_t SP : 1;
};
```

If this register had unallocated bits in it, those would
have been filled in by RegisterFlags as anonymous fields.
A new option "SetChildPrintingDecider" is added so we
can disable printing those.

Important things about this type:
* It is packed so that sizeof(struct cpsr) == sizeof(the real register).
  (this will hold for all flags types we create)
* Each field has the same storage type, which is the same as the type
  of the raw register value. This prevents fields being spilt over
  into more storage units, as is allowed by most ABIs.
* Each bitfield size matches that of its register field.
* The most significant field is first.

The last point is required because the most significant bit (MSB)
being on the left/top of a print out matches what you'd expect to
see in an architecture manual. In addition, having lldb print a
different field order on big/little endian hosts is not acceptable.

As a consequence, if the target is little endian we have to
reverse the order of the fields in the value. The value of each field
remains the same. For example 0b01 doesn't become 0b10, it just shifts
up or down.

This is needed because clang's type system assumes that for a struct
like the one above, the least significant bit (LSB) will be first
for a little endian target. We need the MSB to be first.

Finally, if lldb's host is a different endian to the target we have
to byte swap the host endian value to match the endian of the target's
typesystem.

| Host Endian | Target Endian | Field Order Swap | Byte Order Swap |
|-------------|---------------|------------------|-----------------|
| Little      | Little        | Yes              | No              |
| Big         | Little        | Yes              | Yes             |
| Little      | Big           | No               | Yes             |
| Big         | Big           | No               | No              |

Testing was done as follows:
* Little -> Little
  * LE AArch64 native debug.
* Big -> Little
  * s390x lldb running under QEMU, connected to LE AArch64 target.
* Little -> Big
  * LE AArch64 lldb connected to QEMU's GDB stub, which is running
    an s390x program.
* Big -> Big
* s390x lldb running under QEMU, connected to another QEMU's GDB
   stub, which is running an s390x program.

As we are not allowed to link core code to plugins directly,
I have added a new plugin RegisterTypeBuilder. There is one implementation
of this, RegisterTypeBuilderClang, which uses TypeSystemClang to build
the CompilerType from the register fields.

Reviewed By: jasonmolenda

Differential Revision: https://reviews.llvm.org/D145580

[mlir] fix mismerge in bazel

[clang][github] Try to fix multi-line py comment

Introduced in 09effa706a026c7ebcc01bf14f9f710cb1a8fa87

[NFC][ARM] Fix Type in Test

I landed this test with a typo, the callsites all show `fp16_inner`
returning `half`, so the declaration should too.

[llvm][github] Add good-first-issue comment to issues

Differential Revision: https://reviews.llvm.org/D146819

[AArch64][GISel] Legalize non-power-of-two G_CTTZ

The main change here is to add a `widenScalarToNextPow2` before the
`clampScalar` so that non-power-of-two sizes between 32 and 64 are
turned into s64 count trailing zeroes.

However, if you make the legalisation rules depend on TypeIdx 0 (the
output), then you still get crashes for the s65 testcase, which I solved
by instead flipping the rules around to be about TypeIdx 1 (the input),
with a `scalarSameSizeAs` at the end to tie index 0 to index 1. This,
incidentally, is how things are written for `G_CTLZ`.

Differential Revision: https://reviews.llvm.org/D147602

[mlir] add structured (Linalg) transform op matchers

Add a set of transform operations into the "structured" extension of the
Transform dialect that allow one to select transformation targets more
specifically than the currently available matching. In particular, add
the mechanism for identifying the producers of operands (input and init
in destination-passing style) and users of results, as well as
mechanisms for reasoning about the shape of the iteration space.

Additionally, add several transform operations to manipulate parameters
that could be useful to implement more advanced selectors. Specifically,
new operations let one produce and compare parameter values to implement
shape-driven transformations.

New operations are placed in separate files to decrease compilation
time. Some relayering of the extension is necessary to avoid repeated
generation of enums.

Depends on D148013
Depends on D148014
Depends on D148015

Reviewed By: chelini

Differential Revision: https://reviews.llvm.org/D148017

[lldb] Read register fields from target XML

This teaches ProcessGDBRemote to look for "flags" nodes
in the target XML that tell you what fields a register has.

https://sourceware.org/gdb/onlinedocs/gdb/Target-Description-Format.html

It will check for various invalid inputs like:
* Flags nodes with 0 fields in them.
* Start or end being > the register size.
* Fields that overlap.
* Required properties not being present (e.g. no name).
* Flag sets being redefined.

If anything untoward is found, we'll just drop the field or the
flag set altogether. Register fields are a "nice to have" so LLDB
shouldn't be crashing because of them, instead just log anything
we throw away. So the user can fix their XML/file a bug with their
vendor.

Once that is done it will sort the fields and pass them to
the RegisterFields class I added previously.

There is no way to see these fields yet, so tests for this code
will come later when the formatting code is added.

The fields are stored in a map of unique pointers on the
ProcessGDBRemote class. It will give out raw pointers on the
assumption that the GDB process lives longer than the users
of those pointers do. Which means RegisterInfo is still a trivial struct
but we are properly destroying the fields when the GDB process ends.

We can't store the fields directly in the map because adding new
items may cause its storage to be reallocated, which would invalidate
pointers we've already given out.

Reviewed By: jasonmolenda, JDevlieghere

Differential Revision: https://reviews.llvm.org/D145574

[lldb][ObjectFileELF] Improve error output for unsupported arch/relocations

ObjectFileELF::ApplyRelocations() considered all 32-bit input objects to be i386 and didn't provide good error messages for AArch32 objects. Please find an example in https://github.com/llvm/llvm-project/issues/61948
While we are here, let' improve the situation for unsupported architectures as well. I think we should report the error here too and not silently fail (or crash with assertions enabled).

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D147627

[BOLT][NFC] Fix UB due to left shift of negative value

The following test fails when enabling UBSan due to a left shift of a
negative value:

> runtime error: left shift of negative value -2

BOLT :: AArch64/ext-island-ref.s

This patch fixes this by using a multiplication instead of a shift.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D148218

[mlir][Linalg] Fix hoist padding through scf.for iter_arg

Previously, hoisting through an iter_arg would mistakenly yield the unpadded value and
cast it to the padded value.

This was incorrect and resulted in out-of-bounds accesses.
The correct formulation is to yield the padded value and extract a smaller dynamic slice
out of it.

Differential Revision: https://reviews.llvm.org/D148173

[llvm][readobj] Add AArch64 SME and SME2 note types

These are used to store new state added by the Scalable Matrix
Extension which is documented in
https://developer.arm.com/documentation/ddi0616/aa/.

The values match those defined by Linux, see:
https://github.com/torvalds/linux/blob/e62252bc55b6d4eddc6c2bdbf95a448180d6a08d/include/uapi/linux/elf.h#L435

The ZT register(s) are added by SME2 which is not yet publicly
documented but has support in LLVM and Linux already.

Also added descriptions for SVE and PAC_MASK notes since those
were missing.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D148126

Make 'static assertion failed' diagnostics point to the static assertion expression

"static assertion failed" pointed to the static_assert token and
then underlined the static assertion expression:

<source>:3:1: error: static assertion failed
    static_assert(false);
    ^             ~~~~~
    1 error generated.
See Godbolt: https://godbolt.org/z/r38booz59

Now it points to and highlights the assertion expression.

Fixes https://github.com/llvm/llvm-project/issues/61951
Differential Revision: https://reviews.llvm.org/D147745

[clang][NFC] More range for loops in TextDiagnostic.cpp

[clang][NFC] Fix comment typo

[include-cleaner] Handle incomplete template specializations

Instantiation pattern is null for incomplete template types and using
specializaiton decl results in not seeing re-declarations.

Differential Revision: https://reviews.llvm.org/D148158

[lldb] Add dummy field to RegisterInfo for register flags use later

This structure is supposed to be trivial, so we cannot simply do
"= nullptr;" on the new member. Doing that means you are non trivial,
regardless of whether you emulate the previously implied constructor somehow.

The next option is to update every use of brace initialisation.

Given that this is some hundreds of lines, this change just adds a dummy
pointer that is set to nullptr. Subsequent changes will actually use that
to point to register flags information.

Note: This change is not clang-format-ted because it changes a bunch of
areas that are not themselves formatted. It would just add noise.

Reviewed By: jasonmolenda, JDevlieghere

Differential Revision: https://reviews.llvm.org/D145568