review.tizen.org Git - platform/upstream/llvm.git/log

[lld-macho] Fix -bitcode_process_mode arg type

This is still undocumented and unsupported, but if someone passed it
before you would end up with a missing file error since this takes an
argument that wouldn't be handled.

Differential Revision: https://reviews.llvm.org/D130606

[libc++][ranges] Fix the CI.

[clang][AIX] Add option to control quadword lock free atomics ABI on AIX

We are supporting quadword lock free atomics on AIX. For the situation that users on AIX are using a libatomic that is lock-based for quadword types, we can't enable quadword lock free atomics by default on AIX in case user's new code and existing code accessing the same shared atomic quadword variable, we can't guarentee atomicity. So we need an option to enable quadword lock free atomics on AIX, thus we can build a quadword lock-free libatomic(also for advanced users considering atomic performance critical) for users to make the transition smooth.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D127189

[ASan] Use stack safety analysis to optimize allocas instrumentation.

Added alloca optimization which was missed during the implemenation of D112098.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D130503

[asan][test] Check for __asan_stack_malloc

[amdgpu][nfc] Separate processUsedLDS into independent pieces, rename it

[Polly] Insert !dbg metadata for emitted CallInsts.

The IR Verifier requires that every call instruction to an inlineable
function (among other things, its implementation must be visible in the
translation unit) must also have !dbg metadata attached to it. When
parallelizing, Polly emits calls to OpenMP runtime function out of thin
air, or at least not directly derived from a bounded list of previous
instruction. While we could search for instructions in the SCoP that has
some debug info attached to it, there is no guarantee that we find any.
Our solution is to generate a new DILocation that points to line 0 to
represent optimized code.

The OpenMP function implementation is usually not available in the
user's translation unit, but can become visible in an LTO build. For
the bug to appear, libomp must also be built with debug symbols.

IMHO, the IR verifier rule is too strict. Runtime functions can
also be inserted by other optimization passes, such as
LoopIdiomRecognize. When inserting a call to e.g. memset, it uses the
DebugLoc from a StoreInst from the unoptimized code. It is not
required to have !dbg metadata attached either.

Fixes #56692

[amdgpu][nfc] Extract kernel annotation from processUsedLDS

workflows: Use sccache to speed up CI builds

Reviewed By: asl

Differential Revision: https://reviews.llvm.org/D129880

[asan][test] Cleanup asan-stack-safety.ll test

Import CI tests from the release branch

The tests still only run on pushes or pull requests for the release
branch, but having it in the main branch means we don't have to copy
the tests every time we create a new release branch.

Reviewed By: asl

Differential Revision: https://reviews.llvm.org/D129526

[libc++][NFC] Add checks for lifetime issues in classic algorithms.

Differential Revision: https://reviews.llvm.org/D130330

[libc++][ranges] Implement `ranges::is_heap{,_until}`.

Differential Revision: https://reviews.llvm.org/D130547

Add string conversion for InstructionControlFlowKind enum

Refactor the string conversion of the `lldb::InstructionControlFlowKind` enum out
of `Instruction::Dump` to enable reuse of this logic by the
JSON TraceDumper (to be implemented in separate diff).

Will coordinate the landing of this change with D130320 since there will be a minor merge conflict between
these changes.

Test Plan:
Run unittests
```
> ninja check-lldb
[4/5] Running lldb unit test suite

Testing Time: 10.13s
  Passed: 1084
```

Verify '-k' flag's output
```
(lldb) thread trace dump instructions -k
thread #1: tid = 1375377
  libstdc++.so.6`std::ostream::flush() + 43
    7048: 0x00007ffff7b54dab    return      retq
    7047: 0x00007ffff7b54daa    other       popq   %rbx
    7046: 0x00007ffff7b54da7    other       movq   %rbx, %rax
    7045: 0x00007ffff7b54da5    cond jump   je     0x11adb0                  ; <+48>
    7044: 0x00007ffff7b54da2    other       cmpl   $-0x1, %eax
  libc.so.6`_IO_fflush + 249
    7043: 0x00007ffff7161729    return      retq
    7042: 0x00007ffff7161728    other       popq   %rbp
    7041: 0x00007ffff7161727    other       popq   %rbx
    7040: 0x00007ffff7161725    other       movl   %edx, %eax
    7039: 0x00007ffff7161721    other       addq   $0x8, %rsp
    7038: 0x00007ffff7161709    cond jump   je     0x87721                   ; <+241>
    7037: 0x00007ffff7161707    other       decl   (%rsi)
    7036: 0x00007ffff71616fe    cond jump   je     0x87707                   ; <+215>
    7035: 0x00007ffff71616f7    other       cmpl   $0x0, 0x33de92(%rip)      ; __libc_multiple_threads
    7034: 0x00007ffff71616ef    other       movq   $0x0, 0x8(%rsi)
    7033: 0x00007ffff71616ed    cond jump   jne    0x87721                   ; <+241>
    7032: 0x00007ffff71616e9    other       subl   $0x1, 0x4(%rsi)
    7031: 0x00007ffff71616e2    other       movq   0x88(%rbx), %rsi
    7030: 0x00007ffff71616e0    cond jump   jne    0x87721                   ; <+241>
    7029: 0x00007ffff71616da    other       testl  $0x8000, (%rbx)           ; imm = 0x8000
```

Differential Revision: https://reviews.llvm.org/D130580

[libc++][ranges] Make sure all range algorithms support differing projection types:

- for all algorithms taking more than one range, add a `robust` test to
  check the case where the ranges have different value types and the
  given projections are different, with each projection applying to
  a different value type;
- fix `ranges::include` to apply the correct projection to each range.

Differential Revision: https://reviews.llvm.org/D130515

[libc++][ranges] Implement `ranges::generate{,_n}`.

Differential Revision: https://reviews.llvm.org/D130552

Revert "[Support] Workaround compiler bug in MSVC"

This reverts commit ec8f4fd68cd401a0ba41bb160d6acce670486fab.

This caused a failure in the mlir-windows bot.

workflows: Add GitHub action for automating some release tasks

For each release tag, this action will create a new release on GitHub,
and for each -final tag, this action will build the documentation and
upload it to GitHub.

Reviewed By: hans, kwk

Differential Revision: https://reviews.llvm.org/D99780

github: Automatically assign reviewers for backport requests

When there is a backport request, the GitHub Action that handles the
backport will now automatically assign the issue to the user(s) who
approved the commit in Phabricator and create an issue comment asking
them to review the request.

Reviewed By: thieta, kwk

Differential Revision: https://reviews.llvm.org/D126423

[CodeGen] Fixed ambiguous symbol ExtAddrMode in case of NDEBUG and LLVM_ENABLE_DUMP

This patch fixes the following error with MSVC 16.9.2 in case of NDEBUG and LLVM_ENABLE_DUMP:
llvm/lib/CodeGen/CodeGenPrepare.cpp(2581): error C2872: 'ExtAddrMode': ambiguous symbol
llvm/include/llvm/CodeGen/TargetInstrInfo.h(86): note: could be 'llvm::ExtAddrMode'
llvm/lib/CodeGen/CodeGenPrepare.cpp(2447): note: or '`anonymous-namespace'::ExtAddrMode'
llvm/lib/CodeGen/CodeGenPrepare.cpp(2581): error C2039: 'print': is not a member of 'llvm::ExtAddrMode'

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D130426

github: Fix release automation /branch command with new repo

We started using the llvm/llvm-project-release-prs repo for
backport pull requests, but since this repo is not a fork of
llvm/llvm-project it will reject pull requests from other repos. In
order to fix this, when ever someone uses the /branch command to request
a branch be merged into the release branch, we first copy the branch to
the llvm-project-release-prs repo and then create the pull request.

Reviewed By: thieta

Differential Revision: https://reviews.llvm.org/D126940

[ELF] addDependentLibrary: fix a use-after-free bug in archiveName

[mlir] Refactor SubElementInterface replace support

The current support was essentially the amount necessary
to support replacing SymbolRefAttrs, but suffers from various
deficiencies (both ergonomic and functional):

* Replace crashes if unsupported
This makes it really hard to use safely, given that you don't know
if you are going to crash or not when using it.

* Types aren't supported
This seems like a simple missed addition when the attribute replacement
support was originally added.

* The ergonomics are weird
It currently uses an index based replacement, which makes the implementations
quite clunky.

This commit refactors support to be a bit more ergonomic, and also
adds support for types in the process. This was also a great oppurtunity
to greatly simplify how replacement is done in the symbol table.

Fixes #56355

Differential Revision: https://reviews.llvm.org/D130589

[ELF] addLibrary: fix a use-after-free bug in archiveName

It manifests as an incorrect name in --print-archive-stats=.

[ELF][test] Clean up print-archive-stats.s

[RISCV] Pre-commit tests for D130146. NFC

[lldb/ClangExpressionParser] Fix compiler error due to `clang::CreateLLVMCodeGen()` API change

[CGDebugInfo] Access the current working directory from the `VFS`

...instead of calling `llvm::sys::fs::current_path()` directly.

Differential Revision: https://reviews.llvm.org/D130443

[clang-tidy] Avoid extra parentheses around MemberExpr

Fixes https://github.com/llvm/llvm-project/issues/55025.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D129596

[InstCombine] Fold strtoul and strtoull and avoid PR #56293

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D129224

[lldb] Disable TestStackFromStdModule.py

TestStackFromStdModule.py started failing due to f4fb72e6d4ce
(https://reviews.llvm.org/D128146), with a clang assertion failure:
assert(isa<InjectedClassNameType>(Decl->TypeForDecl))

[amdgpu][nfc] Separate LDS struct creation from RAUW

[Support] Workaround compiler bug in MSVC

https://developercommunity.visualstudio.com/t/Prev-Issue---with-__assume-isnan-/1597317

This was causing unittest failures on Windows for the GitHub actions
based CI we use in the release branches.

Failed Tests (2):
LLVM-Unit :: Support/./SupportTests.exe/FormatVariadicTest.BigTest
LLVM-Unit :: Support/./SupportTests.exe/NativeFormatTest.BoundaryTests

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D129822

[AggressiveInstCombine] convert sqrt libcalls with "nnan" to sqrt intrinsics

This is an alternate to D129155 that uses TTI.haveFastSqrt() to avoid a
potential miscompile for programs with reads of errno. Moving the transform
to AggressiveInstCombine provides access to TTI.

If a sqrt call has "nnan", that implies that the input argument is never
negative because sqrt of {negative number} --> NAN.
If the argument is never negative and the call can be lowered without a
libcall, then we can assume that errno accesses are unchanged after lowering,
so the call can be translated to the LLVM intrinsic (which is expected to
become inline code).

This affects codegen for targets like x86 that have sqrt instructions, but
still have to conservatively assume that a libcall may be needed to set
errno as shown in issue #52620 and issue #56383.

This patch won't solve those examples - we will need to extend this to use
CannotBeOrderedLessThanZero or similar, enhance that analysis for new
operators, and/or deal with llvm.assume too.

Differential Revision: https://reviews.llvm.org/D129167

[Clang][Doc] Update the release note for clang

Add the support for `atomic compare` and `atomic compare capture` in the
release note of clang.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D129211

[clang] Pass FoundDecl to DeclRefExpr creator for operator overloads

Without the "found declaration" it is later not possible to know where the operator declaration
was brought into the scope calling it.

The initial motivation for this fix came from #55095. However, this also has an influence on
`clang -ast-dump` which now prints a `UsingShadow` attribute for operators only visible through
`using` statements. Also, clangd now correctly references the `using` statement instead of the
operator directly.

Reviewed By: shafik

Differential Revision: https://reviews.llvm.org/D129973

Move GetControlFlowKind's logic to DisassemblerLLVMC.cpp

This diff move the logic of `GetControlFlowKind()` from Disassembler.cpp to DisassemblerLLVMC.cpp.
Here's details:
- Actual logic of GetControlFlowKind() move to `DisassemblerLLVMC.cpp`, and we can check underlying architecture using `DisassemblerScope` there.
- With this change, passing 'triple' to `GetControlFlowKind()` is no more required.

Reviewed By: wallace

Differential Revision: https://reviews.llvm.org/D130320

[trace][intel pt] Introduce wall clock time for each trace item

- Decouple TSCs from trace items
- Turn TSCs into events just like CPUs. The new name is HW clock tick, wich could be reused by other vendors.
- Add a GetWallTime that returns the wall time that the trace plug-in can infer for each trace item.
- For intel pt, we are doing the following interpolation: if an instruction takes less than 1 TSC, we use that duration, otherwise, we assume the instruction took 1 TSC. This helps us avoid having to handle context switches, changes to kernel, idle times, decoding errors, etc. We are just trying to show some approximation and not the real data. For the real data, TSCs are the way to go. Besides that, we are making sure that no two trace items will give the same interpolation value. Finally, we are using as time 0 the time at which tracing started.

Sample output:

```
(lldb) r
Process 750047 launched: '/home/wallace/a.out' (x86_64)
Process 750047 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000000000402479 a.out`main at main.cpp:29:20
   26   };
   27
   28   int main() {
-> 29     std::vector<int> vvv;
   30     for (int i = 0; i < 100; i++)
   31       vvv.push_back(i);
   32
(lldb) process trace start -s 64kb -t --per-cpu
(lldb) b 60
Breakpoint 2: where = a.out`main + 1689 at main.cpp:60:23, address = 0x0000000000402afe
(lldb) c
Process 750047 resuming
Process 750047 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 2.1
    frame #0: 0x0000000000402afe a.out`main at main.cpp:60:23
   57     map<int, int> m;
   58     m[3] = 4;
   59
-> 60     map<string, string> m2;
   61     m2["5"] = "6";
   62
   63     std::vector<std::string> vs = {"2", "3"};
(lldb) thread trace dump instructions -t -f -e thread #1: tid = 750047
    0: [379567.000 ns] (event) HW clock tick [48599428476224707]
    1: [379569.000 ns] (event) CPU core changed [new CPU=2]
    2: [390487.000 ns] (event) HW clock tick [48599428476246495]
    3: [1602508.000 ns] (event) HW clock tick [48599428478664855]
    4: [1662745.000 ns] (event) HW clock tick [48599428478785046]
  libc.so.6`malloc
    5: [1662746.995 ns] 0x00007ffff7176660    endbr64
    6: [1662748.991 ns] 0x00007ffff7176664    movq   0x32387d(%rip), %rax      ;  + 408
    7: [1662750.986 ns] 0x00007ffff717666b    pushq  %r12
    8: [1662752.981 ns] 0x00007ffff717666d    pushq  %rbp
    9: [1662754.977 ns] 0x00007ffff717666e    pushq  %rbx
    10: [1662756.972 ns] 0x00007ffff717666f    movq   (%rax), %rax
    11: [1662758.967 ns] 0x00007ffff7176672    testq  %rax, %rax
    12: [1662760.963 ns] 0x00007ffff7176675    jne    0x9c7e0                   ; <+384>
    13: [1662762.958 ns] 0x00007ffff717667b    leaq   0x17(%rdi), %rax
    14: [1662764.953 ns] 0x00007ffff717667f    cmpq   $0x1f, %rax
    15: [1662766.949 ns] 0x00007ffff7176683    ja     0x9c730                   ; <+208>
    16: [1662768.944 ns] 0x00007ffff7176730    andq   $-0x10, %rax
    17: [1662770.939 ns] 0x00007ffff7176734    cmpq   $-0x41, %rax
    18: [1662772.935 ns] 0x00007ffff7176738    seta   %dl
    19: [1662774.930 ns] 0x00007ffff717673b    jmp    0x9c690                   ; <+48>
    20: [1662776.925 ns] 0x00007ffff7176690    cmpq   %rdi, %rax
    21: [1662778.921 ns] 0x00007ffff7176693    jb     0x9c7b0                   ; <+336>
    22: [1662780.916 ns] 0x00007ffff7176699    testb  %dl, %dl
    23: [1662782.911 ns] 0x00007ffff717669b    jne    0x9c7b0                   ; <+336>
    24: [1662784.906 ns] 0x00007ffff71766a1    movq   0x3236c0(%rip), %r12      ;  + 24
(lldb) thread trace dump instructions -t -f -e -J -c 4
[
  {
    "id": 0,
    "timestamp_ns": "379567.000000",
    "event": "HW clock tick",
    "hwClock": 48599428476224707
  },
  {
    "id": 1,
    "timestamp_ns": "379569.000000",
    "event": "CPU core changed",
    "cpuId": 2
  },
  {
    "id": 2,
    "timestamp_ns": "390487.000000",
    "event": "HW clock tick",
    "hwClock": 48599428476246495
  },
  {
    "id": 3,
    "timestamp_ns": "1602508.000000",
    "event": "HW clock tick",
    "hwClock": 48599428478664855
  },
  {
    "id": 4,
    "timestamp_ns": "1662745.000000",
    "event": "HW clock tick",
    "hwClock": 48599428478785046
  },
  {
    "id": 5,
    "timestamp_ns": "1662746.995324",
    "loadAddress": "0x7ffff7176660",
    "module": "libc.so.6",
    "symbol": "malloc",
    "mnemonic": "endbr64"
  },
  {
    "id": 6,
    "timestamp_ns": "1662748.990648",
    "loadAddress": "0x7ffff7176664",
    "module": "libc.so.6",
    "symbol": "malloc",
    "mnemonic": "movq"
  },
  {
    "id": 7,
    "timestamp_ns": "1662750.985972",
    "loadAddress": "0x7ffff717666b",
    "module": "libc.so.6",
    "symbol": "malloc",
    "mnemonic": "pushq"
  },
  {
    "id": 8,
    "timestamp_ns": "1662752.981296",
    "loadAddress": "0x7ffff717666d",
    "module": "libc.so.6",
    "symbol": "malloc",
    "mnemonic": "pushq"
  }
]
```

Differential Revision: https://reviews.llvm.org/D130054

[InstSimplify] remove redundant calls to 'isImplied'; NFCI

We already call the more general isImpliedCondition() (which calls
isImpliedTrueByMatchingCmp() internally) from simplifyAndInst()
and simplifyOrInst().

There was a difference visible with this change on a vector test
before a925bef70c6c, but I can't find any gaps now.

[gn build] Port 4638d7a28f62

[Sanitizers][Darwin] Allows '-mtargetos' to used to set minimum deployment target.

Currently, m{platform}-version-min is default flag used to set min deployment target within compilter-rt and sanitizers.
However, clang uses flags -target and -mtargetos for setting target triple and minimum deployment targets.
-mtargetos will be the preferred flag to set min version in the future and the
${platform}-version-min flag will not be used for future platforms.

This change allows darwin platforms to use either ${platform}-min-version or -mtargetos
without breaking lit test flags that allows for overriding the default min value in lit tests
Tests using flags: 'darwin_min_target_with_tls_support', 'min_macos_deployment_target'
will no longer fail if they use mtargetos instead of version-min.

rdar://81028225

Differential Revision: https://reviews.llvm.org/D130542

Revert "[clang-offload-bundler] Library-ize ClangOffloadBundler"

This reverts commit 8348c4095600ec2c0beee293267832799d2ebee3.

[Matrix] Add assert to catch extracted vectors with poison elements

Assert when the extracted vector is wider than the row/column.

Differential Revision: https://reviews.llvm.org/D130173

[RISCV] Add Predicate to c.lw/c.sw/c.lwsp/c.swsp InstAliases with no offset.

These are aliases that allow the immediate offset to be ommitted.
We had predicates for the RV64, RV32+F, and D versions, but
not the base versions.

I've also re-ordered them to share Predicate lines to improve
readability.

[Matrix] Refactor tiled loops in a struct. NFC

The three loops have the same structure: index, header, latch.

[GlobalISel] Import patterns for G_FMAXIMUM + G_FMINIMUM

Allows us to select scalar instructions on AArch64.

Differential Revision: https://reviews.llvm.org/D115381

[clang][dataflow] Analyze calls to in-TU functions

This patch adds initial support for context-sensitive analysis of simple functions whose definition is available in the translation unit, guarded by the `ContextSensitive` flag in the new `TransferOptions` struct. When this option is true, the `VisitCallExpr` case in the builtin transfer function has a fallthrough case which checks for a direct callee with a body. In that case, it constructs a CFG from that callee body, uses the new `pushCall` method on the `Environment` to make an environment to analyze the callee, and then calls `runDataflowAnalysis` with a `NoopAnalysis` (disabling context-sensitive analysis on that sub-analysis, to avoid problems with recursion). After the sub-analysis completes, the `Environment` from its exit block is simply assigned back to the environment at the callsite.

The `pushCall` method (which currently only supports non-method functions with some restrictions) maps the `SourceLocation`s for all the parameters to the existing source locations for the corresponding arguments from the callsite.

This patch adds a few tests to check that this context-sensitive analysis works on simple functions. More sophisticated functionality will be added later; the most important next step is to explicitly model context in some fields of the `DataflowAnalysisContext` class, as mentioned in a `FIXME` comment in the `pushCall` implementation.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D130306

[RISCV] Minor fixes to rv64c-valid.s test.

-Missing CHECK-NO-EXT and CHECK-NO-RV64 on subw.
-Stray CHECK-NO-RV64 on c.slli.
-c.slli used immediate 1 instead of RV64 only immediate like 63.
-Missing CHECK-NO-EXT on c.srli and c.srai

[gn build] Port 8348c4095600

[amdgpu][nfc] Skip operations on padding fields in LDS struct

Revert "[clang][dataflow] Analyze calls to in-TU functions"

This reverts commit fa2b83d07ecab3b24b4c5ee2e7dc4b6bbc895317.

[clang][dataflow] Analyze calls to in-TU functions

Depends On D130305

This patch adds initial support for context-sensitive analysis of simple functions whose definition is available in the translation unit, guarded by the `ContextSensitive` flag in the new `TransferOptions` struct. When this option is true, the `VisitCallExpr` case in the builtin transfer function has a fallthrough case which checks for a direct callee with a body. In that case, it constructs a CFG from that callee body, uses the new `pushCall` method on the `Environment` to make an environment to analyze the callee, and then calls `runDataflowAnalysis` with a `NoopAnalysis` (disabling context-sensitive analysis on that sub-analysis, to avoid problems with recursion). After the sub-analysis completes, the `Environment` from its exit block is simply assigned back to the environment at the callsite.

The `pushCall` method (which currently only supports non-method functions with some restrictions) first calls `initGlobalVars`, then maps the `SourceLocation`s for all the parameters to the existing source locations for the corresponding arguments from the callsite.

This patch adds a few tests to check that this context-sensitive analysis works on simple functions. More sophisticated functionality will be added later; the most important next step is to explicitly model context in some fields of the `DataflowAnalysisContext` class, as mentioned in a `TODO` comment in the `pushCall` implementation.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D130306

[MachineFunctionPass] Support -print-changed and -print-changed=quiet

-print-changed for new pass manager is handy beside -print-after-all.
Port it to MachineFunctionPass.

Note: lib/Passes/StandardInstrumentations.cpp implements a number of
misc features. If we want to use them for codegen, we may need to lift
some functionality to LLVMIR.

Reviewed By: aeubanks, jamieschmeiser

Differential Revision: https://reviews.llvm.org/D130434

StackFrame::GetValueObjectForFrameVariable holds the StackFrame lock too long.

This can cause a deadlock if other threads use the common pattern of
"lock the StackFrameList, get a frame, lock the StackFrame."

Differential Revision: https://reviews.llvm.org/D130524

[clang-offload-bundler] Library-ize ClangOffloadBundler

Lifting the core functionalities of the clang-offload-bundler into a
user-facing library/API. This will allow online and JIT compilers to
bundle and unbundle files without spawning a new process.

This patch lifts the classes and functions used to implement
the clang-offload-bundler into a separate OffloadBundler.cpp,
and defines three top-level API functions in OfflaodBundler.h.
        BundleFiles()
        UnbundleFiles()
        UnbundleArchives()

This patch also introduces a Config class that locally stores the
previously global cl::opt options and arrays to allow users to call
the APIs in a multi-threaded context, and introduces an
OffloadBundler class to encapsulate the top-level API functions.

We also  lift the BundlerExecutable variable, which is specific
to the clang-offload-bundler tool, from the API, and replace
its use with an ObjcopyPath variable. This variable must be set
in order to internally call llvm-objcopy.

Finally, we move the API files from
clang/tools/clang-offload-bundler into clang/lib/Driver and
clang/include/clang/Driver.

Differential Revision: https://reviews.llvm.org/D129873

[DAG] matchRotateSub - set demanded bits to the shift amount type size, not the shift result size.

This should fix a report on D130251 of an assert due to a bitwidth mismatch in APInt::isSubSetOf

[AArch64] Simplify BTI/PAC-RET module flags

These module flags use the Min merge behavior with a default value of
zero, so we don't need to emit them if zero.

Reviewed By: danielkiss

Differential Revision: https://reviews.llvm.org/D130145

[AMDGPU][GFX10][DOC][NFC] Update assembler syntax description

Summary of changes:
- Update FLAT LDS syntax (see https://reviews.llvm.org/D125126)

[clangd] Improve XRefs support for ObjCMethodDecl

- Correct nameLocation to point to the first selector fragment instead
of the - or +

- getDefinition now searches through the proper impl decls to find
the definition of the ObjCMethodDecl if one exists

Differential Revision: https://reviews.llvm.org/D130095

[mlir][transform] Add ForeachOp to transform dialect

This op "unbatches" an op handle and executes the loop body for each payload op.

Differential Revision: https://reviews.llvm.org/D130257

[C++20] [Modules] Disable preferred_name when writing a C++20 Module interface

Currently, the use of preferred_name would block implementing std
modules in libcxx. See https://github.com/llvm/llvm-project/issues/56490
for example.
The problem is pretty hard and it looks like we couldn't solve it in a
short time. So we sent this patch as a workaround to avoid blocking us
to modularize STL. This is intended to be fixed properly in the future.

Reviewed By: erichkeane, aaron.ballman, tahonermann

Differential Revision: https://reviews.llvm.org/D130331

[AMDGPU] Start refactoring GCNSchedStrategy

Tries to make the different scheduling stages a bit more self contained and
modifiable. Intended to be NFC. Preface to other changes.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D130147

[WinEH] Apply funclet operand bundles to nounwind intrinsics that lower to function calls in the course of IR transforms

WinEHPrepare marks any function call from EH funclets as unreachable, if it's not a nounwind intrinsic or has no proper funclet bundle operand. This
affects ARC intrinsics on Windows, because they are lowered to regular function calls in the PreISelIntrinsicLowering pass. It caused silent binary truncations and crashes during unwinding with the GNUstep ObjC runtime: https://github.com/gnustep/libobjc2/issues/222

This patch adds a new function `llvm::IntrinsicInst::mayLowerToFunctionCall()` that aims to collect all affected intrinsic IDs.
* Clang CodeGen uses it to determine whether or not it must emit a funclet bundle operand.
* PreISelIntrinsicLowering asserts that the function returns true for all ObjC runtime calls it lowers.
* LLVM uses it to determine whether or not a funclet bundle operand must be propagated to inlined call sites.

Reviewed By: theraven

Differential Revision: https://reviews.llvm.org/D128190

[RISCV] Add codegen coverage for ceil/floor/trunc/round/roundeven within FPR

Currently, all of these go to libcalls. A change to improve lowering is upcoming.

[gn build] Port f4fb72e6d4ce

[libc++] Use uninitialized algorithms for vector

Reviewed By: ldionne, #libc

Spies: huixie90, eaeltsin, joanahalili, bgraur, alexfh, hans, avogelsgesang, augusto2112, libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D128146

[bolt,AArch64] Fix one more test failure from D130358.

This one actually makes the test simpler, because lit doesn't have to
reconstitute a 32-bit little-endian value from individual bytes any
more: llvm-objdump is printing the desired 32-bit value in the first
place, so we can move straight on to doing the arithmetic on it.

[SVE][SelectionDAG] Use INDEX to generate matching instances of BUILD_VECTOR.

This patch starts small, only detecting sequences of the form
<a, a+n, a+2n, a+3n, ...> where a and n are ConstantSDNodes.

Differential Revision: https://reviews.llvm.org/D125194

[gn build] (manually) port a5640968f2f7

[DWP][DWARF] Detect and error on debug info offset overflow

Right now we silently overflow uint32_t for debug_indfo sections. Added a check
and error out.

Differential Revision: https://reviews.llvm.org/D130395

[WPD] Use new llvm.public.type.test intrinsic for potentially publicly visible classes

Turning on opaque pointers has uncovered an issue with WPD where we currently pattern match away `assume(type.test)` in WPD so that a later LTT doesn't resolve the type test to undef and introduce an `assume(false)`. The pattern matching can fail in cases where we transform two `assume(type.test)`s into `assume(phi(type.test.1, type.test.2))`.

Currently we create `assume(type.test)` for all virtual calls that might be devirtualized. This is to support `-Wl,--lto-whole-program-visibility`.

To prevent this, all virtual calls that may not be in the same LTO module instead use a new `llvm.public.type.test` intrinsic in place of the `llvm.type.test`. Then when we know if `-Wl,--lto-whole-program-visibility` is passed or not, we can either replace all `llvm.public.type.test` with `llvm.type.test`, or replace all `llvm.public.type.test` with `true`. This prevents WPD from trying to pattern match away `assume(type.test)` for public virtual calls when failing the pattern matching will result in miscompiles.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D128955

[AMDGPU][MC][GFX11] Correct src0 for VOP3_DPP variants of v_cmp*class* opcodes

Disable SGPRs for src0 of these opcodes.

Differential Revision: https://reviews.llvm.org/D130486

[llvm][cmake] Follow up to D117973

1. Slightly document the "mark advanced" variable used to control the
   installed CMake package dir.

   I would document it more, but I am considering in the future adding
   pkg-config support in this manner, after which `_PACKGE_DIR` is
   probably better called `_CMAKE_PACKGE_DIR` or similar.

2. Convey the custom path to the legacy `llvm-config` binary.

Reviewed By: sebastian-ne

Differential Revision: https://reviews.llvm.org/D130539

[cmake] Slight fix ups to make robust to the full range of GNUInstallDirs

See https://cmake.org/cmake/help/v3.14/module/GNUInstallDirs.html#result-variables for `CMAKE_INSTALL_FULL_*`

Reviewed By: sebastian-ne

Differential Revision: https://reviews.llvm.org/D130545

[LLDB][ClangExpression] Prevent nullptr namespace map access during logging

Some codepaths lead to `namespace_map == nullptr` when we get to
`ClangASTSource::FindCompleteType`. This occurred while debugging
an lldb session that had `settings set target.import-std-module true`.

In that case, with `LLDBLog::Expressions` logging enabled, we would
dereference a `nullptr` and crash.

This commit moves the logging until after we check for `nullptr`.

**Testing**

* Fixed the specific crash I was seeing while debugging an `lldb`
session with `import-std-module` enabled.

Differential Revision: https://reviews.llvm.org/D130561

[AMDGPU][MC][GFX11] Correct encoding of VOP3/VOP3_DPP v_cmpx* opcodes

Encode dst=EXEC but allow disassembler accept any dst value.

Differential Revision: https://reviews.llvm.org/D130345

[mlir] Sort the libraties in BUILD.bazel.

[mlir] Update bazel build.

LangRef: note that `allockind("free")` requires void return

Otherwise we have to work pretty hard to ensure a discarded alloc/free
pair doesn't remove a return value that's still useful.

Differential Revision: https://reviews.llvm.org/D130568

[AArch64][SVE] Sink ptrue into loop if it is used by PTEST.

This helps fold away the ptest instructions, which needs the knowledge on whether
the general predicate is known to zero the inactive lanes.

This fixes some PTEST regressions introduced by D129282.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D129852

[AArch64][SVE] Consider more intrinsics in 'isZeroingInactiveLanes'.

This fixes some PTEST regressions introduced by D129282.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D129851

[AArch64][SVE] NFC: Add test-case to sve-ptest-removal-cmp* tests

This also adds new sve-ptest tests for FP compares that will retain
the ptest.

This also includes a few other NFC changes:
* Added type mangling to ptest.any intrinsic.
* Regenerated asm using update_llc_tests script.

tsan: capture shadow map start/end on init and reuse in reset

Capture the computed shadow begin/end values at the point where the
shadow is first created and reuse those values on reset. Introduce new
windows-specific function "ZeroMmapFixedRegion" for zeroing out an
address space region previously returned by one of the MmapFixed*
routines; call this function (on windows) from DoResetImpl
tsan_rtl.cpp instead of MmapFixedSuperNoReserve.

See https://github.com/golang/go/issues/53539#issuecomment-1168778740
for context; intended to help with updating the syso for Go's
windows/amd64 race detector.

Differential Revision: https://reviews.llvm.org/D128909

Revert "[flang][OpenMP] Lowering support for default clause"

This reverts commit 05e6fce84fd39d150195b8928561f2c90c71e538.

[bazel] Port 628fbbef81c5ac806e6dbf2bce18dd44980051b1

[DAGCombine] Mask doesn't have to be (EltSize - 1) exactly when combining rotation

I think what we need is the least Log2(EltSize) significant bits are known to be ones.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D130251

[libc] Use nearest_integer instructions to improve expm1f performance.

Use nearest_integer instructions to improve expf performance.

Performance tests with CORE-MATH's perf tool:

Before the patch:
```
$ ./perf.sh expm1f
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH reciprocal throughput   : 10.096
System LIBC reciprocal throughput : 44.036
LIBC reciprocal throughput        : 11.575

$ ./perf.sh expm1f --latency
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH latency   : 42.239
System LIBC latency : 122.815
LIBC latency        : 50.122
```
After the patch:
```
$ ./perf.sh expm1f
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH reciprocal throughput   : 10.046
System LIBC reciprocal throughput : 43.899
LIBC reciprocal throughput        : 9.179

$ ./perf.sh expm1f --latency
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH latency   : 42.078
System LIBC latency : 120.488
LIBC latency        : 41.528
```

Reviewed By: zimmermann6

Differential Revision: https://reviews.llvm.org/D130502

[libc] Use nearest_integer instructions to improve expf performance.

Use nearest_integer instructions to improve expf performance.

Performance tests with CORE-MATH's perf tool:

Before the patch:
```
$ ./perf.sh expf
LIBC-location: /home/lnt/experiment/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH reciprocal throughput   : 9.860
System LIBC reciprocal throughput : 7.728
LIBC reciprocal throughput        : 12.363

$ ./perf.sh expf --latency
LIBC-location: /home/lnt/experiment/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH latency   : 42.802
System LIBC latency : 35.941
LIBC latency        : 49.808
```

After the patch:
```
$ ./perf.sh expf
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH reciprocal throughput   : 9.441
System LIBC reciprocal throughput : 7.382
LIBC reciprocal throughput        : 8.843

$ ./perf.sh expf --latency
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH latency   : 44.192
System LIBC latency : 37.693
LIBC latency        : 44.145
```

Reviewed By: zimmermann6

Differential Revision: https://reviews.llvm.org/D130498

[RISCV] Precommit test for D130251

Added tests won't modify the least Log2(EltSize) significant bits.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D130252

[C++20] [Modules] Don't handle no linkage entities when overloading

The original implementation uses `ND->getFormalLinkage() <=
Linkage::InternalLinkage`. It is not right since the spec only says
internal linkage and it doesn't mention 'no linkage'. This matters when
we consider constructors. According to [class.ctor.general]p1,
constructors have no name so constructors have no linkage too.

[Debuginfo][llvm-dwarfutil] Add check for unsupported debug sections.

Current DWARFLinker implementation does not support some debug sections
(mainly DWARF v5 sections). This patch adds diagnostic for such sections.
The warning would be displayed for critical(such that could not be removed)
sections and the source file would be skipped. Other unsupported sections
would be removed and warning message should be displayed. The zero exit
status would be returned for both cases.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D123623

[flang] Remove fp128 support for llvm.round and llvm.trunc

The fp128 in llvm.round and llvm.trunc is not supported in X86_64 for
now. Revert the support. To support quad precision for llvm.round and
llvm.trunc, it may should be supported using runtime.

Reviewed By: Jean Perier

Differential Revision: https://reviews.llvm.org/D130556

[clang][dataflow] Add explicit "AST" nodes for implications and iff

Previously we used to desugar implications and biconditionals into
equivalent CNF/DNF as soon as possible. However, this desugaring makes
debug output (Environment::dump()) less readable than it could be.
Therefore, it makes sense to keep the sugared representation of a
boolean formula, and desugar it in the solver.

Reviewed By: sgatev, xazax.hun, wyt

Differential Revision: https://reviews.llvm.org/D130519

[NFC] Fix some C++20 warnings

Without this patch when using CMAKE_CXX_STANDARD=20 Microsoft compiler produces following warnings

clang\include\clang/Basic/DiagnosticIDs.h(48): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(49): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(50): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(51): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(52): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(53): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(54): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(55): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(56): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(57): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(58): warning C5054: operator '+': deprecated between enumerations of different types
clang\include\clang/Basic/DiagnosticIDs.h(59): warning C5054: operator '+': deprecated between enumerations of different types

Patch By: Godin

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D130476

[pseudo] Allow opaque nodes to represent terminals

This allows incomplete code such as `namespace foo {` to be modeled as a
normal sequence with the missing } represented by an empty opaque node.

Differential Revision: https://reviews.llvm.org/D130551

[libc++][NFC] Add missing SHA in ABI changelog

[libc++] Generalize the customizeable assertion handler

Instead of taking a fixed set of arguments, use variadics so that
we can pass arbitrary arguments to the handler. This is the first
step towards using the handler to handle other non-assertion-related
failures, like std::unreachable and an exception being thrown in
-fno-exceptions mode, which would improve user experience by including
additional information in crashes (right now, we call abort() without
additional information).

Differential Revision: https://reviews.llvm.org/D130507

[libc++] Remove XFAIL for libcpp_deallocate on AIX, which seems to be passing now

[gn build] Port 7a5cb15ea6fa

[bazel] Run autoformatter on BUILD.bazel