review.tizen.org Git - platform/upstream/llvm.git/log

[Libomptarget] Change device free routines to accept the allocation kind

Previous support for device memory allocators used a single free
routine and did not provide the original kind of the allocation. This is
problematic as some of these memory types required different handling.
Previously this was worked around using a map in runtime to record the
original kind of each pointer. Instead, this patch introduces new free
routines similar to the existing allocation routines. This allows us to
avoid a map traversal every time we free a device pointer.

The only interfaces defined by the standard are `omp_target_alloc` and
`omp_target_free`, these do not take a kind as `omp_alloc` does. The
standard dictates the following:

"The omp_target_alloc routine returns a device pointer that references
the device address of a storage location of size bytes. The storage
location is dynamically allocated in the device data environment of the
device specified by device_num."

Which suggests that these routines only allocate the default device
memory for the kind. So this has been changed to reflect this. This
change is somewhat breaking if users were using `omp_target_free` as
previously shown in the tests.

Reviewed By: JonChesterfield, tianshilei1992

Differential Revision: https://reviews.llvm.org/D133053

Revert "[clang] fix generation of .debug_aranges with LTO"

This reverts commit 6bf6730ac55e064edf46915ebba02e9c716f48e8.
Breaks tests if LLD isn't being built, see comments on
https://reviews.llvm.org/D133092

[BOLT] Preserve original LSDA type encoding

In non-pie binaries BOLT unconditionally converted type encoding
from indirect to absptr, which broke std exceptions since pointers
to their typeinfo were only assigned at runtime in .data section.
In this patch we preserve original encoding so that indirect
remains indirect and can be resolved at runtime, and absolute remains absolute.

Reviewed By: rafauler, maksfb

Differential Revision: https://reviews.llvm.org/D132484

[clang] fix linker executable path in test

A previous patch (https://reviews.llvm.org/D132810) introduced a test
that fails on systems where the linker executable (`ld`) has a `.exe`
extension. This patch updates the regex in the test so that lit can
look for both `ld` as well as `ld.exe`.

Reviewed By: stella.stamenova

Differential Revision: https://reviews.llvm.org/D133773

Revert "[lldb][DWARF5] Enable macro evaluation"

This reverts commit a0fb69d17b4d7501a85554010727837340e7b52f.

This broke the windows lldb bot: https://lab.llvm.org/buildbot/#/builders/83/builds/23666

Revert "[test][clang] run test for lld emitting dwarf-aranages only if lld is presented"

This reverts commit 44075cc34a9b373714b594964001ce283598eac1.
Broke check-clang, see comments on https://reviews.llvm.org/D133841

[mlir] Add accessor methods for I[2|4|16] types to Builder.

Adds the accessor methods for I[2|4|16] types to the Builder.

Differential Revision: https://reviews.llvm.org/D133793

[mlir][sparse] Make sparse compiler more admissible.

Previously, the iteration graph is computed without priority. This patch add a heuristic when computing the iteration graph by starting with Reduction iterator when doing topo sort, which makes Reduction iterators (likely) appear as late in the sorted array as possible.

The current sparse compiler also failed to compile the newly added case.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D133738

Revert "[mlir][scf][Transform] Refactor transform.fuse_into_containing_op so it is iterative and supports output fusion."

This reverts commit 54a5f606281d05203dca1d81d135e691b10bc513 which is a WIP that was pushed by mistake.

[mlir][scf][Transform] Refactor transform.fuse_into_containing_op so it is iterative and supports output fusion.

This revision revisits the implementation of `transform.fuse_into_containing_op` so that it iterates on
producers one use at a time.

Support is added to fuse a producer through a foreach_thread shared tensor argument, in which case we
tile and fuse the op inside the containing op and update the shared tensor argument to the unique destination operand.
If one cannot find such a unique destination operand the transform fails.

[mlir][Linalg] Add return type filter to the transform dialect

This allows matching ops by additionally providing an idiomatic spec for a unique return type.

Differential Revision: https://reviews.llvm.org/D133862

[SLP][NFC]Extract getLastInstructionInBundle function for better
dependence checking, NFC.

Part of D110978

[MLIR][math] Use approximate matches for folded ops

LibM implementations differ, so the folders can have different results
on different platforms. For instance, the `cos` folder was failing on M1
mac. I chose to match the constant floats to 2(.5) significant digits.

Reviewed By: jacquesguan

Differential Revision: https://reviews.llvm.org/D133797

[MLIR][Presburger] Add hermite normal form computation to Matrix

This patch adds hermite normal form computation to Matrix. Part of this algorithm
lived in LinearTransform, being used for compuing column echelon form. This
patch moves the implementation to Matrix::hermiteNormalForm and generalises it
to compute the hermite normal form.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D133510

[flang][driver]Fix broken PowerPC tests

Tests don't work on PPC since `return` instruciton is't called `ret` (apparently)

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D133859

[InstCombine] Optimize multiplication where both operands are negated

Handle the case where both operands are negated in matrix multiplication

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D133695

Remove some unused static functions in CGOpenMPRuntimeGPU.cpp, NFC

[LLVM][AArch64] Don't warn about clobbering X16 when Speculative Load Hardening is used

SLH will fall back to a different technique if X16 is being used,
so there is no need to warn for inline asm use. Only prevent other codegen
from using it.

Reviewed By: kristof.beyls

Differential Revision: https://reviews.llvm.org/D133766

[OpenMP] Remove unused function after removing simplified interface

Summary:
A previous patch removed the user of this function but did not remove
the function causing unused function warnings. Remove it.

[CMake] Avoid `LLVM_BINARY_DIR` when other more specific variable are better-suited, part 1

A simple sed doing these substitutions:

- `${LLVM_BINARY_DIR}/\$\{CMAKE_CFG_INTDIR}/lib(${LLVM_LIBDIR_SUFFIX})?\>` -> `${LLVM_LIBRARY_DIR}`
- `${LLVM_BINARY_DIR}/\$\{CMAKE_CFG_INTDIR}/bin\>` -> `${LLVM_TOOLS_BINARY_DIR}`

where `\>` means "word boundary".

The only manual modifications were reverting changes in

- `compiler-rt/cmake/Modules/CompilerRTUtils.cmake`

because these were "entry points" where we wanted to tread carefully not not introduce a "loop" which would end with an undefined variable being expanded to nothing.

There are many more occurrences without `CMAKE_CFG_INTDIR`, but those are left for D132316 as they have proved somewhat tricky to fix.

This hopefully increases readability overall, and also decreases the usages of `LLVM_LIBDIR_SUFFIX`, preparing us for D130586.

Reviewed By: sebastian-ne

Differential Revision: https://reviews.llvm.org/D133828

[MLIR][Presburger] use arbitrary-precision arithmetic with MPInt instead of int64_t

Only the main Presburger library under the Presburger directory has been switched to use arbitrary precision. Users have been changed to just cast returned values back to int64_t or to use newly added convenience functions that perform the same cast internally.

The performance impact of this has been tested by checking test runtimes after copy-pasting 100 copies of each function. Affine/simplify-structures.mlir goes from 0.76s to 0.80s after this patch. Its performance sees no regression compared to its original performance at commit 18a06d4f3a7474d062d1fe7d405813ed2e40b4fc before a series of patches that I landed to offset the performance overhead of switching to arbitrary precision.

Affine/canonicalize.mlir and SCF/canonicalize.mlir show no noticable difference, staying at 2.02s and about 2.35s respectively.

Also, for Affine and SCF tests as a whole (no copy-pasting), the runtime remains about 0.09s on average before and after.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D129510

[analyzer] Initialize ShouldEmitErrorsOnInvalidConfigValue analyzer option

Downstream users who doesn't make use of the clang cc1 frontend for
commandline argument parsing, won't benefit from the Marshalling
provided default initialization of the AnalyzerOptions entries. More
about this later.
Those analyzer option fields, as they are bitfields, cannot be default
initialized at the declaration (prior c++20), hence they are initialized
at the constructor.
The only problem is that `ShouldEmitErrorsOnInvalidConfigValue` was
forgotten.

In this patch I'm proposing to initialize that field with the rest.

Note that this value is read by
`CheckerRegistry.cpp:insertAndValidate()`.
The analyzer options are initialized by the marshalling at
`CompilerInvocation.cpp:GenerateAnalyzerArgs()` by the expansion of the
`ANALYZER_OPTION_WITH_MARSHALLING` xmacro to the appropriate default
value regardless of the constructor initialized list which I'm touching.
Due to that this only affects users using CSA as a library, without
serious effort, I believe we cannot test this.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D133851

[OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU

Previously, we linked in the ROCm device libraries which provide math
and other utility functions late. This is not stricly correct as this
library contains several flags that are only set per-TU, such as fast
math or denormalization. This patch changes this to pass the bitcode
libraries per-TU using the same method we use for the CUDA libraries.
This has the advantage that we correctly propagate attributes making
this implementation more correct. Additionally, many annoying unused
functions were not being fully removed during LTO. This lead to
erroneous warning messages and remarks on unused functions.

I am not sure if not finding these libraries should be a hard error. let
me know if it should be demoted to a warning saying that some device
utilities will not work without them.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D133726

[OpenMP] Remove simplified device runtime handling

The old device runtime had a "simplified" version that prevented many of
the runtime features from being initialized. The old device runtime was
deleted in LLVM 14 and is no longer in use. Selectively deactivating
features is now done using specific flags rather than the old technique.
This patch simply removes the extra logic required for handling the old
simple runtime scheme.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D133802

[AA] Tracking per-location ModRef info in FunctionModRefBehavior (NFCI)

Currently, FunctionModRefBehavior tracks whether the function reads
or writes memory (ModRefInfo) and which locations it can access
(argmem, inaccessiblemem and other). This patch changes it to track
ModRef information per-location instead.

To give two examples of why this is useful:

* D117095 highlights a weakness of ModRef modelling in the presence
  of operand bundles. For a memcpy call with deopt operand bundle,
  we want to say that it can read any memory, but only write argument
  memory. This would allow them to be treated like any other calls.
  However, we currently can't express this and have to say that it
  can read or write any memory.
* D127383 would ideally be modelled as a separate threadid location,
  where threadid Refs outside pre-split coroutines can be ignored
  (like other accesses to constant memory). The current representation
  does not allow modelling this precisely.

The patch as implemented is intended to be NFC, but there are some
obvious opportunities for improvements and simplification. To fully
capitalize on this we would also want to change the way we represent
memory attributes on functions, but that's a larger change, and I
think it makes sense to separate out the FunctionModRefBehavior
refactoring.

Differential Revision: https://reviews.llvm.org/D130896

[ConstraintElimination] Clear new indices directly in getConstraint(NFC)

Instead of checking if any of the new indices has a non-zero coefficient
before using the constraint, do this directly when constructing the
constraint.

[MLIR] Fix toy lit substitutions

The tools are called e.g. `toyc-ch1`, not `toy-ch1`.

Add missing toyc-ch6/7.

It turns out that the other substitutions are not needed more by specific circumstances rather than by design:
The lit test exec root is set to build/mlir/test, which is where all the test tools are placed by CMake and we wouldn't need to substitute them at all.
We shouldn't rely on this assumption though, because it will make things harder for standalone tests and other build systems.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D133842

Fix heap-use-after-free when clearing DIEs in fission compile units.

D131437 caused heap-use-after-free failures when testing TestCreateAfterAttach.py in asan mode, and "regular" crashes outside of asan.

This appears to be due to a mismatch in a couple places where we choose to clear the DIEs. When we clear the DIE of a skeleton unit, we unconditionally clear the DIE of the DWO unit if it exists. However, `~ScopedExtractDIEs()` only looks at the skeleton unit when deciding to clear. If we decide to clear the skeleton unit because it is now unused, we end up clearing the DWO unit that _is_ used. This change adds a guard by checking `m_cancel_scopes` to prevent clearing the DWO unit.

This is 100% reproducible by running TestCreateAfterAttach.py in asan mode, although it only seems to reproduce in our internal build, so no test case is added here. If someone has suggestions on how to write one, I can add it.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D133790

[AArch64] Disable nontemproal load for Big Endian

The current code for generating nontemporal load outputs the wrong assembly for big endian architecture.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D133789

[InstCombine] try multi-use demanded bits folds for 'add'

This patch enables a multi-use demanded bits fold (motivated by issue #57576):
https://alive2.llvm.org/ce/z/DsZakh

This mimics transforms that we already do on the single-use path.

Originally, this patch did not include the last part to form a constant, but
that can be removed independently to reduce risk. It's not clear what the
effect of either change will be when viewed end-to-end.

This is expected to be neutral or a slight win for compile-time.
See the "add-demand2" series for experimental timing results:
https://llvm-compile-time-tracker.com/?config=NewPM-O3&stat=instructions&remote=rotateright

Differential Revision: https://reviews.llvm.org/D133788

[SLP] Move getInsertIndex function, NFC.

Part of D110978.

[flang][driver]Fix broken flang-new mlir test

The test was added as a .mlir file, and this extension is not
in the lit.cfg.py, so it was never run. When running it, the
file would produce an error, as semicolon is not an MLIR comment.

This adds the extension and fixes the comment start by using C++
style comments.

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D133792

[AArch64] Add nontemporal load tests for big endian.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D133765

[AA] Remove unnecessary intersections from getModRefBehavior() (NFC)

Intersection with other providers is performed by AAResults. Doing
this here is both pointless and confusing.

[ConstraintElimination] Further de-compose operands of add operations.

This simply extends the existing logic to look through adds and combine
the components as done in other places already.

[CostModel][X86] getArithmeticInstrCost - move GLM/SLM custom costs AFTER constant shift -> multiply canonicalization

Corrects the shift by constant costs to better account for them being converted to multiples for lowering - which demonstrates that we should probably be trying harder NOT to convert these to multiplies for some CPUs (v4i32 in particular).

[CostModel][X86] Fix throughput costs for AVX512BW v32i16 shifts

Fixes regression from a931dbfbd30754cf39897037a223eee60ae9e855

[lldb] Enable (un-xfail) some dwarf tests for arm

These are passing now that the relocation assertion has been removed in
D132954.

Relocations still remain unimplemented though, so it's possible this may
start to fail due to unrelated changes. If that happens very often, we
may just need to disable (skip) the test instead.

[ConstraintElimination] Add tests where info from zext can be used.

[lldb][DWARF5] Enable macro evaluation

Patch enables handing of DWARFv5 DW_MACRO_define_strx and DW_MACRO_undef_strx

~~~

OS Laboratory. Huawei RRI. Saint-Petersburg

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D130062

[MIR] Support printing and parsing pcsections

Adds support for printing and parsing PC sections metadata in MIR.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D133785

[ConstraintElimination] Add tests for chained adds.

Add test coverage for reasoning about chains of adds.

[test][clang] run test for lld emitting dwarf-aranages only if lld is presented

Fixes: https://reviews.llvm.org/D133092
CI: https://lab.llvm.org/buildbot/#/builders/109/builds/46592

Reviewed By: hokein

Differential Revision: https://reviews.llvm.org/D133841

[libc][Obvious] Fix typo in the alternate path of the POSIX "access" function.

[libc] Add implementation of POSIX function "access".

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D133814

[clang][Interp] Remove struct from a testcase

This should fix the leak sanitizer breakage introduced by
https://reviews.llvm.org/D132997, e.g.
https://lab.llvm.org/buildbot/#/builders/5/builds/27410

[C++20] [Coroutines] Prefer sized deallocation in promise_type

Now when the compiler can't find the sized deallocation function
correctly in promise_type if there are multiple deallocation function
overloads there.

According to [dcl.fct.def.coroutine]p12:
> If both a usual deallocation function with only a pointer parameter
> and a usual deallocation function with both a pointer parameter and a
> size parameter are found, then the selected deallocation function
> shall be the one with two parameters.

So when there are multiple deallocation functions, the compiler should
choose the sized one instead of the unsized one. The patch fixes this.

[flang] Make a descriptor copy for fir.load fir.ref<fir.box>

`fir.box` and `fir.ref<fir.box>` are both lowered to LLVM as a
descriptor in memory. This is because fir.box of polymorphic and assumed
rank entities cannot be known at compile time, so fir.box cannot be
lowered to a struct value.

fir.load or fir.ref<fir.box> was previously lowered to a no-op,
propagating the operand descriptor storage as a result.
This is wrong because the operand descriptor storage may later be
modified, and these changes should not be visible in the loaded fir.box
that is an immutable SSA value.

Modify fir.load codegen for fir.box to make a copy into a new storage to
ensure the fir.box is immutable.

Differential Revision: https://reviews.llvm.org/D133779

[amdgpu] Expand all ConstantExpr users of LDS variables in instructions

Bug noted in D112717 can be sidestepped with this change.

Expanding all ConstantExpr involved with LDS up front makes the variable specialisation simpler. Excludes ConstantExpr that don't access LDS to avoid disturbing codegen elsewhere.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D133422

[Support] Access threadIndex via a wrapper function

On Unix platforms, this wrapper function is inline, so it should
expand to the same direct access to the thread local variable. On
Windows, it's a non-inline function within Parallel.cpp, allowing
making the thread_local variable static.

Windows Native TLS doesn't support direct access to thread local
variables in a different DLL, and GCC/binutils on Windows occasionally
has problems with non-static thread local variables too.

This fixes mingw dylib builds with native TLS after
e6aebff67426fa0f9779a0c19d6188a043bf15e7.

At the same time, move the whole thread local variable within
#if LLVM_ENABLE_THREADS
to fix builds without threading support.

Differential Revision: https://reviews.llvm.org/D133759

AMDGPU: Factor out hasDivergentBranch(). NFC

This is helpful for detecting whether a block ends with divergent branch
in passes before lowering the pseudo control flow instructions.

Differential Revision: https://reviews.llvm.org/D133184

[HLSL]Add -O and -Od option for dxc mode.

Two new dxc mode options -O and -Od are added for dxc mode.
-O is just alias of existing cc1 -O option.
-Od will be lowered into -O0 and -dxc-opt-disable.

-dxc-opt-disable is cc1 option added to for build ShaderFlags.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D128845

[AArch64InstPrinter] Introduce register markup tags emission

AArch64 assembly syntax emission now leverages markup tags for registers, if enabled.

Reviewed By: MaskRay, david-arm

Differential Revision: https://reviews.llvm.org/D129870

[llvm-dwp] Report the filename if it cannot be found

For now, we report nothing if the execution/dwo file is missing, which is confusing.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D133549

[RISCV] Add cost model for vector insert/extract element.

This patch adds cost model for vector insert/extract element instructions. In RVV, we could use vector scalar move instruction to insert or extract the first element, and use vslide to move it. But for mask vector or i64 vector in i32 target, we need special instructions to make it.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D133007

[NVPTX] Use MBB.begin() instead MBB.front() in NVPTXFrameLowering::emitPrologue

The second argument of `NVPTXFrameLowering::emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB)` is the first MBB of the MF. In that function, it assumes the first MBB always contains instructions, so it gets the first instruction by MachineInstr *MI = &MBB.front();. However, with the reproducer/test case attached, all instructions in the first MBB is cleared in a previous pass for stack coloring. As a consequence, MBB.front() triggers the assertion that the first node is actually a sentinel node. Hence we are using MachineBasicBlock::iterator to iterate over MBB.

Fix #52623.

Differential Revision: https://reviews.llvm.org/D132663

[RISCV] Transform VMERGE_VVM_<LMUL>_TU with all ones mask to VADD_VI_<LMUL>_TU.

The transformation is benefit because vmerge.vvm always needs mask operand but
vadd.vi may not.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D133255

[RISCV] Lower BUILD_VECTOR to RISCVISD::VID_VL if it is floating-point type.

Differential Revision: https://reviews.llvm.org/D133688

[LoongArch] Categorize code by function. NFC.

Differential Revision: https://reviews.llvm.org/D133754

[RISCV] Assemble `call foo` to R_RISCV_CALL_PLT

R_RISCV_CALL/R_RISCV_CALL_PLT distinction isn't necessary. R_RISCV_CALL has been
deprecated as a resolution to
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/98 .

ld.lld and mold treat the two relocation types the same. GNU ld has a custom
handling for undefined weak functions which is unnecessary: calling an
unresolved undefined weak function is UB and GNU ld can handle the case without
a relocation error (such a function call is usually guarded by a zero value
check and should be allowed).

This patch assembles `call foo` to use R_RISCV_CALL_PLT instead of the
deprecated R_RISCV_CALL.

Note: the code generator still differentiates `call foo` and (maybe preemptible)
`call foo@plt`, but the difference is purely aesthetic.

Note: D105429 does not support R_RISCV_CALL_PLT correctly. Changed the test to
force R_RISCV_CALL for now.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D132530

[lld-macho][nfci] Don't include null terminator in StringRefs

So @keith observed
[here](https://reviews.llvm.org/D128108#inline-1263900) that the
StringRefs we were returning from `CStringInputSection::getStringRef()`
included the null terminator in their total length, but regular
StringRefs do not. Let's fix that so these StringRefs are less confusing
to use.

Reviewed By: #lld-macho, keith, Roger

Differential Revision: https://reviews.llvm.org/D133728

[mlir][sparse] minor merger API simplification

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D133821

[Object][COFF] Allow section symbol to be common symbol

I ran into an lld-link error due to a symbol named ".idata$4" coming from some
static library:
  .idata$4 should not refer to special section 0.

Here is the symbol table entry for .idata$4:

  Symbol {
      Name: .idata$4
      Value: 3221225536
      Section: IMAGE_SYM_UNDEFINED (0)
      BaseType: Null (0x0)
      ComplexType: Null (0x0)
      StorageClass: Section (0x68)
      AuxSymbolCount: 0
  }

The symbol .idata$4 is a section symbol (IMAGE_SYM_CLASS_SECTION) and LLD
currently handles it as a regular defined symbol since isCommon() returns false
for this symbol. This results in the error ".idata$4 should not refer to special
section 0" because lld-link asserts that regular defined symbols should not
refer to section 0.

Should this symbol be handled as a common symbol instead? LLVM currently only
allows external symbols (IMAGE_SYM_CLASS_EXTERNAL) to be common
symbols. However, the PE/COFF spec (see section "Section Number Values") does
not seem to mention this restriction. Any thoughts?

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D133627

[mlir][vector] Fold scalar vector.extract of non-splat n-D constants

Add a new pattern to fold `vector.extract` over n-D constants that extract scalars.
The previous code handled ND splat constants only. The new pattern is conservative and does handle sub-vector constants.

This is to aid the `arith::EmulateWideInt` pass which emits a lot of 2-element vector constants.

Reviewed By: Mogball, dcaballe

Differential Revision: https://reviews.llvm.org/D133742

[flang] Write semantics test for atomic_fetch_xor

Write a semantics test for the atomic intrinsic subroutine,
atomic_fetch_xor.

Reviewed By: rouson

Differential Revision: https://reviews.llvm.org/D133704

[test] [fuzzer] Enable tests for iossim, disable for ios (update2)

The fuzzer tests cross_over.test and merge-control-file.test are not handled
correctly on ios device testing. On-device testing requires the macros %t, %s,
etc. to be expanded for a different default directory than when testing on host.

rdar://99889376

Differential Revision: https://reviews.llvm.org/D133811

Address feedback in https://reviews.llvm.org/D133637

https://reviews.llvm.org/D133637 fixes the problem where we should hash raw content of
register mask instead of the pointer to it.

Fix the same issue in `llvm::hash_value()`.

Remove the added API `MachineOperand::getRegMaskSize()` to avoid potential confusion.

Add an assert to emphasize that we probably should hash a machine operand iff it has
associated machine function, but keep the fallback logic in the original change.

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D133747

[IR] Add alignment for llvm.threadlocal.address

This diff sets the alignment attribute for the return value
and the argument of llvm.threadlocal.address.

(https://github.com/llvm/llvm-project/issues/57438)

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D133741

[WebAssembly] Improve codegen for shuffles with undefined lane indices

For undefined lane indices, fill the mask with {0..N} instead of zeros to allow
further reduction to word/dword shuffle on the VM.

Reviewed By: tlively, penzn

Differential Revision: https://reviews.llvm.org/D133473

[Lex/DependencyDirectivesScanner] Handle the case where the source line starts with a `tok::hashhash`

Differential Revision: https://reviews.llvm.org/D133674

Add mach-o corefile support for platform binaries

Add support for recognizing a platform binary in the ObjectFileMachO
method that parses the "load binary" LC_NOTEs in a corefile.

A bit of reorganization to ProcessMachCore::DoLoadCore to separate
all of the unrelated things being done in that method into their own
separate methods, as well as small fixes to improve the handling of
a corefile with multiple kernel images in the corefile.

Differential Revision: https://reviews.llvm.org/D133680
rdar://98754861

RegisterCoalescer: Fix verifier error when merging copy of undef

There's no real read of the register, so the copy introduced a new
live value. Make sure we introduce a replacement implicit_def instead
of just erasing the copy.

Found from llvm-reduce since it tries to set undef on everything.

[lldb][fuzz] Allow expression fuzzer to be passed as a flag.

The expression fuzzer checks an environment variable, `LLDB_FUZZER_TARGET`, to get the fuzzer target binary. This is fine, but internally our tooling for running fuzz tests only has proper handling for flag values. It's surprisingly complicated to add support for that, and allowing it to be passed via flag seems reasonable anyway.

Reviewed By: cassanova

Differential Revision: https://reviews.llvm.org/D133546

[RISCV] Add MIR comments for VecPolicy operands

Analogous to what we already do for SEW operands, aimed at making the resulting MIR readable by a human.

[clang] fix generation of .debug_aranges with LTO

Right now in case of LTO the section is not emited:

    $ cat test.c
    void __attribute__((optnone)) bar()
    {
    }
    void __attribute__((optnone)) foo()
    {
            bar();
    }
    int main()
    {
            foo();
    }

    $ clang -flto=thin -gdwarf-aranges -g -O3 test.c
    $ eu-readelf -waranges a.out  | fgrep -c -e foo -e bar
    0

    $ clang -gdwarf-aranges -g -O3 test.c
    $ eu-readelf -waranges a.out  | fgrep -c -e foo -e bar
    2

Fix this by passing explicitly -mllvm -generate-arange-section.

P.S. although this looks like a hack, since none of -mllvm was passed to
the lld before.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Suggested-by: OCHyams <orlando.hyams@sony.com>
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D133092

Don't populate the symbol table with symbols that don't belong to a section with the flag SHF_ALLOC

When populating the symbol table for an ELF object file, don't insert any symbols that come from ELF sections which don't have runtime allocated memory (typically debugging symbols).

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D133795

llvm-reduce: Add undef to new subregister IMPLICIT_DEFs

This avoids a verifier error from the other unused lanes when
LiveIntervals is used.

llvm-reduce: Fix missing undef flags in some tests

These caused failures when LiveIntervals is used by the verifier. Also
fix some other errors that appear with subranges enabled.

llvm-reduce: Use FileCheck instead of python for interestingness test

Also avoid using cat for no reason.

[RISCV] Simpify operand index calculation in createMIROperandComment [nfc]

Revert "Be more careful to maintain quoting information when parsing commands."

This reverts commit 6c089b2af5d8d98f66b27b67f70958f520820a76.

This was causing the test test_help_run_hides_options from TestHelp.py to
fail on Linux and Windows (but the test succeeds on macOS). The decision
to print option information is determined by CommandObjectAlias::IsDashDashCommand
which was changed, but only by replacing an inline string constant with a const char *
CommandInterpreter::g_argument which has the same string value. I can't see why this
would fail, I'll have to spin up a vm to see if I can repo there.

Revert "constexpr isn't right here."

This didn't help either.

This reverts commit 8433b210839ed655852428ba8b34bb67b191957a.

Revert "Trying to understand the TestHelp.py failure from 6c089b2."

It didn't help.

This reverts commit 81f8788528886ee611041e1f4ee54eea8bbfa277.

Revert "Make sure libLLVM users link with libatomic if needed"

Adds too many dependencies: many libraries in LLVM_SYSTEM_LIBS are
arguably not required for users of libLLVM.

This reverts commit 44ffc13f2eb6188a86ae88ea1e942e9ac354db9b.

Add virtual-base-class example to AMDGPUDwarfExtensionAllowLocationDescriptionOnTheDwarfExpressionStack.md

Differential Revision: https://reviews.llvm.org/D133791

[ELF][Distributed ThinLTO] Do not generate empty index when bitcode object is linked

When the same bitcode object file is given multiple times from the Command-line
as lazy object file, empty index is generated which overwrites the one from thinlink.
This could cause undefined symbols during final link.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D133740

[BOLT] Stop using std::iterator (NFC)

Without this patch, I get warnings like:

  bolt/include/bolt/Core/BinaryContext.h:108:19: error:
  'iterator<std::bidirectional_iterator_tag,
  llvm::bolt::BinarySection>' is deprecated
  [-Werror,-Wdeprecated-declarations]

This patch fixes those warnings by defining iterator_category,
value_type, etc.

This patch intentionally leaves duplicate types like FilterIterator::T
and FilterIterator::PointerT intact to avoid mixing the fix and the
cleanup.

Differential Revision: https://reviews.llvm.org/D133650

Trying to understand the TestHelp.py failure from 6c089b2.

Sadly, the test passes on macOS, but fails on Ubuntu & Win.  The
extra option printing is supposed to be suppressed by the return
from CommandObjectAlias::IsDashDashCommand.  That was changed, but just
by replacing an inline string compare with a const string from
CommandInterpreter.  Putting the old version back temporarily to
see if that is really the problem.

[HLSL] Adding a test change I forgot to add

This test just verifies that even at -O0 the buffer subscript operators
are inlined. The original change was
fb5baffc28c8beaf790a2fb1c8a863d29020bfbe.

[DirectX backend] Remove Attribute not for DXIL on CallInst

Remove Attribute on CallInst which is not for DXIL when prepare for DXIL.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D133279

[HLSL] Mark buffer subscript operators as AlwaysInline

HLSL requires aggressive inlineing for resource accesses. This just
enforces that we get resource handle accesses inlined early.

[gn build] port fc04749957f1

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D133794

[mlir][sparse] Add sparse_tensor.select operation

The new select operation allows filtering of sparse tensors
by conditionally keeping or removing each element. This
can be used to remove negative values or select the upper
triangle of a matrix.

The select op has a single region which operates on a single
value and must return a boolean True to keep or False to drop.

Reviewed by: aartbik

Differential Revision: https://reviews.llvm.org/D133569

[Clang] [Sema] Ignore invalid multiversion function redeclarations

If a redeclaration of a multiversion function is invalid,
it may be in a broken condition (for example, missing an important
attribute). We shouldn't analyze invalid redeclarations.

Fixes https://github.com/llvm/llvm-project/issues/57343

Reviewed By: tahonermann

Differential Revision: https://reviews.llvm.org/D133641

[HLSL] Call global destructors from entries

HLSL doesn't have a C++ runtime that supports `atexit` registration. To
enable global destructors we instead rely on the `llvm.global_dtor`
mechanism.

This change disables `atexit` generation for HLSL and updates the HLSL
code generation to call global destructors on the exit from entry
functions.

Depends on D132977.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D133518

constexpr isn't right here.

[Formatters][NFCI] Replace 'is_regex' arguments with an enum.

Modify `SBTypeNameSpecifier` and `lldb_private::TypeMatcher` so they
have an enum value for the type of matching to perform instead of an
`m_is_regex` boolean value.

This change paves the way for introducing formatter matching based on
the result of a python callback in addition to the existing name-based
matching. See the RFC thread at
https://discourse.llvm.org/t/rfc-python-callback-for-data-formatters-type-matching/64204
for more details.

Differential Revision: https://reviews.llvm.org/D133240

[AMDGPU] Don't shrink VOP3 instructions pre-RA on GFX10+

In GFX10, there is no advantage to shrinking these instructions pre-RA,
so this just saves a bit of work.

In GFX11 there is an advantage to *not* shrinking them pre-RA, because
the register classes for 16-bit operands are less restrictive in the
VOP3 form than in the shrunk form. This patch is a prerequisite for
actually setting up those register classes correctly for 16-bit vs
non-16-bit operands.

Differential Revision: https://reviews.llvm.org/D133769

[SelectOpti] Fix lifetime intrinsic bug

When a select is converted to a branch and load instructions are sinked to the true/false blocks,
lifetime intrinsics (if present) could be made unsound if not moved.

This conservatively moves all lifetime intrinsics in a transformed BB to the end block to ensure
preserved lifetime semantics.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D133777

[DX] DXContainer does not support COMDAT

The DXContainer is pretty primitive, but doesn't support COMDAT. We need
to set that in the Triple so that Clang won't try to emit COMDATs.