review.tizen.org Git - platform/upstream/llvm.git/log

[AMDGPU] More GFX11 test coverage

[InstCombine] Push freeze through recurrence phi

We really want to push freezes through recurrence phis, so that we
freeze only the start value, rather than the IV value on every
iteration. foldOpIntoPhi() already handles this for the case where
the transfer function doesn't produce poison, e.g.
%iv.next = add %iv, 1. However, this does not work if nowrap flags
are present, e.g. the very common %iv.next = add nuw %iv, 1 case.

This patch adds a fold that pushes freeze instructions to the start
value by checking whether all backedge values will be non-poison
after poison generating flags have been dropped. This allows pushing
freezes out of loops in most cases. I suspect that this also
obsoletes the CanonicalizeFreezeInLoops pass, and we can probably
drop it.

Fixes https://github.com/llvm/llvm-project/issues/56048.

Differential Revision: https://reviews.llvm.org/D127960

Revert "[clang] Dont print implicit forrange initializer"

This reverts commit 32805e60c9de1f82887cd2af30d247dcabd2e1d3.
Broke check-clang, see comments on https://reviews.llvm.org/D127863

[InstCombine] add tests for FP casts; NFC

[InstCombine] add tests for (pow2 >> X) >u C; NFC

[SelectionDAG] Extend WidenVecOp_INSERT_SUBVECTOR to cover more cases.

WidenVecOp_INSERT_SUBVECTOR only supported cases where widening
effectively converts the insert into a copy. However, when the
widened subvector is no bigger than the vector being inserted into
and we can be sure there's no loss of data, we can simply emit
another INSERT_SUBVECTOR.

Fixes: #54982

Differential Revision: https://reviews.llvm.org/D127508

[gn build] (semi-manually) port 232bd331cbaa

[lldb] [test] Update baseline test status for FreeBSD

Fixes #19721
Fixes #18440
Partially fixes bug #47660
Fixes #47761
Fixes #47763

Sponsored by: The FreeBSD Foundation

[mlir][bufferize][NFC] Remove BufferizationState

With the recent refactorings, this class is no longer needed. We can use BufferizationOptions in all places were BufferizationState was used.

Differential Revision: https://reviews.llvm.org/D127653

[Clang] Allow 'Complex float __attribute__((mode(HC)))'

Adding half float to types that can be represented by __attribute__((mode(xx))).
Original implementation authored by George Steed.

Differential Revision: https://reviews.llvm.org/D126479

[mlir][bufferize] Bufferize after TensorCopyInsertion

This change changes the bufferization so that it utilizes the new TensorCopyInsertion pass. One-Shot Bufferize no longer calls the One-Shot Analysis. Instead, it relies on the TensorCopyInsertion pass to make the entire IR fully inplacable. The `bufferize` implementations of all ops are simplified; they no longer have to account for out-of-place bufferization decisions. These were already materialized in the IR in the form of `bufferization.alloc_tensor` ops during the TensorCopyInsertion pass.

Differential Revision: https://reviews.llvm.org/D127652

[AMDGPU] Use explicit -global-isel=0/1 in tests. NFC.

[LLVM][IR] Fix typo in DerivedTypes.h (NFC)

[AArch64][LV] AArch64 does not prefer vectorized addressing

TTI::prefersVectorizedAddressing() try to vectorize the addresses that lead to loads.
For aarch64, only gather/scatter (supported by SVE) can deal with vectors of addresses.
This patch specializes the hook for AArch64, to return true only when we enable SVE.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D124612

[clang] Dont print implicit forrange initializer

Fixes https://github.com/clangd/clangd/issues/1158

Differential Revision: https://reviews.llvm.org/D127863

[IR] Check for SignedMin/-1 division in canTrap() (PR56038)

In addition to division by zero, signed division also traps for
SignedMin / -1. This was handled in isSafeToSpeculativelyExecute(),
but not in Constant::canTrap().

[mlir] replace 'emit_c_wrappers' func->llvm conversion option with a pass

The 'emit_c_wrappers' option in the FuncToLLVM conversion requests C interface
wrappers to be emitted for every builtin function in the module. While this has
been useful to bootstrap the interface, it is problematic in the longer term as
it may unintentionally affect the functions that should retain their existing
interface, e.g., libm functions obtained by lowering math operations (see
D126964 for an example). Since D77314, we have a finer-grain control over
interface generation via an attribute that avoids the problem entirely. Remove
the 'emit_c_wrappers' option. Introduce the '-llvm-request-c-wrappers' pass
that can be run in any pipeline that needs blanket emission of functions to
annotate all builtin functions with the attribute before performing the usual
lowering that accounts for the attribute.

Reviewed By: chelini

Differential Revision: https://reviews.llvm.org/D127952

[libc][bazel] Remove memcpy dependency in memmove

[AArch64] NFC: Fix BFMLAL[BT] inst def names

[msan] Allow KMSAN to use -fsanitize-memory-param-retval

Let -fsanitize-memory-param-retval be used together with
-fsanitize=kernel-memory, so that it can be applied when building the
Linux kernel.

Also add clang/test/CodeGen/kmsan-param-retval.c to ensure that
-fsanitize-memory-param-retval eliminates shadow accesses for parameters
marked as undef.

Reviewed By: eugenis, vitalybuka

Differential Revision: https://reviews.llvm.org/D127860

[OpenCL] Fix atomic_fetch_add/sub half overloads

Some of the atomic_fetch_add and atomic_fetch_sub overloads intended
for atomic_half types accidentally had an atomic_float parameter.

[InstCombine] Add tests for freeze of recurrence with invoke start (NFC)

[LLDB] XFAIL TestLoadUnload fails on Arm/Ubuntu Jammy

This patch marks following tests as XFAIL for Arm/Ubuntu Jammy 22.04:
test_lldb_process_load_and_unload_commands
test_load_unload

Revert "Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""""

This reverts commit 04a3d5f3a1193fb87576425a385aa0a6115b1e7c.

I see two more issues:

- uitofp/sitofp from i32/i64 to half now generates
  __floatsihf/__floatdihf, which exists in neither compiler-rt nor
  libgcc

- This crashes when legalizing the bitcast:
```
; RUN: llc < %s -mcpu=skx
define void @main.45(ptr nocapture readnone %retval, ptr noalias nocapture readnone %run_options, ptr noalias nocapture readnone %params, ptr noalias nocapture readonly %buffer_table, ptr noalias nocapture readnone %status, ptr noalias nocapture readnone %prof_counters) local_unnamed_addr {
entry:
  %fusion = load ptr, ptr %buffer_table, align 8
  %0 = getelementptr inbounds ptr, ptr %buffer_table, i64 1
  %Arg_1.2 = load ptr, ptr %0, align 8
  %1 = getelementptr inbounds ptr, ptr %buffer_table, i64 2
  %Arg_0.1 = load ptr, ptr %1, align 8
  %2 = load half, ptr %Arg_0.1, align 8
  %3 = bitcast half %2 to i16
  %4 = and i16 %3, 32767
  %5 = icmp eq i16 %4, 0
  %6 = and i16 %3, -32768
  %broadcast.splatinsert = insertelement <4 x half> poison, half %2, i64 0
  %broadcast.splat = shufflevector <4 x half> %broadcast.splatinsert, <4 x half> poison, <4 x i32> zeroinitializer
  %broadcast.splatinsert9 = insertelement <4 x i16> poison, i16 %4, i64 0
  %broadcast.splat10 = shufflevector <4 x i16> %broadcast.splatinsert9, <4 x i16> poison, <4 x i32> zeroinitializer
  %broadcast.splatinsert11 = insertelement <4 x i16> poison, i16 %6, i64 0
  %broadcast.splat12 = shufflevector <4 x i16> %broadcast.splatinsert11, <4 x i16> poison, <4 x i32> zeroinitializer
  %broadcast.splatinsert13 = insertelement <4 x i16> poison, i16 %3, i64 0
  %broadcast.splat14 = shufflevector <4 x i16> %broadcast.splatinsert13, <4 x i16> poison, <4 x i32> zeroinitializer
  %wide.load = load <4 x half>, ptr %Arg_1.2, align 8
  %7 = fcmp uno <4 x half> %broadcast.splat, %wide.load
  %8 = fcmp oeq <4 x half> %broadcast.splat, %wide.load
  %9 = bitcast <4 x half> %wide.load to <4 x i16>
  %10 = and <4 x i16> %9, <i16 32767, i16 32767, i16 32767, i16 32767>
  %11 = icmp eq <4 x i16> %10, zeroinitializer
  %12 = and <4 x i16> %9, <i16 -32768, i16 -32768, i16 -32768, i16 -32768>
  %13 = or <4 x i16> %12, <i16 1, i16 1, i16 1, i16 1>
  %14 = select <4 x i1> %11, <4 x i16> %9, <4 x i16> %13
  %15 = icmp ugt <4 x i16> %broadcast.splat10, %10
  %16 = icmp ne <4 x i16> %broadcast.splat12, %12
  %17 = or <4 x i1> %15, %16
  %18 = select <4 x i1> %17, <4 x i16> <i16 -1, i16 -1, i16 -1, i16 -1>, <4 x i16> <i16 1, i16 1, i16 1, i16 1>
  %19 = add <4 x i16> %18, %broadcast.splat14
  %20 = select i1 %5, <4 x i16> %14, <4 x i16> %19
  %21 = select <4 x i1> %8, <4 x i16> %9, <4 x i16> %20
  %22 = bitcast <4 x i16> %21 to <4 x half>
  %23 = select <4 x i1> %7, <4 x half> <half 0xH7E00, half 0xH7E00, half 0xH7E00, half 0xH7E00>, <4 x half> %22
  store <4 x half> %23, ptr %fusion, align 16
  ret void
}
```

llc: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:977: void (anonymous namespace)::SelectionDAGLegalize::LegalizeOp(llvm::SDNode *): Assertion `(TLI.getTypeAction(*DAG.getContext(), Op.getValueType()) == TargetLowering::TypeLegal || Op.getOpcode() == ISD::TargetConstant || Op.getOpcode() == ISD::Register) && "Unexpected illegal type!"' failed.

[clang] Fix trivially copyable for copy constructor and copy assignment operator

From [class.copy.ctor]:

```
A non-template constructor for class X is a copy constructor if its first
parameter is of type X&, const X&, volatile X& or const volatile X&, and
either there are no other parameters or else all other parameters have
default arguments (9.3.4.7).

A copy/move constructor for class X is trivial if it is not user-provided and if:
- class X has no virtual functions (11.7.3) and no virtual base classes (11.7.2), and
- the constructor selected to copy/move each direct base class subobject is trivial, and
- or each non-static data member of X that is of class type (or array thereof),
the constructor selected to copy/move that member is trivial;

otherwise the copy/move constructor is non-trivial.
```

So `T(T&) = default`; should be trivial assuming that the previous
provisions are met.

This works in GCC, but not in Clang at the moment:
https://godbolt.org/z/fTGe71b6P

Reviewed By: royjacobson

Differential Revision: https://reviews.llvm.org/D127593

[RISCV][NFC] Add load/store instructions in rv64*-invalid.s

Reviewed By: benshi001, sunshaoce

Differential Revision: https://reviews.llvm.org/D127721

[Driver] Pass -X to ld for riscv*-{elf,freebsd,linux}

GNU ld has a hack that defaults to -X (--discard-locals) in the emulation file
`riscvelf.em`. The recommended way, as gcc/config/arm does, is to let the
compiler driver pass -X to ld.
(The motivation is likely to discard a plethora of `.L` symbols due to linker
relaxation.)

lld default to --discard-none. To make clang+lld match GNU ld's behavior, pass
-X to ld.

Note: GNU ld has a special rule to treat ld -r -s as ld -r -S -x. With -X, driver `-r -Wl,-s`
will behave as ld `-r -S -X`. This removes fewer symbols than `-r -S -x` but is safe.

Differential Revision: https://reviews.llvm.org/D127826

[TableGen][DirectX] generate DXIL operation table with TableGen.

Add more feature to tableGen backend gen-dxil-operation.

It will generate getOpCodeProperty, getOpCodeClassName and getOpCodeName when build DirectX target.
Each of these functions has a table which generate based on DXIL operations.

These generated functions will replace the manually written functions which used for query DXIL operation information.

Reviewed By: bogner

Differential Revision: https://reviews.llvm.org/D125520

[AArch64][test] Add --mattr=-{sve,sve2,sme} to SVE/SVE2/SME MC tests

llvm-objdump will be changed to default to disassemble all instructions.
Add --mattr=-{sve,sve2,sme} to keep the desired testing.

[MachineBlockPlacementStats] Added check for "-filter-print-funcs"
option to the machine-block-placement-stats.

Differential Revision: https://reviews.llvm.org/D128019

Revert "[MachineBlockPlacementStats] Add check for `-filter-print-funcs` option to machine-block-placement stats."

This reverts commit 46d45df4516e9a5bc43460429cd02cd04a85db1a. Going to
add differential revision link to commit message and re-commit.

[MachineBlockPlacementStats] Add check for `-filter-print-funcs` option to machine-block-placement stats.

Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"""

Fix the crash on lowering X86ISD::FCMP.

[lldb] Remove LogHandler::Create functions (NFC)

Remove the LogHandler::Create functions. Except for the StreamHandler
they were just forwarding their arguments to std::make_shared.

[lld-macho][nfc] Tests for -force_load + regular archive load combinations

I realized we'd forgotten to cover this case (though our existing
behavior is indeed correct / matches ld64's).

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D128025

[mlir] Disable warning in test of deprecated feature (NFC)

Disable warning for deprecation in test of deprecated feature. Also
remove additional test of deprecated feature from TestOps.td.

[mlir] Split MLProgram global load and store to Graph variants

* Split ops into X_graph variants as discussed;
* Remove tokens from non-Graph region variants and rely on side-effect
  modelling there while removing side-effect modelling from Graph
  variants and relying on explicit ordering there;
* Make tokens required to be produced by Graph variants - but kept
  explicit token type specification given previous discussion on this
  potentially being configurable in future;

This results in duplicating some code. I considered adding helper
functions but decided against adding an abstraction there early given
size of duplication and creating accidental coupling.

Differential Revision: https://reviews.llvm.org/D127813

[LegalizeTypes][NFC] Merge promote SPLAT_VECTOR and promote SCALAR_TO_VECTOR to one function

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D127825

[mlir][doc] Avoid duplication with constraints and defs

Where a constraint also has a def, emit the def only to avoid duplicate
output (and def has more complete info). Also move attributes and types
to the end rather than some on top and some at end.

Differential Revision: https://reviews.llvm.org/D127823

[LegalizeTypes][RISCV][NFC] Modify assert in PromoteIntRes_STEP_VECTOR and add some tests for RISCV

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D127939

[RISCV][NFC][test] Correct a wrong test in vreductions-fp-vp.ll

Reviewed By: victor-eds, frasercrmck

Differential Revision: https://reviews.llvm.org/D127946

[ORC-RT] Make the ORC runtime C API public.

This is a first step towards allowing programs to pre-link against the ORC
runtime, which would allow us to move some code that is currently in the LLVM
OrcTarget library into the ORC runtime instead.

The C API header has limited utility as-is, but serves as a minimal first step
and provides clients with tools for interacting with wrapper functions.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D127324

llvm-reduce: Try to fix dynamic libraries build

PowerPC] Emit warning for incompatible vector types that are currently diagnosed with -fno-lax-vector-conversions

This patch is the last prerequisite to switch the default behaviour to -fno-lax-vector-conversions in the future.
The first path ;D124093; fixed the altivec implicit castings.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D126540

[PowerPC][NFC] Undefine __XL_COMPAT_ALTIVEC__ in builtin lit test

Add defines and undefines of the __XL_COMPAT_ALTIVEC__ to ensure
consistent results regardless of the default for this macro.

[OpenMP] Initial parsing and sema for 'parallel masked' construct

Differential Revision: https://reviews.llvm.org/D127454

[AMDGPU][NFC] Remove isConstantAddr

fix isConstantAddr defined but not used

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D127959

[gn build] Port eea11e7369ca

llvm-reduce: Add reduction pass to simplify instructions

llvm-reduce: Support replacing FP values with 1.0

[Object][COFF] Improve section name parsing

Inspired by discussions on D127369, we probably can further improve LLVM's COFF
section name parsing. Hopefully, this makes the logic simpler and handles some
edge cases more elegantly.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D127902

[sanitizer] Delete empty sanitizer_openbsd.cpp after D89759

[SVE][CodeGen] Restructure SVE fixed length tests to use update_llc_test_checks.

Most tests have been updated to make use of vscale_range to reduce
the number of RUN lines. For the remaining RUN lines the check
prefixes have been updated to ensure the original expectation of
the manual CHECK lines is maintained after update_llc_test_checks
is run.

[lldb] Cleanup Python API reference files after building the docs

The sphinx-automodapi extension requires that the generated RST files
live next to the index file. This means that we generate them in the
source directory rather than the build directory. This patch ensures
these files are removed again when sphinx finishes its build.

The proper solution to this problem would be to move everything in the
doc folder from the source directory to the build directory before
generating the docs.

I believe that old RST files being kept around is the reason that the
Python API references on the website isn't getting updated. This patch
is meant as a speculative fix and a way to confirm that.

[mlir][sparse] improved testing and codegen for semi-ring operations

The semi-ring blocks were simply "inlined" by the sparse compiler but
without any filtering or patching. This revision improves the analysis
(rejecting blocks that use non-invariant computations from outside
their blocks, except for linalg.index) and also improves the codegen
by properly patching up index computations (previous version crashed).

With a regression test. Also updated the documentation now that the
example code is properly working.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D128000

[gn build] Port ff3989e6ae74

[gn build] Port 61fac2c370b9

Incomplete attempt to pull DWARFTypePrinter into its own file for reuse
from lldb

[RISCV] Rename VTy param of RISCVTTIImpl::getArithmeticReductionCost [NFC]

Having it be consistent with getMinMaxReductionCost for ease of copy paste outweights the minor clarity of calling it VTy instead of Ty.

[libc++][ranges] Implement `ranges::sort`.

Differential Revision: https://reviews.llvm.org/D127557

fix x86 sanitizer failure due to use of or

[lldb] Remove references to epydoc from the documentation

We no longer rely on epydoc but instead use a sphinx plugin to generate
the Python API reference.

[lldb] Add RotatingLogHandler

Add a log handler that maintains a circular buffer with a fixed size.

Differential revision: https://reviews.llvm.org/D127937

[RISCV] Implement RISCVTargetLowering::getTargetConstantFromLoad.

This allows computeKnownBits to see the constant being loaded.

This recovers the rv64zbp test case changes from D127520.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D127679

[RISCV] Add RISCVISD opcode for PseudoLLA.

Rather than emitting a MachineSDNode from lowering. Let isel match it.

This is consistent with the RISCVISD::HI and ADD_LO nodes that were
also added. Having them both the same will make D127679 consistent.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D127714

[NFCI] Whitespace in SemaDeclAttr.cpp

[PowerPC] Fixing implicit castings in altivec for -fno-lax-vector-conversions

XL considers different vector types to be incompatible with each other.
For example assignment between variables of types vector float and vector
long long or even vector signed int and vector unsigned int are diagnosed.
clang, however does not diagnose such cases and does a simple bitcast between
the two types. This could easily result in program errors. This patch is to
fix the implicit casts in altivec.h so that there is no incompatible vector
type errors whit -fno-lax-vector-conversions, this is the prerequisite patch
to switch the default to -fno-lax-vector-conversions later.

Reviewed By: nemanjai, amyk

Differential Revision: https://reviews.llvm.org/D124093

[clang-tidy] Organize check doc files into subdirectories (NFC)

- Rename doc files to subdirs by module
- Update release notes and check list to use subdirs
- Update add_new_check.py to handle doc subdirs

Differential Revision: https://reviews.llvm.org/D126495

[RISCV] Don't emit LUI/ADDI MachineSDNodes from getAddr

Instead add RISCVISD opcodes that will be selected to LUI/ADDI
during isel.

I'm looking into maybe moving doPeepholeLoadStoreADDI into isel.
Having the ADDI as a RISCVISD node will make it visible to isel.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D127713

[SelectionDAG] Don't apply MinRCSize constraint in InstrEmitter::AddRegisterOperand for IMPLICIT_DEF sources.

MinRCSize is 4 and prevents constrainRegClass from changing the
register class if the new class has size less than 4.

IMPLICIT_DEF gets a unique vreg for each use and will be removed
by the ProcessImplicitDef pass before register allocation. I don't
think there is any reason to prevent constraining the virtual register
to whatever register class the use needs.

The attached test case was previously creating a copy of IMPLICIT_DEF
because vrm8nov0 has 3 registers in it.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D128005

Add DWARF string debug to clang release notes.

D12353 added inline strings to the DWARF info produced by clang. This
turns out to break some debugging software that assumes that a
DW_TAG_variable *must* come with a DW_AT_name. Add a release note to
broadcast this change.

Reviewed By: paulkirth

Differential Revision: https://reviews.llvm.org/D126224

[mlir][sparse] fix asan issue

The LinalgElementwiseOpFusion pass has become smarter, and converts
the simple conversion linalg operation into a sparse dialect convert
operation. However, since our current bufferization does not take the
new semantics into consideration, we leak memory of the allocation.
For now, this has been fixed by making the operation less trivial.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D128002

Make setSanitizerMetadata byval.

This fixes a UaF bug in llvm::GlobalObject::copyAttributesFrom, where a
sanitizer metadata object is captured by reference, and passed by
reference to llvm::GlobalValue::setSanitizerMetadata. The reference
comes from the same map that the new value is going to be inserted to,
and the map insertion triggers iterator invalidation - leading to a
use-after-free on the dangling reference.

This patch fixes that bug by making setSanitizerMetadata's argument
byval. This should also systematically prevent the problem from
happening in future, as it's a very easy pattern to have. This shouldn't
be any performance problem, the SanitizerMetadata struct is a bitfield
POD.

[libc] Add a status page for math functions.

Add a status page for math functions.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D127920

[RISCV] Start merging demanded reasoning - starting with load/stores [nfc]

This change merges the logic for reasoning about demanded portions of the VTYPE register between the main dataflow algorithm and the backwards mutation post pass. In the process, we get to delete a bunch of now redundant code.

This should be entirely NFC. I included a slight hack (see TODO) to avoid changing behavior in the post pass while being able to use the generalized logic in the prepass. I will fix the TODO in a separate change once this lands.

Differential Revision: https://reviews.llvm.org/D127983

[RISCV] Add cost model for scalable scatter and gather

The costing we use for fixed length vector gather and scatter is to simply count up the memory ops, and multiply by a fixed memory op cost. For scalable vectors, we don't actually know how many lanes are active. Instead, we have to end up making a worst case assumption on how many lanes could be active. In the generic +V case, this results in very high costs, but we can do better when we know an upper bound on the VLEN.

There's some obvious ways to improve this - e.g. using information about VL and mask bits from the instruction to reduce the upper bound - but this seems like a reasonable starting point.

The resulting costs do bias us pretty strongly away from generating scatter/gather for generic +V. Without this, we'd be returning an invalid cost and thus definitely not vectorizing, so no major change in practical behavior expected.

Differential Revision: https://reviews.llvm.org/D127541

[mlir][complex] Add Python bindings for complex ops.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D127916

Revert "[TableGen][DirectX] generate DXIL operation table with TableGen."

This reverts commit 46fcdf23640ebb76271f91720583b0df6bed4481.

Reason: Broke the buildbots:
https://lab.llvm.org/buildbot/#/builders/77/builds/18671

Move debug-only code inside LLVM_DEUG to prevent unused variable warnings.

Reland "[ASan] Use debuginfo for symbolization."

This reverts commit 99796d06dbe11c8f81376ad1d42e7f17d2eff6ae.

Hint: Looking here because your manual invocation of something in
'check-asan' broke? You need a new symbolizer (after D123538).

An upcoming patch will remove the internal metadata for global
variables. With D123534 and D123538, clang now emits DWARF debug info
for constant strings (the only global variable type it was missing), and
llvm-symbolizer is now able to symbolize all global variable addresses
(where previously it wouldn't give you the file:line information).

Move ASan's runtime over from the internal metadata to DWARF.

Differential Revision: https://reviews.llvm.org/D127552

[TableGen][DirectX] generate DXIL operation table with TableGen.

Add more feature to tableGen backend gen-dxil-operation.

It will generate getOpCodeProperty, getOpCodeClassName and getOpCodeName when build DirectX target.
Each of these functions has a table which generate based on DXIL operations.

These generated functions will replace the manually written functions which used for query DXIL operation information.

Reviewed By: bogner

Differential Revision: https://reviews.llvm.org/D125520

[MergeFunctions] Preserve symbols used llvm.used/llvm.compiler.used

llvm.used and llvm.compiler.used are often used with inline assembly
that refers to a specific symbol so that the symbol is kept through to
the linker even though there are no references to it from LLVM IR.

This fixes the MergeFunctions pass to preserve references to these
symbols in llvm.used/llvm.compiler.used so they are not deleted from the
IR. This doesn't prevent these functions from being merged, but
guarantees that an alias or thunk with the expected symbol name is kept
in the IR.

Differential Revision: https://reviews.llvm.org/D127751

[gn build] Port 6ff49af33d09

[lldb] Introduce the concept of a log handler (NFC)

This patch introduces the concept of a log handlers. Log handlers allow
customizing the way log output is emitted. The StreamCallback class
tried to do something conceptually similar. The benefit of the log
handler interface is that you don't need to conform to llvm's
raw_ostream interface.

Differential revision: https://reviews.llvm.org/D127922

[Delinearization] Refactoring of fixed-size array delinearization

This is a follow-up patch to D122857 where we added delinearization of
fixed-size arrays to loop cache analysis, which resulted in some duplicate
code, i.e., "tryDelinearizeFixedSize()", in LoopCacheCost.cpp and
DependenceAnalysis.cpp. Refactoring is done in this patch.

This patch refactors out the main logic of "tryDelinearizeFixedSize()" as
"tryDelinearizeFixedSizeImpl()" and moves it to Delinearization.cpp, such that
clients can reuse "llvm::tryDelinearizeFixedSizeImpl()" wherever they would
like to delinearize fixed-size arrays. Currently it has two users, i.e.,
DependenceAnalysis.cpp and LoopCacheCost.cpp.

Reviewed By: Meinersbur, #loopoptwg

Differential Revision: https://reviews.llvm.org/D124745

[TargetTransformInfo] Added an opt/llc option for cache line size

In some passes we need a valid number of cache line size to do analysis or
transformation, e.g., loop cache analysis and loop date prefetch. However,
for some backend targets, `TTIImpl->getCacheLineSize()` is not implemented
and hence 'TTI.getCacheLineSize()' would just return 0 which eventually might
produce invalid result.

In this patch we add a user-specified opt/llc option for cache line size.
If the option is specified by users we use the value supplied, otherwise we
fall-back to the default value obtained from `TTIImpl->->getCacheLineSize()`.
The powerpc target already has such an option, this patch generalizes
this option to TargetTransformInfo.cpp.

Reviewed By: bmahjour, #loopoptwg

Differential Revision: https://reviews.llvm.org/D127342

[libc++] Remove now-unused experimental/filesystem config file

Fix StopInfoBreakpoint::ShouldNotify when a callback deletes the site we hit.

When we hit a breakpoint site all of whose owners are internal, we don't
broadcast that event to the public event queue. However, we were checking
whether that was true in the ShouldNotify method, which gets run after the
breakpoint callbacks get run. If the breakpoint callback deletes the site
we just hit, we no longer have the information to make that determination.

This patch just gathers the "was all internal" fact when the StopInfoBreakpoint
gets made, which happens before anyone has a chance to delete the site, and then
uses that cached value.

This bug was causing a couple of tests (including TestStopAtEntry.py) to fail
when using new the macOS Ventura dyld support.

Differential Revision: https://reviews.llvm.org/D127997

Reland "[PS4/PS5][profiling] Go back to the old way of doing a runtime hook"

Profiling stopped working for us after D98061, which was largely a
Fuschia-specific patch but in one place used `isOSBinFormatELF` to
make a decision. I'm adding a PS4/PS5 exception to that, so we can
get profiling to work again.

Differential Revision: https://reviews.llvm.org/D127506

Fix TraceGDBRemotePacketsTest

This test broke, but the fix is simple.

[BOLT][NFCI] Refactor interface for adding basic blocks

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127935

[AMDGPU] gfx11 new dot instruction codegen support

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D127904

[AMDGPU] Add GFX11 codegen for llvm.amdgcn.mov.dpp8

Differential Revision: https://reviews.llvm.org/D127980

[trace][intelpt] Support system-wide tracing [22] - Some final touches

Having a member variable TraceIntelPT * makes it look as if it was
optional. I'm using instead a weak_ptr to indicate that it's not
optional and the object is under the ownership of TraceIntelPT.

Besides that, I've simplified the Perf aux and data buffers copying by
using vector.insert.

I'm also renaming Lookup2 to Lookup. The 2 in the name is confusing.

Differential Revision: https://reviews.llvm.org/D127881

[trace][intelpt] Support system-wide tracing [21] - Support long numbers in JSON

llvm's JSON parser supports 64 bit integers, but other tools like the
ones written in JS don't support numbers that big, so we need to
represent these possibly big numbers as a string. This diff uses that to
represent addresses and tsc zero. The former is printed in hex for and
the latter in decimal string form. The schema was updated mentioning
that.

Besides that, I fixed some remaining issues and now all test pass. Before I wasn't running all tests because for some reason my computer reverted perf_paranoid to 1.

Differential Revision: https://reviews.llvm.org/D127819

[trace][intelpt] Support system-wide tracing [20] - Rename some fields in the schema

As discusses offline with @jj10305, we are updating some naming used throughout the code, specially in the json schema

- traceBuffer -> iptTrace
- core -> cpu

Differential Revision: https://reviews.llvm.org/D127817

[trace][intelpt] Support system-wide tracing [19] - Some other minor improvements

This addresses the issues in diffs [13], [14] and [16]

- Add better documentation
- Fix some castings by making them safer
- Simplify CorrelateContextSwitchesAndIntelPtTraces
- Rename some functions

Differential Revision: https://reviews.llvm.org/D127804

[trace][intelpt] Support system-wide tracing [18] - some more improvements

This applies the changes requested for diff 12.

- use DenseMap<ConstString, _> instead of std::unordered_map<ConstString, _>, which is more idiomatic and possibly performant.
- deduplicate some code in Trace.cpp by using helper functions for fetching in maps
- stop using size and offset when fetching binary data, because we in fact read the entire buffers all the time. If we ever need streaming, we can implement it then. Now, the size is used only to check that we are getting the correct amount of data. This is useful because in some cases determining the size doesn't involve fetching the actual data.
- added back the x86_64 macro to the perf tests
- added more documentation
- simplified some file handling
- fixed some comments

Differential Revision: https://reviews.llvm.org/D127752

Revert "[PS4/PS5][profiling] Go back to the old way of doing a runtime hook"

This reverts commit 39fb84343ec5cf9081e236745490c65eb8a9fc31.

Pushed without verifying the test still works.

[flang][runtime] Make ASSOCIATED() conform with standard

ASSOCIATED() must be false for zero-sized arrays, and the
strides on a dimension are irrelevant when the extent is unitary.

Differential Revision: https://reviews.llvm.org/D127793