review.tizen.org Git - platform/upstream/llvm.git/log

Add generic type attribute mapping infrastructure, use it in GpuToX

Remapping memory spaces is a function often needed in type
conversions, most often when going to LLVM or to/from SPIR-V (a future
commit), and it is possible that such remappings may become more
common in the future as dialects take advantage of the more generic
memory space infrastructure.

Currently, memory space remappings are handled by running a
special-purpose conversion pass before the main conversion that
changes the address space attributes. In this commit, this approach is
replaced by adding a notion of type attribute conversions
TypeConverter, which is then used to convert memory space attributes.

Then, we use this infrastructure throughout the *ToLLVM conversions.
This has the advantage of loosing the requirements on the inputs to
those passes from "all address spaces must be integers" to "all
memory spaces must be convertible to integer spaces", a looser
requirement that reduces the coupling between portions of MLIR.

ON top of that, this change leads to the removal of most of the calls
to getMemorySpaceAsInt(), bringing us closer to removing it.

(A rework of the SPIR-V conversions to use this new system will be in
a folowup commit.)

As a note, one long-term motivation for this change is that I would
eventually like to add an allocaMemorySpace key to MLIR data layouts
and then call getMemRefAddressSpace(allocaMemorySpace) in the
relevant *ToLLVM in order to ensure all alloca()s, whether incoming or
produces during the LLVM lowering, have the correct address space for
a given target.

I expect that the type attribute conversion system may be useful in
other contexts.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D142159

[lldb] Print an error for unsupported combinations of log options

Print an error for unsupported combinations of log handlers and log
options. Only the stream log handler takes a file and only the circular
and stream handler take a buffer size. This cannot be dealt with through
option groups because the option combinations depend on the requested
handler.

Differential revision: https://reviews.llvm.org/D143623

[MLGO] Update test for MBB Profile Dump

This patch updates the test for the MBB profile dump to include a
function that has multiple basic blocks so that we can test the
numbering of multiple basic blocks within an individual function.

[flang] Support polymorphic inputs for CSHIFT intrinsic

Result must carry the polymorphic type information from the array.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D143649

[TTI][AArch64] Cost model insertelement and indexed LD1 instructions

An indexed LD1 instruction, or "ASIMD load, 1 element, one lane, B/H/S"
instruction that loads a value and inserts an element into a vector is
an expensive instruction. It has a latency of 8 on modern cores. We
generate an indexed LD1 when an insertelement instruction has a load as an
operand and this patch is recognising and makes indexed LD1 more expensive.

Differential Revision: https://reviews.llvm.org/D141602

[z/OS][pg] Throw error when using -pg on z/OS

Throw an error when trying to compile with `-pg` on z/OS,
as the platform does not support `gprof`.

Reviewed By: cebowleratibm, MaskRay

Differential Revision: https://reviews.llvm.org/D137756

[Clang][AIX][p]Enable -p Functionality

This patch enables `-p` functionality into Clang on AIX and Linux
To create parity with GCC.

The purpose of the `-p` flag is similar to that of `-pg`, but the
results are analyzed with the `prof` tool as opposed to the `gprof` tool.
More details can be found in this RFC post:
https://discourse.llvm.org/t/rfc-add-p-driver-support-to-clang/66013?u=francii

On AIX, compiling with `-p` links against `mcrt0.o`
and produces a mon.out file analyzed with the `prof` tool,
while `-pg` links against `gcrt0.o` and produces a `gmon.out`file
analyzed with the `gprof` tool. The differences are therefore
only a concern when linking, so calling `-p` will push `-pg` to cc1.

An AIX test for `-p` already exists, and I recently
another test was added here:
https://github.com/llvm/llvm-project/commit/dc9846ce988b9ddfcbc42cd462d5d94b634b3161
As such, there is no AIX test case attached to this patch.

Reviewed By: daltenty

Differential Revision: https://reviews.llvm.org/D137753

[SelectionDAG] Do not second-guess alignment for alloca

Alignment of an alloca in IR can be lower than the preferred alignment
on purpose, but this override essentially treats the preferred
alignment as the minimum alignment.

The patch changes this behavior to always use the specified
alignment. If alignment is not set explicitly in LLVM IR, it is set to
DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign.

Tests are changed as well: explicit alignment is increased to match
the preferred alignment if it changes output, or omitted when it is
hard to determine the right value (e.g. for pointers, some structs, or
weird types).

Differential Revision: https://reviews.llvm.org/D135462

[X86] combineConcatVectorOps - pull out repeated uses of VT.getScalarSizeInBits(). NFC.

We already have a EltSizeInBits variable

[AMDGPU][GlobalISel] Fix selection of image sample g16 instructions

Pre-GFX10 A16 modifier would imply G16. From GFX10 and onwards there are
separate instructions for 16bit gradients. This fixes the condition for
selecting G16 opcodes. Also stop adding G16 flag to instructions that do not
use gradients for GFX10 onwards.

[mlir] Add function for checking if a block is inside a loop

This function returns whether a block is nested inside of a loop. There
can be three kinds of loop:
  1) The block is nested inside of a LoopLikeOpInterface
  2) The block is nested inside another block which is in a loop
  3) There is a cycle in the control flow graph

This will be useful for Flang's stack arrays pass, which moves array
allocations from the heap to the stack. Special handling is needed when
allocations occur inside of loops to ensure additional stack space is
not allocated on each loop iteration.

Differential Revision: https://reviews.llvm.org/D141401

[libclang] Tweaks for clang_CXXMethod_isExplicit

This adds a release note that was accidentally dropped, and moves the
symbol from LLVM 16 to LLVM 17 in the module map.

Amends 0a51bc731bcc2c27e4fe97957a83642d93d989be

[include-mapping] Add C-compatibility symbol entries.

Extending the python generator:
- to generate C-compatibility symbols
- to generate macros

Differential Revision: https://reviews.llvm.org/D143214

Disable MSVC C5105 warnings

Suppresses "macro expansion producing 'defined' has undefined behavior"
due to the diagnostic triggering in WinBase.h (a system header file).

[AMDGPU] Remove unused ClangBuiltin definition for fmed3

__builtin_amdgcn_fmed3 is unused since the actual builtins are
defined by Clang and have a floating point type suffix, h or f.

Differential Revision: https://reviews.llvm.org/D143643

[mlir][Linalg] Fix expected buffer semantics crash

Fixes #58747

[libc] Introduce a config macro file

[mlir][llvm] Fuse MD_access_group & MD_loop import

This commit moves the importing logic of access group metadata into the
loop annotation importer. These two metadata imports can be grouped
because access groups are only used in combination with
`llvm.loop.parallel_accesses`.

As a nice side effect, this commit decouples the LoopAnnotationImporter
from the ModuleImport class.

Differential Revision: https://reviews.llvm.org/D143577

[libc] Add an optimization macro header

[MemorySSA] Add test for gep with loop invariant pointer operand and
indexes are all constant

[mlir][NFC] Use fully qualified C++ namespaces in .td files.

Some extra cases that were not covered in 6da0184b.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D143477

[clang-tidy][doc] Improve clang-tidy documentation

- Specify that the .clang-tidy file is in YAML format.
- Document the options that may be used in the .clang-tidy file,
- Add missing documentation for existing options (User).
- Fix spurious newline after the dash that comes after every
  command-line option. This was inconsistent with single-line
  descriptions, which lacked a newline. The description is now
  aligned with the dash and the corresponding command-line option,
  more visually pleasing.

This enables documenting upcoming global clang-tidy
configuration options.

Differential Revision: https://reviews.llvm.org/D141144

[mlir] add support for transform dialect value handles

Introduce support for the third kind of values in the transform dialect:
value handles. Similarly to operation handles, value handles are
pointing to a set of values in the payload IR. This enables
transformation to be targeted at specific values, such as individual
results of a multi-result payload operation without indirecting through
the producing op or block arguments that previously could not be easily
addressed. This is expected to support a broad class of memory-oriented
transformations such as selective bufferization, buffer assignment, and
memory transfer management.

Value handles are functionally similar to operation handles and require
similar implementation logic. The most important change concerns the
handle invalidation mechanism where operation and value handles can
affect each other.

This patch includes two cleanups that make it easier to introduce value
handles:

  - `RaggedArray` structure that encapsulates the SmallVector of
    ArrayRef backed by flat SmallVector logic, frequently used in the
    transform interfaces implementation;

  - rewrite the tests that associated payload handles with an integer
    value `reinterpret_cast`ed as a pointer, which were a frequent
    source of confusion and crashes when adding more debugging
    facilities that can inspect the payload.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D143385

Revert "[YAML IO] Check that mapping doesn't contain duplicating keys"

LLDB failures: https://lab.llvm.org/buildbot/#/builders/17/builds/33865

This reverts commit 4c228ee6d40a7cff256f1a680561b6c0155ad704.

[llvm][cmake][Trivial] don't use `/Zc:preprocessor` with clang-cl

The flag is not recognized by clang-cl and emits unused command line warning for every translation unit

[YAML IO] Check that mapping doesn't contain duplicating keys

According to YAML specification keys must be unique for a mapping node:
"The content of a mapping node is an unordered set of key/value node pairs, with
the restriction that each of the keys is unique".

Differential Revision: https://reviews.llvm.org/D140474

AMDGPU/MC: Fix decoders for VSrc_v2b32 and VSrc_v2f32 RegisterOperands

Decoder should make 32 bit value when decoding immediates, not 64 bit.

Differential Revision: https://reviews.llvm.org/D143574

AMDGPU/MC: Add assembler tests for v2f32 and v2b32 with imm operand

Add test coverage for https://github.com/llvm/llvm-project/issues/60563.
D142636 introduced a bug: incorrect disassembly of floating point inline
constants for v_pk_mov_b32, v_pk_add_f32, v_pk_mul_f32 and v_pk_fma_f32.
Precommit for D143574.

Differential Revision: https://reviews.llvm.org/D143573

[mlir][FuncToLLVM] Add option for emitting opaque pointers

Part of https://discourse.llvm.org/t/rfc-switching-the-llvm-dialect-and-dialect-lowerings-to-opaque-pointers/68179

FuncToLLVM contains some logic working with Memrefs and their lowerings and in the process creating pointer types, loads and allocas. This patch ports the code of these to be compatible with opaque pointers and adds a pass option to enable the use of opaque pointers within the pass.

For the migration effort, the tests have been rewritten to use opaque pointers with dedicated test files for typed pointer support

Differential Revision: https://reviews.llvm.org/D143608

[AArch64] Protect against overflowing shift amounts in performSETCCCombine

In the attached test case we can get into positions where the shift gets a
constant shift amount that is negative or larger than the bitwidth, leading to
trying to create an invalid constant. Add a check to make sure we can handle it
without assertions.

Fixes #60530

[libc][NFC] separate macros in several targets

[LLDB] Add missing newline to "image lookup" output

When using --name, due to a missing newline, multiple symbol results
were not correctly printed:
```
(lldb) image lookup -r -n "As<.*"
2 matches found in <...>/tbi_lisp:
        Address: tbi_lisp<...>
        Summary: tbi_lisp<...> at Symbol.cpp:75        Address: tbi_lisp<...>
        Summary: tbi_lisp<...> at Symbol.cpp:82
```
It should be:
```
(lldb) image lookup -r -n "As<.*"
2 matches found in /home/david.spickett/tbi_lisp/tbi_lisp:
        Address: tbi_lisp<...>
        Summary: tbi_lisp<...> at Symbol.cpp:75
        Address: tbi_lisp<...>
        Summary: tbi_lisp<...> at Symbol.cpp:82
```
With Address/Summary on separate lines.

Reviewed By: clayborg, labath

Differential Revision: https://reviews.llvm.org/D143564

[AMDGPU] Ignore unused bits in VINTERP encoding

In the GFX11 VINTERP encoding bits 23, 59 and 60 are unused. Change the
disassembler to ignore these bits.

Differential Revision: https://reviews.llvm.org/D143633

[Support] Emulate SIGPIPE handling in raw_fd_ostream write for Windows

Prevent errors and crash dumps for broken pipes on Windows.

Fixes: https://github.com/llvm/llvm-project/issues/48672

Differential Revision: https://reviews.llvm.org/D142224

[mlir][bufferization] Improve aliasing OpOperand/OpResult property

`getAliasingOpOperands`/`getAliasingOpResults` now encodes OpOperand/OpResult, buffer relation and a degree of certainty. E.g.:
```
// aliasingOpOperands(%r) = {(%t, EQUIV, DEFINITE)}
// aliasingOpResults(%t) = {(%r, EQUIV, DEFINITE)}
%r = tensor.insert %f into %t[%idx] : tensor<?xf32>

// aliasingOpOperands(%r) = {(%t0, EQUIV, MAYBE), (%t1, EQUIV, MAYBE)}
// aliasingOpResults(%t0) = {(%r, EQUIV, MAYBE)}
// aliasingOpResults(%t1) = {(%r, EQUIV, MAYBE)}
%r = arith.select %c, %t0, %t1 : tensor<?xf32>
```

`BufferizableOpInterface::bufferRelation` is removed, as it is now part of `getAliasingOpOperands`/`getAliasingOpResults`.

This change allows for better analysis, in particular wrt. equivalence. This allows additional optimizations and better error checking (which is sometimes overly conservative). Examples:

* EmptyTensorElimination can eliminate `tensor.empty` inside `scf.if` blocks. This requires a modeling of equivalence: It is not a per-OpResult property anymore. Instead, it can be specified for each OpOperand and OpResult. This is important because `tensor.empty` may be eliminated only if all values on the SSA use-def chain to the final consumer (`tensor.insert_slice`) are equivalent.
* The detection of "returning allocs from a block" can be improved. (Addresses a TODO in `assertNoAllocsReturned`.) This allows us to bufferize IR such as "yielding a `tensor.extract_slice` result from an `scf.if` branch", which currently fails to bufferize because the alloc detection is too conservative.
* Better bufferization of loops. Aliases of the iter_arg can be yielded (even if they are not equivalent) without having to realloc and copy the entire buffer on each iteration.

The above-mentioned examples are not yet implemented with this change. This change just improves the BufferizableOpInterface, its implementations and related helper functions, so that better aliasing information is available for each op.

Differential Revision: https://reviews.llvm.org/D142129

Fix call to deprecated API in bd87a2449da0c82e63cebdf9c131c54a5472e3a7

[mlir][sparse] Port the remaining integration tests to use SVE

This patch updates the remaining SparseCompiler integration tests to
target SVE when available.

Two tests will require some investigation in the future:
* sparse_matmul.mlir
* sparse_tanh.mlir
The former passes regardless - that's due to how `CHECK` lines are
defined. The latter fails when SVE is enabled, but passes when it's
disabled. I marked it as UNSUPPORTED as there is no mechanism to XFAIL a
test conditionally. Also, see [1] for more details.

[1] https://github.com/llvm/llvm-project/issues/60626

Differential Revision: https://reviews.llvm.org/D143514

[CGP] Add generic TargetLowering::shouldAlignPointerArgs() implementation

This function was added for ARM targets, but aligning global/stack pointer
arguments passed to memcpy/memmove/memset can improve code size and
performance for all targets that don't have fast unaligned accesses.
This adds a generic implementation that adjusts the alignment to pointer
size if unaligned accesses are slow.
Review D134168 suggests that this significantly improves performance on
synthetic benchmarks such as Dhrystone on RV32 as it avoids memcpy() calls.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D134282

Simplify test from change D73904

This part of the test can break if multilib is enabled, and isn't
important to testing the change with which is was added.

The relevant part of the test is
ARM-EABI: "-lclang_rt.builtins-arm"
which remains.

Differential Revision: https://reviews.llvm.org/D143590

[Test] Fix YAML mapping keys duplication. NFC.

YAML specification does not allow keys duplication an a mapping. However, YAML
parser in LLVM does not have any check on that and uses only the last key entry.
In this change duplicated keys are merged to satisfy the spec.

Differential Revision: https://reviews.llvm.org/D141848

Reland "[X86][ABI] Don't preserve return regs for preserve_all/preserve_most CCs""

The original change mistakenly excluded parameter registers from the
list of callee-saved-registers. This reland fixes it - it only excludes
the return registers for preserve_all/preserve_most CCs.

Original description:
> Currently both calling conventions preserve registers that are used to
> store a return value. This causes the returned value to be lost:
>
>   define i32 @bar() {
>     %1 = call preserve_mostcc i32 @foo()
>     ret i32 %1
>   }
>
>   define preserve_mostcc i32 @foo() {
>     ret i32 2
>     ; preserve_mostcc will restore %rax,
>     ; whatever it was before the call.
>   }
>
> This contradicts the current documentation (preserve_allcc "behaves
> identical to the `C` calling conventions on how arguments and return
> values are passed") and also breaks [[clang::preserve_most]].
>
> This change makes CSRs be preserved iff they are not used to store a
> return value (e.g.  %rax for scalars, {%rax:%rdx} for __int128, %xmm0
> for double).  For void functions no additional registers are
> preserved, i.e.  the behaviour is backward compatible with existing
> code.

Differential Revision: https://reviews.llvm.org/D143425

[clang][codegen] Fix emission of consteval constructor of derived type

For simple derived type ConstantEmitter returns a struct of the same
size but different type which is then stored field-by-field into memory
via pointer to derived type. In case base type has more fields than derived,
the incorrect GEP is emitted. So, just cast pointer to derived type to
appropriate type with enough fields.

Fixes https://github.com/llvm/llvm-project/issues/60166

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D142534

Reland D143267: [LoopVectorize] Use DataLayout::getIndexType instead of i32 for non-constant GEP indices.

Fixed issue where 'ConstantInt::get(IndextTy, -Part)' was executed with the wrong type for Part,
e.g. IndexTy was i64, but Part was 'unsigned', which led to things like 'mul i64 .., 4294967292',
which was obviously wrong.

Also changed sve-vector-reverse.ll to be vectorized with UF>1 to test this.

This reverts commit 1f01cdda68614dba12af3cc3aff38541d0abcc6b.

[Docs] Clarify behavior of llvm-lit -vv

Differential Revision: https://reviews.llvm.org/D143586

[libc] Add documentation for the macros folder

[libc][NFC] Format bazel file

[libc][NFC] Move cpu_features.h to properties subfolder

[libc][NFC] Move compiler_features.h to properties subfolder

[mlir][llvm] Add extra attributes to the atomic ops.

The revision adds a number of extra arguments to the
atomic read modify write and compare and exchange
operations. The extra arguments include the volatile,
weak, syncscope, and alignment attributes.

The implementation also adapts the fence operation to use
a assembly format and generalizes the helper used
to obtain the syncscope name.

Reviewed By: Dinistro

Differential Revision: https://reviews.llvm.org/D143554

[libc][NFC] Move architectures.h to properties subfolder

[mlir][transform] Fix apply WithPDLPatternsOp with non-pattern op

Fix https://github.com/llvm/llvm-project/issues/60209

Fix crash with segmentation fault when transform::WithPDLPatternsOp is
applied with non-pattern op. Added check for existing transform ops with
pattern op.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D143474

[mlir][llvm] Purge struct_attr

This commit removes the `llvm.struct_attr` which was used to bundle
result attributes that were previously attached to multiple results.
This extension isn't part of LLVM as result attribute semantics cannot
be supported on a struct field granularity.
Furthermore, many usages promoted result attributes to argument
attributes but this does not necessary preserve the semantics.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D143473

[mlir][bufferization][NFC] Cache definitions of read tensors

This is to avoid unnecessary traversals of the IR.

Differential Revision: https://reviews.llvm.org/D143408

[mlir][vector] Prevent duplicating operations during vector distribute

We should distribute ops that have other uses than the yield op as this
would duplicate those ops.

Differential Revision: https://reviews.llvm.org/D143629

[mlir][Tiling] Properly reject "buffer semantic" operations

Our tiling implementation assumes a "tensor semantic" for the operation to
be tiled.
Prior to this patch, if we provide a tilable op with "buffer semantic", we
will assert instead of gracefully reject the input.

This patch turns the assert in a proper error.

Differential Revision: https://reviews.llvm.org/D143558

[NFC][TableGen] Refine the check in Decoder
The Opcode occupy 2 bytes in following test, we should use {{[0-9]+}}
to match the total value if it, not a part of it.
OPC_Decode(uleb128 Opcode, uleb128 DIdx)
and so do for OPC_TryDecode.

[mlir][bufferize][NFC] Optimize read-only tensor detection

Check alias sets instead of traversing the IR.

Differential Revision: https://reviews.llvm.org/D143500

[flang][hlfir] Lower procedure designators to HLFIR

- Add a convertProcedureDesignatorToHLFIR that converts the
  fir::ExtendedValue from the current lowering to a
  fir.boxproc/tuple<fir.boxproc, len> mlir::Value.

- Allow fir.boxproc/tuple<fir.boxproc, len> as hlfir::Entity values
  (a function is an address, but from a Fortran entity point of view,
  procedure that are not procedure pointers cannot be assigned to, so
  it makes a lot more sense to consider those as values).

- Modify symbol association to not generate an hlfir.declare for dummy
  procedures. They are not needed and allowing hlfir.declare to declare
  function values would make its verifier and handling overly complex
  for little benefits (maybe an hlfir.declare_proc could be added if it
  turnout out useful later for debug info and attributes storing
  purposes).

- Allow translation from hlfir::Entity to fir::ExtendedValue.
  convertToBox return type had to be relaxed because some intrinsics
  handles both object and procedure arguments and need to lower their
  object arguments "asBox". fir::BoxValue is not intended to carry
  dummy procedures (all its member functions would make little sense
  and its verifier does not accept such type).
  Note that AsAddr, AsValue and AsBox will always return the same MLIR
  value for procedure designators because they are always handled the
  same way in FIR.

Differential Revision: https://reviews.llvm.org/D143585

[AVR] Optimize 16-bit comparison with a constant

Fixes https://github.com/llvm/llvm-project/issues/30923

Reviewed By: jacquesguan, aykevl

Differential Revision: https://reviews.llvm.org/D142281

[libc][obvious] Fix build.

[libc][math] Implement scalbn, scalbnf, scalbnl.

Implement scalbn via `fptuil::ldexp` for `FLT_RADIX==2` case.
"unimplemented" otherwise.

Reviewed By: lntue, sivachandra

Differential Revision: https://reviews.llvm.org/D143116

[mlir][func] Add support for nested tuples to TestDecomposeCallGraphTypes.

Nested tuples were only supported in some narrow edge cases (and
potentially only because the test ops like `test.make_tuple` aren't
properly verified). This patch adds a couple of test cases with tested
tuple types and makes them work in the test pass by extending the
argument materialization and decomposition functions.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D143579

[Instcombine] Precommit tests update for icmp(trunc cttz/ctlz(x), C); NFC

[llvm-c-test] Fix unused variable warnings

This patch fixes:

  llvm/tools/llvm-c-test/debuginfo.c:211:12: error: unused variable
  'tag0' [-Werror,-Wunused-variable]

  llvm/tools/llvm-c-test/debuginfo.c:222:12: error: unused variable
  'tag1' [-Werror,-Wunused-variable]

[lldb] Add --gdb-format flag to dwim-print

Add support for the `--gdb-format`/`-G` flag to `dwim-print`.

The gdb-format flag allows users to alias `p` to `dwim-print`.

Differential Revision: https://reviews.llvm.org/D141425

[BOLT] Process fragment siblings in lite mode, keep lite mode on

In lite mode, include split function fragments to the list of functions to
process even if a fragment has no samples. This is required to properly
detect and update split jump tables (jump tables that contain pointers to code
in the main and cold fragments).

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D140457

[mlir][linalg] Enhance padding LinalgOps to handle tensor.empty cases.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D143043

InstCombine: Add some additional is.fpclass tests

Test some more cases related to compare with 0 and inf.

[LoongArch] Add baseline tests for translating the selection of constants into mathematical operations

InstCombine: Fold and (fcmp), (is.fpclass) into is.fpclass

Fold class test performed by an fcmp into another class. For now this
avoids introducing new class calls then there isn't one that already
exists.

[-Wunsafe-buffer-usage] To disable a test on Windows systems.

One of the tests in the commit
'bdf4f2bea50e87f5b9273e3bbc9a7753bca3a6bb' sometimes fails on one of
the buildbots runing on a Windows machine. For example,
"https://lab.llvm.org/buildbot/#/builders/60/builds/10615" is a failed
build.

Now we disable it until we figure out why this could happen.

[gn] port 79971d0d771a27 (LLVMProfdataTests)

[clang][deps] NFC: Refactor inferred modules test

This patch squashes two tests with identical inputs into a single test, and adopts the `split-file` utility. This allows us to remove `sed` invocation with multiple commands, where "s|-E|-x objective-c -E|g" could've caused issues if previous replacements injected path containing "-E".

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D143615

[mlir][tosa] make Select operator broadcastable in the pass

Making Select broadcastable can let this op easier to use.

Change-Id: I4a4bec4f7cbe532e954a5b4fe53136676ab4300c

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D139156

add LLVMGetDINodeTag to C bindings

Differential Revision: https://reviews.llvm.org/D138415

[docs] Update "production quality" targets for lld/ELF

Add RISC-V and update Arm from (>= v6) to (>= v4).

Reviewed By: pirama

Differential Revision: https://reviews.llvm.org/D143543

[CMake] Fix -DBUILD_SHARED_LIBS=on builds after D141446

[SanitizerBinaryMetadata] Make constructors/destructors hidden

By switching them to external with default visibility, DSOs may not call
their own constructor/destructor. This is incorrect, because they pass
different parameters.

Fix it by marking the ctors/dtors as external linkage but with hidden
visibility.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D143611

Fix initialization of th_task_state on each thread on expanding hot teams.

The th_task_state was initialized from the master thread's value, or
from its memo stack, but this causes problems because neither of those
may have the right value at the right time. However, other threads in
the team are guaranteed to have the right values, so we change the
initialize the new threads' th_task_state from the th_task_state of
the last of the older threads in the hot team.

Differential Revision: https://reviews.llvm.org/D142247
Fix #56307.

Remove core file that shouldn't have been committed

[MLGO] Add BB Profile Dump in AsmPrinter

This patch adds a basic block profile dump option within the AsmPrinter
and dumps basic block profile information so that cost models can use
the data for downstream analysis.

Differential Revision: https://reviews.llvm.org/D143311

[Lex] Fix -Wunused-variable for LLVM_ENABLE_ASSERTIONS=off builds after D140179

[Support] Clarify CrashRecoveryContext exception codes on Windows. NFC

Differential revision: https://reviews.llvm.org/D143609

[LSAN] Fix pthread_create interceptor to ignore leaks in real pthread_create.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D143209

[clang][cli] Simplify repetitive macro invocations

Since we now only support Visual Studio 2019 16.7 and newer, we're able to use the `/Zc:preprocessor` flag that turns on the standards-conforming preprocessor. It (among other things) correctly expands `__VA_ARGS__` (see https://learn.microsoft.com/en-us/cpp/preprocessor/preprocessor-experimental-overview?view=msvc-170#macro-arguments-are-unpacked). This enables us to get rid of some repetitive boilerplate in Clang's command-line parser/generator.

Reviewed By: Bigcheese

Differential Revision: https://reviews.llvm.org/D135128

[InstCombine] canonicalize cmp+select as umin/umax

(V == 0) ? 1 : V --> umax(V, 1)
(V == UMAX) ? UMAX-1 : V --> umin(V, UMAX-1)

https://alive2.llvm.org/ce/z/pfDBAf

This is one pair of the variants discussed in issue #60374.

Enhancements for the other end of the constant range and
signed variants are potential follow-ups, but that may
require more work because we canonicalize at least one
min/max like that to icmp+zext.

[InstCombine] add tests for cmp+select; NFC

Add CFI integer types normalization

This commit adds a new option (i.e.,
`-fsanitize-cfi-icall-normalize-integers`) for normalizing integer types
as vendor extended types for cross-language LLVM CFI/KCFI support with
other languages that can't represent and encode C/C++ integer types.

Specifically, integer types are encoded as their defined representations
(e.g., 8-bit signed integer, 16-bit signed integer, 32-bit signed
integer, ...) for compatibility with languages that define
explicitly-sized integer types (e.g., i8, i16, i32, ..., in Rust).

``-fsanitize-cfi-icall-normalize-integers`` is compatible with
``-fsanitize-cfi-icall-generalize-pointers``.

This helps with providing cross-language CFI support with the Rust
compiler and is an alternative solution for the issue described and
alternatives proposed in the RFC
https://github.com/rust-lang/rfcs/pull/3296.

For more information about LLVM CFI/KCFI and cross-language LLVM
CFI/KCFI support for the Rust compiler, see the design document in the
tracking issue https://github.com/rust-lang/rust/issues/89653.

Relands b1e9ab7438a098a18fecda88fc87ef4ccadfcf1e with fixes.

Reviewed By: pcc, samitolvanen

Differential Revision: https://reviews.llvm.org/D139395

[clang][deps] Ensure module invocation can be serialized

When reseting modular options, propagate the values from certain options
that have ImpliedBy relations instead of setting to the default. Also,
verify in clang-scan-deps that the command line produced round trips
exactly.

Ideally we would automatically derive the set of options that need this
kind of propagation, but for now there aren't very many impacted.

rdar://105148590

Differential Revision: https://reviews.llvm.org/D143446

[llvm-profdata] Add option to cap profile output size

D139603 (add option to llvm-profdata to reduce output profile size) contains test cases that are not cross-platform. Moving those tests to unit test and making sure the feature is callable from llvm library

Reviewed By: snehasish

Differential Revision: https://reviews.llvm.org/D141446

[-Wunsafe-buffer-usage] Add unsafe buffer checking opt-out pragmas

Add a pair of clang pragmas:
- `#pragma clang unsafe_buffer_usage begin` and
- `#pragma clang unsafe_buffer_usage end`,
which specify the start and end of an (unsafe buffer checking) opt-out
region, respectively.

Behaviors of opt-out regions conform to the following rules:

- No nested nor overlapped opt-out regions are allowed. One cannot
  start an opt-out region with `... unsafe_buffer_usage begin` but never
  close it with `... unsafe_buffer_usage end`. Mis-use of the pragmas
  will be warned.
- Warnings raised from unsafe buffer operations inside such an opt-out
  region will always be suppressed. This behavior CANNOT be changed by
  `clang diagnostic` pragmas or command-line flags.
- Warnings raised from unsafe operations outside of such opt-out
  regions may be reported on declarations inside opt-out
  regions. These warnings are NOT suppressed.
- An un-suppressed unsafe operation warning may be attached with
  notes. These notes are NOT suppressed as well regardless of whether
  they are in opt-out regions.

The implementation maintains a separate sequence of location pairs
representing opt-out regions in `Preprocessor`.  The `UnsafeBufferUsage`
analyzer reads the region sequence to check if an unsafe operation is
in an opt-out region. If it is, discard the warning raised from the
operation immediately.

This is a re-land after I reverting it at 9aa00c8a306561c4e3ddb09058e66bae322a0769.
The compilation error should be resolved.

Reviewed by: NoQ

Differential revision: https://reviews.llvm.org/D140179

[mlir][sparse] Implement hybrid quick sort for sparse_tensor.sort.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D143227

Revert "[LSAN] Fix pthread_create interceptor to ignore leaks in real pthread_create."

This reverts commit a7db3cb257ff6396481f44427bccd0ca5abf4d63.

Reapply "[cmake][msvc] Enable standards-conforming preprocessor"

This reverts commit 16e1a49441c51817697138437d8db2c15bc19cb4, essentially reapplying 12d8e7c6ade55bba241259312e3e4bdcf6aeab81. The build bot where this caused issues is supposed to be updated now: https://reviews.llvm.org/D135128#4108588

[AMDGPU] Do not exapnd fp atomics on gfx940

FP atomics are safe on gfx940. This fixes regression after D131560.

Fixes: SWDEV-380468

Differential Revision: https://reviews.llvm.org/D143603

[X86] midpoint-int-vec - cleanup common check prefixes

[AMDGPU] Update atomic tests. NFC.

This is to precommit tests before future patch.

[LSAN] Fix pthread_create interceptor to ignore leaks in real pthread_create.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D143209

[DAG] Fold freeze(concat_vectors(x,y,...)) -> concat_vectors(freeze(x),freeze(y),...)

Another of the cleanups necessary for D136529

[mlir][cf] Add support for opaque pointers to ControlFlowToLLVM lowering

Part of https://discourse.llvm.org/t/rfc-switching-the-llvm-dialect-and-dialect-lowerings-to-opaque-pointers/68179

This is a very simple patch since there is only one use of pointers types in `cf.assert` that has to be changed. Pointer types are conditionally created with element types and the GEP had to be adjusted to use the array type as base type.

Differential Revision: https://reviews.llvm.org/D143583