review.tizen.org Git - platform/upstream/llvm.git/log

[clang-tidy][NFC] Remove custom isInAnonymousNamespace matchers

Since now the same matcher exists in ASTMatchers.

[ArgPromotion] Convert tests to opaque pointers (NFC)

update_test_checks was rerun for some of those, because we use
a different GEP representation with opaque pointers.

[AggressiveInstCombine] Convert tests to opaque pointers (NFC)

[MLIR][Arith] Remove unused assertions
We shouldn't be checking things that are guaranteed by the op's verifier.

Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D140610

[Support] Use inplace APInt operators in DivisionByConstantInfo. NFC

Reduces the number of temporary APInts that get created and
copy/moved from.

[ASTMatchers] Add isInAnonymousNamespace narrowing matcher

Used in a couple clang-tidy checks so it could be extracted
out as its own matcher.

Differential Revision: https://reviews.llvm.org/D140328

[X86] Add reduce_*_ep[i|u]8/16 series intrinsics.

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D140531

[libc][obvious] Remove a spurious statement leftover from a previous change.

[OpenMP] [OMPD] Enable OMPD Tests

It was disabled due to different failures it different llvm bots.

Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D138411

[Support] Move some APInt declarations in DivisionByConstantInfo to their first assignment.

This uses copy initialization instead of default constructing the
APInts and assigning over them.

[libc][NFC] Use operator new and operator delete in POSIX file actions API.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D140597

[MLIR][Arith] Canonicalize xor with ext

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D139307

[InstCombine] complete (X << Z) / (Y << Z) --> X / Y

Add one more situations for this fold.
For unsigned div, 'nsw' on both shifts + 'nuw' on the dividend.

Alive2: https://alive2.llvm.org/ce/z/sELF76

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D139997

[BOLT][TEST] Limit iterations in X86/exceptions-pic.test

The test has 3 invocations with 1M iterations each, which adds delay to fast
check-bolt testing. Reduce the number to 1K.

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D139651

Add Soft/Hard RSS Limits to Scudo Standalone

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D126752

[scudo] Fix return type of GetRSS()

[AVR] Select 16-bit LDS/STS for load/store on AVRTiny.

The 32-bit LDS/STS are not available on AVRTiny, so we have
to use their compact 16-bit form for memory access.

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D139687

[AVR] Support 16-bit LDS/STS on AVRTiny.

LDS/STS are 32-bit instructions on AVR, which can access up to
64KB data space. While they are 16-bit instructions on AVRTiny,
which can only access 128B data space.

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D139621

[libc++] Granularize <type_traits> includes in <compare>

Reviewed By: Mordante, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D140480

[AVR][MC] Fix illegal operand forms.

These operands are illegal and rejected by avr-gcc.
    subi r24, -lo8(symobl+offset)
    sbci r25, -hi8(symobl+offset)

And their correct form should be
    subi r24, lo8(-(symobl+offset))
    sbci r25, hi8(-(symobl+offset))

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D140473

[mlir][sparse] add missing dependent dialect.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140595

[AVR] Fix a bug in AsmPrinter when printing memory operands.

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D140383

[NFC][Codegen][X86] Tests w/ final optimized IR of SROA-with-variably-indexed-loads (D140493)

32-byte ones are for consistency only, we really only care about
up to 16-byte on 64-bit and maybe up to 8-byte on 32-bit.

In 16byte ones, we are still having some redundant vec<->scalar traffic.

https://reviews.llvm.org/D140493

[ORC][ORC-RT] Add SimplePackedSerialization support for optionals.

This allows optionals to be serialized and deserialized, and used as arguments
and return values in SPS wrapper functions.

Serialization of optional values is indicated by use of the SPSOptional tag.
SPSOptionals are serialized serialized as a bool (false for no value, true for
value) plus the serialization of the contained value if any. Serialization
to/from std::optional is included in this commit.

This commit includes updates to SimplePackedSerialization in both ORC and the
ORC runtime.

, std::optional serialization.

[mlir][sparse] use sparse_tensor::StorageSpecifier to store dim/memSizes

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140130

[clang-format] Set requires expression params as not an expression

Previously, the parens of a requires expression's "parameters" were not
explicitly set, meaning they ended up as whatever the outer scope was.
This is a problem in some cases though, since the process of determining
star/amp checks if the token is inside of an expression context

This patch always makes sure the context between those parens are always
set to not be an expression

Fixes https://github.com/llvm/llvm-project/issues/59600

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D140330

[clang-format][docs] Fix invalid CSS syntax in versionbadge

CSS uses colons, not the equals sign. The final semicolon is optional,
but preferred to be included. Really, the font property doesn't really
need to be there, but I suppose it was put there for a reason.

It's surprising how lenient browsers are when parsing

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D138441

[lld-macho] Use ld64's LC_LINKER_OPTIONS behavior by default

By default ld64 ignores invalid LC_LINKER_OPTIONS unless the link fails,
in which case it prints a warning. Originally lld chose to be strict
about these, but it has uncovered that many of these exist in open
source projects today, since before developers never would have noticed
this issue. In order to make adoption of lld easier, this mirrors ld64's
behavior, while also adding a `--strict-auto-link-options` flag if
projects want to audit their libraries for these invalid options.

More discussion on https://reviews.llvm.org/D140225
Fixes https://github.com/llvm/llvm-project/issues/59627

Differential Revision: https://reviews.llvm.org/D140491

[lld-macho] Flip string deduplication default

Previously by default, when not using `--ifc=`, lld would not
deduplicate string literals. This reveals reliance on undefined behavior
where string literal addresses are compared instead of using string
equality checks. While ideally you would be able to easily identify and
eliminate the reliance on this UB, this can be difficult, especially for
third party code, and increases the friction and risk of users migrating
to lld. This flips the default to deduplicate strings unless
`--no-deduplicate-strings` is passed, matching ld64's behavior.

Differential Revision: https://reviews.llvm.org/D140517

[NFC][SROA] Rewrite widen-load-of-small-alloca tests to just store result, not call some function

[DAGCombiner] `visitFREEZE()`: be less greedy with replacing other uses of undef

[DAGCombiner] `visitFREEZE()`: allow multiple maybe-poison operands for `BUILD_VECTOR`

[DAGCombine] `BUILD_VECTOR` can not create undef or poison

[NFC][Codegen] Tests for `freeze` of `BUILD_VECTOR`

[NFC][DAGCombiner] `visitFREEZE()`: use early return

[bazel] fix bazel file.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140589

[gn build] Port d64d3c5a8f81

[mlir][vector] Fix bug in extractOp folding

We were missing to check for transpose when folding.
Also add a new file to test folding independently of
canonicalization as canonicalization was hiding the bug.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140533

std::sort: add BlockQuickSort partitioning algorithm for arithmetic types

This diff modifies std::sort in two ways:

* for arithmetic types we update the core partitioning algorithm to use
BlockQuickSort for partitioning. The partition function was carefully
written to let the compiler generates SIMD instructions without actually
writing SIMD intrinsics in the loop. We see up to 50% better performance
for sorting arithmetic types. The use of the BlockQuickSort partitioning
has been limited to arithmetic types since the algorithm works well when
branch instructions can be avoided during partitioning. This usually not
true for types other than the arithmetic ones.

* for other types (tuples, strings) updates have been made to improve
performance by about 10%.  Performance numbers comparing std::sort (old)
and Bitset sort (new) on libcxx benchmark.

name                                                             old cpu/op  new cpu/op  delta
BM_Sort_uint32_Random_1                                          3.72ns ± 5%  3.78ns ±16%      ~     (p=0.819 n=36+34)
BM_Sort_uint32_Random_4                                          5.42ns ± 5%  5.29ns ± 7%    -2.42%  (p=0.000 n=35+31)
BM_Sort_uint32_Random_16                                         10.5ns ± 3%  11.9ns ±15%   +13.08%  (p=0.000 n=36+40)
BM_Sort_uint32_Random_64                                         18.6ns ± 7%  18.5ns ±15%    -0.95%  (p=0.002 n=33+40)
BM_Sort_uint32_Random_256                                        26.2ns ± 4%  21.3ns ± 8%   -18.89%  (p=0.000 n=37+34)
BM_Sort_uint32_Random_1024                                       33.4ns ± 5%  23.3ns ± 4%   -30.37%  (p=0.000 n=39+35)
BM_Sort_uint32_Random_16384                                      47.7ns ± 5%  26.7ns ± 5%   -44.06%  (p=0.000 n=39+35)
BM_Sort_uint32_Random_262144                                     62.6ns ± 3%  30.1ns ± 6%   -51.81%  (p=0.000 n=37+36)
BM_Sort_uint32_Ascending_1                                       3.71ns ± 3%  4.28ns ± 3%   +15.53%  (p=0.000 n=37+35)
BM_Sort_uint32_Ascending_4                                       1.47ns ± 3%  1.46ns ± 3%      ~     (p=0.083 n=36+37)
BM_Sort_uint32_Ascending_16                                      0.93ns ± 4%  1.02ns ± 3%    +9.32%  (p=0.000 n=36+36)
BM_Sort_uint32_Ascending_64                                      1.23ns ± 5%  1.51ns ± 3%   +22.56%  (p=0.000 n=34+36)
BM_Sort_uint32_Ascending_256                                     1.21ns ± 3%  1.57ns ± 4%   +29.77%  (p=0.000 n=33+35)
BM_Sort_uint32_Ascending_1024                                    1.03ns ± 4%  1.43ns ± 3%   +38.44%  (p=0.000 n=32+35)
BM_Sort_uint32_Ascending_16384                                   0.94ns ± 8%  1.36ns ± 5%   +44.09%  (p=0.000 n=32+35)
BM_Sort_uint32_Ascending_262144                                  0.93ns ± 3%  1.35ns ± 7%   +45.06%  (p=0.000 n=32+36)
BM_Sort_uint32_Descending_1                                      3.69ns ± 2%  4.27ns ± 3%   +15.73%  (p=0.000 n=31+36)
BM_Sort_uint32_Descending_4                                      1.74ns ± 2%  1.78ns ± 3%    +2.29%  (p=0.000 n=31+38)
BM_Sort_uint32_Descending_16                                     3.92ns ± 4%  4.20ns ± 4%    +7.13%  (p=0.000 n=32+38)
BM_Sort_uint32_Descending_64                                     2.09ns ± 4%  3.25ns ± 4%   +55.10%  (p=0.000 n=33+37)
BM_Sort_uint32_Descending_256                                    1.98ns ± 7%  2.93ns ± 4%   +47.95%  (p=0.000 n=34+36)
BM_Sort_uint32_Descending_1024                                   2.23ns ± 6%  2.64ns ± 3%   +18.22%  (p=0.000 n=34+38)
BM_Sort_uint32_Descending_16384                                  1.93ns ± 6%  2.43ns ± 4%   +25.99%  (p=0.000 n=34+35)
BM_Sort_uint32_Descending_262144                                 1.89ns ± 3%  2.38ns ± 4%   +25.41%  (p=0.000 n=33+35)
BM_Sort_uint32_SingleElement_1                                   3.67ns ± 2%  4.28ns ± 4%   +16.60%  (p=0.000 n=34+34)
BM_Sort_uint32_SingleElement_4                                   1.48ns ± 4%  1.48ns ± 5%      ~     (p=0.951 n=35+33)
BM_Sort_uint32_SingleElement_16                                  0.93ns ± 3%  1.02ns ± 4%    +9.51%  (p=0.000 n=36+33)
BM_Sort_uint32_SingleElement_64                                  0.76ns ± 3%  1.59ns ± 8%  +109.78%  (p=0.000 n=36+32)
BM_Sort_uint32_SingleElement_256                                 0.82ns ± 4%  1.45ns ± 5%   +76.62%  (p=0.000 n=37+34)
BM_Sort_uint32_SingleElement_1024                                0.77ns ± 4%  1.31ns ± 4%   +71.40%  (p=0.000 n=34+34)
BM_Sort_uint32_SingleElement_16384                               0.64ns ± 4%  1.24ns ± 6%   +93.29%  (p=0.000 n=35+36)
BM_Sort_uint32_SingleElement_262144                              0.63ns ± 3%  1.23ns ± 4%   +95.17%  (p=0.000 n=35+35)
BM_Sort_uint32_PipeOrgan_1                                       3.68ns ± 2%  4.42ns ± 3%   +20.31%  (p=0.000 n=34+36)
BM_Sort_uint32_PipeOrgan_4                                       1.54ns ± 3%  1.53ns ± 3%      ~     (p=0.128 n=34+36)
BM_Sort_uint32_PipeOrgan_16                                      2.22ns ± 3%  1.99ns ± 3%   -10.28%  (p=0.000 n=33+36)
BM_Sort_uint32_PipeOrgan_64                                      4.41ns ± 3%  3.39ns ± 4%   -23.17%  (p=0.000 n=35+37)
BM_Sort_uint32_PipeOrgan_256                                     2.75ns ± 5%  3.07ns ± 3%   +11.74%  (p=0.000 n=37+37)
BM_Sort_uint32_PipeOrgan_1024                                    3.58ns ± 2%  5.48ns ± 3%   +52.97%  (p=0.000 n=37+36)
BM_Sort_uint32_PipeOrgan_16384                                   4.10ns ± 3%  6.53ns ± 3%   +59.27%  (p=0.000 n=37+37)
BM_Sort_uint32_PipeOrgan_262144                                  4.90ns ± 3%  7.39ns ± 3%   +50.71%  (p=0.000 n=34+37)
BM_Sort_uint32_QuickSortAdversary_1                              3.68ns ± 2%  4.28ns ± 3%   +16.19%  (p=0.000 n=36+37)
BM_Sort_uint32_QuickSortAdversary_4                              1.46ns ± 4%  1.46ns ± 3%      ~     (p=0.736 n=35+38)
BM_Sort_uint32_QuickSortAdversary_16                             0.93ns ± 3%  1.02ns ± 4%    +9.69%  (p=0.000 n=36+37)
BM_Sort_uint32_QuickSortAdversary_64                             13.6ns ± 4%  17.9ns ± 8%   +31.37%  (p=0.000 n=36+35)
BM_Sort_uint32_QuickSortAdversary_256                            20.0ns ± 4%  25.7ns ± 4%   +28.69%  (p=0.000 n=36+35)
BM_Sort_uint32_QuickSortAdversary_1024                           28.3ns ± 6%  31.7ns ± 3%   +12.12%  (p=0.000 n=36+37)
BM_Sort_uint32_QuickSortAdversary_16384                          45.8ns ± 3%  50.6ns ± 4%   +10.32%  (p=0.000 n=38+36)
BM_Sort_uint32_QuickSortAdversary_262144                         61.6ns ± 4%  68.2ns ± 4%   +10.68%  (p=0.000 n=37+37)
BM_Sort_uint64_Random_1                                          3.71ns ± 4%  4.00ns ± 4%    +7.93%  (p=0.000 n=34+35)
BM_Sort_uint64_Random_4                                          5.52ns ± 8%  5.22ns ± 6%    -5.41%  (p=0.000 n=32+32)
BM_Sort_uint64_Random_16                                         10.7ns ±15%  10.2ns ± 7%      ~     (p=0.077 n=40+31)
BM_Sort_uint64_Random_64                                         19.0ns ±14%  18.2ns ±14%    -4.31%  (p=0.001 n=40+40)
BM_Sort_uint64_Random_256                                        25.7ns ± 9%  22.1ns ±15%   -13.82%  (p=0.000 n=33+40)
BM_Sort_uint64_Random_1024                                       32.4ns ± 6%  23.8ns ±16%   -26.64%  (p=0.000 n=33+40)
BM_Sort_uint64_Random_16384                                      46.8ns ± 3%  27.1ns ±16%   -42.15%  (p=0.000 n=33+40)
BM_Sort_uint64_Random_262144                                     61.3ns ± 4%  30.4ns ±16%   -50.34%  (p=0.000 n=34+40)
BM_Sort_uint64_Ascending_1                                       3.67ns ± 3%  3.87ns ±16%    +5.36%  (p=0.049 n=35+40)
BM_Sort_uint64_Ascending_4                                       1.46ns ± 3%  1.46ns ± 3%      ~     (p=0.130 n=37+31)
BM_Sort_uint64_Ascending_16                                      1.09ns ± 3%  0.91ns ± 6%   -16.79%  (p=0.000 n=38+32)
BM_Sort_uint64_Ascending_64                                      1.25ns ± 3%  1.29ns ± 5%    +3.11%  (p=0.000 n=38+34)
BM_Sort_uint64_Ascending_256                                     1.37ns ± 3%  1.42ns ± 3%    +3.07%  (p=0.000 n=39+35)
BM_Sort_uint64_Ascending_1024                                    1.12ns ± 3%  1.17ns ± 3%    +5.28%  (p=0.000 n=37+36)
BM_Sort_uint64_Ascending_16384                                   0.98ns ± 3%  1.09ns ± 3%   +10.95%  (p=0.000 n=36+37)
BM_Sort_uint64_Ascending_262144                                  0.98ns ± 3%  1.08ns ± 3%   +10.97%  (p=0.000 n=36+37)
BM_Sort_uint64_Descending_1                                      3.68ns ± 3%  3.67ns ± 3%      ~     (p=0.652 n=36+36)
BM_Sort_uint64_Descending_4                                      1.71ns ± 3%  1.73ns ± 3%    +1.50%  (p=0.000 n=33+34)
BM_Sort_uint64_Descending_16                                     4.96ns ± 2%  5.49ns ± 3%   +10.73%  (p=0.000 n=31+36)
BM_Sort_uint64_Descending_64                                     2.14ns ± 6%  3.03ns ± 3%   +41.72%  (p=0.000 n=32+35)
BM_Sort_uint64_Descending_256                                    2.03ns ± 4%  2.86ns ± 4%   +40.55%  (p=0.000 n=32+34)
BM_Sort_uint64_Descending_1024                                   2.20ns ± 2%  2.29ns ± 3%    +4.20%  (p=0.000 n=31+36)
BM_Sort_uint64_Descending_16384                                  1.89ns ± 3%  2.08ns ± 3%   +10.00%  (p=0.000 n=31+37)
BM_Sort_uint64_Descending_262144                                 1.92ns ± 3%  2.07ns ± 4%    +7.95%  (p=0.000 n=31+36)
BM_Sort_uint64_SingleElement_1                                   3.68ns ± 5%  3.67ns ± 3%      ~     (p=0.716 n=31+37)
BM_Sort_uint64_SingleElement_4                                   1.46ns ± 3%  1.46ns ± 3%      ~     (p=0.557 n=34+37)
BM_Sort_uint64_SingleElement_16                                  1.09ns ± 2%  0.91ns ± 3%   -16.93%  (p=0.000 n=33+36)
BM_Sort_uint64_SingleElement_64                                  0.83ns ± 4%  1.47ns ± 4%   +78.03%  (p=0.000 n=34+34)
BM_Sort_uint64_SingleElement_256                                 0.95ns ± 4%  1.28ns ± 4%   +35.17%  (p=0.000 n=35+35)
BM_Sort_uint64_SingleElement_1024                                0.76ns ± 3%  1.05ns ± 3%   +37.78%  (p=0.000 n=35+33)
BM_Sort_uint64_SingleElement_16384                               0.71ns ± 2%  0.98ns ± 5%   +38.43%  (p=0.000 n=34+33)
BM_Sort_uint64_SingleElement_262144                              0.72ns ± 3%  0.98ns ± 4%   +35.93%  (p=0.000 n=35+33)
BM_Sort_uint64_PipeOrgan_1                                       3.68ns ± 3%  3.68ns ± 3%      ~     (p=0.650 n=35+33)
BM_Sort_uint64_PipeOrgan_4                                       1.53ns ± 2%  1.54ns ± 4%      ~     (p=0.424 n=33+36)
BM_Sort_uint64_PipeOrgan_16                                      2.23ns ± 3%  2.06ns ± 4%    -7.68%  (p=0.000 n=34+35)
BM_Sort_uint64_PipeOrgan_64                                      5.46ns ± 2%  3.41ns ± 4%   -37.67%  (p=0.000 n=33+36)
BM_Sort_uint64_PipeOrgan_256                                     2.92ns ± 4%  2.91ns ± 3%      ~     (p=0.257 n=35+35)
BM_Sort_uint64_PipeOrgan_1024                                    3.72ns ± 3%  5.35ns ± 4%   +43.95%  (p=0.000 n=35+35)
BM_Sort_uint64_PipeOrgan_16384                                   4.12ns ± 3%  6.37ns ± 3%   +54.74%  (p=0.000 n=34+36)
BM_Sort_uint64_PipeOrgan_262144                                  4.99ns ± 3%  7.25ns ± 5%   +45.45%  (p=0.000 n=35+35)
BM_Sort_uint64_QuickSortAdversary_1                              3.67ns ± 2%  3.65ns ± 3%      ~     (p=0.071 n=35+37)
BM_Sort_uint64_QuickSortAdversary_4                              1.46ns ± 3%  1.46ns ± 3%      ~     (p=0.214 n=36+37)
BM_Sort_uint64_QuickSortAdversary_16                             1.09ns ± 3%  0.91ns ± 3%   -16.73%  (p=0.000 n=36+38)
BM_Sort_uint64_QuickSortAdversary_64                             13.7ns ± 3%  17.8ns ± 5%   +29.86%  (p=0.000 n=36+37)
BM_Sort_uint64_QuickSortAdversary_256                            20.0ns ± 3%  25.9ns ± 3%   +29.25%  (p=0.000 n=35+38)
BM_Sort_uint64_QuickSortAdversary_1024                           28.1ns ± 3%  31.0ns ± 4%   +10.35%  (p=0.000 n=33+37)
BM_Sort_uint64_QuickSortAdversary_16384                          45.8ns ± 2%  50.5ns ± 4%   +10.29%  (p=0.000 n=36+37)
BM_Sort_uint64_QuickSortAdversary_262144                         64.9ns ± 3%  69.5ns ± 3%    +7.15%  (p=0.000 n=36+36)
BM_Sort_pair<uint32, uint32>_Random_1                            4.03ns ± 5%  4.33ns ± 4%    +7.31%  (p=0.000 n=36+36)
BM_Sort_pair<uint32, uint32>_Random_4                            6.78ns ± 5%  6.71ns ± 4%    -1.09%  (p=0.040 n=35+35)
BM_Sort_pair<uint32, uint32>_Random_16                           25.2ns ± 6%  16.8ns ± 7%   -33.35%  (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_Random_64                           35.6ns ± 7%  27.2ns ± 8%   -23.73%  (p=0.000 n=34+36)
BM_Sort_pair<uint32, uint32>_Random_256                          43.5ns ±13%  34.0ns ± 8%   -21.78%  (p=0.000 n=32+34)
BM_Sort_pair<uint32, uint32>_Random_1024                         50.6ns ± 8%  40.8ns ± 5%   -19.35%  (p=0.000 n=32+32)
BM_Sort_pair<uint32, uint32>_Random_16384                        66.0ns ± 3%  55.9ns ± 6%   -15.24%  (p=0.000 n=32+32)
BM_Sort_pair<uint32, uint32>_Random_262144                       82.4ns ± 4%  72.0ns ± 5%   -12.64%  (p=0.000 n=32+31)
BM_Sort_pair<uint32, uint32>_Ascending_1                         4.00ns ± 2%  4.50ns ±16%   +12.59%  (p=0.000 n=33+40)
BM_Sort_pair<uint32, uint32>_Ascending_4                         2.22ns ± 3%  2.34ns ±16%    +5.46%  (p=0.041 n=33+40)
BM_Sort_pair<uint32, uint32>_Ascending_16                        2.33ns ± 4%  1.30ns ±15%   -44.33%  (p=0.000 n=34+40)
BM_Sort_pair<uint32, uint32>_Ascending_64                        1.39ns ± 4%  1.50ns ± 8%    +8.48%  (p=0.000 n=35+32)
BM_Sort_pair<uint32, uint32>_Ascending_256                       1.47ns ± 4%  1.56ns ± 3%    +5.96%  (p=0.000 n=37+31)
BM_Sort_pair<uint32, uint32>_Ascending_1024                      1.34ns ± 3%  1.35ns ± 4%    +1.22%  (p=0.000 n=38+31)
BM_Sort_pair<uint32, uint32>_Ascending_16384                     1.18ns ± 2%  1.18ns ± 3%      ~     (p=0.687 n=37+32)
BM_Sort_pair<uint32, uint32>_Ascending_262144                    1.18ns ± 3%  1.17ns ± 2%      ~     (p=0.153 n=38+34)
BM_Sort_pair<uint32, uint32>_Descending_1                        4.00ns ± 2%  4.29ns ± 3%    +7.22%  (p=0.000 n=37+36)
BM_Sort_pair<uint32, uint32>_Descending_4                        2.91ns ± 3%  2.92ns ± 3%      ~     (p=0.065 n=37+35)
BM_Sort_pair<uint32, uint32>_Descending_16                       4.96ns ± 4%  6.51ns ± 2%   +31.36%  (p=0.000 n=37+30)
BM_Sort_pair<uint32, uint32>_Descending_64                       3.13ns ± 2%  2.92ns ± 3%    -6.71%  (p=0.000 n=36+37)
BM_Sort_pair<uint32, uint32>_Descending_256                      2.56ns ± 3%  2.73ns ± 5%    +6.55%  (p=0.000 n=35+37)
BM_Sort_pair<uint32, uint32>_Descending_1024                     3.11ns ± 3%  2.34ns ± 4%   -24.85%  (p=0.000 n=36+35)
BM_Sort_pair<uint32, uint32>_Descending_16384                    2.84ns ± 3%  2.14ns ± 5%   -24.48%  (p=0.000 n=37+37)
BM_Sort_pair<uint32, uint32>_Descending_262144                   2.86ns ± 3%  2.15ns ± 3%   -25.08%  (p=0.000 n=36+35)
BM_Sort_pair<uint32, uint32>_SingleElement_1                     3.99ns ± 3%  4.28ns ± 3%    +7.08%  (p=0.000 n=33+35)
BM_Sort_pair<uint32, uint32>_SingleElement_4                     2.32ns ± 6%  2.30ns ± 3%    -0.77%  (p=0.032 n=32+35)
BM_Sort_pair<uint32, uint32>_SingleElement_16                    1.67ns ± 4%  1.27ns ± 4%   -24.13%  (p=0.000 n=32+35)
BM_Sort_pair<uint32, uint32>_SingleElement_64                    1.64ns ± 7%  1.83ns ± 4%   +11.54%  (p=0.000 n=31+35)
BM_Sort_pair<uint32, uint32>_SingleElement_256                   1.57ns ± 3%  1.90ns ± 3%   +21.46%  (p=0.000 n=31+36)
BM_Sort_pair<uint32, uint32>_SingleElement_1024                  1.49ns ±15%  1.63ns ± 3%    +9.42%  (p=0.000 n=40+37)
BM_Sort_pair<uint32, uint32>_SingleElement_16384                 1.29ns ±17%  1.57ns ± 3%   +21.51%  (p=0.000 n=33+36)
BM_Sort_pair<uint32, uint32>_SingleElement_262144                1.26ns ± 4%  1.56ns ± 4%   +24.11%  (p=0.000 n=33+36)
BM_Sort_pair<uint32, uint32>_PipeOrgan_1                         4.01ns ± 2%  4.28ns ± 3%    +6.68%  (p=0.000 n=32+35)
BM_Sort_pair<uint32, uint32>_PipeOrgan_4                         2.38ns ± 5%  2.42ns ± 4%    +1.61%  (p=0.000 n=34+35)
BM_Sort_pair<uint32, uint32>_PipeOrgan_16                        4.83ns ± 2%  2.71ns ± 7%   -43.96%  (p=0.000 n=34+34)
BM_Sort_pair<uint32, uint32>_PipeOrgan_64                        4.53ns ± 3%  3.89ns ± 7%   -14.11%  (p=0.000 n=35+33)
BM_Sort_pair<uint32, uint32>_PipeOrgan_256                       5.53ns ± 4%  2.81ns ± 4%   -49.13%  (p=0.000 n=36+33)
BM_Sort_pair<uint32, uint32>_PipeOrgan_1024                      6.49ns ± 4%  5.29ns ± 3%   -18.50%  (p=0.000 n=35+32)
BM_Sort_pair<uint32, uint32>_PipeOrgan_16384                     7.21ns ± 4%  5.97ns ± 3%   -17.24%  (p=0.000 n=36+33)
BM_Sort_pair<uint32, uint32>_PipeOrgan_262144                    7.98ns ± 5%  6.59ns ± 3%   -17.46%  (p=0.000 n=33+33)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_1                3.99ns ± 3%  4.27ns ± 3%    +6.95%  (p=0.000 n=36+34)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_4                2.40ns ± 3%  2.37ns ± 3%    -1.00%  (p=0.007 n=34+34)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_16               4.96ns ± 5%  2.72ns ± 7%   -45.07%  (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_64               7.24ns ± 4%  7.51ns ± 4%    +3.63%  (p=0.000 n=34+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_256              9.85ns ± 5%  7.12ns ± 4%   -27.70%  (p=0.000 n=34+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_1024             11.6ns ± 6%   8.8ns ± 5%   -23.86%  (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_16384            32.7ns ± 3%  20.8ns ± 4%   -36.26%  (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_262144           36.4ns ± 3%  24.0ns ± 4%   -34.12%  (p=0.000 n=34+36)
BM_Sort_tuple<uint32, uint64, uint32>_Random_1                   4.04ns ± 6%  4.34ns ± 4%    +7.55%  (p=0.000 n=37+37)
BM_Sort_tuple<uint32, uint64, uint32>_Random_4                   7.19ns ± 6%  7.26ns ± 5%    +0.99%  (p=0.042 n=36+38)
BM_Sort_tuple<uint32, uint64, uint32>_Random_16                  30.4ns ± 6%  21.8ns ± 7%   -28.28%  (p=0.000 n=34+37)
BM_Sort_tuple<uint32, uint64, uint32>_Random_64                  42.8ns ±11%  33.5ns ± 9%   -21.70%  (p=0.000 n=36+38)
BM_Sort_tuple<uint32, uint64, uint32>_Random_256                 49.9ns ± 6%  40.3ns ± 9%   -19.20%  (p=0.000 n=35+38)
BM_Sort_tuple<uint32, uint64, uint32>_Random_1024                56.3ns ± 3%  46.1ns ± 4%   -18.08%  (p=0.000 n=35+35)
BM_Sort_tuple<uint32, uint64, uint32>_Random_16384               72.2ns ± 5%  62.1ns ± 3%   -14.05%  (p=0.000 n=37+36)
BM_Sort_tuple<uint32, uint64, uint32>_Random_262144              88.7ns ± 6%  79.0ns ± 6%   -10.93%  (p=0.000 n=36+36)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_1                3.96ns ± 3%  4.36ns ± 3%    +9.96%  (p=0.000 n=34+37)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_4                2.39ns ± 2%  2.39ns ± 3%      ~     (p=0.604 n=36+37)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_16               3.04ns ± 4%  1.48ns ± 3%   -51.20%  (p=0.000 n=34+35)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_64               2.44ns ± 3%  2.30ns ± 5%    -5.61%  (p=0.000 n=36+35)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_256              2.35ns ± 3%  2.39ns ± 5%    +1.78%  (p=0.000 n=33+34)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_1024             2.12ns ± 5%  2.08ns ± 4%    -1.80%  (p=0.000 n=33+34)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_16384            2.02ns ± 3%  2.00ns ± 5%    -1.25%  (p=0.000 n=32+32)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_262144           2.06ns ± 5%  2.11ns ± 9%      ~     (p=0.618 n=32+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_1               3.97ns ± 2%  4.57ns ±16%   +15.19%  (p=0.000 n=32+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_4               3.64ns ± 3%  4.05ns ±15%   +11.05%  (p=0.000 n=33+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_16              5.68ns ± 5%  9.36ns ±16%   +64.92%  (p=0.000 n=35+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_64              4.27ns ± 4%  3.88ns ± 8%    -9.13%  (p=0.000 n=35+32)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_256             3.58ns ± 3%  3.76ns ±14%    +5.12%  (p=0.002 n=38+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_1024            4.16ns ± 3%  3.21ns ± 5%   -22.77%  (p=0.000 n=38+31)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_16384           3.90ns ± 4%  3.00ns ± 3%   -23.12%  (p=0.000 n=38+32)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_262144          4.52ns ± 3%  3.42ns ± 3%   -24.29%  (p=0.000 n=38+33)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_1            3.97ns ± 3%  4.31ns ± 3%    +8.78%  (p=0.000 n=39+34)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_4            2.54ns ± 2%  2.54ns ± 4%      ~     (p=0.341 n=38+36)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_16           2.39ns ± 3%  1.70ns ± 6%   -28.90%  (p=0.000 n=38+35)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_64           2.61ns ± 2%  3.23ns ± 3%   +24.07%  (p=0.000 n=35+35)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_256          2.83ns ± 2%  2.97ns ± 4%    +4.83%  (p=0.000 n=35+37)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_1024         2.44ns ± 4%  2.44ns ± 3%      ~     (p=0.481 n=36+36)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_16384        2.19ns ± 3%  2.37ns ± 6%    +8.01%  (p=0.000 n=36+37)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_262144       2.34ns ± 2%  2.36ns ± 5%    +1.11%  (p=0.001 n=36+36)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_1                3.96ns ± 2%  4.31ns ± 3%    +8.76%  (p=0.000 n=33+35)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_4                2.65ns ± 6%  2.67ns ± 4%      ~     (p=0.139 n=32+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_16               5.64ns ± 3%  3.56ns ± 3%   -36.80%  (p=0.000 n=31+35)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_64               6.12ns ±16%  5.04ns ± 4%   -17.64%  (p=0.000 n=40+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_256              6.78ns ± 6%  3.73ns ± 3%   -44.94%  (p=0.000 n=31+36)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_1024             8.36ns ±15%  6.51ns ± 4%   -22.13%  (p=0.000 n=40+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_16384            9.24ns ±15%  7.91ns ± 3%   -14.34%  (p=0.000 n=40+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_262144           10.7ns ± 3%   9.3ns ± 6%   -12.36%  (p=0.000 n=32+36)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_1       3.97ns ± 3%  4.31ns ± 3%    +8.63%  (p=0.000 n=32+35)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_4       2.79ns ± 3%  2.76ns ± 4%    -0.95%  (p=0.002 n=33+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_16      5.07ns ± 3%  3.69ns ± 4%   -27.35%  (p=0.000 n=35+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_64      9.26ns ± 3%  8.34ns ± 7%    -9.88%  (p=0.000 n=35+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_256     11.8ns ± 5%   9.7ns ± 3%   -17.83%  (p=0.000 n=37+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_1024    19.2ns ± 4%  14.5ns ±10%   -24.59%  (p=0.000 n=36+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_16384   45.5ns ± 4%  37.4ns ± 9%   -17.71%  (p=0.000 n=35+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_262144  50.0ns ± 4%  43.2ns ± 3%   -13.69%  (p=0.000 n=35+34)
BM_Sort_string_Random_1                                          4.66ns ± 6%  4.40ns ± 4%    -5.55%  (p=0.000 n=35+37)
BM_Sort_string_Random_4                                          14.9ns ± 3%  15.0ns ± 6%      ~     (p=0.863 n=36+38)
BM_Sort_string_Random_16                                         45.5ns ± 6%  35.8ns ± 8%   -21.37%  (p=0.000 n=36+36)
BM_Sort_string_Random_64                                         66.6ns ± 4%  58.2ns ± 3%   -12.69%  (p=0.000 n=36+37)
BM_Sort_string_Random_256                                        86.0ns ± 5%  77.4ns ± 3%   -10.01%  (p=0.000 n=37+37)
BM_Sort_string_Random_1024                                        106ns ± 3%    96ns ± 6%    -9.39%  (p=0.000 n=37+37)
BM_Sort_string_Random_16384                                       154ns ± 3%   141ns ± 5%    -8.03%  (p=0.000 n=35+36)
BM_Sort_string_Random_262144                                      213ns ± 4%   197ns ± 4%    -7.59%  (p=0.000 n=34+34)
BM_Sort_string_Ascending_1                                       4.59ns ± 2%  4.56ns ±17%    -0.60%  (p=0.002 n=32+40)
BM_Sort_string_Ascending_4                                       7.52ns ± 9%  7.54ns ±12%      ~     (p=0.554 n=37+40)
BM_Sort_string_Ascending_16                                      13.1ns ± 6%   8.8ns ±12%   -33.26%  (p=0.000 n=39+38)
BM_Sort_string_Ascending_64                                      14.8ns ±10%  14.5ns ±11%    -2.15%  (p=0.013 n=40+37)
BM_Sort_string_Ascending_256                                     14.0ns ± 6%  14.1ns ±10%      ~     (p=0.760 n=37+40)
BM_Sort_string_Ascending_1024                                    12.9ns ±10%  12.8ns ±20%      ~     (p=0.055 n=35+40)
BM_Sort_string_Ascending_16384                                   17.2ns ±13%  17.4ns ±21%      ~     (p=1.000 n=37+40)
BM_Sort_string_Ascending_262144                                  17.5ns ±12%  17.5ns ±25%      ~     (p=0.392 n=35+39)
BM_Sort_string_Descending_1                                      4.59ns ± 3%  4.34ns ± 3%    -5.51%  (p=0.000 n=32+33)
BM_Sort_string_Descending_4                                      10.1ns ± 5%   9.8ns ± 4%    -2.84%  (p=0.000 n=36+34)
BM_Sort_string_Descending_16                                     22.0ns ± 4%  39.6ns ± 4%   +79.84%  (p=0.000 n=36+33)
BM_Sort_string_Descending_64                                     21.4ns ±12%  21.3ns ±14%      ~     (p=0.542 n=37+39)
BM_Sort_string_Descending_256                                    19.4ns ±13%  18.9ns ±13%    -2.74%  (p=0.039 n=37+39)
BM_Sort_string_Descending_1024                                   22.7ns ± 5%  17.6ns ±15%   -22.52%  (p=0.000 n=35+40)
BM_Sort_string_Descending_16384                                  27.9ns ±14%  22.6ns ±10%   -19.11%  (p=0.000 n=40+37)
BM_Sort_string_Descending_262144                                 33.8ns ±14%  26.1ns ±21%   -22.74%  (p=0.000 n=39+38)
BM_Sort_string_SingleElement_1                                   4.58ns ± 2%  4.35ns ± 3%    -5.14%  (p=0.000 n=35+37)
BM_Sort_string_SingleElement_4                                   7.92ns ± 3%  7.92ns ± 7%      ~     (p=0.625 n=38+39)
BM_Sort_string_SingleElement_16                                  18.0ns ± 3%   7.9ns ± 6%   -56.23%  (p=0.000 n=36+35)
BM_Sort_string_SingleElement_64                                  20.3ns ± 5%  19.3ns ±15%    -4.83%  (p=0.000 n=34+38)
BM_Sort_string_SingleElement_256                                 19.4ns ± 7%  18.1ns ±14%    -6.67%  (p=0.000 n=36+39)
BM_Sort_string_SingleElement_1024                                19.3ns ± 9%  17.4ns ±17%    -9.40%  (p=0.000 n=35+40)
BM_Sort_string_SingleElement_16384                               17.5ns ±12%  16.2ns ±20%    -7.91%  (p=0.000 n=37+40)
BM_Sort_string_SingleElement_262144                              16.7ns ±18%  15.3ns ±27%    -8.56%  (p=0.000 n=40+40)
BM_Sort_string_PipeOrgan_1                                       4.60ns ± 2%  4.33ns ± 3%    -5.80%  (p=0.000 n=33+31)
BM_Sort_string_PipeOrgan_4                                       8.29ns ± 4%  8.17ns ± 8%    -1.50%  (p=0.004 n=39+36)
BM_Sort_string_PipeOrgan_16                                      22.9ns ± 3%  16.4ns ± 6%   -28.45%  (p=0.000 n=39+38)
BM_Sort_string_PipeOrgan_64                                      30.7ns ± 4%  28.9ns ± 7%    -6.05%  (p=0.000 n=38+37)
BM_Sort_string_PipeOrgan_256                                     38.1ns ± 3%  22.5ns ± 9%   -40.78%  (p=0.000 n=37+37)
BM_Sort_string_PipeOrgan_1024                                    45.4ns ± 4%  36.2ns ± 6%   -20.33%  (p=0.000 n=37+37)
BM_Sort_string_PipeOrgan_16384                                   56.2ns ± 4%  49.0ns ± 8%   -12.73%  (p=0.000 n=36+38)
BM_Sort_string_PipeOrgan_262144                                  77.8ns ±13%  62.8ns ±10%   -19.27%  (p=0.000 n=39+39)
BM_Sort_string_QuickSortAdversary_1                              4.80ns ±16%  4.34ns ± 4%    -9.56%  (p=0.000 n=39+34)
BM_Sort_string_QuickSortAdversary_4                              14.8ns ± 5%  14.7ns ± 4%    -0.80%  (p=0.037 n=33+33)
BM_Sort_string_QuickSortAdversary_16                             44.6ns ± 4%  34.8ns ± 5%   -21.98%  (p=0.000 n=35+34)
BM_Sort_string_QuickSortAdversary_64                             66.2ns ± 3%  58.1ns ± 4%   -12.32%  (p=0.000 n=36+35)
BM_Sort_string_QuickSortAdversary_256                            85.4ns ± 5%  76.9ns ± 6%    -9.99%  (p=0.000 n=36+36)
BM_Sort_string_QuickSortAdversary_1024                            106ns ± 4%    96ns ± 3%    -9.62%  (p=0.000 n=34+37)
BM_Sort_string_QuickSortAdversary_16384                           153ns ± 3%   141ns ± 4%    -8.22%  (p=0.000 n=34+37)
BM_Sort_string_QuickSortAdversary_262144                          211ns ± 5%   195ns ± 6%    -7.77%  (p=0.000 n=35+38)

Differential Revision: https://reviews.llvm.org/D122780

[mlir][sparse] introduce sparse_tensor::StorageSpecifierToLLVM pass

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140122

[LowerExpectIntrinsic] Propagate branch weights through phi values when ExpectedValue is unlikely in LowerExpectIntrinsic

Update handlePhiDef to consider the probability argument in an expect.with.probability intrinsic when annotating BranchInsts.
In addition, we also disallow non-constant probability arguments in this intrinsic.

Differential Revsion: https://reviews.llvm.org/D140337

[libc][NFC] Use operator delete to cleanup a File object.

The File API has been refactored to allow cleanup using operator delete.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D140574

Revert "Emit unwind information in the .debug_frame section when the .cfi_sections .debug_frame directive is used."

This reverts commit d2cbdb6bef31bdc3254daf57148225ea4b34520c.

This is because we are seeing linker crashes in the internal apple bots.

[RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility

SLLI and ADD are more compressible than SLLIW and ADDW. SLLI/ADD both have a 5-bit register encoding. SLLIW/ADDW have a 3-bit register encoding. They both require the dest to also be one of the sources.

We aggressively form ADDW/SLLIW as it helps hasAllWBitUsers in RISCVISelDAGToDAG to not require recursion. So we need a pass to remove excessive -w suffixes.

Differential Revision: https://reviews.llvm.org/D139948

[libc++] Granularize <type_traits> includes in <utility>

Reviewed By: Mordante, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D140426

SCCP: Add failing testcase with llvm.ssa.copy

SCCP: Don't assert on constantexpr casts of function uses

This includes 2 different, related fixes:

1. Fix asserting on direct assume-like intrinsic uses of a function
address

2. Fix asserting on constant expression casts used by assume-like
intrinsics.

By default hasAddressTaken permits assume-like intrinsic uses, which
ignores assume-like calls and pointer casts of the address used by
assume-like calls.

Fixes #59602, but there are additional issues I encountered when
debugging this. For instance, the original failing bitcast expression
was really unused. Clang tentatively created it for the function type,
but was unnecessary after applyGlobalValReplacements. That did not
clean up the now dead ConstantExpr which hung around oun the user
list, so this assert only reproduced when running clang from the
original testcase, and didn't just running opt -passes=ipsccp. I don't
know who is responsible for cleaning up unused ConstantExprs, but I've
run into similar issues several times recently.

Additionally, I found a few assertions with llvm.ssa.copy with
functions and casts of functions as the argument.

Another issue theoretically exists if hasAddressTaken chooses to
respect nocapture when passed function addresses. The search here
would need to do additional work to look at the users of the constant
cast to see if any call sites need returned to be stripped.

[CSKY] Fix MachineFunctionInfo initialization after 69e75ae695d9ef1360a2a1fbefd6e0e0456c3f7b

[clang] Remove redundant initialization of std::optional (NFC)

[mlir][GPU] Add known_block_size and known_grid_size to gpu.func

In many cases, the the number of workgroups (the grid size) and the
number of workitems within each group (the block size) that a GPU
kernel will be launched with are known. For example, if gpu.launch is
called with constant block and grid sizes, we know that those are the
only possible sizes that will be used to launch that kernel. In other
cases, a custom code-generation pipeline that eventually produces GPU
kernels may know the launch dimensions of those kernels, or at least
may be able to provide an upper bound on them.

Other GPU programming systems, such as OpenCL, allow capturing such
information to enable compiler optimizations - see
reqd_work_group_size, but MLIR currently has no mechanism for doing so.

This set of attributes is the first step in enabling optimizations
based on the known launch dimensions of kernels. It extends the kernel
outline pass to set these bounds on kernels with constant launch
dimensions and extends integer range inference for GPU index
operations to account for the bounds when they are known.

Subsequent revisions will use this data when lowering GPU operations
to the ROCDL dialect.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D139865

[gn build] Port 17ed8f29287b

[BOLT][AArch64] Handle adrp+ld64 linker relaxations

Linker might relax adrp + ldr got address loading to adrp + add for
local non-preemptible symbols (e.g. hidden/protected symbols in
executable). As usually linker doesn't change relocations properly after
relaxation, so we have to handle such cases by ourselves. To do that
during relocations reading we change LD64 reloc to ADD if instruction
mismatch found and introduce FixRelaxationPass that searches for ADRP+ADD
pairs and after performing some checks we're replacing ADRP target symbol
to already fixed ADDs one.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D138097

[lld-macho] Fix assert when splitting section

Fixes https://github.com/llvm/llvm-project/issues/59649

Differential Revision: https://reviews.llvm.org/D140518

[mlir][sparse] make loop emitter API more concise.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140583

[LangRef] Add description for nocallback attribute

This patch adds the description for nocallback attribute
that is implemented in https://reviews.llvm.org/D90275.

Differential Revision: https://reviews.llvm.org/D131628

[Linker] Remove nocallback attribute while linking

GCC's leaf attribute is lowered to LLVM IR nocallback attribute.
Clang conservatively treats this function attribute as a hint on the
module level, and removes it while linking modules. More context can
be found in: https://reviews.llvm.org/D131628.

Differential Revision: https://reviews.llvm.org/D137360

[DAGCombine][X86] Pull one-use `freeze` out of `extract_vector_elt` vector operand

This may allow us to further simplify the vector,
and freezing the extracted result is still fine:
```
----------------------------------------
define i8 @src(<2 x i8> %src, i64 %idx) {
%0:
  %i1 = freeze <2 x i8> %src
  %i2 = extractelement <2 x i8> %i1, i64 %idx
  ret i8 %i2
}
=>
define i8 @tgt(<2 x i8> %src, i64 %idx) {
%0:
  %i1 = extractelement <2 x i8> %src, i64 %idx
  %i2 = freeze i8 %i1
  ret i8 %i2
}
Transformation seems to be correct!
```

BUT, there must not be other uses of that freeze,
see `@freeze_extractelement_extra_use`.

Also, looks like we are missing some ISEL-level handling for freeze.

[NFC][Codegen][X86] Add tests where we could improve `freeze` handling

[Driver] Revert D139717 and add -Xparser/-Xcompiler instead

Some macOS projects use -Xparser even if it leads to a
-Wunused-command-line-argument warning. It doesn't justify adding a broad Joined
`-X` (IgnoredGCCCompat) as GCC doesn't really support these arbitrary `-X`
options.

Note: `-Xcompiler foo` is a GNU libtool option, not a driver option.
It is misused by some ChromeOS packages (but not by Gentoo).
Keep it for a while.

It seems that GCC < 4.6 reports g++: unrecognized option '-Xfoo' but exit with 0.
GCC >= 4.6 reports g++: error: unrecognized option '-Xfoo' and exits with 1.
It never supports -Xcompiler or -Xparser, so `IgnoredGCCCompat` is not justified.

Differential Revision: https://reviews.llvm.org/D140224

[mlir][sparse] move loop boundary method to codegenenv

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D140578

Remove incorrectly implemented -mibt-seal

The option from D116070 does not work as intended and will not be needed when
hidden visibility is used. A function needs ENDBR if it may be reached
indirectly. If we make ThinLTO combine the address-taken property (close to
`!GV.use_empty() && !GV.hasAtLeastLocalUnnamedAddr()`), then the condition can
be expressed with:

`AddressTaken || (!F.hasLocalLinkage() && (VisibleToRegularObj || !F.hasHiddenVisibility()))`

The current `F.hasAddressTaken()` condition does not take into acount of
address-significance in another bitcode file or ELF relocatable file.

For the Linux kernel, it uses relocatable linking. lld/ELF uses a
conservative approach by setting all `VisibleToRegularObj` to true.
Using the non-relocatable semantics may under-estimate
`VisibleToRegularObj`. As @pcc mentioned on
https://github.com/ClangBuiltLinux/linux/issues/1737#issuecomment-1343414686
, we probably need a symbol list to supply additional
`VisibleToRegularObj` symbols (not part of the relocatable LTO link).

Reviewed By: samitolvanen

Differential Revision: https://reviews.llvm.org/D140363

[DAGCombiner] `visitFREEZE()`: allow, and update, other uses of maybe-poison operand

[NFC][Codegen][X86] Add test for freeze with other uses of maybe-poison operand

[libc][obvious] fix errno for 32 bit long test

one of the tests in StrtolTest.h is intended to detect that 32 bit longs
are handled correctly, but it wasn't using the correct value for errno
causing failures.

Differential Revision: https://reviews.llvm.org/D140577

[lldb] Fix a warning

This patch fixes:

lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.cpp:1378:16:
warning: control reaches end of non-void function [-Wreturn-type]

Revert D138179 "MIPS: fix build from IR files, nan2008 and FpAbi"

This reverts commit 9739bb81aed490bfcbcbbac6970da8fb7232fd34.
It causes `.module is not permitted after generating code`
for Linux kernel's `ARCH=mips 32r1_defconfig` clang+GNU as build.
It's confirmed as a defect, but the proper fix needs time to sort out.

Fix indentation in LangRef.rst

Sphinx build was broken.

Support: Fix broken C++ marker

[mlir] Fix a warning

This patch fixes:

mlir/lib/Analysis/DataFlow/SparseAnalysis.cpp:321:19: warning:
unused variable ‘block’ [-Wunused-variable]

[libc] Handle allocation failures gracefully in FILE related API.

Few uses of free have not yet been replaced by the custom operator
delete yet. They will be done in a follow up patch.

Reviewed By: lntue, michaelrj

Differential Revision: https://reviews.llvm.org/D140526

[mlir] Fix warnings

This patch fixes:

  third-party/unittest/googletest/include/gtest/gtest.h:1526:11:
  error: comparison of integers of different signs: 'const unsigned
  long' and 'const int' [-Werror,-Wsign-compare]

Properly support LLVM_ENABLE_LLD on Windows

Currently, setting -DLLVM_ENABLE_LLD=ON on windows also requires setting
-DCMAKE_LINKER=lld-link.exe. This is both misleading and redundant.

Fix this by trying to find llvm-link.exe when -DLLVM_ENABLE_LLD=ON is
set and CMAKE_LINKER is not, and aborting otherwise.

Differential Revision: https://reviews.llvm.org/D140534

[VPlan] Add support for tracking UFs applicable to VPlan (NFC).

Explicitly track the UFs supported in a VPlan. This is needed to
allow transformations to restrict the UFs which are supported.

Discussed as separate improvement in D135017.

[SPARC] Fix SELECT_REG emission for f128s

In LowerSELECT_CC, SELECT_REG between two f128s should only be emitted if we
have hardware quadfloat enabled.

This should fix issue #59646

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D140515

[libc++][CI] Improves buildbot runner.

Adds documentation and shows a help message when running the script
without arguments.

Lands parts of D139545.

[gn build] Port eb6e13cb3280

[libc] change str to int tests to be templated

Previously the tests were copy/pasted into several files, this changes
them to be instead templated and sharing one file.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D140441

[libc++][format] Removes test redundancy.

The format function test serve two purposes:
- Test whether all format functions work in general.
- Test whether all formatting rules are implemented correctly.

At the moment the *pass.cpp tests do both. These tests are quite slow,
while testing all rules for all functions doesn't add much coverage.

There are two execution modi of the format functions:
- run-time validation in the vformat functions.
- compile-time validation in the other function.

So instead of running all tests for all functions, they are only used for
format.pass.cpp and vformat.pass.cpp still do all tests.

The other tests do a smaller set of test, just to make sure they work in the
basics.

Running the format tests using one thread:
- before 00:04:16
- after 00:02:14

The slow tests were also reported in
https::llvm.org/PR58141

Also split a generic part of the test to a generic support header. This
allows these parts to be reused in the range-based formatter tests.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D140115

[libc++][format] Adds formatter for tuple and pair

Implements parts of
- P2286R8 Formatting Ranges

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D136775

[mlir][sparse] completed codegen environment privatization

All members are now private and access is through delegate
or convenience methods only (except the loop emitter, which
is still under refactoring).

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D140519

[VPlan] Add unittest for printing plans with VFs and UFs (NFC).

[mlir][spirv] Add StreamingInterfaceINTEL to SPIRVBase.td

StreamingInterfaceINTEL has been recently added to the SPIR-V headers:
https://github.com/KhronosGroup/SPIRV-Headers/commit/70ff9d939cd7fd0c758756ac57ab0c7c6d6c64d6

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D140476

[Sanitizer] Fix page alignment for mmap calls

We are in the process of enabling sanitizer_common unit tests on arm64 for apple devices. rdar://101436019

The test `CompactRingBuffer.int64` is failing on arm64 with the error:

```==17265==ERROR: SanitizerTool failed to deallocate 0xfffffffffffff000 (-4096) bytes at address 0x000105c30000
SanitizerTool: CHECK failed: sanitizer_posix.cpp:63 "(("unable to unmap" && 0)) != (0)" (0x0, 0x0) (tid=157296)```

If page size is sufficiently larger than alignment then this code:
UnmapOrDie((void*)end, map_end - end);
end is will be greater than map_end causing the value passed to UnmapOrDie to be negative.

This is caused when GetPageSizeCached returns 16k and alignment is 8k.
map_size and what is mapped by mmap uses size and alignment which is smaller than what is calculated by end using the actual page size.
Therefore, map_end ends up being less than end.
The call to mmap is allocating sufficent page-aligned memory, because it calls RoundUp within MmapOrDieOnFatalError.
But this size is not being captured by map_size.

We can address this by rounding up map_size here to be page-aligned. This ensures that map_end will be greater than or equal to end and that it will match mmaps use of page-aligned value, and the
subsequent call to munmap will also be page-aligned.

Differential Revision: https://reviews.llvm.org/D140353

[IR/MachineOutliner] Add a "nooutline" function attr and respect it

Add `nooutline` + update LangRef to say it exists.

This makes it possible to say "don't outline from this function ever."

We want to be able to toggle whether or not a function should be in the search
set regardless of default behaviour.

Add testcases for the IR Outliner + Machine Outliner.

Also remove an unnecessary check for an empty function in the Machine Outliner.

Differential Revision: https://reviews.llvm.org/D140438

[libc] add exponent format to printf

Add support for the %e/E conversion in printf, as well as unit tests. It
does not yet support long doubles.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D140042

[lldb] Add LTO dependency to lldb test suite

Make the lldb test target depend on LTO, since TestFullLtoStepping
needs it (prior to this patch, running "ninja check-lldb" would not
build libLTO).

Differential Revision: https://reviews.llvm.org/D140051

AMDGPU: Modernize sqrt f64 test

Use the readfirstlane hack for the scalar cases as a hack to
combine globalisel and sdag tests. gfx6 stores are a bit broken
in globalisel, and scalar returns are totally broken in sdag.

Add aligned_alloc to symbolizer symbols list.

New symbol used by libcxx as of https://reviews.llvm.org/D138196, needs
to be added to the symbol deps list.

Support: Add polling option to sys::Wait

Currently the process is terminated after the timeout. Add an option
to let the process resume after the timeout instead.

https://reviews.llvm.org/D138952

AMDGPU: Update constant address spaces used in printf test

This was never updated for the address space number shuffle.

AMDGPU: Use early continue to reduce indentation

[mlir] Allow specifying benefit for C func ptr style patterns.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D139234

[AArch64] Add RSHRN and RSHRN2 patterns

This adds some tablegen patterns for RSHRN, which performs a rounding
shift with narrow. This is similar to the existing SHRN patterns with an
extra addition to perform the rounding, that adds 1<<(shift-1) before
the right shift. Because the round immediate and the shift amount are
tied, it goes via a ComplexPattern that uses a SelectRoundingVLShr
method to perform the selection checks.

aarch64_neon_rshrn are expanded into the sequence of equivalent
instructions (trunc(shr(add(x, 1<<(sht-1)), sht))) so that they can be
converted back into RSHRN. Which also allows us to match raddhn through
the adjusted patterns that previously used aarch64_neon_rshrn.

DIfferential Revision: https://reviews.llvm.org/D140297

Small fixes to creduce-clang-crash.py script.

Specify python3, and replace / with // to do integer division.

mlir/tblgen test: add a test for EnumAttr customAssemblyFormat

attr-or-type-format.td contains tests for various attributes and types,
but nowhere in the testsuite is the customAssemblyFormat for an EnumAttr
(enum class in C++) exercised. Fix this by adding a test.

Signed-off-by: Ramkumar Ramachandra <r@artagnon.com>
Differential Revision: https://reviews.llvm.org/D140034

[GlobalISel][Legalizer] add minScalarIf action

Ensure scalar is at least as wide as type, but only if the specified condition
is met.

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D140305

[AVR] Do not emit instructions invalid for attiny10

The attiny4/attiny5/attiny9/attiny10 have a slightly modified
instruction set that drops a number of useful instructions. This patch
makes sure to not emit them on these "reduced tiny" cores.

The affected instructions are:

  * lds and sts (load/store directly from data)
  * ldd and std (load/store with displacement)
  * adiw and sbiw (add/sub register pairs)
  * various other instructions that were emitted without checking
    whether the chip actually supports them (movw, adiw, etc)

There is a variant on lds and sts on these chips, but it can only
address a limited portion of the address space and is mainly useful to
load/store I/O registers (as an extension to the in and out
instructions). I have not implemented it here, implementing it can be
done in a separate patch.

This patch is not optimal. I'm sure it can be improved a lot. For
example, we could teach the instruction selector to not select lddw/stdw
instructions so that the weird pointer adjustments are not necessary.
But for now I've focused just on correctness, not on code quality.

Updates: https://github.com/llvm/llvm-project/issues/53459

Differential Revision: https://reviews.llvm.org/D131867

[AMDGPU] Remove permlane discard vdst_in optimization from isel

D72845 implemented the equivalent IR optimization in InstCombine so it
seems that there's no advantage to doing it during isel too.

This partially reverts D72844.

Differential Revision: https://reviews.llvm.org/D140546

Apply clang-tidy fixes for llvm-else-after-return in TensorOps.cpp (NFC)

Apply clang-tidy fixes for llvm-else-after-return in SparseVectorization.cpp (NFC)