platform/upstream/llvm.git
19 months ago[TTI][NFC]Remove trailing spaces, NFC.
Alexey Bataev [Fri, 23 Dec 2022 16:00:50 +0000 (08:00 -0800)]
[TTI][NFC]Remove trailing spaces, NFC.

19 months ago[mlir] Enable types to us custom assembly formats involving optional attributes.
Nick Kreeger [Fri, 23 Dec 2022 15:55:15 +0000 (09:55 -0600)]
[mlir] Enable types to us custom assembly formats involving optional attributes.

Author: Laszlo Kindrat <laszlokindrat@gmail.com>
Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D140322

19 months ago[NFC][OpenMP] Fix compile warning caused by using `std::move` on a local object on...
Shilei Tian [Fri, 23 Dec 2022 15:42:29 +0000 (10:42 -0500)]
[NFC][OpenMP] Fix compile warning caused by using `std::move` on a local object on a `return` statement

19 months ago[LoopUnroll] Convert some tests to opaque pointers (NFC)
Nikita Popov [Fri, 23 Dec 2022 15:33:28 +0000 (16:33 +0100)]
[LoopUnroll] Convert some tests to opaque pointers (NFC)

19 months ago[DAGCombiner] `visitFREEZE()`: fix cycle breaking
Roman Lebedev [Fri, 23 Dec 2022 15:10:39 +0000 (18:10 +0300)]
[DAGCombiner] `visitFREEZE()`: fix cycle breaking

Depending on the particular DAG, we might either create a `freeze`,
or not. And only in the former case, the cycle would be formed.
It would be nicer to have `ReplaceAllUsesOfValueWithIf()`,
like we have in IR, but we don't have that.

Fixes https://github.com/llvm/llvm-project/issues/59677

19 months agoValueTracking: Teach canCreateUndefOrPoison about saturating intrinsics
Matt Arsenault [Fri, 23 Dec 2022 03:38:10 +0000 (22:38 -0500)]
ValueTracking: Teach canCreateUndefOrPoison about saturating intrinsics

19 months agoInstCombine: Add baseline tests for saturating poison handling
Matt Arsenault [Fri, 23 Dec 2022 03:37:12 +0000 (22:37 -0500)]
InstCombine: Add baseline tests for saturating poison handling

19 months ago[libc++] Add custom clang-tidy checks
Nikolas Klauser [Sat, 13 Aug 2022 20:33:12 +0000 (22:33 +0200)]
[libc++] Add custom clang-tidy checks

Reviewed By: #libc, ldionne

Spies: jwakely, beanz, smeenai, cfe-commits, tschuett, avogelsgesang, Mordante, sstefan1, libcxx-commits, ldionne, mgorny, arichardson, miyuki

Differential Revision: https://reviews.llvm.org/D131963

19 months agofix warn-xparser test
Mikhail Goncharov [Fri, 23 Dec 2022 14:32:59 +0000 (15:32 +0100)]
fix warn-xparser test

for https://reviews.llvm.org/D140224

19 months ago[gn] port f29cfab55d1f
Nico Weber [Fri, 23 Dec 2022 14:29:38 +0000 (09:29 -0500)]
[gn] port f29cfab55d1f

19 months ago[InlineAdvisor] Restructure advisor plugin unittest cmake
ibricchi [Fri, 23 Dec 2022 14:12:36 +0000 (09:12 -0500)]
[InlineAdvisor] Restructure advisor plugin unittest cmake

Move the plugin used in the unittest to test Inline Advisor Plugins
into a separate folder to clean up the cmake file for the analysis
tests.

Differential Revision: https://reviews.llvm.org/D140559

19 months ago[DAGCombiner] `visitFREEZE()`: fix handling of no maybe-poison ops
Roman Lebedev [Fri, 23 Dec 2022 14:14:22 +0000 (17:14 +0300)]
[DAGCombiner] `visitFREEZE()`: fix handling of no maybe-poison ops

The original code was confusing. It was stripping poison-generating flags,
but the comments were saying that doing so was a TODO.

If the poison-generating flags are present, then even if all operands
are guaranteed not to be undef or poison, the whole operation may still
produce undef or poison. We can still deal with that case,
and we already do deal with it in fact, by also dropping those flags.

Refs. https://github.com/llvm/llvm-project/issues/59676

19 months ago[DAGCombiner] `visitFREEZE()`: restore previous behaviour on no maybe-poison operands
Roman Lebedev [Fri, 23 Dec 2022 14:02:14 +0000 (17:02 +0300)]
[DAGCombiner] `visitFREEZE()`: restore previous behaviour on no maybe-poison operands

Lack of such operands implies that the op might be poison-producing due to
it's flags. We seem to drop them already, but the comments are confusing.

Fixes https://github.com/llvm/llvm-project/issues/59676

19 months ago[RISCV] Combine comparison and logic ops
Ilya Andreev [Tue, 13 Sep 2022 13:01:56 +0000 (09:01 -0400)]
[RISCV] Combine comparison and logic ops

Two comparison operations and a logical operation are combined into selection using MIN or MAX and comparison operation.
For optimization to be applied conditions have to be satisfied:
  1. In comparison operations has to be the one common operand.
  2. Supports only signed and unsigned integers.
  3. Comparison has to be the same with respect to common operand.
  4. There are no more users of comparison except logic operation.
  5. Every combination of comparison and AND, OR are supported.

It will convert
  %l0 = %a < %c
  %l1 = %b < %c
  %res = %l0 or %l1
into
  %sel = min(%a, %b)
  %res = %sel < %c

It supports several comparison operations (<, <=, >, >=), signed, unsigned values and different order of operands if they do not violate conditions.

Differential Revision: https://reviews.llvm.org/D134277

19 months ago[RISCV][test] Combine comparison and logic ops
Ilya Andreev [Tue, 13 Sep 2022 13:01:56 +0000 (09:01 -0400)]
[RISCV][test] Combine comparison and logic ops

Two comparison operations and a logical operation are combined into selection using MIN or MAX and comparison operation.
For optimization to be applied conditions have to be satisfied:
  1. In comparison operations has to be the one common operand.
  2. Supports only signed or unsigned integers.
  3. Comparison has to be the same with respect to common operand.
  4. There are no more users of comparison except logic operation.
  5. Every combination of comparison and AND, OR are supported.

It will convert
  %l0 = %a < %c
  %l1 = %b < %c
  %res = %l0 or %l1
into
  %sel = min(%a, %b)
  %res = %sel < %c

It supports several comparison operations (<, <=, >, >=), signed, unsigned values and different order of operands if they do not violate conditions.

19 months ago[gn] port ba0ec6f15f55
Nico Weber [Fri, 23 Dec 2022 14:08:56 +0000 (09:08 -0500)]
[gn] port ba0ec6f15f55

19 months ago[clang] Remove deprecated ControlFlowContext::build()
Dmitri Gribenko [Fri, 23 Dec 2022 14:03:48 +0000 (15:03 +0100)]
[clang] Remove deprecated ControlFlowContext::build()

Reviewed By: merrymeerkat

Differential Revision: https://reviews.llvm.org/D140625

19 months ago[NFC][NVPTX] Remove dead override
Luke Drummond [Tue, 20 Dec 2022 00:05:46 +0000 (00:05 +0000)]
[NFC][NVPTX] Remove dead override

After 68f2218e1e, NVPTXTargetObjectFile::Initialize is an empty wrapper
of the parent method. Get rid of it.

Differential Revision: https://reviews.llvm.org/D140397

19 months agoApply shortened printing/parsing form to linalg.reduce.
Aliia Khasanova [Fri, 23 Dec 2022 13:38:29 +0000 (14:38 +0100)]
Apply shortened printing/parsing form to linalg.reduce.

Differential Revision: https://reviews.llvm.org/D140622

19 months ago[clang] Migrate away from a deprecated Clang CFG factory function
Dmitri Gribenko [Fri, 23 Dec 2022 13:26:34 +0000 (14:26 +0100)]
[clang] Migrate away from a deprecated Clang CFG factory function

Reviewed By: merrymeerkat

Differential Revision: https://reviews.llvm.org/D140620

19 months ago[LV] Move exit cond simplification to separate transform.
Florian Hahn [Fri, 23 Dec 2022 12:51:20 +0000 (12:51 +0000)]
[LV] Move exit cond simplification to separate transform.

This sets the stage for D133017 by moving out the code that performs
VPlan based simplifications to a separate transform that takes the
chosen VF & UF as arguments.

The main advantage is that this transform runs before any changes to
the CFG are being made. This allows using SCEV without worrying about
making queries while the IR is in an incomplete state.

Note that this patch switches the reasoning to use SCEV, but still only
simplifies loops with constant trip counts. Using SCEV here is needed to
access the backedge taken count, because the trip count IR value has not
been created yet.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D135017

19 months agoRevert "[clang] Use a StringRef instead of a raw char pointer to store builtin and...
serge-sans-paille [Fri, 23 Dec 2022 12:25:58 +0000 (13:25 +0100)]
Revert "[clang] Use a StringRef instead of a raw char pointer to store builtin and call information"

There are still remaining issues with GCC 12, see for instance

https://lab.llvm.org/buildbot/#/builders/93/builds/12669

This reverts commit 5ce4e92264102de21760c94db9166afe8f71fcf6.

19 months ago[NFC][NVPTX] Remove dead comment and commented code
Luke Drummond [Mon, 19 Dec 2022 14:59:12 +0000 (14:59 +0000)]
[NFC][NVPTX] Remove dead comment and commented code

A confusing comment after the last return statement in
`NVPTXAsmPrinter::doFinalization` referred to a preprocessor macro
(NVISA) that has never existed since the NVPTX backend has been a part
of upstream llvm - as far as the pickaxe will tell me anyway. Thus I've
removed it.

Differential Revision: https://reviews.llvm.org/D140399

19 months ago[mlir] Add option to limit number of pattern rewrites in CanonicalizerPass
Matthias Springer [Fri, 23 Dec 2022 12:01:00 +0000 (13:01 +0100)]
[mlir] Add option to limit number of pattern rewrites in CanonicalizerPass

The greedy pattern rewriter consists of two nested loops. `config.maxIterations` (which configurable on the CanonicalizerPass) controls the maximum number of iterations of the outer loop.

```
/// This specifies the maximum number of times the rewriter will iterate
/// between applying patterns and simplifying regions. Use `kNoLimit` to
/// disable this iteration limit.
int64_t maxIterations = 10;
```

This change adds `config.maxNumRewrites` which controls the maximum number of pattern rewrites within an iteration. (It effectively control the maximum number of iterations of the inner loop.)

This flag is meant for debugging and useful in cases where one or multiple faulty patterns can be applied indefinitely, resulting in an infinite loop.

Differential Revision: https://reviews.llvm.org/D140525

19 months ago[VE] Convert test to opaque pointers (NFC)
Nikita Popov [Fri, 23 Dec 2022 11:46:41 +0000 (12:46 +0100)]
[VE] Convert test to opaque pointers (NFC)

There is a minor codegen regression here (an extra and instruction).
The reason is that CGP only eliminates fallthrough branches if it
has made some other kind of change, and with opaque pointers that
other change does not occur.

Ideally, we should probably always try to eliminate fallthroughs,
but this runs into the problem that performing a dummy fallthrough
is a common pattern in tests for forcing SDAG to select them
separately, so it's not quite that simple.

19 months ago[clang] Use a StringRef instead of a raw char pointer to store builtin and call infor...
serge-sans-paille [Mon, 12 Dec 2022 16:02:15 +0000 (17:02 +0100)]
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information

This avoids recomputing string length that is already known at compile
time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

This is a recommit of 719d98dfa841c522d8d452f0685e503538415a53 with a
change to llvm/utils/TableGen/OptParserEmitter.cpp to cope with GCC bug
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108158

Differential Revision: https://reviews.llvm.org/D139881

19 months ago[LV] Assert that the executed plan contains selected VF & UF (NFC).
Florian Hahn [Fri, 23 Dec 2022 11:44:42 +0000 (11:44 +0000)]
[LV] Assert that the executed plan contains selected VF & UF (NFC).

Add assertion to ensure the executed plan is valid for the selected VF
and UF.

19 months ago[clang] Fix a clang crash on invalid code in C++20 mode.
Haojian Wu [Thu, 22 Dec 2022 22:23:54 +0000 (23:23 +0100)]
[clang] Fix a clang crash on invalid code in C++20 mode.

This crash is a combination of recovery-expr + new SemaInit.cpp code
introduced by by https://reviews.llvm.org/D129531.

Differential Revision: https://reviews.llvm.org/D140587

19 months agoRemove empty header file.
Dani Ferreira Franco Moura [Fri, 23 Dec 2022 10:04:56 +0000 (10:04 +0000)]
Remove empty header file.

Reviewed By: gribozavr2, merrymeerkat

Differential Revision: https://reviews.llvm.org/D140483

19 months ago[VE] Name instructions in test (NFC)
Nikita Popov [Fri, 23 Dec 2022 10:41:57 +0000 (11:41 +0100)]
[VE] Name instructions in test (NFC)

19 months agoRevert "[clang] Use a StringRef instead of a raw char pointer to store builtin and...
serge-sans-paille [Fri, 23 Dec 2022 10:36:56 +0000 (11:36 +0100)]
Revert "[clang] Use a StringRef instead of a raw char pointer to store builtin and call information"

Failing builds: https://lab.llvm.org/buildbot#builders/9/builds/19030
This is GCC specific and has been reported upstream: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108158

This reverts commit 719d98dfa841c522d8d452f0685e503538415a53.

19 months ago[clang] Use a StringRef instead of a raw char pointer to store builtin and call infor...
serge-sans-paille [Mon, 12 Dec 2022 16:02:15 +0000 (17:02 +0100)]
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information

This avoids recomputing string length that is already known at compile
time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

Differential Revision: https://reviews.llvm.org/D139881

19 months ago[examples] Direct HowToUseJIT readers to HowToUseLLJIT instead.
Lang Hames [Fri, 23 Dec 2022 09:08:50 +0000 (01:08 -0800)]
[examples] Direct HowToUseJIT readers to HowToUseLLJIT instead.

HowToUseJIT describes the older APIs. We want to discourage their use in new
projects.

19 months ago[Docs] Clarify typed pointers support timeline
Nikita Popov [Fri, 23 Dec 2022 09:07:59 +0000 (10:07 +0100)]
[Docs] Clarify typed pointers support timeline

As there have been a couple of questions about this recently, this
gives a hard timeline on typed pointers support.

Given that we are about a month away from LLVM 16 branching, I think
we should retain best-effort typed pointer support in LLVM 16 even
if we get all tests migrated before that point.

Conversely, regardless of what the actual test migration state will
be at that point, I believe we should un-support typed pointers as
a matter of policy immediately after branching. Once release/16.x
has been branched, typed pointers on main will no longer be
supported (and can be actively broken). We only need to keep
not-yet-migrated tests working, if there are any left at that point.

Differential Revision: https://reviews.llvm.org/D140487

19 months ago[LoopDeletion] Convert tests to opaque pointers (NFC)
Nikita Popov [Fri, 23 Dec 2022 09:06:50 +0000 (10:06 +0100)]
[LoopDeletion] Convert tests to opaque pointers (NFC)

19 months ago[VectorCombine] Convert tests to opaque pointers (NFC)
Nikita Popov [Fri, 23 Dec 2022 09:03:38 +0000 (10:03 +0100)]
[VectorCombine] Convert tests to opaque pointers (NFC)

19 months ago[SLP] Convert some tests to opaque pointers (NFC)
Nikita Popov [Fri, 23 Dec 2022 09:01:56 +0000 (10:01 +0100)]
[SLP] Convert some tests to opaque pointers (NFC)

19 months ago[GVN] Convert some tests to opaque pointers (NFC)
Nikita Popov [Fri, 23 Dec 2022 08:58:33 +0000 (09:58 +0100)]
[GVN] Convert some tests to opaque pointers (NFC)

19 months ago[BDCE] Convert tests to opaque pointers (NFC)
Nikita Popov [Fri, 23 Dec 2022 08:57:22 +0000 (09:57 +0100)]
[BDCE] Convert tests to opaque pointers (NFC)

19 months ago[Attributor] Convert some tests to opaque pointers (NFC)
Nikita Popov [Fri, 23 Dec 2022 08:55:29 +0000 (09:55 +0100)]
[Attributor] Convert some tests to opaque pointers (NFC)

These were converted without adjustments.

19 months ago[clang-tidy][NFC] Remove custom isInAnonymousNamespace matchers
Carlos Galvez [Fri, 23 Dec 2022 08:47:28 +0000 (08:47 +0000)]
[clang-tidy][NFC] Remove custom isInAnonymousNamespace matchers

Since now the same matcher exists in ASTMatchers.

19 months ago[ArgPromotion] Convert tests to opaque pointers (NFC)
Nikita Popov [Fri, 23 Dec 2022 08:48:36 +0000 (09:48 +0100)]
[ArgPromotion] Convert tests to opaque pointers (NFC)

update_test_checks was rerun for some of those, because we use
a different GEP representation with opaque pointers.

19 months ago[AggressiveInstCombine] Convert tests to opaque pointers (NFC)
Nikita Popov [Fri, 23 Dec 2022 08:47:48 +0000 (09:47 +0100)]
[AggressiveInstCombine] Convert tests to opaque pointers (NFC)

19 months ago[MLIR][Arith] Remove unused assertions
liqinweng [Fri, 23 Dec 2022 08:01:29 +0000 (16:01 +0800)]
[MLIR][Arith] Remove unused assertions
We shouldn't be checking things that are guaranteed by the op's verifier.

Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D140610

19 months ago[Support] Use inplace APInt operators in DivisionByConstantInfo. NFC
Craig Topper [Fri, 23 Dec 2022 06:48:12 +0000 (22:48 -0800)]
[Support] Use inplace APInt operators in DivisionByConstantInfo. NFC

Reduces the number of temporary APInts that get created and
copy/moved from.

19 months ago[ASTMatchers] Add isInAnonymousNamespace narrowing matcher
Carlos Galvez [Mon, 19 Dec 2022 18:34:35 +0000 (18:34 +0000)]
[ASTMatchers] Add isInAnonymousNamespace narrowing matcher

Used in a couple clang-tidy checks so it could be extracted
out as its own matcher.

Differential Revision: https://reviews.llvm.org/D140328

19 months ago[X86] Add reduce_*_ep[i|u]8/16 series intrinsics.
Freddy Ye [Fri, 23 Dec 2022 06:53:33 +0000 (14:53 +0800)]
[X86] Add reduce_*_ep[i|u]8/16 series intrinsics.

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D140531

19 months ago[libc][obvious] Remove a spurious statement leftover from a previous change.
Siva Chandra Reddy [Fri, 23 Dec 2022 06:49:45 +0000 (06:49 +0000)]
[libc][obvious] Remove a spurious statement leftover from a previous change.

19 months ago[OpenMP] [OMPD] Enable OMPD Tests
Vignesh Balasubramanian [Wed, 14 Dec 2022 04:36:30 +0000 (10:06 +0530)]
[OpenMP] [OMPD] Enable OMPD Tests

It was disabled due to different failures it different llvm bots.

Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D138411

19 months ago[Support] Move some APInt declarations in DivisionByConstantInfo to their first assig...
Craig Topper [Fri, 23 Dec 2022 05:50:01 +0000 (21:50 -0800)]
[Support] Move some APInt declarations in DivisionByConstantInfo to their first assignment.

This uses copy initialization instead of default constructing the
APInts and assigning over them.

19 months ago[libc][NFC] Use operator new and operator delete in POSIX file actions API.
Siva Chandra Reddy [Fri, 23 Dec 2022 01:42:20 +0000 (01:42 +0000)]
[libc][NFC] Use operator new and operator delete in POSIX file actions API.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D140597

19 months ago[MLIR][Arith] Canonicalize xor with ext
liqinweng [Fri, 23 Dec 2022 04:40:30 +0000 (12:40 +0800)]
[MLIR][Arith] Canonicalize xor with ext

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D139307

19 months ago[InstCombine] complete (X << Z) / (Y << Z) --> X / Y
Chenbing Zheng [Fri, 23 Dec 2022 03:51:20 +0000 (11:51 +0800)]
[InstCombine] complete (X << Z) / (Y << Z) --> X / Y

Add one more situations for this fold.
For unsigned div, 'nsw' on both shifts + 'nuw' on the dividend.

Alive2: https://alive2.llvm.org/ce/z/sELF76

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D139997

19 months ago[BOLT][TEST] Limit iterations in X86/exceptions-pic.test
Amir Ayupov [Fri, 23 Dec 2022 03:43:38 +0000 (19:43 -0800)]
[BOLT][TEST] Limit iterations in X86/exceptions-pic.test

The test has 3 invocations with 1M iterations each, which adds delay to fast
check-bolt testing. Reduce the number to 1K.

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D139651

19 months agoAdd Soft/Hard RSS Limits to Scudo Standalone
Bastian Kersting [Thu, 22 Dec 2022 19:38:01 +0000 (11:38 -0800)]
Add Soft/Hard RSS Limits to Scudo Standalone

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D126752

19 months ago[scudo] Fix return type of GetRSS()
Vitaly Buka [Fri, 23 Dec 2022 03:40:49 +0000 (19:40 -0800)]
[scudo] Fix return type of GetRSS()

19 months ago[AVR] Select 16-bit LDS/STS for load/store on AVRTiny.
Ben Shi [Fri, 23 Dec 2022 02:20:45 +0000 (10:20 +0800)]
[AVR] Select 16-bit LDS/STS for load/store on AVRTiny.

The 32-bit LDS/STS are not available on AVRTiny, so we have
to use their compact 16-bit form for memory access.

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D139687

19 months ago[AVR] Support 16-bit LDS/STS on AVRTiny.
Ben Shi [Fri, 23 Dec 2022 02:01:46 +0000 (10:01 +0800)]
[AVR] Support 16-bit LDS/STS on AVRTiny.

LDS/STS are 32-bit instructions on AVR, which can access up to
64KB data space. While they are 16-bit instructions on AVRTiny,
which can only access 128B data space.

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D139621

19 months ago[libc++] Granularize <type_traits> includes in <compare>
Nikolas Klauser [Tue, 20 Dec 2022 23:07:17 +0000 (00:07 +0100)]
[libc++] Granularize <type_traits> includes in <compare>

Reviewed By: Mordante, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D140480

19 months ago[AVR][MC] Fix illegal operand forms.
Ben Shi [Wed, 21 Dec 2022 12:17:54 +0000 (20:17 +0800)]
[AVR][MC] Fix illegal operand forms.

These operands are illegal and rejected by avr-gcc.
    subi r24, -lo8(symobl+offset)
    sbci r25, -hi8(symobl+offset)

And their correct form should be
    subi r24, lo8(-(symobl+offset))
    sbci r25, hi8(-(symobl+offset))

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D140473

19 months ago[mlir][sparse] add missing dependent dialect.
Peiming Liu [Fri, 23 Dec 2022 01:37:01 +0000 (01:37 +0000)]
[mlir][sparse] add missing dependent dialect.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140595

19 months ago[AVR] Fix a bug in AsmPrinter when printing memory operands.
Ben Shi [Tue, 20 Dec 2022 10:12:29 +0000 (18:12 +0800)]
[AVR] Fix a bug in AsmPrinter when printing memory operands.

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D140383

19 months ago[NFC][Codegen][X86] Tests w/ final optimized IR of SROA-with-variably-indexed-loads...
Roman Lebedev [Fri, 23 Dec 2022 01:38:07 +0000 (04:38 +0300)]
[NFC][Codegen][X86] Tests w/ final optimized IR of SROA-with-variably-indexed-loads (D140493)

32-byte ones are for consistency only, we really only care about
up to 16-byte on 64-bit and maybe up to 8-byte on 32-bit.

In 16byte ones, we are still having some redundant vec<->scalar traffic.

https://reviews.llvm.org/D140493

19 months ago[ORC][ORC-RT] Add SimplePackedSerialization support for optionals.
Lang Hames [Fri, 23 Dec 2022 01:22:58 +0000 (17:22 -0800)]
[ORC][ORC-RT] Add SimplePackedSerialization support for optionals.

This allows optionals to be serialized and deserialized, and used as arguments
and return values in SPS wrapper functions.

Serialization of optional values is indicated by use of the SPSOptional tag.
SPSOptionals are serialized serialized as a bool (false for no value, true for
value) plus the serialization of the contained value if any. Serialization
to/from std::optional is included in this commit.

This commit includes updates to SimplePackedSerialization in both ORC and the
ORC runtime.

, std::optional serialization.

19 months ago[mlir][sparse] use sparse_tensor::StorageSpecifier to store dim/memSizes
Peiming Liu [Thu, 15 Dec 2022 18:45:07 +0000 (18:45 +0000)]
[mlir][sparse] use sparse_tensor::StorageSpecifier to store dim/memSizes

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140130

19 months ago[clang-format] Set requires expression params as not an expression
Emilia Dreamer [Fri, 23 Dec 2022 00:22:10 +0000 (02:22 +0200)]
[clang-format] Set requires expression params as not an expression

Previously, the parens of a requires expression's "parameters" were not
explicitly set, meaning they ended up as whatever the outer scope was.
This is a problem in some cases though, since the process of determining
star/amp checks if the token is inside of an expression context

This patch always makes sure the context between those parens are always
set to not be an expression

Fixes https://github.com/llvm/llvm-project/issues/59600

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D140330

19 months ago[clang-format][docs] Fix invalid CSS syntax in versionbadge
Emilia Dreamer [Fri, 23 Dec 2022 00:15:23 +0000 (02:15 +0200)]
[clang-format][docs] Fix invalid CSS syntax in versionbadge

CSS uses colons, not the equals sign. The final semicolon is optional,
but preferred to be included. Really, the font property doesn't really
need to be there, but I suppose it was put there for a reason.

It's surprising how lenient browsers are when parsing

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D138441

19 months ago[lld-macho] Use ld64's LC_LINKER_OPTIONS behavior by default
Keith Smiley [Thu, 22 Dec 2022 21:39:53 +0000 (13:39 -0800)]
[lld-macho] Use ld64's LC_LINKER_OPTIONS behavior by default

By default ld64 ignores invalid LC_LINKER_OPTIONS unless the link fails,
in which case it prints a warning. Originally lld chose to be strict
about these, but it has uncovered that many of these exist in open
source projects today, since before developers never would have noticed
this issue. In order to make adoption of lld easier, this mirrors ld64's
behavior, while also adding a `--strict-auto-link-options` flag if
projects want to audit their libraries for these invalid options.

More discussion on https://reviews.llvm.org/D140225
Fixes https://github.com/llvm/llvm-project/issues/59627

Differential Revision: https://reviews.llvm.org/D140491

19 months ago[lld-macho] Flip string deduplication default
Keith Smiley [Wed, 21 Dec 2022 23:48:28 +0000 (15:48 -0800)]
[lld-macho] Flip string deduplication default

Previously by default, when not using `--ifc=`, lld would not
deduplicate string literals. This reveals reliance on undefined behavior
where string literal addresses are compared instead of using string
equality checks. While ideally you would be able to easily identify and
eliminate the reliance on this UB, this can be difficult, especially for
third party code, and increases the friction and risk of users migrating
to lld. This flips the default to deduplicate strings unless
`--no-deduplicate-strings` is passed, matching ld64's behavior.

Differential Revision: https://reviews.llvm.org/D140517

19 months ago[NFC][SROA] Rewrite widen-load-of-small-alloca tests to just store result, not call...
Roman Lebedev [Thu, 22 Dec 2022 23:40:45 +0000 (02:40 +0300)]
[NFC][SROA] Rewrite widen-load-of-small-alloca tests to just store result, not call some function

19 months ago[DAGCombiner] `visitFREEZE()`: be less greedy with replacing other uses of undef
Roman Lebedev [Thu, 22 Dec 2022 23:11:26 +0000 (02:11 +0300)]
[DAGCombiner] `visitFREEZE()`: be less greedy with replacing other uses of undef

19 months ago[DAGCombiner] `visitFREEZE()`: allow multiple maybe-poison operands for `BUILD_VECTOR`
Roman Lebedev [Thu, 22 Dec 2022 22:10:41 +0000 (01:10 +0300)]
[DAGCombiner] `visitFREEZE()`: allow multiple maybe-poison operands for `BUILD_VECTOR`

19 months ago[DAGCombine] `BUILD_VECTOR` can not create undef or poison
Roman Lebedev [Thu, 22 Dec 2022 22:15:17 +0000 (01:15 +0300)]
[DAGCombine] `BUILD_VECTOR` can not create undef or poison

19 months ago[NFC][Codegen] Tests for `freeze` of `BUILD_VECTOR`
Roman Lebedev [Thu, 22 Dec 2022 21:26:47 +0000 (00:26 +0300)]
[NFC][Codegen] Tests for `freeze` of `BUILD_VECTOR`

19 months ago[NFC][DAGCombiner] `visitFREEZE()`: use early return
Roman Lebedev [Thu, 22 Dec 2022 21:40:43 +0000 (00:40 +0300)]
[NFC][DAGCombiner] `visitFREEZE()`: use early return

19 months ago[bazel] fix bazel file.
Peiming Liu [Thu, 22 Dec 2022 23:03:23 +0000 (23:03 +0000)]
[bazel] fix bazel file.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140589

19 months ago[gn build] Port d64d3c5a8f81
LLVM GN Syncbot [Thu, 22 Dec 2022 22:53:42 +0000 (22:53 +0000)]
[gn build] Port d64d3c5a8f81

19 months ago[mlir][vector] Fix bug in extractOp folding
Thomas Raoux [Thu, 22 Dec 2022 22:50:07 +0000 (14:50 -0800)]
[mlir][vector] Fix bug in extractOp folding

We were missing to check for transpose when folding.
Also add a new file to test folding independently of
canonicalization as canonicalization was hiding the bug.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140533

19 months agostd::sort: add BlockQuickSort partitioning algorithm for arithmetic types
Nilay Vaish [Wed, 21 Dec 2022 23:49:32 +0000 (15:49 -0800)]
std::sort: add BlockQuickSort partitioning algorithm for arithmetic types

This diff modifies std::sort in two ways:

* for arithmetic types we update the core partitioning algorithm to use
BlockQuickSort for partitioning. The partition function was carefully
written to let the compiler generates SIMD instructions without actually
writing SIMD intrinsics in the loop. We see up to 50% better performance
for sorting arithmetic types. The use of the BlockQuickSort partitioning
has been limited to arithmetic types since the algorithm works well when
branch instructions can be avoided during partitioning. This usually not
true for types other than the arithmetic ones.

* for other types (tuples, strings) updates have been made to improve
performance by about 10%.  Performance numbers comparing std::sort (old)
and Bitset sort (new) on libcxx benchmark.

name                                                             old cpu/op  new cpu/op  delta
BM_Sort_uint32_Random_1                                          3.72ns ± 5%  3.78ns ±16%      ~     (p=0.819 n=36+34)
BM_Sort_uint32_Random_4                                          5.42ns ± 5%  5.29ns ± 7%    -2.42%  (p=0.000 n=35+31)
BM_Sort_uint32_Random_16                                         10.5ns ± 3%  11.9ns ±15%   +13.08%  (p=0.000 n=36+40)
BM_Sort_uint32_Random_64                                         18.6ns ± 7%  18.5ns ±15%    -0.95%  (p=0.002 n=33+40)
BM_Sort_uint32_Random_256                                        26.2ns ± 4%  21.3ns ± 8%   -18.89%  (p=0.000 n=37+34)
BM_Sort_uint32_Random_1024                                       33.4ns ± 5%  23.3ns ± 4%   -30.37%  (p=0.000 n=39+35)
BM_Sort_uint32_Random_16384                                      47.7ns ± 5%  26.7ns ± 5%   -44.06%  (p=0.000 n=39+35)
BM_Sort_uint32_Random_262144                                     62.6ns ± 3%  30.1ns ± 6%   -51.81%  (p=0.000 n=37+36)
BM_Sort_uint32_Ascending_1                                       3.71ns ± 3%  4.28ns ± 3%   +15.53%  (p=0.000 n=37+35)
BM_Sort_uint32_Ascending_4                                       1.47ns ± 3%  1.46ns ± 3%      ~     (p=0.083 n=36+37)
BM_Sort_uint32_Ascending_16                                      0.93ns ± 4%  1.02ns ± 3%    +9.32%  (p=0.000 n=36+36)
BM_Sort_uint32_Ascending_64                                      1.23ns ± 5%  1.51ns ± 3%   +22.56%  (p=0.000 n=34+36)
BM_Sort_uint32_Ascending_256                                     1.21ns ± 3%  1.57ns ± 4%   +29.77%  (p=0.000 n=33+35)
BM_Sort_uint32_Ascending_1024                                    1.03ns ± 4%  1.43ns ± 3%   +38.44%  (p=0.000 n=32+35)
BM_Sort_uint32_Ascending_16384                                   0.94ns ± 8%  1.36ns ± 5%   +44.09%  (p=0.000 n=32+35)
BM_Sort_uint32_Ascending_262144                                  0.93ns ± 3%  1.35ns ± 7%   +45.06%  (p=0.000 n=32+36)
BM_Sort_uint32_Descending_1                                      3.69ns ± 2%  4.27ns ± 3%   +15.73%  (p=0.000 n=31+36)
BM_Sort_uint32_Descending_4                                      1.74ns ± 2%  1.78ns ± 3%    +2.29%  (p=0.000 n=31+38)
BM_Sort_uint32_Descending_16                                     3.92ns ± 4%  4.20ns ± 4%    +7.13%  (p=0.000 n=32+38)
BM_Sort_uint32_Descending_64                                     2.09ns ± 4%  3.25ns ± 4%   +55.10%  (p=0.000 n=33+37)
BM_Sort_uint32_Descending_256                                    1.98ns ± 7%  2.93ns ± 4%   +47.95%  (p=0.000 n=34+36)
BM_Sort_uint32_Descending_1024                                   2.23ns ± 6%  2.64ns ± 3%   +18.22%  (p=0.000 n=34+38)
BM_Sort_uint32_Descending_16384                                  1.93ns ± 6%  2.43ns ± 4%   +25.99%  (p=0.000 n=34+35)
BM_Sort_uint32_Descending_262144                                 1.89ns ± 3%  2.38ns ± 4%   +25.41%  (p=0.000 n=33+35)
BM_Sort_uint32_SingleElement_1                                   3.67ns ± 2%  4.28ns ± 4%   +16.60%  (p=0.000 n=34+34)
BM_Sort_uint32_SingleElement_4                                   1.48ns ± 4%  1.48ns ± 5%      ~     (p=0.951 n=35+33)
BM_Sort_uint32_SingleElement_16                                  0.93ns ± 3%  1.02ns ± 4%    +9.51%  (p=0.000 n=36+33)
BM_Sort_uint32_SingleElement_64                                  0.76ns ± 3%  1.59ns ± 8%  +109.78%  (p=0.000 n=36+32)
BM_Sort_uint32_SingleElement_256                                 0.82ns ± 4%  1.45ns ± 5%   +76.62%  (p=0.000 n=37+34)
BM_Sort_uint32_SingleElement_1024                                0.77ns ± 4%  1.31ns ± 4%   +71.40%  (p=0.000 n=34+34)
BM_Sort_uint32_SingleElement_16384                               0.64ns ± 4%  1.24ns ± 6%   +93.29%  (p=0.000 n=35+36)
BM_Sort_uint32_SingleElement_262144                              0.63ns ± 3%  1.23ns ± 4%   +95.17%  (p=0.000 n=35+35)
BM_Sort_uint32_PipeOrgan_1                                       3.68ns ± 2%  4.42ns ± 3%   +20.31%  (p=0.000 n=34+36)
BM_Sort_uint32_PipeOrgan_4                                       1.54ns ± 3%  1.53ns ± 3%      ~     (p=0.128 n=34+36)
BM_Sort_uint32_PipeOrgan_16                                      2.22ns ± 3%  1.99ns ± 3%   -10.28%  (p=0.000 n=33+36)
BM_Sort_uint32_PipeOrgan_64                                      4.41ns ± 3%  3.39ns ± 4%   -23.17%  (p=0.000 n=35+37)
BM_Sort_uint32_PipeOrgan_256                                     2.75ns ± 5%  3.07ns ± 3%   +11.74%  (p=0.000 n=37+37)
BM_Sort_uint32_PipeOrgan_1024                                    3.58ns ± 2%  5.48ns ± 3%   +52.97%  (p=0.000 n=37+36)
BM_Sort_uint32_PipeOrgan_16384                                   4.10ns ± 3%  6.53ns ± 3%   +59.27%  (p=0.000 n=37+37)
BM_Sort_uint32_PipeOrgan_262144                                  4.90ns ± 3%  7.39ns ± 3%   +50.71%  (p=0.000 n=34+37)
BM_Sort_uint32_QuickSortAdversary_1                              3.68ns ± 2%  4.28ns ± 3%   +16.19%  (p=0.000 n=36+37)
BM_Sort_uint32_QuickSortAdversary_4                              1.46ns ± 4%  1.46ns ± 3%      ~     (p=0.736 n=35+38)
BM_Sort_uint32_QuickSortAdversary_16                             0.93ns ± 3%  1.02ns ± 4%    +9.69%  (p=0.000 n=36+37)
BM_Sort_uint32_QuickSortAdversary_64                             13.6ns ± 4%  17.9ns ± 8%   +31.37%  (p=0.000 n=36+35)
BM_Sort_uint32_QuickSortAdversary_256                            20.0ns ± 4%  25.7ns ± 4%   +28.69%  (p=0.000 n=36+35)
BM_Sort_uint32_QuickSortAdversary_1024                           28.3ns ± 6%  31.7ns ± 3%   +12.12%  (p=0.000 n=36+37)
BM_Sort_uint32_QuickSortAdversary_16384                          45.8ns ± 3%  50.6ns ± 4%   +10.32%  (p=0.000 n=38+36)
BM_Sort_uint32_QuickSortAdversary_262144                         61.6ns ± 4%  68.2ns ± 4%   +10.68%  (p=0.000 n=37+37)
BM_Sort_uint64_Random_1                                          3.71ns ± 4%  4.00ns ± 4%    +7.93%  (p=0.000 n=34+35)
BM_Sort_uint64_Random_4                                          5.52ns ± 8%  5.22ns ± 6%    -5.41%  (p=0.000 n=32+32)
BM_Sort_uint64_Random_16                                         10.7ns ±15%  10.2ns ± 7%      ~     (p=0.077 n=40+31)
BM_Sort_uint64_Random_64                                         19.0ns ±14%  18.2ns ±14%    -4.31%  (p=0.001 n=40+40)
BM_Sort_uint64_Random_256                                        25.7ns ± 9%  22.1ns ±15%   -13.82%  (p=0.000 n=33+40)
BM_Sort_uint64_Random_1024                                       32.4ns ± 6%  23.8ns ±16%   -26.64%  (p=0.000 n=33+40)
BM_Sort_uint64_Random_16384                                      46.8ns ± 3%  27.1ns ±16%   -42.15%  (p=0.000 n=33+40)
BM_Sort_uint64_Random_262144                                     61.3ns ± 4%  30.4ns ±16%   -50.34%  (p=0.000 n=34+40)
BM_Sort_uint64_Ascending_1                                       3.67ns ± 3%  3.87ns ±16%    +5.36%  (p=0.049 n=35+40)
BM_Sort_uint64_Ascending_4                                       1.46ns ± 3%  1.46ns ± 3%      ~     (p=0.130 n=37+31)
BM_Sort_uint64_Ascending_16                                      1.09ns ± 3%  0.91ns ± 6%   -16.79%  (p=0.000 n=38+32)
BM_Sort_uint64_Ascending_64                                      1.25ns ± 3%  1.29ns ± 5%    +3.11%  (p=0.000 n=38+34)
BM_Sort_uint64_Ascending_256                                     1.37ns ± 3%  1.42ns ± 3%    +3.07%  (p=0.000 n=39+35)
BM_Sort_uint64_Ascending_1024                                    1.12ns ± 3%  1.17ns ± 3%    +5.28%  (p=0.000 n=37+36)
BM_Sort_uint64_Ascending_16384                                   0.98ns ± 3%  1.09ns ± 3%   +10.95%  (p=0.000 n=36+37)
BM_Sort_uint64_Ascending_262144                                  0.98ns ± 3%  1.08ns ± 3%   +10.97%  (p=0.000 n=36+37)
BM_Sort_uint64_Descending_1                                      3.68ns ± 3%  3.67ns ± 3%      ~     (p=0.652 n=36+36)
BM_Sort_uint64_Descending_4                                      1.71ns ± 3%  1.73ns ± 3%    +1.50%  (p=0.000 n=33+34)
BM_Sort_uint64_Descending_16                                     4.96ns ± 2%  5.49ns ± 3%   +10.73%  (p=0.000 n=31+36)
BM_Sort_uint64_Descending_64                                     2.14ns ± 6%  3.03ns ± 3%   +41.72%  (p=0.000 n=32+35)
BM_Sort_uint64_Descending_256                                    2.03ns ± 4%  2.86ns ± 4%   +40.55%  (p=0.000 n=32+34)
BM_Sort_uint64_Descending_1024                                   2.20ns ± 2%  2.29ns ± 3%    +4.20%  (p=0.000 n=31+36)
BM_Sort_uint64_Descending_16384                                  1.89ns ± 3%  2.08ns ± 3%   +10.00%  (p=0.000 n=31+37)
BM_Sort_uint64_Descending_262144                                 1.92ns ± 3%  2.07ns ± 4%    +7.95%  (p=0.000 n=31+36)
BM_Sort_uint64_SingleElement_1                                   3.68ns ± 5%  3.67ns ± 3%      ~     (p=0.716 n=31+37)
BM_Sort_uint64_SingleElement_4                                   1.46ns ± 3%  1.46ns ± 3%      ~     (p=0.557 n=34+37)
BM_Sort_uint64_SingleElement_16                                  1.09ns ± 2%  0.91ns ± 3%   -16.93%  (p=0.000 n=33+36)
BM_Sort_uint64_SingleElement_64                                  0.83ns ± 4%  1.47ns ± 4%   +78.03%  (p=0.000 n=34+34)
BM_Sort_uint64_SingleElement_256                                 0.95ns ± 4%  1.28ns ± 4%   +35.17%  (p=0.000 n=35+35)
BM_Sort_uint64_SingleElement_1024                                0.76ns ± 3%  1.05ns ± 3%   +37.78%  (p=0.000 n=35+33)
BM_Sort_uint64_SingleElement_16384                               0.71ns ± 2%  0.98ns ± 5%   +38.43%  (p=0.000 n=34+33)
BM_Sort_uint64_SingleElement_262144                              0.72ns ± 3%  0.98ns ± 4%   +35.93%  (p=0.000 n=35+33)
BM_Sort_uint64_PipeOrgan_1                                       3.68ns ± 3%  3.68ns ± 3%      ~     (p=0.650 n=35+33)
BM_Sort_uint64_PipeOrgan_4                                       1.53ns ± 2%  1.54ns ± 4%      ~     (p=0.424 n=33+36)
BM_Sort_uint64_PipeOrgan_16                                      2.23ns ± 3%  2.06ns ± 4%    -7.68%  (p=0.000 n=34+35)
BM_Sort_uint64_PipeOrgan_64                                      5.46ns ± 2%  3.41ns ± 4%   -37.67%  (p=0.000 n=33+36)
BM_Sort_uint64_PipeOrgan_256                                     2.92ns ± 4%  2.91ns ± 3%      ~     (p=0.257 n=35+35)
BM_Sort_uint64_PipeOrgan_1024                                    3.72ns ± 3%  5.35ns ± 4%   +43.95%  (p=0.000 n=35+35)
BM_Sort_uint64_PipeOrgan_16384                                   4.12ns ± 3%  6.37ns ± 3%   +54.74%  (p=0.000 n=34+36)
BM_Sort_uint64_PipeOrgan_262144                                  4.99ns ± 3%  7.25ns ± 5%   +45.45%  (p=0.000 n=35+35)
BM_Sort_uint64_QuickSortAdversary_1                              3.67ns ± 2%  3.65ns ± 3%      ~     (p=0.071 n=35+37)
BM_Sort_uint64_QuickSortAdversary_4                              1.46ns ± 3%  1.46ns ± 3%      ~     (p=0.214 n=36+37)
BM_Sort_uint64_QuickSortAdversary_16                             1.09ns ± 3%  0.91ns ± 3%   -16.73%  (p=0.000 n=36+38)
BM_Sort_uint64_QuickSortAdversary_64                             13.7ns ± 3%  17.8ns ± 5%   +29.86%  (p=0.000 n=36+37)
BM_Sort_uint64_QuickSortAdversary_256                            20.0ns ± 3%  25.9ns ± 3%   +29.25%  (p=0.000 n=35+38)
BM_Sort_uint64_QuickSortAdversary_1024                           28.1ns ± 3%  31.0ns ± 4%   +10.35%  (p=0.000 n=33+37)
BM_Sort_uint64_QuickSortAdversary_16384                          45.8ns ± 2%  50.5ns ± 4%   +10.29%  (p=0.000 n=36+37)
BM_Sort_uint64_QuickSortAdversary_262144                         64.9ns ± 3%  69.5ns ± 3%    +7.15%  (p=0.000 n=36+36)
BM_Sort_pair<uint32, uint32>_Random_1                            4.03ns ± 5%  4.33ns ± 4%    +7.31%  (p=0.000 n=36+36)
BM_Sort_pair<uint32, uint32>_Random_4                            6.78ns ± 5%  6.71ns ± 4%    -1.09%  (p=0.040 n=35+35)
BM_Sort_pair<uint32, uint32>_Random_16                           25.2ns ± 6%  16.8ns ± 7%   -33.35%  (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_Random_64                           35.6ns ± 7%  27.2ns ± 8%   -23.73%  (p=0.000 n=34+36)
BM_Sort_pair<uint32, uint32>_Random_256                          43.5ns ±13%  34.0ns ± 8%   -21.78%  (p=0.000 n=32+34)
BM_Sort_pair<uint32, uint32>_Random_1024                         50.6ns ± 8%  40.8ns ± 5%   -19.35%  (p=0.000 n=32+32)
BM_Sort_pair<uint32, uint32>_Random_16384                        66.0ns ± 3%  55.9ns ± 6%   -15.24%  (p=0.000 n=32+32)
BM_Sort_pair<uint32, uint32>_Random_262144                       82.4ns ± 4%  72.0ns ± 5%   -12.64%  (p=0.000 n=32+31)
BM_Sort_pair<uint32, uint32>_Ascending_1                         4.00ns ± 2%  4.50ns ±16%   +12.59%  (p=0.000 n=33+40)
BM_Sort_pair<uint32, uint32>_Ascending_4                         2.22ns ± 3%  2.34ns ±16%    +5.46%  (p=0.041 n=33+40)
BM_Sort_pair<uint32, uint32>_Ascending_16                        2.33ns ± 4%  1.30ns ±15%   -44.33%  (p=0.000 n=34+40)
BM_Sort_pair<uint32, uint32>_Ascending_64                        1.39ns ± 4%  1.50ns ± 8%    +8.48%  (p=0.000 n=35+32)
BM_Sort_pair<uint32, uint32>_Ascending_256                       1.47ns ± 4%  1.56ns ± 3%    +5.96%  (p=0.000 n=37+31)
BM_Sort_pair<uint32, uint32>_Ascending_1024                      1.34ns ± 3%  1.35ns ± 4%    +1.22%  (p=0.000 n=38+31)
BM_Sort_pair<uint32, uint32>_Ascending_16384                     1.18ns ± 2%  1.18ns ± 3%      ~     (p=0.687 n=37+32)
BM_Sort_pair<uint32, uint32>_Ascending_262144                    1.18ns ± 3%  1.17ns ± 2%      ~     (p=0.153 n=38+34)
BM_Sort_pair<uint32, uint32>_Descending_1                        4.00ns ± 2%  4.29ns ± 3%    +7.22%  (p=0.000 n=37+36)
BM_Sort_pair<uint32, uint32>_Descending_4                        2.91ns ± 3%  2.92ns ± 3%      ~     (p=0.065 n=37+35)
BM_Sort_pair<uint32, uint32>_Descending_16                       4.96ns ± 4%  6.51ns ± 2%   +31.36%  (p=0.000 n=37+30)
BM_Sort_pair<uint32, uint32>_Descending_64                       3.13ns ± 2%  2.92ns ± 3%    -6.71%  (p=0.000 n=36+37)
BM_Sort_pair<uint32, uint32>_Descending_256                      2.56ns ± 3%  2.73ns ± 5%    +6.55%  (p=0.000 n=35+37)
BM_Sort_pair<uint32, uint32>_Descending_1024                     3.11ns ± 3%  2.34ns ± 4%   -24.85%  (p=0.000 n=36+35)
BM_Sort_pair<uint32, uint32>_Descending_16384                    2.84ns ± 3%  2.14ns ± 5%   -24.48%  (p=0.000 n=37+37)
BM_Sort_pair<uint32, uint32>_Descending_262144                   2.86ns ± 3%  2.15ns ± 3%   -25.08%  (p=0.000 n=36+35)
BM_Sort_pair<uint32, uint32>_SingleElement_1                     3.99ns ± 3%  4.28ns ± 3%    +7.08%  (p=0.000 n=33+35)
BM_Sort_pair<uint32, uint32>_SingleElement_4                     2.32ns ± 6%  2.30ns ± 3%    -0.77%  (p=0.032 n=32+35)
BM_Sort_pair<uint32, uint32>_SingleElement_16                    1.67ns ± 4%  1.27ns ± 4%   -24.13%  (p=0.000 n=32+35)
BM_Sort_pair<uint32, uint32>_SingleElement_64                    1.64ns ± 7%  1.83ns ± 4%   +11.54%  (p=0.000 n=31+35)
BM_Sort_pair<uint32, uint32>_SingleElement_256                   1.57ns ± 3%  1.90ns ± 3%   +21.46%  (p=0.000 n=31+36)
BM_Sort_pair<uint32, uint32>_SingleElement_1024                  1.49ns ±15%  1.63ns ± 3%    +9.42%  (p=0.000 n=40+37)
BM_Sort_pair<uint32, uint32>_SingleElement_16384                 1.29ns ±17%  1.57ns ± 3%   +21.51%  (p=0.000 n=33+36)
BM_Sort_pair<uint32, uint32>_SingleElement_262144                1.26ns ± 4%  1.56ns ± 4%   +24.11%  (p=0.000 n=33+36)
BM_Sort_pair<uint32, uint32>_PipeOrgan_1                         4.01ns ± 2%  4.28ns ± 3%    +6.68%  (p=0.000 n=32+35)
BM_Sort_pair<uint32, uint32>_PipeOrgan_4                         2.38ns ± 5%  2.42ns ± 4%    +1.61%  (p=0.000 n=34+35)
BM_Sort_pair<uint32, uint32>_PipeOrgan_16                        4.83ns ± 2%  2.71ns ± 7%   -43.96%  (p=0.000 n=34+34)
BM_Sort_pair<uint32, uint32>_PipeOrgan_64                        4.53ns ± 3%  3.89ns ± 7%   -14.11%  (p=0.000 n=35+33)
BM_Sort_pair<uint32, uint32>_PipeOrgan_256                       5.53ns ± 4%  2.81ns ± 4%   -49.13%  (p=0.000 n=36+33)
BM_Sort_pair<uint32, uint32>_PipeOrgan_1024                      6.49ns ± 4%  5.29ns ± 3%   -18.50%  (p=0.000 n=35+32)
BM_Sort_pair<uint32, uint32>_PipeOrgan_16384                     7.21ns ± 4%  5.97ns ± 3%   -17.24%  (p=0.000 n=36+33)
BM_Sort_pair<uint32, uint32>_PipeOrgan_262144                    7.98ns ± 5%  6.59ns ± 3%   -17.46%  (p=0.000 n=33+33)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_1                3.99ns ± 3%  4.27ns ± 3%    +6.95%  (p=0.000 n=36+34)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_4                2.40ns ± 3%  2.37ns ± 3%    -1.00%  (p=0.007 n=34+34)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_16               4.96ns ± 5%  2.72ns ± 7%   -45.07%  (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_64               7.24ns ± 4%  7.51ns ± 4%    +3.63%  (p=0.000 n=34+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_256              9.85ns ± 5%  7.12ns ± 4%   -27.70%  (p=0.000 n=34+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_1024             11.6ns ± 6%   8.8ns ± 5%   -23.86%  (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_16384            32.7ns ± 3%  20.8ns ± 4%   -36.26%  (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_262144           36.4ns ± 3%  24.0ns ± 4%   -34.12%  (p=0.000 n=34+36)
BM_Sort_tuple<uint32, uint64, uint32>_Random_1                   4.04ns ± 6%  4.34ns ± 4%    +7.55%  (p=0.000 n=37+37)
BM_Sort_tuple<uint32, uint64, uint32>_Random_4                   7.19ns ± 6%  7.26ns ± 5%    +0.99%  (p=0.042 n=36+38)
BM_Sort_tuple<uint32, uint64, uint32>_Random_16                  30.4ns ± 6%  21.8ns ± 7%   -28.28%  (p=0.000 n=34+37)
BM_Sort_tuple<uint32, uint64, uint32>_Random_64                  42.8ns ±11%  33.5ns ± 9%   -21.70%  (p=0.000 n=36+38)
BM_Sort_tuple<uint32, uint64, uint32>_Random_256                 49.9ns ± 6%  40.3ns ± 9%   -19.20%  (p=0.000 n=35+38)
BM_Sort_tuple<uint32, uint64, uint32>_Random_1024                56.3ns ± 3%  46.1ns ± 4%   -18.08%  (p=0.000 n=35+35)
BM_Sort_tuple<uint32, uint64, uint32>_Random_16384               72.2ns ± 5%  62.1ns ± 3%   -14.05%  (p=0.000 n=37+36)
BM_Sort_tuple<uint32, uint64, uint32>_Random_262144              88.7ns ± 6%  79.0ns ± 6%   -10.93%  (p=0.000 n=36+36)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_1                3.96ns ± 3%  4.36ns ± 3%    +9.96%  (p=0.000 n=34+37)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_4                2.39ns ± 2%  2.39ns ± 3%      ~     (p=0.604 n=36+37)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_16               3.04ns ± 4%  1.48ns ± 3%   -51.20%  (p=0.000 n=34+35)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_64               2.44ns ± 3%  2.30ns ± 5%    -5.61%  (p=0.000 n=36+35)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_256              2.35ns ± 3%  2.39ns ± 5%    +1.78%  (p=0.000 n=33+34)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_1024             2.12ns ± 5%  2.08ns ± 4%    -1.80%  (p=0.000 n=33+34)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_16384            2.02ns ± 3%  2.00ns ± 5%    -1.25%  (p=0.000 n=32+32)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_262144           2.06ns ± 5%  2.11ns ± 9%      ~     (p=0.618 n=32+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_1               3.97ns ± 2%  4.57ns ±16%   +15.19%  (p=0.000 n=32+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_4               3.64ns ± 3%  4.05ns ±15%   +11.05%  (p=0.000 n=33+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_16              5.68ns ± 5%  9.36ns ±16%   +64.92%  (p=0.000 n=35+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_64              4.27ns ± 4%  3.88ns ± 8%    -9.13%  (p=0.000 n=35+32)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_256             3.58ns ± 3%  3.76ns ±14%    +5.12%  (p=0.002 n=38+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_1024            4.16ns ± 3%  3.21ns ± 5%   -22.77%  (p=0.000 n=38+31)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_16384           3.90ns ± 4%  3.00ns ± 3%   -23.12%  (p=0.000 n=38+32)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_262144          4.52ns ± 3%  3.42ns ± 3%   -24.29%  (p=0.000 n=38+33)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_1            3.97ns ± 3%  4.31ns ± 3%    +8.78%  (p=0.000 n=39+34)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_4            2.54ns ± 2%  2.54ns ± 4%      ~     (p=0.341 n=38+36)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_16           2.39ns ± 3%  1.70ns ± 6%   -28.90%  (p=0.000 n=38+35)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_64           2.61ns ± 2%  3.23ns ± 3%   +24.07%  (p=0.000 n=35+35)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_256          2.83ns ± 2%  2.97ns ± 4%    +4.83%  (p=0.000 n=35+37)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_1024         2.44ns ± 4%  2.44ns ± 3%      ~     (p=0.481 n=36+36)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_16384        2.19ns ± 3%  2.37ns ± 6%    +8.01%  (p=0.000 n=36+37)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_262144       2.34ns ± 2%  2.36ns ± 5%    +1.11%  (p=0.001 n=36+36)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_1                3.96ns ± 2%  4.31ns ± 3%    +8.76%  (p=0.000 n=33+35)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_4                2.65ns ± 6%  2.67ns ± 4%      ~     (p=0.139 n=32+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_16               5.64ns ± 3%  3.56ns ± 3%   -36.80%  (p=0.000 n=31+35)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_64               6.12ns ±16%  5.04ns ± 4%   -17.64%  (p=0.000 n=40+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_256              6.78ns ± 6%  3.73ns ± 3%   -44.94%  (p=0.000 n=31+36)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_1024             8.36ns ±15%  6.51ns ± 4%   -22.13%  (p=0.000 n=40+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_16384            9.24ns ±15%  7.91ns ± 3%   -14.34%  (p=0.000 n=40+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_262144           10.7ns ± 3%   9.3ns ± 6%   -12.36%  (p=0.000 n=32+36)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_1       3.97ns ± 3%  4.31ns ± 3%    +8.63%  (p=0.000 n=32+35)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_4       2.79ns ± 3%  2.76ns ± 4%    -0.95%  (p=0.002 n=33+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_16      5.07ns ± 3%  3.69ns ± 4%   -27.35%  (p=0.000 n=35+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_64      9.26ns ± 3%  8.34ns ± 7%    -9.88%  (p=0.000 n=35+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_256     11.8ns ± 5%   9.7ns ± 3%   -17.83%  (p=0.000 n=37+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_1024    19.2ns ± 4%  14.5ns ±10%   -24.59%  (p=0.000 n=36+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_16384   45.5ns ± 4%  37.4ns ± 9%   -17.71%  (p=0.000 n=35+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_262144  50.0ns ± 4%  43.2ns ± 3%   -13.69%  (p=0.000 n=35+34)
BM_Sort_string_Random_1                                          4.66ns ± 6%  4.40ns ± 4%    -5.55%  (p=0.000 n=35+37)
BM_Sort_string_Random_4                                          14.9ns ± 3%  15.0ns ± 6%      ~     (p=0.863 n=36+38)
BM_Sort_string_Random_16                                         45.5ns ± 6%  35.8ns ± 8%   -21.37%  (p=0.000 n=36+36)
BM_Sort_string_Random_64                                         66.6ns ± 4%  58.2ns ± 3%   -12.69%  (p=0.000 n=36+37)
BM_Sort_string_Random_256                                        86.0ns ± 5%  77.4ns ± 3%   -10.01%  (p=0.000 n=37+37)
BM_Sort_string_Random_1024                                        106ns ± 3%    96ns ± 6%    -9.39%  (p=0.000 n=37+37)
BM_Sort_string_Random_16384                                       154ns ± 3%   141ns ± 5%    -8.03%  (p=0.000 n=35+36)
BM_Sort_string_Random_262144                                      213ns ± 4%   197ns ± 4%    -7.59%  (p=0.000 n=34+34)
BM_Sort_string_Ascending_1                                       4.59ns ± 2%  4.56ns ±17%    -0.60%  (p=0.002 n=32+40)
BM_Sort_string_Ascending_4                                       7.52ns ± 9%  7.54ns ±12%      ~     (p=0.554 n=37+40)
BM_Sort_string_Ascending_16                                      13.1ns ± 6%   8.8ns ±12%   -33.26%  (p=0.000 n=39+38)
BM_Sort_string_Ascending_64                                      14.8ns ±10%  14.5ns ±11%    -2.15%  (p=0.013 n=40+37)
BM_Sort_string_Ascending_256                                     14.0ns ± 6%  14.1ns ±10%      ~     (p=0.760 n=37+40)
BM_Sort_string_Ascending_1024                                    12.9ns ±10%  12.8ns ±20%      ~     (p=0.055 n=35+40)
BM_Sort_string_Ascending_16384                                   17.2ns ±13%  17.4ns ±21%      ~     (p=1.000 n=37+40)
BM_Sort_string_Ascending_262144                                  17.5ns ±12%  17.5ns ±25%      ~     (p=0.392 n=35+39)
BM_Sort_string_Descending_1                                      4.59ns ± 3%  4.34ns ± 3%    -5.51%  (p=0.000 n=32+33)
BM_Sort_string_Descending_4                                      10.1ns ± 5%   9.8ns ± 4%    -2.84%  (p=0.000 n=36+34)
BM_Sort_string_Descending_16                                     22.0ns ± 4%  39.6ns ± 4%   +79.84%  (p=0.000 n=36+33)
BM_Sort_string_Descending_64                                     21.4ns ±12%  21.3ns ±14%      ~     (p=0.542 n=37+39)
BM_Sort_string_Descending_256                                    19.4ns ±13%  18.9ns ±13%    -2.74%  (p=0.039 n=37+39)
BM_Sort_string_Descending_1024                                   22.7ns ± 5%  17.6ns ±15%   -22.52%  (p=0.000 n=35+40)
BM_Sort_string_Descending_16384                                  27.9ns ±14%  22.6ns ±10%   -19.11%  (p=0.000 n=40+37)
BM_Sort_string_Descending_262144                                 33.8ns ±14%  26.1ns ±21%   -22.74%  (p=0.000 n=39+38)
BM_Sort_string_SingleElement_1                                   4.58ns ± 2%  4.35ns ± 3%    -5.14%  (p=0.000 n=35+37)
BM_Sort_string_SingleElement_4                                   7.92ns ± 3%  7.92ns ± 7%      ~     (p=0.625 n=38+39)
BM_Sort_string_SingleElement_16                                  18.0ns ± 3%   7.9ns ± 6%   -56.23%  (p=0.000 n=36+35)
BM_Sort_string_SingleElement_64                                  20.3ns ± 5%  19.3ns ±15%    -4.83%  (p=0.000 n=34+38)
BM_Sort_string_SingleElement_256                                 19.4ns ± 7%  18.1ns ±14%    -6.67%  (p=0.000 n=36+39)
BM_Sort_string_SingleElement_1024                                19.3ns ± 9%  17.4ns ±17%    -9.40%  (p=0.000 n=35+40)
BM_Sort_string_SingleElement_16384                               17.5ns ±12%  16.2ns ±20%    -7.91%  (p=0.000 n=37+40)
BM_Sort_string_SingleElement_262144                              16.7ns ±18%  15.3ns ±27%    -8.56%  (p=0.000 n=40+40)
BM_Sort_string_PipeOrgan_1                                       4.60ns ± 2%  4.33ns ± 3%    -5.80%  (p=0.000 n=33+31)
BM_Sort_string_PipeOrgan_4                                       8.29ns ± 4%  8.17ns ± 8%    -1.50%  (p=0.004 n=39+36)
BM_Sort_string_PipeOrgan_16                                      22.9ns ± 3%  16.4ns ± 6%   -28.45%  (p=0.000 n=39+38)
BM_Sort_string_PipeOrgan_64                                      30.7ns ± 4%  28.9ns ± 7%    -6.05%  (p=0.000 n=38+37)
BM_Sort_string_PipeOrgan_256                                     38.1ns ± 3%  22.5ns ± 9%   -40.78%  (p=0.000 n=37+37)
BM_Sort_string_PipeOrgan_1024                                    45.4ns ± 4%  36.2ns ± 6%   -20.33%  (p=0.000 n=37+37)
BM_Sort_string_PipeOrgan_16384                                   56.2ns ± 4%  49.0ns ± 8%   -12.73%  (p=0.000 n=36+38)
BM_Sort_string_PipeOrgan_262144                                  77.8ns ±13%  62.8ns ±10%   -19.27%  (p=0.000 n=39+39)
BM_Sort_string_QuickSortAdversary_1                              4.80ns ±16%  4.34ns ± 4%    -9.56%  (p=0.000 n=39+34)
BM_Sort_string_QuickSortAdversary_4                              14.8ns ± 5%  14.7ns ± 4%    -0.80%  (p=0.037 n=33+33)
BM_Sort_string_QuickSortAdversary_16                             44.6ns ± 4%  34.8ns ± 5%   -21.98%  (p=0.000 n=35+34)
BM_Sort_string_QuickSortAdversary_64                             66.2ns ± 3%  58.1ns ± 4%   -12.32%  (p=0.000 n=36+35)
BM_Sort_string_QuickSortAdversary_256                            85.4ns ± 5%  76.9ns ± 6%    -9.99%  (p=0.000 n=36+36)
BM_Sort_string_QuickSortAdversary_1024                            106ns ± 4%    96ns ± 3%    -9.62%  (p=0.000 n=34+37)
BM_Sort_string_QuickSortAdversary_16384                           153ns ± 3%   141ns ± 4%    -8.22%  (p=0.000 n=34+37)
BM_Sort_string_QuickSortAdversary_262144                          211ns ± 5%   195ns ± 6%    -7.77%  (p=0.000 n=35+38)

Differential Revision: https://reviews.llvm.org/D122780

19 months ago[mlir][sparse] introduce sparse_tensor::StorageSpecifierToLLVM pass
Peiming Liu [Thu, 15 Dec 2022 18:28:30 +0000 (18:28 +0000)]
[mlir][sparse] introduce sparse_tensor::StorageSpecifierToLLVM pass

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140122

19 months ago[LowerExpectIntrinsic] Propagate branch weights through phi values when ExpectedValue...
Zhi Zhuang [Thu, 22 Dec 2022 00:14:11 +0000 (19:14 -0500)]
[LowerExpectIntrinsic] Propagate branch weights through phi values when ExpectedValue is unlikely in LowerExpectIntrinsic

Update handlePhiDef to consider the probability argument in an expect.with.probability intrinsic when annotating BranchInsts.
In addition, we also disallow non-constant probability arguments in this intrinsic.

Differential Revsion: https://reviews.llvm.org/D140337

19 months ago[libc][NFC] Use operator delete to cleanup a File object.
Siva Chandra Reddy [Thu, 22 Dec 2022 08:13:19 +0000 (08:13 +0000)]
[libc][NFC] Use operator delete to cleanup a File object.

The File API has been refactored to allow cleanup using operator delete.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D140574

19 months agoRevert "Emit unwind information in the .debug_frame section when the .cfi_sections...
Shubham Sandeep Rastogi [Thu, 22 Dec 2022 22:23:34 +0000 (14:23 -0800)]
Revert "Emit unwind information in the .debug_frame section when the .cfi_sections .debug_frame directive is used."

This reverts commit d2cbdb6bef31bdc3254daf57148225ea4b34520c.

This is because we are seeing linker crashes in the internal apple bots.

19 months ago[RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility
Nitin John Raj [Thu, 22 Dec 2022 19:28:53 +0000 (11:28 -0800)]
[RISCV] Add pass to remove W suffix from ADDIW and SLLIW to improve compressibility

SLLI and ADD are more compressible than SLLIW and ADDW. SLLI/ADD both have a 5-bit register encoding. SLLIW/ADDW have a 3-bit register encoding. They both require the dest to also be one of the sources.

We aggressively form ADDW/SLLIW as it helps hasAllWBitUsers in RISCVISelDAGToDAG to not require recursion. So we need a pass to remove excessive -w suffixes.

Differential Revision: https://reviews.llvm.org/D139948

19 months ago[libc++] Granularize <type_traits> includes in <utility>
Nikolas Klauser [Tue, 20 Dec 2022 18:47:35 +0000 (19:47 +0100)]
[libc++] Granularize <type_traits> includes in <utility>

Reviewed By: Mordante, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D140426

19 months agoSCCP: Add failing testcase with llvm.ssa.copy
Matt Arsenault [Tue, 20 Dec 2022 16:34:06 +0000 (11:34 -0500)]
SCCP: Add failing testcase with llvm.ssa.copy

19 months agoSCCP: Don't assert on constantexpr casts of function uses
Matt Arsenault [Tue, 20 Dec 2022 13:18:50 +0000 (08:18 -0500)]
SCCP: Don't assert on constantexpr casts of function uses

This includes 2 different, related fixes:

1. Fix asserting on direct assume-like intrinsic uses of a function
   address

2. Fix asserting on constant expression casts used by assume-like
   intrinsics.

By default hasAddressTaken permits assume-like intrinsic uses, which
ignores assume-like calls and pointer casts of the address used by
assume-like calls.

Fixes #59602, but there are additional issues I encountered when
debugging this. For instance, the original failing bitcast expression
was really unused. Clang tentatively created it for the function type,
but was unnecessary after applyGlobalValReplacements. That did not
clean up the now dead ConstantExpr which hung around oun the user
list, so this assert only reproduced when running clang from the
original testcase, and didn't just running opt -passes=ipsccp. I don't
know who is responsible for cleaning up unused ConstantExprs, but I've
run into similar issues several times recently.

Additionally, I found a few assertions with llvm.ssa.copy with
functions and casts of functions as the argument.

Another issue theoretically exists if hasAddressTaken chooses to
respect nocapture when passed function addresses. The search here
would need to do additional work to look at the users of the constant
cast to see if any call sites need returned to be stripped.

19 months ago[CSKY] Fix MachineFunctionInfo initialization after 69e75ae695d9ef1360a2a1fbefd6e0e04...
Fangrui Song [Thu, 22 Dec 2022 22:02:12 +0000 (14:02 -0800)]
[CSKY] Fix MachineFunctionInfo initialization after 69e75ae695d9ef1360a2a1fbefd6e0e0456c3f7b

19 months ago[clang] Remove redundant initialization of std::optional (NFC)
Kazu Hirata [Thu, 22 Dec 2022 21:46:26 +0000 (13:46 -0800)]
[clang] Remove redundant initialization of std::optional (NFC)

19 months ago[mlir][GPU] Add known_block_size and known_grid_size to gpu.func
Krzysztof Drewniak [Fri, 2 Dec 2022 20:38:39 +0000 (20:38 +0000)]
[mlir][GPU] Add known_block_size and known_grid_size to gpu.func

In many cases, the the number of workgroups (the grid size) and the
number of workitems within each group (the block size) that a GPU
kernel will be launched with are known. For example, if gpu.launch is
called with constant block and grid sizes, we know that those are the
only possible sizes that will be used to launch that kernel. In other
cases, a custom code-generation pipeline that eventually produces GPU
kernels may know the launch dimensions of those kernels, or at least
may be able to provide an upper bound on them.

Other GPU programming systems, such as OpenCL, allow capturing such
information to enable compiler optimizations - see
reqd_work_group_size, but MLIR currently has no mechanism for doing so.

This set of attributes is the first step in enabling optimizations
based on the known launch dimensions of kernels. It extends the kernel
outline pass to set these bounds on kernels with constant launch
dimensions and extends integer range inference for GPU index
operations to account for the bounds when they are known.

Subsequent revisions will use this data when lowering GPU operations
to the ROCDL dialect.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D139865

19 months ago[gn build] Port 17ed8f29287b
LLVM GN Syncbot [Thu, 22 Dec 2022 21:20:59 +0000 (21:20 +0000)]
[gn build] Port 17ed8f29287b

19 months ago[BOLT][AArch64] Handle adrp+ld64 linker relaxations
Vladislav Khmelevsky [Wed, 16 Nov 2022 07:57:35 +0000 (11:57 +0400)]
[BOLT][AArch64] Handle adrp+ld64 linker relaxations

Linker might relax adrp + ldr got address loading to adrp + add for
local non-preemptible symbols (e.g. hidden/protected symbols in
executable). As usually linker doesn't change relocations properly after
relaxation, so we have to handle such cases by ourselves. To do that
during relocations reading we change LD64 reloc to ADD if instruction
mismatch found and introduce FixRelaxationPass that searches for ADRP+ADD
pairs and after performing some checks we're replacing ADRP target symbol
to already fixed ADDs one.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D138097

19 months ago[lld-macho] Fix assert when splitting section
Keith Smiley [Thu, 22 Dec 2022 00:02:38 +0000 (16:02 -0800)]
[lld-macho] Fix assert when splitting section

Fixes https://github.com/llvm/llvm-project/issues/59649

Differential Revision: https://reviews.llvm.org/D140518

19 months ago[mlir][sparse] make loop emitter API more concise.
Peiming Liu [Thu, 22 Dec 2022 18:56:44 +0000 (18:56 +0000)]
[mlir][sparse] make loop emitter API more concise.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140583

19 months ago[LangRef] Add description for nocallback attribute
Gulfem Savrun Yeniceri [Wed, 10 Aug 2022 22:40:23 +0000 (22:40 +0000)]
[LangRef] Add description for nocallback attribute

This patch adds the description for nocallback attribute
that is implemented in https://reviews.llvm.org/D90275.

Differential Revision: https://reviews.llvm.org/D131628

19 months ago[Linker] Remove nocallback attribute while linking
Gulfem Savrun Yeniceri [Thu, 3 Nov 2022 21:16:06 +0000 (21:16 +0000)]
[Linker] Remove nocallback attribute while linking

GCC's leaf attribute is lowered to LLVM IR nocallback attribute.
Clang conservatively treats this function attribute as a hint on the
module level, and removes it while linking modules. More context can
be found in: https://reviews.llvm.org/D131628.

Differential Revision: https://reviews.llvm.org/D137360

19 months ago[DAGCombine][X86] Pull one-use `freeze` out of `extract_vector_elt` vector operand
Roman Lebedev [Thu, 22 Dec 2022 19:18:11 +0000 (22:18 +0300)]
[DAGCombine][X86] Pull one-use `freeze` out of `extract_vector_elt` vector operand

This may allow us to further simplify the vector,
and freezing the extracted result is still fine:
```
----------------------------------------
define i8 @src(<2 x i8> %src, i64 %idx) {
%0:
  %i1 = freeze <2 x i8> %src
  %i2 = extractelement <2 x i8> %i1, i64 %idx
  ret i8 %i2
}
=>
define i8 @tgt(<2 x i8> %src, i64 %idx) {
%0:
  %i1 = extractelement <2 x i8> %src, i64 %idx
  %i2 = freeze i8 %i1
  ret i8 %i2
}
Transformation seems to be correct!
```

BUT, there must not be other uses of that freeze,
see `@freeze_extractelement_extra_use`.

Also, looks like we are missing some ISEL-level handling for freeze.

19 months ago[NFC][Codegen][X86] Add tests where we could improve `freeze` handling
Roman Lebedev [Thu, 22 Dec 2022 18:26:40 +0000 (21:26 +0300)]
[NFC][Codegen][X86] Add tests where we could improve `freeze` handling

19 months ago[Driver] Revert D139717 and add -Xparser/-Xcompiler instead
Fangrui Song [Thu, 22 Dec 2022 20:51:20 +0000 (12:51 -0800)]
[Driver] Revert D139717 and add -Xparser/-Xcompiler instead

Some macOS projects use -Xparser even if it leads to a
-Wunused-command-line-argument warning. It doesn't justify adding a broad Joined
`-X` (IgnoredGCCCompat) as GCC doesn't really support these arbitrary `-X`
options.

Note: `-Xcompiler foo` is a GNU libtool option, not a driver option.
It is misused by some ChromeOS packages (but not by Gentoo).
Keep it for a while.

It seems that GCC < 4.6 reports g++: unrecognized option '-Xfoo' but exit with 0.
GCC >= 4.6 reports g++: error: unrecognized option '-Xfoo' and exits with 1.
It never supports -Xcompiler or -Xparser, so `IgnoredGCCCompat` is not justified.

Differential Revision: https://reviews.llvm.org/D140224

19 months ago[mlir][sparse] move loop boundary method to codegenenv
Aart Bik [Thu, 22 Dec 2022 20:10:03 +0000 (12:10 -0800)]
[mlir][sparse] move loop boundary method to codegenenv

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D140578