platform/upstream/llvm.git
18 months agollvm-reduce: Simplify attribute reduction implementation
Matt Arsenault [Mon, 2 Jan 2023 18:33:17 +0000 (13:33 -0500)]
llvm-reduce: Simplify attribute reduction implementation

There's no need to construct a map of attributes to modify throughout
the whole function before applying them all at once. The attribute
classes already have the necessary set behavior.

18 months agoFix the Clang sphinx bot again
Aaron Ballman [Thu, 12 Jan 2023 12:59:07 +0000 (07:59 -0500)]
Fix the Clang sphinx bot again

The changes to fix the bot yesterday got reverted in a subsequent
commit, so this adds those changes back again.

Fixes the issue found in:
https://lab.llvm.org/buildbot/#/builders/92/builds/38593

18 months agoRevert "[mlir][linalg] Swap extract_slice(fill(x)) ops"
Alexander Belyaev [Thu, 12 Jan 2023 11:51:23 +0000 (12:51 +0100)]
Revert "[mlir][linalg] Swap extract_slice(fill(x)) ops"

This reverts commit bcfd32adc4b658dc45aa8c338d5dd03837e2a0e4.

There is already a similar pattern in mlir/lib/Dialect/Linalg/Transforms/SwapExtractSliceWithFillPatterns.cpp

Differential Revision: https://reviews.llvm.org/D141597

18 months ago[lldb] Add lldb-server targets to build docs
David Spickett [Tue, 20 Dec 2022 11:31:58 +0000 (11:31 +0000)]
[lldb] Add lldb-server targets to build docs

The current doc has people just do "ninja lldb" which is
not incorrect, it does build lldb. However it does not build lldb-server.
So you can't just "lldb some-binary" and expect it to work.

I've updated the instructions to reflect that most of the time
you'll want both lldb and lldb-server.

Though there is a use case for building just lldb. I'm assuming
Mac OS (where you have debugserver) and if you only wanted to do
remote debug.

Fixes #59575

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D140385

18 months ago[clangd] Tag clang-tidy diagnostics: modernize-*=deprecated, misc-unused-*=unneccesary
Sam McCall [Wed, 11 Jan 2023 20:06:25 +0000 (21:06 +0100)]
[clangd] Tag clang-tidy diagnostics: modernize-*=deprecated, misc-unused-*=unneccesary

Differential Revision: https://reviews.llvm.org/D141537

18 months ago[include-cleaner] Treat a constructor call as a use of the class type.
Haojian Wu [Thu, 12 Jan 2023 11:40:12 +0000 (12:40 +0100)]
[include-cleaner] Treat a constructor call as a use of the class type.

Per the discussion in https://github.com/llvm/llvm-project/issues/59916.

Differential Revision: https://reviews.llvm.org/D141592

18 months ago[lldb] Fix typo in integral format specifier
Michael Buch [Thu, 12 Jan 2023 12:06:03 +0000 (12:06 +0000)]
[lldb] Fix typo in integral format specifier

This regressed with `e262b8f48af9fdca8380f2f079e50291956aad71`.

Two issues here:
1. `:16x` is not a valid format specifier and
   we would crash when we encountered this log
   (which was the case in `TestCPPAccelerator.py`)
2. The third argument was missing curly braces so
   the log message itself was malformed.

18 months ago[clang] Fix unused variable warning in SemaConcept.cpp
Victor Komarov [Thu, 12 Jan 2023 11:55:21 +0000 (12:55 +0100)]
[clang] Fix unused variable warning in SemaConcept.cpp

Issue is described here: https://github.com/llvm/llvm-project/issues/59696

Reviewed By: hokein

Differential Revision: https://reviews.llvm.org/D140711

18 months ago[Mips] Convert some tests to opaque pointers (NFC)
Nikita Popov [Thu, 12 Jan 2023 11:39:54 +0000 (12:39 +0100)]
[Mips] Convert some tests to opaque pointers (NFC)

Dropped bitcasts result in dropped COPYs in MIR.

18 months ago[Mips] Regenerate test checks (NFC)
Nikita Popov [Thu, 12 Jan 2023 11:24:06 +0000 (12:24 +0100)]
[Mips] Regenerate test checks (NFC)

18 months ago[OptTable] Precompute OptTable prefixes union table through tablegen
serge-sans-paille [Fri, 30 Dec 2022 07:32:59 +0000 (08:32 +0100)]
[OptTable] Precompute OptTable prefixes union table through tablegen

This avoid rediscovering this table when reading each options, providing
a sensible 2% speedup when processing and empty file, and a measurable
speedup on typical workloads, see:

This is optional, the legacy, on-the-fly, approach can still be used
through the GenericOptTable class, while the new one is used through
PrecomputedOptTable.

https://llvm-compile-time-tracker.com/compare.php?from=4da6cb3202817ee2897d6b690e4af950459caea4&to=19a492b704e8f5c1dea120b9c0d3859bd78796be&stat=instructions:u

Differential Revision: https://reviews.llvm.org/D140800

18 months ago[OptTable] Make ValuesCode initialisation of Options constexpr
serge-sans-paille [Mon, 26 Dec 2022 07:51:47 +0000 (08:51 +0100)]
[OptTable] Make ValuesCode initialisation of Options constexpr

Current implementation requires a copy of the initialization array to a
vector to be able to modify their Values field.

This is inefficient: it requires a large copy to update a value, while
TableGen has all information to avoid this overwrite.

Modify TableGen to emit the Values code and use it to perform the
initialisation.

The impact on performance is not amazing compared to the actual
compilation, but still noticeable:

https://llvm-compile-time-tracker.com/compare.php?from=d9ab3e82f30d646deff054230b0c742704a1cf26&to=f2b37fb65d5149f70b43d1801beb5239285a2a20&stat=instructions:u

Differential Revision: https://reviews.llvm.org/D140699

18 months ago[Flang] Add/Restore basic debug support (1/n)
Kiran Chandramohan [Thu, 12 Jan 2023 10:34:34 +0000 (10:34 +0000)]
[Flang] Add/Restore basic debug support (1/n)

Recent changes to MLIR meant that Flang does not generate any debug line
table information.

This patch adds a pass that provides some foundation work with which
basic line table debug info can be generated. A walk is performed on
all the `func` ops in the module and they are decorated with a fusedLoc
op that contains the debug metadata for the subroutine along with
location information.

Alternatives include populating this info during lowering or during FIR
to LLVM Dialect conversion.

Note: Patches in future will add
    -> more realistic debug info for types and other fields.
    -> driver flags to control generation of debug.

Fixes #58634.

Reviewed By: awarzynski, vzakhari

Differential Revision: https://reviews.llvm.org/D137956

18 months ago[AMDGPU] Simplify getNumFlatOffsetBits. NFC.
Jay Foad [Thu, 22 Dec 2022 16:50:55 +0000 (16:50 +0000)]
[AMDGPU] Simplify getNumFlatOffsetBits. NFC.

Previously we considered this field to be either N-bit unsigned or
N+1-bit signed, depending on the instruction. I think it's conceptually
simpler to say that the field is always N+1-bit signed, but some
instructions do not allow negative values.

Differential Revision: https://reviews.llvm.org/D140883

18 months ago[llvm/cmake] Replace CMAKE_SOURCE_DIR with PROJECT_SOURCE_DIR
Sebastian Neubauer [Wed, 11 Jan 2023 17:42:00 +0000 (18:42 +0100)]
[llvm/cmake] Replace CMAKE_SOURCE_DIR with PROJECT_SOURCE_DIR

When adding llvm to a build with add_subdirectory, CMAKE_SOURCE_DIR
refers to the source directory of the parent project. We want to use
PROJECT_SOURCE_DIR instead.

Differential Revision: https://reviews.llvm.org/D141521

18 months agoRevert "Don't attempt to create vectors with complex element types."
Johannes Reifferscheid [Thu, 12 Jan 2023 10:34:19 +0000 (11:34 +0100)]
Revert "Don't attempt to create vectors with complex element types."

This reverts commit 91181db6d6fd896f01e1e89786d6d7d3d09a911e.

18 months ago[Attributor] Properly repair broken unittest
Johannes Doerfert [Thu, 12 Jan 2023 10:20:50 +0000 (02:20 -0800)]
[Attributor] Properly repair broken unittest

Reverts 2dc7c7095153822ecd1a8f43aa4c185f9e80cc00 and instead repairs the
unittest properly. The test was broken as that it used references to
dead functions, assumed dead functions could reach code, assumed code
would not be deleted, and did not pre-query all assertion queries.
Arguably, the querry AAs don't make it easy to use them outside the
attributor pipeline, maybe we just should not (or should fix them
pessimistically). For now, the unittest is fixed.

18 months ago[Assignment Tracking][Docs] Remove TODO for replacing undef with poison
OCHyams [Thu, 12 Jan 2023 10:17:53 +0000 (10:17 +0000)]
[Assignment Tracking][Docs] Remove TODO for replacing undef with poison

This has been resolved for assignment tracking by the patch D140906 and others
in the same stack.

18 months ago[Attributor][FIX] Avoid creating accidental poison callees
Johannes Doerfert [Thu, 12 Jan 2023 10:08:07 +0000 (02:08 -0800)]
[Attributor][FIX] Avoid creating accidental poison callees

Back with f3ad8cf00e213 we introduced a bug that caused us to skip
callees when we replace uses. This is not sound since subsequent IR
cleanup will assume replacement has happend. As such we created poison
callees for a long while. The original intend of the check was to
prevent call graph invalidation, however, we now properly check if the
instructions (here the call) are inside the SCC or not.

18 months ago[flang] Restore declared type when deallocating polymorphic entities
Valentin Clement [Thu, 12 Jan 2023 10:12:00 +0000 (11:12 +0100)]
[flang] Restore declared type when deallocating polymorphic entities

As mentioned in section 7.3.2.3 note 7, The dynamic type of an unallocated
allocatable object or a disassociated pointer is the same as its declared type.

This patch adds two function to the runtime:
- `PointerDeallocatePolymorphic`
- `AllocatableDeallocatePolymorphic`

These two functions take a DerivedTypeDesc pointer of the declared type.
The lowering is updated accordingly to call these functions for polymorphic
and unlimited polyrmophic entities. For unlimited polymorphic entities, the
dynamic type is set to nullptr when the entity is on an unallocated or
disassociated state.

Reviewed By: PeteSteinfeld, klausler

Differential Revision: https://reviews.llvm.org/D141519

18 months ago[DebugInfo] Replace UndefValue with PoisonValue in AssignmentTrackingAnalysis
OCHyams [Thu, 12 Jan 2023 09:51:08 +0000 (09:51 +0000)]
[DebugInfo] Replace UndefValue with PoisonValue in AssignmentTrackingAnalysis

This helps towards the effort to remove UndefValue from LLVM.

Related to https://discourse.llvm.org/t/auto-undef-debug-uses-of-a-deleted-value

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D140906

18 months ago[include-mapping] Fix parsing of html_book_20190607.zip (https://en.cppreference...
Viktoriia Bakalova [Wed, 11 Jan 2023 16:01:33 +0000 (16:01 +0000)]
[include-mapping] Fix parsing of html_book_20190607.zip (https://en.cppreference.com/w/File:html_book_20190607.zip). Skip entries that have been added to the index (C++20 symbols), but the corresponding pages for which have not been created yet.

Differential Revision: https://reviews.llvm.org/D141509

18 months ago[NFC][Assignment Tracking] Add is/setKillAddress
OCHyams [Wed, 11 Jan 2023 16:40:34 +0000 (16:40 +0000)]
[NFC][Assignment Tracking] Add is/setKillAddress

Unlike D140903 this patch folds in treating an empty metadata address component
of a dbg.assign the same as undef because it was already being treated that way
in the AssignmentTrackingAnalysis pass.

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D141125

18 months ago[analyzer] Don't escape local static memregions on bind
Balazs Benics [Thu, 12 Jan 2023 09:42:57 +0000 (10:42 +0100)]
[analyzer] Don't escape local static memregions on bind

When the engine processes a store to a variable, it will eventually call
`ExprEngine::processPointerEscapedOnBind()`. This function is supposed to
invalidate (put the given locations to an escape list) the locations
which we cannot reason about.

Unfortunately, local static variables are also put into this list.

This patch relaxes the guard condition, so that beyond stack variables,
static local variables are also ignored.

Differential Revision: https://reviews.llvm.org/D139534

18 months ago[clang][Driver][CUDA] Get rid of unused LibPath
Kadir Cetinkaya [Wed, 11 Jan 2023 08:57:37 +0000 (09:57 +0100)]
[clang][Driver][CUDA] Get rid of unused LibPath

LibPath discovered during InstallationDetection wasn't used anywhere.
Moreover it actually resulted in discarding installations that don't have any
`/lib` directory.

This is causing troubles for our pipelines downstream, that want to perform
syntax-only analysis on the sources.

Differential Revision: https://reviews.llvm.org/D141467

18 months ago[ODRHash] Handle `Integral` and `NullPtr` template parameters in `ODRHash`
isuckatcs [Sun, 8 Jan 2023 13:32:06 +0000 (14:32 +0100)]
[ODRHash] Handle `Integral` and `NullPtr` template parameters in `ODRHash`

Before this patch the parameters mentioned in the title weren't
handled by ODRHash, when a hash was generated for a template
specialization. This patch adds these parameters to the hash,
so that different template specializations will get different
hashes in every case.

Differential Revision: https://reviews.llvm.org/D141224

18 months agoMarking TypeSize getFixedSize() and getKnownMinSize() deprecated
Guillaume Chatelet [Wed, 11 Jan 2023 17:02:32 +0000 (17:02 +0000)]
Marking TypeSize getFixedSize() and getKnownMinSize() deprecated

This change is the last of a series to implement the discussion from
https://reviews.llvm.org/D141134 to only keep one version of the functions.

18 months agoAvoid u8"" literals in tests, their type changes in C++20
Jens Massberg [Tue, 10 Jan 2023 16:12:49 +0000 (17:12 +0100)]
Avoid u8"" literals in tests, their type changes in C++20

Just specify the encoded bytes instead.
Additionally delete insertion operator of raw_ostream for char8_t as it
doesn't work as users might expect (Numbers and pointers are added to
the stream instead of UTF-8 characters). Added a comment and instructions
on how to use UTF-8 strings with raw_ostream.

Differential Revision: https://reviews.llvm.org/D141392

18 months ago[Attributor][FIX] Avoid deleting (internal) library functions
Johannes Doerfert [Thu, 12 Jan 2023 09:06:41 +0000 (01:06 -0800)]
[Attributor][FIX] Avoid deleting (internal) library functions

In CGSCC mode we cannot delete internal library functions, esp.
__kmpc_alloc_shared, or we trigger an assertion. While the assertion is
probably too narrow, we avoid deleting those unused functions for now to
unblock the AMDGPU buildbot.

18 months ago[flang][NFC] Remove CallBuilder class
Jean Perier [Thu, 12 Jan 2023 09:14:07 +0000 (10:14 +0100)]
[flang][NFC] Remove CallBuilder class

The methods of CallBuilder do need to belong to a class.
This was made to avoid having to propagate generic lowering context
(converter, symbol mappings, location and StatementContext).

Packaging them together will actually make it harder to share the code
for user and intrinsic elemental lowering (I plan to use C++ CRTP),
and it is also misleading: one could think there is something going
with the class state while lowering the function while there is not
(and there should not be).

Removes the class and turns the methods into static functions.
Add a new CallContext class to solve the argument threading
inconvenience.

This contains no functional changes at all.

Differential Revision: https://reviews.llvm.org/D141510

18 months ago[flang] Lower component-ref to hlfir.designate
Jean Perier [Thu, 12 Jan 2023 09:08:16 +0000 (10:08 +0100)]
[flang] Lower component-ref to hlfir.designate

Implement the visit of component refs in DesignatorBuilder.
The ArrayRef code has to be updated a bit to cope with the
case where the base is an array and the component is also an
array.

Improve the result type of array sections designators (only return
a fir.box if the array section is not contiguous/has dynamic extent).
This required exposing IsContiguous entry point for different
front-end designator nodes (the implementation already existed,
but was internal to check-expression.cpp).

Differential Revision: https://reviews.llvm.org/D141470

18 months ago[IR] Support importing modules with invalid data layouts.
Jannik Silvanus [Wed, 4 Jan 2023 10:52:00 +0000 (11:52 +0100)]
[IR] Support importing modules with invalid data layouts.

Use the existing mechanism to change the data layout using callbacks.

Before this patch, we had a callback type DataLayoutCallbackTy that receives
a single StringRef specifying the target triple, and optionally returns
the data layout string to be used. Module loaders (both IR and BC) then
apply the callback to potentially override the module's data layout,
after first having imported and parsed the data layout from the file.

We can't do the same to fix invalid data layouts, because the import will already
fail, before the callback has a chance to fix it.
Instead, module loaders now tentatively parse the data layout into a string,
wait until the target triple has been parsed, apply the override callback
to the imported string and only then parse the tentative string as a data layout.

Moreover, add the old data layout string S as second argument to the callback,
in addition to the already existing target triple argument.
S is either the default data layout string in case none is specified, or the data
layout string specified in the module, possibly after auto-upgrades (for the BitcodeReader).
This allows callbacks to inspect the old data layout string,
and fix it instead of setting a fixed data layout.

Also allow to pass data layout override callbacks to lazy bitcode module
loader functions.

Differential Revision: https://reviews.llvm.org/D140985

18 months ago[Clang] Emit noundef metadata next to range metadata
Nikita Popov [Wed, 11 Jan 2023 14:19:57 +0000 (15:19 +0100)]
[Clang] Emit noundef metadata next to range metadata

To preserve the previous semantics after D141386, adjust places
that currently emit !range metadata to also emit !noundef metadata.
This retains range violation as immediate undefined behavior,
rather than just poison.

Differential Revision: https://reviews.llvm.org/D141494

18 months ago[LangRef] Make !range, !nonnull and !align return poison instead of IUB
Nikita Popov [Tue, 10 Jan 2023 15:06:30 +0000 (16:06 +0100)]
[LangRef] Make !range, !nonnull and !align return poison instead of IUB

Make violation of !range, !nonnull and !align metadata return poison
instead of causing immediate undefined behavior. This makes the
behavior match that of the nonnull and align parameter and return
value attributes. The previous behavior can be restored by additionally
specifying the !noundef metadata, same as with parameters.

Some benefits of this change are:

 * This is needed to fix https://github.com/llvm/llvm-project/issues/59888.
   Under current semantics, it is illegal to add !range annotations
   based on known bits. Unless we want to drop that optimization
   entirely, we need to change the !range semantics.
 * This allows preserving range/nonnull/align metadata on
   speculated loads. !noundef metadata needs to be dropped, but
   the poison-generating metadata can be retained.

I don't think there are really disadvantages to the change (apart
from the need to review and adjust optimizations for the new
semantics), as the old behavior is still available via !noundef,
so it should be strictly more flexible.

Differential Revision: https://reviews.llvm.org/D141386

18 months agoDon't attempt to create vectors with complex element types.
Johannes Reifferscheid [Thu, 12 Jan 2023 08:54:47 +0000 (09:54 +0100)]
Don't attempt to create vectors with complex element types.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D141578

18 months ago[OpenMP] Fix a test that fails when 'libgomp' is the default library
Dmitri Gribenko [Thu, 12 Jan 2023 08:24:55 +0000 (09:24 +0100)]
[OpenMP] Fix a test that fails when 'libgomp' is the default library

We can't do offloading with libgomp, thus the test fails. This change
explicitly chooses an OpenMP runtime library that is capable of
offloading.

This change is similar to
https://github.com/llvm/llvm-project/commit/a5098e5f27badc3ba16533418accd2e17641e4e4.

18 months ago[Attributor] Temporarily disable unit test to unbreak buildbots
Johannes Doerfert [Thu, 12 Jan 2023 08:51:33 +0000 (00:51 -0800)]
[Attributor] Temporarily disable unit test to unbreak buildbots

The root cause seems to have expressed in two separate errors and isn't
caught by any IR tests. Will be investigated.

18 months ago[mlir][Tosa][NFC] Migrate Tosa dialect to the new fold API
Markus Böck [Tue, 10 Jan 2023 19:27:12 +0000 (20:27 +0100)]
[mlir][Tosa][NFC] Migrate Tosa dialect to the new fold API

See https://discourse.llvm.org/t/psa-new-improved-fold-method-signature-has-landed-please-update-your-downstream-projects/67618 for context

Differential Revision: https://reviews.llvm.org/D141527

18 months ago[mlir][Vector][NFC] Migrate Vector dialect to the new fold API
Markus Böck [Tue, 10 Jan 2023 19:18:31 +0000 (20:18 +0100)]
[mlir][Vector][NFC] Migrate Vector dialect to the new fold API

See https://discourse.llvm.org/t/psa-new-improved-fold-method-signature-has-landed-please-update-your-downstream-projects/67618 for context

Differential Revision: https://reviews.llvm.org/D141526

18 months ago[mlir][Shape][NFC] Migrate shape dialect to the new fold API
Markus Böck [Tue, 10 Jan 2023 19:03:08 +0000 (20:03 +0100)]
[mlir][Shape][NFC] Migrate shape dialect to the new fold API

See https://discourse.llvm.org/t/psa-new-improved-fold-method-signature-has-landed-please-update-your-downstream-projects/67618 for context

Changes are mostly mechanical in nature. The code nevertheless became more expressive in a lot of places thanks to the use of the new getters!

Differential Revision: https://reviews.llvm.org/D141501

18 months ago[cmake] Optionally install clang-tblgen to aid cross-compiling
James Le Cuirot [Thu, 12 Jan 2023 08:42:26 +0000 (09:42 +0100)]
[cmake] Optionally install clang-tblgen to aid cross-compiling

clang-tblgen is required to cross-compile clang itself. Unlike before,
most of the infrastructure is in place to do this now, and the only
thing preventing it is LLVM_BUILD_UTILS, which doesn't apply to Clang.

I thought about changing this to ${project}_BUILD_UTILS and adding
a CLANG_BUILD_UTILS option, but there seems little point for just one tool.

Instead, it checks whether clang-tblgen was explicitly requested in
LLVM_DISTRIBUTION_COMPONENTS, which is good enough for Gentoo and
other distributions.

Closes https://github.com/llvm/llvm-project/issues/20282.

Differential Revision: https://reviews.llvm.org/D141092

18 months ago[OpenMP] Replace ExternalizationRAII with virtual uses
Johannes Doerfert [Wed, 11 Jan 2023 09:22:45 +0000 (01:22 -0800)]
[OpenMP] Replace ExternalizationRAII with virtual uses

The externalization was always a stopgap solution. One of the drawbacks
is that it is very conservative no matter if we actually require the
functions at the end of the pass. The new concept is more generic and
properly integrates into the dependence graph. Whenever we might need a
function, it has a "virtual use" that cannot be analyzed. If we do not
because of some AA state, there will be a dependence to ensure state
changes trigger revisits of uses, including a potentially new virtual
use.

18 months ago[Attributor] Make AAIsDeadFunction lazy
Johannes Doerfert [Thu, 12 Jan 2023 01:49:09 +0000 (17:49 -0800)]
[Attributor] Make AAIsDeadFunction lazy

18 months ago[Attributor] Ensure no recursive reasoning is used for isAssumedDead
Johannes Doerfert [Thu, 12 Jan 2023 01:48:45 +0000 (17:48 -0800)]
[Attributor] Ensure no recursive reasoning is used for isAssumedDead

This is a precaution for the future.

18 months ago[ELF] Emit Verbose Asm when using --lto-emit-asm
Pierre van Houtryve [Mon, 9 Jan 2023 10:33:38 +0000 (05:33 -0500)]
[ELF] Emit Verbose Asm when using --lto-emit-asm

D138560 was abandonned as the use case can already be covered by `-Xoffload-linker --lto-emit-asm`.
However the output from `--lto-emit-asm` doesn't have
comments like the Clang `-S` output.

This patch adds verbose assembly output to LLD ELF LTO
so that the resulting assembly file more closely matches Clang's.

Having comments is especially important on targets such as AMDGPU because
they contain additional information about the kernel(s) being compiled.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D141268

18 months ago[Attributor] Allow AAs to iterate on their own state
Johannes Doerfert [Thu, 12 Jan 2023 01:47:14 +0000 (17:47 -0800)]
[Attributor] Allow AAs to iterate on their own state

Future AAs might need to iterate their own state until they reach a
fixpoint. We do not want to forbid that but we want to avoid negative
effects or bugs once this happens. As a precaution, we now rerun an AA
that did not require outside information. If it does not change anymore
we are done, otherwise the AA needs to iterate some more.

18 months ago[Attributor][FIX] Improve care when dealing with liveness
Johannes Doerfert [Thu, 12 Jan 2023 01:19:34 +0000 (17:19 -0800)]
[Attributor][FIX] Improve care when dealing with liveness

This patch adds two checks that have in experiments caused issues. One
was an oversight that allowed new AAs during cleanup to be optimistic.
The other treated functions as functions even if they were used as
values, e.g., in a cast instruction. In such cases we might have assumed
the value is dead if the function is not entered, which isn't true.

The new test functions don't expose a bug but I kept them around.

18 months ago[Attributor] Always ensure the correct AAIsDead object is used
Johannes Doerfert [Wed, 11 Jan 2023 23:50:59 +0000 (15:50 -0800)]
[Attributor] Always ensure the correct AAIsDead object is used

Since the Attributor::isAssumedDead lookups can jump between functions
we need to potentially replace a given FnLivenessAA for it to be useful.

18 months ago[MLIR][Tensor] Add canonicalization patterns for `tensor.pack`
Lorenzo Chelini [Wed, 21 Dec 2022 09:18:14 +0000 (10:18 +0100)]
[MLIR][Tensor] Add canonicalization patterns for `tensor.pack`

- Fold an unpack(pack(x)) to x.

- Rewrite a `tensor.pack` to an `tensor.expand_shape` if only one
  dimension is packed.

Reviewed By: tyb0807, hanchung, mravishankar

Differential Revision: https://reviews.llvm.org/D141123

18 months ago[NFC] Refactor the outdated warning message about removing std::experimental::coroutine
Chuanqi Xu [Thu, 12 Jan 2023 07:29:50 +0000 (15:29 +0800)]
[NFC] Refactor the outdated warning message about removing std::experimental::coroutine

The warning message is out of date. According to
https://github.com/llvm/llvm-project/issues/59110 and
https://reviews.llvm.org/D108697, this would be removed in LLVM17.

18 months agoRevert "[gn] port c268f850a299"
Vitaly Buka [Thu, 12 Jan 2023 07:31:03 +0000 (23:31 -0800)]
Revert "[gn] port c268f850a299"

With D141446.

This reverts commit 6be251352e6b4d9708a1b7b7b146ea199342de22.

18 months agoRevert "Fix to D139603(reverted) - moved size check to unit test so that it is cross...
Vitaly Buka [Thu, 12 Jan 2023 07:24:22 +0000 (23:24 -0800)]
Revert "Fix to D139603(reverted) - moved size check to unit test so that it is cross-platform"

Several bots are broken, details in https://reviews.llvm.org/D141446

This reverts commit c268f850a2998eb5370c07c74d7d0756dcc851c9.

18 months ago[RISCV] Teach lowerCTLZ_CTTZ_ZERO_UNDEF to handle conversion i32/i64 vectors to f32...
Yeting Kuo [Mon, 9 Jan 2023 13:54:22 +0000 (21:54 +0800)]
[RISCV] Teach lowerCTLZ_CTTZ_ZERO_UNDEF to handle conversion i32/i64 vectors to f32 vectors.

Previously lowerCTLZ_CTTZ_ZERO_UNDEF converted the source to float value by
ISD::UINT_TO_FP. ISD::UINT_TO_FP uses dynamic rounding mode, so the rounding
may make the exponent of the result not as expected when converting i32/i64 to f32.
This is the reason why we constrained lowerCTLZ_CTTZ_ZERO_UNDEF to only handle
an i32 source when the f64 type having the same element count as source is legal.

The patch teaches lowerCTLZ_CTTZ_ZERO_UNDEF converts i32/i64 vectors to f32
vectors by vfcvt.f.xu.v with RTZ rounding mode. Using RTZ is to make sure the
exponent of results is correct, although f32 could not totally represent each
value in i32/i64.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D140782

18 months ago[OpenMP][FIX] Allow multiple `depend` clauses on a `taskwait nowait`
Johannes Doerfert [Wed, 11 Jan 2023 19:22:11 +0000 (11:22 -0800)]
[OpenMP][FIX] Allow multiple `depend` clauses on a `taskwait nowait`

Fixes https://github.com/llvm/llvm-project/issues/59941

Differential Revision: https://reviews.llvm.org/D141531

18 months ago[OpenMP][5.1] Support `thread_limit` on `omp target`
Johannes Doerfert [Wed, 11 Jan 2023 17:07:13 +0000 (09:07 -0800)]
[OpenMP][5.1] Support `thread_limit` on `omp target`

It is unclear to me what happens if we have two thread_limit clauses to
choose from. I will recommend to the standards committee to disallow
that. For now, we pick the teams one.

Fixes https://github.com/llvm/llvm-project/issues/59940

Differential Revision: https://reviews.llvm.org/D141540

18 months ago[OpenMP][NFC] Include global alias test
Johannes Doerfert [Wed, 11 Jan 2023 05:39:51 +0000 (21:39 -0800)]
[OpenMP][NFC] Include global alias test

18 months ago[MLIR][Affine] Fix affine scalrep - add missing check
Uday Bondhugula [Mon, 9 Jan 2023 05:36:33 +0000 (11:06 +0530)]
[MLIR][Affine] Fix affine scalrep - add missing check

This fixes https://github.com/llvm/llvm-project/issues/59461

Add missing check in affine-scalrep pass that led to scalrep assert or
wrong scalrep when dead affine region ops existed in the same block.

Differential Revision: https://reviews.llvm.org/D141255

18 months ago[MLIR] Fix affine analysis methods for affine.parallel
Uday Bondhugula [Mon, 9 Jan 2023 05:36:49 +0000 (11:06 +0530)]
[MLIR] Fix affine analysis methods for affine.parallel

Drop unnecessary bailout in checkMemRefAccessDependence in the presence of
surrounding affine.parallel ops. When the affine.parallel op was added, affine
analysis methods weren't extended to account for it. Fix this and allow memref
dependence check to work in the presence of affine.parallel ops in the mix.

Rename isForInductionVar -> isAffineForInductionVar, getLoopIVs ->
getAffineForIVs to avoid confusion since that's what they were.

Differential Revision: https://reviews.llvm.org/D141254

18 months agoCanonicalize affine set + operands while adding affine.if op domain
Uday Bondhugula [Thu, 5 Jan 2023 23:03:42 +0000 (04:33 +0530)]
Canonicalize affine set + operands while adding affine.if op domain

Canonicalize affine set + operands in addAffineIfOpDomain. This is to
ensure a unique set of operands for FlatAffineValueConstraints and in
general to provide a simplified set of constraints. For the latter
scenario, this just leads to efficiency improvements as opposed to
functionality. While on this, remove outdated/stale stuff from
AffineStructures.h.

Fixes: https://github.com/llvm/llvm-project/issues/59461

Differential Revision: https://reviews.llvm.org/D141250

18 months ago[Flang] [OpenMP] Add parser support for THREAD_LIMIT clause on OMP TARGET directive.
Raghu Maddhipatla [Wed, 11 Jan 2023 14:23:53 +0000 (08:23 -0600)]
[Flang] [OpenMP] Add parser support for THREAD_LIMIT clause on OMP TARGET directive.

OpenMP 5.1 adds support for the THREAD_LIMIT clause for OMP TARGET directive.

This patch adds parser support for it in flang.

Reviewed By: kiranchandramohan, TIFitis

Differential Revision: https://reviews.llvm.org/D141493

18 months ago[OpenMP][DeviceRTL] Fix the support for tasking on the device
Shilei Tian [Thu, 12 Jan 2023 04:50:28 +0000 (23:50 -0500)]
[OpenMP][DeviceRTL] Fix the support for tasking on the device

This patch fixes the support for tasking on the device.

Note: AMDGPU doesn't support it yet because of no support for `malloc` and `free`.

Fix #59946.

```
➜  ./test_parallel_master_device
[OMPVV_RESULT: test_parallel_master_device.c] Test passed on the device.
```

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D141562

18 months agoFix test hip-windows-filename.hip
Yaxun (Sam) Liu [Thu, 12 Jan 2023 04:31:09 +0000 (23:31 -0500)]
Fix test hip-windows-filename.hip

due to https://reviews.llvm.org/D141437

18 months ago[XCOFF] handle the toc-data for object file generation.
esmeyi [Thu, 12 Jan 2023 04:27:47 +0000 (23:27 -0500)]
[XCOFF] handle the toc-data for object file generation.

Summary: The toc-data feature has been supported for assembly file generation.
         This patch handles the toc-data for object file generation.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D139516

18 months ago[LoopFusion] Sorting of undominated FusionCandidates crashes
Ramkrishnan Narayanan Komala [Thu, 12 Jan 2023 04:13:33 +0000 (23:13 -0500)]
[LoopFusion] Sorting of undominated FusionCandidates crashes

This patch tries to fix [[ https://github.com/llvm/llvm-project/issues/56263 | issue ]].

If two **FusionCandidates** are in same level of dominator tree then, they will not be dominates each other. But they are control flow equivalent. To sort those FusionCandidates **nonStrictlyPostDominate** check is needed.

Reviewed By: Narutoworld

Differential Revision: https://reviews.llvm.org/D139993

18 months ago[CodeGen] Remove #include "llvm/ADT/None.h"
Fangrui Song [Thu, 12 Jan 2023 03:47:02 +0000 (19:47 -0800)]
[CodeGen] Remove #include "llvm/ADT/None.h"

18 months ago[gn] port c268f850a299
Nico Weber [Thu, 12 Jan 2023 03:28:18 +0000 (22:28 -0500)]
[gn] port c268f850a299

18 months ago[LoongArch] Implement mayBeEmittedAsTailCall for tail call optimization
wanglei [Thu, 12 Jan 2023 02:33:11 +0000 (10:33 +0800)]
[LoongArch] Implement mayBeEmittedAsTailCall for tail call optimization

Implements TargetLowering callback `mayBeEmittedAsTailCall` that enables
CodeGenPrepare to duplicate returns when they might enable a tail-call.

Reviewed By: xen0n, MaskRay

Differential Revision: https://reviews.llvm.org/D141257

18 months ago[OpenMP] Implement `omp_get_mapped_ptr`
Shilei Tian [Thu, 12 Jan 2023 03:05:33 +0000 (22:05 -0500)]
[OpenMP] Implement `omp_get_mapped_ptr`

This patch implements the function `omp_get_mapped_ptr`.

Fix #59945.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D141545

18 months ago[RISCV] Move FP store of extractelt pattern to DAGCombine.
Craig Topper [Thu, 12 Jan 2023 03:01:46 +0000 (19:01 -0800)]
[RISCV] Move FP store of extractelt pattern to DAGCombine.

This makes it the same as integer.

18 months ago[RISCV] Use ISD::EXTRACT_VECTOR_ELT for Intrinsic::riscv_vfmv_f_s lowering.
Craig Topper [Thu, 12 Jan 2023 02:46:14 +0000 (18:46 -0800)]
[RISCV] Use ISD::EXTRACT_VECTOR_ELT for Intrinsic::riscv_vfmv_f_s lowering.

This matches what we do for extractelt from IR for both fixed and
scalable vectors.

This lets us remove a few isel patterns.

18 months ago[libc++] Fix ranges::uninitialized_move{, _n} for move-only types
Nikolas Klauser [Tue, 3 Jan 2023 20:51:31 +0000 (21:51 +0100)]
[libc++] Fix ranges::uninitialized_move{, _n} for move-only types

Fixes #59806

Reviewed By: ldionne, var-const, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D140920

18 months ago[tsan] Remove rtl-old
Vitaly Buka [Wed, 11 Jan 2023 03:20:13 +0000 (19:20 -0800)]
[tsan] Remove rtl-old

Reviewed By: dvyukov, MaskRay

Differential Revision: https://reviews.llvm.org/D141455

18 months ago[libc++][NFC] Fix endif comments in cmath
Nikolas Klauser [Thu, 12 Jan 2023 01:47:41 +0000 (02:47 +0100)]
[libc++][NFC] Fix endif comments in cmath

18 months ago[MergeICmps] Adapt to non-eq comparisons
zhongyunde [Thu, 12 Jan 2023 01:41:09 +0000 (09:41 +0800)]
[MergeICmps] Adapt to non-eq comparisons

Fix https://github.com/llvm/llvm-project/issues/59740.

Reviewed By: courbet, nikic
Differential Revision: https://reviews.llvm.org/D141188

18 months ago[libc++][ranges] Fix incorrect integer type in `view_interface` tests.
Konstantin Varlamov [Thu, 12 Jan 2023 01:42:20 +0000 (17:42 -0800)]
[libc++][ranges] Fix incorrect integer type in `view_interface` tests.

`ForwardIter() - ForwardIter()` returns `ptrdiff_t`, and converting it
to an unsigned type isn't guaranteed to produce the same type as
`size_t`.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D141456

18 months agoFixed Clang::Driver 'netbsd.c' test on Windows/Cross builders. NFC.
Vladimir Vereschaka [Thu, 12 Jan 2023 01:22:35 +0000 (17:22 -0800)]
Fixed Clang::Driver 'netbsd.c' test on Windows/Cross builders. NFC.

Differential Revision: https://reviews.llvm.org/D140817

18 months ago[bazel] Enable layering_check for llvm and clang
Fangrui Song [Thu, 12 Jan 2023 01:27:56 +0000 (17:27 -0800)]
[bazel] Enable layering_check for llvm and clang

Similar to D113952 for mlir.

I have added many missing dependencies so that
`bazel-5.0.0 build --config=generic_clang --features=layering_check @llvm-project//llvm:all @llvm-project//clang:all`
works now.
Enable the feature to ensure layering and catch circular dependencies
(https://llvm.org/docs/CodingStandards.html#library-layering).

Reviewed By: GMNGeoffrey, rupprecht

Differential Revision: https://reviews.llvm.org/D141553

18 months ago[bazel] Fix all remaining --features=layering_check issues for @llvm-project//clang:all
Fangrui Song [Thu, 12 Jan 2023 01:24:47 +0000 (17:24 -0800)]
[bazel] Fix all remaining --features=layering_check issues for @llvm-project//clang:all

18 months ago[NFC] fix more type conversion issues
Florian Mayer [Thu, 12 Jan 2023 01:24:02 +0000 (17:24 -0800)]
[NFC] fix more type conversion issues

18 months ago[bazel] Fix all remaining --features=layering_check issues for @llvm-project//llvm:all
Fangrui Song [Thu, 12 Jan 2023 01:09:09 +0000 (17:09 -0800)]
[bazel] Fix all remaining --features=layering_check issues for @llvm-project//llvm:all

18 months ago[NFC] fix type conversion issue
Florian Mayer [Thu, 12 Jan 2023 00:57:12 +0000 (16:57 -0800)]
[NFC] fix type conversion issue

18 months agoDynamically allocate scudo allocation buffer.
Florian Mayer [Tue, 20 Dec 2022 23:20:59 +0000 (15:20 -0800)]
Dynamically allocate scudo allocation buffer.

This is so we can increase the buffer size for finding elusive bugs.

Tested by hand with this program

```

int main(int argc, char** argv) {
  if (argc < 2)
    return 1;
  int n = atoi(argv[1]);
  char* x = reinterpret_cast<char*>(malloc(1));
  *((volatile char*)x) = 1;
  free(x);
  for (; n > 0; --n) {
    char* y = reinterpret_cast<char*>(malloc(1024));
    *((volatile char*)y) = 1;
    free(y);
  }
  *x = 2;
  return 0;
}
```

SCUDO_OPTIONS=allocation_ring_buffer_size=30000 ./uaf 1000000
-> no allocation trace
SCUDO_OPTIONS=allocation_ring_buffer_size=30000000 ./uaf 1000000
-> allocation trace

Reviewed By: hctim, eugenis

Differential Revision: https://reviews.llvm.org/D140932

18 months ago[BOLT] Add test case triggering JT assertion
Rafael Auler [Thu, 12 Jan 2023 00:05:45 +0000 (16:05 -0800)]
[BOLT] Add test case triggering JT assertion

Current case that triggers BOLT assertion. Marked XFAIL.
In this test case, we reproduce the behavior seen in gcc where the
base address of a jump table is decremented by some number and ends up
at the exact addess of a jump table from another function. After
linking, the instruction references another jump table and that
confuses BOLT.

Reviewed By: #bolt, Amir

Differential Revision: https://reviews.llvm.org/D138245

18 months agoFix to D139603(reverted) - moved size check to unit test so that it is cross-platform
William Huang [Wed, 11 Jan 2023 23:54:03 +0000 (23:54 +0000)]
Fix to D139603(reverted) - moved size check to unit test so that it is cross-platform

D139603 (add option to llvm-profdata to reduce output profile size) contains test cases that are not cross-platform. Moving those tests to unit test and making sure the feature is callable from llvm library

Reviewed By: snehasish

Differential Revision: https://reviews.llvm.org/D141446

18 months ago[mlir][sparse] Avoid values buffer reallocation for annotated all dense tensors.
bixia1 [Wed, 11 Jan 2023 17:06:42 +0000 (09:06 -0800)]
[mlir][sparse] Avoid values buffer reallocation for annotated all dense tensors.

Previously, we rely on the InsertOp to gradually increase the size of the
storage for all sparse tensors. We now allocate the full size values buffer
for annotated all dense tensors when we first allocate the tensor. This avoids
the cost of gradually increasing the buffer and allows accessing the values
buffer as if it were a dense tensor.

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D141516

18 months ago[RISCV] Make sure we use LMUL=1 for scalar reduction input in combineBinOpToReduce
Craig Topper [Thu, 12 Jan 2023 00:24:23 +0000 (16:24 -0800)]
[RISCV] Make sure we use LMUL=1 for scalar reduction input in combineBinOpToReduce

We might have looked through an INSERT_SUBVECTOR to find the
vmv.s.x or vfmv.s.f. If we did the ScalarV type is no longer LMUL=1.
We need to add a new INSERT_SUBVECTOR to restore it before
creating the new reduction.

While there, use the same debug location for all of the newly created
nodes. I believe we were using multiple debug locations from the
original nodes, but changing their relative order. I don't think
we're supposed to do that.

18 months ago[mlir][sparse] Refactor the code that reshapes the values buffer for annotated all...
bixia1 [Wed, 11 Jan 2023 16:59:00 +0000 (08:59 -0800)]
[mlir][sparse] Refactor the code that reshapes the values buffer for annotated all dense tensors.

Move the functionality to codegen utils for sharing with the codegen path.

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D141514

18 months ago[ModuleUtils][KCFI] Set patchable-function-prefix for synthesized functions
Sami Tolvanen [Wed, 11 Jan 2023 23:06:20 +0000 (23:06 +0000)]
[ModuleUtils][KCFI] Set patchable-function-prefix for synthesized functions

When -fpatchable-function-entry is used to emit prefix nops
before functions, KCFI assumes all indirectly called functions
have the same number of prefix nops, because the nops are emitted
between the KCFI type hash and the function entry. However, as
patchable-function-prefix is a function attribute set by Clang,
functions later synthesized by LLVM don't inherit this attribute
and end up not having prefix nops. One of these functions
is asan.module_ctor, which the Linux kernel ends up calling
indirectly when KASAN is enabled.

In order to avoid tripping KCFI, save the expected prefix offset
to a module flag, and use it when we're setting KCFI type for the
relevant synthesized functions.

Link: https://github.com/ClangBuiltLinux/linux/issues/1742
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D141172

18 months ago[GWP-ASan] Fix 6413872
Alex Brachet [Wed, 11 Jan 2023 23:42:19 +0000 (23:42 +0000)]
[GWP-ASan] Fix 6413872

Use testing not zxtest in non-Fuchsia case

18 months ago[HIP] Use .hipi as preprocessor output extension
Yaxun (Sam) Liu [Tue, 10 Jan 2023 20:58:57 +0000 (15:58 -0500)]
[HIP] Use .hipi as preprocessor output extension

so that clang can recognize it and handle it automatically
without -x hip-cpp-output.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D141437

18 months ago[llvm-lto2] Remove unused include after D123126
Fangrui Song [Wed, 11 Jan 2023 23:33:50 +0000 (15:33 -0800)]
[llvm-lto2] Remove unused include after D123126

18 months ago[bazel] Fix some --features=layering_check issues
Fangrui Song [Wed, 11 Jan 2023 23:30:54 +0000 (15:30 -0800)]
[bazel] Fix some --features=layering_check issues

18 months ago[Matrix] Optimize matrix transposes around additions
Francis Visoiu Mistrih [Mon, 28 Nov 2022 20:26:54 +0000 (15:26 -0500)]
[Matrix] Optimize matrix transposes around additions

First, sink the transposes to the operands to simplify redudant
ones. Then, lift them to reduce the number of realized transposes.

```
(A + B)^T -> A^T + B^T -> (A + B)^T
```

See tests for more examples.

Differential Revision: https://reviews.llvm.org/D133657

18 months ago[llvm-profdata] Remove an unused include after D115915
Fangrui Song [Wed, 11 Jan 2023 23:18:10 +0000 (15:18 -0800)]
[llvm-profdata] Remove an unused include after D115915

18 months ago[GWP-ASan] Fix test to work with Fuchsia's zxtest
Alex Brachet [Wed, 11 Jan 2023 23:16:19 +0000 (23:16 +0000)]
[GWP-ASan] Fix test to work with Fuchsia's zxtest

18 months ago[llvm][dwwarf] Change CU/TU index to 64-bit
Alexander Yermolovich [Wed, 11 Jan 2023 23:06:42 +0000 (15:06 -0800)]
[llvm][dwwarf] Change CU/TU index to 64-bit

Changed contribution data structure to 64 bit. I added the 32bit and 64bit
accessors to make it explicit where we use 32bit and where we use 64bit. Also to
make sure sure we catch all the cases where this data structure is used.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D139379

18 months ago[GCOV] Set !kcfi_type metadata for indirectly called functions
Sami Tolvanen [Fri, 6 Jan 2023 20:54:09 +0000 (20:54 +0000)]
[GCOV] Set !kcfi_type metadata for indirectly called functions

With CONFIG_GCOV_KERNEL, the Linux kernel indirectly calls the
__llvm_gcov_* functions generated by LLVM. With -fsanitize=kcfi,
these calls are made from instrumented code and fail indirect
call checks as they don't have !kcfi_type metadata. Similarly
to D138945, set type metadata for these functions to allow GCOV
and KCFI to be both enabled.

Link: https://github.com/ClangBuiltLinux/linux/issues/1778
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D141444

18 months agoRevert "[llvm][dwwarf] Change CU/TU index to 64-bit"
Alexander Yermolovich [Wed, 11 Jan 2023 22:40:54 +0000 (14:40 -0800)]
Revert "[llvm][dwwarf] Change CU/TU index to 64-bit"

This reverts commit fa3fa4d0d42326005dfd5887bf047b86904d3be6.

18 months ago[BOLT] using jump weights in profi
spupyrev [Mon, 12 Dec 2022 19:29:02 +0000 (11:29 -0800)]
[BOLT] using jump weights in profi

We want to use profile inference (profi) in BOLT for stale profile matching.
This is the second change for existing usages of profi (e.g., CSSPGO):

(i) Added the ability to provide (estimated) jump weights for the algorithm. The
goal of the algorithm is to create a valid control flow for a given function
(that is, one in which incoming counts equal outgoing counts for every basic
block while minimally modifying the original input block and jump weights). The
input jump weights will be provided based on collected LBR profiles in BOLT.

(ii) Added the corresponding options to ProfiParams.

(iii) Slightly modified / simplified the construction of the flow network in profi
so as it utilizes fewer auxiliary nodes. This is done by introducing parallel
edges to the network (which is supported by MMF) and reduces the size of the
network from 3*|V| to 2*|V|, where |V| is the number of basic blocks in the
function.

**Inference (profile quality) impact:**
The diff is supposed to be a no-op for the inferred counts. However, our
implementation of MCF is not fully deterministic and might return different
results depending on the input network model. Since we changed the model
construction, there are a few differences in comparison to the original
implementation. I checked manually on an internal benchmark and see a minor
difference (+/- 1 count for certain basic blocks) in just a dozen of instances
(out of 10000+ input functions). Hence, the diff is highly unlikely to have an
impact for existing prod workloads.

**Runtime impact:**
I measure up to 10% speedup for block-only (ie CSSPGO/AutoFDO) inference and up
to 50% speedup for block+jump inference (ie BOLT) in comparison to the original
unoptimized version.

Reviewed By: hoy

Differential Revision: https://reviews.llvm.org/D139870

18 months ago[AMDGPU] Mark wmma intrinsics as source of divergence
Stanislav Mekhanoshin [Wed, 11 Jan 2023 21:27:11 +0000 (13:27 -0800)]
[AMDGPU] Mark wmma intrinsics as source of divergence

I do not believe any code can hit this, but these do not give
a uniform answer with all unifirm sources.

Differential Revision: https://reviews.llvm.org/D141544