platform/upstream/llvm.git
16 months agoReland '[msan] Intercept dladdr1, and refactor dladdr'
Thurston Dang [Tue, 11 Jul 2023 16:45:25 +0000 (16:45 +0000)]
Reland '[msan] Intercept dladdr1, and refactor dladdr'

Relanding with #if SANITIZER_GLIBC to avoid breaking FreeBSD.
Also incorporates Arthur's BUILD.gn fix (thanks!) from https://reviews.llvm.org/rGc1e283851772ba494113311405d48cfb883751d1

Original commit message:
This patch adds an msan interceptor for dladdr1 (with support for RTLD_DL_LINKMAP and RTLD_DL_SYMENT) and an accompanying test. It also adds a helper file, msan_dl.cpp, that contains UnpoisonDllAddrInfo (refactored out of the dladdr interceptor) and UnpoisonDllAddr1ExtraInfo.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D154272

16 months ago[flang] Use fir.type_desc in nullify
Valentin Clement [Tue, 11 Jul 2023 19:41:30 +0000 (12:41 -0700)]
[flang] Use fir.type_desc in nullify

Do not look for the global early in nullify codegen. The type descriptor
can be emitted later and it would raise an error as it could not be found.
Use `fir.type_desc` instead so it delays the type descriptor lookup until
evrything is emitted.

https://github.com/llvm/llvm-project/issues/63775

Reviewed By: vzakhari

Differential Revision: https://reviews.llvm.org/D154982

16 months ago[Clang] Diagnose jumps into statement expressions
Corentin Jabot [Fri, 7 Jul 2023 08:58:13 +0000 (10:58 +0200)]
[Clang] Diagnose jumps into statement expressions

Such jumps are not allowed by GCC and allowing them
can lead to situations where we jumps into unevaluated
statements.

Fixes #63682

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D154696

16 months ago[libc][Obvious] Check if the state hasn't already been destroyed on shutdown
Joseph Huber [Tue, 11 Jul 2023 19:30:42 +0000 (14:30 -0500)]
[libc][Obvious] Check if the state hasn't already been destroyed on shutdown

This ensures that if someone calls the `rpc_shutdown` method multiple
times it will not segfault and gracefully continue. This was causing
problems in the OpenMP usage. This could point to other issues, but for
now this is a safe fix.

Differential Revision: https://reviews.llvm.org/D155005

16 months agoAdd a generic Process method to dump plugin history.
Jim Ingham [Tue, 11 Jul 2023 19:29:22 +0000 (12:29 -0700)]
Add a generic Process method to dump plugin history.

I need to call this to figure out why the assert in
StopInfoMachException::CreateStopReasonWithMachException is triggering, but
it isn't appropriate to directly access the GDBRemoteCommunication there.  And
dumping whatever history the process plugin has collected during the run isn't
gdb-remote specific...

Differential Revision: https://reviews.llvm.org/D154992

16 months ago[BPF] Undo transformation for LICM.cpp:hoistMinMax()
Eduard Zingerman [Fri, 7 Jul 2023 01:26:28 +0000 (04:26 +0300)]
[BPF] Undo transformation for LICM.cpp:hoistMinMax()

Extended BPFCheckAndAdjustIR pass with sinkMinMax() transformation
that undoes LICM hoistMinMax pass.

The undo transformation converts the following patterns:

    x < min(a, b) -> x < a && x < b
    x > min(a, b) -> x > a || x > b
    x < max(a, b) -> x < a || x < b
    x > max(a, b) -> x > a && x > b

Where 'a' or 'b' is a constant.
Also supports `sext min(...) ...` and `zext min(...) ...`.

~~~

This was previously commited as 09feee559a29 and reverted in
0bf9bfeacc8c because of the testbot memory leak report:
  https://lab.llvm.org/buildbot/#/builders/5/builds/34931

The memory leak issue was caused by incorrect instruction removal
sequence in skinMinMaxBB():

    I->dropAllReferences();  -------->  I->eraseFromParent();
    I->removeFromParent();   fixed to

Differential Revision: https://reviews.llvm.org/D147990

16 months ago[mlir][Vector] Add support for 0-D vectors to vector.insert/extract
Diego Caballero [Sun, 11 Jun 2023 06:33:55 +0000 (06:33 +0000)]
[mlir][Vector] Add support for 0-D vectors to vector.insert/extract

This is part of the process to remove vector.insertelement/extractelement
from the Vector dialect.

RFC: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops

Differential Revision: https://reviews.llvm.org/D152644

16 months ago[clang-tidy] Make MatchesAnyListedNameMatcher cope with unnamed Decl
Mike Crowe [Tue, 11 Jul 2023 18:58:09 +0000 (18:58 +0000)]
[clang-tidy] Make MatchesAnyListedNameMatcher cope with unnamed Decl

If MatchesAnyListedNameMatcher::NameMatcher::match() is called in
MatchMode::MatchUnqualified mode with a NamedDecl that has no name then
calling NamedDecl::getName() will assert with:
 `Name.isIdentifier() && "Name is not a simple identifier"'

It seems unfair to force all matchers using
matchers::matchesAnyListedName to defend against this, particularly
since test cases are unlikely to provoke the problem. Let's just check
whether the identifier has a name before attempting to use it instead.

Add test case that reproduces the problem to the
use-std-print-custom.cpp lit check.

Reviewed By: PiotrZSL

Differential Revision: https://reviews.llvm.org/D154884

16 months ago[clang-tidy] Don't split \r\n in modernize-use-std-print check
Mike Crowe [Tue, 11 Jul 2023 18:57:36 +0000 (18:57 +0000)]
[clang-tidy] Don't split \r\n in modernize-use-std-print check

When given:
 printf("Hello\r\n");

it's clearer to leave the CRLF intact and convert this to:
 std::print("Hello\r\n");

than to remove the trailing newline and convert it to:
 std::println("Hello\r");

Update the documentation to match, and clarify the situations for using
println vs print which weren't previously explained.

Reviewed By: PiotrZSL

Differential Revision: https://reviews.llvm.org/D154788

16 months ago[LV] Check if ops can safely be truncated in computeMinimumValueSizes.
Florian Hahn [Tue, 11 Jul 2023 19:18:55 +0000 (20:18 +0100)]
[LV] Check if ops can safely be truncated in computeMinimumValueSizes.

Update computeMinimumValueSizes to check if an instruction's operands
can safely be truncated.

If more than MinBW bits are demanded by for the operand or if the
operand is a constant and cannot be safely truncated, it is not safe to
evaluate the instruction in the narrower MinBW. Skip those cases.

Fixes https://github.com/llvm/llvm-project/issues/47927

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D154717

16 months ago[RISCV] Remove SiFive7GetCyclesWidening from RISCVSchedSiFive7.td.
Craig Topper [Tue, 11 Jul 2023 19:17:10 +0000 (12:17 -0700)]
[RISCV] Remove SiFive7GetCyclesWidening from RISCVSchedSiFive7.td.

It's identical to SiFive7GetCyclesDefault.

Differential Revision: https://reviews.llvm.org/D155002

16 months agoAMDGPU: Partially fix not respecting dynamic denormal mode
Matt Arsenault [Wed, 14 Jun 2023 22:53:48 +0000 (18:53 -0400)]
AMDGPU: Partially fix not respecting dynamic denormal mode

The most notable issue was producing v_mad_f32 in functions with the
dynamic mode, since it just ignores the mode. fdiv lowering is still
somewhat broken because it involves a mode switch and we need to query
the original mode.

16 months ago[NFC]add initialization for EmitCompactUnwindNonCanonical in ctor
Chen, Cheng2 [Tue, 11 Jul 2023 19:09:06 +0000 (15:09 -0400)]
[NFC]add initialization for EmitCompactUnwindNonCanonical in ctor

I found that the newly added member variables, "EmitCompactUnwindNonCanonical" were not initialized in the constructor, but other member variables were initialized. The behavior seems to need to be consistent to improve code.

Reviewed By: oontvoo

Differential Revision: https://reviews.llvm.org/D154472

16 months ago[RISCV] Fix name mangling for LMUL!=1 vector types with attribute(rvv_vector_bits)
Craig Topper [Tue, 11 Jul 2023 18:49:40 +0000 (11:49 -0700)]
[RISCV] Fix name mangling for LMUL!=1 vector types with attribute(rvv_vector_bits)

We were always printing "m1", we need to calculate the correct LMUL instead.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D153659

16 months agoValueTracking: ldexp cannot return denormals based on range of exponent
Matt Arsenault [Tue, 4 Jul 2023 11:48:02 +0000 (07:48 -0400)]
ValueTracking: ldexp cannot return denormals based on range of exponent

The implementations of a number of math functions on amdgpu involve
pre and post-scaling the inputs out of the denormal range. If these
are chained together we can possibly fold them out.

computeConstantRange seems weaker than computeKnownBits, so this
regresses some of the older vector tests.

16 months ago[TableGen] Refactor the implementation of arguments to introduce ArgumentInit [nfc]
wangpc [Tue, 11 Jul 2023 17:59:47 +0000 (10:59 -0700)]
[TableGen] Refactor the implementation of arguments to introduce ArgumentInit [nfc]

A new Init type ArgumentInit is added to represent arguments.  We currently only support positional arguments; an upcoming change will add named argument support.

The index of argument in error message is removed.

Differential Revision: https://reviews.llvm.org/D154066

16 months ago[libc++][chrono] Fixes formatting duration subseconds.
Mark de Wever [Sun, 9 Jul 2023 18:59:30 +0000 (20:59 +0200)]
[libc++][chrono] Fixes formatting duration subseconds.

Fixes https://llvm.org/PR62082

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D154851

16 months ago[libc++][format] Adds a UTF transcoder.
Mark de Wever [Fri, 21 Apr 2023 06:09:06 +0000 (08:09 +0200)]
[libc++][format] Adds a UTF transcoder.

This is a preparation for

  P2093R14 Formatted output

When the output of print is to the terminal it needs to use the native
API. This means transcoding UTF-8 to UTF-16 on Windows. The encoder's
interface is modeled after

 P2728 Unicode in the Library, Part 1: UTF Transcoding

But only the required part for P2093R14 is implemented.

On Windows wchar_t is 16 bits, in order to test on platforms where
wchar_t is 32 bits the transcoder has support for char16_t. It also adds
and UTF-8 to UTF-32 encoder which is useful for other tests.

Note it is possible to use <codecvt> for transcoding, but that header is
deprecated. So rather write new code that is not deprecated; the hard
part, decoding, has already been done. The <codecvt> header also
requires locale support while the new code works without including
<locale>.

Note the current transcoder implementation can be optimized since it
basically does UTF-8 -> UTF-32 -> UTF-16. The first goal is to have a
working implementation. Since it's not part of the ABI it's possible to
do the optimization later.

Depends on D149672

Reviewed By: ldionne, tahonermann, #libc

Differential Revision: https://reviews.llvm.org/D150031

16 months ago[libc] adjust strtofloat precision for subnormals
Michael Jones [Mon, 10 Jul 2023 23:41:41 +0000 (16:41 -0700)]
[libc] adjust strtofloat precision for subnormals

Subnormal floating point numbers have a lower effective precision than
normal floating point numbers. This can cause issues for the fuzz test
since the MPFR floats have a constant precision regardless of the
exponent, and the precision must match exactly or else create rounding
errors. To solve this problem, the precision of the MPFR floats is
dynamically calculated.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D154909

16 months ago[clang] Implement `PointerLikeTraits` for `{File,Directory}EntryRef`
Jan Svoboda [Fri, 26 May 2023 20:40:40 +0000 (13:40 -0700)]
[clang] Implement `PointerLikeTraits` for `{File,Directory}EntryRef`

This patch implements `llvm::PointerLikeTraits<FileEntryRef>` and `llvm::PointerLikeTraits<DirectoryEntryRef>`, allowing some simplifications around umbrella header/directory code.

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D154905

16 months ago[StructuralHash] Add unittests
Arthur Eubanks [Sun, 2 Jul 2023 21:33:21 +0000 (14:33 -0700)]
[StructuralHash] Add unittests

Reviewed By: paulkirth

Differential Revision: https://reviews.llvm.org/D154308

16 months ago[flang] Add fastmath flags to localBuilder in IntrinsicCall
David Truby [Thu, 6 Jul 2023 14:32:30 +0000 (15:32 +0100)]
[flang] Add fastmath flags to localBuilder in IntrinsicCall

Currently the local builder used in IntrinsicCall doesn't have the
fastmath flags passed to it. This results in the fastmath attribute
not being added to certain runtime calls. This patch simply forwards
the fastmath flags from the parent builder.

Differential Revision: https://reviews.llvm.org/D154611

16 months ago[AArch64][test] Use -filetype=null after 0b69cc8bcba79366aeee1531c61b1e8255a2d6c4
Fangrui Song [Tue, 11 Jul 2023 17:48:25 +0000 (10:48 -0700)]
[AArch64][test] Use -filetype=null after 0b69cc8bcba79366aeee1531c61b1e8255a2d6c4

Otherwise if the current directory is not writable, we will fail with a
different message.

16 months ago[ARM][AArch64] Make ACLE __clzl/__clzll return unsigned int instead of unsigned long...
Craig Topper [Tue, 11 Jul 2023 17:42:25 +0000 (10:42 -0700)]
[ARM][AArch64] Make ACLE __clzl/__clzll return unsigned int instead of unsigned long/uint64_t.

Use unsigned long in place of uint32_t for both clz and cls.

As far as I can tell this matches what ACLE defines and what gcc implements.

Noticed while investigating fixing https://github.com/llvm/llvm-project/issues/63113

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D154910

16 months ago[lldb][NFCI] Methods to load scripting resources should take a Stream by reference
Alex Langford [Mon, 10 Jul 2023 20:19:13 +0000 (13:19 -0700)]
[lldb][NFCI] Methods to load scripting resources should take a Stream by reference

These methods all take a `Stream *` to get feedback about what's going
on. By default, it's a nullptr, but we always feed it with a valid
pointer. It would therefore make more sense to have this take a
reference.

Differential Revision: https://reviews.llvm.org/D154883

16 months ago[RISCV] Cleanup dead complexity in RISCVMaskedPseudo after TA/TU merge refactoring...
Philip Reames [Tue, 11 Jul 2023 17:24:52 +0000 (10:24 -0700)]
[RISCV] Cleanup dead complexity in RISCVMaskedPseudo after TA/TU merge refactoring [nfc]

After D154245 lands, we have greatly simplified the possible configurations for an entry in the RISCVMaskedPseudo table. This change goes through and reworks everything which uses that table to exploit the available simplifications.

To justify the correctness here, let me note that we no longer had any use of HasTU=true. We were left with only the HasTu=false, and IsCombined=true|false cases. The only usage is IsCombined=false was for the comparison operations. At the moment, these operations are the only ones in the table without vector policy operands. Instead of switching on the pseudo value, we can just check the VecPolicy flag instead.

It may be worth adding a passthru operand to the comparisons (which is actually needed to represent tail undefined vs tail agnostic), and a vector policy operand (which is strictly unneeded) just for consistency, but we can do that in a follow up patch for some further simplification if desired.

Note that we do have a few _TU pseudos left at this point. It's simply that none of them are in the RISCVMaskedPseudo table, and thus don't participate in our post-ISEL transforms.

Differential Revision: https://reviews.llvm.org/D154620

16 months ago[clang][AIX] Fix Overly Strict LTO Option Checking against `data-sections` when ...
Qiongsi Wu [Tue, 11 Jul 2023 17:10:08 +0000 (13:10 -0400)]
[clang][AIX] Fix Overly Strict LTO Option Checking against `data-sections` when `mxcoff-roptr` is in Effect

The LTO `-mxcoff-roptr` [[ https://github.com/llvm/llvm-project/blob/c6b2d25927817bdeca99653ee3e66720f33ce3ae/clang/lib/Driver/ToolChains/CommonArgs.cpp#L750 | check ]] against data sections is overly strict and it ignores the fact that [[ https://github.com/llvm/llvm-project/blob/c6b2d25927817bdeca99653ee3e66720f33ce3ae/llvm/lib/LTO/LTOCodeGenerator.cpp#L427 | data sections is on by default on AIX ]], causing valid LTO compilation to fail when `-fdata-sections` is not explicitly specified.

This patch revises the check so that an error is reported only if data sections is explicitly turned off for LTO.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D152021

16 months ago[lldb][NFCI] Avoid construction of temporary std::strings in Variable
Alex Langford [Mon, 10 Jul 2023 20:40:02 +0000 (13:40 -0700)]
[lldb][NFCI] Avoid construction of temporary std::strings in Variable

A common thing to do is to call `str().c_str()` to get a null-terminated
string out of an existing StringRef. Most of the time this is to be able
to use a printf-style format string. However, llvm::formatv can handle
StringRefs without the need for the additional allocation. Using that
makes more sense.

Differential Revision: https://reviews.llvm.org/D154890

16 months ago[RISCV] Remove legacy TA/TU pseudo distinction for binary instructions
Philip Reames [Tue, 11 Jul 2023 17:10:05 +0000 (10:10 -0700)]
[RISCV] Remove legacy TA/TU pseudo distinction for binary instructions

This change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295.

This change handles most of the binary pseudos. I excluded pseudos which _TIED variants, and those that produce mask results. Both a bit different in functionality, and deserve their own change and review. As with previous changes in the series, we replace the existing TA and TU forms with a single unified pseudo with a passthru (which may be implicit_def) and a policy operand.

As before, we see codegen changes (some improvements and some regressions) due to scheduling differences caused by the extra implicit_def instructions.

Differential Revision: https://reviews.llvm.org/D154245

16 months ago[AArch64] Extra tests for smull/umull, especially of smaller vector size. NFC
David Green [Tue, 11 Jul 2023 17:16:23 +0000 (18:16 +0100)]
[AArch64] Extra tests for smull/umull, especially of smaller vector size. NFC

See D153632 and D154063

16 months agoclang: add a missing dependency on ClangDriverOptions
Jon Roelofs [Tue, 11 Jul 2023 17:05:46 +0000 (10:05 -0700)]
clang: add a missing dependency on ClangDriverOptions

https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake-sanitized/4449/console

16 months ago[RISCV] Constrain register class before replaceRegWith in RISCVMergeBaseOffset.
Craig Topper [Tue, 11 Jul 2023 16:53:56 +0000 (09:53 -0700)]
[RISCV] Constrain register class before replaceRegWith in RISCVMergeBaseOffset.

The register being replaced might have a more restrictive register
class due to requirements of the using instruction. We should
constrain the register class to preserve any restrictions.

This was found in our downstream on a custom instruction. I don't
have a test case for upstream currently.

Differential Revision: https://reviews.llvm.org/D154920

16 months agoReland "[IRCE] Parse range checks in the form of 'LHS - RHS vs Limit'"
Aleksandr Popov [Tue, 11 Jul 2023 16:42:55 +0000 (18:42 +0200)]
Reland "[IRCE] Parse range checks in the form of 'LHS - RHS vs Limit'"

This reverts commit 4c6f95be29c6ce0f89663a5103c58ee63d76cda3

and relands e16c5c092205f68825466c25a1dd30783c4820f3

https://reviews.llvm.org/D154069

16 months ago[mlir] add backward dense dataflow analysis
Alex Zinenko [Fri, 7 Jul 2023 13:17:10 +0000 (13:17 +0000)]
[mlir] add backward dense dataflow analysis

This is the counterpart to the forward dense dataflow analysis and
integrates into the dataflow framework. The implementation follows the
structure of existing dataflow analyses.

Reviewed By: Mogball, phisiart

Differential Revision: https://reviews.llvm.org/D154713

16 months ago[flang][hlfir] Fixed NULL() handling in structure constructor.
Slava Zakharin [Tue, 11 Jul 2023 16:02:23 +0000 (09:02 -0700)]
[flang][hlfir] Fixed NULL() handling in structure constructor.

When an initializer value is missing for an allocatable component
in a structure constructor, the RHS is NULL() expression.
We should just skip this part of the initializer, since the component
must become unallocated (as it is from the initialization).
Runtime detected rank mismatch when we tried to pass NULL() box
RHS for assigning it to the unallocated component of rank 1, 2, etc.

Reviewed By: tblah

Differential Revision: https://reviews.llvm.org/D154906

16 months ago[flang][hlfir] Fixed byval passing for dynamically optional intrinsic args.
Slava Zakharin [Tue, 11 Jul 2023 16:02:09 +0000 (09:02 -0700)]
[flang][hlfir] Fixed byval passing for dynamically optional intrinsic args.

In the context of elemental operation a dynamically optional
intrinsic argument must be lowered such that the elemental
designator is generated under isPresent check.

Reviewed By: tblah

Differential Revision: https://reviews.llvm.org/D154897

16 months ago[bolt] Fix MSVC builds
Shoaib Meenai [Tue, 11 Jul 2023 07:30:01 +0000 (00:30 -0700)]
[bolt] Fix MSVC builds

We need to explicitly mark DWARFUnitInfo as non-copyable since MSVC's
STL has a `noexcept(false)` move constructor for `unordered_map`; see
the added comment for more details.

An alternative might be using SmallVector instead of std::vector, since
that never tries to copy elements [1]. That would result in a bunch of
API changes though, so I figured a smaller targeted fix was better.

[1] https://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h

Reviewed By: ayermolo, maksfb

Differential Revision: https://reviews.llvm.org/D154924

16 months ago[ConstantHoisting] remove a LLVM_DEBUG statement
Nick Desaulniers [Tue, 11 Jul 2023 16:27:33 +0000 (09:27 -0700)]
[ConstantHoisting] remove a LLVM_DEBUG statement

There is no need to print the entire function after a transform via
LLVM_DEBUG statements.  These can be emulated via:
$ llc -print-after=consthoist -filter-print-funcs=<function name>

Otherwise, this makes the output of
$ llc -debug-only=consthoist
too verbose.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D154904

16 months ago[Doc][clang] Some PGO documentation improvements.
Wael Yehia [Fri, 30 Jun 2023 16:44:35 +0000 (12:44 -0400)]
[Doc][clang] Some PGO documentation improvements.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D154239

16 months ago[MicrosoftDemangle] fix warn-trailing false positive
Nick Desaulniers [Tue, 11 Jul 2023 16:23:09 +0000 (09:23 -0700)]
[MicrosoftDemangle] fix warn-trailing false positive

A follow up to commit 6bad76c7ae93 ("[Demangle] fix windows tests")
based on @thakis' report.

Fixes: #63740

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D154875

16 months ago[WebAssembly] Fix implicit fallthrough in encodeInstruction
Bryan Chan [Tue, 11 Jul 2023 16:29:51 +0000 (12:29 -0400)]
[WebAssembly] Fix implicit fallthrough in encodeInstruction

16 months ago[X86] ReplaceNodeResults - widen vector truncate nodes on pre-SSSE3 targets
Simon Pilgrim [Tue, 11 Jul 2023 15:56:08 +0000 (16:56 +0100)]
[X86] ReplaceNodeResults - widen vector truncate nodes on pre-SSSE3 targets

Building on the support for wider input vector types from D154592, try to more aggressively widen inputs instead of scalarizing them.

16 months ago[flang][openacc] Support ieor reduction operator
Valentin Clement [Tue, 11 Jul 2023 15:58:05 +0000 (08:58 -0700)]
[flang][openacc] Support ieor reduction operator

Add support for `ieor` reduction operator in
OpenACC lowering.

Depends on D154887

Reviewed By: razvanlupusoru

Differential Revision: https://reviews.llvm.org/D154888

16 months ago[flang][openacc] Support ior reduction operator
Valentin Clement [Tue, 11 Jul 2023 15:51:13 +0000 (08:51 -0700)]
[flang][openacc] Support ior reduction operator

Add support for `ior` reduction operator in
OpenACC lowering.

Depends on D154886

Reviewed By: razvanlupusoru

Differential Revision: https://reviews.llvm.org/D154887

16 months ago[Libomptarget] Remove RPCHandleTy indirection
Joseph Huber [Tue, 11 Jul 2023 14:27:22 +0000 (09:27 -0500)]
[Libomptarget] Remove RPCHandleTy indirection

The 'RPCHandleTy' was intended to capture the intention that a specific
device owns its slot in the RPC server. However, this required creating
a temporary store to hold these pointers. This was causing really weird
spurious failure due to undefined behaviour in the order of library
teardown. For example, the x64 plugin would be torn down, set this to
some invalid memory, and then the CUDA plugin would crash. Rather than
spend the time to fully diagnose this problem I found it pertinent to
simply remove the failure mode.

This patch removes this indirection so now the usage of the RPC server
must always be done with the intended device. This just requires some
extra handling for the AMDGPU indirection where we need to store a
reference to the device.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D154971

16 months ago[PowerPC] Truncate exponent parameter for vec_cts,vec_ctf
Zarko Todorovski [Tue, 11 Jul 2023 14:53:41 +0000 (10:53 -0400)]
[PowerPC] Truncate exponent parameter for vec_cts,vec_ctf

On PowerPC, the vec_ct* builtin function take the form of eg. d=vec_cts(a,b)
LLVM (llc) will crash when a user specifies a number out of the allowed range
(0-31) for b.This patch truncates b so that we avoid the backend crash in some cases.

Further documentation for the builtins can be found here:
https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.0?topic=functions-vec-ctf
https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.0?topic=functions-vec-cts

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D106409

16 months ago[AArch64] Create SVE Min/Max Cost Model Tests
Tuan Chuong Goh [Tue, 11 Jul 2023 15:51:11 +0000 (16:51 +0100)]
[AArch64] Create SVE Min/Max Cost Model Tests

Differential Revision: https://reviews.llvm.org/D154835

16 months ago[flang][openacc] Support iand reduction operator
Valentin Clement [Tue, 11 Jul 2023 15:50:17 +0000 (08:50 -0700)]
[flang][openacc] Support iand reduction operator

Add support for `iand` reduction operator in
OpenACC lowering.

Reviewed By: razvanlupusoru

Differential Revision: https://reviews.llvm.org/D154886

16 months agoFix profiling of overloaded postincrement / postdecrement.
Richard Smith [Tue, 11 Jul 2023 00:42:19 +0000 (17:42 -0700)]
Fix profiling of overloaded postincrement / postdecrement.

We were accidentally profiling the fabricated second argument (`0`),
resulting in overloaded dependent `a++` and non-overloaded dependent
`a++` having different hashes.

16 months ago[llvm-objdump] Change errors to warnings for symbol section name dumping
Fangrui Song [Tue, 11 Jul 2023 15:38:02 +0000 (08:38 -0700)]
[llvm-objdump] Change errors to warnings for symbol section name dumping

Port D69671 (llvm-readobj) to llvm-objdump. Add a class llvm::objdump::Dumper
and move some free functions into Dumper so that they can call
reportUniqueWarning.

Warnings seems preferable in these cases as the issue is localized and we can
continue dumping other information.

Differential Revision: https://reviews.llvm.org/D154754

16 months ago[clangd] Use canonical path as resolved path for includes.
Viktoriia Bakalova [Tue, 11 Jul 2023 13:33:53 +0000 (13:33 +0000)]
[clangd] Use canonical path as resolved path for includes.

Differential Revision: https://reviews.llvm.org/D154962

16 months ago[SPIRV] Fix CoverageInfo after D153758
Fangrui Song [Tue, 11 Jul 2023 15:35:59 +0000 (08:35 -0700)]
[SPIRV] Fix CoverageInfo after D153758

16 months ago[mlir][nvgpu] Add initial support for `mbarrier`
Guray Ozen [Tue, 11 Jul 2023 15:34:44 +0000 (17:34 +0200)]
[mlir][nvgpu] Add initial support for `mbarrier`

`mbarrier` is a barrier created in shared memory that supports different flavors of synchronizing threads other than `__syncthreads`, for more information see below.
https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-mbarrier

This work adds initial Ops wrt `mbarrier` to nvgpu dialect.

First, it introduces to two types:
`mbarrier.barrier` that is barrier object in shared memory
`mbarrier.barrier.token` that is token

It introduces following Ops:
`mbarrier.create` creates `mbarrier.barrier`
`mbarrier.init` initializes `mbarrier.barrier`
`mbarrier.arrive` performs arrive-on `mbarrier.barrier` returns `mbarrier.barrier.token`
`mbarrier.arrive.nocomplete` performs arrive-on (non-blocking) `mbarrier.barrier` returns `mbarrier.barrier.token`
`mbarrier.test_wait` waits on `mbarrier.barrier` and `mbarrier.barrier.token`

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D154090

16 months agoFix bazel build file for D154060.
Aliia Khasanova [Tue, 11 Jul 2023 15:03:48 +0000 (17:03 +0200)]
Fix bazel build file for D154060.

Differential Revision: https://reviews.llvm.org/D154976

16 months agoRevert "[compiler-rt] Move crt into builtins"
Petr Hosek [Tue, 11 Jul 2023 15:32:49 +0000 (15:32 +0000)]
Revert "[compiler-rt] Move crt into builtins"

This reverts commit dae9d1b52469daca88a968e7b99a26420aef657c since
it caused https://github.com/llvm/llvm-project/issues/63799.

16 months ago[FPEnv] Update comment about nofpexcept default. NFC
Luke Lau [Tue, 11 Jul 2023 14:51:08 +0000 (15:51 +0100)]
[FPEnv] Update comment about nofpexcept default. NFC

It no longer defaults to false as of 63336795f0d5

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D154973

16 months agoRevert "[NFC][AMDGPULowerModuleLDSPass] Factorize repetead sort code"
Juan Manuel MARTINEZ CAAMAÑO [Tue, 11 Jul 2023 15:08:59 +0000 (17:08 +0200)]
Revert "[NFC][AMDGPULowerModuleLDSPass] Factorize repetead sort code"

This reverts commit 125b90749a98d6dc6b492883c9617f9e91ab60e0.

16 months ago[NFC][AMDGPULowerModuleLDSPass] Factorize repetead sort code
Juan Manuel MARTINEZ CAAMAÑO [Tue, 11 Jul 2023 15:03:20 +0000 (17:03 +0200)]
[NFC][AMDGPULowerModuleLDSPass] Factorize repetead sort code

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D154970

16 months ago[mlir][linalg] BufferizeToAllocation: Add custom memcpy op
Matthias Springer [Tue, 11 Jul 2023 14:42:33 +0000 (16:42 +0200)]
[mlir][linalg] BufferizeToAllocation: Add custom memcpy op

Add a new option that allows users to specify a memcpy op: "memref.tensor_store", "memref.copy" or "linalg.copy".

Differential Revision: https://reviews.llvm.org/D154968

16 months ago[mlir][bufferization] Add read_only attribute to ToMemrefOp
Matthias Springer [Tue, 11 Jul 2023 14:35:39 +0000 (16:35 +0200)]
[mlir][bufferization] Add read_only attribute to ToMemrefOp

This unit attribute indicates to the bufferization that the resulting buffer will not be written to by another op.

Differential Revision: https://reviews.llvm.org/D154967

16 months ago[clang] Use llvm.is_fpclass to implement FP classification functions
Serge Pavlov [Tue, 11 Jul 2023 14:34:02 +0000 (21:34 +0700)]
[clang] Use llvm.is_fpclass to implement FP classification functions

Builtin floating-point number classification functions:

    - __builtin_isnan,
    - __builtin_isinf,
    - __builtin_finite, and
    - __builtin_isnormal

now are implemented using `llvm.is_fpclass`.

This change makes the target callback `TargetCodeGenInfo::testFPKind`
unneeded. It is preserved in this change and should be removed later.

Differential Revision: https://reviews.llvm.org/D112932

16 months ago[mlir][linalg] Return newly created ops from bufferize_to_allocation
Matthias Springer [Tue, 11 Jul 2023 14:26:00 +0000 (16:26 +0200)]
[mlir][linalg] Return newly created ops from bufferize_to_allocation

Return all ops that were generated as part of the bufferization, so that users do not have to match them in the enclosing op.

Differential Revision: https://reviews.llvm.org/D154966

16 months ago[WebAssembly] Report error for inline assembly with unsupported opcodes
David Mo [Fri, 7 Jul 2023 18:27:00 +0000 (14:27 -0400)]
[WebAssembly] Report error for inline assembly with unsupported opcodes

For inline WebAssembly, passing a numeric operand to global.get is
unsupported. This causes encodeInstruction to reach an llvm_unreachable
call, leading to undefined behaviors. This patch fixes the issue for
this invalid instruction encoding, making it report an error by adding
an MCContext field in class WebAssemblyMCCodeEmitter.

Reviewed By: sbc100, bryanpkc

Differential Revision: https://reviews.llvm.org/D154734

16 months ago[X86][FP16] Fix mis-combination from FMULC to FCMULC
Phoebe Wang [Tue, 11 Jul 2023 14:08:10 +0000 (22:08 +0800)]
[X86][FP16] Fix mis-combination from FMULC to FCMULC

The combination was designed to combine a negative imaginary value
rather then a full negative complex value.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D154213

16 months ago[mlir][transform] Add transform.select op
Matthias Springer [Tue, 11 Jul 2023 14:09:41 +0000 (16:09 +0200)]
[mlir][transform] Add transform.select op

This transform op can be used to select all payload ops with a given name from a handle.

Differential Revision: https://reviews.llvm.org/D154956

16 months ago[MLIR][Presburger] Optimize for intersect
gilsaia [Tue, 11 Jul 2023 13:53:35 +0000 (19:23 +0530)]
[MLIR][Presburger] Optimize for intersect

Added a series of optimizations to the Intersect function of PresburgerRelation, referring to the ISL implementation.
Tested it on a simple Benchmark implemented by myself to see that it can speed up the Intersect operation

The Benchmark can be found here:https://github.com/gilsaia/llvm-project-test-fpl/blob/develop_benchmark/mlir/benchmark/presburger/Benchmark.cpp

The overall results for Intersect are as follows
{F28191553}

The results for each case are as follows
{F28191556}

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D154771

16 months ago[Bazel] Fixup for D153758, D153850, and D153861 (global-isel-combiner-matchtable)
NAKAMURA Takumi [Tue, 11 Jul 2023 13:42:43 +0000 (22:42 +0900)]
[Bazel] Fixup for D153758, D153850, and D153861 (global-isel-combiner-matchtable)

16 months ago[NFC][AMDGPULowerModuleLDSPass] Add const to some variables/parameters
Juan Manuel MARTINEZ CAAMAÑO [Tue, 11 Jul 2023 13:51:22 +0000 (15:51 +0200)]
[NFC][AMDGPULowerModuleLDSPass] Add const to some variables/parameters

Moving out some changes not related to the bugfix in https://reviews.llvm.org/D154946

Reviewed By: JonChesterfield, arsenm

Differential Revision: https://reviews.llvm.org/D154959

16 months agoFix shr/and pair replace with bfe
Georgi Mirazchiyski [Tue, 11 Jul 2023 10:22:35 +0000 (11:22 +0100)]
Fix shr/and pair replace with bfe

Co-Authored-By: Aidan Belton <aidan.belton@codeplay.com>
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D117118

16 months ago [ARM] mark tMOVi32imm as killing flags
Simon Wallis [Tue, 11 Jul 2023 12:11:30 +0000 (13:11 +0100)]
 [ARM] mark tMOVi32imm as killing flags

Mark the tMOVi32imm pseudo instr as killing the flags register.

The pseudo instruction expands to a sequence of 7 movs/lsls/adds
instructions, which are all Thumb-1 flag setting instructions.

For a test case, take an existing arm test which checks for
"Don't CSE a cmp across a call that clobbers CPSR."
and retarget it at thumbv6m execute-only.

Reviewed By: stuij

Differential Revision: https://reviews.llvm.org/D154845

Change-Id: I8f8209fbc40a833f8875629937b9606c1e2c021d

16 months ago[mlir][llvm] Define annotation intrinsics
Victor Perez [Mon, 10 Jul 2023 09:01:38 +0000 (10:01 +0100)]
[mlir][llvm] Define annotation intrinsics

Define `llvm.intr.var.annotation`, `llvm.intr.ptr.annotation` and
`llvm.intr.annotation` in the llvm dialect as `llvm.var.annotation`,
`llvm.ptr.annotation` and `llvm.annotation` counterparts.

Signed-off-by: Victor Perez <victor.perez@codeplay.com>
Differential Revision: https://reviews.llvm.org/D154842

16 months ago[mlir][Linalg] NFC - Improve some transform op builders
Nicolas Vasilache [Tue, 11 Jul 2023 13:14:39 +0000 (13:14 +0000)]
[mlir][Linalg] NFC - Improve some transform op builders

16 months ago[X86] shuffle-vs-trunc-256.ll - move comment outside test. NFC.
Simon Pilgrim [Tue, 11 Jul 2023 13:12:32 +0000 (14:12 +0100)]
[X86] shuffle-vs-trunc-256.ll - move comment outside test. NFC.

16 months agoValueTracking: Implement computeKnownFPClass for ldexp
Matt Arsenault [Fri, 28 Apr 2023 17:58:33 +0000 (13:58 -0400)]
ValueTracking: Implement computeKnownFPClass for ldexp

https://reviews.llvm.org/D149590

16 months agoValueTracking: Add more baseline ldexp tests for computeKnownFPClass
Matt Arsenault [Tue, 4 Jul 2023 11:43:30 +0000 (07:43 -0400)]
ValueTracking: Add more baseline ldexp tests for computeKnownFPClass

16 months ago[clang] Add test for CWG1710 and related issues
Vlad Serebrennikov [Tue, 11 Jul 2023 13:23:35 +0000 (16:23 +0300)]
[clang] Add test for CWG1710 and related issues

Those issues focus on `template` keyword being optional in certain type-only contexts (base specifiers, member initializers, typename specifiers), as opposed to be disallowed by the grammar, or required by some implementations. GCC accepts all the tests this patch touches since 10, others fail on various tests: https://godbolt.org/z/1M6KE3W1a

It should be noted that the wording in [[ https://cplusplus.github.io/CWG/issues/1710.html | 1710 ]] that resolves those issues has been substantially changed by [[ https://wg21.link/p1787 | P1787 ]]. I can't find the post-P1787 wording that covers those issues, but I can't find the intent of changing relevant behavior in P1787 either, so I assume that intent of the 1710 resolution is preserved somewhere.

This patch covers the following issues:
[[ https://cplusplus.github.io/CWG/issues/314.html  | CWG314 ]]
[[ https://cplusplus.github.io/CWG/issues/343.html  | CWG343 ]]
[[ https://cplusplus.github.io/CWG/issues/1710.html | CWG1710 ]]
[[ https://cplusplus.github.io/CWG/issues/1794.html | CWG1794 ]]
[[ https://cplusplus.github.io/CWG/issues/1812.html | CWG1812 ]]

Reviewed By: #clang-language-wg, cor3ntin

Differential Revision: https://reviews.llvm.org/D151697

16 months ago[InstCombine] Don't handle constants in de morgan folds (PR63791)
Nikita Popov [Tue, 11 Jul 2023 13:15:43 +0000 (15:15 +0200)]
[InstCombine] Don't handle constants in de morgan folds (PR63791)

If the and/or operand is an immediate constant, it will get folded
away anyway. Don't try to freely invert those operands.

A particularly degenerate case of this arises when both operands
are constant and the result is a constant, in which case we try
to invert users of a constant, resulting in an assertion failure.

Fixes https://github.com/llvm/llvm-project/issues/63791.

16 months ago[gn build] Port 053d9e5832c7
LLVM GN Syncbot [Tue, 11 Jul 2023 13:04:05 +0000 (13:04 +0000)]
[gn build] Port 053d9e5832c7

16 months ago[include-cleaner] Fix the `fixIncludes` API not respect main-file header.
Haojian Wu [Tue, 11 Jul 2023 11:41:49 +0000 (13:41 +0200)]
[include-cleaner] Fix the `fixIncludes` API not respect main-file header.

The fixIncludes was using the `input` as the main file path, this will
results in inserting header at wrong places.

We need the main file path to so that we can get the real main-file
header.

Differential Revision: https://reviews.llvm.org/D154950

16 months ago[OpenMP][NFC] lit: Allow setting default environment variables for test
Joachim Jenke [Tue, 11 Jul 2023 12:53:17 +0000 (14:53 +0200)]
[OpenMP][NFC] lit: Allow setting default environment variables for test

Add CHECK_OPENMP_ENV environment variable which will be passed to environment
variables for test (make check-* target). This provides a handy way to
exercise various openmp code with different settings during development.

For example, to change default barrier pattern:
```
$ env CHECK_OPENMP_ENV="KMP_FORKJOIN_BARRIER_PATTERN=hier,hier \
KMP_PLAIN_BARRIER_PATTERN=hier,hier \
KMP_REDUCTION_BARRIER_PATTERN=hier,hier" \
ninja check-openmp
```

Even with this, each test can set appropriate environment variables if needed
as before.

Also, this commit adds missing documention about how to run tests in README.

Patch provided by t-msn

Differential Revision: https://reviews.llvm.org/D122645

16 months ago[InstCombine] Extract "freely invert" related helpers (NFC)
Nikita Popov [Tue, 11 Jul 2023 12:59:50 +0000 (14:59 +0200)]
[InstCombine] Extract "freely invert" related helpers (NFC)

16 months ago[libc++] Move __thread_id out of <__threading_support>
Louis Dionne [Tue, 4 Jul 2023 20:10:45 +0000 (16:10 -0400)]
[libc++] Move __thread_id out of <__threading_support>

This makes <__threading_support> closer to handling only the bridge
between the system's implementation of threading and the rest of libc++.

Differential Revision: https://reviews.llvm.org/D154464

16 months ago[libc++] Make `stop_token` experimental
Hui [Mon, 10 Jul 2023 14:55:55 +0000 (10:55 -0400)]
[libc++] Make `stop_token` experimental

There are discussions about different ways of implementing `stop_token` to make it more performant
mark `stop_token` as experimental to allow us to change the design before it is shipped

Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
Differential Revision: https://reviews.llvm.org/D154700

16 months ago[FuncSpec] Prefer DataLayout-aware constant folding of GEPs.
Alexandros Lamprineas [Tue, 4 Jul 2023 08:38:58 +0000 (09:38 +0100)]
[FuncSpec] Prefer DataLayout-aware constant folding of GEPs.

As shown in D154820, the DataLayout-independent constant folding
interface is not good enough for handling GEPs. Instead we should
be using the DataLayout-aware constant folding interface. Since
there isn't a method to specifically handle GEPs we can use the
one which folds generic instruction operands.

Differential Revision: https://reviews.llvm.org/D154821

16 months ago[FuncSpec][NFC] Improve the unittest coverage for constant folding of GEPs.
Alexandros Lamprineas [Mon, 10 Jul 2023 08:33:01 +0000 (09:33 +0100)]
[FuncSpec][NFC] Improve the unittest coverage for constant folding of GEPs.

The InstCostVisitor is currently using the DataLayout-independent constant
folding interface. This is a workaround since we can't directly call
ConstantExpr::getGetElementPtr due to deprecation. This patch shows that
the constant folding interface we are using is not good enough.

Differential Revision: https://reviews.llvm.org/D154820

16 months ago[gn] port 8444038d160d4 (-gen-global-isel-combiner-matchtable for AMDGPU)
Nico Weber [Tue, 11 Jul 2023 12:22:53 +0000 (08:22 -0400)]
[gn] port 8444038d160d4 (-gen-global-isel-combiner-matchtable for AMDGPU)

16 months ago[gn] port 2f608131b44c (-gen-global-isel-combiner-matchtable for Mips)
Nico Weber [Tue, 11 Jul 2023 12:21:49 +0000 (08:21 -0400)]
[gn] port 2f608131b44c (-gen-global-isel-combiner-matchtable for Mips)

16 months ago[gn] port 87fb0ea27eeb (bolt DIEBuilder)
Nico Weber [Fri, 7 Jul 2023 12:01:12 +0000 (08:01 -0400)]
[gn] port 87fb0ea27eeb (bolt DIEBuilder)

16 months ago[gn] port 655714a300ff303 (-gen-global-isel-combiner-matchtable)
Nico Weber [Tue, 11 Jul 2023 12:15:24 +0000 (08:15 -0400)]
[gn] port 655714a300ff303 (-gen-global-isel-combiner-matchtable)

16 months ago[gn] port 1fe7d9c79967
Nico Weber [Tue, 11 Jul 2023 12:14:54 +0000 (08:14 -0400)]
[gn] port 1fe7d9c79967

16 months ago[gn] port c0719f3bacd5
Nico Weber [Tue, 11 Jul 2023 12:05:41 +0000 (08:05 -0400)]
[gn] port c0719f3bacd5

16 months ago[gn] port 908d0d54b82d
Nico Weber [Tue, 11 Jul 2023 12:05:30 +0000 (08:05 -0400)]
[gn] port 908d0d54b82d

16 months ago[OpenMP] [OMPT] [amdgpu] [5/8] Implemented device init/fini/load callbacks
Michael Halkenhaeuser [Tue, 20 Jun 2023 16:24:05 +0000 (18:24 +0200)]
[OpenMP] [OMPT] [amdgpu] [5/8] Implemented device init/fini/load callbacks

Added support in the generic plugin to invoke registered callbacks.

Depends on D124070

Patch from John Mellor-Crummey <johnmc@rice.edu>
(With contributions from Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com>)

Differential Revision: https://reviews.llvm.org/D124652

16 months ago[ELF] Make subsequent opens to auxiliary files append
Alex Brachet [Tue, 11 Jul 2023 11:07:53 +0000 (11:07 +0000)]
[ELF] Make subsequent opens to auxiliary files append

Previously, the same file could be used across diagnostic options but
the file would be silently overwritten by the whichever option gets
handled last.

Differential Revision: https://reviews.llvm.org/D153873

16 months ago[RISCV] Merge rv32/rv64 vector reduction intrinsic tests that have the same content...
Jim Lin [Tue, 11 Jul 2023 08:59:14 +0000 (16:59 +0800)]
[RISCV] Merge rv32/rv64 vector reduction intrinsic tests that have the same content. NFC.

16 months ago[NFC][AMDGPULowerModuleLDSPass] Remove dead variable
Juan Manuel MARTINEZ CAAMAÑO [Fri, 7 Jul 2023 13:18:10 +0000 (15:18 +0200)]
[NFC][AMDGPULowerModuleLDSPass] Remove dead variable

16 months ago[flang][hlfir]: fix associate of expr with more than one use
Tom Eccles [Fri, 7 Jul 2023 12:54:18 +0000 (12:54 +0000)]
[flang][hlfir]: fix associate of expr with more than one use

Make a copy of the expression and associate that so that this is the
only use.

So far as I know, we don't currently generate code for an associate with
more than one use. This is here just in case.

Depends on D154715

Differential Revision: https://reviews.llvm.org/D154721

16 months ago[flang][hlfir] use adaptor in associate bufferization
Tom Eccles [Fri, 7 Jul 2023 13:00:11 +0000 (13:00 +0000)]
[flang][hlfir] use adaptor in associate bufferization

The associate operation checks if it is the only use of the hlfir.expr,
and if so it can take ownership of the hlfir.expr instead of copying it
(move semantics).

If this check is done by accessing the associate operation's arguments
directly (not through the AssociateOpAdaptor), the expression uses will
contain some operations which have been deleted. These can include prior
copies of the same associate operation, if that operation was cloned
(e.g. to lower a hlfir.elemental into a fir.do_loop). Accessing the
bufferized expression instead of the old hlfir.expr through the adaptor
avoids this false positive.

Differential Revision: https://reviews.llvm.org/D154715

16 months ago[X86] combineAndMaskToShift - match constant splat with X86::isConstantSplat
Simon Pilgrim [Tue, 11 Jul 2023 10:25:12 +0000 (11:25 +0100)]
[X86] combineAndMaskToShift - match constant splat with X86::isConstantSplat

Using X86::isConstantSplat instead of ISD::isConstantSplatVector allows us to detect constant masks after they've been lowered to constant pool loads.

Addresses regression from D154592

16 months ago[mlir][nvgpu] Implement `nvgpu.device_async_copy` by NVVMToLLVM Pass
Guray Ozen [Tue, 11 Jul 2023 09:50:13 +0000 (11:50 +0200)]
[mlir][nvgpu] Implement `nvgpu.device_async_copy` by NVVMToLLVM Pass

`nvgpu.device_async_copy` is lowered into `cp.async` PTX instruction. However, NVPTX backend does not support its all mode especially when zero padding is needed. Therefore, current MLIR implementation genereates inline assembly for that.

This work simplifies PTX generation for `nvgpu.device_async_copy`, and implements it by `NVVMToLLVM` Pass.

Depends on D154060

Reviewed By: nicolasvasilache, manishucsd

Differential Revision: https://reviews.llvm.org/D154345