platform/upstream/llvm.git
2 years ago[llvm] add zstd to `llvm::compression` namespace
Cole Kissane [Tue, 19 Jul 2022 17:54:35 +0000 (10:54 -0700)]
[llvm] add zstd to `llvm::compression` namespace

- add zstd to `llvm::compression` namespace
- add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB`
- add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp`
- debian users should install libzstd when using `LLVM_ENABLE_ZSTD=FORCE_ON` from source due to this bug https://bugs.launchpad.net/ubuntu/+source/libzstd/+bug/1941956

Reviewed By: leonardchan, MaskRay

Differential Revision: https://reviews.llvm.org/D128465

2 years ago[lld-macho] Support folding of functions with identical LSDAs
Jez Ng [Tue, 19 Jul 2022 17:18:54 +0000 (13:18 -0400)]
[lld-macho] Support folding of functions with identical LSDAs

To do this, we need to slice away the LSDA pointer, just like we are
slicing away the functionAddress pointer.

No observable difference in perf on chromium_framework:

             base           diff           difference (95% CI)
  sys_time   1.769 ± 0.068  1.761 ± 0.065  [  -2.7% ..   +1.8%]
  user_time  9.517 ± 0.110  9.528 ± 0.116  [  -0.6% ..   +0.8%]
  wall_time  8.291 ± 0.174  8.307 ± 0.183  [  -1.1% ..   +1.5%]
  samples    21             25

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D129830

2 years agoRevert "[mlir][ods] (NFC) Remove warning in `AttrOrTypeDef`"
Jeff Niu [Tue, 19 Jul 2022 17:25:24 +0000 (10:25 -0700)]
Revert "[mlir][ods] (NFC) Remove warning in `AttrOrTypeDef`"

This reverts commit e45ef5ebf4402e553c9a0b10e8765811cc33bbdd.

2 years agoRevert "[Libomptarget] Make libomptarget an LLVM library"
Jon Chesterfield [Tue, 19 Jul 2022 16:59:45 +0000 (17:59 +0100)]
Revert "[Libomptarget] Make libomptarget an LLVM library"

This reverts commit 70039be62774ae8fc53bb3b8f1bdbd2b0efb3355.

2 years ago[amdgpu] Implement lds kernel id intrinsic
Jon Chesterfield [Tue, 19 Jul 2022 16:46:17 +0000 (17:46 +0100)]
[amdgpu] Implement lds kernel id intrinsic

Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS variable to avoid wasting space for it.

There are a number of implicit arguments accessed by intrinsic already
so this implementation closely follows the existing handling. It is slightly
novel in that this SGPR is written by the kernel prologue.

It is necessary in the general case to put variables at different addresses
such that they can be compactly allocated and thus necessary for an
indirect function call to have some means of determining where a
given variable was allocated. Claiming an arbitrary SGPR into which
an integer can be written by the kernel, in this implementation based
on metadata associated with that kernel, which is then passed on to
indirect call sites is sufficient to determine the variable address.

The intent is to emit a __const array of LDS addresses and index into it.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D125060

2 years agoBitwise comparison intrinsics
Tarun Prabhu [Tue, 19 Jul 2022 16:24:29 +0000 (16:24 +0000)]
Bitwise comparison intrinsics

This patch implements lowering for the F08 bitwise comparison intrinsics
(BGE, BGT, BLE and BLT).

This does not create any runtime functions since the functionality is
simple enough to carry out in IR.

The existing semantic check has been changed because it unconditionally
converted the arguments to the largest possible integer type. This
resulted in the argument with the smaller bit-size being sign-extended.
However, the standard requires the argument with the smaller bit-size to
be zero-extended.

Reviewed By: klausler, jeanPerier

Differential Revision: https://reviews.llvm.org/D127805

2 years ago[Libomptarget] Make libomptarget an LLVM library
Joseph Huber [Fri, 15 Jul 2022 16:10:18 +0000 (12:10 -0400)]
[Libomptarget] Make libomptarget an LLVM library

This patch makes libomptarget depend on LLVM libraries to be built. The
reason for this is because we already have an implicit dependency on
LLVM headers for ELF identification and extraction as well as an
optional dependenly on the LLVMSupport library for time tracing
information. Furthermore, there are changes in the future that require
using more LLVM libraries, and will heavily simplify some future code as
well as open up the large amount of useful LLVM libraries to
libomptarget.

This will make "standalone" builds of `libomptarget' more difficult for
vendors wishing to ship their own. This will require a sufficiently new
version of LLVM to be installed on the system that should be picked up
by the existing handling for the implicit headers.

The things this patch changes are as follows:
  - `libomptarget.so` links against LLVMSupport and LLVMObject
  - `libomptarget.so` is a symbolic link to `libomptarget.so.15`
  - If using a shared library build, user applications will depend on LLVM
    libraries as well
  - We can now use LLVM resources in Libomptarget.

Note that this patch only changes this to apply to libomptarget itself,
not the plugins. Additional patches will be necessary for that.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D129875

2 years ago[DSE] Revisit pointers that may no longer escape after removing another store
Arthur Eubanks [Tue, 12 Apr 2022 23:22:49 +0000 (16:22 -0700)]
[DSE] Revisit pointers that may no longer escape after removing another store

In dependent-capture, previously we'd see that %tmp4 is captured due to
the first store. We'd cache this info in CapturedBeforeReturn and
InvisibleToCallerAfterRet. Then the first store is then removed, causing
the cached values to be wrong.

We also need to revisit everything because normally we work backwards
when removing stores at the end of the function, but in this case
removing an earlier store causes a later store to be removable.

No compile time impact:
https://llvm-compile-time-tracker.com/compare.php?from=56796ae1a8db4c85dada28676f8303a5a3609c63&to=21b7e5248ffc423cd36c9d4a020085e363451465&stat=instructions

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D123686

2 years ago[SimplifyLibCalls] avoid converting pow() to powi() with no FMF
Sanjay Patel [Tue, 19 Jul 2022 15:55:30 +0000 (11:55 -0400)]
[SimplifyLibCalls] avoid converting pow() to powi() with no FMF

powi() is not a standard math library function; it is specified
with non-strict semantics in the LangRef. We currently require
'afn' to do this transform when it needs a sqrt(), so I just
extended that requirement to the whole-number exponent too.

This bug was introduced with:
b17754bcaa14
...where we deferred expansion of pow() to later passes.

2 years ago[nfc][amdgpu] LDS. Move selection logic up the stack.
Jon Chesterfield [Tue, 19 Jul 2022 16:17:22 +0000 (17:17 +0100)]
[nfc][amdgpu] LDS. Move selection logic up the stack.

2 years ago[libclang][ObjC] Inherit availability attribute from containing decls or
Akira Hatanaka [Mon, 11 Jul 2022 17:01:23 +0000 (10:01 -0700)]
[libclang][ObjC] Inherit availability attribute from containing decls or
interface decls

This patch teaches getCursorPlatformAvailabilityForDecl to look for
availability attributes on the containing decls or interface decls if
the current decl doesn't have any availability attributes.

Differential Revision: https://reviews.llvm.org/D129504

2 years ago[mlir][ods] (NFC) Remove warning in `AttrOrTypeDef`
Jeff Niu [Tue, 19 Jul 2022 16:14:52 +0000 (09:14 -0700)]
[mlir][ods] (NFC) Remove warning in `AttrOrTypeDef`

This warning was added because using attribute or type assembly formats
with `skipDefaultBuilders` set could cause compilation errors, since the
required builder prototype may not necessarily be generated and would
need to be checked by hand. This patch removes the warning because a
warning that the generated C++ "might" not compile is not particularly
useful. Attempting to address the TODO (i.e. detect whether a builder of
the correct prototype is provided) would be fragile since it would not
be possible to account for implicit conversions, etc.

In general, ODS should not be emitting warnings in cases like these.

2 years ago[mlir][tblgen] Add support for extraClassDefinition in AttrDef
bhatuzdaname [Tue, 19 Jul 2022 15:54:24 +0000 (08:54 -0700)]
[mlir][tblgen] Add support for extraClassDefinition in AttrDef

For AttrDef declarations, place specified code in extraClassDefinition into the generated *.cpp.inc file.

Reviewed By: Mogball, rriddle

Differential Revision: https://reviews.llvm.org/D129574

2 years ago[llvm][SVE] Remove redundant and when comparing against extending load
David Truby [Tue, 19 Jul 2022 13:13:10 +0000 (14:13 +0100)]
[llvm][SVE] Remove redundant and when comparing against extending load

When determining if an `and` should be merged into an extending load
the constant argument to the `and` is currently not checked if the
argument requires truncation. This prevents the combine happening when
the vector width is half the normal available vector width for SVE VLA
vectors.

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D129281

2 years ago[NewPM] Print function/SCC size with -debug-pass-manager
Arthur Eubanks [Thu, 16 Jun 2022 20:30:12 +0000 (13:30 -0700)]
[NewPM] Print function/SCC size with -debug-pass-manager

This is helpful for debugging issues with very large functions or SCC.
Also helpful when function names are very large and it's hard to tell the number of nodes in an SCC.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D128003

2 years ago[mlir][NFC] Use proper c++ namespaces in .td files
lipracer [Tue, 19 Jul 2022 15:48:34 +0000 (08:48 -0700)]
[mlir][NFC] Use proper c++ namespaces in .td files

td files:
mlir::ArrayRef => llvm::ArrayRef
mlir::Optional=>llvm::Optional
mlir::SmallVector => llvm::SmallVector

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D128537

2 years ago[gn build] (manually) port 4539b44148918 (llvm-dwarfutil)
Nico Weber [Tue, 19 Jul 2022 15:50:54 +0000 (11:50 -0400)]
[gn build] (manually) port 4539b44148918 (llvm-dwarfutil)

2 years ago[gn build] (manually) port 8711fcae276a593
Nico Weber [Tue, 19 Jul 2022 15:38:00 +0000 (11:38 -0400)]
[gn build] (manually) port 8711fcae276a593

2 years ago[bazel] Fix the build after 18b92c66fe59a44f50bc211a418eaf48fe1cf7c1
Benjamin Kramer [Tue, 19 Jul 2022 15:34:39 +0000 (17:34 +0200)]
[bazel] Fix the build after 18b92c66fe59a44f50bc211a418eaf48fe1cf7c1

2 years ago[bazel] Remove libraries that don't build anymore after 5e83a5b4752da6631d79c446f21e5...
Benjamin Kramer [Tue, 19 Jul 2022 15:12:09 +0000 (17:12 +0200)]
[bazel] Remove libraries that don't build anymore after 5e83a5b4752da6631d79c446f21e5d128b5c5495

I don't know who uses these python extensions, probably nobody.

2 years ago[libc++] Treat incomplete features just like other experimental features
Louis Dionne [Thu, 30 Jun 2022 15:57:52 +0000 (11:57 -0400)]
[libc++] Treat incomplete features just like other experimental features

In particular remove the ability to expel incomplete features from the
library at configure-time, since this can now be done through the
_LIBCPP_ENABLE_EXPERIMENTAL macro.

Also, never provide symbols related to incomplete features inside the
dylib, instead provide them in c++experimental.a (this changes the
symbols list, but not for any configuration that should have shipped).

Differential Revision: https://reviews.llvm.org/D128928

2 years ago[libc++] Re-apply "Always build c++experimental.a""
Louis Dionne [Tue, 19 Jul 2022 14:44:06 +0000 (10:44 -0400)]
[libc++] Re-apply "Always build c++experimental.a""

This re-applies bb939931a1ad, which had been reverted by 09cebfb978de
because it broke Chromium. The issues seen by Chromium should be
addressed by 1d0f79558ca4.

Differential Revision: https://reviews.llvm.org/D128927

2 years ago[libc++] Make sure cxx_experimental links against libc++ headers
Louis Dionne [Tue, 19 Jul 2022 14:40:26 +0000 (10:40 -0400)]
[libc++] Make sure cxx_experimental links against libc++ headers

This should fix builds where we build neither the static nor the shared
library.

2 years agoRevert "Update some more tests with update_cc_test_checks.py"
Nicolai Hähnle [Tue, 19 Jul 2022 14:39:05 +0000 (16:39 +0200)]
Revert "Update some more tests with update_cc_test_checks.py"

This reverts commit 9fb33d52b045b6cc97f2f56fe5cd23b41de86ffe.

Buildbots are showing a number of regressions that don't reproduce
locally. Needs more investigating.

2 years ago[coro async] Add missing llvm.coro.id.async intrinsic to declaresCoroCleanupIntrinsics
Arnold Schwaighofer [Mon, 18 Jul 2022 17:46:57 +0000 (10:46 -0700)]
[coro async] Add missing llvm.coro.id.async intrinsic to declaresCoroCleanupIntrinsics

rdar://97214593

Differential Revision: https://reviews.llvm.org/D130038

2 years ago[flang][NFC] Drop `AbstractResultOptions` structure
Daniil Dudkin [Tue, 19 Jul 2022 14:22:39 +0000 (17:22 +0300)]
[flang][NFC] Drop `AbstractResultOptions` structure

`AbstractResultOptions` is obsolete structure because `newArg` is used
only in `ReturnOpConversion`.
This change removes this struct, making dependencies of conversions more
straight-forward.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D129485

2 years agoUpdate some more tests with update_cc_test_checks.py
Nicolai Hähnle [Tue, 19 Jul 2022 06:58:31 +0000 (08:58 +0200)]
Update some more tests with update_cc_test_checks.py

2 years ago[AMDGPU] Remove old operand from VOPC DPP
Joe Nash [Fri, 15 Jul 2022 17:49:02 +0000 (13:49 -0400)]
[AMDGPU] Remove old operand from VOPC DPP

For most DPP instructions, the old operand stores the value that was in
the current lane before the DPP operation, and is tied to the
destination. For VOPC DPP, this is unnecessary and incorrect.

There appears to have been a latent bug related to D122737 with
SIInstrInfo::isOperandLegal. If you checked if a register operand was legal
when the InstructionDesc expected an immediate, it reported that is valid.
Its fix is necessary for and tested in this patch.

Reviewed By: foad, rampitec

Differential Revision: https://reviews.llvm.org/D130040

2 years agoAdd the FreeBSD AArch64 memory layout
Andrew Turner [Wed, 18 May 2022 11:02:26 +0000 (12:02 +0100)]
Add the FreeBSD AArch64 memory layout

Use the FreeBSD AArch64 memory layout values when building for it.
These are based on the x86_64 values, scaled to take into account the
larger address space on AArch64.

Reviewed by: vitalybuka

Differential Revision: https://reviews.llvm.org/D125883

2 years agoAdd the FreeBSD AArch64 shadow offset to llvm
Andrew Turner [Mon, 16 May 2022 16:20:52 +0000 (17:20 +0100)]
Add the FreeBSD AArch64 shadow offset to llvm

AArch64 has a larger address space than 64 but x86. Use the larger
shadow offset on FreeBSD AArch64.

Reviewed by: vitalybuka

Differential Revision: https://reviews.llvm.org/D125873

2 years agoAdd the FreeBSD AArch64 memory layout
Andrew Turner [Mon, 16 May 2022 16:36:40 +0000 (17:36 +0100)]
Add the FreeBSD AArch64 memory layout

Use the FreeBSD AArch64 memory layout values when building for it.
These are based on the x86_64 values, scaled to take into account the
larger address space on AArch64.

Reviewed by: vitalybuka

Differential Revision: https://reviews.llvm.org/D125758

2 years agotsan: optimize DenseSlabAlloc
Dmitry Vyukov [Sat, 16 Jul 2022 09:48:18 +0000 (11:48 +0200)]
tsan: optimize DenseSlabAlloc

If lots of threads do lots of malloc/free and they overflow
per-pthread DenseSlabAlloc cache, it causes lots of contention:

  31.97%  race.old  race.old            [.] __sanitizer::StaticSpinMutex::LockSlow
  17.61%  race.old  race.old            [.] __tsan_read4
  10.77%  race.old  race.old            [.] __tsan::SlotLock

Optimize DenseSlabAlloc to use a lock-free stack of batches of nodes.
This way we don't take any locks in steady state at all and do only
1 push/pop per Refill/Drain.

Effect on the added benchmark:

$ TIME="%e %U %S %M" time ./test.old 36 5 2000000
34.51 978.22 175.67 5833592
32.53 891.73 167.03 5790036
36.17 1005.54 201.24 5802828
36.94 1004.76 226.58 5803188

$ TIME="%e %U %S %M" time ./test.new 36 5 2000000
26.44 720.99 13.45 5750704
25.92 721.98 13.58 5767764
26.33 725.15 13.41 5777936
25.93 713.49 13.41 5791796

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D130002

2 years ago[DAG] Call SimplifyDemandedBits from ISD::MUL nodes
Simon Pilgrim [Tue, 19 Jul 2022 13:10:53 +0000 (14:10 +0100)]
[DAG] Call SimplifyDemandedBits from ISD::MUL nodes

Noticed while triaging D129765.

2 years agoDon't vectorize PHIs in catchswitch blocks
William Schmidt [Mon, 18 Jul 2022 20:55:27 +0000 (13:55 -0700)]
Don't vectorize PHIs in catchswitch blocks

We currently assert in vectorizeTree(TreeEntry*) when processing a PHI
bundle in a block containing a catchswitch.  We attempt to set the
IRBuilder insertion point following the catchswitch, which is invalid.
This is done so that ShuffleBuilder.finalize() knows where to insert
a shuffle if one is needed.

To avoid this occurring, watch out for catchswitch blocks during
buildTree_rec() processing, and avoid adding PHIs in such blocks to
the vectorizable tree.  It is unlikely that constraining vectorization
over an exception path will cause a noticeable performance loss, so
this seems preferable to trying to anticipate when a shuffle will and
will not be required.

2 years ago[Local] Allow creating callbr with duplicate successors
Nikita Popov [Mon, 18 Jul 2022 10:08:00 +0000 (12:08 +0200)]
[Local] Allow creating callbr with duplicate successors

Since D129288, callbr is allowed to have duplicate successors. This
patch removes a limitation which prevents optimizations from actually
producing such callbrs.

Differential Revision: https://reviews.llvm.org/D129997

2 years ago[Reland][Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF.
Alexey Lapshin [Sun, 10 Jul 2022 17:11:55 +0000 (20:11 +0300)]
[Reland][Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF.

This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html
llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal):

```
./llvm-dwarfutil [options] <input file> <output file>

  --garbage-collection    Do garbage collection for debug info(default)
  -j <value>              Alias for --num-threads
  --no-garbage-collection Don`t do garbage collection for debug info
  --no-odr-deduplication  Don`t do ODR deduplication for debug types
  --no-odr                Alias for --no-odr-deduplication
  --no-separate-debug-file
                          Create single output file, containing debug tables(default)
  --num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine
  --odr-deduplication     Do ODR deduplication for debug types(default)
  --odr                   Alias for --odr-deduplication
  --separate-debug-file   Create two output files: file w/o debug tables and file with debug tables
  --tombstone [bfd,maxpc,exec,universal]
                          Tombstone value used as a marker of invalid address(default: universal)
    =bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec
    =maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges
    =exec - Match with address ranges of executable sections
    =universal - Both: bfd and maxpc
```

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D86539

2 years ago[flang] Fix flang-to-external-fc --version
serge-sans-paille [Tue, 19 Jul 2022 10:53:01 +0000 (06:53 -0400)]
[flang] Fix flang-to-external-fc --version

Substitution of @FLANG_VERSION@ wasn't correctly performed.

Differential Revision: https://reviews.llvm.org/D130074

2 years agoAdditional regression test for a crash during reorder masked gather nodes
Evgeniy Brevnov [Tue, 19 Jul 2022 11:46:43 +0000 (18:46 +0700)]
Additional regression test for a crash during reorder masked gather nodes

2 years ago[mlir][Linalg] Add a TileToForeachThread transform.
Nicolas Vasilache [Tue, 19 Jul 2022 08:33:21 +0000 (01:33 -0700)]
[mlir][Linalg] Add a TileToForeachThread transform.

This revision adds a new transformation to tile a TilingInterface `op` to a tiled `scf.foreach_thread`, applying
tiling by `num_threads`.
If non-empty, the `threadDimMapping` is added as an attribute to the resulting `scf.foreach_thread`.
0-tile sizes (i.e. tile by the full size of the data) are used to encode
that a dimension is not tiled.

Differential Revision: https://reviews.llvm.org/D129577

2 years ago[LegalizeDAG] Propagate alignment in ExpandExtractFromVectorThroughStack
Benjamin Kramer [Tue, 19 Jul 2022 10:35:47 +0000 (12:35 +0200)]
[LegalizeDAG] Propagate alignment in ExpandExtractFromVectorThroughStack

Unlike the name suggests this can reuse any store as a base for a
memory-based vector extract. If that store is underaligned the loads
created to extract will have an invalid alignment. Since most CPUs are
forgiving wrt alignment this is almost never an issue, on x86 this is
only reproducible by extracting a 128 bit vector out of a wider vector.

I tried making a test case in the context of
https://reviews.llvm.org/D127982 but it's really really fragile, as the
output pretty much looks like a missed optimization.

2 years ago[ARM] Remove VBICimm if no cleared bits are demanded
David Green [Tue, 19 Jul 2022 10:53:47 +0000 (11:53 +0100)]
[ARM] Remove VBICimm if no cleared bits are demanded

If none of the bits of a VBICimm are demanded, we can remove the node
entirely using the input operand instead.

Differential Revision: https://reviews.llvm.org/D129966

2 years ago[LV] Remove unnecessary cast in widenCallInstruction. (NFC)
Florian Hahn [Tue, 19 Jul 2022 10:23:24 +0000 (11:23 +0100)]
[LV] Remove unnecessary cast in widenCallInstruction. (NFC)

2 years agoFix signed/unsigned comparison mismatch warning
Simon Pilgrim [Tue, 19 Jul 2022 10:13:31 +0000 (11:13 +0100)]
Fix signed/unsigned comparison mismatch warning

2 years ago[DAG] SimplifyDemandedBits - relax "xor (X >> ShiftC), XorC --> (not X) >> ShiftC...
Simon Pilgrim [Tue, 19 Jul 2022 09:58:27 +0000 (10:58 +0100)]
[DAG] SimplifyDemandedBits - relax "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" to match only demanded bits

The "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold is currently limited to the XOR mask being a shifted all-bits mask, but we can relax this to only need to match under the demanded bits.

This helps expose more bit extraction/clearing patterns and fixes the PowerPC testCompares*.ll regressions from D127115

Alive2: https://alive2.llvm.org/ce/z/fl7T7K

Differential Revision: https://reviews.llvm.org/D129933

2 years ago[AMDGPU] Set amdgpu-memory-bound if a basic block has dense global memory access
Abinav Puthan Purayil [Wed, 13 Jul 2022 06:40:02 +0000 (12:10 +0530)]
[AMDGPU] Set amdgpu-memory-bound if a basic block has dense global memory access

AMDGPUPerfHintAnalysis doesn't set the memory bound attribute if
FuncInfo::InstCost outweighs MemInstCost even if we have a basic block
with relatively high global memory access. GCNSchedStrategy could revert
optimal scheduling in favour of occupancy which seems to degrade
performance for some kernels. This change introduces the
HasDenseGlobalMemAcc metric in the heuristic that makes the analysis
more conservative in these cases.

This fixes SWDEV-334259/SWDEV-343932

Differential Revision: https://reviews.llvm.org/D129759

2 years ago[AMDGPU] Pre-commit tests for D129759
Abinav Puthan Purayil [Wed, 13 Jul 2022 18:04:57 +0000 (23:34 +0530)]
[AMDGPU] Pre-commit tests for D129759

Differential Revision: https://reviews.llvm.org/D129760

2 years ago[llvm][AArch64] Add missing FPCR, H and B registers to Codeview mapping
David Spickett [Thu, 14 Jul 2022 09:36:03 +0000 (09:36 +0000)]
[llvm][AArch64] Add missing FPCR, H and B registers to Codeview mapping

Fixes https://github.com/llvm/llvm-project/issues/56484

H registers are 16 bit views of AArch64's Neon registers and
B are the 8 bit views.

msvc does not support 16 bit float (some mention in DirectX but I
couldn't find a way to get to it) so for lack of a better reference
I'm using:
https://github.com/MicrosoftEdge/JsDbg/blob/85c9b41b33bb8f3496dbe400d912c32bb7cc496b/server/references/dia/include/cvconst.h
(the other microsoft-pdb repo is no longer up to date)

Luckily clang does support fp16 so a test is added for that.

There is no 8 bit float type so I had to get creative with the
test case. We're not testing for correct debug info here just
that we can select the B register and not crash in the process.

For FPCR it's never going to be passed as an argument so I've
not added a test for it. It is included to keep our list looking
the same as the reference.

Reviewed By: majnemer

Differential Revision: https://reviews.llvm.org/D129774

2 years agoRevert "[Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF."
Alexey Lapshin [Tue, 19 Jul 2022 09:16:24 +0000 (12:16 +0300)]
Revert "[Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF."

This reverts commit e2147c26bd1522ad67a98836fbe94933eab869bb.

2 years ago[mlir] Ignore effects on allocated results when checking whether the op is trivially...
Markus Böck [Tue, 19 Jul 2022 08:58:25 +0000 (10:58 +0200)]
[mlir] Ignore effects on allocated results when checking whether the op is trivially dead.

In the current state, this is only special cased for Allocation effects, but any effects on results allocated by the operation may be ignored when checking whether the op may be removed, as none of them are possible to be observed if the result is unused.

A use case for this is for IRs for languages which always initialize on allocation. To correctly model such operations, a Write as well as an Allocation effect should be placed on the result. This would prevent the Op from being deleted if unused however. This patch fixes that issue.

Differential Revision: https://reviews.llvm.org/D129854

2 years ago[LoopSimplifyCFG] Prevent use-def dominance breach by handling dead exits. PR56243
Max Kazantsev [Tue, 19 Jul 2022 08:22:44 +0000 (15:22 +0700)]
[LoopSimplifyCFG] Prevent use-def dominance breach by handling dead exits. PR56243

One of the transforms in LoopSimplifyCFG demands that the LCSSA form is
truly maintained for all values, tokens included, otherwise it may end up creating
a use that is not dominated by def (and Phi creation for tokens is impossible).
Detect this situation and prevent transform for it early.

Differential Revision: https://reviews.llvm.org/D129984
Reviewed By: efriedma

2 years agoUpdate docs to note lzfse open source implementation
Jason Molenda [Tue, 19 Jul 2022 08:40:12 +0000 (01:40 -0700)]
Update docs to note lzfse open source implementation

2 years ago[Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF.
Alexey Lapshin [Sun, 10 Jul 2022 17:11:55 +0000 (20:11 +0300)]
[Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF.

This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html
llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal):

```
./llvm-dwarfutil [options] <input file> <output file>

  --garbage-collection    Do garbage collection for debug info(default)
  -j <value>              Alias for --num-threads
  --no-garbage-collection Don`t do garbage collection for debug info
  --no-odr-deduplication  Don`t do ODR deduplication for debug types
  --no-odr                Alias for --no-odr-deduplication
  --no-separate-debug-file
                          Create single output file, containing debug tables(default)
  --num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine
  --odr-deduplication     Do ODR deduplication for debug types(default)
  --odr                   Alias for --odr-deduplication
  --separate-debug-file   Create two output files: file w/o debug tables and file with debug tables
  --tombstone [bfd,maxpc,exec,universal]
                          Tombstone value used as a marker of invalid address(default: universal)
    =bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec
    =maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges
    =exec - Match with address ranges of executable sections
    =universal - Both: bfd and maxpc
```

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D86539

2 years ago[AArch64] Add patterns to fold zext(cmpeq(x, splat(0)))
Cullen Rhodes [Tue, 19 Jul 2022 07:51:28 +0000 (07:51 +0000)]
[AArch64] Add patterns to fold zext(cmpeq(x, splat(0)))

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D129626

2 years ago[X86] Add 64 bit implement for __SSC_MARK
Xiang1 Zhang [Fri, 15 Jul 2022 03:18:33 +0000 (11:18 +0800)]
[X86] Add 64 bit implement for __SSC_MARK

Reviewed By: craig.topper, pengfei.wang, jinsong
Differential Revision: https://reviews.llvm.org/D129826

2 years ago[LoopInfo] Allow cloning of callbr
Nikita Popov [Fri, 15 Jul 2022 10:35:08 +0000 (12:35 +0200)]
[LoopInfo] Allow cloning of callbr

After D129288, callbr is safe to clone without special handling.
This permits optimizations like loop unroll and loop unswitch on
loops containing callbrs.

Fixes https://github.com/llvm/llvm-project/issues/41834.

Differential Revision: https://reviews.llvm.org/D129993

2 years ago[pseudo] Implement a guard to determine function declarator.
Haojian Wu [Fri, 15 Jul 2022 14:15:31 +0000 (16:15 +0200)]
[pseudo] Implement a guard to determine function declarator.

This eliminates some simple-declaration/function-definition false
parses.

- implement a function to determine whether a declarator ForestNode is a
  function declarator;
- extend the standard declarator to two guarded function-declarator and
  non-function-declarator nonterminals;

Differential Revision: https://reviews.llvm.org/D129222

2 years ago[AArch64][SVE] Fold fadda(ptrue, x, select(mask, y, -0.0)) into fadda(mask, x, y)
Rosie Sumpter [Tue, 12 Jul 2022 07:40:59 +0000 (08:40 +0100)]
[AArch64][SVE] Fold fadda(ptrue, x, select(mask, y, -0.0)) into fadda(mask, x, y)

This patch adds an SVE pattern to recognize the use of a select with an
fadda in the form fadda(ptrue, x, select(mask, y, -0.0)). In this case
the select can be folded away, with the select mask used as the
predicate for fadda. This improves the codegen when vectorizing loops
with ordered fp reductions.

Differential Revision: https://reviews.llvm.org/D129623

2 years ago[mlir][sparse][NFC] Update remaining test cases
Matthias Springer [Tue, 19 Jul 2022 07:20:32 +0000 (09:20 +0200)]
[mlir][sparse][NFC] Update remaining test cases

No more to_memref, memref.alloc or memref.dealloc when possible.

Differential Revision: https://reviews.llvm.org/D130023

2 years ago[mlir][bufferization][NFC] Move sparse_tensor.release to bufferization dialect
Matthias Springer [Tue, 19 Jul 2022 07:13:53 +0000 (09:13 +0200)]
[mlir][bufferization][NFC] Move sparse_tensor.release to bufferization dialect

This op used to belong to the sparse dialect, but there are use cases for dense bufferization as well. (E.g., when a tensor alloc is returned from a function and should be deallocated at the call site.) This change moves the op to the bufferization dialect, which now has an `alloc_tensor` and a `dealloc_tensor` op.

Differential Revision: https://reviews.llvm.org/D129985

2 years agoRevert change to clang/test/CodeGen/arm_acle.c
Nicolai Hähnle [Tue, 19 Jul 2022 07:10:27 +0000 (09:10 +0200)]
Revert change to clang/test/CodeGen/arm_acle.c

For some reason, update_cc_test_checks.py produced a failing test.

Partial revert of 301011fa6078b4f16bd3fc6158d9c6fddad7e118

2 years ago[sanitizer] Don't call dlerror() after swift_demangle lookup through dlsym
serge-sans-paille [Tue, 12 Jul 2022 20:05:57 +0000 (22:05 +0200)]
[sanitizer] Don't call dlerror() after swift_demangle lookup through dlsym

Because the call to `dlerror()` may actually want to print something, which turns into a deadlock
as showcased in #49223.

Instead rely on further call to dlsym to clear `dlerror` internal state if they
need to check the return status.

Differential Revision: https://reviews.llvm.org/D128992

2 years ago[llvm] Fix forward declaration in Support/JSON.h
serge-sans-paille [Tue, 12 Jul 2022 20:57:16 +0000 (22:57 +0200)]
[llvm] Fix forward declaration in Support/JSON.h

Some methods of json::Array require json::Value to be completely defined, so
they can't be defined in-class. Fix that by defining them out of class.

Fix #55780

2 years ago[X86][NFC] avx512-f16c-v16f16-fadd.ll avx512-skx-v32f16-fadd.ll - add nounwind to...
Bing1 Yu [Tue, 19 Jul 2022 06:51:58 +0000 (14:51 +0800)]
[X86][NFC] avx512-f16c-v16f16-fadd.ll avx512-skx-v32f16-fadd.ll - add nounwind to prevent cfi noise on tests

2 years agoRerun ./utils/update_cc_test.py on a bunch of tests
Nicolai Hähnle [Tue, 19 Jul 2022 06:45:31 +0000 (08:45 +0200)]
Rerun ./utils/update_cc_test.py on a bunch of tests

Due to update script changes; this reduces the size of a later
"real" diff.

2 years ago[NFC] Introduce API to detect tokens penetrating LCSSA form
Max Kazantsev [Tue, 19 Jul 2022 05:50:43 +0000 (12:50 +0700)]
[NFC] Introduce API to detect tokens penetrating LCSSA form

Following discussion in PR56243, we need to somehow detect the situation
when token values penetrate LCSSA form for transforms that require that
it is maintained by all values (for example, to sustain use-def dominance
invarians). This patch introduces a parameter to LCSSA checkers to control
their ignorance about tokens.

Differential Revision: https://reviews.llvm.org/D129983
Reviewed By: efriedma

2 years ago[gn build] Port 8ed702b83f20
LLVM GN Syncbot [Tue, 19 Jul 2022 06:42:58 +0000 (06:42 +0000)]
[gn build] Port 8ed702b83f20

2 years agoRevert "[DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat."
Max Kazantsev [Tue, 19 Jul 2022 06:26:31 +0000 (13:26 +0700)]
Revert "[DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat."

This reverts commit 58dfaaaace4ea75ab3588a6e738f2cf58ebf77c2.

Massive AARCH test failures in buildbot.

2 years ago[X86] Promote v32f16's fadd into v32f32's fadd when it is avx512 without avx512fp16
Bing1 Yu [Tue, 19 Jul 2022 06:16:30 +0000 (14:16 +0800)]
[X86] Promote v32f16's fadd into v32f32's fadd when it is avx512 without avx512fp16

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D130059

2 years ago[OpenMP][IRBuilder] Add support for taskgroup
Shraiysh Vaishay [Mon, 18 Jul 2022 09:55:54 +0000 (15:25 +0530)]
[OpenMP][IRBuilder] Add support for taskgroup

This patch adds support for generating taskgroup construct.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D128203

2 years ago[mlir] Add refineReturnTypes to InferTypeOpInterface
Jacques Pienaar [Tue, 19 Jul 2022 05:18:52 +0000 (22:18 -0700)]
[mlir] Add refineReturnTypes to InferTypeOpInterface

refineReturnType method shares the same parameters as inferReturnTypes
but gets passed in the return types of the op if known that can be used
during refinement passes or for more op specific error reporting.
Currently the error reporting on failure is generic and doesn't allow
for specializing the returned result based on failure, with this change
what would previously have been a separate trait with specialized
verification can just be handled as part of inferrence rather than
duplicated.

refineReturnTypes behaves like inferReturnTypes if no result types are fed in,
while the current verification is recast as the default implementation for
refineReturnTypes with it calling inferReturnTypes (and so the default type
verification now goes through refine and allows for more op specific inference
mismatch errors).

Differential Revision: https://reviews.llvm.org/D129955

2 years agoUpdate the Windows packaging script.
Carlos Alberto Enciso [Tue, 19 Jul 2022 04:55:14 +0000 (05:55 +0100)]
Update the Windows packaging script.

As discussed on:
  https://discourse.llvm.org/t/build-llvm-release-bat-script-options/63146/6

- Refactor the build/test steps into functions.
- Exit the script if the build directory already exists.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D129559

2 years ago[clang-tidy] Remove unnecessary code from ReadabilityModuleTest
Nathan James [Tue, 19 Jul 2022 04:21:09 +0000 (05:21 +0100)]
[clang-tidy] Remove unnecessary code from ReadabilityModuleTest

D56303 added testing code that was then made redundant by the changes in D125026. However this code wasn't completely removed in the latter patch.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D130026

2 years ago[libc++][ranges] Implement `ranges::{,stable_}partition`.
Konstantin Varlamov [Tue, 19 Jul 2022 04:05:51 +0000 (21:05 -0700)]
[libc++][ranges] Implement `ranges::{,stable_}partition`.

Differential Revision: https://reviews.llvm.org/D129624

2 years ago[ORC] Fix serialization / deserialization of default-constructed ArrayRef<char>.
Lang Hames [Tue, 19 Jul 2022 03:36:17 +0000 (20:36 -0700)]
[ORC] Fix serialization / deserialization of default-constructed ArrayRef<char>.

Avoids a zero-length memcpy from a null src, which caused errors on some of the
sanitizer bots. Also uses null when deserializing an empty ArrayRef (rather
than pointing to a zero length range in the middle of the input buffer).

2 years ago[DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat.
jacquesguan [Thu, 7 Jul 2022 08:48:55 +0000 (16:48 +0800)]
[DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat.

This revision supports to scalarize a binary operation of two scalable splat vectors.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D122791

2 years ago[RISCV][test] Precommit test for D122791.
jacquesguan [Thu, 7 Jul 2022 08:57:59 +0000 (16:57 +0800)]
[RISCV][test] Precommit test for D122791.

Differential Revision: https://reviews.llvm.org/D123362

2 years ago[VE] Support load/store/spill of vector mask registers
Kazushi (Jam) Marukawa [Sat, 2 Jul 2022 04:51:20 +0000 (13:51 +0900)]
[VE] Support load/store/spill of vector mask registers

Support load/store/spill of vector mask registers and add regression
tests.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D129415

2 years ago[AArch64][NFC] Set true for default of subfeature is more readable
zhongyunde [Tue, 19 Jul 2022 00:59:26 +0000 (08:59 +0800)]
[AArch64][NFC] Set true for default of subfeature is more readable

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D129960

2 years agoRevert "Make hit point counts reliable for architectures that stop before evaluation."
Jim Ingham [Tue, 19 Jul 2022 00:38:43 +0000 (17:38 -0700)]
Revert "Make hit point counts reliable for architectures that stop before evaluation."

This reverts commit 5778ada8e54edb2bc2869505b88a959d1915c02f.

The watchpoint tests all stall on aarch64-ubuntu bots.  Reverting till I can
get my hands on an system to test this out.

2 years agoRevert "This is a followup to https://reviews.llvm.org/D129814"
Jim Ingham [Tue, 19 Jul 2022 00:37:13 +0000 (17:37 -0700)]
Revert "This is a followup to https://reviews.llvm.org/D129814"

This reverts commit 555ae5b8f5aa93ab090af853a8b7a83f815b3f20.

Apparently, there's something different about how Linux ARM handles watchpoints,
as all the watchpoint tests seem to stall on the Ubuntu aarch64 bots.

Reverting till I can get my hands on a linux system and see what is
wrong.

2 years ago[RISCV][Clang] Add support for Zmmul extension
ksyx [Sun, 26 Jun 2022 01:36:02 +0000 (21:36 -0400)]
[RISCV][Clang] Add support for Zmmul extension

This patch implements recently ratified extension Zmmul, a subextension
of M (Integer Multiplication and Division) consisting only
multiplication part of it.

Differential Revision: https://reviews.llvm.org/D103313
Reviewed By: craig.topper, jrtc27, asb

2 years ago[unittests/Tooling/DependencyScannerTest] Add a target triple for `ScanDepsWithFS...
Argyrios Kyrtzidis [Mon, 18 Jul 2022 23:53:16 +0000 (16:53 -0700)]
[unittests/Tooling/DependencyScannerTest] Add a target triple for `ScanDepsWithFS` test

This should fix the `clang-ppc64-aix` builder.

2 years ago[llvm-objdump] Support --symbolize-operands when there is a single SHT_LLVM_BB_ADDR_M...
Rahman Lavaee [Sat, 16 Jul 2022 07:48:50 +0000 (00:48 -0700)]
[llvm-objdump] Support --symbolize-operands when there is a single SHT_LLVM_BB_ADDR_MAP section for all text sections

When linking, using `-Wl,-z,keep-text-section-prefix` results in multiple text sections while all `SHT_LLVM_BB_ADDR_MAP` sections are linked into a single one.
In such case, we should not read the corresponding section for each text section, and instead read all `SHT_LLVM_BB_ADDR_MAP` sections before disassembly.

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D129924

2 years agoThis is a followup to https://reviews.llvm.org/D129814
Jim Ingham [Mon, 18 Jul 2022 23:21:51 +0000 (16:21 -0700)]
This is a followup to https://reviews.llvm.org/D129814

That was causing hit counts to be double-counted on x86_64 Linux.
It looks like StopInfoWatchpoint::ShouldStopSynchronous gets called
twice for a give stop on Linux (not on Darwin).  I had taken out the
"have I been called already" check when I reworked this part of the
code because it didn't seem necessary.  Putting that back in because
it looks like it is on some systems.

2 years ago[InstrProf] Allow CSIRPGO function entry coverage
Ellis Hoag [Mon, 18 Jul 2022 22:09:11 +0000 (15:09 -0700)]
[InstrProf] Allow CSIRPGO function entry coverage

The flag `-fcs-profile-generate` for enabling CSIRPGO moves the pass
`pgo-instrumentation` after inlining. Function entry coverage works fine
with this change, so remove the assert. I had originally left this
assert in because I had not tested this at the time.

Reviewed By: davidxl, MaskRay

Differential Revision: https://reviews.llvm.org/D129407

2 years agoWhen the module path for `command script import` is invalid, echo the path.
Jim Ingham [Mon, 18 Jul 2022 21:47:35 +0000 (14:47 -0700)]
When the module path for `command script import` is invalid, echo the path.

We were just emitting "invalid module" w/o saying which module.  That's
not particularly helpful.

Differential Revision: https://reviews.llvm.org/D129338

2 years agoMake hit point counts reliable for architectures that stop before evaluation.
Jim Ingham [Wed, 13 Jul 2022 01:34:24 +0000 (18:34 -0700)]
Make hit point counts reliable for architectures that stop before evaluation.

Since we want to present the "new & old" values for watchpoint hits, on architectures,
including the ARM family, that stop before the triggering instruction is run, we need
to single step over the instruction before stopping for realz.  This was incorrectly
done directly in the StopInfoWatchpoint::ShouldStop.  That causes problems if more than
one thread stops "for a reason" at the same time as the watchpoint, since the other actions
didn't expect the process to make progress in this part of the execution control machinery.

The correct way to do this is to schedule the step over using ThreadPlans, and then to restore
the stop info after that plan stops, so that the rest of the stop info actions can happen when
all the other threads have handled their immediate actions as well.

Differential Revision: https://reviews.llvm.org/D129814

2 years agoCodeGen: Remove AliasAnalysis from regalloc
Matt Arsenault [Fri, 24 Jun 2022 16:09:34 +0000 (12:09 -0400)]
CodeGen: Remove AliasAnalysis from regalloc

This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.

Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.

Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.

2 years ago[libc] fix strtofloatingpoint on rare edge case
Michael Jones [Mon, 18 Jul 2022 18:32:04 +0000 (11:32 -0700)]
[libc] fix strtofloatingpoint on rare edge case

Currently, there are two string parsers that can be used in a call to
strtofloatingpoint. There is the main parser used by Clinger's fast path
and Eisel-Lemire, and the backup parser used by Simple Decimal
Conversion. There was a bug in the backup parser where if the number had
more than 800 digits (the size of the SDC buffer) before the decimal
point, it would just ignore the digits after the 800th and not count
them into the exponent. This patch fixes that issue and adds regression
tests.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D130032

2 years ago[BOLT][DWARF] Fix incorrect DW_AT_type offset for unittest
zr33 [Mon, 18 Jul 2022 21:20:22 +0000 (14:20 -0700)]
[BOLT][DWARF] Fix incorrect DW_AT_type offset for unittest

Some unit tests has incorrect DW_AT_type offset since they are manual crafted, fix them to the correct offset.

Reviewed By: Amir, ayermolo

Differential Revision: https://reviews.llvm.org/D129828

2 years ago[BOLT][DWARF] Add Unit test for DW_AT_high_pc [DW_FORM_addr]
zr33 [Mon, 18 Jul 2022 21:03:40 +0000 (14:03 -0700)]
[BOLT][DWARF] Add Unit test for DW_AT_high_pc [DW_FORM_addr]

Reviewed By: ayermolo

Differential Revision: https://reviews.llvm.org/D127613

2 years ago[pseudo] Add guards for module contextual keywords
Sam McCall [Mon, 18 Jul 2022 20:38:24 +0000 (22:38 +0200)]
[pseudo] Add guards for module contextual keywords

2 years ago[clang-tidy] Reduce the dependencies for the "make-confusable-table" tool
Martin Storsjö [Thu, 14 Jul 2022 19:35:50 +0000 (22:35 +0300)]
[clang-tidy] Reduce the dependencies for the "make-confusable-table" tool

When cross compiling llvm, a separate recursive native cmake build
is generated, for building the tools that generate code (unless they're
provided externally by the caller).

This reduces the number of build steps for that native build from
1000+ steps to 162.

This matches how the clang-pseudo-gen tool is set up in
clang-tools-extra/pseudo/gen/CMakeLists.txt.

Differential Revision: https://reviews.llvm.org/D129797

2 years ago[clang-format] Mark constexpr lambdas as lambda
Björn Schäpers [Sat, 16 Jul 2022 21:46:18 +0000 (23:46 +0200)]
[clang-format] Mark constexpr lambdas as lambda

Otherwise the brace was detected as a function brace, not wrong per se,
but when directly calling the lambda the calling parens were put on the
next line.

Differential Revision: https://reviews.llvm.org/D129946

2 years ago[clang-format] Indent TT_CtorInitializerColon after requires clauses
Björn Schäpers [Wed, 13 Jul 2022 10:38:38 +0000 (12:38 +0200)]
[clang-format] Indent TT_CtorInitializerColon after requires clauses

Fixes https://github.com/llvm/llvm-project/issues/56215

Differential Revision: https://reviews.llvm.org/D129942

2 years ago[clang-format] Fix misannotation of colon in presence of requires clause
Björn Schäpers [Mon, 4 Jul 2022 08:53:34 +0000 (10:53 +0200)]
[clang-format] Fix misannotation of colon in presence of requires clause

For clauses without parentheses it was annotated as TT_InheritanceColon.
Relates to https://github.com/llvm/llvm-project/issues/56215

Differential Revision: https://reviews.llvm.org/D129940

2 years ago[AMDGPU] Support for gfx940 fp8 smfmac
Stanislav Mekhanoshin [Fri, 15 Jul 2022 22:16:04 +0000 (15:16 -0700)]
[AMDGPU] Support for gfx940 fp8 smfmac

Differential Revision: https://reviews.llvm.org/D129908

2 years ago[AMDGPU] Support for gfx940 fp8 mfma
Stanislav Mekhanoshin [Fri, 15 Jul 2022 21:45:19 +0000 (14:45 -0700)]
[AMDGPU] Support for gfx940 fp8 mfma

Differential Revision: https://reviews.llvm.org/D129906

2 years ago[AMDGPU] Support for gfx940 fp8 conversions
Stanislav Mekhanoshin [Fri, 15 Jul 2022 20:20:08 +0000 (13:20 -0700)]
[AMDGPU] Support for gfx940 fp8 conversions

Differential Revision: https://reviews.llvm.org/D129902

2 years ago[LV] Sink module variable and use State to set it in widenCall. (NFC)
Florian Hahn [Mon, 18 Jul 2022 18:41:48 +0000 (19:41 +0100)]
[LV] Sink module variable and use State to set it in widenCall. (NFC)

Limits the lifetime of the variable and makes it independent of
CallInst.