platform/upstream/llvm.git
2 years ago[libc][math] Separated builtin function in special FPUtils header.
Kirill Okhotnikov [Fri, 10 Jun 2022 01:14:10 +0000 (03:14 +0200)]
[libc][math] Separated builtin function in special FPUtils header.

A small refactoring of builtin functions in preparation to adding fmod/fmodf function.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D127088

2 years ago[lld][WebAssembly] Revert moving of data relocations to start function
Sam Clegg [Wed, 8 Jun 2022 18:48:38 +0000 (11:48 -0700)]
[lld][WebAssembly] Revert moving of data relocations to start function

Back in https://reviews.llvm.org/D117412 we moved the application of
data reloctions to the wasm start function.

However, because the dynamic linker doesn't know the final addresses
at module instantiation time, this proved to be too early and the
relocations could be applied with the wrong values.

Fixes: https://github.com/emscripten-core/emscripten/issues/17150

Differential Revision: https://reviews.llvm.org/D127333

2 years ago[flang] semantics test for ucobound
Damian Rouson [Thu, 26 May 2022 23:41:58 +0000 (16:41 -0700)]
[flang] semantics test for ucobound

Add a test with a range of ucobound() intrinsic function
invocations, including a comprehensive set of standard-conforming
keyword and non-keyword arguments with and without optional
arguments present and with argument positions covering all
possible orderings.  Also test that several non-conforming
ucobound() invocations generate the correct error messages.

Differential Revision: https://reviews.llvm.org/D126508

2 years ago[RISCV] Simplify InstrInfo access in doPeepholeMaskedRVV [nfc]
Philip Reames [Thu, 9 Jun 2022 23:58:45 +0000 (16:58 -0700)]
[RISCV] Simplify InstrInfo access in doPeepholeMaskedRVV [nfc]

2 years ago[mlir][NFC] Rename Bazel target aliases and consolidate targets
Mogball [Thu, 9 Jun 2022 23:34:57 +0000 (23:34 +0000)]
[mlir][NFC] Rename Bazel target aliases and consolidate targets

This patch completes outstanding TODOs of removing aliases bazel target names.
This patch also renames and cosolidates some bazel targets to be more in line
with their CMake counterparts, e.g. combining `:LinalgOps` and `:LinalgInterfaces`
into `:LinalgDialect`.

Differential Revision: https://reviews.llvm.org/D127459

2 years ago[mlir] Support passing ostream as argument for the create function.
Okwan Kwon [Thu, 9 Jun 2022 22:11:36 +0000 (15:11 -0700)]
[mlir] Support passing ostream as argument for the create function.

The constructor already supports passing an ostream as argument,
so let's make the create function support it too.

Differential Revision: https://reviews.llvm.org/D127449

2 years ago[NFC] test commit
Sunho Kim [Thu, 9 Jun 2022 23:32:58 +0000 (08:32 +0900)]
[NFC] test commit

This is an empty commit to check commit access

2 years ago[lldb] Use assertState in more tests (NFC)
Dave Lee [Thu, 9 Jun 2022 05:22:27 +0000 (22:22 -0700)]
[lldb] Use assertState in more tests (NFC)

Follow to D127355, converting more `assertEquals` to `assertState`.

Differential Revision: https://reviews.llvm.org/D127378

2 years ago[mlir][spirv] Replace StructAttrs with AttrDefs
Mogball [Thu, 9 Jun 2022 21:35:32 +0000 (21:35 +0000)]
[mlir][spirv] Replace StructAttrs with AttrDefs

Depends on D127370

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D127373

2 years ago[BasicTTI] Return Invalid cost for more scalable vector scalarization cases
Philip Reames [Thu, 9 Jun 2022 23:02:18 +0000 (16:02 -0700)]
[BasicTTI] Return Invalid cost for more scalable vector scalarization cases

Instead of crashing on a cast<FixedVectorType>, we should isntead return Invalid for these cases.  This avoids crashes in assert builds, and potential miscompiles in release builds.

2 years ago[RISCV] Teach RISCVMergeBaseOffset about cases where we use SHXADD to add some immedi...
Craig Topper [Thu, 9 Jun 2022 22:48:21 +0000 (15:48 -0700)]
[RISCV] Teach RISCVMergeBaseOffset about cases where we use SHXADD to add some immediates.

For an addition with simm14 and simm15 immediates with 2 or 3 trailing bits,
we can use a shXadd instruction and an addi to do the addition.

This patch teaches RISCVMergeBaseOffset to see through this pattern.
I don't think the sh1add case occurs because we use two addis for that,
but I implemented it for completeness.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D127376

2 years ago[mlir][tosa] Replace StructAttrs with AttrDefs
Mogball [Thu, 9 Jun 2022 21:34:45 +0000 (21:34 +0000)]
[mlir][tosa] Replace StructAttrs with AttrDefs

Depends on D127352

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127370

2 years ago[mlir][gpu] Move GPU headers into IR/ and Transforms/
Mogball [Thu, 9 Jun 2022 21:33:41 +0000 (21:33 +0000)]
[mlir][gpu] Move GPU headers into IR/ and Transforms/

Depends on D127350

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127352

2 years agoPlumb InstructionCost through unroll costing
Philip Reames [Thu, 9 Jun 2022 22:32:30 +0000 (15:32 -0700)]
Plumb InstructionCost through unroll costing

Teach the unroller(s) how to handle an invalid cost. This avoids crashes when the backend can't provide a cost due to either a fundemental limitation or an unimplemented cost model case.

Differential Revision: https://reviews.llvm.org/D127305

2 years ago[BOLT][AArch64] Handle data at the beginning of a function when disassembling and...
Denis Revunov [Wed, 8 Jun 2022 22:08:31 +0000 (15:08 -0700)]
[BOLT][AArch64] Handle data at the beginning of a function when disassembling and building CFG.

This patch adds getFirstInstructionOffset method for BinaryFunction
which is used to properly handle cases where data is at zero offset in
a function. The main change is that we add basic block at first
instruction offset when disassembling, which prevents assertion
failures in buildCFG.

Reviewed By: yota9, rafauler

Differential Revision: https://reviews.llvm.org/D127111

2 years ago[mlir][gpu] Change ParalellLoopMappingAttr to AttrDef
Mogball [Thu, 9 Jun 2022 21:33:41 +0000 (21:33 +0000)]
[mlir][gpu] Change ParalellLoopMappingAttr to AttrDef

It was a StructAttr. Also adds a FieldParser for AffineMap.

Depends on D127348

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127350

2 years agoPipe potentially invalid InstructionCost through CodeMetrics
Philip Reames [Thu, 9 Jun 2022 22:11:01 +0000 (15:11 -0700)]
Pipe potentially invalid InstructionCost through CodeMetrics

Per the documentation in Support/InstructionCost.h, the purpose of an invalid cost is so that clients can change behavior on impossible to cost inputs. CodeMetrics was instead asserting that invalid costs never occurred.

On a target with an incomplete cost model - e.g. RISCV - this means that transformations would crash on (falsely) invalid constructs - e.g. scalable vectors. While we certainly should improve the cost model - and I plan to do so in the near future - we also shouldn't be crashing. This violates the explicitly stated purpose of an invalid InstructionCost.

I updated all of the "easy" consumers where bailouts were locally obvious. I plan to follow up with loop unroll in a following change.

Differential Revision: https://reviews.llvm.org/D127131

2 years ago[mlir][nvvm] Change MMAShapeAttr to AttrDef
Mogball [Thu, 9 Jun 2022 21:33:26 +0000 (21:33 +0000)]
[mlir][nvvm] Change MMAShapeAttr to AttrDef

MMAShapeAttr was a StructAttr

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127348

2 years ago[gn build] (manually) port 25c8a061c573
Nico Weber [Thu, 9 Jun 2022 22:07:14 +0000 (18:07 -0400)]
[gn build] (manually) port 25c8a061c573

2 years ago[libc] move printf_main in to object library
Michael Jones [Wed, 1 Jun 2022 21:45:50 +0000 (14:45 -0700)]
[libc] move printf_main in to object library

Previously printf_main was a header library, but header library
dependencies don't work properly so it's been moved to an object
library. Additionally, the writers have been marked inline.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D126830

2 years ago[pseudo] Fix the broken build of ClangPseudoBenchmark, after c70aeaa.
Haojian Wu [Thu, 9 Jun 2022 21:02:56 +0000 (23:02 +0200)]
[pseudo] Fix the broken build of ClangPseudoBenchmark, after c70aeaa.

2 years ago[InstCombine] add narrowing transform for low-masked binop with zext operand
Sanjay Patel [Thu, 9 Jun 2022 20:59:26 +0000 (16:59 -0400)]
[InstCombine] add narrowing transform for low-masked binop with zext operand

https://alive2.llvm.org/ce/z/hRy3rE

As shown in D123408, we can produce this pattern when moving
cast around, and we already have a related fold for a binop
with a constant operand.

2 years ago[InstCombine] add tests for masked binop narrowing; NFC
Sanjay Patel [Thu, 9 Jun 2022 19:41:24 +0000 (15:41 -0400)]
[InstCombine] add tests for masked binop narrowing; NFC

2 years ago[AggressiveInstcombine] Add target tests for fptosi.sat fold. NFC
David Green [Thu, 9 Jun 2022 20:47:05 +0000 (21:47 +0100)]
[AggressiveInstcombine] Add target tests for fptosi.sat fold. NFC

2 years ago[bazel] Add missing dependency after 9f1221521f4b.
Benjamin Kramer [Thu, 9 Jun 2022 20:40:45 +0000 (22:40 +0200)]
[bazel] Add missing dependency after 9f1221521f4b.

2 years ago[X86] Remove !VT.is128BitVector() check. NFCI.
Simon Pilgrim [Thu, 9 Jun 2022 20:39:39 +0000 (21:39 +0100)]
[X86] Remove !VT.is128BitVector() check. NFCI.

The code is inside a if(VT.is256BitVector() || VT.is512BitVector()) condition

2 years ago[BOLT] Add support for GOTPCRELX relocations
Maksim Panchenko [Thu, 26 May 2022 19:05:52 +0000 (12:05 -0700)]
[BOLT] Add support for GOTPCRELX relocations

The linker can convert instructions with GOTPCRELX relocations into a
form that uses an absolute addressing with an immediate. BOLT needs to
recognize such conversions and symbolize the immediates.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126747

2 years ago[AMDGPU] Update SIInsertHardClauses for GFX11
Jay Foad [Thu, 9 Jun 2022 10:38:48 +0000 (11:38 +0100)]
[AMDGPU] Update SIInsertHardClauses for GFX11

Changes for GFX11:
- Clauses may not mix instructions of different types, and there are
  more types. For example image instructions with and without a sampler
  are now different types.
- The max size of a clause is explicitly documented as 63 instructions.
  Previously it was implicitly assumed to be 64. This is such a tiny
  difference that it does not seem worth making it conditional on the
  subtarget.
- It can be beneficial to clause stores as well as loads.

Differential Revision: https://reviews.llvm.org/D127391

2 years ago[mlir][bufferize] Improve resolveConflicts for ExtractSliceOp
Matthias Springer [Thu, 9 Jun 2022 20:14:53 +0000 (22:14 +0200)]
[mlir][bufferize] Improve resolveConflicts for ExtractSliceOp

It is sometimes better to make a copy of the OpResult instead of making a copy of the OpOperand. E.g., when bufferizing tensor.extract_slice.

This implementation will eventually make parts of extract_slice's `bufferize` implementation obsolete (and simplify it). It will only need to handle in-place OpOperands.

Differential Revision: https://reviews.llvm.org/D126819

2 years ago[X86][AVX2] LowerINSERT_VECTOR_ELT - support v4i64 insertion as BLENDI(X, SCALAR_TO_V...
Simon Pilgrim [Thu, 9 Jun 2022 20:18:10 +0000 (21:18 +0100)]
[X86][AVX2] LowerINSERT_VECTOR_ELT - support v4i64 insertion as BLENDI(X, SCALAR_TO_VECTOR(Y))

2 years ago[mlir][bufferization][NFC] Put inplacability conflict resolution in op interface
Matthias Springer [Thu, 9 Jun 2022 20:02:16 +0000 (22:02 +0200)]
[mlir][bufferization][NFC] Put inplacability conflict resolution in op interface

The TensorCopyInsertion pass resolves out-of-place bufferization decisions by inserting explicit `bufferization.alloc_tensor` ops. This change moves that functionality into a new BufferizableOpInterface method, so that it can be overridden by op implementations. Some op bufferizations must insert additional `alloc_tensor` ops to make sure that certain aliasing invariants are not violated (e.g., scf::ForOp). This will be addressed in a subsequent change.

Differential Revision: https://reviews.llvm.org/D126817

2 years agoRecommit "[mlir][vector] Allow unroll of contraction in arbitrary order"
Christopher Bate [Wed, 8 Jun 2022 18:56:34 +0000 (12:56 -0600)]
Recommit "[mlir][vector] Allow unroll of contraction in arbitrary order"

Fixed issue with vector.contract default unroll permutation.

Adds support for vector unroll transformations to unroll in different
orders. For example, the vector.contract can be unrolled into a
smaller set of contractions. There is a choice of how to unroll the
decomposition based on the traversal order of (dim0, dim1, dim2).
The choice of traversal order can now be specified by a callback which
given by the caller of the transform. For now, only the
vector.contract, vector.transfer_read/transfer_write operations
support the callback.

Differential Revision: https://reviews.llvm.org/D127004

2 years ago[Object][COFF] Fix section name parsing error when the name field is not null-padded
Pengxuan Zheng [Thu, 9 Jun 2022 01:02:49 +0000 (18:02 -0700)]
[Object][COFF] Fix section name parsing error when the name field is not null-padded

Some object files produced by Mirosoft tools contain sections whose name field
is not fully null-padded at the end. Microsoft's dumpbin is able to print the
section name correctly, but this causes parsing errors with LLVM tools.

So far, this issue only seems to happen when the section name is longer than 8
bytes. In this case, the section name field contains a slash (/) followed by the
offset into the string table, but the name field is not fully null-padded at the
end.

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D127369

2 years ago[libc++] Fixes CI after Jammy update.
Mark de Wever [Thu, 9 Jun 2022 18:51:25 +0000 (20:51 +0200)]
[libc++] Fixes CI after Jammy update.

Reviewed By: #libc, philnik

Differential Revision: https://reviews.llvm.org/D127419

2 years ago[mlir][bufferization] Add TensorCopyInsertion pass
Matthias Springer [Thu, 9 Jun 2022 19:49:37 +0000 (21:49 +0200)]
[mlir][bufferization] Add TensorCopyInsertion pass

This pass runs the One-Shot Analysis to find out which tensor OpOperands must bufferize out-of-place. It then rewrites those tensor OpOperands to explicit allocations with a copy in the form of `bufferization.alloc_tensor`. The resulting IR can then be bufferized without having to care about read-after-write conflicts.

This change makes it possible to connect One-Shot Analysis to other bufferizations such as the sparse compiler.

Differential Revision: https://reviews.llvm.org/D126573

2 years ago[Sanitizers] Cleanup handling of stat64/statfs64
Mariusz Borsa [Tue, 7 Jun 2022 23:15:20 +0000 (16:15 -0700)]
[Sanitizers] Cleanup handling of stat64/statfs64

There are differences in handling of stat64/statfs64 calls by sanitizers between Linux and macOS. Versions of macOS starting with 10.6 drop the stat64/statfs64 APIs completely, relying on the linker to redirect stat/statfs to the appropriate 64 bit versions. Emitting variables needed by sanitizers is thus controlled by convoluted sets of conditions, involving Linux, IOS, macOS and Android, sprinkled around files.

This change adresses it, allowing to specify presence/absence of stat64/statfs64 for each platform, in a single location. Also, it adresses the Android case which handles stat64, but not statfs64.

Adding Vitaly as a reviewer since he seems to be actively working on sanitizers, perhaps can comment on the Android bit

Differential Revision: https://reviews.llvm.org/D127343

2 years ago[libc++] Mark GDB pretty printers as unsupported on GCC 11.2 to make CI green
Louis Dionne [Thu, 9 Jun 2022 19:50:56 +0000 (15:50 -0400)]
[libc++] Mark GDB pretty printers as unsupported on GCC 11.2 to make CI green

2 years ago[AMDGPU] gfx11 VOPC instructions
Joe Nash [Wed, 25 May 2022 14:30:47 +0000 (10:30 -0400)]
[AMDGPU] gfx11 VOPC instructions

Supports encoding existing instrutions on gfx11 and MC support for the new VOPC
dpp instructions.

Patch 19/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126978

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126989

2 years ago[lldb] Set COFF module ABI from default triple and make it an option
Alvin Wong [Thu, 9 Jun 2022 19:34:02 +0000 (22:34 +0300)]
[lldb] Set COFF module ABI from default triple and make it an option

PE/COFF can use either MSVC or GNU (MinGW) ABI for C++ code, however
LLDB had defaulted to MSVC implicitly with no way to override it. This
causes issues when debugging modules built with the GNU ABI, sometimes
even crashes.

This changes the PE/COFF plugin to set the module triple according to
the default target triple used to build LLDB. If the default target
triple is Windows and a valid environment is specified, then this
environment will be used for the module spec. This not only works for
MSVC and GNU, but also other environments.

A new setting, `plugin.object-file.pe-coff.abi`,  has been added to
allow overriding this default ABI.

* Fixes https://github.com/llvm/llvm-project/issues/50775
* Fixes https://github.com/mstorsjo/llvm-mingw/issues/226
* Fixes https://github.com/mstorsjo/llvm-mingw/issues/282

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D127048

2 years ago[mlir][bufferization] Add optional `copy` operand to AllocTensorOp
Matthias Springer [Thu, 9 Jun 2022 19:36:39 +0000 (21:36 +0200)]
[mlir][bufferization] Add optional `copy` operand to AllocTensorOp

If `copy` is specified, the newly allocated buffer is initialized with the given contents. Also add an optional `escape` attribute to indicate whether the buffer of the tensor may be returned from the parent block (aka. "escape") after bufferization.

This change is in preparation of connecting One-Shot Bufferize to the sparse compiler.

Differential Revision: https://reviews.llvm.org/D126570

2 years ago[LLD] [ELF] Add parentheses to silence a GCC warning. NFC.
Martin Storsjö [Thu, 9 Jun 2022 11:12:05 +0000 (14:12 +0300)]
[LLD] [ELF] Add parentheses to silence a GCC warning. NFC.

This silences the following warning:

../tools/lld/ELF/SyntheticSections.cpp:1596:48: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
 1596 |   assert((index != 0 || type != target->gotRel && type != target->pltRel ||

Differential Revision: https://reviews.llvm.org/D127395

2 years ago[libcxx] [doc] Add a release note for distributors about MinGW builds and dllimport
Martin Storsjö [Mon, 6 Jun 2022 20:52:44 +0000 (23:52 +0300)]
[libcxx] [doc] Add a release note for distributors about MinGW builds and dllimport

This allows distributors to simplify how libc++ is built in MinGW
configurations.

Differential Revision: https://reviews.llvm.org/D127151

2 years agoReapply: Add an error message to the default SIGPIPE handler
Tim Northover [Thu, 9 Jun 2022 18:52:29 +0000 (19:52 +0100)]
Reapply: Add an error message to the default SIGPIPE handler

UNIX03 conformance requires utilities to flush stdout before exiting and raise
an error if writing fails. Flushing already happens on a call to exit
and thus automatically on a return from main. Write failure is then
detected by LLVM's default SIGPIPE handler. The handler already exits with
a non-zero code, but conformance additionally requires an error message.

First reapply attempt I hadn't noticed the test had changed, hopefully this
goes better.

2 years ago[gn build] Port 976f37050dbd
LLVM GN Syncbot [Thu, 9 Jun 2022 19:04:54 +0000 (19:04 +0000)]
[gn build] Port 976f37050dbd

2 years ago[libc++] Granularize __string
Nikolas Klauser [Mon, 6 Jun 2022 21:35:24 +0000 (23:35 +0200)]
[libc++] Granularize __string

Reviewed By: ldionne, #libc

Spies: libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D127156

2 years ago[AMDGPU] Use v_mad_u64_u32 for IMAD32
Stanislav Mekhanoshin [Mon, 6 Jun 2022 23:31:25 +0000 (16:31 -0700)]
[AMDGPU] Use v_mad_u64_u32 for IMAD32

Nic Curtis done the experiments to prove it is faster than a
separate mul and add.

Fixes: SWDEV-332806

Differential Revision: https://reviews.llvm.org/D127253

2 years ago[mlir][sparse] refactor handling of merger leafs and ops
Aart Bik [Tue, 7 Jun 2022 22:51:17 +0000 (15:51 -0700)]
[mlir][sparse] refactor handling of merger leafs and ops

Using "default:" in the switch statemements that handle all our
merger ops has become a bit cumbersome since it is easy to overlook
parts of the code that need to handle ops specifically. By enforcing
full switch statements without "default:", we get a compiler warning
when cases are overlooked.

Reviewed By: wrengr

Differential Revision: https://reviews.llvm.org/D127263

2 years agoRevert "Add an error message to the default SIGPIPE handler"
Tim Northover [Thu, 9 Jun 2022 18:01:13 +0000 (19:01 +0100)]
Revert "Add an error message to the default SIGPIPE handler"

It broke PPC bots.

2 years ago[BOLT][DWARF] Fix dwarf5-loclist-offset-form test
Alexander Yermolovich [Thu, 9 Jun 2022 17:25:18 +0000 (10:25 -0700)]
[BOLT][DWARF] Fix dwarf5-loclist-offset-form test

I put it into wrong directory. As the result it is failing for aarch64. Moving
the test under X86. Orignial diff D126999.

Differential Revision: https://reviews.llvm.org/D127417

2 years ago[SystemZ/z/OS] Fix failing dynamic library unit test.
Kai Nacke [Thu, 9 Jun 2022 16:04:31 +0000 (12:04 -0400)]
[SystemZ/z/OS] Fix failing dynamic library unit test.

Root cause for the failure is that the visibility of symbols
is different on z/OS. To fix the failure, the symbols need to
be exported.

Reviewed By: abhina.sreeskantharajan

Differential Revision: https://reviews.llvm.org/D127416

2 years ago[AMDGPU] Fix hazard handling of v_cmpx to permlane
Stanislav Mekhanoshin [Wed, 8 Jun 2022 20:42:28 +0000 (13:42 -0700)]
[AMDGPU] Fix hazard handling of v_cmpx to permlane

- VOP3 and SDWA forms of V_CMPX were not handled
- Hazard only exists if the compare defines EXEC (i.e. V_CMPX)
  forwarded to the permlane.

Differential Revision: https://reviews.llvm.org/D127344

2 years ago[AArch64][SVE] Don't crash on pre-legalizer types in extload combine.
Ahmed Bougacha [Thu, 26 May 2022 21:46:43 +0000 (14:46 -0700)]
[AArch64][SVE] Don't crash on pre-legalizer types in extload combine.

This was assuming the vector types were MVTs, but they don't have to be.

Note that the concrete output of the test isn't very useful, since it's
dominated by nonsensical calling convention lowering for the weird types.

Differential Revision: https://reviews.llvm.org/D126505

2 years ago[libc] add printf base 10 integer conversion
Michael Jones [Tue, 17 May 2022 18:28:16 +0000 (11:28 -0700)]
[libc] add printf base 10 integer conversion

This patch adds support for d, i, and u conversions in printf, as well
as comprehensive unit tests.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D125929

2 years ago[libc] Add compile options to pthread_create target.
Siva Chandra Reddy [Thu, 9 Jun 2022 06:40:55 +0000 (06:40 +0000)]
[libc] Add compile options to pthread_create target.

The compile options now match that of thrd_create. Two compile options
are of importance:
1. -O3 - This is required so that stack is not used between the clone
   syscall and the start function in the child thread.
2. -fno-omit-frame-pointer - This is required so that we can sniff out
   the thread start args from the child thread's stack memory.

Without these two options, pthread_create will exhibit flaky behavior.

Reviewed By: lntue, michaelrj

Differential Revision: https://reviews.llvm.org/D127381

2 years ago[libc] simplify printf converter tests
Michael Jones [Wed, 8 Jun 2022 20:11:02 +0000 (13:11 -0700)]
[libc] simplify printf converter tests

previously the printf converter tests reused the same string_writer,
which meant that each test depended on the tests before it to succeed.
This makes a new string_writer for each test to simplify and clarify the
tests.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D127341

2 years ago[clang] Allow CLANG_MODULE_CACHE_PATH env var to override module caching behavior
Alex Brachet [Thu, 9 Jun 2022 16:55:37 +0000 (16:55 +0000)]
[clang] Allow CLANG_MODULE_CACHE_PATH env var to override module caching behavior

CLANG_MODULE_CACHE_PATH can be used to change where clang should
put the module cache, or can be set to "" to disable caching entirely.

Differential revision: https://reviews.llvm.org/D126678

2 years ago[mlir][bufferize][NFC] Decouple dropping of equivalent return values from bufferization
Matthias Springer [Thu, 9 Jun 2022 16:37:21 +0000 (18:37 +0200)]
[mlir][bufferize][NFC] Decouple dropping of equivalent return values from bufferization

This simplifies the bufferization itself and is in preparation of connecting with the sparse compiler.

Differential Revision: https://reviews.llvm.org/D126814

2 years ago[mlir][bufferize] Fix bug in module equivalence analysis
Matthias Springer [Thu, 9 Jun 2022 16:30:42 +0000 (18:30 +0200)]
[mlir][bufferize] Fix bug in module equivalence analysis

CallOp result are not equivalent to an OpOperand if the OpOperand bufferizes out-of-place.

Differential Revision: https://reviews.llvm.org/D126813

2 years ago[gn build] (manually) port 4ff5e8184c665
Nico Weber [Thu, 9 Jun 2022 16:29:17 +0000 (12:29 -0400)]
[gn build] (manually) port 4ff5e8184c665

Fixes link of many binaries if RISCV is enabled but most other targets aren't.

2 years ago[mlir][bufferize] Decouple promoteBufferResultsToOutParams from One-Shot Bufferize
Matthias Springer [Thu, 9 Jun 2022 16:24:58 +0000 (18:24 +0200)]
[mlir][bufferize] Decouple promoteBufferResultsToOutParams from One-Shot Bufferize

Users should explicitly run `-buffer-results-to-out-params` instead.

The purpose of this change is to remove `finalizeBuffers`, which made it difficult to extend the bufferization to custom buffer types.

Differential Revision: https://reviews.llvm.org/D126253

2 years ago[mlir][bufferization] Decouple buffer-deallocation from One-Shot Bufferize
Matthias Springer [Thu, 9 Jun 2022 16:19:54 +0000 (18:19 +0200)]
[mlir][bufferization] Decouple buffer-deallocation from One-Shot Bufferize

The buffer deallocation pass must now be run explicitly when `allow-return-alloc` is set.

This results in a few extra buffer copies in unoptimized test cases. The proper way to avoid such copies is to relax the OpOperand/OpResult aliasing contract on ops such as scf.for. Some of these copies can also be avoided by improving the buffer deallocation pass.

Differential Revision: https://reviews.llvm.org/D126252

2 years ago[RISCV][NFC] Update testcase for D126861
Kito Cheng [Thu, 9 Jun 2022 16:17:10 +0000 (00:17 +0800)]
[RISCV][NFC] Update testcase for D126861

2 years ago[lldb] Add a reference to the "On Demand Symbols" docs.
Jonas Devlieghere [Thu, 9 Jun 2022 16:13:00 +0000 (09:13 -0700)]
[lldb] Add a reference to the "On Demand Symbols" docs.

Include a reference to the documentation for "on demand symbols" in the
documentation index. This will ensure the page shows up in the side bar
on the website.

2 years ago[lldb] Add table with custom LLDB asserts to the docs
Jonas Devlieghere [Thu, 9 Jun 2022 16:10:14 +0000 (09:10 -0700)]
[lldb] Add table with custom LLDB asserts to the docs

Add table with custom LLDB asserts to the documentation.

Differential revision: https://reviews.llvm.org/D127410

2 years ago[lldb] Fix code blocks in docs/use/intel_pt.rst
Jonas Devlieghere [Thu, 9 Jun 2022 15:41:03 +0000 (08:41 -0700)]
[lldb] Fix code blocks in docs/use/intel_pt.rst

2 years ago[NFC] change error message wording.
Florian Mayer [Thu, 9 Jun 2022 15:49:03 +0000 (08:49 -0700)]
[NFC] change error message wording.

2 years ago[libcxx] improve LIBCXX_ABI_NAMESPACE error message
Florian Mayer [Tue, 7 Jun 2022 22:26:05 +0000 (15:26 -0700)]
[libcxx] improve LIBCXX_ABI_NAMESPACE error message

include the invalid LIBCXX_ABI_NAMESPACE to ease debugging.

Reviewed By: #libc, jloser, ldionne

Differential Revision: https://reviews.llvm.org/D127257

2 years ago[RISCV] Fix missing stack pointer recover
Kito Cheng [Thu, 9 Jun 2022 15:35:57 +0000 (23:35 +0800)]
[RISCV] Fix missing stack pointer recover

In order to make sure the stack point is right through the EH region,
we also need to restore stack pointer from the frame pointer if we
don't preserve stack space within prologue/epilogue for outgoing variables,
normally it's just checking the variable sized object is present or not
is enough, but we also don't preserve that at prologue/epilogue when
have vector objects in stack.

Example to show what happened:
```
try {
  sp adjust for outgoing args. // 1. Sp changed.
  func_call  // 2. Exception raised
  sp restore // Oh, not restored
} catch {
  // 3. And now we are here.
}

// 4. Prepare to return!, restore return address from stack, but...sp is wrong.
// 5. Screw up!
```

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D126861

2 years ago[RISCV] Pre-commit testcase for PR55442
Kito Cheng [Thu, 9 Jun 2022 15:34:13 +0000 (23:34 +0800)]
[RISCV] Pre-commit testcase for PR55442

The testcase show the stack pointer isn't recovered when we got
exception from `_Z3fooiiiiiiiiiiPi`, and then we screw up due to
restore return address from wrong stack pointer.

NOTE:
Trigger conditions:
1. Frame pointer is required.
2. Stack has out-going argument
3. Vector extension is enabled.

Another run-able testcase:

$ clang++ -target riscv64-unknown-linux-gnu -march=rv64gcv test.cpp
```
void __attribute__((noinline)) foo(int, int, int, int, int, int, int, int, int, int, int *){
 throw int(0);
}

int main(int argc, char **argv) {
  int exception_value = 1;
  try {
      foo(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
  } catch (int i) {
    exception_value = i;
  }
  return exception_value;
}
```

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D126860

2 years ago[MLIR][Shape] Generalize `shape.concat` to extent tensors
Yuanqiang Liu [Thu, 9 Jun 2022 15:23:25 +0000 (08:23 -0700)]
[MLIR][Shape] Generalize `shape.concat` to extent tensors

The operation `shape.concat` was used for type shape only.
We now enable it for extent tensors.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D127321

2 years ago[CodeGen] Keep track info of lazy-emitted symbols in ModuleBuilder
Jun Zhang [Thu, 9 Jun 2022 15:12:21 +0000 (23:12 +0800)]
[CodeGen] Keep track info of lazy-emitted symbols in ModuleBuilder

The intent of this patch is to selectively carry some states over to
the Builder so we won't lose the information of the previous symbols.

This used to be several downstream patches of Cling, it aims to fix
errors in Clang Interpreter when trying to use inline functions.
Before this patch:

clang-repl> inline int foo() { return 42;}
clang-repl> int x = foo();

JIT session error: Symbols not found: [ _Z3foov ]
error: Failed to materialize symbols:
{ (main, { x, $.incr_module_1.__inits.0, __orc_init_func.incr_module_1 }) }

Co-authored-by: Axel Naumann <Axel.Naumann@cern.ch>
Signed-off-by: Jun Zhang <jun@junz.org>
Differential Revision: https://reviews.llvm.org/D126781

2 years agoRevert "[Attributor] Replace AAValueSimplify with AAPotentialValues"
Johannes Doerfert [Thu, 9 Jun 2022 15:02:57 +0000 (17:02 +0200)]
Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues"

This reverts commit da50dab1ae111e9e6cb0248a47a038b17f798705.

Patch broke AMD GPU OpenMP offload buildbots.
https://lab.llvm.org/buildbot/#/builders/193/builds/13246

2 years ago[NFC] Clang-format PatternMatch.h
Simon Moll [Thu, 9 Jun 2022 14:51:32 +0000 (16:51 +0200)]
[NFC] Clang-format PatternMatch.h

2 years ago[Attributor] Replace AAValueSimplify with AAPotentialValues
Johannes Doerfert [Tue, 10 May 2022 22:08:43 +0000 (18:08 -0400)]
[Attributor] Replace AAValueSimplify with AAPotentialValues

For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many reasons:
- We recomputed the result a lot as there was no caching for the 9
  locations calling `genericValueTraversal`.
- We added the idea of "intra" vs. "inter" procedural simplification
  only as an afterthought. `genericValueTraversal` did offer an option
  but `AAValueSimplify` did not. Thus, we might end up with "too much"
  simplification in certain situations and then gave up on it.
- Because `genericValueTraversal` was not a real `AA` we ended up with
  problems like the infinite recursion bug (#54981) as well as code
  duplication.

This patch introduces `AAPotentialValues` and replaces the
`AAValueSimplify` uses with it. `genericValueTraversal` is folded into
`AAPotentialValues` as are the instruction simplifications performed in
`AAValueSimplify` before. We further distinguish "intra" and "inter"
procedural simplification now.

`AAValueSimplify` was not deleted as we haven't ported the
re-materialization of instructions yet. There are other differences over
the former handling, e.g., we may not fold trivially foldable
instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2`
but if an operand would be simplified to `i32 1` we would fold it still.

We are also even more aware of function/SCC boundaries in CGSCC passes,
which is good.

Fixes: https://github.com/llvm/llvm-project/issues/54981

2 years ago[Attributor] Try to delete stores and simplify stored values
Johannes Doerfert [Thu, 19 May 2022 18:51:07 +0000 (13:51 -0500)]
[Attributor] Try to delete stores and simplify stored values

By default we should try to eliminate unused stores and simplify values
stored while we are at it.

2 years ago[Attributor] Ensure to use the proper liveness AA
Johannes Doerfert [Thu, 19 May 2022 18:35:58 +0000 (13:35 -0500)]
[Attributor] Ensure to use the proper liveness AA

When determining liveness via Attributor::isAssumedDead(...) we might
end up without a liveness AA or with one pointing into another function.
Neither is helpful and we will avoid both from now on.

2 years ago[AMDGPU] Add GFX11 test coverage for the memory legalizer
Jay Foad [Thu, 9 Jun 2022 14:00:49 +0000 (15:00 +0100)]
[AMDGPU] Add GFX11 test coverage for the memory legalizer

2 years agoPass plugin_name in SBProcess::SaveCore
Levon [Thu, 9 Jun 2022 14:27:51 +0000 (16:27 +0200)]
Pass plugin_name in SBProcess::SaveCore

This CL allows to use minidump save-core functionality (https://reviews.llvm.org/D108233) via SBProcess interface.
After adding a support from gdb-remote client (https://reviews.llvm.org/D101329) if the plugin name is empty the plugin manager will try to save the core directly from the process plugin.
See https://github.com/llvm/llvm-project/blob/main/lldb/source/Core/PluginManager.cpp#L696

To have an ability to save the core with minidump plugin I added plugin name as a parameter in SBProcess::SaveCore.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D125325

2 years ago[RISCV] Add cost model for reverse shuffle
Philip Reames [Thu, 9 Jun 2022 14:20:33 +0000 (07:20 -0700)]
[RISCV] Add cost model for reverse shuffle

The majority of the cost appears to be forming the indices vector.

Differential Revision: https://reviews.llvm.org/D127141

2 years agoRecommit "[SCEV] Look through single value PHIs." (take 3)
Florian Hahn [Thu, 9 Jun 2022 14:20:10 +0000 (15:20 +0100)]
Recommit "[SCEV] Look through single value PHIs." (take 3)

This reverts commit 1fbdbb559569641f6d509b569966901c8fb02b63.

All known issues surfaced by this patch should have been fixed now.

The fixes included fixing issues with SCEV expansion in LV and DA's
reliance on LCSSA phis.

2 years ago[clang][dataflow] Track `optional` contents in `optional` model.
Yitzhak Mandelbaum [Tue, 3 May 2022 15:53:35 +0000 (15:53 +0000)]
[clang][dataflow] Track `optional` contents in `optional` model.

This patch adds partial support for tracking (i.e. modeling) the contents of an
optional value. Specifically, it supports tracking the value after it is
accessed. We leave tracking constructed/assigned contents to a future patch.

Differential Revision: https://reviews.llvm.org/D124932

2 years ago[analyzer] Fix assertion failure after getKnownValue call
Gabor Marton [Wed, 8 Jun 2022 10:11:21 +0000 (12:11 +0200)]
[analyzer] Fix assertion failure after getKnownValue call

Depends on D126560. `getKnownValue` has been changed by the parent patch
in a way that simplification was removed. This is not correct when the
function is called by the Checkers. Thus, a new internal function is
introduced, `getConstValue`, which simply queries the constraint manager.
This `getConstValue` is used internally in the `SimpleSValBuilder` when a
binop is evaluated, this way we avoid the recursion into the `Simplifier`.

Differential Revision: https://reviews.llvm.org/D127285

2 years ago[NFC] format InstructionSimplify & lowerCaseFunctionNames
Simon Moll [Thu, 9 Jun 2022 14:09:14 +0000 (16:09 +0200)]
[NFC] format InstructionSimplify & lowerCaseFunctionNames

Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName".  This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.

This is the alternative to the less invasive clang-format only patch: D126783

Reviewed By: spatel, rengolin

Differential Revision: https://reviews.llvm.org/D126889

2 years ago[DAG] combineInsertEltToShuffle - if EXTRACT_VECTOR_ELT fails to match an existing...
Simon Pilgrim [Thu, 9 Jun 2022 13:46:48 +0000 (14:46 +0100)]
[DAG] combineInsertEltToShuffle - if EXTRACT_VECTOR_ELT fails to match an existing shuffle op, try to replace an undef op if there is one.

This should fix a number of shuffle regressions in D127115 where the re-ordered combines mean we fail to fold a EXTRACT_VECTOR_ELT/INSERT_VECTOR_ELT sequence into a BUILD_VECTOR if we extract from more than one vector source.

2 years ago[Attributor][FIX] Give registered simplification callbacks precedence
Johannes Doerfert [Thu, 9 Jun 2022 12:51:49 +0000 (14:51 +0200)]
[Attributor][FIX] Give registered simplification callbacks precedence

We accidentally checked for constants before we looked for registered
simplification callbacks. The latter needs to take precedence though.

2 years agoFix TableLookupTest on FreeBSD
Andrew Turner [Wed, 18 May 2022 16:21:36 +0000 (17:21 +0100)]
Fix TableLookupTest on FreeBSD

As with Linux placce the Counters array in the __libfuzzer_extra_counters
section. This fixes the test on FreeBSD.

Reviewed by: vitalybuka

Differential Revision: https://reviews.llvm.org/D125902

2 years ago[mlir][bufferization] Add OneShotBufferize transform op
Matthias Springer [Thu, 9 Jun 2022 11:00:08 +0000 (13:00 +0200)]
[mlir][bufferization] Add OneShotBufferize transform op

This commit allows for One-Shot Bufferize to be used through the transform dialect. No op handle is currently returned for the bufferized IR.

Differential Revision: https://reviews.llvm.org/D125098

2 years ago[docs] Update supported language standards list for C++
Yuki Okushi [Sun, 5 Jun 2022 03:18:47 +0000 (12:18 +0900)]
[docs] Update supported language standards list for C++

Differential Revision: https://reviews.llvm.org/D127065

2 years ago[OpenMP] Fix the build on Windows
Yuki Okushi [Thu, 2 Jun 2022 11:34:04 +0000 (20:34 +0900)]
[OpenMP] Fix the build on Windows

The code expanded from kmp_barrier.h uses some `KMP_INTERNAL_*`s,
so the definitions have to be placed before it.

Fixes #55815

Differential Revision: https://reviews.llvm.org/D126873

2 years ago[clang][tests] Add missing compiler name
Timm Bäder [Thu, 9 Jun 2022 13:10:29 +0000 (15:10 +0200)]
[clang][tests] Add missing compiler name

The driver stripts the first argument. Without the compiler name, the
test depends on whether GCC_INSTALL_PREFIX is set or not.

See https://reviews.llvm.org/D125862

2 years ago[pseudo] Move grammar-related headers to a separate dir, NFC.
Haojian Wu [Thu, 9 Jun 2022 10:16:14 +0000 (12:16 +0200)]
[pseudo] Move grammar-related headers to a separate dir, NFC.

We did that for .cpp, but forgot the headers.

Differential Revision: https://reviews.llvm.org/D127388

2 years ago[libc++][CI] Updates Docker image.
Mark de Wever [Mon, 30 May 2022 16:34:15 +0000 (18:34 +0200)]
[libc++][CI] Updates Docker image.

- Updates the image to use Ubuntu Jammy.
- Installs GCC-12 as preparation to migrate to that GCC version.

NOTE: This is a re-application of f2f0dba818a50, which was reverted
in 2b5e3ef83c3 due to an issue with the CI nodes. The CI nodes have
since then been updated and this appears to be fine.

Differential Revision: https://reviews.llvm.org/D126666

2 years agoUse HTTPS links instead of HTTP ones in the C DR status page
Aaron Ballman [Thu, 9 Jun 2022 12:52:07 +0000 (08:52 -0400)]
Use HTTPS links instead of HTTP ones in the C DR status page

2 years ago[pseudo] Fix unit test build
Christian Kandeler [Thu, 9 Jun 2022 12:42:14 +0000 (14:42 +0200)]
[pseudo] Fix unit test build

Analogous to 632545e8ce846ccaeca8df15a3dc5e36d01a1275.

Reviewed By: hokein

Differential Revision: https://reviews.llvm.org/D127397

2 years ago[mlir] add producer fusion to structured transform ops
Alex Zinenko [Thu, 9 Jun 2022 09:10:52 +0000 (11:10 +0200)]
[mlir] add producer fusion to structured transform ops

This relies on the existing TileAndFuse pattern for tensor-based structured
ops. It complements pure tiling, from which some utilities are generalized.

Depends On D127300

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D127319

2 years agoRevert "[lld-macho] Initial support for EH Frames"
Douglas Yung [Thu, 9 Jun 2022 12:25:43 +0000 (05:25 -0700)]
Revert "[lld-macho] Initial support for EH Frames"

This reverts commit 826be330af9c0a8553a5b32718ecd2d97e10438e.

This was causing a test failure on build bots:
  - https://lab.llvm.org/buildbot/#/builders/36/builds/21770
  - https://lab.llvm.org/buildbot/#/builders/58/builds/23913

2 years agoRevert "[lld-macho] Support EH frames under arm64"
Douglas Yung [Thu, 9 Jun 2022 12:24:28 +0000 (05:24 -0700)]
Revert "[lld-macho] Support EH frames under arm64"

This reverts commit 977d62c33e3343a394777c1754682761eebb66cd.

This change was causing crashes in 2 tests on the buildbots:
  - https://lab.llvm.org/buildbot/#/builders/58/builds/23914
  - https://lab.llvm.org/buildbot/#/builders/36/builds/21771

2 years ago[pseudo] Don't clang-format test inputs. NFC
Sam McCall [Thu, 9 Jun 2022 12:18:04 +0000 (14:18 +0200)]
[pseudo] Don't clang-format test inputs. NFC

2 years ago[pseudo] Fix the missing-field-initializers warning from f1ac00c9b0d1, NFC
Haojian Wu [Thu, 9 Jun 2022 12:10:36 +0000 (14:10 +0200)]
[pseudo] Fix the missing-field-initializers warning from f1ac00c9b0d1, NFC

2 years ago[LIBOMPTARGET] Adding AMD to llvm-omp-device-info
Jose Manuel Monsalve Diaz [Wed, 1 Jun 2022 21:49:23 +0000 (21:49 +0000)]
[LIBOMPTARGET] Adding AMD to llvm-omp-device-info

Adding device information print for AMD devices on the
`llvm-omp-device-info` command line tool. The output is inspired by
the rocminfo command line tool.

This commit adds missing HSA functions, enums and structs
needed to query additional information from the HSA agents.
A generic message for the `generic-elf-64bit` plugin is also added

Example of an output:
```
llvm-omp-device-info
Device (0):
    This is a generic-elf-64bit device

Device (1):
    This is a generic-elf-64bit device

Device (2):
    This is a generic-elf-64bit device

Device (3):
    This is a generic-elf-64bit device

Device (4):
    HSA Runtime Version:                1.1
    HSA OpenMP Device Number:           0
    Device Name:                        gfx906
    Vendor Name:                        AMD
    Device Type:                        GPU
    Max Queues:                         128
    Queue Min Size:                     64
    Queue Max Size:                     131072
    Cache:
      L0:                               16384 bytes
      L1:                               8388608 bytes
    Cacheline Size:                     64
    Max Clock Freq(MHz):                1725
    Compute Units:                      60
    SIMD per CU:                        4
    Fast F16 Operation:                 TRUE
    Wavefront Size:                     64
    Workgroup Max Size:                 1024
    Workgroup Max Size per Dimension:
      x:                                1024
      y:                                1024
      z:                                1024
    Max Waves Per CU:                   40
    Max Work-item Per CU:               2560
    Grid Max Size:                      4294967295
    Grid Max Size per Dimension:
      x:                                4294967295
      y:                                4294967295
      z:                                4294967295
    Max fbarriers/Workgrp:              32
    Memory Pools:
      Pool GLOBAL; FLAGS: COARSE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GLOBAL; FLAGS: FINE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GROUP:
        Size:                            65536 bytes
        Allocatable:                     FALSE
        Runtime Alloc Granule:           0 bytes
        Runtime Alloc alignment:         0 bytes
        Accessable by all:               FALSE

Device (5):
    HSA Runtime Version:                1.1
    HSA OpenMP Device Number:           1
    Device Name:                        gfx906
    Vendor Name:                        AMD
    Device Type:                        GPU
    Max Queues:                         128
    Queue Min Size:                     64
    Queue Max Size:                     131072
    Cache:
      L0:                               16384 bytes
      L1:                               8388608 bytes
    Cacheline Size:                     64
    Max Clock Freq(MHz):                1725
    Compute Units:                      60
    SIMD per CU:                        4
    Fast F16 Operation:                 TRUE
    Wavefront Size:                     64
    Workgroup Max Size:                 1024
    Workgroup Max Size per Dimension:
      x:                                1024
      y:                                1024
      z:                                1024
    Max Waves Per CU:                   40
    Max Work-item Per CU:               2560
    Grid Max Size:                      4294967295
    Grid Max Size per Dimension:
      x:                                4294967295
      y:                                4294967295
      z:                                4294967295
    Max fbarriers/Workgrp:              32
    Memory Pools:
      Pool GLOBAL; FLAGS: COARSE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GLOBAL; FLAGS: FINE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GROUP:
        Size:                            65536 bytes
        Allocatable:                     FALSE
        Runtime Alloc Granule:           0 bytes
        Runtime Alloc alignment:         0 bytes
        Accessable by all:               FALSE

Device (6):
    HSA Runtime Version:                1.1
    HSA OpenMP Device Number:           2
    Device Name:                        gfx906
    Vendor Name:                        AMD
    Device Type:                        GPU
    Max Queues:                         128
    Queue Min Size:                     64
    Queue Max Size:                     131072
    Cache:
      L0:                               16384 bytes
      L1:                               8388608 bytes
    Cacheline Size:                     64
    Max Clock Freq(MHz):                1725
    Compute Units:                      60
    SIMD per CU:                        4
    Fast F16 Operation:                 TRUE
    Wavefront Size:                     64
    Workgroup Max Size:                 1024
    Workgroup Max Size per Dimension:
      x:                                1024
      y:                                1024
      z:                                1024
    Max Waves Per CU:                   40
    Max Work-item Per CU:               2560
    Grid Max Size:                      4294967295
    Grid Max Size per Dimension:
      x:                                4294967295
      y:                                4294967295
      z:                                4294967295
    Max fbarriers/Workgrp:              32
    Memory Pools:
      Pool GLOBAL; FLAGS: COARSE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GLOBAL; FLAGS: FINE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GROUP:
        Size:                            65536 bytes
        Allocatable:                     FALSE
        Runtime Alloc Granule:           0 bytes
        Runtime Alloc alignment:         0 bytes
        Accessable by all:               FALSE

Device (7):
    HSA Runtime Version:                1.1
    HSA OpenMP Device Number:           3
    Device Name:                        gfx906
    Vendor Name:                        AMD
    Device Type:                        GPU
    Max Queues:                         128
    Queue Min Size:                     64
    Queue Max Size:                     131072
    Cache:
      L0:                               16384 bytes
      L1:                               8388608 bytes
    Cacheline Size:                     64
    Max Clock Freq(MHz):                1725
    Compute Units:                      60
    SIMD per CU:                        4
    Fast F16 Operation:                 TRUE
    Wavefront Size:                     64
    Workgroup Max Size:                 1024
    Workgroup Max Size per Dimension:
      x:                                1024
      y:                                1024
      z:                                1024
    Max Waves Per CU:                   40
    Max Work-item Per CU:               2560
    Grid Max Size:                      4294967295
    Grid Max Size per Dimension:
      x:                                4294967295
      y:                                4294967295
      z:                                4294967295
    Max fbarriers/Workgrp:              32
    Memory Pools:
      Pool GLOBAL; FLAGS: COARSE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GLOBAL; FLAGS: FINE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GROUP:
        Size:                            65536 bytes
        Allocatable:                     FALSE
        Runtime Alloc Granule:           0 bytes
        Runtime Alloc alignment:         0 bytes
        Accessable by all:               FALSE
```

Differential Revision: https://reviews.llvm.org/D126836