platform/upstream/llvm.git
2 years ago[DeadArgElim] Remove dead code after r128810
Fangrui Song [Thu, 9 Jun 2022 04:11:54 +0000 (21:11 -0700)]
[DeadArgElim] Remove dead code after r128810

2 years ago[lld-macho] Support EH frames under arm64
Jez Ng [Thu, 9 Jun 2022 03:41:29 +0000 (23:41 -0400)]
[lld-macho] Support EH frames under arm64

For arm64, llvm-mc emits relocations for the target function
address like so:

  ltmp:
    <CIE start>
    ...
    <CIE end>
    ... multiple FDEs ...
    <FDE start>
    <target function address - (ltmp + pcrel offset)>
    ...

If any of the FDEs in `multiple FDEs` get dead-stripped, then `FDE start`
will move to an earlier address, and `ltmp + pcrel offset` will no longer
reflect an accurate pcrel value. To avoid this problem, we "canonicalize"
our relocation by adding an `EH_Frame` symbol at `FDE start`, and updating
the reloc to be `target function address - (EH_Frame + new pcrel offset)`.

Reviewed By: #lld-macho, Roger

Differential Revision: https://reviews.llvm.org/D124561

2 years ago[lld-macho] Initial support for EH Frames
Jez Ng [Thu, 9 Jun 2022 03:40:52 +0000 (23:40 -0400)]
[lld-macho] Initial support for EH Frames

== Background ==

`llvm-mc` generates unwind info in both compact unwind and DWARF
formats. LLD already handles the compact unwind format; this diff gets
us close to handling the DWARF format properly.

== Caveats ==

It's not quite done yet, but I figure it's worth getting this reviewed
and landed first as it's shaping up to be a fairly large code change.

**Known limitations of the current code:**

* Only works for x86_64, for which `llvm-mc` emits "abs-ified"
  relocations as described in https://github.com/llvm/llvm-project/commit/618def651b59bd42c05bbd91d825af2fb2145683.
  `llvm-mc` emits regular relocations for ARM EH frames, which we do not
  yet handle correctly.

Since the feature is not ready for real use yet, I've gated it behind a
flag that only gets toggled on during test suite runs. With most of the
new code disabled, we see just a hint of perf regression, so I don't
think it'd be remiss to land this as-is:

             base           diff           difference (95% CI)
  sys_time   1.926 ± 0.168  1.979 ± 0.117  [  -1.2% ..   +6.6%]
  user_time  3.590 ± 0.033  3.606 ± 0.028  [  +0.0% ..   +0.9%]
  wall_time  7.104 ± 0.184  7.179 ± 0.151  [  -0.2% ..   +2.3%]
  samples    30             31

== Design ==

Like compact unwind entries, EH frames are also represented as regular
ConcatInputSections that get pointed to via `Defined::unwindEntry`. This
allows them to be handled generically by e.g. the MarkLive and ICF
code. (But note that unlike compact unwind subsections, EH frame
subsections do end up in the final binary.)

In order to make EH frames "look like" a regular ConcatInputSection,
some processing is required. First, we need to split the `__eh_frame`
section along EH frame boundaries rather than along symbol boundaries.
We do this by decoding the length field of each EH frame. Second, the
abs-ified relocations need to be turned into regular Relocs.

== Next Steps ==

In order to support EH frames on ARM targets, we will either have to
teach LLD how to handle EH frames with explicit relocs, or we can try to
make `llvm-mc` emit abs-ified relocs for ARM as well. I'm hoping to do
the latter as I think it will make the LLD implementation both simpler
and faster to execute.

== Misc ==

The `obj-file-with-stabs.s` test had to be updated as the previous
version would trip assertion errors in the code. It appears that in our
attempt to produce a minimal YAML test input, we created a file with
invalid EH frame data. I've fixed this by re-generating the YAML and not
doing any hand-pruning of it.

Reviewed By: #lld-macho, Roger

Differential Revision: https://reviews.llvm.org/D123435

2 years ago[mlir][ods] Mark StructAttr as deprecated
Mogball [Thu, 9 Jun 2022 03:02:21 +0000 (03:02 +0000)]
[mlir][ods] Mark StructAttr as deprecated

2 years ago[InstCombine] Add vector tests for shl+lshr+and transforms; NFC
chenglin.bi [Thu, 9 Jun 2022 03:13:56 +0000 (11:13 +0800)]
[InstCombine] Add vector tests for shl+lshr+and transforms; NFC
D126617

2 years ago[msan][test] Fix cpusetsize for another pthread_getaffinity_np.cpp test
Fangrui Song [Thu, 9 Jun 2022 03:08:05 +0000 (20:08 -0700)]
[msan][test] Fix cpusetsize for another pthread_getaffinity_np.cpp test

Similar to D127368

2 years ago[libc++] Towards a simpler extern template story in libc++
Louis Dionne [Tue, 8 Jun 2021 21:25:08 +0000 (17:25 -0400)]
[libc++] Towards a simpler extern template story in libc++

The flexibility around extern template instantiation declarations in
libc++ result in a very complicated model, especially when support for
slightly different configurations (like the debug mode or assertions
in the dylib) are taken into account. That results in unexpected bugs
like http://llvm.org/PR50534 (and there have been multiple similar
bugs in the past, notably around the debug mode).

This patch gets rid of the _LIBCPP_DISABLE_EXTERN_TEMPLATE knob, which
I don't think is fundamental. Indeed, the motivation for that knob was to
avoid taking a dependency on the library, however that can be done better
by linking against the static library instead. And in fact, some parts of
the headers will always depend on things defined in the library, which
defeats the original goal of _LIBCPP_DISABLE_EXTERN_TEMPLATE.

Differential Revision: https://reviews.llvm.org/D103960

2 years ago[RISCV] Fix 80 column violations in RISCVInsertVSETVLI.cpp. NFC
Craig Topper [Thu, 9 Jun 2022 01:34:26 +0000 (18:34 -0700)]
[RISCV] Fix 80 column violations in RISCVInsertVSETVLI.cpp. NFC

I think these were likely introduced in the recent work done to
this pass.

2 years ago[msan][test] Use a large cpusetsize for pthread_getaffinity_np
Fangrui Song [Thu, 9 Jun 2022 01:50:23 +0000 (18:50 -0700)]
[msan][test] Use a large cpusetsize for pthread_getaffinity_np

pthread_getaffinity_np (Linux `kernel/sched/core.c:sched_getaffinity`) fails
with EINVAL if 8*cpusetsize (constant in glibc: 1024) is smaller than
`nr_cpu_ids` (CONFIG_NR_CPUS, which is 2048 for several arch/powerpc/configs
configurations).

The build bot clang-ppc64le-linux-lnt seems to have a larger `nr_cpu_ids`.

Differential Revision: https://reviews.llvm.org/D127368

2 years ago[MicrosoftDemangle] Set error to true when returning nullptr.
Zequan Wu [Thu, 9 Jun 2022 00:08:22 +0000 (17:08 -0700)]
[MicrosoftDemangle] Set error to true when returning nullptr.

2 years ago[lld/mac] Write output sections in parallel
Michael Eisel [Thu, 9 Jun 2022 00:09:48 +0000 (20:09 -0400)]
[lld/mac] Write output sections in parallel

This reduces linking time by ~8% for my project (1.19s -> 0.53s for
writeSections()). writeTo is const, which bodes well for it being
parallelizable, and I've looked through the different overridden versions and
can't see any race conditions. It produces the same byte-for-byte output for my
project.

Differential Revision: https://reviews.llvm.org/D126800

2 years agoAdd help text for "breakpoint name", describing the feature more fully.
Jim Ingham [Wed, 8 Jun 2022 18:08:32 +0000 (11:08 -0700)]
Add help text for "breakpoint name", describing the feature more fully.

https://reviews.llvm.org/D127038

2 years ago[BOLT]DWARF] Eagerly write out loclists
Alexander Yermolovich [Wed, 8 Jun 2022 23:52:23 +0000 (16:52 -0700)]
[BOLT]DWARF] Eagerly write out loclists

Taking advantage of us being able to re-write .debug_info to reduce memory
footprint loclists. Writing out loc-list as they are added, similar to how
we handle ranges.

Collected on clang-14
trunk
4:41.20 real,   389.50 user,    59.50 sys,      0 amem, 38412532 mmem
4:30.08 real,   376.10 user,    63.75 sys,      0 amem, 38477844 mmem
4:25.58 real,   373.76 user,    54.71 sys,      0 amem, 38439660 mmem
diff
4:34.66 real,   392.83 user,    57.73 sys,      0 amem, 38382560 mmem
4:35.96 real,   377.70 user,    58.62 sys,      0 amem, 38255840 mmem
4:27.61 real,    390.18 user,    57.02 sys,      0 amem, 38223224 mmem

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D126999

2 years ago[gn build] (manually) port 0e9a01dcac99 (libcxx module.modulemap gen)
Nico Weber [Wed, 8 Jun 2022 23:50:52 +0000 (19:50 -0400)]
[gn build] (manually) port 0e9a01dcac99 (libcxx module.modulemap gen)

2 years ago[ORC] Add an output stream operator for SymbolStringPool.
Lang Hames [Wed, 8 Jun 2022 23:48:15 +0000 (16:48 -0700)]
[ORC] Add an output stream operator for SymbolStringPool.

Handy for checking string pool state, e.g. when debugging dangling-pool-entry
errors.

2 years ago[RISCV] Add debug message that should have been in D126843.
Craig Topper [Wed, 8 Jun 2022 23:46:01 +0000 (16:46 -0700)]
[RISCV] Add debug message that should have been in D126843.

For consistency with the other messages in this file.

2 years ago[LLDB][NativePDB] Fix several crashes when parsing debug info.
Zequan Wu [Wed, 8 Jun 2022 22:49:24 +0000 (15:49 -0700)]
[LLDB][NativePDB] Fix several crashes when parsing debug info.
1. If array element type is a tag decl, complete it.
2. Fix few places where `asTag` should be used instead of `asClass()`.
3. Handle the case that `PdbAstBuilder::CreateFunctionDecl` return nullptr mainly due to an existing workaround (`m_cxx_record_map`).
4. `FindMembersSize` should never return error as this would cause early exiting in `CVTypeVisitor::visitFieldListMemberStream` and then cause assertion failure.
5. In some pdbs from C++ runtime libraries have S_LPROC32 followed directly by S_LOCAL and the local variable location is a S_DEFRANGE_FRAMEPOINTER_REL. There is no information about base frame register in this case, ignoring it by returning RegisterId::NONE.
6. Add a TODO when S_DEFRANGE_SUBFIELD_REGISTER describes the variable location of a pointer type. For now, just ignoring it if the variable is pointer.

2 years ago[docs][clang] Fixing minor typo
Jose Manuel Monsalve Diaz [Wed, 8 Jun 2022 23:35:11 +0000 (23:35 +0000)]
[docs][clang] Fixing minor typo

Changing "tot the" to "to the"

2 years ago[clang][dataflow] Remove IndirectionValue class, moving PointeeLoc field into Pointer...
Wei Yi Tee [Wed, 8 Jun 2022 23:15:51 +0000 (01:15 +0200)]
[clang][dataflow] Remove IndirectionValue class, moving PointeeLoc field into PointerValue and ReferenceValue

This patch precedes a future patch to make PointeeLoc for PointerValue possibly empty (for nullptr), by using a pointer instead of a reference type.
ReferenceValue should maintain a non-empty PointeeLoc reference.

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D127312

2 years ago[MSAN] print out the only possible invalid parameter (EINVAL is returned)
Kevin Athey [Wed, 8 Jun 2022 23:21:53 +0000 (16:21 -0700)]
[MSAN] print out the only possible invalid parameter (EINVAL is returned)

One more round attempting to figure what is wrong.

Depends on: https://reviews.llvm.org/D127346

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D127359

2 years ago[NFC] clang: add test for PR55886
Matheus Izvekov [Wed, 8 Jun 2022 22:49:52 +0000 (00:49 +0200)]
[NFC] clang: add test for PR55886

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D127361

2 years ago[lldb] Add assertState function to the API test suite
Jonas Devlieghere [Wed, 8 Jun 2022 22:57:41 +0000 (15:57 -0700)]
[lldb] Add assertState function to the API test suite

Add a function to make it easier to debug a test failure caused by an
unexpected state.

Currently, tests are using assertEqual which results in a cryptic error
message: "AssertionError: 5 != 10". Even when a test provides a message
to make it clear why a particular state is expected, you still have to
figure out which of the two was the expected state, and what the other
value corresponds to.

We have a function in lldbutil that helps you convert the state number
into a user readable string. This patch adds a wrapper around
assertEqual specifically for comparing states and reporting better error
messages.

The aforementioned error message now looks like this: "AssertionError:
stopped (5) != exited (10)". If the user provided a message, that
continues to get printed as well.

Differential revision: https://reviews.llvm.org/D127355

2 years ago[clang-format] Remove braces of else blocks that embody an if block
owenca [Mon, 6 Jun 2022 10:03:27 +0000 (03:03 -0700)]
[clang-format] Remove braces of else blocks that embody an if block

Fixes #55663.

Differential Revision: https://reviews.llvm.org/D127260

2 years ago[libc++] Fix modules builds when features are removed
Louis Dionne [Mon, 23 May 2022 20:48:47 +0000 (16:48 -0400)]
[libc++] Fix modules builds when features are removed

When some headers are not available because we removed features like
localization or threads, the compiler should not try to include these
headers when building modules. To avoid that from happening, add a
requires-declaration that is never satisfied when the configuration
in use doesn't support a header.

rdar://93777687

Differential Revision: https://reviews.llvm.org/D127127

2 years ago[libc][NFC] Mark some methods constexpr
Alex Brachet [Wed, 8 Jun 2022 22:39:52 +0000 (22:39 +0000)]
[libc][NFC] Mark some methods constexpr

gcc is complaining that these methods are being called
from a function that is marked constexpr but these
aren't.

2 years ago[BOLT][NFC] Replace stdio with raw_ostream in CallGraph
Huan Nguyen [Wed, 8 Jun 2022 22:37:25 +0000 (15:37 -0700)]
[BOLT][NFC] Replace stdio with raw_ostream in CallGraph

Replacing stdio functions, e.g., fopen, fprintf, with raw_ostream.

Test Plan:
```
ninja check-bolt
```

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126826

2 years ago[lldb] Update TestModuleLoadedNotifys.py for macOS Ventura
Jonas Devlieghere [Wed, 8 Jun 2022 21:37:00 +0000 (14:37 -0700)]
[lldb] Update TestModuleLoadedNotifys.py for macOS Ventura

On macOS Ventura and later, dyld and the main binary will be loaded
again when dyld moves itself into the shared cache. Update the test
accordingly.

Differential revision: https://reviews.llvm.org/D127331

2 years ago[lldb/Commands] Prevent crash due to reading memory from page zero.
Chelsea Cassanova [Sat, 4 Jun 2022 00:04:13 +0000 (20:04 -0400)]
[lldb/Commands] Prevent crash due to reading memory from page zero.

Adds a check to ensure that a process exists before attempting to get
its ABI to prevent lldb from crashing due to trying to read from page zero.

Differential revision: https://reviews.llvm.org/D127016

2 years ago[mlir][sparse] Fix a problem introduced by the PR for reading complex number.
bixia1 [Wed, 8 Jun 2022 21:57:54 +0000 (14:57 -0700)]
[mlir][sparse] Fix a problem introduced by the PR for reading complex number.

The problem is in function isValid.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D127349

2 years ago[SelectionDAG] Teach computeKnownBits that a nsw self multiply produce a positive...
Craig Topper [Wed, 8 Jun 2022 20:10:36 +0000 (13:10 -0700)]
[SelectionDAG] Teach computeKnownBits that a nsw self multiply produce a positive value.

This matches what we do in IR. For the RISC-V test case, this allows
us to use -8 for the AND mask instead of materializing a constant in a register.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D127335

2 years ago[pseudo] GC GSS nodes, reuse them with a freelist
Sam McCall [Tue, 31 May 2022 21:58:26 +0000 (23:58 +0200)]
[pseudo] GC GSS nodes, reuse them with a freelist

Most GSS nodes have short effective lifetimes, keeping them around until the
end of the parse is wasteful. Mark and sweep them every 20 tokens.

When parsing clangd/AST.cpp, this reduces the GSS memory from 1MB to 20kB.
We pay ~5% performance for this according to the glrParse benchmark.
(Parsing more tokens between GCs doesn't seem to improve this further).

Compared to the refcounting approach in https://reviews.llvm.org/D126337, this
is simpler (at least the complexity is better isolated) and has >2x less
overhead. It doesn't provide death handlers (for error-handling) but we have
an alternative solution in mind.

Differential Revision: https://reviews.llvm.org/D126723

2 years ago[pseudo] Restore accidentally removed debug print
Sam McCall [Wed, 8 Jun 2022 21:39:34 +0000 (23:39 +0200)]
[pseudo] Restore accidentally removed debug print

2 years ago[pseudo] Invert rows/columns of LRTable storage for speedup. NFC
Sam McCall [Fri, 3 Jun 2022 21:48:41 +0000 (23:48 +0200)]
[pseudo] Invert rows/columns of LRTable storage for speedup. NFC

There are more states than symbols.
This means first partioning the action list by state leaves us with a smaller
range to binary search over. This improves find() a lot and glrParse() by 7%.
The tradeoff is storing more smaller ranges increases the size of the offsets
array, overall grammar memory is +1% (337->340KB).

Before:
glrParse    188795975 ns    188778003 ns           77 bytes_per_second=1.98068M/s
After:
glrParse    175936203 ns    175916873 ns           81 bytes_per_second=2.12548M/s

Differential Revision: https://reviews.llvm.org/D127006

2 years agoFix FunctionPropertiesAnalysis updating callsite in 1-BB loop
Mircea Trofin [Wed, 8 Jun 2022 21:30:43 +0000 (14:30 -0700)]
Fix FunctionPropertiesAnalysis updating callsite in 1-BB loop

If the callsite is in a single BB loop, we need to exclude the BB from
the successor set (in which it'd be a member), because that set forms a
boundary at which we stop traversing the CFG, when re-ingesting BBs
after inlining; but after inlining, the callsite BB's new successors
should be visited.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D127178

2 years ago[ELF] Support 'G' in .eh_frame
Florian Mayer [Wed, 8 Jun 2022 21:22:04 +0000 (14:22 -0700)]
[ELF] Support 'G' in .eh_frame

Reviewed By: MaskRay, eugenis

Differential Revision: https://reviews.llvm.org/D127148

2 years ago[DWARF] Support 'G' in dwarf parser
Florian Mayer [Wed, 8 Jun 2022 20:56:50 +0000 (13:56 -0700)]
[DWARF] Support 'G' in dwarf parser

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D127171

2 years ago[mlgo] Disable accounting upon ForceStop
Jin Xin Ng [Wed, 8 Jun 2022 21:23:34 +0000 (14:23 -0700)]
[mlgo] Disable accounting upon ForceStop

Once ForceStop is set to true, we only return positive inlining advice when it is mandatory; There is no need for further node/edge accounting.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D127245

2 years ago[compiler-rt] Fix Mmap on FreeBSD AArch64
Andrew Turner [Wed, 18 May 2022 16:18:49 +0000 (17:18 +0100)]
[compiler-rt] Fix Mmap on FreeBSD AArch64

On FreeBSD AArch64 safestack needs to use __syscall to handle 64 bit arguments

Reviewed by: MaskRay, vitalybuka

Differential Revision: https://reviews.llvm.org/D125901

2 years ago[compiler-rt] Fix the longjmp sp slot on FreeBSD AArch64
Andrew Turner [Tue, 17 May 2022 10:36:48 +0000 (11:36 +0100)]
[compiler-rt] Fix the longjmp sp slot on FreeBSD AArch64

The stack pointer is stored in the second slot in the jump buffer on
AArch64. Use the correct slot value to read this rather than the
following register.

Reviewed by: melver

Differential Revision: https://reviews.llvm.org/D125762

2 years ago[compiler-rt] Add the FreeBSD AArch64 shadow offset
Andrew Turner [Mon, 16 May 2022 16:07:52 +0000 (17:07 +0100)]
[compiler-rt] Add the FreeBSD AArch64 shadow offset

As with 64 bit x86 use an offset in middle of the address space scaled up
to work with the full 48 bit space.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D125757

2 years ago[compiler-rt] Add the common FreeBSD AArch64 support
Andrew Turner [Mon, 16 May 2022 15:58:45 +0000 (16:58 +0100)]
[compiler-rt] Add the common FreeBSD AArch64 support

Reviewed by: vitalybuka

Differential Revision: https://reviews.llvm.org/D125756

2 years ago[MSAN] send output to stderr in test: pthread_getaffinity_np.
Kevin Athey [Wed, 8 Jun 2022 21:14:08 +0000 (14:14 -0700)]
[MSAN] send output to stderr in test: pthread_getaffinity_np.

Must send output to stderr to view it.
This will be rolled back when diagnosis is complete.

Depends on: https://reviews.llvm.org/D127320

Differential Revision: https://reviews.llvm.org/D127346

2 years agoAdd llvm's Support lib to the psuedoCXX library
Nathan Lanza [Wed, 8 Jun 2022 21:11:40 +0000 (17:11 -0400)]
Add llvm's Support lib to the psuedoCXX library

This is failing to find `EnableABIBreakingCheck` at link time. Add
Support to provide it here.

Reviewed By: hokein

Differential Revision: https://reviews.llvm.org/D127269

2 years ago[MSAN] Add comment regarding why pthread_getaffinity_np is not supported on Android.
Kevin Athey [Wed, 8 Jun 2022 18:26:17 +0000 (11:26 -0700)]
[MSAN] Add comment regarding why pthread_getaffinity_np is not supported on Android.

Depends on: https://reviews.llvm.org/D127264

Reviewed By: vitalybuka, fmayer

Differential Revision: https://reviews.llvm.org/D127327

2 years agoRevert "[DWARF] Support 'G' in dwarf parser"
Florian Mayer [Wed, 8 Jun 2022 20:53:00 +0000 (13:53 -0700)]
Revert "[DWARF] Support 'G' in dwarf parser"

This reverts commit 4c71c3386c5c79560517a22e75938c9951f8de68.

2 years agoRevert "[ELF] Support 'G' in .eh_frame"
Florian Mayer [Wed, 8 Jun 2022 20:52:38 +0000 (13:52 -0700)]
Revert "[ELF] Support 'G' in .eh_frame"

This reverts commit 40f34fe4a87d5171854b9b65678ef3d9baea5785.

2 years ago[ELF] Support 'G' in .eh_frame
Florian Mayer [Tue, 7 Jun 2022 00:56:22 +0000 (17:56 -0700)]
[ELF] Support 'G' in .eh_frame

Reviewed By: MaskRay, eugenis

Differential Revision: https://reviews.llvm.org/D127148

2 years ago[mlir][sparse] Add complex number reading from files.
bixia1 [Tue, 7 Jun 2022 23:45:34 +0000 (16:45 -0700)]
[mlir][sparse] Add complex number reading from files.

Support complex numbers for Matrix Market Exchange Formats. Add a test case.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D127138

2 years ago[llvm-cov] Assume repeat architectures for universal binaries
Keith Smiley [Tue, 15 Mar 2022 04:57:38 +0000 (21:57 -0700)]
[llvm-cov] Assume repeat architectures for universal binaries

In the case you pass multiple universal binaries to llvm-cov, assume
that if only 1 architecture is passed, it should be used for all the
passed binaries.

This makes it easier to use multiple multi-arch binaries, since it's
likely very rare that your architectures mismatch significantly if you
also have multiple binaries in a single llvm-cov invocation. If the
architecture is invalid for any of the passed binaries, it will still
fail later.

Differential Revision: https://reviews.llvm.org/D121667

2 years ago[AMDGPU] Regenerate combine-cond-add-sub.ll
Simon Pilgrim [Wed, 8 Jun 2022 20:10:06 +0000 (21:10 +0100)]
[AMDGPU] Regenerate combine-cond-add-sub.ll

2 years agoSwitch links to use https consistently
Aaron Ballman [Wed, 8 Jun 2022 19:53:08 +0000 (15:53 -0400)]
Switch links to use https consistently

The WG14 website was recently updated to support SSL, and we might as
well make use of that.

2 years agoAdd missing entries for Annex F and Annex H to the C status page
Aaron Ballman [Wed, 8 Jun 2022 19:52:06 +0000 (15:52 -0400)]
Add missing entries for Annex F and Annex H to the C status page

2 years ago[DWARF] Support 'G' in dwarf parser
Florian Mayer [Tue, 7 Jun 2022 00:54:56 +0000 (17:54 -0700)]
[DWARF] Support 'G' in dwarf parser

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D127171

2 years ago[MC] Add 'G' to augmentation string for MTE instrumented functions
Florian Mayer [Fri, 3 Jun 2022 01:05:02 +0000 (18:05 -0700)]
[MC] Add 'G' to augmentation string for MTE instrumented functions

This was agreed on in
https://lists.llvm.org/pipermail/llvm-dev/2020-May/141345.html

The thread proposed two options
* add a character to augmentation string and handle in libuwind
* use a separate personality function.

It was determined that this is the simpler and better option.

This is part of ARM's Aarch64 ABI:
https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#id22

The next step after this is teaching libunwind to untag when this
augmentation character is set.

Reviewed By: MaskRay, eugenis

Differential Revision: https://reviews.llvm.org/D127007

2 years ago[compiler-rt][test] Restore original symbolize_stack test
Paul Kirth [Wed, 8 Jun 2022 18:59:27 +0000 (18:59 +0000)]
[compiler-rt][test] Restore original symbolize_stack test

In D126580 we updated the test to reflect that there should always
be a full trace. However, some executions do not have symbolizer
information, so we will restore the original test until we can formulate
a more robust test.

Reviewed By: leonardchan

Differential Revision: https://reviews.llvm.org/D127334

2 years ago[JITLink][ELF][AArch64] Implement R_AARCH64_PREL32 and R_AARCH64_PREL64.
Sunho Kim [Wed, 8 Jun 2022 18:13:03 +0000 (11:13 -0700)]
[JITLink][ELF][AArch64] Implement R_AARCH64_PREL32 and R_AARCH64_PREL64.

This patch implements R_AARCH64_PREL64 and R_AARCH64_PREL32 relocations that is
used in eh frame pointers. The test case utlizes obj2yaml tool to create an
artifical eh frame that generates related relocation types.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D127058

2 years ago[CSSPGO][Preinliner] Set default value of sample-profile-inline-limit-max to 3000
Hongtao Yu [Wed, 8 Jun 2022 18:50:19 +0000 (11:50 -0700)]
[CSSPGO][Preinliner] Set default value of sample-profile-inline-limit-max to 3000

The default value of sample-profile-inline-limit-max is defined as 10000 in sampleprofile.cpp. This is too big for cspreinliner which works with assembly size instead of IR size. The value 3000 turns out to be a good tradeoff. Compared to the value 10000, 3000 gives as good performance and code size, but lower build time.

Reviewed By: wenlei, wlei

Differential Revision: https://reviews.llvm.org/D127330

2 years agoRevert "Reland "[NFC][compiler-rt][asan] Unify asan and lsan allocator settings""
Leonard Chan [Wed, 8 Jun 2022 18:54:18 +0000 (11:54 -0700)]
Revert "Reland "[NFC][compiler-rt][asan] Unify asan and lsan allocator settings""

This reverts commit b37d84aa8d59dde2fae7388da5101bf471ec3434.

This broke aarch64 asan builders for fuchsia. I accidentally changed the allocator
settings for fuchsia on aarch64 because the new asan allocator settings use:

```
// AArch64/SANITIZER_CAN_USE_ALLOCATOR64 is only for 42-bit VMA
// so no need to different values for different VMA.
const uptr kAllocatorSpace =  0x10000000000ULL;
const uptr kAllocatorSize  =  0x10000000000ULL;  // 3T.
typedef DefaultSizeClassMap SizeClassMap;
```

rather than reaching the final `#else` which would use fuchsia's lsan config.

2 years ago[APFloat] Fix truncation of certain subnormal numbers
Danila Malyutin [Mon, 6 Jun 2022 17:12:43 +0000 (20:12 +0300)]
[APFloat] Fix truncation of certain subnormal numbers

Certain subnormals would be incorrectly rounded away from zero.

Fixes #55838

Differential Revision: https://reviews.llvm.org/D127140

2 years ago[SystemZ] Fix check for zero size when lowering memcmp.
Kai Nacke [Thu, 2 Jun 2022 17:42:49 +0000 (13:42 -0400)]
[SystemZ] Fix check for zero size when lowering memcmp.

During lowering of memcmp/bcmp, the check for a size of 0 is done
in 2 different ways. In rare cases this can lead to a crash in
SystemZSelectionDAGInfo::EmitTargetCodeForMemcmp(). The root cause
is that SelectionDAGBuilder::visitMemCmpBCmpCall() checks for a
constant int value which is not yet evaluated. When the value is
turned into a SDValue, then the evaluation is done and results in
a ConstantSDNode. But EmitTargetCodeForMemcmp() expects the special
case of 0 length to be handled, which results in an assertion.

The fix is to turn the value into a SDValue, so that both functions
use the same check.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D126900

2 years ago[lldb] Improve error reporting from TestAppleSimulatorOSType.py
Jonas Devlieghere [Wed, 8 Jun 2022 18:47:03 +0000 (11:47 -0700)]
[lldb] Improve error reporting from TestAppleSimulatorOSType.py

When we can't find a simulator, report the platform and architecture in
the error message.

2 years ago[MLIR][Presburger] subtract: improve redundant constraint detection
Arjun P [Wed, 8 Jun 2022 18:43:43 +0000 (14:43 -0400)]
[MLIR][Presburger] subtract: improve redundant constraint detection

When constraints in the two operands make each other redundant, prefer constraints of the second because this affects the number of sets in the output at each level; reducing these can help prevent exponential blowup.

This is accomplished by adding extra overloads to Simplex::detectRedundant that only scan a subrange of the constraints for redundancy.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D127237

2 years ago[cmake] Don't try creating an executable when detecting the linker
Louis Dionne [Tue, 17 May 2022 19:05:05 +0000 (15:05 -0400)]
[cmake] Don't try creating an executable when detecting the linker

On most platforms, the linker detection command that we run ends up being
something like `clang++ -Wl,-v` or `clang++ -Wl,--version`. This usually
fails with a missing reference to `_main` because we don't have any input
file. However, when compiling for a target that is implicitly freestanding,
the invocation actually succeeds and a dummy `a.out` file is created in
the current working directory. This is extremely annoying because it
creates a `a.out` file at the root of the monorepo when running CMake
configuration from the root.

Differential Revision: https://reviews.llvm.org/D125827

2 years ago[compiler-rt][hwasan] Check address tagging mode in InitializeOsSupport on Fuchsia
Leonard Chan [Wed, 8 Jun 2022 00:16:28 +0000 (17:16 -0700)]
[compiler-rt][hwasan] Check address tagging mode in InitializeOsSupport on Fuchsia

Differential Revision: https://reviews.llvm.org/D127262

2 years ago[lldb] Use objc_getRealizedClassList_trylock on macOS Ventura and later
Jonas Devlieghere [Wed, 8 Jun 2022 18:32:36 +0000 (11:32 -0700)]
[lldb] Use objc_getRealizedClassList_trylock on macOS Ventura and later

In order to avoid stranding the Objective-C runtime lock, we switched
from objc_copyRealizedClassList to its non locking variant
objc_copyRealizedClassList_nolock. Not taking the lock was relatively
safe because we run this expression on one thread only, but it was still
possible that someone was in the middle of modifying this list while we
were trying to read it. Worst case that would result in a crash in the
inferior without side-effects and we'd unwind and try again later.

With the introduction of macOS Ventura, we can use
objc_getRealizedClassList_trylock instead. It has semantics similar to
objc_copyRealizedClassList_nolock, but instead of not locking at all,
the function returns if the lock is already taken, which avoids the
aforementioned crash without stranding the Objective-C runtime lock.
Because LLDB gets to allocate the underlying memory we also avoid
stranding the malloc lock.

rdar://89373233

Differential revision: https://reviews.llvm.org/D127252

2 years ago[mlir] Refactoring the tablegen Tensor types
wren romano [Sat, 4 Jun 2022 00:17:56 +0000 (17:17 -0700)]
[mlir] Refactoring the tablegen Tensor types

Reduces repetition in tablegen files for defining various tensor types.  In particular the goal is to reduce the repetition when defining new tensor types (e.g., D126994).

Reviewed By: aartbik, rriddle

Differential Revision: https://reviews.llvm.org/D127039

2 years ago[clang][dataflow] Enable use of synthetic properties on all Value instances.
Wei Yi Tee [Wed, 8 Jun 2022 17:55:54 +0000 (19:55 +0200)]
[clang][dataflow] Enable use of synthetic properties on all Value instances.

This patch moves the implementation of synthetic properties from the StructValue class into the Value base class so that it can be used across all Value instances.

Reviewed By: gribozavr2, ymandel, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D127196

2 years ago[MSAN] Add result printing for failed call in pthread_getaffinity_np.
Kevin Athey [Wed, 8 Jun 2022 16:45:05 +0000 (09:45 -0700)]
[MSAN] Add result printing for failed call in pthread_getaffinity_np.

Will be reverted when test failure is diagnosed.

Depends on: https://reviews.llvm.org/D127185

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D127320

2 years ago[clang][deps] Make order of module dependencies deterministic
Ben Langmuir [Tue, 7 Jun 2022 19:13:08 +0000 (12:13 -0700)]
[clang][deps] Make order of module dependencies deterministic

This fixes the underlying module dependencies, which had a
non-deterministic order, which was also visible in the order of calls to
DependencyConsumer methods. This was not directly observable in
the clang-scan-deps utility, because it was previously seeing a sorted
order from std::map in DependencyScanningTool. However, the underlying
API previously created a likely issue for any other clients. Note: if
you only apply the change from DependencyScanningTool, you can see the
issue in clang-scan-deps, and existing tests will fail
non-deterministicaly.

Differential Revision: https://reviews.llvm.org/D127243

2 years ago[clang][deps] Set -disable-free for module compilations
Ben Langmuir [Tue, 7 Jun 2022 16:53:38 +0000 (09:53 -0700)]
[clang][deps] Set -disable-free for module compilations

The command-line arguments for module builds are cc1 commands, so they
do not implicitly set -disable-free like a driver invocation, and
Tooling will disable it for the scanning instance itself. Set
-disable-free explicitly so that separate invocations for building
modules will not pay for freeing memory unnecessarily.

Differential Revision: https://reviews.llvm.org/D127229

2 years ago[AMDGPU] gfx11 VOP3P instruction MC support
Joe Nash [Tue, 24 May 2022 17:31:09 +0000 (13:31 -0400)]
[AMDGPU] gfx11 VOP3P instruction MC support

Includes dpp versions of VOP3P instructions.

Patch 18/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126917

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126978

2 years ago[clang][driver] adds `-print-diagnostics`
Christopher Di Bella [Wed, 1 Jun 2022 17:43:52 +0000 (17:43 +0000)]
[clang][driver] adds `-print-diagnostics`

Prints a list of all the warnings that Clang offers.

Differential Revision: https://reviews.llvm.org/D126796

2 years ago[PseudoProbe] Use callee name as callsite identfier for MCDecodedPseudoProbeInlineTree.
Hongtao Yu [Wed, 25 May 2022 23:30:07 +0000 (16:30 -0700)]
[PseudoProbe] Use callee name as callsite identfier for MCDecodedPseudoProbeInlineTree.

The callsite identifier used in pseudo probe encoding and decoding is consisted of a function name and the callsite probe id. For encoding, i.e., `MCPseudoProbeInlineTree`, the function name is callee function name. However for decoding, i.e., `MCDecodedPseudoProbeInlineTree`, the caller function name is used actually. This results in multiple callees that are inlined at the same callsite, likely via indirect call promotion, sharing the same decoded inline frame. While it is not a problem for profile generation, it confuses probe re-encoding in Bolt.

In Bolt, we decode pseudo probes first and build `MCDecodedPseudoProbeInlineTree`. The decoded tree is used for final re-encoding. Here comes the problem. Two inlinees from the same callsite share the same decoded inline frame. During re-encoding, the frame name (whatever inlinee comes first) will be used and encoded in the bolted binary. This will cause wrong inline contexts  in the profile generated on the bolted binary.

The fix is a no-op to pre-bolt profile generation. Some of the bolt tests are not yet upstreamed, thus I'm not adding a bolt test here.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D126434

2 years ago[mlir] Lower complex.power and complex.rsqrt to standard dialect.
bixia1 [Wed, 8 Jun 2022 16:11:28 +0000 (09:11 -0700)]
[mlir] Lower complex.power and complex.rsqrt to standard dialect.

Add conversion tests and correctness tests.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D127255

2 years agoAdd Python bindings for the OpaqueType
dime10 [Wed, 8 Jun 2022 17:50:12 +0000 (19:50 +0200)]
Add Python bindings for the OpaqueType

Implement the C-API and Python bindings for the builtin opaque type, which was previously missing.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D127303

2 years ago[docs][clang] Minor typo fix
Jose Manuel Monsalve Diaz [Wed, 8 Jun 2022 17:41:04 +0000 (17:41 +0000)]
[docs][clang] Minor typo fix

Changing "iamge" to "image"

2 years ago[WebAssembly] Implement remaining relaxed SIMD instructions
Thomas Lively [Wed, 8 Jun 2022 17:32:10 +0000 (10:32 -0700)]
[WebAssembly] Implement remaining relaxed SIMD instructions

Add codegen, intrinsics, and builtins for the i16x8.relaxed_q15mulr_s,
i16x8.dot_i8x16_i7x16_s, and i32x4.dot_i8x16_i7x16_add_s instructions. These are
the last instructions from the relaxed SIMD proposal[1] that had not been
implemented.

[1]:
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md.

Differential Revision: https://reviews.llvm.org/D127170

2 years ago[CodeView] Fix incorrect CodeView encoding of signed integer constants
Steve Merritt [Mon, 23 May 2022 19:41:58 +0000 (15:41 -0400)]
[CodeView] Fix incorrect CodeView encoding of signed integer constants

Add proper CodeView encoding for positive constant integer values greater than
127.  In addition, use the two byte encoding form for positive values less
than LF_NUMERIC.

Differential Revision: https://reviews.llvm.org/D126968

2 years ago[X86] Regenerate slow-pmulld.ll with common SSE check prefixes
Simon Pilgrim [Wed, 8 Jun 2022 17:17:50 +0000 (18:17 +0100)]
[X86] Regenerate slow-pmulld.ll with common SSE check prefixes

Add back some unused check prefixes to simplify the D127115 regeneration

2 years agoRevert "[libc++][CI] Updates Docker image."
Mark de Wever [Wed, 8 Jun 2022 17:16:02 +0000 (19:16 +0200)]
Revert "[libc++][CI] Updates Docker image."

This reverts commit f2f0dba818a50fc17ed309823b2fdb72cb725eec.

This Docker file doesn't work on the CI. It fails to clone the checkout.
This seems like an issue with a newer glibc on an older Docker where the
clone3() call fails.

This needs further investigation before relanding.

2 years ago[mlir] Fix handling of some region branch terminator successors
Mogball [Wed, 8 Jun 2022 00:01:44 +0000 (00:01 +0000)]
[mlir] Fix handling of some region branch terminator successors

When `RegionBranchOpInterface::getSuccessorRegions` is called for anything other than the parent op, it expects the operands of the terminator of the source region to be passed, not the operands of the parent op. This was not always respected.

This fixes a bug in integer range inference and ForwardDataFlowSolver and changes `scf.while` to allow narrowing of successors using constant inputs.

Fixes #55873

Reviewed By: mehdi_amini, krzysz00

Differential Revision: https://reviews.llvm.org/D127261

2 years ago[Clang] Fix memory leak due to TemplateArgumentListInfo used in AST node.
Andrew Browne [Fri, 3 Jun 2022 00:42:54 +0000 (17:42 -0700)]
[Clang] Fix memory leak due to TemplateArgumentListInfo used in AST node.

It looks like the leak is rooted at the allocation here:
https://github.com/llvm/llvm-project/blob/1a155ee7de3b62a2fabee86fb470a1554fadc54d/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp#L3857

The VarTemplateSpecializationDecl is allocated using placement new which uses the AST structure for ownership: https://github.com/llvm/llvm-project/blob/1a155ee7de3b62a2fabee86fb470a1554fadc54d/clang/lib/AST/DeclBase.cpp#L99

The problem is the TemplateArgumentListInfo inside https://github.com/llvm/llvm-project/blob/1a155ee7de3b62a2fabee86fb470a1554fadc54d/clang/include/clang/AST/DeclTemplate.h#L2721
This object contains a vector which does not use placement new: https://github.com/llvm/llvm-project/blob/1a155ee7de3b62a2fabee86fb470a1554fadc54d/clang/include/clang/AST/TemplateBase.h#L564

Apparently ASTTemplateArgumentListInfo should be used instead https://github.com/llvm/llvm-project/blob/1a155ee7de3b62a2fabee86fb470a1554fadc54d/clang/include/clang/AST/TemplateBase.h#L575

https://reviews.llvm.org/D125802#3551305

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D126944

2 years ago[libc++][NFC] Add missing 'return 0'
Louis Dionne [Wed, 8 Jun 2022 16:56:13 +0000 (12:56 -0400)]
[libc++][NFC] Add missing 'return 0'

2 years ago[mlir][sparse] Add F16 and BF16.
bixia1 [Tue, 7 Jun 2022 23:07:13 +0000 (16:07 -0700)]
[mlir][sparse] Add F16 and BF16.

This is the first PR to add `F16` and `BF16` support to the sparse codegen. There are still problems in supporting these two data types, such as `BF16` is not quite working yet.

Add tests cases.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D127010

2 years ago[X86] combineMOVMSK - constant fold with getTargetConstantBitsFromNode not just BUILD...
Simon Pilgrim [Wed, 8 Jun 2022 16:48:47 +0000 (17:48 +0100)]
[X86] combineMOVMSK - constant fold with getTargetConstantBitsFromNode not just BUILD_VECTOR

Help avoid a regression in D127115

2 years ago[libc++][NFC] Simplify enable_if for std::copy optimization
Louis Dionne [Tue, 7 Jun 2022 17:16:52 +0000 (13:16 -0400)]
[libc++][NFC] Simplify enable_if for std::copy optimization

Get rid of the __is_trivially_copy_assignable_unwrapped helper, which
is only used in one place, and use __iter_value_type instead of
iterator_traits<T>::value_type.

Differential Revision: https://reviews.llvm.org/D127230

2 years ago[flang] Add one missed semantic check for named constant in common block
PeixinQiao [Wed, 8 Jun 2022 16:43:30 +0000 (00:43 +0800)]
[flang] Add one missed semantic check for named constant in common block

As Fortran 2018 R874, common block object must be one variable name, which
cannot be one named constant. Add this check.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D126762

2 years ago[flang] Add one semantic check for procedure bind(C) interface-name
PeixinQiao [Wed, 8 Jun 2022 16:38:14 +0000 (00:38 +0800)]
[flang] Add one semantic check for procedure bind(C) interface-name

As Fortran 2018 C1521, in procedure declaration statement, if
proc-language-binding-spec (bind(c)) is specified, the proc-interface
shall appear, it shall be an interface-name, and interface-name shall
be declared with a proc-language-binding-spec.

Reviewed By: klausler, Jean Perier

Differential Revision: https://reviews.llvm.org/D127121

2 years ago[LIBOMPTARGET] Adding AMD to llvm-omp-device-info
Jose Manuel Monsalve Diaz [Wed, 1 Jun 2022 21:49:23 +0000 (21:49 +0000)]
[LIBOMPTARGET] Adding AMD to llvm-omp-device-info

Adding device information print for AMD devices on the
`llvm-omp-device-info` command line tool. The output is inspired by
the rocminfo command line tool.

This commit adds missing HSA functions, enums and structs
needed to query additional information from the HSA agents.
A generic message for the `generic-elf-64bit` plugin is also added

Example of an output:
```
llvm-omp-device-info
Device (0):
    This is a generic-elf-64bit device

Device (1):
    This is a generic-elf-64bit device

Device (2):
    This is a generic-elf-64bit device

Device (3):
    This is a generic-elf-64bit device

Device (4):
    HSA Runtime Version:                1.1
    HSA OpenMP Device Number:           0
    Device Name:                        gfx906
    Vendor Name:                        AMD
    Device Type:                        GPU
    Max Queues:                         128
    Queue Min Size:                     64
    Queue Max Size:                     131072
    Cache:
      L0:                               16384 bytes
      L1:                               8388608 bytes
    Cacheline Size:                     64
    Max Clock Freq(MHz):                1725
    Compute Units:                      60
    SIMD per CU:                        4
    Fast F16 Operation:                 TRUE
    Wavefront Size:                     64
    Workgroup Max Size:                 1024
    Workgroup Max Size per Dimension:
      x:                                1024
      y:                                1024
      z:                                1024
    Max Waves Per CU:                   40
    Max Work-item Per CU:               2560
    Grid Max Size:                      4294967295
    Grid Max Size per Dimension:
      x:                                4294967295
      y:                                4294967295
      z:                                4294967295
    Max fbarriers/Workgrp:              32
    Memory Pools:
      Pool GLOBAL; FLAGS: COARSE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GLOBAL; FLAGS: FINE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GROUP:
        Size:                            65536 bytes
        Allocatable:                     FALSE
        Runtime Alloc Granule:           0 bytes
        Runtime Alloc alignment:         0 bytes
        Accessable by all:               FALSE

Device (5):
    HSA Runtime Version:                1.1
    HSA OpenMP Device Number:           1
    Device Name:                        gfx906
    Vendor Name:                        AMD
    Device Type:                        GPU
    Max Queues:                         128
    Queue Min Size:                     64
    Queue Max Size:                     131072
    Cache:
      L0:                               16384 bytes
      L1:                               8388608 bytes
    Cacheline Size:                     64
    Max Clock Freq(MHz):                1725
    Compute Units:                      60
    SIMD per CU:                        4
    Fast F16 Operation:                 TRUE
    Wavefront Size:                     64
    Workgroup Max Size:                 1024
    Workgroup Max Size per Dimension:
      x:                                1024
      y:                                1024
      z:                                1024
    Max Waves Per CU:                   40
    Max Work-item Per CU:               2560
    Grid Max Size:                      4294967295
    Grid Max Size per Dimension:
      x:                                4294967295
      y:                                4294967295
      z:                                4294967295
    Max fbarriers/Workgrp:              32
    Memory Pools:
      Pool GLOBAL; FLAGS: COARSE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GLOBAL; FLAGS: FINE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GROUP:
        Size:                            65536 bytes
        Allocatable:                     FALSE
        Runtime Alloc Granule:           0 bytes
        Runtime Alloc alignment:         0 bytes
        Accessable by all:               FALSE

Device (6):
    HSA Runtime Version:                1.1
    HSA OpenMP Device Number:           2
    Device Name:                        gfx906
    Vendor Name:                        AMD
    Device Type:                        GPU
    Max Queues:                         128
    Queue Min Size:                     64
    Queue Max Size:                     131072
    Cache:
      L0:                               16384 bytes
      L1:                               8388608 bytes
    Cacheline Size:                     64
    Max Clock Freq(MHz):                1725
    Compute Units:                      60
    SIMD per CU:                        4
    Fast F16 Operation:                 TRUE
    Wavefront Size:                     64
    Workgroup Max Size:                 1024
    Workgroup Max Size per Dimension:
      x:                                1024
      y:                                1024
      z:                                1024
    Max Waves Per CU:                   40
    Max Work-item Per CU:               2560
    Grid Max Size:                      4294967295
    Grid Max Size per Dimension:
      x:                                4294967295
      y:                                4294967295
      z:                                4294967295
    Max fbarriers/Workgrp:              32
    Memory Pools:
      Pool GLOBAL; FLAGS: COARSE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GLOBAL; FLAGS: FINE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GROUP:
        Size:                            65536 bytes
        Allocatable:                     FALSE
        Runtime Alloc Granule:           0 bytes
        Runtime Alloc alignment:         0 bytes
        Accessable by all:               FALSE

Device (7):
    HSA Runtime Version:                1.1
    HSA OpenMP Device Number:           3
    Device Name:                        gfx906
    Vendor Name:                        AMD
    Device Type:                        GPU
    Max Queues:                         128
    Queue Min Size:                     64
    Queue Max Size:                     131072
    Cache:
      L0:                               16384 bytes
      L1:                               8388608 bytes
    Cacheline Size:                     64
    Max Clock Freq(MHz):                1725
    Compute Units:                      60
    SIMD per CU:                        4
    Fast F16 Operation:                 TRUE
    Wavefront Size:                     64
    Workgroup Max Size:                 1024
    Workgroup Max Size per Dimension:
      x:                                1024
      y:                                1024
      z:                                1024
    Max Waves Per CU:                   40
    Max Work-item Per CU:               2560
    Grid Max Size:                      4294967295
    Grid Max Size per Dimension:
      x:                                4294967295
      y:                                4294967295
      z:                                4294967295
    Max fbarriers/Workgrp:              32
    Memory Pools:
      Pool GLOBAL; FLAGS: COARSE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GLOBAL; FLAGS: FINE GRAINED, :
        Size:                            34342961152 bytes
        Allocatable:                     TRUE
        Runtime Alloc Granule:           4096 bytes
        Runtime Alloc alignment:         4096 bytes
        Accessable by all:               FALSE
      Pool GROUP:
        Size:                            65536 bytes
        Allocatable:                     FALSE
        Runtime Alloc Granule:           0 bytes
        Runtime Alloc alignment:         0 bytes
        Accessable by all:               FALSE
```

Differential Revision: https://reviews.llvm.org/D126836

2 years ago[NFC][Flang][OpenMP] Refactor getting ompobject symbol
PeixinQiao [Wed, 8 Jun 2022 16:29:07 +0000 (00:29 +0800)]
[NFC][Flang][OpenMP] Refactor getting ompobject symbol

Getting ompobject symbol is needed in multiple places and will be
needed later for the lowering of other constructs/clauses such as
copyin clause. Extract them into one function.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D127280

2 years ago[libc++] Make sure we add /llvm to the list of safe directories
Louis Dionne [Wed, 8 Jun 2022 16:25:05 +0000 (12:25 -0400)]
[libc++] Make sure we add /llvm to the list of safe directories

With the new version of Git in Ubuntu Jammy (which is now what we use in
our Docker image), we need to add `/llvm` to the list of safe directories
to avoid failures.

2 years ago[AArch64] Remove ToBeRemoved from AArch64MIPeepholeOpt
David Green [Wed, 8 Jun 2022 16:26:07 +0000 (17:26 +0100)]
[AArch64] Remove ToBeRemoved from AArch64MIPeepholeOpt

The ToBeRemoved is used to remove any MachineInstructions that are no
longer needed, making sure we don't invalidate the iterator that is
currently in use by erasing the instruction straight away. This makes
issues for keeping the code in SSA from though, where subsequent
transforms that require SSA form may have been broken by previous
peepholes.

If, instead, we use make_early_inc_range the iteration issue shouldn't
be present, so long as we do not remove the subsequent instruction in
the peephole optimizations. That way the code between transforms is kept
in SSA form, meaning hopefully less things that can go wrong.

Differential Revision: https://reviews.llvm.org/D127296

2 years ago[libc] Fix build when __FE_DENORM is defined
Alex Brachet [Wed, 8 Jun 2022 16:21:53 +0000 (16:21 +0000)]
[libc] Fix build when __FE_DENORM is defined

Differential revision: https://reviews.llvm.org/D127222

2 years ago[flang][NFC] Move genMaxWithZero into fir:::factory
jeanPerier [Wed, 8 Jun 2022 16:01:50 +0000 (18:01 +0200)]
[flang][NFC] Move genMaxWithZero into fir:::factory

Move tthe function to allow its usage in the Optimizer/Builder functions.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D127295

2 years ago[lldb] Update TestMultithreaded to report FAIL for a non-zero exit code
Jonas Devlieghere [Wed, 8 Jun 2022 15:45:53 +0000 (08:45 -0700)]
[lldb] Update TestMultithreaded to report FAIL for a non-zero exit code

A non-zero exit code from the test binary results in a
CalledProcessError. Without catching the exception, that would result in
a error (unresolved test) instead of a failure. This patch fixes that.

2 years ago[lldb] Parse the dotest output to determine the most appropriate result code
Jonas Devlieghere [Wed, 8 Jun 2022 15:35:38 +0000 (08:35 -0700)]
[lldb] Parse the dotest output to determine the most appropriate result code

Currently we look for keywords in the dotest.py output to determine the
lit result code. This binary approach of a keyword being present works
for PASS and FAIL, where having at least one test pass or fail
respectively results in that exit code. Things are more complicated
for tests that neither passed or failed, but report a combination of
(un)expected failures, skips or unresolved tests.

This patch changes the logic to parse the number of tests with a
particular result from the dotest.py output. For tests that did not PASS
or FAIL, we now report the lit result code for the one that occurred the
most. For example, if we had a test with 3 skips and 4 expected
failures, we report the test as XFAIL.

We're still mapping multiple tests to one result code, so some loss of
information is inevitable.

Differential revision: https://reviews.llvm.org/D127258

2 years ago[WebAssembly] Regenerate simd-build-vector.ll to show full codegen
Simon Pilgrim [Wed, 8 Jun 2022 15:54:26 +0000 (16:54 +0100)]
[WebAssembly] Regenerate simd-build-vector.ll to show full codegen

2 years ago[flang] Add proper todo in BoxValue
Valentin Clement [Wed, 8 Jun 2022 15:50:49 +0000 (17:50 +0200)]
[flang] Add proper todo in BoxValue

Switch debub message to proper TODOs.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D127282

2 years agoReland [AMDGPU] gfx11 VOP1+VOP2 Instruction MC support
Joe Nash [Mon, 23 May 2022 14:26:02 +0000 (10:26 -0400)]
Reland [AMDGPU] gfx11 VOP1+VOP2 Instruction MC support

The reverted dependent commit is now relanded, so reland this.
Includes dpp instructions and vop1/vop2 promoted to vop3

Patch 17/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126483

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126917

2 years agoAdd a parameter to LoadFromASTFile that accepts a file system and defaults to the...
Andy Soffer [Wed, 8 Jun 2022 15:17:22 +0000 (15:17 +0000)]
Add a parameter to LoadFromASTFile that accepts a file system and defaults to the real file-system.

Reviewed By: ymandel

Differential Revision: https://reviews.llvm.org/D126888