platform/upstream/llvm.git
13 months agoAMDGPU: Add baseline tests for llvm.amdgcn.exp2 folds
Matt Arsenault [Wed, 14 Jun 2023 12:23:26 +0000 (08:23 -0400)]
AMDGPU: Add baseline tests for llvm.amdgcn.exp2 folds

13 months agoAMDGPU: Add llvm.amdgcn.exp2 intrinsic
Matt Arsenault [Wed, 14 Jun 2023 12:07:25 +0000 (08:07 -0400)]
AMDGPU: Add llvm.amdgcn.exp2 intrinsic

Provide direct access to v_exp_f32 and v_exp_f16, so we can start
correctly lowering the generic exp intrinsics.

Unfortunately have to break from the usual naming convention of
matching the instruction name and stripping the v_ prefix. exp is
already taken by the export intrinsic. On the clang builtin side, we
have a choice of maintaining the convention to the instruction name,
or following the intrinsic name.

13 months agoAMDGPU: Add llvm.amdgcn.log to intrinsic documentation
Matt Arsenault [Wed, 14 Jun 2023 12:01:30 +0000 (08:01 -0400)]
AMDGPU: Add llvm.amdgcn.log to intrinsic documentation

13 months agoAMDGPU: Correct semantic bearing typo in intrinsic description
Matt Arsenault [Wed, 14 Jun 2023 11:58:08 +0000 (07:58 -0400)]
AMDGPU: Correct semantic bearing typo in intrinsic description

13 months ago[clang][index] NFCI: Make `CXFile` a `FileEntryRef`
Jan Svoboda [Wed, 31 May 2023 21:53:11 +0000 (14:53 -0700)]
[clang][index] NFCI: Make `CXFile` a `FileEntryRef`

This patch swaps out the `void *` behind `CXFile` from `FileEntry *` to `FileEntryRef::MapEntry *`. This allows us to remove some deprecated uses of `FileEntry::getName()`.

Depends on D151854.

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D151938

13 months ago[flang][OpenMP][OpenACC] Support stop statement in OpenMP/OpenACC region
Peixin Qiao [Thu, 15 Jun 2023 09:52:04 +0000 (09:52 +0000)]
[flang][OpenMP][OpenACC] Support stop statement in OpenMP/OpenACC region

[flang][OpenMP][OpenACC] Support stop statement in OpenMP/OpenACC region

This supports lowering of stop statement in OpenMP/OpenACC region.
* OpenMP/OpenACC: Emit `fir.unreachable` only if the block is not
  terminated by any terminator. This avoids knocking off an existing
  OpenMP/OpenACC terminator.
* OpenMP: Emit the OpenMP terminator instead of `fir.unreachable` since
  OpenMP regions can only be terminated by OpenMP terminators. This is
  currently skipped for OpenACC since unstructured code is not yet
  handled specially in OpenACC lowering.

Fixes #60737
Fixes #61877

Co-authored-by: Kiran Chandramohan <kiranchandramohan@gmail.com>
Co-authored-by: Val Donaldson <vdonaldson@nvidia.com>
Reviewed By: vdonaldson, peixin

Differential Revision: https://reviews.llvm.org/D129969

13 months ago[mlir][test] Drop op type from test passes in TestPatterns.cpp
Matthias Springer [Thu, 15 Jun 2023 09:30:03 +0000 (11:30 +0200)]
[mlir][test] Drop op type from test passes in TestPatterns.cpp

When possible, use `OperationPass<>` instead of `OperationPass<ModuleOp>` or `OperationPass<FuncOp>`.

Differential Revision: https://reviews.llvm.org/D153005

13 months ago[clang] Deprecate `DirectoryEntry::getName()`
Jan Svoboda [Thu, 15 Jun 2023 09:22:48 +0000 (11:22 +0200)]
[clang] Deprecate `DirectoryEntry::getName()`

This finally officially deprecates `DirectoryEntry::getName()`. I checked no usages remain in targets built by any of `check-clang`, `check-clang-tools`, `check-clang-extra`. There are probably some remaining usages in places like LLDB and other clients. This will give them a chance to transition to `DirectoryEntryRef::getName()` before we remove the function altogether.

Depends on D151922.

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D151927

13 months ago[LSR] Move new test to X86 subdir.
Florian Hahn [Thu, 15 Jun 2023 10:11:05 +0000 (11:11 +0100)]
[LSR] Move new test to X86 subdir.

The test added in 1665cb06307 requires the X86 backend, so move it to
the X86 subdirectory.

13 months ago[AMDGPU][AsmParser][NFC] Simplify v_interp-related operand definitions.
Ivan Kosarev [Thu, 15 Jun 2023 10:04:52 +0000 (11:04 +0100)]
[AMDGPU][AsmParser][NFC] Simplify v_interp-related operand definitions.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D152897

13 months ago[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 9.
Ivan Kosarev [Thu, 15 Jun 2023 09:56:56 +0000 (10:56 +0100)]
[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 9.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D152902

13 months ago[lldb] Fix build error after 7bca6f45
Jan Svoboda [Thu, 15 Jun 2023 09:59:31 +0000 (11:59 +0200)]
[lldb] Fix build error after 7bca6f45

13 months ago[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 8.
Ivan Kosarev [Thu, 15 Jun 2023 09:47:00 +0000 (10:47 +0100)]
[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 8.

Reviewed By: Joe_Nash

Differential Revision: https://reviews.llvm.org/D152809

13 months ago[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 7.
Ivan Kosarev [Thu, 15 Jun 2023 09:40:53 +0000 (10:40 +0100)]
[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 7.

Reviewed By: Joe_Nash

Differential Revision: https://reviews.llvm.org/D152808

13 months ago[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 6.
Ivan Kosarev [Thu, 15 Jun 2023 09:34:19 +0000 (10:34 +0100)]
[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 6.

Reviewed By: Joe_Nash

Differential Revision: https://reviews.llvm.org/D152807

13 months ago[X86] Add test for icmp/sub operand order across blocks (NFC)
Nikita Popov [Thu, 15 Jun 2023 09:34:01 +0000 (11:34 +0200)]
[X86] Add test for icmp/sub operand order across blocks (NFC)

13 months ago[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 5.
Ivan Kosarev [Thu, 15 Jun 2023 09:28:10 +0000 (10:28 +0100)]
[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 5.

Reviewed By: Joe_Nash

Differential Revision: https://reviews.llvm.org/D152805

13 months ago[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 4.
Ivan Kosarev [Thu, 15 Jun 2023 09:09:07 +0000 (10:09 +0100)]
[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 4.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D152717

13 months ago[clang] NFC: Use `DirectoryEntryRef` in `FileManager::getCanonicalName()`
Jan Svoboda [Thu, 15 Jun 2023 09:09:02 +0000 (11:09 +0200)]
[clang] NFC: Use `DirectoryEntryRef` in `FileManager::getCanonicalName()`

This patch removes the last use of deprecated `DirectoryEntry::getName()`.

Depends on D151855.

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D151922

13 months ago[clang] Use `{File,Directory}EntryRef` in modular header search (part 2/2)
Jan Svoboda [Thu, 15 Jun 2023 08:47:27 +0000 (10:47 +0200)]
[clang] Use `{File,Directory}EntryRef` in modular header search (part 2/2)

This patch removes some deprecated uses of `{File,Directory}Entry::getName()`. No functional change intended.

Depends on D151854.

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D151855

13 months ago[mlir][Interfaces] Symbols are not trivially dead
Matthias Springer [Thu, 15 Jun 2023 09:15:07 +0000 (11:15 +0200)]
[mlir][Interfaces] Symbols are not trivially dead

The greedy pattern rewrite driver removes ops that are "trivially dead". This could include symbols that are still referenced by other ops. Dead symbols should be removed with the `-symbol-dce` pass instead.

This bug was not triggered for `func::FuncOp`, because ops are not considered "trivally dead" if they do not implement the `MemoryEffectOpInterface`, indicating that the op may or may not have side effects. It is, however, triggered for `transform::NamedSequenceOp`, which implements that interface because it is required for all transform dialect ops.

Differential Revision: https://reviews.llvm.org/D152994

13 months ago[GlobalIsel][X86] Add handling for G_CONCAT_VECTORS
Simon Pilgrim [Thu, 15 Jun 2023 09:17:04 +0000 (10:17 +0100)]
[GlobalIsel][X86] Add handling for G_CONCAT_VECTORS

Replace the legacy legalizer versions - interestingly the only concat_vectors isel patterns we currently have are for AVX512 predicate masks, which gisel doesn't handle at all yet.

13 months ago[LSR] Add test cases showing bad handling of extends of post-inc uses.
Florian Hahn [Thu, 15 Jun 2023 09:15:12 +0000 (10:15 +0100)]
[LSR] Add test cases showing bad handling of extends of post-inc uses.

Tests from #38847, #62852.

13 months ago[BOLT] Move from RuntimeDyld to JITLink
Job Noorman [Thu, 15 Jun 2023 08:52:11 +0000 (10:52 +0200)]
[BOLT] Move from RuntimeDyld to JITLink

RuntimeDyld has been deprecated in favor of JITLink. [1] This patch
replaces all uses of RuntimeDyld in BOLT with JITLink.

Care has been taken to minimize the impact on the code structure in
order to ease the inspection of this (rather large) changeset. Since
BOLT relied on the RuntimeDyld API in multiple places, this wasn't
always possible though and I'll explain the changes in code structure
first.

Design note: BOLT uses a JIT linker to perform what essentially is
static linking. No linked code is ever executed; the result of linking
is simply written back to an executable file. For this reason, I
restricted myself to the use of the core JITLink library and avoided ORC
as much as possible.

RuntimeDyld contains methods for loading objects (loadObject) and symbol
lookup (getSymbol). Since JITLink doesn't provide a class with a similar
interface, the BOLTLinker abstract class was added to implement it. It
was added to Core since both the Rewrite and RuntimeLibs libraries make
use of it. Wherever a RuntimeDyld object was used before, it was
replaced with a BOLTLinker object.

There is one major difference between the RuntimeDyld and BOLTLinker
interfaces: in JITLink, section allocation and the application of fixups
(relocation) happens in a single call (jitlink::link). That is, there is
no separate method like finalizeWithMemoryManagerLocking in RuntimeDyld.
BOLT used to remap sections between allocating (loadObject) and linking
them (finalizeWithMemoryManagerLocking). This doesn't work anymore with
JITLink. Instead, BOLTLinker::loadObject accepts a callback that is
called before fixups are applied which is used to remap sections.

The actual implementation of the BOLTLinker interface lives in the
JITLinkLinker class in the Rewrite library. It's the only part of the
BOLT code that should directly interact with the JITLink API.

For loading object, JITLinkLinker first creates a LinkGraph
(jitlink::createLinkGraphFromObject) and then links it (jitlink::link).
For the latter, it uses a custom JITLinkContext with the following
properties:
- Use BOLT's ExecutableFileMemoryManager. This one was updated to
  implement the JITLinkMemoryManager interface. Since BOLT never
  executes code, its finalization step is a no-op.
- Pass config: don't use the default target passes since they modify
  DWARF sections in a way that seems incompatible with BOLT. Also run a
  custom pre-prune pass that makes sure sections without symbols are not
  pruned by JITLink.
- Implement symbol lookup. This used to be implemented by
  BOLTSymbolResolver.
- Call the section mapper callback before the final linking step.
- Copy symbol values when the LinkGraph is resolved. Symbols are stored
  inside JITLinkLinker to ensure that later objects (i.e.,
  instrumentation libraries) can find them. This functionality used to
  be provided by RuntimeDyld but I did not find a way to use JITLink
  directly for this.

Some more minor points of interest:
- BinarySection::SectionID: JITLink doesn't have something equivalent to
  RuntimeDyld's Section IDs. Instead, sections can only be referred to
  by name. Hence, SectionID was updated to a string.
- There seem to be no tests for Mach-O. I've tested a small hello-world
  style binary but not more than that.
- On Mach-O, JITLink "normalizes" section names to include the segment
  name. I had to parse the section name back from this manually which
  feels slightly hacky.

[1] https://reviews.llvm.org/D145686#4222642

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D147544

13 months ago[mlir][Vector] Add pattern to reorder elementwise and broadcast ops
Andrzej Warzynski [Fri, 2 Jun 2023 14:32:12 +0000 (15:32 +0100)]
[mlir][Vector] Add pattern to reorder elementwise and broadcast ops

The new pattern will replace elementwise(broadcast) with
broadcast(elementwise) when safe.

This change affects tests for vectorising nD-extract. In one case
("vectorize_nd_tensor_extract_with_tensor_extract") I just trimmed the
test and only preserved the key parts (scalar and contiguous load from
the original Op). We could do the same with some other tests if that
helps maintainability.

Differential Revision: https://reviews.llvm.org/D152812

13 months ago[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 3.
Ivan Kosarev [Thu, 15 Jun 2023 08:41:32 +0000 (09:41 +0100)]
[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 3.

Reviewed By: Joe_Nash

Differential Revision: https://reviews.llvm.org/D152716

13 months ago[DAGCombiner] Fix crash when trying to replace an indexed store with a narrow store.
Amara Emerson [Wed, 14 Jun 2023 22:42:43 +0000 (15:42 -0700)]
[DAGCombiner] Fix crash when trying to replace an indexed store with a narrow store.

rdar://108818859

Differential Revision: https://reviews.llvm.org/D152978

13 months ago[InstCombine] Add additional tests for displaced shifts (NFC)
Nikita Popov [Thu, 15 Jun 2023 08:47:54 +0000 (10:47 +0200)]
[InstCombine] Add additional tests for displaced shifts (NFC)

13 months ago[lldb] Fix dead link in JIT debugging doc
David Spickett [Thu, 15 Jun 2023 08:38:08 +0000 (08:38 +0000)]
[lldb] Fix dead link in JIT debugging doc

Thanks to Zhang on Discord for spotting this.

13 months ago[1/3][RISCV] Define machine instruction to write an immediate into vxrm
eopXD [Thu, 18 May 2023 12:27:40 +0000 (05:27 -0700)]
[1/3][RISCV] Define machine instruction to write an immediate into vxrm

This patch-set wants to model rounding mode for the fixed-point
intrinsics of the RVV C intrinsics.

The specification PR: [riscv-non-isa/rvv-intrinsic-doc#222](https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/222)

The 3 patches is a proof-of-concept with a bottom-up approach
Going from machine instruction to LLVM intrinsics, then to the C
intrinsics. The 3 patches applies the rounding mode control on the
`vaadd` instruction. Proceeding patches will extend the change to all
other fixed-point computations.

---

This is the 1st commit of the patch-set.  This patch gives a name to
the machine instruction that writes an immediate into the CSR `vxrm`.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D151395

13 months ago[ARM,AArch64] Add a full set of -mtp= options.
Simon Tatham [Thu, 15 Jun 2023 08:26:20 +0000 (09:26 +0100)]
[ARM,AArch64] Add a full set of -mtp= options.

AArch64 has five system registers intended to be useful as thread
pointers: one for each exception level which is RW at that level and
inaccessible to lower ones, and the special TPIDRRO_EL0 which is
readable but not writable at EL0. AArch32 has three, corresponding to
the AArch64 ones that aren't specific to EL2 or EL3.

Currently clang supports only a subset of these registers, and not
even a consistent subset between AArch64 and AArch32:

 - For AArch64, clang permits you to choose between the four TPIDR_ELn
   thread registers, but not the fifth one, TPIDRRO_EL0.

 - In AArch32, on the other hand, the //only// thread register you can
   choose (apart from 'none, use a function call') is TPIDRURO, which
   corresponds to (the bottom 32 bits of) AArch64's TPIDRRO_EL0.

So there is no thread register that you can currently use in both
targets!

For custom and bare-metal purposes, users might very reasonably want
to use any of these thread registers. There's no reason they shouldn't
all be supported as options, even if the default choices follow
existing practice on typical operating systems.

This commit extends the range of values acceptable to the `-mtp=`
clang option, so that you can specify any of these registers by (the
lower-case version of) their official names in the ArmARM:

 - For AArch64: tpidr_el0, tpidrro_el0, tpidr_el1, tpidr_el2, tpidr_el3
 - For AArch32: tpidrurw, tpidruro, tpidrprw

All existing values of the option are still supported and behave the
same as before. Defaults are also unchanged. No command line that
worked already should change behaviour as a result of this.

The new values for the `-mtp=` option have been agreed with Arm's gcc
developers (although I don't know whether they plan to implement them
in the near future).

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D152433

13 months ago[AArch64] Fix check lines for arm64-neon-across.ll. NFC
David Green [Thu, 15 Jun 2023 08:25:28 +0000 (09:25 +0100)]
[AArch64] Fix check lines for arm64-neon-across.ll. NFC

Commit de0707a2b98162ab52fa2dd9277a9bbb4f7256c7 updated the check lines, but
due to conflicting assembly not all functions kept their checks. This now
distinguishes between selection-dag and global isel.

13 months ago[AArch64][SVE] Enable shouldFoldSelectWithIdentityConstant for SVE.
David Green [Thu, 15 Jun 2023 08:17:50 +0000 (09:17 +0100)]
[AArch64][SVE] Enable shouldFoldSelectWithIdentityConstant for SVE.

Instcombine will canonicalize `select(c, binop(a, b), a)` to
`binop(select(c, b, identityvalue), a)`. The original select form
makes a more natural form for vector predicated operations for
vector architectures like SVE where predication is well supported.
This patch enables shouldFoldSelectWithIdentityConstant for SVE so
that more predicated instructions can be generated, helping simplify
the handling with identity constants.

Predicated FMA patterns have also been adjusted here as they need to
look at FMF's. Other operations like add/sub, mul, and/or/xor and
mla/mls have been recently updated.

There is one test (scalable_int_min_max) that increases in size. There
are multiple selects that could be combined into a single select but
does not currently fold.

Differential Revision: https://reviews.llvm.org/D149967

13 months ago[CGCall] Directly create opaque pointers (NFCI)
Nikita Popov [Thu, 15 Jun 2023 08:03:29 +0000 (10:03 +0200)]
[CGCall] Directly create opaque pointers (NFCI)

13 months ago[AArch64] Don't look at type size for scalable types in isExtFreeImpl
David Green [Thu, 15 Jun 2023 07:47:10 +0000 (08:47 +0100)]
[AArch64] Don't look at type size for scalable types in isExtFreeImpl

This fixes one of those 'Request for a fixed element count on a scalable
object' errors in the AArch64 isExtFreeImpl method, where the uses of a sext
are checked to see if the instruction can be considered free.
https://godbolt.org/z/debYP9c4G

Differential Revision: https://reviews.llvm.org/D152930

13 months ago[OpenMP] Update the default version of OpenMP to 5.1
Animesh Kumar [Tue, 6 Jun 2023 10:46:13 +0000 (16:16 +0530)]
[OpenMP] Update the default version of OpenMP to 5.1

The default version of OpenMP is updated from 5.0 to 5.1 which means if -fopenmp is specified but -fopenmp-version is not specified with clang, the default version of OpenMP is taken to be 5.1.  After modifying the Frontend for that, various LIT tests were updated. This patch contains all such changes. At a high level, these are the patterns of changes observed in LIT tests -

  # RUN lines which mentioned `-fopenmp-version=50` need to kept only if the IR for version 5.0 and 5.1 are different. Otherwise only one RUN line with no version info(i.e. default version) needs to be there.

  # Test cases of this sort already had the RUN lines with respect to the older default version 5.0 and the version 5.1. Only swapping the version specification flag `-fopenmp-version` from newer version RUN line to older version RUN line is required.

  # Diagnostics: Remove the 5.0 version specific RUN lines if there was no difference in the Diagnostics messages with respect to the default 5.1.

  # Diagnostics: In case there was any difference in diagnostics messages between 5.0 and 5.1, mention version specific messages in tests.

  # If the test contained version specific ifdef's e.g. "#ifdef OMP5" but there were no RUN lines for any other version than 5.X, then bring the code guarded by ifdef's outside and remove the ifdef's.

  # Some tests had RUN lines for both 5.0 and 5.1 versions, but it is found that the IR for 5.0 is not different from the 5.1, therefore such RUN lines are redundant. So, such duplicated lines are removed.

  # To generate CHECK lines automatically, use the script llvm/utils/update_cc_test_checks.py

Reviewed By: saiislam, ABataev

Differential Revision: https://reviews.llvm.org/D129635

(cherry picked from commit 9dd2999907dc791136a75238a6000f69bf67cf4e)

13 months ago[Clang] Directly create opaque pointers
Nikita Popov [Fri, 9 Jun 2023 07:57:21 +0000 (09:57 +0200)]
[Clang] Directly create opaque pointers

In CGTypes, directly create opaque pointers, without computing the
LLVM element type. This is not as straightforward as I though it
would be, because apparently computing the LLVM type also causes a
number of side effects.

In particular, we no longer produce diagnostics like -Wpacked for
typed (only) behind pointers, because we no longer depend on their
layout.

Differential Revision: https://reviews.llvm.org/D152505

13 months ago[MC] Properly report errors for .subsection
Fangrui Song [Thu, 15 Jun 2023 06:57:03 +0000 (23:57 -0700)]
[MC] Properly report errors for .subsection

For the out-of-range error, MCConstantExpr doesn't have a location, so we
can only show "<unknown>:0:".

Also, allow subsection numbers up to 2147483647, which is the maximum value GNU
assembler supports. (GNU assembler also supports negative numbers.)

13 months agoRemove small data limit for riscv64.*android triples
AdityaK [Thu, 15 Jun 2023 06:13:59 +0000 (23:13 -0700)]
Remove small data limit for riscv64.*android triples

On Android GP register has been repurposed for SCS so there is no need to have .sdata section.

Reviewers: enh, craig.topper, pirama, kito-cheng, jrtc27

Differential Revision: https://reviews.llvm.org/D151512

13 months ago[lldb][RISCV] Replace enumeration of RVV builtin types with inclusion to RISCVVTypes.def
eopXD [Wed, 14 Jun 2023 14:39:31 +0000 (07:39 -0700)]
[lldb][RISCV] Replace enumeration of RVV builtin types with inclusion to RISCVVTypes.def

This approach prevents us from adding new lines into the switch case
when new types are introduced.

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D152922

13 months ago[XCOFF] FixupOffsetInCsect should be 0 for R_REF relocation.
esmeyi [Thu, 15 Jun 2023 05:28:45 +0000 (01:28 -0400)]
[XCOFF] FixupOffsetInCsect should be 0 for R_REF relocation.

Summary: The FixupOffsetInCsect should be 0 for R_REF relocation since it specifies a nonrelocating reference. Otherwise liker would try to relocate the symbol through its address and an error like following occurred.
```
ld: 0711-547 SEVERE ERROR: Object /tmp/1-2a7ea1.o cannot be processed.
RLD address 0x65 for section 2 (.data) is
not contained in the section.
```

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D152777

13 months ago[AMDGPU] Enable Atomic Optimizer and Default to Iterative Scan Strategy.
Pravin Jagtap [Thu, 15 Jun 2023 05:18:38 +0000 (01:18 -0400)]
[AMDGPU] Enable Atomic Optimizer and Default to Iterative Scan Strategy.

The D147408 implemented new Iterative approach for scan computations
and  added new flag `amdgpu-atomic-optimizer-strategy` which is
defaulted to DPP.

The changeset https://github.com/GPUOpen-Drivers/llpc/pull/2506
adapts to the new changes in LLPC.

This patch enables atomic optimizer pass and selects Iterative
approach for scan computations by default for compute pipeline.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D152649

13 months ago[AMDGPU] Place returns on stack if they would violate VGPR limit
Carl Ritson [Thu, 15 Jun 2023 04:45:15 +0000 (13:45 +0900)]
[AMDGPU] Place returns on stack if they would violate VGPR limit

Check no VGPRs above configured maximum would be used by a return
when deciding if it can be lowered.

Reviewed By: sebastian-ne

Differential Revision: https://reviews.llvm.org/D152912

13 months ago[AMDGPU] Remove return VGPRs from callee save list
Carl Ritson [Thu, 15 Jun 2023 04:44:54 +0000 (13:44 +0900)]
[AMDGPU] Remove return VGPRs from callee save list

There is no need to generate spill/restore for registers used in
return value.  This matters for amdgpu_gfx calling convention
where CSR and Ret definitions overlap.

Reviewed By: sebastian-ne

Differential Revision: https://reviews.llvm.org/D152892

13 months ago[mlir] Remove unused forward declaration OpAsmParserResult
Kazu Hirata [Thu, 15 Jun 2023 05:04:47 +0000 (22:04 -0700)]
[mlir] Remove unused forward declaration OpAsmParserResult

The corresponding class definition was removed by:

  commit 17ef97bf7e124d7cba023011a7764e64b2889212
  Author: Chris Lattner <clattner@google.com>
  Date:   Tue Aug 7 09:12:35 2018 -0700

13 months ago[mlir] Remove unused forward declaration FilteredValueUseIterator
Kazu Hirata [Thu, 15 Jun 2023 05:04:46 +0000 (22:04 -0700)]
[mlir] Remove unused forward declaration FilteredValueUseIterator

The corresponding class definition was removed by:

  commit 48e9ef4320a315ba1f2358db7295d53917a14f4b
  Author: River Riddle <riddleriver@gmail.com>
  Date:   Thu Apr 23 16:23:34 2020 -0700

13 months ago[mlir] Remove unused forward declaration AffineSymbolExprStorage
Kazu Hirata [Thu, 15 Jun 2023 05:04:44 +0000 (22:04 -0700)]
[mlir] Remove unused forward declaration AffineSymbolExprStorage

The corresponding struct definition was removed by:

  commit c74996d199e8931d4fc3d72acd50754b43c4ec2d
  Author: Alex Zinenko <zinenko@google.com>
  Date:   Tue May 21 01:34:13 2019 -0700

13 months ago[lldb] Remove unused forward declaration RecordingMemoryManager
Kazu Hirata [Thu, 15 Jun 2023 05:04:43 +0000 (22:04 -0700)]
[lldb] Remove unused forward declaration RecordingMemoryManager

The corresponding class definition was removed by:

  commit 8dfb68e0398ef48d41dc8ea058e9aa750b5fc85f
  Author: Sean Callanan <scallanan@apple.com>
  Date:   Tue Mar 19 00:10:07 2013 +0000

13 months ago[lld] Remove unused forward declarations for DefinedRelative
Kazu Hirata [Thu, 15 Jun 2023 05:04:41 +0000 (22:04 -0700)]
[lld] Remove unused forward declarations for DefinedRelative

The corresponding class definition was removed by:

  commit 502d4ce2e4921cc38ef1226ae093e594d900fe46
  Author: Reid Kleckner <rnk@google.com>
  Date:   Mon Jun 26 15:39:52 2017 +0000

13 months ago[CodeGen] Remove unused function GetOrCreateRTTIProxyGlobalVariable
Kazu Hirata [Thu, 15 Jun 2023 05:04:40 +0000 (22:04 -0700)]
[CodeGen] Remove unused function GetOrCreateRTTIProxyGlobalVariable

The last use was removed by:

  commit 46f366494f3ca8cc98daa6fb4f29c7c446c176b6
  Author: Fangrui Song <i@maskray.me>
  Date:   Sat May 20 08:24:20 2023 -0700

This patch also removes RTTIProxyMap, which becomes unused once I
remove GetOrCreateRTTIProxyGlobalVariable.

Differential Revision: https://reviews.llvm.org/D152782

13 months ago[mlir] Remove unused declaration createComposedAffineApplyOp
Kazu Hirata [Thu, 15 Jun 2023 05:04:38 +0000 (22:04 -0700)]
[mlir] Remove unused declaration createComposedAffineApplyOp

The corresponding function definition was removed by:

  commit 362557e11c8185e172a49f7542dc04c519857230
  Author: Nicolas Vasilache <ntv@google.com>
  Date:   Fri Jan 11 16:08:16 2019 -0800

13 months ago[mlir] Use DenseMapBase::lookup (NFC)
Kazu Hirata [Thu, 15 Jun 2023 05:04:37 +0000 (22:04 -0700)]
[mlir] Use DenseMapBase::lookup (NFC)

13 months agoPrevent deadlocks in death tests.
Martin Braenne [Tue, 13 Jun 2023 07:05:15 +0000 (07:05 +0000)]
Prevent deadlocks in death tests.

We have recently started seeing deadlocks in death tests while running in an internal test environment.

Per the documentation here, there are issues with death tests in the presence of threads:

https://github.com/google/googletest/blob/main/docs/advanced.md#death-tests-and-threads

To avoid the deadlocks, I first tried appending `DeathTest` to the relevant test suite names, which has the effect of running these test suites before all other tests. However, this did not prevent the deadlocks.

This patch therefore uses the option of setting the `death_test_style` flag to `"threadsafe"` (see description in the page linked above under "Death Test Styles"), and this prevents the deadlocks.

The documentation notes that the "threadsafe" death test style "trades increased test execution time (potentially dramatically so) for improved thread safety". This is because, to execute a death test, "threadsafe" does a "fork + exec", then re-executes the current test in the child process, whereas the default "fast" death test style does only a fork (on those platforms that support it). However, as we have relatively few death tests, the increased execution time does not make a big difference in total test execution time in my testing.

Note that other projects, such as Chromium, also choose to set the "threadsafe" death test style globally:

https://source.chromium.org/chromium/chromium/src/+/main:base/test/test_suite.cc;l=367

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D152696

13 months ago[clang][NFC] Add a notice to desugarForDiagnostic
Younan Zhang [Wed, 14 Jun 2023 01:59:30 +0000 (09:59 +0800)]
[clang][NFC] Add a notice to desugarForDiagnostic

`desugarForDiagnostic` only sets ShouldAKA to true if desugaring
happens, otherwise ShouldAKA is left intact and might be uninitialized.

Victims (including me):

https://github.com/llvm/llvm-project/commit/25bf8cb3c0e3c41231289a6ff0a37b6d49b24011

https://github.com/llvm/llvm-project/commit/0e8384a0fe4f03d60cd92aba1cae074512481ca2

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D152880

13 months agoFix test target_cpu_features.f90
Yaxun (Sam) Liu [Thu, 15 Jun 2023 03:39:35 +0000 (23:39 -0400)]
Fix test target_cpu_features.f90

Change-Id: Iae75abc3e4d1508f08080f687aba0ee1af74da2b

13 months ago[HIP] emit macro `__HIP_NO_IMAGE_SUPPORT`
Yaxun (Sam) Liu [Wed, 24 May 2023 16:59:01 +0000 (12:59 -0400)]
[HIP] emit macro `__HIP_NO_IMAGE_SUPPORT`

HIP texture/image support is optional as some devices
do not have image instructions. A macro __HIP_NO_IMAGE_SUPPORT
is defined for device not supporting images (https://github.com/ROCm-Developer-Tools/HIP/blob/d0448aa4c4dd0f4b29ccf6a663b7f5ad9f5183e0/docs/reference/kernel_language.md?plain=1#L426 )

Currently the macro is defined by HIP header based on predefined macros
for GPU, e.g __gfx*__ , which is error prone. This patch let clang
emit the predefined macro.

Reviewed by: Matt Arsenault, Artem Belevich

Differential Revision: https://reviews.llvm.org/D151349

13 months ago[RISCV] Remove redundant line `NOTE: Assertions have been autogenerated by utils...
Jim Lin [Thu, 15 Jun 2023 02:18:27 +0000 (10:18 +0800)]
[RISCV] Remove redundant line `NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py` from riscv64-zknd-zkne.c

13 months ago[RISCV] Remove fcvt.d.l(u) and fcvt.l(u).d instructions with _IN32X suffix.
Craig Topper [Thu, 15 Jun 2023 01:03:57 +0000 (18:03 -0700)]
[RISCV] Remove fcvt.d.l(u) and fcvt.l(u).d instructions with _IN32X suffix.

This is the same as D152950 without depending on D152948.

_IN32X instructions are for Zdinx on RV32 where doubles are split
across 2 registers.

fcvt.d.l(u) and fcvt.l(u).d are RV64 only instructions so we don't
need _IN32X versions of them.

Reviewed By: sunshaoce

Differential Revision: https://reviews.llvm.org/D152952

13 months ago[lldb] Have crashlog warn when remapped paths are inaccessible.
Jonas Devlieghere [Thu, 15 Jun 2023 00:15:28 +0000 (17:15 -0700)]
[lldb] Have crashlog warn when remapped paths are inaccessible.

It can be tricky to troubleshoot why the crashlog script can't show
inline sources. The two most common causes are that we couldn't find the
dSYM or, if we find the dSYM, that the path remapping included in the
dSYMForUUID output isn't accessible. The former is already easy to
diagnose, but the latter is harder because you'd have to manually invoke
dsymForUUID on the UUID and check the remapped path. This patch
automates that process and prints a warning if the remapped path doesn't
exist or is not accessible.

Differential revision: https://reviews.llvm.org/D152886

13 months ago[lldb] Remove lldbassert from DebugNamesDWARFIndex::GetGlobalVariables
Jonas Devlieghere [Thu, 15 Jun 2023 00:12:42 +0000 (17:12 -0700)]
[lldb] Remove lldbassert from DebugNamesDWARFIndex::GetGlobalVariables

34a8e6eee666 changed SymbolFileDWARF::GetDwoNum to
SymbolFileDWARF::GetFileIndex but changed the meaning from just DWO to
DWO and OSO which changed the meaning of the assert. The assert was
therefore removed from ManualDWARFIndex::GetGlobalVariables and
ManualDWARFIndex::GetGlobalVariables but was still present in
DebugNamesDWARFIndex::GetGlobalVariables. If we want to reintroduce the
assert, we need something with the old semantics for all 3.

13 months ago[TextAPI] add osx to possible string to platform input
Cyndy Ishida [Wed, 14 Jun 2023 23:24:42 +0000 (16:24 -0700)]
[TextAPI] add osx to possible string to platform input

13 months ago[Bazel] Another fix for 7a2fdc685f609730af29e5e969843e9eb71a184c
Pranav Kant [Wed, 14 Jun 2023 23:28:50 +0000 (23:28 +0000)]
[Bazel] Another fix for 7a2fdc685f609730af29e5e969843e9eb71a184c

13 months agoRevert "[Clang][MS] Remove assertion on BaseOffset can't be smaller than Size."
Nico Weber [Wed, 14 Jun 2023 23:19:48 +0000 (16:19 -0700)]
Revert "[Clang][MS] Remove assertion on BaseOffset can't be smaller than Size."

This reverts commit 5d54213ee557a86fae82af0f75498adf02f24e82.

Breaks check-clang on Windows, see https://reviews.llvm.org/D152472#4422913

13 months agoRevert "[ABI] [C++20] [Modules] Don't generate vtable if the class is defined in...
Nico Weber [Wed, 14 Jun 2023 23:17:31 +0000 (16:17 -0700)]
Revert "[ABI] [C++20] [Modules] Don't generate vtable if the class is defined in other module unit"

Breaks check-clang on win and mac, see comments on https://reviews.llvm.org/D150023

This reverts commit d8a36b00d198fdc2ea866ea5da449628db07070f.

Also revert follow-up "[NFC] skip the test modules-vtable.cppm on windows"
This reverts commit baf0b12ca6c624b2a59aa6f2fd0310c72d35ac56.

13 months ago[Bazel] Fix for 7a2fdc685f609730af29e5e969843e9eb71a184c
Pranav Kant [Wed, 14 Jun 2023 23:11:53 +0000 (23:11 +0000)]
[Bazel] Fix for 7a2fdc685f609730af29e5e969843e9eb71a184c

13 months ago[scudo] Disable new/delete mismatch tests on Android.
Christopher Ferris [Wed, 14 Jun 2023 20:49:37 +0000 (13:49 -0700)]
[scudo] Disable new/delete mismatch tests on Android.

Android does not do any checking of new/delete mismatches, so disable
this test when compiling for Android.

Reviewed By: Chia-hungDuan

Differential Revision: https://reviews.llvm.org/D152958

13 months ago[lldb] Fix SBPlatform after f4be9ff6458f
Alex Langford [Wed, 14 Jun 2023 21:14:58 +0000 (14:14 -0700)]
[lldb] Fix SBPlatform after f4be9ff6458f

If you pass `nullptr` (or `None` from python) to SBPlatform::SetSDKRoot,
LLDB crashes. Let's be more resilient to `nullptr` here.

Differential Revision: https://reviews.llvm.org/D152962

13 months ago[RDF] Create build config
Krzysztof Parzyszek [Sat, 3 Jun 2023 13:59:19 +0000 (06:59 -0700)]
[RDF] Create build config

- Add option to ignore reserved registers
- Add possibility to track selected registers or register classes only

Tracking is done based on register units, so the set of registers to track
is translated into a set of register units.

13 months ago[hwasan] Fixup mmap tagging regions
Vitaly Buka [Wed, 14 Jun 2023 22:22:32 +0000 (15:22 -0700)]
[hwasan] Fixup mmap tagging regions

Reviewed By: thurston

Differential Revision: https://reviews.llvm.org/D152893

13 months ago[DebugInfo] Always emit `.debug_names` with DWARF 5 for Apple platforms
Jonas Devlieghere [Wed, 14 Jun 2023 20:00:38 +0000 (13:00 -0700)]
[DebugInfo] Always emit `.debug_names` with DWARF 5 for Apple platforms

On Apple platforms, we generate .apple_names, .apple_types,
.apple_namespaces and .apple_objc Apple accelerator tables for DWARF 4
and earlier. For DWARF 5 we should generate .debug_names, but instead we
get no accelerator tables at all.

In the backend we are correctly determining that we should be emitting
.debug_names instead of .apple_names. However, when we get to the point
of emitting the section, if the CU debug name table kind is not
"default", the accelerator table emission is skipped.

This patch sets the DebugNameTableKind to Apple in the frontend when
target an Apple target. That way we know that the CU was compiled with
the intent of emitting accelerator tables. For DWARF 4 and earlier, that
means Apple accelerator tables. For DWARF 5 and later, that means .debug
names.

Differential revision: https://reviews.llvm.org/D118754

13 months ago[NFC] Autogenerate several Mips test.
Amaury Séchet [Wed, 14 Jun 2023 22:06:04 +0000 (22:06 +0000)]
[NFC] Autogenerate several Mips test.

13 months ago[ELF] Fix early overflow check in finalizeAddressDependentContent
Andreu Carminati [Wed, 14 Jun 2023 22:26:31 +0000 (15:26 -0700)]
[ELF] Fix early overflow check in finalizeAddressDependentContent

LLD terminates with errors when it detects overflows in the
finalizeAddressDependentContent calculation. Although, sometimes, those errors
are not really errors, but an intermediate result of an ongoing address
calculation.  If we continue the fixed-point algorithm we can converge to the
correct result.

This patch

* Removes the verification inside the fixed point algorithm.
* Calls checkMemoryRegions at the end.

Reviewed By: peter.smith, MaskRay

Differential Revision: https://reviews.llvm.org/D152170

13 months ago[ELF] Refine warning condition for memory region assignment for non-allocatable section
Andreu Carminati [Wed, 14 Jun 2023 22:23:14 +0000 (15:23 -0700)]
[ELF] Refine warning condition for memory region assignment for non-allocatable section

The warning "ignoring memory region assignment for non-allocatable section"  should be generated under the following conditions:

* sections without SHF_ALLOC attribute and,
* presence of input sections or data commands (ByteCommand)

The goal of the change is to reduce spurious warnings that are generated for some output sections that have no input section.

Reviewed By: MaskRay, peter.smith

Differential Revision: https://reviews.llvm.org/D151802

13 months ago[test][hwasan] Use perror and abort in test
Vitaly Buka [Wed, 14 Jun 2023 22:21:41 +0000 (15:21 -0700)]
[test][hwasan] Use perror and abort in test

13 months ago[flang][openacc] Add lowering for max operator
Valentin Clement [Wed, 14 Jun 2023 22:20:16 +0000 (15:20 -0700)]
[flang][openacc] Add lowering for max operator

Add support for the max operator in the reduction
clause.

Depdns on D151671

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D151672

13 months agoRevert "[gn build] Port 2700da5fe28d (lld/unittests etc)"
Nico Weber [Wed, 14 Jun 2023 22:07:27 +0000 (15:07 -0700)]
Revert "[gn build] Port 2700da5fe28d (lld/unittests etc)"

2700da5fe28d got reverted in aa495214b39d.

This reverts commit 9239cde390e2c8e7cc4ffd13bff7030a5172c805.

Also revert follow-up "[gn] Fix case of directory I added in 9239cde390e"
This reverts commit 4de67143babd20f44c1f806404df356bff6825a2.

13 months ago[libc++][ranges][NFC] Status page: Adds `views::enumerate`
Hristo Hristov [Wed, 14 Jun 2023 21:49:11 +0000 (00:49 +0300)]
[libc++][ranges][NFC] Status page: Adds `views::enumerate`

13 months agoRevert "[scudo] Temporariy dispatch region from `RegionBeg`"
Chia-hung Duan [Wed, 14 Jun 2023 21:25:59 +0000 (21:25 +0000)]
Revert "[scudo] Temporariy dispatch region from `RegionBeg`"

This reverts commit 9d9a7732e14d7d4c0db7b46d6ebe588e8f43b951.

This was a workaround for some platform and it has been fixed in
bfa02523b2e7ed66368ea61866a474e55ef354a3

Differential Revision: https://reviews.llvm.org/D152964

13 months ago[NFC] Autogenerate numerous SystemZ tests
Amaury Séchet [Wed, 14 Jun 2023 21:45:06 +0000 (21:45 +0000)]
[NFC] Autogenerate numerous SystemZ tests

13 months ago[lldb][NFCI] Remove unused method ProcessStructReader::GetOffsetOf
Alex Langford [Wed, 14 Jun 2023 21:45:18 +0000 (14:45 -0700)]
[lldb][NFCI] Remove unused method ProcessStructReader::GetOffsetOf

Completely unused here, AFAICT. It's also not used in the swift
downstream forks.

13 months ago[lldb][NFCI] Remove ProcessStructReader header where unused
Alex Langford [Wed, 14 Jun 2023 21:37:44 +0000 (14:37 -0700)]
[lldb][NFCI] Remove ProcessStructReader header where unused

13 months ago[libc++][spaceship][NFC] P1614R2: Status page - mark header synopsis sections as...
Hristo Hristov [Tue, 13 Jun 2023 05:05:11 +0000 (08:05 +0300)]
[libc++][spaceship][NFC] P1614R2: Status page - mark header synopsis sections as "Complete"

Header synospis sections of P1614R2 are implemented by other items usually. For completeness, let's mark some of them as "Complete".

Reviewed By: #libc, Mordante

Differential Revision: https://reviews.llvm.org/D152775

13 months ago[mlir] Fix warnings in release builds
Kazu Hirata [Wed, 14 Jun 2023 21:22:17 +0000 (14:22 -0700)]
[mlir] Fix warnings in release builds

This patch fixes:

  mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.cpp:846:16:
  error: unused variable 'lvlTp' [-Werror,-Wunused-variable]

  mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.cpp:1059:13:
  error: unused variable '[t, l]' [-Werror,-Wunused-variable]

13 months ago[clang-tidy][NFC] Update ReleaseNotes to mention some performance changes in checks
Piotr Zegar [Wed, 14 Jun 2023 21:15:36 +0000 (21:15 +0000)]
[clang-tidy][NFC] Update ReleaseNotes to mention some performance changes in checks

Included a note in the release documentation about the improved
performance of certain checks, allowing users who had previously
disabled them due to slowness to reconsider their decision.

13 months ago[SampleFDO] Remove 'using namespace' (NFC)
Muhammad Asif Manzoor [Wed, 14 Jun 2023 19:24:23 +0000 (15:24 -0400)]
[SampleFDO] Remove 'using namespace' (NFC)

Remove 'using namespace' statement from header file to avoid propagating it to
other locations unnecessarily and avoid potential name collisions.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D152727

13 months ago[NFC] Autogenerate various Thumb2 tests.
Amaury Séchet [Wed, 14 Jun 2023 21:17:40 +0000 (21:17 +0000)]
[NFC] Autogenerate various Thumb2 tests.

13 months agoRevert "[DebugInfo] Always emit `.debug_names` with DWARF 5 for Apple platforms"
Jonas Devlieghere [Wed, 14 Jun 2023 21:16:16 +0000 (14:16 -0700)]
Revert "[DebugInfo] Always emit `.debug_names` with DWARF 5 for Apple platforms"

This reverts commit e0d57295bf6a3c04f2901d9c70f529d570f48b65 because the
accel-tables-apple.ll test is failing on a few buildbots.

13 months ago[mlir][ArmSME] Dialect and Intrinsic Op Definition
Frank (Fang) Gao [Wed, 14 Jun 2023 21:03:36 +0000 (17:03 -0400)]
[mlir][ArmSME] Dialect and Intrinsic Op Definition

This patch creates the ArmSME dialect, and provides the intrinsic op
definition necessary for lowering to LLVM IR.

This will cover most instructions interacting with the ZA tile register,
not covering SME2 instructions.

Source: https://developer.arm.com/documentation/ddi0616/latest

Reviewed By: awarzynski, c-rhodes

Differential Revision: https://reviews.llvm.org/D152878

13 months agoRevert "[InstSimplify] Fold all global variables with initializers"
Alan Zhao [Wed, 14 Jun 2023 21:10:31 +0000 (14:10 -0700)]
Revert "[InstSimplify] Fold all global variables with initializers"

This reverts commit 17b7df3daee85c1a4d1d955e558d42b34ce17549.

Reason: causes chrome builds to crash: https://crbug.com/1454861

13 months agoClear non-addressable bits from pc/fp/sp in unwinds
Jason Molenda [Wed, 14 Jun 2023 20:43:53 +0000 (13:43 -0700)]
Clear non-addressable bits from pc/fp/sp in unwinds

Some Darwin corefiles can have the pc/fp/sp/lr in the
live register context signed with pointer authentication;
this patch changes RegisterContextUnwind to strip those
bits off of those values as we try to walk the stack.

Differential Revision: https://reviews.llvm.org/D152861
rdar://109185291

13 months agoAdd Fix*Address methods to Process, call into ABI
Jason Molenda [Wed, 14 Jun 2023 20:40:54 +0000 (13:40 -0700)]
Add Fix*Address methods to Process, call into ABI

We need to clear non-addressable bits from addresses across
the lldb sources.  Currently these need to use an ABI method
to clear those bits from addresses, which you do by taking a
Process, getting the current ABI, then calling the method.

Simplify this by providing methods in Process which call into
the ABI methods themselves.

Differential Revision: https://reviews.llvm.org/D152863

13 months ago[Clang][MS] Remove assertion on BaseOffset can't be smaller than Size.
Zequan Wu [Thu, 8 Jun 2023 21:38:57 +0000 (17:38 -0400)]
[Clang][MS] Remove assertion on BaseOffset can't be smaller than Size.

This assertion triggered when we have two base classes sharing the same offset
and the first base is empty and the second class is non-empty.
Remove it for correctness.

I can't add a test case for this because -foverride-record-layout doesn't read
base class info at all. I can add that support later for testing if needed.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D152472

13 months agoRevert "[LLD] Allow usage of LLD as a library"
Leonard Chan [Wed, 14 Jun 2023 20:36:27 +0000 (20:36 +0000)]
Revert "[LLD] Allow usage of LLD as a library"

This reverts commit 2700da5fe28d8b17c66e5c960d2188276a6ced39.

Reverting since this causes some test failures on our builders: https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8778372807208184913/overview

13 months agoRevert "[HIP] Allow std::malloc in device function"
Yaxun (Sam) Liu [Wed, 14 Jun 2023 20:03:29 +0000 (16:03 -0400)]
Revert "[HIP] Allow std::malloc in device function"

This reverts commit f5033c37025db46df95a7859d7189d09b5e3433e.

revert this patch since it causes regressions for Tensile. A
reduced test case is:

int main()
{
    std::shared_ptr<float> a;
    a = std::shared_ptr<float>(
        (float*)std::malloc(sizeof(float) * 100),
        std::free
    );
    return 0;
}

Will fix the issue then re-commit.

Fixes: SWDEV-405317

13 months ago[mlir][sparse] fix crashes when the tensor that defines the loop bound can not be...
Peiming Liu [Wed, 14 Jun 2023 01:29:20 +0000 (01:29 +0000)]
[mlir][sparse] fix crashes when the tensor that defines the loop bound can not be found

Reviewed By: aartbik, K-Wu

Differential Revision: https://reviews.llvm.org/D152877

13 months ago[asan] Fix shadow load alignment for unaligned 128-bit load/store
Fangrui Song [Wed, 14 Jun 2023 20:16:49 +0000 (13:16 -0700)]
[asan] Fix shadow load alignment for unaligned 128-bit load/store

When a 128-bit load/store is aligned by 8, we incorrectly emit `load i16, ptr ..., align 2`
while the shadow memory address may not be aligned by 2.

This manifests as possibly-misaligned shadow memory load with `-mstrict-align`,
e.g. `clang --target=aarch64-linux -O2 -mstrict-align -fsanitize=address`
```
__attribute__((noinline)) void foo(unsigned long *ptr) {
  ptr[0] = 3;
  ptr[1] = 3;
}
// ldrh    w8, [x9, x8]  // the shadow memory load may not be aligned by 2
```

Infer the shadow memory alignment from the load/store alignment to set the
correct alignment. The generated code now uses two ldrb and one orr.

Fix https://github.com/llvm/llvm-project/issues/63258

Differential Revision: https://reviews.llvm.org/D152663

13 months ago[LoopIdiom] Preserve alias information for memset_pattern
William S. Moses [Wed, 14 Jun 2023 16:23:44 +0000 (12:23 -0400)]
[LoopIdiom] Preserve alias information for memset_pattern

TBAA/NoAlias/AliasScope and other information is currently preserved
when upgrading to a memcpy/memset. However, this is missing when upgrading to
the macOS memset_pattern function. This adds the same alias information preservation
to memset_pattern

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D152934

13 months ago[DebugInfo] Always emit `.debug_names` with DWARF 5 for Apple platforms
Jonas Devlieghere [Wed, 14 Jun 2023 20:00:38 +0000 (13:00 -0700)]
[DebugInfo] Always emit `.debug_names` with DWARF 5 for Apple platforms

On Apple platforms, we generate .apple_names, .apple_types,
.apple_namespaces and .apple_objc Apple accelerator tables for DWARF 4
and earlier. For DWARF 5 we should generate .debug_names, but instead we
get no accelerator tables at all.

In the backend we are correctly determining that we should be emitting
.debug_names instead of .apple_names. However, when we get to the point
of emitting the section, if the CU debug name table kind is not
"default", the accelerator table emission is skipped.

This patch sets the DebugNameTableKind to Apple in the frontend when
target an Apple target. That way we know that the CU was compiled with
the intent of emitting accelerator tables. For DWARF 4 and earlier, that
means Apple accelerator tables. For DWARF 5 and later, that means .debug
names.

Differential revision: https://reviews.llvm.org/D118754

13 months ago[mlir][sparse] unifying enterLoopOverTensorAtLvl and enterCoIterationOverTensorsAtLvls
Peiming Liu [Tue, 6 Jun 2023 22:51:32 +0000 (22:51 +0000)]
[mlir][sparse] unifying enterLoopOverTensorAtLvl and enterCoIterationOverTensorsAtLvls

The tensor levels are now explicitly categorized into different `LoopCondKind` to instruct LoopEmitter generate different code for different kinds of condition (e.g., `SparseCond`, `SparseSliceCond`, `SparseAffineIdxCond`, etc)

The process of generating a while loop is now dissembled into three steps and they are dispatched to different LoopCondKind handler.
1. Generate LoopCondition (e.g., `pos <= posHi` for `SparseCond`, `slice.isNonEmpty` for `SparseAffineIdxCond`)
2. Generate LoopBody (e.g., compute the coordinates)
3. Generate ExtraChecks (e.g., `if (onSlice(crd))` for `SparseSliceCond`)

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D152464

13 months ago[RISCV] Remove dead code from doPeepholeMaskedRVV [nfc]
Philip Reames [Wed, 14 Jun 2023 19:57:49 +0000 (12:57 -0700)]
[RISCV] Remove dead code from doPeepholeMaskedRVV [nfc]

This is after lowering of undef to IMPLICIT_DEF, so the condition is always false.  Rather than fixing the intent (which was to match implicit_def per the comment), just delete it.  We're in the process of migrating away from the TA pseudos, so using _TA more often is fine.