platform/upstream/llvm.git
2 years ago[SelectionDAG] Don't apply MinRCSize constraint in InstrEmitter::AddRegisterOperand...
Craig Topper [Thu, 16 Jun 2022 21:45:44 +0000 (14:45 -0700)]
[SelectionDAG] Don't apply MinRCSize constraint in InstrEmitter::AddRegisterOperand for IMPLICIT_DEF sources.

MinRCSize is 4 and prevents constrainRegClass from changing the
register class if the new class has size less than 4.

IMPLICIT_DEF gets a unique vreg for each use and will be removed
by the ProcessImplicitDef pass before register allocation. I don't
think there is any reason to prevent constraining the virtual register
to whatever register class the use needs.

The attached test case was previously creating a copy of IMPLICIT_DEF
because vrm8nov0 has 3 registers in it.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D128005

2 years agoAdd DWARF string debug to clang release notes.
Mitch Phillips [Thu, 16 Jun 2022 21:51:05 +0000 (14:51 -0700)]
Add DWARF string debug to clang release notes.

D12353 added inline strings to the DWARF info produced by clang. This
turns out to break some debugging software that assumes that a
DW_TAG_variable *must* come with a DW_AT_name. Add a release note to
broadcast this change.

Reviewed By: paulkirth

Differential Revision: https://reviews.llvm.org/D126224

2 years ago[mlir][sparse] fix asan issue
Aart Bik [Thu, 16 Jun 2022 20:25:23 +0000 (13:25 -0700)]
[mlir][sparse] fix asan issue

The LinalgElementwiseOpFusion pass has become smarter, and converts
the simple conversion linalg operation into a sparse dialect convert
operation. However, since our current bufferization does not take the
new semantics into consideration, we leak memory of the allocation.
For now, this has been fixed by making the operation less trivial.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D128002

2 years agoMake setSanitizerMetadata byval.
Mitch Phillips [Thu, 16 Jun 2022 21:27:38 +0000 (14:27 -0700)]
Make setSanitizerMetadata byval.

This fixes a UaF bug in llvm::GlobalObject::copyAttributesFrom, where a
sanitizer metadata object is captured by reference, and passed by
reference to llvm::GlobalValue::setSanitizerMetadata. The reference
comes from the same map that the new value is going to be inserted to,
and the map insertion triggers iterator invalidation - leading to a
use-after-free on the dangling reference.

This patch fixes that bug by making setSanitizerMetadata's argument
byval. This should also systematically prevent the problem from
happening in future, as it's a very easy pattern to have. This shouldn't
be any performance problem, the SanitizerMetadata struct is a bitfield
POD.

2 years ago[libc] Add a status page for math functions.
Tue Ly [Wed, 15 Jun 2022 23:57:46 +0000 (19:57 -0400)]
[libc] Add a status page for math functions.

Add a status page for math functions.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D127920

2 years ago[RISCV] Start merging demanded reasoning - starting with load/stores [nfc]
Philip Reames [Thu, 16 Jun 2022 21:34:53 +0000 (14:34 -0700)]
[RISCV] Start merging demanded reasoning - starting with load/stores [nfc]

This change merges the logic for reasoning about demanded portions of the VTYPE register between the main dataflow algorithm and the backwards mutation post pass. In the process, we get to delete a bunch of now redundant code.

This should be entirely NFC. I included a slight hack (see TODO) to avoid changing behavior in the post pass while being able to use the generalized logic in the prepass. I will fix the TODO in a separate change once this lands.

Differential Revision: https://reviews.llvm.org/D127983

2 years ago[RISCV] Add cost model for scalable scatter and gather
Philip Reames [Thu, 16 Jun 2022 21:10:21 +0000 (14:10 -0700)]
[RISCV] Add cost model for scalable scatter and gather

The costing we use for fixed length vector gather and scatter is to simply count up the memory ops, and multiply by a fixed memory op cost. For scalable vectors, we don't actually know how many lanes are active. Instead, we have to end up making a worst case assumption on how many lanes could be active. In the generic +V case, this results in very high costs, but we can do better when we know an upper bound on the VLEN.

There's some obvious ways to improve this - e.g. using information about VL and mask bits from the instruction to reduce the upper bound - but this seems like a reasonable starting point.

The resulting costs do bias us pretty strongly away from generating scatter/gather for generic +V.  Without this, we'd be returning an invalid cost and thus definitely not vectorizing, so no major change in practical behavior expected.

Differential Revision: https://reviews.llvm.org/D127541

2 years ago[mlir][complex] Add Python bindings for complex ops.
bixia1 [Wed, 15 Jun 2022 23:00:51 +0000 (16:00 -0700)]
[mlir][complex] Add Python bindings for complex ops.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D127916

2 years agoRevert "[TableGen][DirectX] generate DXIL operation table with TableGen."
Mitch Phillips [Thu, 16 Jun 2022 21:07:54 +0000 (14:07 -0700)]
Revert "[TableGen][DirectX] generate DXIL operation table with TableGen."

This reverts commit 46fcdf23640ebb76271f91720583b0df6bed4481.

Reason: Broke the buildbots:
https://lab.llvm.org/buildbot/#/builders/77/builds/18671

2 years agoMove debug-only code inside LLVM_DEUG to prevent unused variable warnings.
Sterling Augustine [Thu, 16 Jun 2022 21:00:44 +0000 (14:00 -0700)]
Move debug-only code inside LLVM_DEUG to prevent unused variable warnings.

2 years agoReland "[ASan] Use debuginfo for symbolization."
Mitch Phillips [Thu, 16 Jun 2022 17:23:26 +0000 (10:23 -0700)]
Reland "[ASan] Use debuginfo for symbolization."

This reverts commit 99796d06dbe11c8f81376ad1d42e7f17d2eff6ae.

Hint: Looking here because your manual invocation of something in
'check-asan' broke? You need a new symbolizer (after D123538).

An upcoming patch will remove the internal metadata for global
variables. With D123534 and D123538, clang now emits DWARF debug info
for constant strings (the only global variable type it was missing), and
llvm-symbolizer is now able to symbolize all global variable addresses
(where previously it wouldn't give you the file:line information).

Move ASan's runtime over from the internal metadata to DWARF.

Differential Revision: https://reviews.llvm.org/D127552

2 years ago[TableGen][DirectX] generate DXIL operation table with TableGen.
Xiang Li [Thu, 16 Jun 2022 14:56:26 +0000 (07:56 -0700)]
[TableGen][DirectX] generate DXIL operation table with TableGen.

Add more feature to tableGen backend gen-dxil-operation.

It will generate getOpCodeProperty, getOpCodeClassName and getOpCodeName when build DirectX target.
Each of these functions has a table which generate based on DXIL operations.

These generated functions will replace the manually written functions which used for query DXIL operation information.

Reviewed By: bogner

Differential Revision: https://reviews.llvm.org/D125520

2 years ago[MergeFunctions] Preserve symbols used llvm.used/llvm.compiler.used
Amanieu d'Antras [Thu, 16 Jun 2022 20:31:33 +0000 (21:31 +0100)]
[MergeFunctions] Preserve symbols used llvm.used/llvm.compiler.used

llvm.used and llvm.compiler.used are often used with inline assembly
that refers to a specific symbol so that the symbol is kept through to
the linker even though there are no references to it from LLVM IR.

This fixes the MergeFunctions pass to preserve references to these
symbols in llvm.used/llvm.compiler.used so they are not deleted from the
IR. This doesn't prevent these functions from being merged, but
guarantees that an alias or thunk with the expected symbol name is kept
in the IR.

Differential Revision: https://reviews.llvm.org/D127751

2 years ago[gn build] Port 6ff49af33d09
LLVM GN Syncbot [Thu, 16 Jun 2022 20:34:45 +0000 (20:34 +0000)]
[gn build] Port 6ff49af33d09

2 years ago[lldb] Introduce the concept of a log handler (NFC)
Jonas Devlieghere [Thu, 16 Jun 2022 00:32:14 +0000 (17:32 -0700)]
[lldb] Introduce the concept of a log handler (NFC)

This patch introduces the concept of a log handlers. Log handlers allow
customizing the way log output is emitted. The StreamCallback class
tried to do something conceptually similar. The benefit of the log
handler interface is that you don't need to conform to llvm's
raw_ostream interface.

Differential revision: https://reviews.llvm.org/D127922

2 years ago[Delinearization] Refactoring of fixed-size array delinearization
Congzhe Cao [Thu, 16 Jun 2022 20:03:30 +0000 (16:03 -0400)]
[Delinearization] Refactoring of fixed-size array delinearization

This is a follow-up patch to D122857 where we added delinearization of
fixed-size arrays to loop cache analysis, which resulted in some duplicate
code, i.e., "tryDelinearizeFixedSize()", in LoopCacheCost.cpp and
DependenceAnalysis.cpp. Refactoring is done in this patch.

This patch refactors out the main logic of "tryDelinearizeFixedSize()" as
"tryDelinearizeFixedSizeImpl()" and moves it to Delinearization.cpp, such that
clients can reuse "llvm::tryDelinearizeFixedSizeImpl()" wherever they would
like to delinearize fixed-size arrays. Currently it has two users, i.e.,
DependenceAnalysis.cpp and LoopCacheCost.cpp.

Reviewed By: Meinersbur, #loopoptwg

Differential Revision: https://reviews.llvm.org/D124745

2 years ago[TargetTransformInfo] Added an opt/llc option for cache line size
Congzhe Cao [Thu, 16 Jun 2022 19:50:07 +0000 (15:50 -0400)]
[TargetTransformInfo] Added an opt/llc option for cache line size

In some passes we need a valid number of cache line size to do analysis or
transformation, e.g., loop cache analysis and loop date prefetch. However,
for some backend targets, `TTIImpl->getCacheLineSize()` is not implemented
and hence 'TTI.getCacheLineSize()' would just return 0 which eventually might
produce invalid result.

In this patch we add a user-specified opt/llc option for cache line size.
If the option is specified by users we use the value supplied, otherwise we
fall-back to the default value obtained from `TTIImpl->->getCacheLineSize()`.
The powerpc target already has such an option, this patch generalizes
this option to TargetTransformInfo.cpp.

Reviewed By: bmahjour, #loopoptwg

Differential Revision: https://reviews.llvm.org/D127342

2 years ago[libc++] Remove now-unused experimental/filesystem config file
Louis Dionne [Thu, 16 Jun 2022 19:34:08 +0000 (15:34 -0400)]
[libc++] Remove now-unused experimental/filesystem config file

2 years agoFix StopInfoBreakpoint::ShouldNotify when a callback deletes the site we hit.
Jim Ingham [Thu, 16 Jun 2022 18:46:25 +0000 (11:46 -0700)]
Fix StopInfoBreakpoint::ShouldNotify when a callback deletes the site we hit.

When we hit a breakpoint site all of whose owners are internal, we don't
broadcast that event to the public event queue.  However, we were checking
whether that was true in the ShouldNotify method, which gets run after the
breakpoint callbacks get run.  If the breakpoint callback deletes the site
we just hit, we no longer have the information to make that determination.

This patch just gathers the "was all internal" fact when the StopInfoBreakpoint
gets made, which happens before anyone has a chance to delete the site, and then
uses that cached value.

This bug was causing a couple of tests (including TestStopAtEntry.py) to fail
when using new the macOS Ventura dyld support.

Differential Revision: https://reviews.llvm.org/D127997

2 years agoReland "[PS4/PS5][profiling] Go back to the old way of doing a runtime hook"
Paul Robinson [Thu, 16 Jun 2022 18:41:51 +0000 (11:41 -0700)]
Reland "[PS4/PS5][profiling] Go back to the old way of doing a runtime hook"

Profiling stopped working for us after D98061, which was largely a
Fuschia-specific patch but in one place used `isOSBinFormatELF` to
make a decision.  I'm adding a PS4/PS5 exception to that, so we can
get profiling to work again.

Differential Revision: https://reviews.llvm.org/D127506

2 years agoFix TraceGDBRemotePacketsTest
Walter Erquinigo [Thu, 16 Jun 2022 18:52:47 +0000 (11:52 -0700)]
Fix TraceGDBRemotePacketsTest

This test broke, but the fix is simple.

2 years ago[BOLT][NFCI] Refactor interface for adding basic blocks
Maksim Panchenko [Thu, 16 Jun 2022 01:49:39 +0000 (18:49 -0700)]
[BOLT][NFCI] Refactor interface for adding basic blocks

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127935

2 years ago[AMDGPU] gfx11 new dot instruction codegen support
Joe Nash [Wed, 15 Jun 2022 18:03:51 +0000 (14:03 -0400)]
[AMDGPU] gfx11 new dot instruction codegen support

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D127904

2 years ago[AMDGPU] Add GFX11 codegen for llvm.amdgcn.mov.dpp8
Jay Foad [Thu, 16 Jun 2022 16:10:11 +0000 (17:10 +0100)]
[AMDGPU] Add GFX11 codegen for llvm.amdgcn.mov.dpp8

Differential Revision: https://reviews.llvm.org/D127980

2 years ago[trace][intelpt] Support system-wide tracing [22] - Some final touches
Walter Erquinigo [Wed, 15 Jun 2022 17:35:00 +0000 (10:35 -0700)]
[trace][intelpt] Support system-wide tracing [22] - Some final touches

Having a member variable TraceIntelPT * makes it look as if it was
optional. I'm using instead a weak_ptr to indicate that it's not
optional and the object is under the ownership of TraceIntelPT.

Besides that, I've simplified the Perf aux and data buffers copying by
using vector.insert.

I'm also renaming Lookup2 to Lookup. The 2 in the name is confusing.

Differential Revision: https://reviews.llvm.org/D127881

2 years ago[trace][intelpt] Support system-wide tracing [21] - Support long numbers in JSON
Walter Erquinigo [Wed, 15 Jun 2022 02:05:30 +0000 (19:05 -0700)]
[trace][intelpt] Support system-wide tracing [21] - Support long numbers in JSON

llvm's JSON parser supports 64 bit integers, but other tools like the
ones written in JS don't support numbers that big, so we need to
represent these possibly big numbers as a string. This diff uses that to
represent addresses and tsc zero. The former is printed in hex for and
the latter in decimal string form. The schema was updated mentioning
that.

Besides that, I fixed some remaining issues and now all test pass. Before I wasn't running all tests because for some reason my computer reverted perf_paranoid to 1.

Differential Revision: https://reviews.llvm.org/D127819

2 years ago[trace][intelpt] Support system-wide tracing [20] - Rename some fields in the schema
Walter Erquinigo [Tue, 14 Jun 2022 22:53:59 +0000 (15:53 -0700)]
[trace][intelpt] Support system-wide tracing [20] - Rename some fields in the schema

As discusses offline with @jj10305, we are updating some naming used throughout the code, specially in the json schema

- traceBuffer -> iptTrace
- core -> cpu

Differential Revision: https://reviews.llvm.org/D127817

2 years ago[trace][intelpt] Support system-wide tracing [19] - Some other minor improvements
Walter Erquinigo [Tue, 14 Jun 2022 21:36:26 +0000 (14:36 -0700)]
[trace][intelpt] Support system-wide tracing [19] - Some other minor improvements

This addresses the issues in diffs [13], [14] and [16]

- Add better documentation
- Fix some castings by making them safer
- Simplify CorrelateContextSwitchesAndIntelPtTraces
- Rename some functions

Differential Revision: https://reviews.llvm.org/D127804

2 years ago[trace][intelpt] Support system-wide tracing [18] - some more improvements
Walter Erquinigo [Mon, 13 Jun 2022 06:36:52 +0000 (23:36 -0700)]
[trace][intelpt] Support system-wide tracing [18] - some more improvements

This applies the changes requested for diff 12.

- use DenseMap<ConstString, _> instead of std::unordered_map<ConstString, _>, which is more idiomatic and possibly performant.
- deduplicate some code in Trace.cpp by using helper functions for fetching in maps
- stop using size and offset when fetching binary data, because we in fact read the entire buffers all the time. If we ever need streaming, we can implement it then. Now, the size is used only to check that we are getting the correct amount of data. This is useful because in some cases determining the size doesn't involve fetching the actual data.
- added back the x86_64 macro to the perf tests
- added more documentation
- simplified some file handling
- fixed some comments

Differential Revision: https://reviews.llvm.org/D127752

2 years agoRevert "[PS4/PS5][profiling] Go back to the old way of doing a runtime hook"
Paul Robinson [Thu, 16 Jun 2022 18:40:33 +0000 (11:40 -0700)]
Revert "[PS4/PS5][profiling] Go back to the old way of doing a runtime hook"

This reverts commit 39fb84343ec5cf9081e236745490c65eb8a9fc31.

Pushed without verifying the test still works.

2 years ago[flang][runtime] Make ASSOCIATED() conform with standard
Peter Klausler [Mon, 13 Jun 2022 15:56:09 +0000 (08:56 -0700)]
[flang][runtime] Make ASSOCIATED() conform with standard

ASSOCIATED() must be false for zero-sized arrays, and the
strides on a dimension are irrelevant when the extent is unitary.

Differential Revision: https://reviews.llvm.org/D127793

2 years agoReland "[NFC] Precommited tests from D73000"
Dávid Bolvanský [Thu, 16 Jun 2022 18:38:14 +0000 (20:38 +0200)]
Reland "[NFC] Precommited tests from D73000"

2 years agoRevert "[NFC] Precommited tests from D73000"
Dávid Bolvanský [Thu, 16 Jun 2022 18:31:06 +0000 (20:31 +0200)]
Revert "[NFC] Precommited tests from D73000"

This reverts commit 814c9f4e0c4dd72c9df600ec45e40efd72ef55f0.

2 years ago[PS4/PS5][profiling] Go back to the old way of doing a runtime hook
Paul Robinson [Fri, 10 Jun 2022 16:12:49 +0000 (09:12 -0700)]
[PS4/PS5][profiling] Go back to the old way of doing a runtime hook

Profiling stopped working for us after D98061, which was largely a
Fuschia-specific patch but in one place used `isOSBinFormatELF` to
make a decision.  I'm adding a PS4/PS5 exception to that, so we can
get profiling to work again.

Differential Revision: https://reviews.llvm.org/D127506

2 years ago[PS5] Set address sanitizer shadow offset
Paul Robinson [Thu, 16 Jun 2022 18:26:04 +0000 (11:26 -0700)]
[PS5] Set address sanitizer shadow offset

2 years ago[trace][intelpt] Support system-wide tracing [17] - Some improvements
Walter Erquinigo [Wed, 8 Jun 2022 18:39:31 +0000 (11:39 -0700)]
[trace][intelpt] Support system-wide tracing [17] - Some improvements

This improves several things and addresses comments up to the diff [11] in this stack.

- Simplify many functions to receive less parameters that they can identify easily
- Create Storage classes for Trace and TraceIntelPT that can make it easier to reason about what can change with live process refreshes and what cannot.
- Don't cache the perf zero conversion numbers in lldb-server to make sure we get the most up-to-date numbers.
- Move the thread identifaction from context switches to the bundle parser, to leave TraceIntelPT simpler. This also makes sure that the constructor of TraceIntelPT is invoked when the entire data has been checked to be correct.
- Normalize all bundle paths before the Processes, Threads and Modules are created, so that they can assume that all paths are correct and absolute
- Fix some issues in the tests. Now they all pass.
- return the specific instance when constructing PerThread and MultiCore processor tracers.
- Properly implement IntelPTMultiCoreTrace::TraceStart.
- Improve some comments.
- Use the typedef ContextSwitchTrace more often for clarity.
- Move CreateContextSwitchTracePerfEvent to Perf.h as a utility function.
- Synchronize better the state of the context switch and the intel pt
perf events.
- Use a booblean instead of an enum for the PerfEvent state.

Differential Revision: https://reviews.llvm.org/D127456

2 years ago[trace][intelpt] Support system-wide tracing [16] - Create threads automatically...
Walter Erquinigo [Fri, 3 Jun 2022 20:23:26 +0000 (13:23 -0700)]
[trace][intelpt] Support system-wide tracing [16] - Create threads automatically from context switch data in the post-mortem case

For some context, The context switch data contains information of which threads were
executed by each traced process, therefore it's not necessary to specify
them in the trace file.

So this diffs adds support for that automatic feature. Eventually we
could include it to live processes as well.

Differential Revision: https://reviews.llvm.org/D127001

2 years ago[trace][intelpt] Support system-wide tracing [15] - Make triple optional
Walter Erquinigo [Fri, 3 Jun 2022 19:01:45 +0000 (12:01 -0700)]
[trace][intelpt] Support system-wide tracing [15] - Make triple optional

The process triple should only be needed when LLDB can't identify the correct
triple on its own. Examples could be universal mach-o binaries. But in any case,
at least for most of ELF files, LLDB should be able to do the job without having
the user specify the triple manually.

Differential Revision: https://reviews.llvm.org/D126990

2 years ago[trace][intelpt] Support system-wide tracing [14] - Decode per cpu
Walter Erquinigo [Tue, 24 May 2022 19:16:25 +0000 (12:16 -0700)]
[trace][intelpt] Support system-wide tracing [14] - Decode per cpu

This is the final functional patch to support intel pt decoding per cpu.
It works by doing the following:

- First, all context switches are split by tid and sorted in order. This produces a list of continuous executes per thread per core.
- Then, all intel pt subtraces are split by PSB boundaries and assigned to individual thread continuous executions on the same core by doing simple TSC-based comparisons.
- With this, we have, per thread, a sorted list of continuous executions each one with a list of intel pt subtraces. Up to this point, this is really fast because no instructions were actually decoded.
- Then, each thread can be decoded by traversing their continuous executions and intel pt subtraces. An advantage of having these continuous executions is that we can identify if a continuous exexecution doesn't have intel pt data, and thus has a gap in it. We can later to more sofisticated comparisons to identify if within a continuous execution there are gaps.

I'm adding a test as well.

Differential Revision: https://reviews.llvm.org/D126394

2 years ago[trace][intelpt] Support system-wide tracing [13] - Add context switch decoding
Walter Erquinigo [Thu, 19 May 2022 23:39:20 +0000 (16:39 -0700)]
[trace][intelpt] Support system-wide tracing [13] - Add context switch decoding

- Add the logic that parses all cpu context switch traces and produces blocks of continuous executions, which will be later used to assign intel pt subtraces to threads and to identify gaps. This logic can also identify if the context switch trace is malformed.
- The continuous executions blocks are able to indicate when there were some contention issues when producing the context switch trace. See the inline comments for more information.
- Update the 'dump info' command to show information and stats related to the multicore decoding flow, including timing about context switch decoding.
- Add the logic to conver nanoseconds to TSCs.
- Fix a bug when returning the context switches. Now they data returned makes sense and even empty traces can be returned from lldb-server.
- Finish the necessary bits for loading and saving a multi-core trace bundle from disk.
- Change some size_t to uint64_t for compatibility with 32 bit systems.

Tested by saving a trace session of a program that sleeps 100 times, it was able to produce the following 'dump info' text:

```
(lldb) trace load /tmp/trace3/trace.json                                                                   (lldb) thread trace dump info                                                                              Trace technology: intel-pt

thread #1: tid = 4192415
  Total number of instructions: 1

  Memory usage:
    Total approximate memory usage (excluding raw trace): 2.51 KiB
    Average memory usage per instruction (excluding raw trace): 2573.00 bytes

  Timing for this thread:

  Timing for global tasks:
    Context switch trace decoding: 0.00s

  Events:
    Number of instructions with events: 0
    Number of individual events: 0

  Multi-core decoding:
    Total number of continuous executions found: 2499
    Number of continuous executions for this thread: 102

  Errors:
    Number of TSC decoding errors: 0
```

Differential Revision: https://reviews.llvm.org/D126267

2 years ago[PS5] Emit ud2 for ubsan trap
Paul Robinson [Thu, 16 Jun 2022 18:19:51 +0000 (11:19 -0700)]
[PS5] Emit ud2 for ubsan trap

2 years ago[NFC] Precommited tests from D73000
Dávid Bolvanský [Thu, 16 Jun 2022 18:15:56 +0000 (20:15 +0200)]
[NFC] Precommited tests from D73000

2 years ago[RISCV] Avoid reducing etype just to initialize lane 0 of an undef vector
Philip Reames [Thu, 16 Jun 2022 18:06:18 +0000 (11:06 -0700)]
[RISCV] Avoid reducing etype just to initialize lane 0 of an undef vector

If we're writing to an undef vector (i.e. implicit_def), we can change the value of bits outside the requested write without consequence. This allows us to avoid a VSETVLI just for narrowing the value written.

Differential Revision: https://reviews.llvm.org/D127880

2 years ago[PS5] Use same debug trap instruction as PS4
Paul Robinson [Thu, 16 Jun 2022 18:02:45 +0000 (11:02 -0700)]
[PS5] Use same debug trap instruction as PS4

2 years ago[libc][obvious] fix address test on windows
Michael Jones [Thu, 16 Jun 2022 17:49:11 +0000 (10:49 -0700)]
[libc][obvious] fix address test on windows

On windows size_t != unsigned long.

Differential Revision: https://reviews.llvm.org/D127989

2 years agoFix a bug introduced by the move of AddressRanges.h into ADT.
Greg Clayton [Tue, 14 Jun 2022 23:44:21 +0000 (16:44 -0700)]
Fix a bug introduced by the move of AddressRanges.h into ADT.

The bug was introduced when the AddressRange class was no longer able to modify the End address directly and the entire range of the .text address range that contained the trailing empty symbol was replaced. There was no unit test for this, so it wasn't caught. I fixed the bug and added a unit test for it.

The effects of this bug are serious as the AddressOffsetSize in the header would be incorrectly calculated and an invalid GSYM would be created.

Differential Revision: https://reviews.llvm.org/D127811

2 years ago[flang] NINT(-.4999) is 0, not overflow
Peter Klausler [Fri, 10 Jun 2022 00:06:35 +0000 (17:06 -0700)]
[flang] NINT(-.4999) is 0, not overflow

Overflow detection in the folding of int/nint/ceiling is
incorrectly signalling overflow when a negative argument yields
a zero result.

Differential Revision: https://reviews.llvm.org/D127785

2 years ago[SLP]Use original vector if need to shuffle truncated root.
Alexey Bataev [Thu, 16 Jun 2022 14:52:14 +0000 (07:52 -0700)]
[SLP]Use original vector if need to shuffle truncated root.

If the root scalar is mapped to to the smallest bit width, the vector is
truncated and the types between original buildvector and extracted value
mismatched. For extract, we emit sext/zext instructions, for shuffles we
can reuse oringal vector instead of the truncated one.

Differential Revision: https://reviews.llvm.org/D127974

2 years ago[libc++][doc] Update formatting status.
Mark de Wever [Thu, 16 Jun 2022 17:37:49 +0000 (19:37 +0200)]
[libc++][doc] Update formatting status.

2 years ago[RISCV] Fix a typo in an intrinsic name
Philip Reames [Thu, 16 Jun 2022 17:32:04 +0000 (10:32 -0700)]
[RISCV] Fix a typo in an intrinsic name

Apparently the parser/verifier is more lax than it should be.  The typo'd names should have been rejected.

2 years ago[AMDGPU] Add GFX11 llvm.amdgcn.ds.add.gs.reg.rtn / llvm.amdgcn.ds.sub.gs.reg.rtn...
Jay Foad [Thu, 16 Jun 2022 12:27:59 +0000 (13:27 +0100)]
[AMDGPU] Add GFX11 llvm.amdgcn.ds.add.gs.reg.rtn / llvm.amdgcn.ds.sub.gs.reg.rtn intrinsics

Differential Revision: https://reviews.llvm.org/D127955

2 years ago[AMDGPU] GFX11 CodeGen support for MIMG instructions
Jay Foad [Tue, 14 Jun 2022 14:12:42 +0000 (15:12 +0100)]
[AMDGPU] GFX11 CodeGen support for MIMG instructions

This includes:
- New llvm.amdgcn.image.msaa.load.* intrinsics
- NSA changes, because MIMG-NSA is now limited to 3 dwords
- Split CD forms of IMAGE_SAMPLE instructions out into separate
  test files since they are no longer supported in GFX11

Differential Revision: https://reviews.llvm.org/D127837

2 years ago[AMDGPU] Add new GFX11 intrinsic llvm.amdgcn.exp.row
Jay Foad [Mon, 13 Jun 2022 17:20:52 +0000 (18:20 +0100)]
[AMDGPU] Add new GFX11 intrinsic llvm.amdgcn.exp.row

Differential Revision: https://reviews.llvm.org/D127671

2 years ago[AArch64] Regenerate 3 codegen test files. NFC
David Green [Thu, 16 Jun 2022 17:23:05 +0000 (18:23 +0100)]
[AArch64] Regenerate 3 codegen test files. NFC

2 years ago[docs][OpaquePtr] Add detail to motivations behind opaque pointers
Arthur Eubanks [Tue, 24 May 2022 17:57:11 +0000 (10:57 -0700)]
[docs][OpaquePtr] Add detail to motivations behind opaque pointers

Reviewed By: #opaque-pointers, rnk, nikic

Differential Revision: https://reviews.llvm.org/D126309

2 years ago[flang] Handle module subprogram with interface in same (sub)module when writing...
Peter Klausler [Thu, 9 Jun 2022 23:06:23 +0000 (16:06 -0700)]
[flang] Handle module subprogram with interface in same (sub)module when writing module file

There's a few (3) cases where Fortran allows two distinct symbols to have
the same name in the same scope.  Module file output copes with only two of
them.  The third involves a separate module procedure that isn't separate:
both the procedure and its declared interface appear in the same (sub)module.
Fix to ensure that the interface is included in the module file output, so
that the module file reader doesn't suffer a bogus error about a "separate
module procedure without an interface".

Differential Revision: https://reviews.llvm.org/D127784

2 years ago[RISCV] Merge TIED_TU and TIED instructions for VWADD_W/VWSUB_W by using policy operand.
Craig Topper [Thu, 16 Jun 2022 16:59:27 +0000 (09:59 -0700)]
[RISCV] Merge TIED_TU and TIED instructions for VWADD_W/VWSUB_W by using policy operand.

This removes one of the uses of ForceTailUndisturbed.

2 years ago[flang] Correct implementation of WAIT with no ID
Peter Klausler [Wed, 15 Jun 2022 23:53:56 +0000 (16:53 -0700)]
[flang] Correct implementation of WAIT with no ID

Previous one was returning a bogus error status about a bad WAIT
statement ID number.

Differential Revision: https://reviews.llvm.org/D127979

2 years ago[libc] fix line buffered empty file writes
Michael Jones [Wed, 15 Jun 2022 23:12:58 +0000 (16:12 -0700)]
[libc] fix line buffered empty file writes

Previously, any line buffered write of size 0 would cause an error.
The variable used to track the index of the last newline started at
the size of the write - 1, which underflowed. Now it's handled properly,
and a test has been added to prevent regressions.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D127914

2 years ago[libc] add printf hex conversion
Michael Jones [Fri, 20 May 2022 18:50:16 +0000 (11:50 -0700)]
[libc] add printf hex conversion

The hex converter handles the %x and %X conversions.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D126082

2 years ago[mlir][vector] Fix contraction op lowering with mixed types
Thomas Raoux [Thu, 16 Jun 2022 00:40:58 +0000 (00:40 +0000)]
[mlir][vector] Fix contraction op lowering with mixed types

contraction op can have mixed type, add support for this case to the pattern
lowering contraction op to outerproduct.

Differential Revision: https://reviews.llvm.org/D127926

2 years ago[clang] Don't emit type test/assume for virtual classes that should never participate...
Arthur Eubanks [Wed, 15 Jun 2022 16:44:43 +0000 (09:44 -0700)]
[clang] Don't emit type test/assume for virtual classes that should never participate in WPD

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D127876

2 years ago[mlir][linalg] Relax convolution vectorization to support mixed types
Thomas Raoux [Thu, 16 Jun 2022 00:52:25 +0000 (00:52 +0000)]
[mlir][linalg] Relax convolution vectorization to support mixed types

Support the case where convolution does float extension of the inputs.

Differential Revision: https://reviews.llvm.org/D127925

2 years ago[RISCV] Reorder function definitions to reduce upcoming diff [nfc]
Philip Reames [Thu, 16 Jun 2022 16:25:20 +0000 (09:25 -0700)]
[RISCV] Reorder function definitions to reduce upcoming diff [nfc]

2 years ago[libc][NFC] Make explicit uint16_t casts in fenv
Alex Brachet [Thu, 16 Jun 2022 16:18:44 +0000 (16:18 +0000)]
[libc][NFC] Make explicit uint16_t casts in fenv

2 years ago[MLInliner] Don't inline call sites in unreachable basic blocks
Mircea Trofin [Wed, 15 Jun 2022 14:32:55 +0000 (07:32 -0700)]
[MLInliner] Don't inline call sites in unreachable basic blocks

This requires DominatorTree be updated, which we do in the ml inliner
case, but not in the default case, and the cost of doing so is
noticeable to compile time for the latter[1]. So the patch only affects
the ML inliner.

[1] https://llvm-compile-time-tracker.com/compare.php?from=9fc0aa45e3312944431ba7e1ca0cec99c613992b&to=7af461b1ce0d9138211ef5f883f35d5b9ddf47be&stat=wall-time

Differential Revision: https://reviews.llvm.org/D127899

2 years ago[RISCV] Use TAIL_AGNOSTIC in riscv_fma_vl patterns.
Craig Topper [Thu, 16 Jun 2022 16:03:02 +0000 (09:03 -0700)]
[RISCV] Use TAIL_AGNOSTIC in riscv_fma_vl patterns.

We may eventually need tail undisturbed patterns, but we will need
a policy operand on the ISD node to communicate it.

2 years agoAllow bitwidth difference when checking for isOneOrOneSplat.
Adrian Tong [Wed, 8 Jun 2022 18:20:42 +0000 (18:20 +0000)]
Allow bitwidth difference when checking for isOneOrOneSplat.

This helps handling a case where the BUILD_VECTOR has i16 element type
and i32 constant operands

t2: v8i16 = setcc t8, t17, setult:ch
t3: v8i16 = BUILD_VECTOR Constant:i32<1>, ...
   t4: v8i16 = and t2, t3
      t5: v8i16 = add t8, t4

This can be turned into t5: v8i16 = sub t8, t2, and allows us to remove
t3 and t4 from the DAG.

Differential Revision: https://reviews.llvm.org/D127354

2 years agoRevert "[libc++] Test the size of basic_string"
Nikolas Klauser [Thu, 16 Jun 2022 16:01:22 +0000 (18:01 +0200)]
Revert "[libc++] Test the size of basic_string"

This reverts commit 147f74b6ee901c0338671d628da79c2108452097.

2 years ago[RISCV] Split DemandedField logic in advance of reuse in dataflow [nfc]
Philip Reames [Thu, 16 Jun 2022 15:48:28 +0000 (08:48 -0700)]
[RISCV] Split DemandedField logic in advance of reuse in dataflow [nfc]

This change just moves some code around, and extracts out a helper function expected to be useful when reusing the demanded field logic in the forward dataflow.

2 years ago[PowerPC] Fix LQ-STQ instructions to use correct offset and base
Ahsan Saghir [Wed, 1 Jun 2022 17:12:34 +0000 (12:12 -0500)]
[PowerPC] Fix LQ-STQ instructions to use correct offset and base

This patch fixes the load and store quadword instructions on
PowerPC to use correct offset and base address.

Reviewed By: #powerpc, nemanjai, lkail

Differential Revision: https://reviews.llvm.org/D126807

2 years ago[RISCV] Move getSEWLMULRatio out of VSETVLIInfo [nfc]
Philip Reames [Thu, 16 Jun 2022 15:39:53 +0000 (08:39 -0700)]
[RISCV] Move getSEWLMULRatio out of VSETVLIInfo [nfc]

2 years ago[clang] Don't emit IFUNC when targeting Fuchsia
Alex Brachet [Thu, 16 Jun 2022 15:38:12 +0000 (15:38 +0000)]
[clang] Don't emit IFUNC when targeting Fuchsia

Fuchsia's dynamic linker does not and will never support IFUNC's.

Differential revision: https://reviews.llvm.org/D127933

2 years ago[RISCV] Use TAIL_UNDISTURBED_MASK_UNDISTURBED for riscv_slidedown_vl unless the merge...
Craig Topper [Thu, 16 Jun 2022 06:54:03 +0000 (23:54 -0700)]
[RISCV] Use TAIL_UNDISTURBED_MASK_UNDISTURBED for riscv_slidedown_vl unless the merge op is undef.

If the merge operand isn't undef we need to be using tail undisturbed.

Turns out all of our uses of riscv_slidedown_vl use undef so this
doesn't affect any tests.

2 years agoReplace to_hexString by touhexstr [NFC]
Corentin Jabot [Thu, 16 Jun 2022 09:52:40 +0000 (11:52 +0200)]
Replace to_hexString by touhexstr [NFC]

LLVM had 2 methods to convert a number to an hexa string,
this remove one of them.

Differential Revision: https://reviews.llvm.org/D127958

2 years ago[strictfp][IPSCCP] Precommit tests for D115737.
Kevin P. Neal [Thu, 16 Jun 2022 15:20:04 +0000 (11:20 -0400)]
[strictfp][IPSCCP] Precommit tests for D115737.

2 years agoAdd braces to silence a gcc 9.4 -Wdangling-else warning [nfc]
Philip Reames [Thu, 16 Jun 2022 15:10:34 +0000 (08:10 -0700)]
Add braces to silence a gcc 9.4 -Wdangling-else warning [nfc]

2 years ago[RISCV] Extend demanded field transform in InsertVSETVLI to VTYPE subfeilds
Philip Reames [Thu, 16 Jun 2022 15:00:05 +0000 (08:00 -0700)]
[RISCV] Extend demanded field transform in InsertVSETVLI to VTYPE subfeilds

The motivating case, and the only one actually enabled by this patch, is a load or store followed by another op with the same SEW/LMUL ratio.

As an example, consider:

define void @test1(ptr %in, ptr %out) {
entry:
  %0 = load <8 x i16>, ptr %in, align 2
  %1 = sext <8 x i16> %0 to <8 x i32>
  store <8 x i32> %1, ptr %out, align 4
  ret void
}

Without this patch, we get:

vsetivli zero, 8, e16, mf4, ta, mu
vle16.v v8, (a0)
vsetvli zero, zero, e32, mf2, ta, mu
vsext.vf2 v9, v8
vse32.v v9, (a1)
ret

Whereas with the patch we get:

vsetivli zero, 8, e32, mf2, ta, mu
vle16.v v8, (a0)
vsext.vf2 v9, v8
vse32.v v9, (a1)
ret

We have rewritten the first vsetvli and thus removed the second one.

As is strongly hinted by the code structure and todos, I am planning on communing this with all (or most all?) of the cases from isCompatible used in the forward data flow. This will be done in a series of following changes - some NFC reworks, and some reviewed optimization extensions.

Differential Revision: https://reviews.llvm.org/D127780

2 years ago[mlir][spirv] Workaround driver bug in math.ctlz conversion again
Lei Zhang [Thu, 16 Jun 2022 14:48:33 +0000 (10:48 -0400)]
[mlir][spirv] Workaround driver bug in math.ctlz conversion again

The previous approach does not work as the Adreno driver is
clever at optimizing away the selection. So now check two
inputs together.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D127930

2 years ago[CostModel][AArch64][NFC] Add cost model tests for fshl/fshr intrinsics
David Sherwood [Thu, 16 Jun 2022 14:47:12 +0000 (15:47 +0100)]
[CostModel][AArch64][NFC] Add cost model tests for fshl/fshr intrinsics

2 years agoPrevent crash when TurnSwitchRangeIntoICmp receives default unreachable destination
Samuel Eubanks [Thu, 16 Jun 2022 14:10:13 +0000 (16:10 +0200)]
Prevent crash when TurnSwitchRangeIntoICmp receives default unreachable destination

TurnSwitchRangeIntoICmp crashes when given a switch with a default
destination of unreachable
Addresses issue #53208
https://github.com/llvm/llvm-project/issues/53208

Differential revision: https://reviews.llvm.org/D127712

2 years ago[LV] Remove widenPHIInstruction dependence on underlying instr (NFC).
Florian Hahn [Thu, 16 Jun 2022 14:02:44 +0000 (16:02 +0200)]
[LV] Remove widenPHIInstruction dependence on underlying instr (NFC).

Instead of using the underlying instruction and VF to get the type, use
the type of the incoming value. This removes an unnecessary dependence
on the underlying instruction and enables using the recipe without an
underlying instruction.

2 years ago[SLP]Extend vectorization for scatter vectorize nodes.
Alexey Bataev [Tue, 7 Jun 2022 14:43:29 +0000 (07:43 -0700)]
[SLP]Extend vectorization for scatter vectorize nodes.

Currently scatter vectorize nodes can be emitted only for GEPs with
constant indices. But we can also emit such nodes for GEPs with the same
ptr and non-constant vectorizable/gathered indices, if profitable. Patch
adds support for such nodes and tries to improve handling of GEPs with
non-const indeces for such nodes.

Metric: SLP.NumVectorInstructions

Program                                                                                       SLP.NumVectorInstructions
                                                                                              results                   results0 diff
                    test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test  5243.00                   5240.00  -0.1%
                     test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test  5243.00                   5240.00  -0.1%
                     test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 27550.00                  27507.00  -0.2%
                               test-suite :: External/SPEC/CFP2006/453.povray/453.povray.test  5395.00                   5380.00  -0.3%
                       test-suite :: External/SPEC/CFP2017rate/511.povray_r/511.povray_r.test  5389.00                   5374.00  -0.3%
                    test-suite :: External/SPEC/CINT2017rate/520.omnetpp_r/520.omnetpp_r.test   961.00                    958.00  -0.3%
                   test-suite :: External/SPEC/CINT2017speed/620.omnetpp_s/620.omnetpp_s.test   961.00                    958.00  -0.3%
                               test-suite :: External/SPEC/CFP2006/447.dealII/447.dealII.test  5664.00                   5643.00  -0.4%
                       test-suite :: External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test 13202.00                  13127.00  -0.6%
                                test-suite :: External/SPEC/CINT2006/445.gobmk/445.gobmk.test   212.00                    207.00  -2.4%
                                test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test   890.00                    850.00  -4.5%
                            test-suite :: External/SPEC/CINT2006/464.h264ref/464.h264ref.test  1695.00                   1581.00  -6.7%
                                 test-suite :: MultiSource/Applications/JM/lencod/lencod.test  2338.00                   2140.00  -8.5%
                                  test-suite :: SingleSource/UnitTests/matrix-types-spec.test    63.00                     55.00 -12.7%
                             test-suite :: SingleSource/Benchmarks/Adobe-C++/loop_unroll.test   468.00                    356.00 -23.9%
                                                                           Geomean difference                                     -0.3%

All numbers show increased number of generated vector instructions.

Diff:
SingleSource/Benchmarks/Adobe-C++/loop_unroll - better without LTO, but
need an extra analysis with LTO (with LTO compiler generates
masked_gather, while before regular loads were emitted because of extra
data, availbale at LTO time).
SingleSource/UnitTests/matrix-types-spec - more vector code.
MultiSource/Applications/JM/lencod/lencod - same.
External/SPEC/CINT2006/464.h264ref/464.h264ref - same.
MultiSource/Benchmarks/7zip/7zip-benchmark - same.
External/SPEC/CINT2006/445.gobmk/445.gobmk - no changes.
External/SPEC/CFP2017rate/510.parest_r/510.parest_r - more vector code.
External/SPEC/CFP2006/447.dealII/447.dealII - same
External/SPEC/CINT2017speed/620.omnetpp_s/620.omnetpp_s - same
External/SPEC/CINT2017rate/520.omnetpp_r/520.omnetpp - same
External/SPEC/CFP2017rate/511.povray_r/511.povray - same
External/SPEC/CFP2006/453.povray/453.povray - same
External/SPEC/CFP2017rate/526.blender_r/526.blender_r - same
External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r - same
External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s - same

Differential Revision: https://reviews.llvm.org/D127219

2 years ago[libc++] Robust against C++20-hostile iterators
Nikolas Klauser [Thu, 16 Jun 2022 08:50:15 +0000 (10:50 +0200)]
[libc++] Robust against C++20-hostile iterators

Reviewed By: ldionne, #libc, EricWF

Spies: EricWF, libcxx-commits, mgrang

Differential Revision: https://reviews.llvm.org/D127669

2 years agocmake: configure clang lit to use hmaptool from source directly
Matheus Izvekov [Thu, 16 Jun 2022 06:24:19 +0000 (08:24 +0200)]
cmake: configure clang lit to use hmaptool from source directly

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: dyung

Differential Revision: https://reviews.llvm.org/D127943

2 years ago[AMDGPU] Remove duplicate RUN lines from a test
Jay Foad [Thu, 16 Jun 2022 10:47:20 +0000 (11:47 +0100)]
[AMDGPU] Remove duplicate RUN lines from a test

2 years ago[AMDGPU][MC][GFX11] Correct src0 for dpp variants of v_cvt_*_e64
Dmitry Preobrazhensky [Thu, 16 Jun 2022 10:44:56 +0000 (13:44 +0300)]
[AMDGPU][MC][GFX11] Correct src0 for dpp variants of v_cvt_*_e64

Differential Revision: https://reviews.llvm.org/D127847

2 years ago[InstCombine] Add more tests for freeze of loop phi (NFC)
Nikita Popov [Thu, 16 Jun 2022 10:44:14 +0000 (12:44 +0200)]
[InstCombine] Add more tests for freeze of loop phi (NFC)

2 years ago[clangd] Don't add inlay hints on std::move/forward
Tobias Ribizel [Thu, 16 Jun 2022 10:23:51 +0000 (12:23 +0200)]
[clangd] Don't add inlay hints on std::move/forward

This removes parameter inlay hints from a few builtin functions like std::move/std::forward

Reviewed By: nridge

Differential Revision: https://reviews.llvm.org/D127859

2 years ago[sanitizer_common] Fix SanitizerCommon.ChainedOriginDepotStats test
Kristina Bessonova [Thu, 16 Jun 2022 09:56:06 +0000 (11:56 +0200)]
[sanitizer_common] Fix SanitizerCommon.ChainedOriginDepotStats test

This test was failing with the following error message if to run the test binary
directly, w/o using lit:

  $ Sanitizer-x86_64-Test --gtest_filter=SanitizerCommon.ChainedOriginDepot*
  ...
  [ RUN      ] SanitizerCommon.ChainedOriginDepotStats
  compiler-rt/lib/sanitizer_common/tests/sanitizer_chained_origin_depot_test.cpp:77: Failure
  Expected: (stats1.allocated) > (stats0.allocated), actual: 196608 vs 196608
  [  FAILED  ] SanitizerCommon.ChainedOriginDepotStats (867 ms)

Since the ChainedOriginDepot* tests are not doing any cleanup, by the time
SanitizerCommon.ChainedOriginDepotStats test starts executing the depot
may not be empty, so there will be no allocation for the test.

This patch introduces ChainedOriginDepot::TestOnlyUnmap() API that deallocates
memory when requested. This makes sure underlying TwoLevelMap initiates
the expected allocation during the test.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D127621

2 years ago[AArch64][SME] Add SME cntsb/h/w/d intrinsics
David Sherwood [Wed, 15 Jun 2022 10:28:08 +0000 (11:28 +0100)]
[AArch64][SME] Add SME cntsb/h/w/d intrinsics

These intrinsics return the number of elements in a streaming
vector, for example aarch64.sme.cntsw returns the number of
32-bit elements. When in streaming mode these are equivalent
to aarch64.sve.cntb/h/w/d with an input value of 1.

I have implemented these intrinsics using the rdsvl instruction
and added tests here:

  CodeGen/AArch64/SME/sme-intrinsics-rdsvl.ll

Differential Revision: https://reviews.llvm.org/D127853

2 years ago[AMDGPU] Change use null for dead sdst to be gfx1030+
David Stuttard [Wed, 15 Jun 2022 15:48:28 +0000 (16:48 +0100)]
[AMDGPU] Change use null for dead sdst to be gfx1030+

Pre gfx1030 null for sdst is different.
c97436f8b6e2 [AMDGPU] Use null for dead sdst operand - requires a change to make
it not apply to pre gfx1030

Differential Revision: https://reviews.llvm.org/D127869

2 years agoRevert "[libc] Apply no-builtin everywhere, remove unnecessary flags"
Guillaume Chatelet [Thu, 16 Jun 2022 09:28:17 +0000 (09:28 +0000)]
Revert "[libc] Apply no-builtin everywhere, remove unnecessary flags"

This reverts commit b2a9ea4420127d10b18ae648b16757665f8bbd7c.

2 years agoReland "[SplitKit] Handle early clobber + tied to def correctly"
Kito Cheng [Thu, 16 Jun 2022 08:45:36 +0000 (16:45 +0800)]
Reland "[SplitKit] Handle early clobber + tied to def correctly"

This reverts commit 7207373e1eb0dd419b4e13a5e2d0ca146ef9544e.

We found another RISC-V bug when landing D126048, and it has been fixed
by D127642 now.

Differential Revision: https://reviews.llvm.org/D126048

2 years agoReland "[RISCV] Testcase to show wrong register allocation result of subreg liveness"
Kito Cheng [Wed, 15 Jun 2022 08:24:09 +0000 (16:24 +0800)]
Reland "[RISCV] Testcase to show wrong register allocation result of subreg liveness"

This reverts commit 6a6f632b93cd3ed9e35f37c04efb12fa87038b60.

This commit will failed when EXPENSIVE_CHECKS enabled, fixed by D126048.

Differential Revision: https://reviews.llvm.org/D126047

2 years ago[libc++] Test the size of basic_string
Nikolas Klauser [Wed, 15 Jun 2022 20:28:15 +0000 (22:28 +0200)]
[libc++] Test the size of basic_string

Reviewed By: ldionne, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D127672

2 years agoUpdate FileCheck docs after D95849. NFCI
Diana Picus [Wed, 15 Jun 2022 11:24:21 +0000 (11:24 +0000)]
Update FileCheck docs after D95849. NFCI

The default has been false for quite a while now.

Differential Revision: https://reviews.llvm.org/D127846

2 years agoRevert "[ARM] Add a pipeline test showing missing postinc generation. NFC"
David Green [Thu, 16 Jun 2022 07:23:08 +0000 (08:23 +0100)]
Revert "[ARM] Add a pipeline test showing missing postinc generation. NFC"

This reverts commit d9ef307e9bb3b636a18c4051a236f1aafd7600e6 as it is
causeing expensive check verification errors. Remove the test again
until we can fix them.

2 years ago[AMDGPU] Add support for GFX11 hazards
Jay Foad [Wed, 15 Jun 2022 16:09:10 +0000 (17:09 +0100)]
[AMDGPU] Add support for GFX11 hazards

Add support for partial stall over EXEC hazard and trans use hazard.

Differential Revision: https://reviews.llvm.org/D127872

2 years ago[ARM] Add a pipeline test showing missing postinc generation. NFC
David Green [Thu, 16 Jun 2022 07:04:50 +0000 (08:04 +0100)]
[ARM] Add a pipeline test showing missing postinc generation. NFC