platform/upstream/llvm.git
4 years ago[InstCombine] Rename InstCombinerImpl::matchBSwap to matchBSwapOrBitReverse. NFCI.
Simon Pilgrim [Fri, 23 Oct 2020 11:19:18 +0000 (12:19 +0100)]
[InstCombine] Rename InstCombinerImpl::matchBSwap to matchBSwapOrBitReverse. NFCI.

This matches bswap and bitreverse intrinsics, so we should make that clear in the function name.

4 years ago[X86] lowerShuffleWithPERMV - use MVT::changeTypeToInteger helper. NFCI.
Simon Pilgrim [Fri, 23 Oct 2020 11:14:19 +0000 (12:14 +0100)]
[X86] lowerShuffleWithPERMV - use MVT::changeTypeToInteger helper. NFCI.

4 years ago[ARM][SchedModels] Convert IsR1P0AndLaterPred to MCSchedPredicate. NFC
Evgeny Leviant [Fri, 23 Oct 2020 11:27:49 +0000 (14:27 +0300)]
[ARM][SchedModels] Convert IsR1P0AndLaterPred to MCSchedPredicate. NFC

Differential revision: https://reviews.llvm.org/D90017

4 years ago[AArch64] Implement getIntrinsicInstrCost, handle min/max intrinsics.
Florian Hahn [Fri, 23 Oct 2020 08:00:20 +0000 (09:00 +0100)]
[AArch64] Implement getIntrinsicInstrCost, handle min/max intrinsics.

This patch adds a specialized implementation of getIntrinsicInstrCost
and add initial cost-modeling for min/max vector intrinsics.

AArch64 NEON support umin/smin/umax/smax for vectors
<8 x i8>, <16 x i8>, <4 x i16>, <8 x i16>, <2 x i32> and <4 x i32>.
Notably, it does not support vectors with i64 elements.

This change by itself should have very little impact on codegen, but in
follow-up patches I plan to teach the vectorizers to consider using
those intrinsics on platforms where it is profitable, e.g. because there
is no general 'select'-like instruction.

The current cost returned should be better for throughput, latency and size.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D89953

4 years ago[lldb] Split out NetBSD/x86 watchpoint impl for unification
Michał Górny [Thu, 22 Oct 2020 12:18:36 +0000 (14:18 +0200)]
[lldb] Split out NetBSD/x86 watchpoint impl for unification

Split the current NetBSD watchpoint implementation for x86 into Utility,
and revamp it to improve readability.  This code is meant to be used
as a common class for all x86 watchpoint implementation, particularly
these on FreeBSD and Linux.

The code uses global watchpoint enable bits, as required by the NetBSD
kernel.  If it ever becomes necessary for any platform to use local
enable bits instead, this can be trivially abstracted out.

The code also postpones clearing DR6 until a new different watchpoint
is being set in place of the old one.  This is necessary since LLDB
repeatedly reenables watchpoints on all threads, by clearing
and restoring them.  When DR6 is cleared as a part of that, then pending
events on other threads can no longer be associated with watchpoints
correctly.

Differential Revision: https://reviews.llvm.org/D89874

4 years ago[DebugInstrRef] NFC: Separate collection of machine/variable values
Jeremy Morse [Fri, 23 Oct 2020 10:06:51 +0000 (11:06 +0100)]
[DebugInstrRef] NFC: Separate collection of machine/variable values

This patch adjusts _when_ something happens in LiveDebugValues /
InstrRefBasedLDV, to make it more amenable to dealing with DBG_INSTR_REF
instructions. There's no functional change.

In the current InstrRefBasedLDV implementation, we collect the machine
value-number transfer function for blocks at the same time as the
variable-value transfer function. After solving machine value numbers, the
variable-value transfer function is updated so that DBG_VALUEs of live-in
registers have the correct value. The same would need to be done for
DBG_INSTR_REFs, to connect instruction-references with machine value
numbers.

Rather than writing more code for that, this patch separates the two: we
collect the (machine-value-number) transfer function and solve for
machine value numbers, then step through the MachineInstrs again collecting
the variable value transfer function. This simplifies things for the new
few patches.

Differential Revision: https://reviews.llvm.org/D85760

4 years ago[MLIR] Added PromoteBuffersToStackPass to convert heap- to stack-based allocations.
Julian Gross [Mon, 19 Oct 2020 11:49:06 +0000 (13:49 +0200)]
[MLIR] Added PromoteBuffersToStackPass to convert heap- to stack-based allocations.

Added optimization pass to convert heap-based allocs to stack-based allocas in
buffer placement. Added the corresponding test file.

Differential Revision: https://reviews.llvm.org/D89688

4 years ago[mlir] Fix exiting OpPatternRewriteDriver::simplifyLocally after first iteration...
Christian Sigg [Fri, 23 Oct 2020 06:29:22 +0000 (08:29 +0200)]
[mlir] Fix exiting OpPatternRewriteDriver::simplifyLocally after first iteration that didn't change the op.

Before this change, we would run `maxIterations` if the first iteration changed the op.
After this change, we exit the loop as soon as an iteration hasn't changed the op.
Assuming that we have reached a fixed point when an iteration doesn't change the op, this doesn't affect correctness.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D89981

4 years ago[mem2reg] Remove dbg.values describing contents of dead allocas
OCHyams [Fri, 23 Oct 2020 04:44:13 +0000 (04:44 +0000)]
[mem2reg] Remove dbg.values describing contents of dead allocas

This patch copies @vsk's fix to instcombine from D85555 over to mem2reg. The
motivation and rationale are exactly the same: When mem2reg removes an alloca,
it erases the dbg.{addr,declare} instructions which refer to the alloca. It
would be better to instead remove all debug intrinsics which describe the
contents of the dead alloca, namely all dbg.value(<dead alloca>, ...,
DW_OP_deref)'s.

As far as I can tell, prior to D80264 these `dbg.value+deref`s would have been
silently dropped instead of being made `undef`, so we're just returning to
previous behaviour with these patches.

Testing:
`llvm-lit llvm/test` and `ninja check-clang` gave no unexpected failures. Added
3 tests, each of which covers a dbg.value deletion path in mem2reg:
  mem2reg-promote-alloca-1.ll
  mem2reg-promote-alloca-2.ll
  mem2reg-promote-alloca-3.ll
The first is based on the dexter test inlining.c from D89543. This patch also
improves the debugging experience for loop.c from D89543, which suffers
similarly after arg promotion instead of inlining.

4 years ago[lld][ELF][test] Add additional test coverage for LTO
James Henderson [Tue, 20 Oct 2020 09:07:27 +0000 (10:07 +0100)]
[lld][ELF][test] Add additional test coverage for LTO

These are all inspired by existing test coverage we have in an internal
testsuite.

Reviewed by: grimar, MaskRay

Differential Revision: https://reviews.llvm.org/D89775

4 years ago[AMDGPU] Add simplification/combines for llvm.amdgcn.fmul.legacy
Jay Foad [Tue, 6 Oct 2020 14:49:04 +0000 (15:49 +0100)]
[AMDGPU] Add simplification/combines for llvm.amdgcn.fmul.legacy

Differential Revision: https://reviews.llvm.org/D88955

4 years ago[SVE]Clarify TypeSize comparisons in llvm/lib/Transforms
Caroline Concatto [Fri, 16 Oct 2020 08:21:28 +0000 (09:21 +0100)]
[SVE]Clarify TypeSize comparisons in llvm/lib/Transforms

Use isKnownXY comparators when one of the operands can be with
scalable vectors or getFixedSize() for all the other cases.

This patch also does bug fixes for getPrimitiveSizeInBits by using
getFixedSize() near the places with the TypeSize comparison.

Differential Revision: https://reviews.llvm.org/D89703

4 years ago[llvm-mca] Add test for cortex-a57 NEON instructions
Evgeny Leviant [Fri, 23 Oct 2020 07:55:54 +0000 (10:55 +0300)]
[llvm-mca] Add test for cortex-a57 NEON instructions

4 years ago[ARM][SchedModels] Let ldm* instruction scheduling use MCSchedPredicate
Evgeny Leviant [Fri, 23 Oct 2020 07:33:20 +0000 (10:33 +0300)]
[ARM][SchedModels] Let ldm* instruction scheduling use MCSchedPredicate

Differential revision: https://reviews.llvm.org/D89957

4 years ago[DebugInfo] Clear subreg in setDebugValueUndef()
David Stenberg [Thu, 22 Oct 2020 15:24:30 +0000 (17:24 +0200)]
[DebugInfo] Clear subreg in setDebugValueUndef()

When switching the register debug operands to $noreg in
setupDebugValueUndef() also clear the sub-register indices for virtual
registers. This is done when marking DBG_VALUEs undef in other cases,
e.g. in LiveDebugVariables. I have not found any cases where leaving the
sub-register index causes any issues, and the indices would eventually
get dropped when LiveDebugVariables reinserted the undef DBG_VALUEs
after register scheduling, but if nothing else it looked a bit weird in
printouts to have sub-register indices on $noreg, and I don't think the
sub-register index holds any meaningful information at that point.

I have not been able to find any source-level reproducer for this with
an upstream target, so I have just added an instrumented machine-sink
test.

Reviewed By: djtodoro, jmorse

Differential Revision: https://reviews.llvm.org/D89941

4 years ago[llvm-objcopy][NFC] Extract arg parsing logic into a helper function
Keith Smiley [Fri, 23 Oct 2020 06:13:49 +0000 (23:13 -0700)]
[llvm-objcopy][NFC] Extract arg parsing logic into a helper function

This diff refactors the code which determines the tool type based on
how llvm-objcopy is invoked (objcopy vs strip vs bitcode-strip vs install-name-tool).
NFC.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D89713

4 years agoRevert "[JITLink][ELF] Add support for ELF::R_X86_64_REX_GOTPCRELX relocation."
Lang Hames [Fri, 23 Oct 2020 06:21:29 +0000 (23:21 -0700)]
Revert "[JITLink][ELF] Add support for ELF::R_X86_64_REX_GOTPCRELX relocation."

This reverts commit e2fceec2fd15b7b74617816ddd87f456c42bbc45.

This commit broke one of the bots. Reverting while I investigate.

4 years ago[JITLink][ELF] Add support for ELF::R_X86_64_REX_GOTPCRELX relocation.
Lang Hames [Fri, 23 Oct 2020 05:42:03 +0000 (22:42 -0700)]
[JITLink][ELF] Add support for ELF::R_X86_64_REX_GOTPCRELX relocation.

No support for relaxation yet -- this will always use the GOT entry.

4 years ago[SCEV][NFC] Cache symbolic max exit count
Max Kazantsev [Fri, 23 Oct 2020 05:05:37 +0000 (12:05 +0700)]
[SCEV][NFC] Cache symbolic max exit count

We want to have a caching version of symbolic BE exit count
rather than recompute it every time we need it.

Differential Revision: https://reviews.llvm.org/D89954
Reviewed By: nikic, efriedma

4 years ago[lldb] Fix bug instroduced by a00acbab45b0
Jonas Devlieghere [Fri, 23 Oct 2020 05:15:58 +0000 (22:15 -0700)]
[lldb] Fix bug instroduced by a00acbab45b0

g_expression_prefix, as the name implies, must be perfixed, not
suffixed.

4 years ago[runtimes] Do not set XXX_STANDALONE_BUILD for libc++/abi/unwind
Louis Dionne [Fri, 23 Oct 2020 02:09:43 +0000 (22:09 -0400)]
[runtimes] Do not set XXX_STANDALONE_BUILD for libc++/abi/unwind

The runtimes build was lying to the various runtimes builds by setting
XXX_STANDALONE_BUILD=ON when they are really not being built standalone.
Only COMPILER_RT_STANDALONE_BUILD appears to be necessary, but setting it
for the other runtimes actually breaks everything.

Differential Revision: https://reviews.llvm.org/D90005

4 years ago[lldb] Fix missing initialization in UtilityFunction ctor (NFC)
Jonas Devlieghere [Fri, 23 Oct 2020 04:10:33 +0000 (21:10 -0700)]
[lldb] Fix missing initialization in UtilityFunction ctor (NFC)

The UtilityFunction ctor was dropping the text argument. Probably for
that reason ClangUtilityFunction was setting the parent's member
directly instead of deferring to the parent ctor. Also change the
signatures to take strings which are std::moved in place.

4 years ago[IR] Merge metadata manipulation code into Value
Serge Pavlov [Fri, 13 Sep 2019 17:21:24 +0000 (00:21 +0700)]
[IR] Merge metadata manipulation code into Value

Now there are two main classes in Value hierarchy, which support metadata,
these are Instruction and GlobalObject. They implement different APIs for
metadata manipulation, which however overlap. This change moves metadata
manipulation code into Value, so descendant classes can use this code for
their operations on metadata.

No functional changes intended.

Differential Revision: https://reviews.llvm.org/D67626

4 years agoDebugInfo: Hash DIE referevences (DW_OP_convert) when computing Split DWARF signatures
David Blaikie [Fri, 23 Oct 2020 03:08:54 +0000 (20:08 -0700)]
DebugInfo: Hash DIE referevences (DW_OP_convert) when computing Split DWARF signatures

4 years ago[CGSCC] Detect devirtualization in more cases
Arthur Eubanks [Thu, 15 Oct 2020 00:56:38 +0000 (17:56 -0700)]
[CGSCC] Detect devirtualization in more cases

The devirtualization wrapper misses cases where if it wraps a pass
manager, an individual pass may devirtualize an indirect call created by
a previous pass. For example, inlining may create a new indirect call
which is devirtualized by instcombine. Currently the devirtualization
wrapper will not see that because it only checks cgscc edges at the very
beginning and end of the pass (manager) it wraps.

This fixes some tests testing this exact behavior in the legacy PM.

This piggybacks off of updateCGAndAnalysisManagerForPass()'s detection
of promoted ref to call edges.

This supercedes one of the previous mechanisms to detect
devirtualization by keeping track of potentially promoted call
instructions via WeakTrackingVHs.

There is one more existing way of detecting devirtualization, by
checking if the number of indirect calls has decreased and the number of
direct calls has increased in a function. It handles cases where calls
to functions without definitions are promoted, and some tests rely on
that. LazyCallGraph doesn't track edges to functions without
definitions so this part can't be removed in this change.

check-llvm and check-clang with -abort-on-max-devirt-iterations-reached
on by default doesn't show any failures outside of tests specifically
testing it so it doesn't needlessly rerun passes more than necessary.
(The NPM -O2/3 pipeline run the inliner/function simplification pipeline
under a devirtualization repeater pass up to 4 times by default).

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89587

4 years agoSourceManager: Remove a redundant nullptr check in getNonBuiltinFilenameForID, NFC
Duncan P. N. Exon Smith [Fri, 23 Oct 2020 01:41:26 +0000 (21:41 -0400)]
SourceManager: Remove a redundant nullptr check in getNonBuiltinFilenameForID, NFC

4 years agoSourceManager: getFileEntryRefForID => getNonBuiltinFilenameForID, NFC
Duncan P. N. Exon Smith [Thu, 15 Oct 2020 22:46:25 +0000 (18:46 -0400)]
SourceManager: getFileEntryRefForID => getNonBuiltinFilenameForID, NFC

`SourceManager::getFileEntryRefForID`'s remaining callers just want the
filename component, which is coming from the `FileInfo`. Replace the API
with `getNonBuiltinFilenameForID`, which also removes another use of
`FileEntryRef::FileEntryRef` outside of `FileManager`.

Both callers are collecting file dependencies, and one of them relied on
this API to filter out built-ins (as exposed by
clang/test/ClangScanDeps/modules-full.cpp). It seems nice to continue
providing that service.

Differential Revision: https://reviews.llvm.org/D89508

4 years ago[MC] Adjust StringTableBuilder for linked Mach-O binaries
Alexander Shaposhnikov [Fri, 23 Oct 2020 01:03:40 +0000 (18:03 -0700)]
[MC] Adjust StringTableBuilder for linked Mach-O binaries

LD64 emits string tables which start with a space and a zero byte.
This diff adjusts StringTableBuilder for linked Mach-O binaries to match LD64's behavior.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D89561

4 years agoDebugInfo: Tidy up test for more portability to MachO and Windows
David Blaikie [Fri, 23 Oct 2020 02:07:43 +0000 (19:07 -0700)]
DebugInfo: Tidy up test for more portability to MachO and Windows

*fingers crossed*

4 years ago[Inliner] Run always-inliner in inliner-wrapper
Arthur Eubanks [Tue, 1 Sep 2020 22:55:05 +0000 (15:55 -0700)]
[Inliner] Run always-inliner in inliner-wrapper

An alwaysinline function may not get inlined in inliner-wrapper due to
the inlining order.

Previously for the following, the inliner would first inline @a() into @b(),

```
define void @a() {
entry:
  call void @b()
  ret void
}

define void @b() alwaysinline {
entry:
  br label %for.cond

for.cond:
  call void @a()
  br label %for.cond
}
```

making @b() recursive and unable to be inlined into @a(), ending at

```
define void @a() {
entry:
  call void @b()
  ret void
}

define void @b() alwaysinline {
entry:
  br label %for.cond

for.cond:
  call void @b()
  br label %for.cond
}
```

Running always-inliner first makes sure that we respect alwaysinline in more cases.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46945.

Reviewed By: davidxl, rnk

Differential Revision: https://reviews.llvm.org/D86988

4 years agoSourceManager: Change SourceManager::isMainFile to take a FileEntry, NFC
Duncan P. N. Exon Smith [Thu, 15 Oct 2020 22:32:34 +0000 (18:32 -0400)]
SourceManager: Change SourceManager::isMainFile to take a FileEntry, NFC

`SourceManager::isMainFile` does not use the filename, so it doesn't
need the full `FileEntryRef`; in fact, it's misleading to take the name
because that makes it look relevant. Simplify the API, and in the
process remove some calls to `FileEntryRef::FileEntryRef` in the unit
tests (which were blocking making that private to `SourceManager`).

Differential Revision: https://reviews.llvm.org/D89507

4 years agoSourceManager: Factor out helpers for common SLocEntry lookup pattern, NFC
Duncan P. N. Exon Smith [Thu, 15 Oct 2020 22:17:17 +0000 (18:17 -0400)]
SourceManager: Factor out helpers for common SLocEntry lookup pattern, NFC

Add helpers `getSLocEntryOrNull`, which handles the `Invalid` logic
around `getSLocEntry`, and `getSLocEntryForFile`, which also checks for
`SLocEntry::isFile`, and use them to reduce repeated code.

Differential Revision: https://reviews.llvm.org/D89503

4 years ago[OpenMP] Fixed a potential integer overflow
Shilei Tian [Fri, 23 Oct 2020 01:21:41 +0000 (21:21 -0400)]
[OpenMP] Fixed a potential integer overflow

`size_t` has different width on 32- and 64-bit architecture, but the
computation to floor to power of two assumed it is 64-bit, which can cause an
integer overflow. In this patch, architecture detection is added so that the
operation for 64-bit `size_t`. Thank Luke for reporting the issue.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89878

4 years agoRevert "[MBP] Add whole chain to BlockFilterSet instead of individual BB"
Han Shen [Fri, 23 Oct 2020 00:26:01 +0000 (17:26 -0700)]
Revert "[MBP] Add whole chain to BlockFilterSet instead of individual BB"

This reverts commit adfb5415010fbbc009a4a6298cfda7a6ed4fa6d4.

This is reverted because it caused an chrome error: https://crbug.com/1140168

4 years agoFix constant evaluation of zero-initialization of a union whose first
Richard Smith [Fri, 23 Oct 2020 00:02:28 +0000 (17:02 -0700)]
Fix constant evaluation of zero-initialization of a union whose first
FieldDecl is an unamed bitfield.

Unnamed bitfields aren't non-static data member, so such a bitfield
isn't actually the first non-static data member.

4 years agoBitCodeFormat: update doc on new byref and mustprogress attrs; NFC
Nick Desaulniers [Thu, 22 Oct 2020 23:29:17 +0000 (16:29 -0700)]
BitCodeFormat: update doc on new byref and mustprogress attrs; NFC

Forked from review of:
https://reviews.llvm.org/D87956

4 years ago[libc++abi] Fix the standalone build after the __config_site change
Louis Dionne [Thu, 22 Oct 2020 23:11:33 +0000 (19:11 -0400)]
[libc++abi] Fix the standalone build after the __config_site change

In 5d796645, we stopped looking at the LIBCXXABI_LIBCXX_INCLUDES variable,
which broke users of the Standalone build. This patch reinstates that
variable, however it must point to the *installed* path of the libc++
headers, not the libc++ headers in the source tree (which has always
been the case, but wasn't enforced before).

If LIBCXXABI_LIBCXX_INCLUDES points to the libc++ headers in the source
tree, the `__config_site` header will fail to be found.

4 years ago[NFC][SampleFDO] Move some common stuff from SampleProfileReaderExtBinary/WriterExtBinary
Wei Mi [Thu, 15 Oct 2020 22:17:28 +0000 (15:17 -0700)]
[NFC][SampleFDO] Move some common stuff from SampleProfileReaderExtBinary/WriterExtBinary
to their parent classes.

SampleProfileReaderExtBinary/SampleProfileWriterExtBinary specify the typical
section layout currently used by SampleFDO. Currently a lot of section
reader/writer stay in the two classes. However, as we expect to have more
types of SampleFDO profiles, we hope those new types of profiles can share
the common sections while configuring their own sections easily with minimal
change. That is why I move some common stuff from
SampleProfileReaderExtBinary/SampleProfileWriterExtBinary to
SampleProfileReaderExtBinaryBase/SampleProfileWriterExtBinaryBase so new
profiles class inheriting from the base class can reuse them.

Differential Revision: https://reviews.llvm.org/D89524

4 years agoDebugInfo: Use llc rather than %llc_dwarf when also hardcoding a target triple
David Blaikie [Thu, 22 Oct 2020 22:43:39 +0000 (15:43 -0700)]
DebugInfo: Use llc rather than %llc_dwarf when also hardcoding a target triple

4 years ago[AArch64][GlobalISel] Move imm adjustment for G_ICMP to post-legalizer lowering
Jessica Paquette [Tue, 20 Oct 2020 20:17:39 +0000 (13:17 -0700)]
[AArch64][GlobalISel] Move imm adjustment for G_ICMP to post-legalizer lowering

Move the code which adjusts the immediate/predicate on a G_ICMP to
AArch64PostLegalizerLowering.

This

- Reduces the number of places we need to test for optimized compares in the
selector. We know that the compare should have been simplified by the time it
hits the selector, so we can avoid testing this in selects, brconds, etc.

- Allows us to potentially fold more compares (previously, this optimization
was only done after calling `tryFoldCompare`, this may allow us to hit some more
TST cases)

- Simplifies the selection code in `emitIntegerCompare` significantly; we can
just use an emitSUBS function.

- Allows us to avoid checking that the predicate has been updated after
`emitIntegerCompare`.

Also add a utility header file for things that may be useful in the selector
and various combiners. No need for an implementation file at this point, since
it's just one constexpr function for now. I've run into a couple cases where
having one of these would be handy, so might as well add it here. There are
a couple functions in the selector that can probably be factored out into
here.

Differential Revision: https://reviews.llvm.org/D89823

4 years ago[ELF] --warn-backrefs: save the referenced InputFile *
Fangrui Song [Thu, 22 Oct 2020 22:26:52 +0000 (15:26 -0700)]
[ELF] --warn-backrefs: save the referenced InputFile *

For a diagnostic `A refers to B` where B refers to a bitcode file, if the
symbol gets optimized out, the user may see `A refers to <internal>`; if the
symbol is retained, the user may see `A refers to lto.tmp`.

Save the reference InputFile * in the DenseMap so that the original filename is
available in reportBackrefs().

4 years ago[gn build] (semi-manually) port 147b9497e79
Nico Weber [Thu, 22 Oct 2020 22:16:09 +0000 (18:16 -0400)]
[gn build] (semi-manually) port 147b9497e79

4 years ago[AArch64][GlobalISel] Split post-legalizer combiner to allow for lowering at -O0
Jessica Paquette [Mon, 19 Oct 2020 17:17:15 +0000 (10:17 -0700)]
[AArch64][GlobalISel] Split post-legalizer combiner to allow for lowering at -O0

There are a lot of combines in AArch64PostLegalizerCombiner which exist to
facilitate instruction matching in the selector. (E.g. matching for G_ZIP and
other shuffle vector pseudos)

It still makes sense to select these instructions at -O0.

Matching earlier in a combiner can reduce complexity in the selector
significantly. For example, a good portion of our selection code for compares
would be a lot easier to represent in a combine.

This patch moves matching combines into a "AArch64PostLegalizerLowering"
combiner which runs at all optimization levels.

Also, while we're here, improve the documentation for the
AArch64PostLegalizerCombiner, and fix up the filepath in its file comment.

And also add a 'r' which somehow got dropped from a bunch of function names.

https://reviews.llvm.org/D89820

4 years ago[libTooling] Add function to Transformer to create a no-op edit.
Yitzhak Mandelbaum [Thu, 22 Oct 2020 14:03:59 +0000 (14:03 +0000)]
[libTooling] Add function to Transformer to create a no-op edit.

This functionality is commonly needed in clang tidy checks (based on
transformer) that only print warnings, without suggesting any edits. The no-op
edit allows the user to associate a diagnostic message with a source location.

Differential Revision: https://reviews.llvm.org/D89961

4 years ago[SourceManager] Avoid copying SLocEntry in computeMacroArgsCache
Jan Korous [Thu, 22 Oct 2020 21:18:13 +0000 (14:18 -0700)]
[SourceManager] Avoid copying SLocEntry in computeMacroArgsCache

Follow-up to e7870223d8b5

Differential Revision: https://reviews.llvm.org/D86230

4 years ago[clang][Frontend] Add missing error handling
LemonBoy [Thu, 22 Oct 2020 21:13:07 +0000 (14:13 -0700)]
[clang][Frontend] Add missing error handling

Some early errors during the ASTUnit creation were not transferred to the `FailedParseDiagnostic` so when the code in `LoadFromCommandLine` swaps its content with the content of `StoredDiagnostics` they cannot be retrieved by the user in any way.

Reviewed By: andrewrk, dblaikie

Differential Revision: https://reviews.llvm.org/D78658

4 years ago[libc++] Allow running the tests in the experimental runtimes-only build
Louis Dionne [Thu, 22 Oct 2020 21:03:33 +0000 (17:03 -0400)]
[libc++] Allow running the tests in the experimental runtimes-only build

4 years ago[llvm-objcopy][MachO] Fix the calculation of the output size
Alexander Shaposhnikov [Thu, 22 Oct 2020 20:25:13 +0000 (13:25 -0700)]
[llvm-objcopy][MachO] Fix the calculation of the output size

Virtual sections do not contribute to the final output size.
This diff fixes the corresponding calculations in the method MachOWriter::totalSize.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D89661

4 years ago[GWP-ASan] Move random-related code in the allocator (redo)
Kostya Kortchinsky [Thu, 22 Oct 2020 20:40:12 +0000 (13:40 -0700)]
[GWP-ASan] Move random-related code in the allocator (redo)

This is a redo of D89908, which triggered some `-Werror=conversion`
errors with GCC due to assignments to the 31-bit variable.

This CL adds to the original one a 31-bit mask variable that is used
at every assignment to silence the warning.

Differential Revision: https://reviews.llvm.org/D89984

4 years ago[DomTree] Make assert more precise
Nikita Popov [Thu, 22 Oct 2020 20:40:06 +0000 (22:40 +0200)]
[DomTree] Make assert more precise

Per asbirlea's comment, assert that only instructions, constants
and arguments are passed to this API. Simplify returning true
would not be correct for special Value subclasses like MemoryAccess.

4 years ago[BasicAA] Only add visited phi blocks temporarily
Nikita Popov [Thu, 22 Oct 2020 19:50:18 +0000 (21:50 +0200)]
[BasicAA] Only add visited phi blocks temporarily

Visited phi blocks only need to be added for the duration of the
recursive alias queries, they should not leak into following code.

Once again, while this also improves analysis precision, this is
mainly intended to clarify the applicability scope of VisitedPhiBBs.

4 years ago[AIX] Emit error for -G option on AIX
Xiangling Liao [Wed, 21 Oct 2020 20:50:36 +0000 (16:50 -0400)]
[AIX] Emit error for -G option on AIX

1. Emit error for -G driver option on AIX
2. Adjust cmake file to use -Wl,-G instead of -G

On AIX, legacy XL compiler uses -G to produce a shared object enabled
for use with the run-time linker, which has different meanings from what
it is used for in Clang. And in Clang, other targets do not have -G map
to another functionality in their legacy compiler. So this error is more
important when we are on AIX.

Differential Revision: https://reviews.llvm.org/D89897

4 years ago[BasicAA] Don't track visited blocks for phi-phi alias query
Nikita Popov [Thu, 22 Oct 2020 19:44:09 +0000 (21:44 +0200)]
[BasicAA] Don't track visited blocks for phi-phi alias query

We only need the VisitedPhiBBs to disambiguate comparisons of
values from two different loop iterations. If we're comparing
two phis from the same basic block in lock-step, the compared
values will always be on the same iteration.

While this also increases precision, this is mainly intended
to clarify the scope of VisitedPhiBBs.

4 years agoInitial support for vectorization using Libmvec (GLIBC vector math library)
Venkataramanan Kumar [Thu, 22 Oct 2020 20:00:34 +0000 (16:00 -0400)]
Initial support for vectorization using Libmvec (GLIBC vector math library)

Differential Revision: https://reviews.llvm.org/D88154

4 years agoRevert "[GWP-ASan] Move random-related code in the allocator"
Nikita Popov [Thu, 22 Oct 2020 19:56:37 +0000 (21:56 +0200)]
Revert "[GWP-ASan] Move random-related code in the allocator"

This reverts commit 9903b0586cfb76ef2401c342501e61e1bd3daa0f.

Causes build failures (on GCC 10.2) with the following error:

In file included from /home/nikic/llvm-project/compiler-rt/lib/scudo/standalone/combined.h:29,
                 from /home/nikic/llvm-project/compiler-rt/lib/scudo/standalone/allocator_config.h:12,
                 from /home/nikic/llvm-project/compiler-rt/lib/scudo/standalone/wrappers_cpp.cpp:14:
/home/nikic/llvm-project/compiler-rt/lib/scudo/standalone/../../gwp_asan/guarded_pool_allocator.h: In member function ‘bool gwp_asan::GuardedPoolAllocator::shouldSample()’:
/home/nikic/llvm-project/compiler-rt/lib/scudo/standalone/../../gwp_asan/guarded_pool_allocator.h:82:69: error: conversion from ‘uint32_t’ {aka ‘unsigned int’} to ‘unsigned int:31’ may change value [-Werror=conversion]
   82 |           (getRandomUnsigned32() % (AdjustedSampleRatePlusOne - 1)) + 1;
      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~

4 years ago[BasicAA] Add additional phi tests (NFC)
Nikita Popov [Wed, 21 Oct 2020 07:32:17 +0000 (09:32 +0200)]
[BasicAA] Add additional phi tests (NFC)

4 years ago[clangd] Get rid of llvm::Optional in Remote- and LocalIndexRoot; NFC
Kirill Bobyrev [Thu, 22 Oct 2020 19:47:48 +0000 (21:47 +0200)]
[clangd] Get rid of llvm::Optional in Remote- and LocalIndexRoot; NFC

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D89852

4 years ago[SourceManager] Skip module maps when searching files for macro arguments
Jan Korous [Wed, 19 Aug 2020 05:34:37 +0000 (22:34 -0700)]
[SourceManager] Skip module maps when searching files for macro arguments

Differential Revision: https://reviews.llvm.org/D86230

4 years ago[clangd] Give the server information about client's remote index protocol version
Kirill Bobyrev [Thu, 22 Oct 2020 19:15:20 +0000 (21:15 +0200)]
[clangd] Give the server information about client's remote index protocol version

And also introduce Protobuf package versioning, it will help to deal
with breaking changes. Inroducing package version itself is a breaking
change, clients and servers need to be updated.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D89862

4 years ago[test] HotColdSplit: cover use of opaque pointer type
Vedant Kumar [Thu, 22 Oct 2020 19:24:15 +0000 (12:24 -0700)]
[test] HotColdSplit: cover use of opaque pointer type

Add a test to cover the case where an extracted block contains a
lifetime marker for a pointer with an opaque type.

4 years agoRevert "[CodeExtractor] Don't create bitcasts when inserting lifetime markers (NFCI)"
Vedant Kumar [Thu, 22 Oct 2020 18:11:12 +0000 (11:11 -0700)]
Revert "[CodeExtractor] Don't create bitcasts when inserting lifetime markers (NFCI)"

This reverts commit 26ee8aff2b85ee28a2b2d0b1860d878b512fbdef.

It's necessary to insert bitcast the pointer operand of a lifetime
marker if it has an opaque pointer type.

rdar://70560161

4 years agoRevert "Revert "[mlir] Convert from Async dialect to LLVM coroutines""
Lei Zhang [Thu, 22 Oct 2020 19:20:42 +0000 (15:20 -0400)]
Revert "Revert "[mlir] Convert from Async dialect to LLVM coroutines""

This reverts commit 4986d5eaff359081a867def1c6a2e1147dbb2ad6 with
proper patches to CMakeLists.txt:

- Add MLIRAsync as a dependency to MLIRAsyncToLLVM
- Add Coroutines as a dependency to MLIRExecutionEngine

4 years agoRevert "[mlir] Convert from Async dialect to LLVM coroutines"
Mehdi Amini [Thu, 22 Oct 2020 19:11:56 +0000 (19:11 +0000)]
Revert "[mlir] Convert from Async dialect to LLVM coroutines"

This reverts commit a8b0ae3bddee311cbc97801089a95702f32773f8
and commit f8fcff5a9d7ee948add3f28382d4ced5710edaaf.

The build with SHARED_LIBRARY=ON is broken.

4 years agoPort -instnamer to NPM
Arthur Eubanks [Thu, 22 Oct 2020 05:55:34 +0000 (22:55 -0700)]
Port -instnamer to NPM

Some clang tests use this.

Reviewed By: akhuang

Differential Revision: https://reviews.llvm.org/D89931

4 years agoDWARFv5: Disable DW_OP_convert for configurations that don't yet support it
David Blaikie [Thu, 22 Oct 2020 18:47:35 +0000 (11:47 -0700)]
DWARFv5: Disable DW_OP_convert for configurations that don't yet support it

Testing reveals that lldb and gdb have some problems with supporting
DW_OP_convert - gdb with Split DWARF tries to resolve the CU-relative
DIE offset relative to the skeleton DIE. lldb tries to treat the offset
as absolute, which judging by the llvm-dsymutil support for
DW_OP_convert, I guess works OK in MachO? (though probably llvm-dsymutil
is producing invalid DWARF by resolving the relative reference to an
absolute one?).

Specifically this disables DW_OP_convert usage in DWARFv5 if:
* Tuning for GDB and using Split DWARF
* Tuning for LLDB and not targeting MachO

4 years ago[GWP-ASan] Move random-related code in the allocator
Kostya Kortchinsky [Wed, 21 Oct 2020 20:04:09 +0000 (13:04 -0700)]
[GWP-ASan] Move random-related code in the allocator

We need to have all thread specific data packed into a single `uintptr_t`
for the upcoming Fuchsia support. We can move the `RandomState` into the
`ThreadLocalPackedVariables`, reducing the size of `NextSampleCounter`
to 31 bits (or we could reduce `RandomState` to 31 bits).

We move `getRandomUnsigned32` into the platform agnostic part of the
class, and `initPRNG` in the platform specific part.

`ScopedBoolean` is replaced by actual assignments since non-const
references to bitfields are prohibited.

`random.{h,cpp}` are removed.

Differential Revision: https://reviews.llvm.org/D89908

4 years ago[libc++] Drop old workaround for iostreams instantiations missing from the dylib
Louis Dionne [Thu, 22 Oct 2020 18:43:41 +0000 (14:43 -0400)]
[libc++] Drop old workaround for iostreams instantiations missing from the dylib

On old Apple platforms (pre 10.9), we couldn't rely on the iostreams
explicit instantiations being part of the dylib. However, we don't
support back-deploying to such old deployment targets anymore, so the
workaround can be dropped.

4 years ago[InstCombine][NFC] Use ConstantExpr::getBinOpIdentity
Layton Kifer [Thu, 22 Oct 2020 18:42:09 +0000 (20:42 +0200)]
[InstCombine][NFC] Use ConstantExpr::getBinOpIdentity

Delete duplicate implementation getSelectFoldableConstant and
replace with ConstantExpr::getBinOpIdentity.

Differential Revision: https://reviews.llvm.org/D89839

4 years ago[PatternMatch] Add new FP matchers. NFC.
Jay Foad [Fri, 16 Oct 2020 12:54:19 +0000 (13:54 +0100)]
[PatternMatch] Add new FP matchers. NFC.

This adds matchers m_NonNaN, m_NonInf, m_Finite and m_NonZeroFP as well
as generic support for binding the matched value to an APFloat.

I tried to follow the existing convention of using an FP suffix for
predicates like zero and non-zero, which could be confused with the
integer versions, but not for predicates which are clearly already
FP-specific.

Differential Revision: https://reviews.llvm.org/D89038

4 years ago[MemCpyOpt] Move GEP during call slot optimization
Nikita Popov [Sat, 17 Oct 2020 13:54:52 +0000 (15:54 +0200)]
[MemCpyOpt] Move GEP during call slot optimization

When performing a call slot optimization to a GEP destination, it
will currently usually fail, because the GEP is directly before the
memcpy and as such does not dominate the call. We should move it
above the call if that satisfies the domination requirement.

I think that a constant-index GEP is the only useful thing to move
here, as otherwise isDereferenceablePointer couldn't look through
it anyway. As such I'm not trying to generalize this further.

Differential Revision: https://reviews.llvm.org/D89623

4 years ago[NFC][PartialInliner]: Clean up code
Ettore Tiotto [Thu, 22 Oct 2020 17:59:32 +0000 (13:59 -0400)]
[NFC][PartialInliner]: Clean up code

Make member function const where possible, use LLVM_DEBUG to print debug traces
rather than a custom option, pass by reference to avoid null checking, ...

Reviewed By: fhann

Differential Revision: https://reviews.llvm.org/D89895

4 years agoHowToReleaseLLVM: Clean up document and remove references to SVN
Tom Stellard [Thu, 22 Oct 2020 18:33:58 +0000 (11:33 -0700)]
HowToReleaseLLVM: Clean up document and remove references to SVN

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D80395

4 years ago[InstSimplify] add tests for ctpop constant range; NFC
Sanjay Patel [Thu, 22 Oct 2020 17:18:38 +0000 (13:18 -0400)]
[InstSimplify] add tests for ctpop constant range; NFC

4 years ago[SystemZ][z/OS] Set short-enums as the default for z/OS
Jonathan Crowther [Thu, 22 Oct 2020 18:13:26 +0000 (14:13 -0400)]
[SystemZ][z/OS] Set short-enums as the default for z/OS

This patch sets short-enums to be the default for z/OS.

Reviewed By: abhina.sreeskantharajan

Differential Revision: https://reviews.llvm.org/D89801

4 years agoclang/Basic: Remove ContentCache::getRawBuffer, NFC
Duncan P. N. Exon Smith [Thu, 15 Oct 2020 02:34:19 +0000 (22:34 -0400)]
clang/Basic: Remove ContentCache::getRawBuffer, NFC

Replace `ContentCache::getRawBuffer` with `getBufferDataIfLoaded` and
`getBufferIfLoaded`, excising another accessor for the underlying
`MemoryBuffer*` in favour of `StringRef` and `MemoryBufferRef`.

Differential Revision: https://reviews.llvm.org/D89445

4 years ago[TableGen] Update documents to make them more complete
Paul C. Anagnostopoulos [Thu, 22 Oct 2020 14:20:10 +0000 (10:20 -0400)]
[TableGen] Update documents to make them more complete

Differential Revision: https://reviews.llvm.org/D89962

4 years ago[InstCombine] Remove dbg.values describing contents of dead allocas
Vedant Kumar [Tue, 20 Oct 2020 18:59:26 +0000 (11:59 -0700)]
[InstCombine] Remove dbg.values describing contents of dead allocas

When InstCombine removes an alloca, it erases the dbg.{addr,declare}
instructions which refer to the alloca. It would be better to instead
remove all debug intrinsics which describe the contents of the dead
alloca, namely all dbg.value(<dead alloca>, ..., DW_OP_deref)'s.

This effectively undoes work performed in an InstCombine run earlier in
the pipeline by LowerDbgDeclare, which inserts DW_OP_deref dbg.values
before CallInst users of an alloca. The motivating example looks like:

```
  define void @foo(i32 %0) {
    %a = alloca i32              ; This alloca is erased.
    store i32 %0, i32* %a
    dbg.value(i32 %0, "arg0")    ; This dbg.value survives.
    dbg.value(i32* %a, "arg0", DW_OP_deref)
    call void @trivially_inlinable_no_op(i32* %a)
    ret void
  }
```

If the DW_OP_deref dbg.value is not erased, it becomes dbg.value(undef)
after inlining, making "arg0" unavailable. But we already have dbg.value
descriptions of the alloca's value (from LowerDbgDeclare), so the
DW_OP_deref dbg.value cannot serve its purpose of describing an
initialization of the alloca by some callee. It invalidates other useful
dbg.values, causing large gaps in location coverage, so we should delete
it (even though doing so may cause stale dbg.values to appear, if
there's a dead store to `%a` in @trivially_inlinable_no_op).

OTOH, it wouldn't be correct to delete all dbg.value descriptions of an
alloca. Note that it's possible to describe a variable that takes on
different pointer values, e.g.:

```
  void use(int *);
  void t(int a, int b) {
    int *local = &a;     // dbg.value(i32* %a.addr, "local")
    local = &b;          // dbg.value(i32* undef, "local")
    use(&a);             //           (note: %b.addr is optimized out)
    local = &a;          // dbg.value(i32* %a.addr, "local")
  }
```

In this example, the alloca for "b" is erased, but we need to describe
the value of "local" as <unavailable> before the call to "use". This
prevents "local" from appearing to be equal to "&a" at the callsite.

rdar://66592859

Differential Revision: https://reviews.llvm.org/D85555

4 years agoAMDGPU: Cleanup MIR test
Matt Arsenault [Thu, 22 Oct 2020 16:10:36 +0000 (12:10 -0400)]
AMDGPU: Cleanup MIR test

Remove registers section and compact block/register numbers

4 years agoRevert "[Docs] Clarify that FunctionPasses can't add/remove declarations"
Arthur Eubanks [Thu, 22 Oct 2020 16:49:42 +0000 (09:49 -0700)]
Revert "[Docs] Clarify that FunctionPasses can't add/remove declarations"

This reverts commit 710676cf3a3c6f6ddf2f18e24cac017d20dac1ff.

4 years ago[ELF] Set SHF_INFO_LINK for .rel[a].plt and .rel[a].dyn
Fangrui Song [Thu, 22 Oct 2020 16:48:04 +0000 (09:48 -0700)]
[ELF] Set SHF_INFO_LINK for .rel[a].plt and .rel[a].dyn

The ELF spec says

> If the sh_flags field for this section header includes the attribute SHF_INFO_LINK, then this member represents a section header table index.

Set SHF_INFO_LINK so that binary manipulation tools know that sh_info is
a section header table index instead of (the number of local symbols in the case of SHT_SYMTAB/SHT_DYNSYM).
We have already added SHF_INFO_LINK for --emit-relocs retained SHT_REL[A].

For example, we can teach llvm-objcopy to preserve the section index of the sh_info referenced section if
SHF_INFO_LINK is set. (GNU objcopy recognizes .rel[a].plt and updates
sh_info even if SHF_INFO_LINK is not set).

Reviewed By: grimar, psmith

Differential Revision: https://reviews.llvm.org/D89828

4 years agoRevert "[lldb] Explicitly use the configuration architecture when building test execu...
Raphael Isemann [Thu, 22 Oct 2020 16:42:19 +0000 (18:42 +0200)]
Revert "[lldb] Explicitly use the configuration architecture when building test executables"

This reverts commit 41185226f6d80663b4a1064c6f47581ee567d78d.

Causes TestQuoting to fail on Windows.

4 years ago[DomTree] Accept Value as Def (NFC)
Nikita Popov [Sat, 17 Oct 2020 18:54:53 +0000 (20:54 +0200)]
[DomTree] Accept Value as Def (NFC)

Non-instruction defs like arguments, constants or global values
always dominate all instructions/uses inside the function. This
case currently needs to be treated separately by the caller, see
https://reviews.llvm.org/D89623#inline-832818 for an example.

This patch makes the dominator tree APIs accept a Value instead of
an Instruction and always returns true for the non-Instruction case.

A complication here is that BasicBlocks are also Values. For that
reason we can't support the dominates(Value *, BasicBlock *)
variant, as it would conflict with dominates(BasicBlock *, BasicBlock *),
which has different semantics. For the other two APIs we assert
that the passed value is not a BasicBlock.

Differential Revision: https://reviews.llvm.org/D89632

4 years ago[SLP] Add tests with selects that can be turned into min/max.
Florian Hahn [Thu, 22 Oct 2020 08:39:05 +0000 (09:39 +0100)]
[SLP] Add tests with selects that can be turned into min/max.

AArch64 does not have a flexible vector select instruction. In some
cases, the selects can be turned into min/max however, for which there
are dedicated vector instructions on AArch64.

This patch adds some tests for such cases.

4 years ago[AMDGPU] Add amdgpu specific loop threshold metadata
Tim Corringham [Tue, 28 Jul 2020 18:01:03 +0000 (19:01 +0100)]
[AMDGPU] Add amdgpu specific loop threshold metadata

Add new loop metadata amdgpu.loop.unroll.threshold to allow the initial AMDGPU
specific unroll threshold value to be specified on a loop by loop basis.

The intention is to be able to to allow more nuanced hints, e.g. specifying a
low threshold value to indicate that a loop may be unrolled if cheap enough
rather than using the all or nothing llvm.loop.unroll.disable metadata.

Differential Revision: https://reviews.llvm.org/D84779

4 years ago[gn build] Add missing clangd dependencies
Arthur Eubanks [Sun, 18 Oct 2020 20:35:58 +0000 (13:35 -0700)]
[gn build] Add missing clangd dependencies

Fixes
$ ninja obj/build/rel/gen/clang-tools-extra/clangd/CompletionModel.CompletionModel.obj

Some tablegen include files from clang/include/clang/AST and
clang/include/clang/Sema need to be generated before CompletionModel is
compiled.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D89657

4 years ago[Docs] Clarify that FunctionPasses can't add/remove declarations
Arthur Eubanks [Wed, 21 Oct 2020 15:55:50 +0000 (08:55 -0700)]
[Docs] Clarify that FunctionPasses can't add/remove declarations

In preparation for potential future concurrency, a FunctionPass
shouldn't modify anything at the module level that other FunctionPasses
can also modify.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89890

4 years ago[lldb/DWARF] Add support for DW_OP_implicit_value
Med Ismail Bennani [Wed, 21 Oct 2020 01:54:48 +0000 (03:54 +0200)]
[lldb/DWARF] Add support for DW_OP_implicit_value

This patch completes https://reviews.llvm.org/D83560. Now that the
compiler can emit `DW_OP_implicit_value` into DWARF expressions, lldb
needed to learn reading these opcodes for variable inspection and
expression evaluation.

This implicit location descriptor specifies an immediate value with two
operands: the length (ULEB128) followed by a block representing the value
in the target memory representation.

rdar://67406091

Differential revision: https://reviews.llvm.org/D89842

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
4 years ago[OpenCL] Remove unused extensions
Marco Antognini [Mon, 12 Oct 2020 14:17:03 +0000 (15:17 +0100)]
[OpenCL] Remove unused extensions

Many non-language extensions are defined but also unused. This patch
removes them with their tests as they do not require compiler support.

The cl_khr_select_fprounding_mode extension is also removed because it
has been deprecated since OpenCL 1.1 and Clang doesn't have any specific
support for it.

The cl_khr_context_abort extension is only referred to in "The OpenCL
Specification", version 1.2 and 2.0, in Table 4.3, but no specification
is provided in "The OpenCL Extension Specification" for these versions.
Because it is both unused in Clang and lacks specification, this
extension is removed.

The following extensions are platform extensions that bring new OpenCL
APIs but do not impact the kernel language nor require compiler support.
They are therefore removed.

- cl_khr_gl_sharing, introduced in OpenCL 1.0

- cl_khr_icd, introduced in OpenCL 1.2

- cl_khr_gl_event, introduced in OpenCL 1.1
Note: this extension adds a new API to create cl_event but it also
specifies that these can only be used by clEnqueueAcquireGLObjects.
Hence, they cannot be used on the device side and the extension does
not impact the kernel language.

- cl_khr_d3d10_sharing, introduced in OpenCL 1.1

- cl_khr_d3d11_sharing, introduced in OpenCL 1.2

- cl_khr_dx9_media_sharing, introduced in OpenCL 1.2

- cl_khr_image2d_from_buffer, introduced in OpenCL 1.2

- cl_khr_initialize_memory, introduced in OpenCL 1.2

- cl_khr_gl_depth_images, introduced in OpenCL 1.2
Note: this extension is related to cl_khr_depth_images but only the
latter adds new features to the kernel language.

- cl_khr_spir, introduced in OpenCL 1.2

- cl_khr_egl_event, introduced in OpenCL 1.2
Note: this extension adds a new API to create cl_event but it also
specifies that these can only be used by clEnqueueAcquire* API
functions. Hence, they cannot be used on the device side and the
extension does not impact the kernel language.

- cl_khr_egl_image, introduced in OpenCL 1.2

- cl_khr_terminate_context, introduced in OpenCL 1.2

The minimum required OpenCL version used in OpenCLExtensions.def for
these extensions is not always correct. Removing these address that
issue.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D89372

4 years ago[HIP] Fix HIP rounding math intrinsics
Aaron En Ye Shi [Thu, 22 Oct 2020 15:07:47 +0000 (15:07 +0000)]
[HIP] Fix HIP rounding math intrinsics

The __ocml_*_rte_f32 and __ocml_*_rte_f64 functions are not
available if OCML_BASIC_ROUNDED_OPERATIONS is not defined.

Reviewed By: b-sumner, yaxunl

Fixes: SWDEV-257235

Differential Revision: https://reviews.llvm.org/D89966

4 years ago[NFC][MC] Use MCRegister for ReachingDefAnalysis APIs
Mircea Trofin [Wed, 21 Oct 2020 20:59:45 +0000 (13:59 -0700)]
[NFC][MC] Use MCRegister for ReachingDefAnalysis APIs

Also updated the users of the APIs; and a drive-by small change to
RDFRegister.cpp

Differential Revision: https://reviews.llvm.org/D89912

4 years ago[LoopRotate][NPM] Disable header duplication under -Oz
Arthur Eubanks [Thu, 22 Oct 2020 05:08:58 +0000 (22:08 -0700)]
[LoopRotate][NPM] Disable header duplication under -Oz

It was already disabled under -Oz in
buildFunctionSimplificationPipeline(), but not in
buildModuleOptimizationPipeline()/addPGOInstrPasses().

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D89927

4 years ago[lldb] Fix a regression introduced by D75730
Jonas Devlieghere [Thu, 22 Oct 2020 15:32:05 +0000 (08:32 -0700)]
[lldb] Fix a regression introduced by D75730

In a new Range class was introduced to simplify and the Disassembler API
and reduce duplication. It unintentionally broke the
SBFrame::Disassemble functionality because it unconditionally converts
the number of instructions to a Range{Limit::Instructions,
num_instructions}. This is subtly different from the previous behavior,
where now we're passing a Range and assume it's valid in the callee, the
original code would propagate num_instructions and the callee would
compare the value and decided between disassembling instructions or
bytes.

Unfortunately the existing tests was not particularly strict:

  disassembly = frame.Disassemble()
  self.assertNotEqual(len(disassembly), 0, "Disassembly was empty.")

This would pass because without this patch we'd disassemble zero
instructions, resulting in an error:

  (lldb) script print(lldb.frame.Disassemble())
  error: error reading data from section __text

Differential revision: https://reviews.llvm.org/D89925

4 years ago[mlir] Do not start threads in AsyncRuntime
Eugene Zhulenev [Thu, 22 Oct 2020 15:17:53 +0000 (08:17 -0700)]
[mlir] Do not start threads in AsyncRuntime

pthreads is not enabled for all builds by default

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D89967

4 years ago[MemProf] Allow the binary to specify the profile output filename
Teresa Johnson [Tue, 29 Sep 2020 22:31:11 +0000 (15:31 -0700)]
[MemProf] Allow the binary to specify the profile output filename

This will allow the output directory to be specified by a build time
option, similar to the directory specified for regular PGO profiles via
-fprofile-generate=. The memory profiling instrumentation pass will
set up the variable. This is the same mechanism used by the PGO
instrumentation and runtime.

Depends on D87120 and D89629.

Differential Revision: https://reviews.llvm.org/D89086

4 years ago[mlir][gpu] NFC: switch occurrences of gpu.launch_func to custom format.
Christian Sigg [Thu, 22 Oct 2020 05:43:34 +0000 (07:43 +0200)]
[mlir][gpu] NFC: switch occurrences of gpu.launch_func to custom format.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D89929

4 years ago[AMDGPU] Fix expansion of i16 MULH
Piotr Sobczak [Thu, 22 Oct 2020 14:28:33 +0000 (16:28 +0200)]
[AMDGPU] Fix expansion of i16 MULH

This commit marks i16 MULH as expand in AMDGPU backend,
which is necessary after the refactoring in D80485.

Differential Revision: https://reviews.llvm.org/D89965

4 years ago[AArch64] Add min/max cost-model tests for v2i32.
Florian Hahn [Thu, 22 Oct 2020 15:02:55 +0000 (16:02 +0100)]
[AArch64] Add min/max cost-model tests for v2i32.

4 years ago[ARM][SchedModels] Convert IsLdstsoScaledPred to MCSchedPredicate
Evgeny Leviant [Thu, 22 Oct 2020 15:03:01 +0000 (18:03 +0300)]
[ARM][SchedModels] Convert IsLdstsoScaledPred to MCSchedPredicate

Differential revision: https://reviews.llvm.org/D89939

4 years ago[X86] X86AsmParser - make methods const where possible. NFCI.
Simon Pilgrim [Thu, 22 Oct 2020 14:46:09 +0000 (15:46 +0100)]
[X86] X86AsmParser - make methods const where possible. NFCI.

Reported by cppcheck

4 years ago[X86] Return const& in IntelExprStateMachine::getIdentifierInfo(). NFCI.
Simon Pilgrim [Thu, 22 Oct 2020 13:53:30 +0000 (14:53 +0100)]
[X86] Return const& in IntelExprStateMachine::getIdentifierInfo(). NFCI.

Avoid unnecessary copy in X86AsmParser::ParseIntelOperand