review.tizen.org Git - platform/upstream/llvm.git/log

[mlir] Remove unneeded OpBuilder params. NFC.

These are now automatically prepended.

[mlir][ods] Custom builder with no params

Incorrect generation of custom build method without any params.

NFC: Remove unused variable.

Add mlir python APIs for creating operations, regions and blocks.

* The API is a bit more verbose than I feel like it needs to be. In a follow-up I'd like to abbreviate some things and look in to creating aliases for common accessors.
* There is a lingering lifetime hazard between the module and newly added operations. We have the facilities now to solve for this but I will do that in a follow-up.
* We may need to craft a more limited API for safely referencing successors when creating operations. We need more facilities to really prove that out and should defer for now.

Differential Revision: https://reviews.llvm.org/D87996

Implement python iteration over the operation/region/block hierarchy.

* Removes the half-completed prior attempt at region/block mutation in favor of new approach to ownership.
* Will re-add mutation more correctly in a follow-on.
* Eliminates the detached state on blocks and regions, simplifying the ownership hierarchy.
* Adds both iterator and index based access at each level.

Differential Revision: https://reviews.llvm.org/D87982

Add Operation to python bindings.

* Fixes a rather egregious bug with respect to the inability to return arbitrary objects from py::init (was causing aliasing of multiple py::object -> native instance).
* Makes Modules and Operations referencable types so that they can be reliably depended on.
* Uniques python operation instances within a context. Opens the door for further accounting.
* Next I will retrofit region and block to be dependent on the operation, and I will attempt to model the API to avoid detached regions/blocks, which will simplify things a lot (in that world, only operations can be detached).
* Added quite a bit of test coverage to check for leaks and reference issues.
* Supercedes: https://reviews.llvm.org/D87213

Differential Revision: https://reviews.llvm.org/D87958

[mlir][openacc] Use OptionalParseResult in loop op parser instead of bool variables

This patch switch from using bool variables to OptionalParseResult for the parsing
inside loop operation. This is already done for parallel operation and this patch unify this
in the dialect.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D88111

[docs][llvm] Fix typos

I don't have commit access.
Please help me commit it.
Thanks : )

Reviewed By: Paul-C-Anagnostopoulos

Differential Revision: https://reviews.llvm.org/D88139

[clangd] Refactor code completion signal's utility properties.

Current implementation of heuristic-based scoring function also contains
computation of derived signals (e.g. whether name contains a word from
context, computing file distances, scope distances.)
This is an attempt to separate out the logic for computation of derived
signals from the scoring function.
This will allow us to have a clean API for scoring functions that will
take only concrete code completion signals as input.

Differential Revision: https://reviews.llvm.org/D88146

[SVE] Lower fixed length ISD::VECREDUCE_ADD to Scalable

Differential Revision: https://reviews.llvm.org/D87796

[VPlan] Disconnect VPValue and VPUser.

This refactors VPuser to not inherit from VPValue to facilitate
introducing operations that introduce multiple VPValues (e.g.
VPInterleaveRecipe).

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D84679

[mlir] Fix typos in Dialect.h. NFC.

[SystemZ] Make sure not to call getZExtValue on a >64 bit constant.

Better use isZero() and isIntN() in SystemZTargetTransformInfo rather than
calling getZExtValue() since the immediate operand may be wider than 64 bits,
which is not allowed with getZExtValue().

Fixes https://bugs.llvm.org/show_bug.cgi?id=47600

Review: Simon Pilgrim

[NFC][ARM] Pre-commit tail predication test

Fix typos in ASTMatchers.h; NFC

GlobalISel: Fix truncating shift amount in trunc (shl) combine

The shift amount type does not necessarily match the result type. This
was inserting a trunc from s32 to s32, which asserted. Just preserve
the original shift amount type which can be legalized later.

AMDGPU: Check global FP atomics match default FP mode

We would always select global FP atomics from atomicrmw fadd, although
they have a hardcoded FP mode.

[lldb] Fix GetRemoteSharedModule fallback logic

When the various methods of locating the module in GetRemoteSharedModule
fail, make sure we pass the original module spec to the bail-out call to
the provided resolver function.

Also make sure we consistently use the resolved module spec from the
various success paths.

Thanks to what appears to have been an accidentally inverted condition
(commit 85967fa applied the new condition to a path where GetModuleSpec
returns false, but should have applied it when GetModuleSpec returns
true), without this fix we only pass the original module spec in the
fallback if the original spec has no uuid (or has a uuid that somehow
matches the resolved module's uuid despite the call to GetModuleSpec
failing). This manifested as a bug when processing a minidump file with
a user-provided sysroot, since in that case the resolver call was being
applied to resolved_module_spec (despite resolution failing), which did
not have the path of its file_spec set.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D88099

[libc++] Re-apply fdc41e11f (LWG1203) without breaking the C++11 build

fdc41e11f was reverted in e46c1def5 because it broke the C++11 build.
We shouldn't be using enable_if_t in C++11, instead we must use
enable_if<...>::type.

[NFCI][flang] Renamed a variable name to a more descriptive name

[flang] Removed OpenMP lowering unittests

These tests aren't adding much value and consensus has been reached for
there removal.
For more context, please refer to discussion in this revision:
https://reviews.llvm.org/D87846

[mlir] Added support for f64 memref printing in runner utils

Added print_memref_f64 function to runner utils.

Differential Revision: https://reviews.llvm.org/D88143

[OpenMP][flang]Lower NUM_THREADS clause for parallel construct

This patch reflects the work that can be upstreamed from PR(merged)
PR: https://github.com/flang-compiler/f18-llvm-project/pull/411

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D87846

[CUDA][HIP] Fix static device var used by host code only

A static device variable may be accessed in host code through
cudaMemCpyFromSymbol etc. Currently clang does not
emit the static device variable if it is only referenced by
host code, which causes host code to fail at run time.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D88115

[flang] CHARACTER(*) return does not require explicit interface

Fortran 2018 15.4.2.2(4)(c) says nonassumed or explicit non-constant
length parameter require explicit interface. The "nonassumed" part was
missing in f18 characteristic analysis causing CanBeCalledViaImplicitInterface
to return false for `CHARACTER(*) function foo()` like interfaces.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D88075

[SVE][CodeGen] Lower legal integer -> floating point conversions

This patch adds new ISD nodes, SCVTZ_MERGE_PASSTHRU &
UCVTZ_MERGE_PASSTHRU, which are used to lower both legal
scalable vector [S|U]INT_TO_FP operations and the following intrinsics:
- llvm.aarch64.sve.scvtf
- llvm.aarch64.sve.ucvtf

Reviewed By: sdesmalen, efriedma

Differential Revision: https://reviews.llvm.org/D87913

[llvm-readelf/obj] - Fix extended section symbol indices printed in warnings for MIPS GOT/PLT entries.

Recent refactoring introduced a symbol index argument for `getFullSymbolName` method,
which is only used for reporting error messages about invalid extended symbol indexes.

There are few issues in the implementation and we don't report correct symbol indices
when dumping MIPS GOT/PLT entries currently.

This patch adds test cases and fixes the issue.

Differential revision: https://reviews.llvm.org/D88089

[llvm-readelf/obj] - Print section symbol names properly when dumping relocations.

Currently `--relocations` ignores section symbol names and always prints
section names for them. This is inconsistent with GNU readelf and with `--symbols`.

We have a code in `getFullSymbolName` (which is used for `--symbols`) which can be
reused for `getRelocationTarget` (used for `--relocations`).
With that the issue described is fixed and code becomes a bit shorter.
Also with this change we start to print more relocations (in situations when we just
showed warnings instead before) and also start to report more diagnostic warnings
(see reloc-zero-name-or-value.test).

Differential revision: https://reviews.llvm.org/D87613

[AMDGPU] Insert waitcnt after returning from call

When memory operations are outstanding on function calls, either the
caller or the callee can insert a waitcnt to ensure that all reads are
finished.
Calls need some time to be executed, so if the callee inserts the
waitcnt, filling the instruction buffer and waiting for memory will be
interleaved, hiding some latency. This comes at the cost of having a
waitcnt inside functions that may not be needed as no memory operations
are outstanding.

For function calls, this is already implemented. The same principal
applies to returns: If the caller inserts a waitcnt after the call, the
callee does not have to wait and the return and memory operation can be
run in parallel.

This commit implements waiting in the caller after returning from a
function call.

Differential Revision: https://reviews.llvm.org/D87674

[llvm-readelf/obj] - Cleanup the code. NFCI.

This:
1) Replaces pointers with references in many places.
2) Adds few TODOs about fixing possible unhandled errors (in ARMEHABIPrinter.h).
3) Replaces `auto`s with actual types.
4) Removes excessive arguments.
5) Adds `const ELFFile<ELFT> &Obj;` member to `ELFDumper` to simplify the code.

Differential revision: https://reviews.llvm.org/D88097

[analyzer][StdLibraryFunctionsChecker] Separate the signature from the summaries

The signature should not be part of the summaries as many FIXME comments
suggests. By separating the signature, we open up the way to a generic
matching implementation which could be used later under the hoods of
CallDescriptionMap.

Differential Revision: https://reviews.llvm.org/D88100

[analyzer][StdLibraryFunctionsChecker] Fix getline/getdelim signatures

It is no longer needed to add summaries of 'getline' for different
possible underlying types of ssize_t. We can just simply lookup the
type.

Differential Revision: https://reviews.llvm.org/D88092

[SVE] Make EVT::getScalarSizeInBits and others consistent with Type::getScalarSizeInBits

An existing function Type::getScalarSizeInBits returns a uint64_t
instead of a TypeSize class because the caller is requesting a
scalar size, which cannot be scalable. This patch makes other
similar functions requesting a scalar size consistent with that,
thereby eliminating more than 1000 implicit TypeSize -> uint64_t
casts.

Differential revision: https://reviews.llvm.org/D87889

Revert "[libc++] Implement LWG1203"

This reverts commit fdc41e11f9687a50c97e2a59663bf2d541ff5489. It causes the
libcxx/modules/stds_include.sh.cpp test to fail with:
libcxx/include/ostream:1039:45: error: no template named 'enable_if_t'; did you mean 'enable_if'?
template <class _Stream, class _Tp, class = enable_if_t<

Still investigating what's causing this and reverting in the meantime to get
the bots green again.

[SVE] Fix InstCombinerImpl::PromoteCastOfAllocation for scalable vectors

In this patch I've fixed some warnings that arose from the implicit
cast of TypeSize -> uint64_t. I tried writing a variety of different
cases to show how this optimisation might work for scalable vectors
and found:

1. The optimisation does not work for cases where the cast type
is scalable and the allocated type is not. This because we need to
know how many times the cast type fits into the allocated type.
2. If we pass all the various checks for the case when the allocated
type is scalable and the cast type is not, then when creating the
new alloca we have to take vscale into account. This leads to
sub-optimal IR that is worse than the original IR.
3. For the remaining case when both the alloca and cast types are
scalable it is hard to find examples where the optimisation would
kick in, except for simple bitcasts, because we typically fail the
ABI alignment checks.

For now I've changed the code to bail out if only one of the alloca
and cast types is scalable. This means we continue to support the
existing cases where both types are fixed, and also the specific case
when both types are scalable with the same size and alignment, for
example a simple bitcast of an alloca to another type.

I've added tests that show we don't attempt to promote the alloca,
except for simple bitcasts:

Transforms/InstCombine/AArch64/sve-cast-of-alloc.ll

Differential revision: https://reviews.llvm.org/D87378

[AMDGPU] Fix merging m0 inits

Fix incorrect merges of m0 inits in loops.

It was assumed that if a clobbering instruction appears in
the same block as an init and the clobbering instruction
does not dominate the init then it does not interfere with
init.

This does not work in the presence of loops, where in this
scenario, the clobbering instruction does interfere with
the init in another iteration.

To fix this, do not check for block equality and defer the
decision to the predecessor check.

Differential Revision: https://reviews.llvm.org/D87882

[mlir][Linalg] Add pattern to fold linalg.tensor_reshape that add unit extent dims.

A sequence of two reshapes such that one of them is just adding unit
extent dims can be folded to a single reshape.

Differential Revision: https://reviews.llvm.org/D88057

[RISCV][ASAN] implementation of ThreadSelf for riscv64

[6/11] patch series to port ASAN for riscv64

Depends On D87574

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D87575

[NFC] Reformat preprocessor directives

Revert "[RISCV][ASAN] implementation of ThreadSelf for riscv64"

Merged two unrelated commits

This reverts commit 00f6ebef6e347e0d24a8f940fe43656719e88cb8.

[PowerPC] Implementation of 128-bit Binary Vector Mod and Sign Extend builtins

This patch implements 128-bit Binary Vector Mod and Sign Extend builtins for PowerPC10.

Differential: https://reviews.llvm.org/D87394#inline-815858

[CVP] Remove a redundant trailing semicolon, fixing GCC warnings. NFC.

[InstCombine] Add parentheses in assert to silence GCC warning. NFC.

[MC] [Win64EH] Try to generate packed unwind info where possible

In practice, this only gives modest savings (for a 6.5 MB DLL with
230 KB xdata, the xdata sections shrinks by around 2.5 KB); to
gain more, the frame lowering would need to be tweaked to more often
generate frame layouts that match the canonical layouts that can
be written in packed form.

Differential Revision: https://reviews.llvm.org/D87371

Add a dump() method on the pass manager for debugging purpose (NFC)

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D88008

[RISCV][ASAN] implementation of ThreadSelf for riscv64

[6/11] patch series to port ASAN for riscv64

Depends On D87574

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D87575

[RISCV][ASAN] implementation for vfork interceptor for riscv64

[5/11] patch series to port ASAN for riscv64

Depends On D87573

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D87574

[RISCV][ASAN] implementation of clone interceptor for riscv64

[4/11] patch series to port ASAN for riscv64

Depends On D87572

Reviewed By: eugenis, vitalybuka

Differential Revision: https://reviews.llvm.org/D87573

[RISCV][ASAN] implementation of internal syscalls wrappers for riscv64

implements glibc-like wrappers over Linux syscalls.

[3/11] patch series to port ASAN for riscv64

Depends On D87998

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D87572

[Analyzer][WebKit] Use tri-state types for relevant predicates

Some of the predicates can't always be decided - for example when a type
definition isn't available. At the same time it's necessary to let
client code decide what to do about such cases - specifically we can't
just use true or false values as there are callees with
conflicting strategies how to handle this.

This is a speculative fix for PR47276.

Differential Revision: https://reviews.llvm.org/D88133

[RISCV][ASAN] updated platform macros to simplify detection of RISCV64 platform

[2/11] patch series to port ASAN for riscv64

Depends On D87997

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D87998

[sanitizers] Remove the message queue with IPC_RMID after D82897

[Sanitizers] Fix test case that doesn't clean up after itself

Commit https://reviews.llvm.org/rG144e57fc9535 added this test
case that creates message queues but does not remove them. The
message queues subsequently build up on the machine until the
system wide limit is reached. This has caused failures for a
number of bots running on a couple of big PPC machines.

This patch just adds the missing cleanup.

[lld-maco] fix build breakage

[ThinLTO] Avoid temporaries when loading global decl attachment metadata

When performing ThinLTO importing, the metadata loader attempts to lazy
load, by building an index. However, module level global decl attachment
metadata was being parsed early while building the index, since the
associated (module level) global values aren't materialized on demand.
This results in the creation of forward reference temporary metadatas,
which are expensive.

Normally, these module level global values don't have much attached
metadata. However, in the case of -fwhole-program-vtables (e.g. for
whole program devirtualization), the vtables may have many attached type
metadatas. This was resulting in very slow performance when performing
ThinLTO importing with the default lazy loading.

This patch restructures the handling of these global decl attachment
records, delaying their parsing until after the lazy loading index has
been built. Then the parser can use the interface that loads from the
index, which resolves forward references immediately instead of creating
expensive temporaries.

For one ThinLTO backend that imports from modules containing huge
numbers of vtables and associated types, I measured the following
compile times for the metadata materialization during function
importing, rounded to nearest second:

No -fwhole-program-vtables:
  Lazy loading on (head):  1s
  Lazy loading off (head): 3s
  Lazy loading on (patch): 1s

With -fwhole-program-vtables:
  Lazy loading on (head):  440s
  Lazy loading off (head): 4s
  Lazy loading on (patch): 2s

Differential Revision: https://reviews.llvm.org/D87970

[lld-macho] In the context of relocs, s/target/referent/ for sections & symbols

The word "target" is overloaded, so lighten its load by using another word to denote the symbol or section to which a reloc points. While more stilted than "target", "referent" is rather less pompous than "designatum" or "denotatum". :P

Along the way, make a few neighboring variable names more descriptive.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D87584

[test][NewPM] Clean up ScalarEvolution tests to work under NPM

[CostModel][X86] add CostModel for SK_Select(v8f64, v8i64, v16f32, v16i32, v32i16, v64i8)

add CostModel for SK_Select(v8f64, v8i64, v16f32, v16i32, v32i16, v64i8)

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D87884

[test][NewPM] Fix update-scev.ll under NPM

Revert "[IRSim] Adding IRSimilarityCandidate that contains a region of IRInstructionData."

This reverts commit 4944bb190fed8861d4d043eaf45e3c1e12aa2dc5.

[NewPM] Pin tests with -debug-pass to legacy PM

-debug-pass is a legacy PM only option.

Some tests checks that the pass returned that it made a change,
which is not relevant to the NPM, since passes return PreservedAnalyses.

Some tests check that passes are freed at the proper time, which is also
not relevant to the NPM.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87945

[DWARFYAML][test] Simplify __debug_pubnames/types tests. NFC.

This patch stripped unneeded sections from the test case.

Reviewed By: grimar, MaskRay

Differential Revision: https://reviews.llvm.org/D88073

Revert "Canonicalize declaration pointers when forming APValues."

This reverts commit 905b9ca26c94fa86339451a528cedde5004fc1bb.

Reverting because this strips `weak` attributes off function
declarations, leading to the linker error we see at
https://ci.chromium.org/p/fuchsia/builders/ci/clang_toolchain.fuchsia-arm64-debug-subbuild/b8868932035091473008.

See https://reviews.llvm.org/rG905b9ca26c94 for reproducer details.

[EHStreamer] Ensure CallSiteEntry::{BeginLabel,EndLabel} are non-null. NFC

... to simplify the code a bit.

Reviewed By: rahmanl

Differential Revision: https://reviews.llvm.org/D87999

[lld-macho] handle option -headerpad_max_install_names

Differential Revision: https://reviews.llvm.org/D88064

[Clang] Fix a typo in implicit-int-float-conversion.c

[IRSim] Adding IRSimilarityCandidate that contains a region of IRInstructionData.

The IRSimilarityCandidate is a container to hold a region of
IRInstructions and offer interfaces for the starting instruction, ending
instruction, parent function, length. It also assigns a global value
number for each unique instance of a value in the region.

It also contains an interface to compare two IRSimilarity as to whether
they have the same sequence of similar instructions.

Tests for whether the instructions are similar are found in
unittests/Analysis/IRSimilarityIdentifierTest.cpp.

Differential Revision: https://reviews.llvm.org/D86970

[NFC][docs] Fix link.

The rendered html was (no hyperlink was generated):

(see Getting Started <GettingStarted.html#git-pre-push-hook>)

Now, it is (with proper hyperlink):

(see Git pre-push hook)

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D88116

[ORC][examples] Add missing library dependencies.

[trace] avoid using <regex>

Easy fix based on the feedback by maskray on
https://reviews.llvm.org/D85705.

[InstCombine][NFC][tests] Add ninf base value case to pow-sqrt.ll

[InstCombine] Fix errno bug in pow expansion to sqrt

A conversion from `pow` to `sqrt` shall not call an `errno`-setting
`sqrt` with -//infinity//: the `sqrt` will set `EDOM` where the `pow`
call need not.

This patch avoids the erroneous (pun not intended) transformation by
applying the restrictions discussed in the thread for
https://lists.llvm.org/pipermail/llvm-dev/2020-September/145051.html.

The existing tests are updated (depending on emphasis in the checks for
library calls, avoidance of overlap, and overall coverage):
  - to add `ninf`, retaining the intended library call,
  - to use the intrinsic, retaining the use of `select`, or
  - to expect the replacement to not occur.

The following is tested:
  - The pow intrinsic folds to a `select` instruction to
    handle -//infinity//.
  - The pow library call folds, with `ninf`, to `sqrt` without the
    `select` instruction associated with handling -//infinity//.
  - The pow library call does not fold to `sqrt` without `ninf`.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D87877

[SLP]Fix coding style, NFC.

[libc++] NFC: Reindent the feature test macro generation script

Each feature-test macro is now a clear block indentation-wise.

[libc++] NFC: Collocate C++20 removed members of std::allocator

[AArch64] Teach analyzeBranch to remove branch equivelent to fallthrough

The motivation here is that MachineBlockPlacement relies on analyzeBranch to remove branches to fallthrough blocks when the branch is not fully analyzeable. With the introduction of the FAULTING_OP psuedo for implicit null checking (see D87861), this case becomes important. Note that it's hard to otherwise exercise this path as BranchFolding handle's any fully analyzeable branch sequence without using this interface.

p.s. For anyone who saw my comment in the original review, what I thought was an issue in BranchFolding originally turned out to simply be a bug in my patch. (Now fixed.)

Differential Revision: https://reviews.llvm.org/D88035

Fix build due to renaming in LoopInfo.

[libc++] Implement LWG1203

Libc++ had an issue where nonsensical code like

decltype(std::stringstream{} << std::vector<int>{});

would compile, as long as you kept the expression inside decltype in
an unevaluated operand. This turned out to be that we didn't implement
LWG1203, which clarifies what we should do in that case.

rdar://58769296

Change LoopInfo::empty to isInnermost after D82895

Small fixes for "[LoopInfo] empty() -> isInnermost(), add isOutermost()"

Revert "[CodeGen] emit CG profile for COFF object file"

This reverts commit 91aed9bf975f1e4346cc8f4bdefc98436386ced2, it is
causing link errors.

[LoopInfo] empty() -> isInnermost(), add isOutermost()

Differential Revision: https://reviews.llvm.org/D82895

[AArch64] Avoid pairing loads with same result reg

When pairing ldr instructions to an ldp instruction, we cannot pair two ldr
destination registers where one is a sub or super register of the other.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D86906

[ThinLTO] Option to bypass function importing.

This completes the circle, complementing -lto-embed-bitcode
(specifically, post-merge-pre-opt). Using -thinlto-assume-merged skips
function importing. The index file is still needed for the other data it
contains.

Differential Revision: https://reviews.llvm.org/D87949

[compiler-rt][AIX] Add CMake support for 32-bit Power builds

This patch enables support for building compiler-rt builtins for 32-bit
Power arch on AIX. For now, we leave out the specialized ppc builtin
implementations for 128-bit long double and friends since those will
need some special handling for AIX.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D87383

[lldb][test] Remove accidental import pdb in 783dc7dc7ed7487d0782c2feb8854df949b98e69

Two patches to fix the broken build.
One to fix a C++ compiler warning.
One to allow Sphinx to find a new document.

Revert "The wrong placement of add pass with optimizations led to -funique-internal-linkage-names being disabled."

This reverts commit 6950db36d33d85d18e3241ab6c87494c05ebe0fb.

[flang][msvc] Explicitly reference "this" inside closure. NFC.

The Microsoft compiler seems to have difficulties to decide between a const/non-const method of a captured object context in a closure. The error message is:
```
symbol.cpp(261): error C2668: 'Fortran::semantics::Symbol::detailsIf': ambiguous call to overloaded function
symbol.h(535): note: could be 'const D *Fortran::semantics::Symbol::detailsIf<Fortran::semantics::DerivedTypeDetails>(void) const'
symbol.h(534): note: or 'D *Fortran::semantics::Symbol::detailsIf<Fortran::semantics::DerivedTypeDetails>(void)'
symbol.cpp(261): note: while trying to match the argument list '()'
```
Explicitly using the this-pointer resolves this problem.

This patch is part of the series to make flang compilable with MS Visual Studio <http://lists.llvm.org/pipermail/flang-dev/2020-July/000448.html>.

Reviewed By: DavidTruby

Differential Revision: https://reviews.llvm.org/D88052

[flang][msvc] Add explicit function template argument to applyLamda. NFC.

Like in D87961, msvc has difficulties deducing the template argument. The error message is:
```
expr-parsers.cpp(383): error C2672: 'applyLambda': no matching overloaded function found
```
Explicitly pass the first template argument to help it.

This patch is part of the series to make flang compilable with MS Visual Studio <http://lists.llvm.org/pipermail/flang-dev/2020-July/000448.html>.

Reviewed By: DavidTruby

Differential Revision: https://reviews.llvm.org/D88001

[flang][msvc] Add explicit function template argument to applyFunction. NFC.

Msvc has difficulties deducing the template argument here. The error message is:
```
basic-parsers.h(790,12): error C2672: 'applyFunction': no matching overloaded function found
```
Explicitly pass the first template argument to help it.

This patch is part of the series to make flang compilable with MS Visual Studio <http://lists.llvm.org/pipermail/flang-dev/2020-July/000448.html>.

Reviewed By: DavidTruby

Differential Revision: https://reviews.llvm.org/D87961

Revert "[lldb] XFAIL TestMemoryHistory on Linux"

This reverts commit 7518006d75accd21325747430d6bced66b2c5ada.

This test apparently works on the Swift CI ubuntu bot, so it shouldn't be
XFAIL'd on Linux.

Implement a new kind of Pass: dynamic pass pipeline

Instead of performing a transformation, such pass yields a new pass pipeline
to run on the currently visited operation.
This feature can be used for example to implement a sub-pipeline that
would run only on an operation with specific attributes. Another example
would be to compute a cost model and dynamic schedule a pipeline based
on the result of this analysis.

Discussion: https://llvm.discourse.group/t/rfc-dynamic-pass-pipeline/1637

Recommit after fixing an ASAN issue: the callback lambda needs to be
allocated to a temporary to have its lifetime extended to the end of the
current block instead of just the current call expression.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D86392

[CVP] Narrow SDiv/SRem to the smallest power-of-2 that's sufficient to contain its operands

This is practically identical to what we already do for UDiv/URem:
  https://rise4fun.com/Alive/04K

Name: narrow udiv
Pre: C0 u<= 255 && C1 u<= 255
%r = udiv i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = udiv i8 %t0, %t1
%r = zext i8 %t2 to i16

Name: narrow exact udiv
Pre: C0 u<= 255 && C1 u<= 255
%r = udiv exact i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = udiv exact i8 %t0, %t1
%r = zext i8 %t2 to i16

Name: narrow urem
Pre: C0 u<= 255 && C1 u<= 255
%r = urem i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = urem i8 %t0, %t1
%r = zext i8 %t2 to i16

... only here we need to look for 'min signed bits', not 'active bits',
and there's an UB to be aware of:
  https://rise4fun.com/Alive/KG86
  https://rise4fun.com/Alive/LwR

Name: narrow sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = sdiv i16 C0, C1
  =>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = sdiv i9 %t0, %t1
%r = sext i9 %t2 to i16

Name: narrow exact sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = sdiv exact i16 C0, C1
  =>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = sdiv exact i9 %t0, %t1
%r = sext i9 %t2 to i16

Name: narrow srem
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = srem i16 C0, C1
  =>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = srem i9 %t0, %t1
%r = sext i9 %t2 to i16

Name: narrow sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = sdiv i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = sdiv i8 %t0, %t1
%r = sext i8 %t2 to i16

Name: narrow exact sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = sdiv exact i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = sdiv exact i8 %t0, %t1
%r = sext i8 %t2 to i16

Name: narrow srem
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = srem i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = srem i8 %t0, %t1
%r = sext i8 %t2 to i16

The ConstantRangeTest.losslessSignedTruncationSignext test sanity-checks
the logic, that we can losslessly truncate ConstantRange to
`getMinSignedBits()` and signext it back, and it will be identical
to the original CR.

On vanilla llvm test-suite + RawSpeed, this fires 1262 times,
while the same fold for UDiv/URem only fires 384 times. Sic!

Additionally, this causes +606.18% (+1079) extra cases of
aggressive-instcombine.NumDAGsReduced, and +473.14% (+1145)
of aggressive-instcombine.NumInstrsReduced folds.

[NFC][CVP] Add tests for SDiv/SRem narrowing

[NFC][CVP] Give a better name STATISTIC() counting udiv i16 -> udiv i8 xforms

[ConstantRange] Introduce getMinSignedBits() method

Similar to the ConstantRange::getActiveBits(), and to similarly-named
methods in APInt, returns the bitwidth needed to represent
the given signed constant range

[NFC][APInt] Refactor getMinSignedBits() in terms of getNumSignBits()

This is fully identical to the old implementation, just easier to read.

[NFC][CVP] processUDivOrURem(): refactor to use ConstantRange::getActiveBits()

As an exhaustive test shows, this logic is fully identical to the old
implementation, with exception of the case where both of the operands
had empty ranges:

```
TEST_F(ConstantRangeTest, CVP_UDiv) {
  unsigned Bits = 4;
  EnumerateConstantRanges(Bits, [&](const ConstantRange &CR0) {
    if(CR0.isEmptySet())
      return;
    EnumerateConstantRanges(Bits, [&](const ConstantRange &CR1) {
      if(CR0.isEmptySet())
        return;

      unsigned MaxActiveBits = 0;
      for (const ConstantRange &CR : {CR0, CR1})
        MaxActiveBits = std::max(MaxActiveBits, CR.getActiveBits());

      ConstantRange OperandRange(Bits, /*isFullSet=*/false);
      for (const ConstantRange &CR : {CR0, CR1})
        OperandRange = OperandRange.unionWith(CR);
      unsigned NewWidth = OperandRange.getUnsignedMax().getActiveBits();

      EXPECT_EQ(MaxActiveBits, NewWidth) << CR0 << " " << CR1;
    });
  });
}
```

[ConstantRange] Introduce getActiveBits() method

Much like APInt::getActiveBits(), computes how many bits are needed
to be able to represent every value in this constant range,
treating the values as unsigned.