review.tizen.org Git - platform/upstream/llvm.git/log

[mlir][Linalg] Add a DownscaleDepthwiseConv2DNhwcHwcOp decomposition pattern.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D113907

[asm] Make EmitMSInlineAsmStr and EmitGCCInlineAsmStr more alike

https://reviews.llvm.org/D71677 copied a bunch of code from
EmitGCCInlineAsmStr() to EmitMSInlineAsmStr() but made a few small
(likely unintentional) changes. This makes these pieces look the same.

No behavior change.

(Why are these functions two copies? No great reason as far as I can tell.
https://reviews.llvm.org/rG1778831a3d1d24ab6545635f63da4d9c5f8f0ac7 did the
split; we might want to undo them at some point. But PR23933 suggests
that a bigger change is planned for this file in the future, so keeping
this incremental for now.)

Differential Revision: https://reviews.llvm.org/D113924

[asm] Convert AsmPrinter::PrintSpecial() to StringRef

No behavior change.

Differential Revision: https://reviews.llvm.org/D113911

[asm] Correctly handle special names in variants

There's really no reason why anyone should use these special names in a variant.
I noticed this while reading the code: all other writes to OS are guarded by
this conditional, and the behavior with the check seems more correct, so
let's add the check.

Differential Revision: https://reviews.llvm.org/D113909

[PowerPC] Fix 32bit vector insert instructions for ISA3.1

The platform independent ISD::INSERT_VECTOR_ELT take a element index,
but vins* instructions take a byte index. Update 32bit td patterns for
vector insert to handle the element index accordingly.

Since vector insert for non constant index are supported in
ISA3.1, there is no need to use platform specific ISD node,
PPCISD::VECINSERT. Update td pattern to directly use
ISD::INSERT_VECTOR_ELT instead.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D113802

[NFC][lld] Inclusive language: change master file to merged file

[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with merged in these comments.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D113903

[LLDB][NativePDB] Fix image lookup by address

`image lookup -a ` doesn't work because the compilands list is always empty.
Create CU at given index if doesn't exit.

Differential Revision: https://reviews.llvm.org/D113821

Making the code compliant to the documentation about Floating Point
support default values for C/C++. FPP-MODEL=PRECISE enables FFP-CONTRACT
(FMA is enabled).

Fix a misleading FIXME in an unroll test

[SLP][NFC]Add a test for extra shuffle emission, NFC.

[llvm][fix] Inclusive language: replace master with main in find_interesting_reviews.py

As part of using inclusive language within the llvm project and to fix
the command because the master branch was renamed, this patch replaces
master with main in `find_interesting_reviews.py`.

Reviewed By: kristof.beyls

Differential Revision: https://reviews.llvm.org/D113918

[runtime-unroll] Inline canSafelyUnrollMultiExitLoop [NFC]

All of the interesting logic from this routine has been removed, inline the single check into the sole non-assert caller. The assert use has little value with the restructured code and is simply dropped.

[mlir][linalg] Make loop ops in TileLoopNest accessible

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D113927

[PatternMatch] Add m_BinOp/m_c_BinOp with specific opcode

Differential Revision: https://reviews.llvm.org/D113508

[msan] Disabled test failing on new GLIBC

[runtime-unroll] Restructure if-clause to improve readability [NFC]

[SLP][DOT][NFCI]Output all scalars for the splats, not only the first one.

Revert "[llvm][ubsan] Inclusive language: replace use of blacklist HandleLLVMOptions.cmake"

This reverts commit 44a64afd43943ed6f47c37f61a6cd2e99c7287f3.

[AIX][llvm-go] AIX linker does not recognize `-rpath`

The existing logic adds `-rpath` to CGO_LDFLAGS, which is not a valid linker option on AIX. This patch substitutes it with `-blibpath` on AIX.

Reviewed By: daltenty

Differential Revision: https://reviews.llvm.org/D113704

[analyzer][NFC] Separate CallDescription from CallEvent

`CallDescriptions` deserve its own translation unit.
This patch simply moves the corresponding parts.
Also includes the `CallDescription.h` where it's necessary.

Reviewed By: martong, xazax.hun, Szelethus

Differential Revision: https://reviews.llvm.org/D113587

[X86] Add generic splitVectorOp helper. NFC

Update splitVectorIntUnary/splitVectorIntBinary to use this internally, after some operand type sanity checks.

Avoid code duplication and makes it easier to split other vector instruction forms in the future.

[RISCV] Teach needVSETVLIPHI to handle mask register instructions.

This handles the case where the mask register instruction input
comes from a Phi of vsetvlis. If the VLMAX is the same as the VLMAX
required by the mask register instruction, we can avoid a vsetvli.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D113204

[flang] Allow implicit procedure pointers to associate with explicit procedures

Section 10.2.2.4, paragraph 3 states that, for procedure pointer assignment:
  If the pointer object has an explicit interface, its characteristics shall be
  the same as the pointer target ...

Thus, it's illegal for a procedure pointer with an explicit interface to be
associated with a procedure whose interface is implicit.  However, there's no
prohibition that disallows a procedure pointer with an implicit interface from
being associated with a procedure whose interface is explicit.

We were incorrectly emitting an error message for this latter case.

We were also not covering the case of procedures with explicit
interfaces where calling them requires the use of a descriptor.  Such
procedures cannot be associated with procedure pointers with implicit
interfaces.

Differential Revision: https://reviews.llvm.org/D113706

[NFC][X86][Costmodel] Improve test coverage for {i8,i16,i32,i64}->i1 vector trunc

[NFC][X86][Costmodel] Improve test coverage for i1->{i8,i16,i32,i64} vector *ext

[InstCombine] Fold (A^B)|~A-->~(A&B)

https://alive2.llvm.org/ce/z/2v6rhF

Fixes:
https://llvm.org/PR52478

Differential Revision: https://reviews.llvm.org/D113783

[analyzer] Fix region cast between the same types with different qualifiers.

Summary: Specifically, this fixes the case when we get an access to array element through the pointer to element. This covers several FIXME's. in https://reviews.llvm.org/D111654.
Example:
const int arr[4][2];
const int *ptr = arr[1]; // Fixes this.
The issue is that `arr[1]` is `int*` (&Element{Element{glob_arr5,1 S64b,int[2]},0 S64b,int}), and `ptr` is `const int*`. We don't take qualifiers into account. Consequently, we doesn't match the types as the same ones.

Differential Revision: https://reviews.llvm.org/D113480

[X86] LowerFunnelShift - pull out repeated EltSizeInBits variable. NFC.

[NFC][X86][Costmodel] Add i1 replication shuffle costmodel test coverage

[PatternMatch] Add a new m_Any that binds a value.

This is analogous to what LLVM's PatternMatch.h supports,
but LLVM calls it m_Value for both the binding and
nonbinding versions.

This is an upstream from CIRCT and is used there.

Differential Revision: https://reviews.llvm.org/D113905

[llvm][ubsan] Inclusive language: replace use of blacklist HandleLLVMOptions.cmake

This patch changes it to ignorelist and contains a filename change for the
.txt file that's called.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D113689

[x86] fold vector (X > -1) & Y to shift+andn (2nd try)

The first try at this patch ( bf5748a1af0d ) was reverted ( 5be64d416481 )
because it could crash. The cause of that problem was failing to account
for the optional peek-through-bitcast in the enclosing function.

This version of the patch adds a clause to avoid the fold in case of
bitcasts because it is unlikely to be profitable in that scenario.

A test case based on https://llvm.org/PR52504 was added to make sure
we don't have that problem again.

Original commit message:

and (pcmpgt X, -1), Y --> pandn (vsrai X, BitWidth-1), Y

This avoids the -1 constant vector in favor of an arithmetic shift
instruction if it exists (the ISA is still not complete after all
these years...).

We catch this pattern late in combining by matching PCMPGT, so it
should not interfere with more general folds.

Differential Revision: https://reviews.llvm.org/D113603

[x86] add test for vector signbit mask fold (PR52504); NFC

This goes with D113603 -
which was reverted because it could crash on this and similar examples.

[X86][Costmodel] `getReplicationShuffleCost()`: promote 8 bit-wide elements to 32 bit when no AVX512VBMI

Currently `X86TTIImpl::getInterleavedMemoryOpCostAVX512()` asks about i8 elt type,
so this change does affect vectorization. In the end, it will ask about i1.

We should also try to promote to i16 if we have AVX512BW, i'll do that in a follow-up.
All costs here look good, i've added the missing truncation costs in preparatory patches.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113853

[X86][Costmodel] `trunc v32i16 to v64i8` can appear after legalization, cost is same as for `trunc v32i16 to v32i8`

Some of the costs get larger here,
but i suppose that makes sense since we'd previously query
scalarization costs that may not be really representative of the reality.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113852

[X86][Costmodel] `trunc v8i64 to v16i8/v32i8/v64i8` can appear after legalization, cost is same as for `trunc v8i64 to v8i8`

While this one is trivial and identical to the previous patch,
there is a weird cost change in a follow-up patch that i'm not sure about.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113851

[X86][Costmodel] `trunc v16i32 to v32i8/v64i8` can appear after legalization, cost is same as for `trunc v16i32 to v16i8`

While this one is trivial and identical to the previous patch,
there is a weird cost change in a follow-up patch that i'm not sure about.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113850

[Flang] Add the FIR LLVMPointer Type

Add a fir.llvm_ptr type to allow any level of indirections

Currently, fir pointer types (fir.ref, fir.ptr, and fir.heap) carry
a special Fortran semantics, and cannot be freely combined/nested.

When implementing some features, lowering sometimes needs more liberty
regarding the number of indirection levels. Add a fir.llvm_ptr that has
no constraints.

Allow its usage in fir.coordinate_op, fir.load, and fir.store.

Convert the FIR LLVMPointer to an LLVMPointer in the LLVM dialect.

Reviewed By: clementval

Differential Revision: https://reviews.llvm.org/D113755

Co-authored-by: Jean Perier <jperier@nvidia.com>

[openmp][amdgpu] Add comment warning that libm may be broken

Using llvm-link to add rocm device-libs probably doesn't work

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D112639

[X86] getAVX512Node() - find constant broadcasts to encourage load-folding

If an operand is a bitcasted or widended constant, try to more aggressively create broadcastable constants for folding, which in particular helps non-VLX modes.

I've refactored getAVX512Node so that VLX targets can make better use of this as well.

NOTE: In the future, I think we should consider removing the broadcast of constant data from DAG entirely and move this to either X86InstrInfo::foldMemoryOperand or a new pass - AVX1/2 targets has similar problems with missed (whole vector) folds that need to be improved as well.

Differential Revision: https://reviews.llvm.org/D113845

[SLP]Improve splat detection.

A bunch of scalars can be treated as a splat not only if all elements
are the same but also if some of them are undefvalues.

Differential Revision: https://reviews.llvm.org/D113774

[clang] Allow clang-check to customize analyzer output file or dir name

Required by https://stackoverflow.com/questions/58073606

As the output argument is stripped out in the clang-check tool, it seems impossible for clang-check users to customize the output file name, even with -extra-args and -extra-arg-before.

This patch adds the -analyzer-output-path argument to allow users to adjust the output name. And if the argument is not set or the analyzer is not enabled, the original strip output adjuster will remove the output arguments.

Differential Revision: https://reviews.llvm.org/D97265

[fir] Add fir.global_len conversion placeholder

As for D113662, this patch just add a place holder for
the fir.global_len operation conversion. This operation
is part of F20xx and is not implemented yet.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D113887

[flang][CodeGen] Transform `fir.unboxchar` to a sequence of LLVM MLIR

This patch extends the `FIRToLLVMLowering` pass in Flang by adding a
hook to transform `fir.unboxchar` to a sequence of LLVM MLIR
instructions.

This is part of the upstreaming effort from the `fir-dev` branch in [1].

[1] https://github.com/flang-compiler/f18-llvm-project

Differential Revision: https://reviews.llvm.org/D113747

Originally written by:
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>

[fir] Remove extra return in SelectTypeOpConversion

Extra commit after D113878

[libc++] Add missing _LIBCPP_HIDE_FROM_ABI to __rewrap_iter

Regenerate acle_st1*.c tests

Regenerate acle_st1*.c tests using update_cc_test_checks.py

[NFC][InstSimplify] add test cases with base results for or-xor fold

This patch adds tests with baseline results as a pre-commit for D113861

Differential Revision: https://reviews.llvm.org/D113860

[SLP][NFC]Use `isa_and_nonnull` and fix comment, NFC.

Fix an unused variable warning

Revert "[GVN][NFC] Remove redundant check"

This reverts commit c35e8185d8c170c20e28956e0c9f3c1be895fefb.

mstorsjo reported in the revision thread that one VNCoercion assertion
is violated and seemly in relate to this commit. As per "If a test case
that demonstrates a problem is reported in the commit thread, please
revert and investigate offline", this commit is reverted.

[SLP]Do not create unused gather nodes for scalar arguments of vector intrinsics.

If the vector intrinsic has scalar argument, we currently still create
a tree entry for this argument. This entry is not used, just consumes
resources and increases the cost of the tree.

Differential Revision: https://reviews.llvm.org/D113806

[CMake] Allow passing extra options to extract_symbols.py.

When cross-compiling LLVM in an environment where there //is// an
objdump binary available but it does not understand the target
platform's object file format, extract_symbols.py fails, because its
initial check for tool availability decides that the existence of
objdump at all is good enough to settle on it as the tool of choice.

In such an environment it's useful to work around this by telling
extract_symbols.py to use llvm-readobj instead. The script itself has
an option for that, but its invocation in AddLLVM.cmake wasn't
providing a mechanism to add extra options passed through for the
cmake command line.

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D113557

[lldb] Unwrap the type when dereferencing the value

The value type can be a typedef of a reference (e.g. `typedef int& myint`).
In this case `GetQualType(type)` will return `clang::Typedef`, which cannot
be casted to `clang::ReferenceType`.

Fix a regression introduced in https://reviews.llvm.org/D103532.

Reviewed By: teemperor

Differential Revision: https://reviews.llvm.org/D113673

[fir] Add fir.select_type conversion placeholder

As for D113662, this patch just add a place holder for
the `fir.select_type` operation conversion. This operation
is part of F20xx and is not implemented yet.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D113878

[IVDescriptor] Make sure the sign is included for negative extension.

At the moment, computeRecurrenceType does not include any sign bits in
the maximum bit width. If the value can be negative, this means the sign
bit will be missing and the sext won't properly extend the value.

If the value can be negative, increment the bitwidth by one to make sure
there is at least one sign bit in the result value.

Note that the increment is also needed *if* the value is *known* to be
negative, as a sign bit needs to be preserved for the sext to work.

Note that this at the moment prevents vectorization, because the
analysis computes i1 as type for the recurrence when looking through the
AND in lookThroughAnd.

Fixes PR51794, PR52485.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D113056

[libcxx] Fix enable_if condition of std::reverse_iterator::operator=

The template std::is_assignable<T, U> checks that T is assignable from
U. Hence, the order of operands in the instantiation of
std::is_assignable in the std::reverse_iterator::operator= condition
should be reversed.

This issue remained unnoticed because std::reverse_iterator has an
implicit conversion constructor. This patch adds a test to check that
the assignment operator is used directly, without any implicit
conversions. The patch also adds a similar test for
std::move_iterator.

Reviewed By: Quuxplusone, ldionne, #libc

Differential Revision: https://reviews.llvm.org/D113417

[llvm-nm][test] Move X86 lit.local.cfg into the X86 subfolder

The file seems to have been put in the wrong place in its original
commit. This had the effect of marking all llvm-nm tests as unsupported,
unless X86 was enabled, even for tests that weren't X86 specific.

Fixes https://bugs.llvm.org/show_bug.cgi?id=52506.

Reviewed by: mstorsjo

Differential Revision: https://reviews.llvm.org/D113882

[mlir][Linalg] Fix and improve vectorization of depthwise convolutions.

When trying to connect the vectorization of depthwise convolutions to e2e execution
a number of problems surfaced.
Fix an off-by-one error on the size of the input vector (similary to what was previously done for regular conv).
Rewrite the lowering to vector.fma instead of vector.contract: the KW reduction dimension has already been unrolled and vector.contract requires a reduction dimension to be valid.

Differential Revision: https://reviews.llvm.org/D113884

[mlir][Linalg] Add bounded recursion declaration to FMAOp -> LLVM conversion.

FMAOp -> LLVM conversion is done progressively by peeling off 1 dimension from FMAOp at each pattern iteration. Add the recursively bounded property declaration to the pattern so that the rewriter can apply it multiple times.

Without this, FMAOps with 3+D do not lower to LLVM.

Differential Revision: https://reviews.llvm.org/D113886

[mlir] Move min/max ops from Std to Arith.

Differential Revision: https://reviews.llvm.org/D113881

[clang-tidy] Fix a crash in modernize-loop-convert around conversion operators

modernize-loop-convert checks and fixes when a loop that iterates over the
elements of a container can be rewritten from a for(...; ...; ...) style into
the "new" C++11 for-range format. For that, it needs to parse the elements of
that loop, like its init-statement, such as ItType it = cont.begin().
modernize-loop-convert checks whether the loop variable is initialized by a
begin() member function.

When an iterator is initialized with a conversion operator (e.g. for
(const_iterator it = non_const_container.begin(); ...), attempts to retrieve the
name of the initializer expression resulted in an assert, as conversion
operators don't have a valid IdentifierInfo.

I fixed this by making digThroughConstructors dig through conversion operators
as well.

Differential Revision: https://reviews.llvm.org/D113201

[mlir] DialectConversion: fix OperationLegalizer::isIllegal result when legality callback returns None

OperationLegalizer::isIllegal returns false if operation legality wasn't
registered by user and we expect same behaviour when dynamic legality
callback return None, but instead true was returned.

Differential Revision: https://reviews.llvm.org/D113267

[mlir][Linalg] Fix off-by-one error in conv vector size computation.

Differential Revision: https://reviews.llvm.org/D113877

Revert "[x86] fold vector (X > -1) & Y to shift+andn"

This casued assertion failures:

  llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:9446:
  void llvm::SelectionDAG::ReplaceAllUsesWith(llvm::SDNode *, llvm::SDNode *):
  Assertion `(!From->hasAnyUseOfValue(i) || From->getValueType(i) == To->getValueType(i))
  && "Cannot use this version of ReplaceAllUsesWith!"' failed.

See comment on the code review.

(Had to update some expectations in test/CodeGen/X86/vselect-zero.ll
manually due to other changes having landed after the reverted one.)

> and (pcmpgt X, -1), Y --> pandn (vsrai X, BitWidth-1), Y
>
> This avoids the -1 constant vector in favor of an arithmetic shift
> instruction if it exists (the ISA is still not complete after all
> these years...).
>
> We catch this pattern late in combining by matching PCMPGT, so it
> should not interfere with more general folds.
>
> Differential Revision: https://reviews.llvm.org/D113603

This reverts commit bf5748a1af0d2f6f9396d9dc6ac89d15de41eee7.

[cmake] use project relative paths when generating ASTNodeAPI.json

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: stephenneuendorffer

Differential Revision: https://reviews.llvm.org/D113664

[flang][CodeGen] Transform `fir.emboxchar` to a sequence of LLVM MLIR

This patch extends the `FIRToLLVMLowering` pass in Flang by adding a
hook to transform `fir.emboxchar` to a sequence of LLVM MLIR
instructions.

This is part of the upstreaming effort from the `fir-dev` branch in [1].

[1] https://github.com/flang-compiler/f18-llvm-project

Differential Revision: https://reviews.llvm.org/D113666

Patch originally written by:
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>

[DAG] SimplifyVBinOp - add SDLoc() argument

Pass in SDLoc instead of (repeated) local creations in SimplifyVBinOp and scalarizeBinOpOfSplats

[DAG] SimplifyVBinOp - pull out repeated getValueType() call. NFC.

[mlir][linalg][bufferize] Allow non-tensor mappings in BufferizationState

This change makes it possible to set up custom mappings in a PostAnalysisStep. Some users of Comprehensive Bufferize have custom tensor types and it is most convenient to just reuse the same bvm.

Also add some more assertions.

Differential Revision: https://reviews.llvm.org/D113726

[mlir] NFC - Add VectorType::Builder to more easily build vector types from existing ones

Differential Revision: https://reviews.llvm.org/D113875

[mlir][linalg][bufferize] Fix insertion point of result buffers

Differential Revision: https://reviews.llvm.org/D113723

[MachineVerifier] Live interval for a subreg must have subranges

MachineVerifier verified the subranges of a live interval if
they existed, but did not complain if they did not exist.

This patch changes the verifier to complain if there are no
subranges in the live interval for a subreg operand (so long
as MachineRegisterInfo says we should be tracking subreg
liveness for that register). This matches the conditions for
LiveIntervalCalc to create subranges in the first place.

Differential Revision: https://reviews.llvm.org/D112556

[lldb/test] Fix std-module vector tests to work with both kinds of vector layouts

D112976 changed the layout and 0d62e31c45 andjusted the test
expectations to match.

This patch changes the tests to expect both versions, so that one can
run the test suite against older libc++ versions as well.

[AMDGPU][MC][GFX10] Corrected global_atomic_fcmpswap*

Corrected src data size of global_atomic_fcmpswap and global_atomic_fcmpswap_x2 opcodes.

Differential Revision: https://reviews.llvm.org/D113746

[ARM] Fix GatherScatter AddLikeOr condition

Fix a deadlock in __cxa_guard_abort in tsan

hat tip: @The_Whole_Daisy for helping to isolate

Reviewed By: dvyukov, fowles

Differential Revision: https://reviews.llvm.org/D113713

[fir] Add !fir.len type conversion

This patch adds the !fir.len type conversion. The type is converted
to the a 32 bits integer.

This patch is part of the upstreaming effort from fir-dev branch.

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D113658

[AArch64][SVE] Break false dependencies for inactive lanes of FP unary operations

Follow up to D105889, covering instructions using sve_fp_2op_p_zd_HSD:
frintn, frintp, frintm, frintz, frinta, frintx, frinti, frecpx and
fsqrt.

Reviewed By: bsmith

Differential Revision: https://reviews.llvm.org/D113485

[Flang] Fixup some comments. NFC

Clarify some comments as discussed here:
https://github.com/flang-compiler/f18-llvm-project/pull/1210

[ELF] Do not try to assign a memory region to a non-allocatable section

Non-allocatable sections are not part of the memory image of the
program, so there is no need to find memory regions for them either
matching properties or handling explicit assignments. The early test
and return help to simplify LinkerScript::findMemoryRegion() a bit.

Differential Revision: https://reviews.llvm.org/D113768

[VE] Fix SDNode user loop after efa896e5f7

Rewriting SDNode user loops broke VEISelLowering (commit efa896e5f7).
This fixes it.

[LV] Rename blockNeedsPredication to blockNeedsPredicationForAnyReason.

The interface is a convenience function to ask if a block requires
predication when widening, but it's important that there are two
separate concepts to consider:
(A) The block was predicated in the original loop.
(B) The block was unpredicated in the original loop, but requires
predication because of tail folding.

In the case of (B) we know that at least one lane of the vector will
be executed, which means we can implementing a load from a uniform address
with a scalar load + splat (D112552). In the case of predication because
of (A), we cannot do this, because the scalar load itself requires
predication.

The name 'blockNeedsPredication' does not make the distinction between
(A) and (B), hence the reason to rename it.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D113392

[mlir][Linalg] Make depthwise convolution naming scheme consistent.

Names should be consistent across all operations otherwise painful bugs will surface.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D113762

[clang-tidy] Fix `bugprone-use-after-move` check to also consider moves in constructor initializers

Fixes PR#38187. Constructors are actually already checked,
but only as functions, i.e. the check only looks at the
constructor body and not at the initializers, which misses
the (common) case where constructor parameters are moved
as part of an initializer expression.

One remaining false negative is when both the move //and//
the use-after-move occur in constructor initializers.
This is a lot more difficult to handle, though, because
the `bugprone-use-after-move` check is currently based on
a CFG that only takes the body into account, not the
initializers, so e.g. initialization order would have to
manually be considered. I will file a follow-up issue for
this once PR#38187 is closed.

Reviewed By: carlosgalvezp

Differential Revision: https://reviews.llvm.org/D113708

Revert "[mlir] FlatAffineConstraint parsing for unit tests"

This reverts commit bec488b8183c06c409ddbdd49bbfe2709e407b32.

This commit introduced a layering violation between MLIR libraries.
Reverting for now while discussing on the original review thread.

[DebugInfo] Fix Test Targets in D108261

It appears REQUIRES are needed for tests added in D108261.
This was not caught in the pre-merge tests but in the post-commit tests.
he fix is to move the tests into the target sub-directories.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D113870

Add more test coverage for D77598

Add coverage to demonstrate why including the type of template
parameters is necessary to disambiguate function template
specializations.

Test courtesy of Richard Smith

ast-print: Avoid extra whitespace before function opening brace

ast-dump: Add missing identation of class template specializations

Add test for a case in D77598

This covers the DeclPrinter::VisitCXXRecordDecl caller - though also
demonstrates some possible inconsistency in template specialization
printing.

Re-apply "[mlir] Allow out-of-tree python building from installed MLIR."

Re-applies D111513:
* Adds a full-fledged Python example dialect and tests to the Standalone example (need to do a bit of tweaking in the top level CMake and lit tests to adapt better to if not building with Python enabled).
* Rips out remnants of custom extension building in favor of pybind11_add_module which does the right thing.
* Makes python and extension sources installable (outputs to src/python/${name} in the install tree): Both Python and C++ extension sources get installed as downstreams need all of this in order to build a derived version of the API.
* Exports sources targets (with our properties that make everything work) by converting them to INTERFACE libraries (which have export support), as recommended for the forseeable future by CMake devs. Renames custom properties to start with lower-case letter, as also recommended/required (groan).
* Adds a ROOT_DIR argument to declare_mlir_python_extension since now all C++ sources for an extension must be under the same directory (to line up at install time).
* Downstreams will need to adapt by:

* Remove absolute paths from any SOURCES for declare_mlir_python_extension (I believe all downstreams are just using ${CMAKE_CURRENT_SOURCE_DIR} here, which can just be ommitted). May need to set ROOT_DIR if not relative to the current source directory.
* To allow further downstreams to install/build, will need to make sure that all C++ extension headers are also listed under SOURCES for declare_mlir_python_extension.

This reverts commit 1a6c26d1f52999edbfbf6a978ae3f0e6759ea755.

Reviewed By: stephenneuendorffer

Differential Revision: https://reviews.llvm.org/D113732

[DebugInfo] Fix end_sequence of debug_line in LTO Object

In a LTO build, the `end_sequence` in debug_line table for each compile unit (CU) points the end of text section which merged all CUs. The `end_sequence` needs to point to the end of each CU's range. This bug often causes invalid `debug_line` table in the final `.dSYM` binary for MachO after running `dsymutil` which tries to compensate an out-of-range address of `end_sequence`.
The fix is to sync the line table termination with the range operations that are already maintained in DwarfDebug. When CU or section changes, or nodebug functions appear or module is finished, the prior pending line table is terminated using the last range label. In the MC path where no range is tracked, the old logic is conservatively used to end the line table using the section end symbol.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D108261

[msan] Fix test with GLIBC 2.34

PTHREAD_STACK_MIN is not a constexpr

[llvm] Use range-based for loops with instructions (NFC)

[llvm] Use isa instead of dyn_cast (NFC)

[AMDGPU] Remove selectStoreIntrinsic (NFC)

The last use was removed on Jan 13, 2020 in commit
533d650e947a2f7216a315aeb8c79ac1d4740e5f.

[NFC] Use Optional<ProfileCount> to model invalid counts

ProfileCount could model invalid values, but a user had no indication
that the getCount method could return bogus data. Optional<ProfileCount>
addresses that, because the user must dereference the optional. In
addition, the patch removes concept duplication.

Differential Revision: https://reviews.llvm.org/D113839

[PowerPC] guard update form prepare with non-const increment with option

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D113471

[NFC] Trim trailing whitespace in *.rst