review.tizen.org Git - platform/upstream/llvm.git/log

[flang] Fix: use right symbol for parent component

When constructing the representation for a component reference
to an inherited component, expression semantics make the parent
component references explicit in the DataRef; e.g., base%component
becomes base%parent%grandparent%component if component was
inheritance-associated through two levels.  But expression semantics
was inserting references to the symbol table entries for the
intermediate types, not the symbols for the parent components in
the extended types.  (We didn't notice the distinction until
recently because both symbols have the same name; this only
affects lowering.)  Find and use the right symbols.

Differential Revision: https://reviews.llvm.org/D118746

[libc++] [NFC] s/__referenceable/__can_reference/

The Standard name for this exposition-only concept is _can-reference_.

Differential Revision: https://reviews.llvm.org/D118726

[SLP]Alternate vectorization for cmp instructions.

Added support for alternate ops vectorization of the cmp instructions.
It allows to vectorize either cmp instructions with same/swapped
predicate but different (swapped) operands kinds or cmp instructions
with different predicates and compatible operands kinds.

Differential Revision: https://reviews.llvm.org/D115955

[lldb] [Commands] Implement "thread siginfo"

Differential Revision: https://reviews.llvm.org/D118473

[PowerPC] Fixing buildbod failure ppc64le-lld-multistage-test

[mlir] Support verification order (1/3)

This CL supports adding dependency between traits verifiers and the
dependency will be checked statically.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D115135

[clang][Sparc] Fix __builtin_extract_return_addr etc.

While investigating the failures of `symbolize_pc.cpp` and
`symbolize_pc_inline.cpp` on SPARC (both Solaris and Linux), I noticed that
`__builtin_extract_return_addr` is a no-op in `clang` on all targets, while
`gcc` has non-default implementations for arm, mips, s390, and sparc.

This patch provides the SPARC implementation. For background see
`SparcISelLowering.cpp` (`SparcTargetLowering::LowerReturn_32`), the SPARC
psABI p.3-12, `%i7` and p.3-16/17, and SCD 2.4.1, p.3P-10, `%i7` and
p.3P-15.

Tested (after enabling the `sanitizer_common` tests on SPARC) on
`sparcv9-sun-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D91607

[RISCV] Add DAG combines to transform ADD_VL/SUB_VL into widening add/sub.

This adds or reuses ISD opcodes for vadd.wv, vaddu.wv, vadd.vv, vaddu.vv
and a similar set for sub.

I've included support for narrowing scalar splats that have known
sign/zero bits similar to what was done for MUL_VL.

The conversion to vwadd.vv proceeds in two phases. First we'll form
a vwadd.wv by narrowing one of the operands. Then we'll visit the
vwadd.wv to try to narrow the other operand. This turned out to be
simpler than catching all the cases in one step. The forming of of
vwadd.wv can happen for either operand for add, but only the right
hand side for sub since sub isn't commutable.

An interesting quirk is that ADD_VL and VZEXT_VL/VSEXT_VL are formed
during vector op legalization, but VMV_V_X_VL isn't usually formed
until op legalization when BUILD_VECTORS are handled. This leads to
VWADD_W_VL forming in one DAG combine round, and then a later DAG combine
round sees the VMV_V_X_VL and needs to commute the operands to get the
splat in position. This alone necessitated a VWADD_W_VL combine function
which made forming vwadd.vv in two stages an easy choice.

I've left out trying hard to form vwadd.wx instructions for now. It would
only save an extend in the scalar domain which isn't as interesting.

Might need to review the test coverage a bit. Most of the vwadd.wv
instructions are coming from vXi64 tests on rv64. The tests were
copy pasted from the existing multiply tests.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D117954

[NFC] TypePromotion tests

[flang] Switch return to ExtendedValue in AbstractConverter and Bridge

Change the signature of `genExprAddr`, `genExprValue` to return a `fir::ExtendedValue` instead of a simple `mlir::Value`

This patch is a preparation for more lowering to be upstream. It supports D118786 and D118787.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D118785

[libc++] UNSUPPORT GDB pretty printers with Clang 15 (which is what main is)

[flang] Set right "inNamelist" flag

NAMELIST I/O was inconsistent in its choice of which set of I/O modes
to set the "inNamelist" flag. The wrong choice was in the set of modes
that are part of the persistent state of an I/O connection; the right
place is the set of modes that are reinitialized at the beginning of
each I/O statement so that they can be modified by READ/WRITE control
list specifiers and FORMAT control edit descriptors. Fix.

Differential Revision: https://reviews.llvm.org/D118745

[AMDGPU] Use new target MMO flag MONoClobber

This allows us to set the noclobber flag on (the MMO of) a load
instruction instead of on the pointer. This fixes a bug where noclobber
was being applied to all loads from the same pointer, even if some of
them were clobbered.

Differential Revision: https://reviews.llvm.org/D118775

[libc++] [test] Fix a couple of copy-paste comments. NFC.

Add missing include diagnosed by the modules build.

Remove redundant LLVM_HAS_RVALUE_REFERENCE_THIS and LLVM_LVALUE_FUNCTION defines

Now that VS2017 support has been dropped (D114639), the LLVM_HAS_RVALUE_REFERENCE_THIS define is always true and the LLVM_LVALUE_FUNCTION define is always enabled for ref-qualifiers.

This patch proposes we remove the defines and use the qualifiers directly.

Differential Revision: https://reviews.llvm.org/D118609

[Function Specialisation] Fix use after free

This is a fix for a use-after-free found by the address sanitizer when
compiling GCC: https://github.com/llvm/llvm-project/issues/52821

The Function Specialization pass may remove instructions, cached
inside the PredicateBase class, which are later being dereferenced
from the SCCPInstVisitor class. To prevent the dangling references
I am lazily deleting the dead instructions after the Solver has run.

Differential Revision: https://reviews.llvm.org/D118591

[clang][macho] add clang frontend support for emitting macho files with two build version load commands

This patch extends clang frontend to add metadata that can be used to emit macho files with two build version load commands.
It utilizes "darwin.target_variant.triple" and "darwin.target_variant.SDK Version" metadata names for that.

MachO uses two build version load commands to represent an object file / binary that is targeting both the macOS target,
and the Mac Catalyst target. At runtime, a dynamic library that supports both targets can be loaded from either a native
macOS or a Mac Catalyst app on a macOS system. We want to add support to this to upstream to LLVM to be able to build
compiler-rt for both targets, to finish the complete support for the Mac Catalyst platform, which is right now targetable
by upstream clang, but the compiler-rt bits aren't supported because of the lack of this multiple build version support.

Differential Revision: https://reviews.llvm.org/D115415

TrigramIndex.h - move unnecessary StringRef include down to TrigramIndex.cpp

[libc++] Guard bits of 598983d7 against _LIBCPP_HAS_NO_CONCEPTS.

[gn build] Port 256d2533322c

[IRBuilder] Reformat two functions (NFC)

These were using 1-space indentation.

[libc++] [NFC] Normalize some `#ifndef _LIBCPP_HAS_NO_CONCEPTS`.

[libc++] [NFC] s/_LIBCPP_STD_VER > 17 && !defined(_LIBCPP_HAS_NO_CONCEPTS)/!defined(_LIBCPP_HAS_NO_CONCEPTS)/

Per Discord discussion, we're normalizing on a simple `!defined(_LIBCPP_HAS_NO_CONCEPTS)`
so that we can do a big search-and-replace for `!defined(_LIBCPP_HAS_NO_CONCEPTS)`
back into `_LIBCPP_STD_VER > 17` when we're ready to abandon support for concept-syntax-less
compilers.

Differential Revision: https://reviews.llvm.org/D118748

Follow up to 9fd9d56dc6b, avoid a memory leak

Gaps in the basic block number range (from blocks being deleted or folded)
get block-value-tables allocated but never ejected, leading to a memory
leak, currently tripping up the asan buildbots. Fix this up by manually
freeing that memory.

As suggested elsewhere, if these things were owned by a unique_ptr then
cleanup would happen automagically. D118774 should eliminate the need for
this dance.

[ConstraintElimination] Add tests with signed predicates and GEPs.

[PowerPC] Scalar IBM MASS library conversion pass

This patch introduces the conversions from math function calls
to MASS library calls. To resolves calls generated with these conversions, one
need to link libxlopt.a library. This patch is tested on PowerPC Linux and AIX.

Differential: https://reviews.llvm.org/D101759

Reviewer: bmahjour

[libc++] Add CI without experimental features and don't exclude span from the tests

There is no reason for the parts of std::span that don't depend on ranges
to be disabled when ranges aren't provided. Also, to make sure the
"no-experimental-stuff" configuration is tested, add a CI job for it.

Differential Revision: https://reviews.llvm.org/D118740

[llvm-rc] Use cast<> instead of dyn_cast<> to avoid dereference of nullptr

The pointer is dereferenced immediately, so assert the cast is correct instead of returning nullptr

Signposts.h - move unnecessary StringRef include down to Signposts.cpp

Fix buildbreak introduced in ed2deab5956fea9e8f64ef6020fe0b4e19734ecc

[nfc][regalloc] Make the max inference cutoff configurable

Added a flag to make configurable the number of interferences after
which we 'bail out' and treat a set of intervals as un-evictable. Also
using it on the ML side, as it turns out to be a good control for
compile-time.

With this configurable, we can do a bit of trial and error and see if
bumping it has any effect on heuristic/policy quality.

Differential Revision: https://reviews.llvm.org/D118707

Also document -arch as -arch is mac specific

Reviewed By: serge-sans-paille

Differential Revision: https://reviews.llvm.org/D60380

[DebugInfo][InstrRef] Fix a tombstone-in-DenseMap crash from D117877

This is a follow-up to D117877: variable assignments of DBG_VALUE $noreg,
or DBG_INSTR_REFs where no value can be found, are represented by a
DbgValue object with Kind "Undef", explicitly meaning "there is no value".
In D117877 I added a special-case to some assignment accounting faster,
without considering this scenario. It causes variables to be given the
value ValueIDNum::EmptyValue, which then ends up being a DenseMap key. The
DenseMap asserts, because EmptyValue is the tombstone key.

Fix this by handling the assign-undef scenario in the special case, to
match what happens in the general case: the variable has no value if it's
only ever assigned $noreg / undef.

Differential Revision: https://reviews.llvm.org/D118715

[NFC][SimplifyCFG] Merge `FoldTwoEntryPHINode()` into it's only callee

[NFC][SimplifyCFG] `FoldTwoEntryPHINode()`: s/BB/MergeBB/

[SimplifyCFG] Start redesigning `FoldTwoEntryPHINode()`.

The current `FoldTwoEntryPHINode()` is not quite designed correctly.
It starts from the merge point, and then tries to detect
the 'divergence' point.

Because of that, it is limited to the simple two-predecessor case,
where the PHI completely goes away. but that is rather pessimistic,
and it doesn't make much sense from the costmodel side of things.

For example if there is some other unrelated predecessor of
the merge point, we could split the merge point so that
the then/else blocks first branch to an empty block
and then to the merge point, and then we'd be able to speculate
the then/else code.

But if we'd instead simply start at the divergence point,
and look for the merge point, then we'll just natively support this case.

There's also the fact that `SpeculativelyExecuteBB()` already does
just that, but only if there is a single block to speculate,
and with a much more restrictive cost model.
But that also means we have code duplication.

Now, sadly, while this is as much NFCI as possible,
there is just no way to cleanly migrate to
the proper implementation. The results *are* going to be different
somewhat because of various phase ordering effects and SimplifyCFG
block iteration strategy.

[NFC][SimplifyCFG] Autogenerate checklines in a few tests being affected by upcoming change

[MLIR][Presburger] Simplify checkExplicitRepresentation

This also gets rid of a clang-tidy warning.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D118780

[clang-format] Elide unnecessary braces. NFC.

[MLIR][Presburger] Use `SmallVector` instead of `std::vector` in `getLocalRepr`

Use `SmallVector` instead of `std::vector` in `getLocalRepr` function.
Also, fix the casing of a variable.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D118722

[libc] Populate rtti/eh flags for all targets

[llvm-profgen] Use cast<> instead of dyn_cast<> to avoid dereference of nullptr

The pointer is dereferenced immediately, so assert the cast is correct instead of returning nullptr

[llvm-profgen] Use cast<> instead of dyn_cast<> to avoid dereference of nullptr

The pointers are dereferenced immediately, so assert the cast is correct instead of returning nullptr

[DebugInfo][InstrRef][NFC] Use depth-first scope search for variable locs

This patch aims to reduce max-rss from instruction referencing, by avoiding
keeping variable value information in memory for too long. Instead of
computing all the variable values then emitting them to DBG_VALUE
instructions, this patch tries to stream the information out through a
depth first search:
* Make use of the fact LexicalScopes gives a depth-number to each lexical
   scope,
* Produce a map that identifies the last lexical scope to make use of a
   block,
* Enumerate each scope in LexicalScopes' DFS order, solving the variable
   value problem,
* After each scope is processed, look for any blocks that won't be used by
   any other scope, and emit all the variable information to DBG_VALUE
   instructions.

Differential Revision: https://reviews.llvm.org/D118460

[AArch64] Genereate CCMP from And CSel

LLVM has a couple of ways of producing ccmp - either from chains in isel
or from a later ifcvt style pass. This adds a simple DAG combine to
capture more cases, converting and(csel(0, 1, cc0), csel(0, 1, cc1))
into a csel(ccmp(.., cc0)), depending on cc1 (a SUBS in this case).

Differential Revision: https://reviews.llvm.org/D118327

[ArgPromotion] Add test with bitcasts (NFC)

Argument promotion currently doesn't handle these.

[libc] use llvm_update_compile_flags to populate rtti/exception compilation flags

[clang-format] Elide unnecessary braces. NFC.

[clang-format] Factor out loop variable. NFC.

* Break on the size of the used variable Content instead of Lines (even though both should have the same size).

[clang-format] Simplify use of StringRef::substr(). NFC.

[MLIR][Presburger] maybeLocalRepr: rename inEqualityPair -> inequalityPair

[x86] invert a vector select IR canonicalization with a binop identity constant

This is an intentionally limited/different form of D90113.
That patch bravely tries to generalize folds where we pull
a binop into the arms of a select:
N0 + (Cond ? 0 : FVal) --> Cond ? N0 : (N0 + FVal)
...but it is not universally profitable.

This is the inverse of IR canonicalization as discussed in
D113442.

We know that this transform is not entirely profitable even
within x86, so we only handle x86 vector fadd/fsub as a 1st
step. The intent is to prevent AVX512 regressions as mentioned
in D113442.

The plan is to port this to DAGCombiner (so it will eventually
look more like D90113) and add more types/cases in pieces with
many more tests to verify that we are seeing improvements.

Differential Revision: https://reviews.llvm.org/D118644

[clang][NFC] Remove unreachable code

NamespaceDecls are NamedDecls, so NSD can never be non-null in the
else branch. Add a comment about this whole ModuleInternal linkage
concept going away when p1815 is implemented.

Reviewed By: bruno

Differential Revision: https://reviews.llvm.org/D118704

[MLIR][Presburger] Support isSubsetOf in PresburgerSet and IntegerPolyhedron

Also support isEqual in IntegerPolyhedron.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D118778

[lldb] Convert "LLDB" log channel to the new API

[clang-format] Use prefix operator--. NFC.

[clang-format] Use llvm::seq instead of std::iota. NFC.

[DebugInfo][InstrRef][NFC] Free resources at an earlier stage

This patch releases some memory from InstrRefBasedLDV earlier that it would
otherwise. The underlying problem is:
* We store a big table of "live in values for each block",
* We translate that into DBG_VALUE instructions in each block,

And both exist in memory at the same time, which needlessly doubles that
information. The most of what this patch does is: as we progressively
translate live-in information into DBG_VALUEs, we free the variable-value /
machine-value tracking information as we go, which significantly reduces
peak memory.

While I'm here, also add a clear method to wipe variable assignments that
have been accumulated into VLocTracker objects, and turn a DenseMap into
a SmallDenseMap to avoid an initial allocation.

Differential Revision: https://reviews.llvm.org/D118453

[Docs][NFC] Contributing.rst: fix wording

Fix a sentence containing two consecutive 'and'.

[MLIR] PresburgerSet::isIntegerEmpty: address clang-tidy warning

[NFC][libc] Remove unneeded gtest and benchmark configuration

Differential Revision: https://reviews.llvm.org/D118770

[DebugInfo][InstrRef][NFC] Cache some PHI resolutions

Install a cache of DBG_INSTR_REF -> ValueIDNum resolutions, for scenarios
where the value has to be reconstructed from several DBG_PHIs. Whenever
this happens, it's because branch folding + tail duplication has messed
with the SSA form of the program, and we have to solve a mini SSA problem
to find the variable value. This is always called twice, so it makes sense
to cache the value.

This gives a ~0.5% geomean compile-time-performance improvement on CTMark.

Differential Revision: https://reviews.llvm.org/D118455

[MLIR][AffineAnalysis] Fix typo in comment (NFC)

[MLIR] Matrix: support matrix-vector multiplication

This just moves in the implementation from LinearTransform.

Reviewed By: Groverkss, bondhugula

Differential Revision: https://reviews.llvm.org/D118479

Revert "[SLP]Alternate vectorization for cmp instructions."

This reverts commit 83620bd2ad867f706c699d0f2b8be10e43d9f3d7.

It's causing miscompilations, see review comments at
https://reviews.llvm.org/D115955

[LAA] Add Memory dependence remarks.

Adds new optimization remarks when vectorization fails.

More specifically, new remarks are added for following 4 cases:

- Backward dependency
- Backward dependency that prevents Store-to-load forwarding
- Forward dependency that prevents Store-to-load forwarding
- Unknown dependency

It is important to note that only one of the sources
of failures (to vectorize) is reported by the remarks.
This source of failure may not be first in program order.

A regression test has been added to test the following cases:

a) Loop can be vectorized: No optimization remark is emitted
b) Loop can not be vectorized: In this case an optimization
remark will be emitted for one source of failure.

Reviewed By: sdesmalen, david-arm

Differential Revision: https://reviews.llvm.org/D108371

[DAG] SimplifyDemandedVectorElts - remove KnownZero/KnownUndef from DCI helper wrapper

None of the external users actual touch these (they're purely used internally down the recursive call) - its trivial to add another wrapper if anything ever does want to track known elements.

[scan-build] Fix deadlock at failures in libears/ear.c

We experienced some deadlocks when we used multiple threads for logging
using `scan-builds` intercept-build tool when we used multiple threads by
e.g. logging `make -j16`

```
(gdb) bt
#0  0x00007f2bb3aff110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f2bb3af70a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f2bb3d152e4 in ?? ()
#3  0x00007ffcc5f0cc80 in ?? ()
#4  0x00007f2bb3d2bf5b in ?? () from /lib64/ld-linux-x86-64.so.2
#5  0x00007f2bb3b5da27 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x00007f2bb3b5dbe0 in exit () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x00007f2bb3d144ee in ?? ()
#8  0x746e692f706d742f in ?? ()
#9  0x692d747065637265 in ?? ()
#10 0x2f653631326b3034 in ?? ()
#11 0x646d632e35353532 in ?? ()
#12 0x0000000000000000 in ?? ()
```

I think the gcc's exit call caused the injected `libear.so` to be unloaded
by the `ld`, which in turn called the `void on_unload() __attribute__((destructor))`.
That tried to acquire an already locked mutex which was left locked in the
`bear_report_call()` call, that probably encountered some error and
returned early when it forgot to unlock the mutex.

All of these are speculation since from the backtrace I could not verify
if frames 2 and 3 are in fact corresponding to the `libear.so` module.
But I think it's a fairly safe bet.

So, hereby I'm releasing the held mutex on *all paths*, even if some failure
happens.

PS: I would use lock_guards, but it's C.

Reviewed-by: NoQ
Differential Revision: https://reviews.llvm.org/D118439

[libc] Fix automemcpy test by adding memmove configuration

Re-apply 3fab2d138e30, now with a triple added

Was reverted in 1c1b670a73a9 as it broke all non-x86 bots. Original commit
message:

[DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out

In certain circumstances with things like autogenerated code and asan, you
can end up with thousands of Values live at the same time, causing a large
working set and a lot of information spilled to the stack. Unfortunately
InstrRefBasedLDV doesn't cope well with this and consumes a lot of memory
when there are many many stack slots. See the reproducer in D116821.

It seems very unlikely that a developer would be able to reason about
hundreds of live named local variables at the same time, so a huge working
set and many stack slots is an indicator that we're likely analysing
autogenerated or instrumented code. In those cases: gracefully degrade by
setting an upper bound on the amount of stack slots to track. This limits
peak memory consumption, at the cost of dropping some variable locations,
but in a rare scenario where it's unlikely someone is actually going to
use them.

In terms of the patch, this adds a cl::opt for max number of stack slots to
track, and has the stack-slot-numbering code optionally return None. That
then filters through a number of code paths, which can then chose to not
track a spill / restore if it touches an untracked spill slot. The added
test checks that we drop variable locations that are on the stack, if we
set the limit to zero.

Differential Revision: https://reviews.llvm.org/D118601

[mlir][vector] Avoid hoisting alloca'ed temporary buffers across AutomaticAllocationScope

This revision avoids incorrect hoisting of alloca'd buffers across an AutomaticAllocationScope boundary.
In the more general case, we will probably need a ParallelScope-like interface.

Differential Revision: https://reviews.llvm.org/D118768

[MSVC] Workaround missing search path for sanitizer headers.

This is to fix build errors "Cannot open include file:
'sanitizer/asan_interface.h'" when building LLVM with MSVC and
LLVM_USE_SANITIZER=Address.

asan_interface.h is not available in MSVC's search path, instead it is
located under %VCToolsInstallDir%/crt/src/sanitizer.
This is an alternate solution to https://reviews.llvm.org/D118159, to
avoid adding all internal crt sources to the header search paths.

Tested with visual studio 2019 v16.9.6 and visual studio 2022 v17.0.5

Reviewed By: aaron.ballman, rnk

Differential Revision: https://reviews.llvm.org/D118624

[mlir] Fully qualify generated C++ code in RewriterGen.cpp

By fully qualifying the use of any types and functions from the mlir namespace, users are not required to add using namespace mlir; into the C++ file including the Tablegen output.

Differential Revision: https://reviews.llvm.org/D118767

Revert "[analyzer] Prevent misuses of -analyze-function"

This reverts commit 9d6a6159730171bc0faf78d7f109d6543f4c93c2.

Exit Code: 1

Command Output (stderr):
--
/scratch/buildbot/bothome/clang-ve-ninja/llvm-project/clang/test/Analysis/analyze-function-guide.cpp:53:21: error: CHECK-EMPTY-NOT: excluded string found in input // CHECK-EMPTY-NOT: Every top-level function was skipped.
                    ^
<stdin>:1:1: note: found here
Every top-level function was skipped.
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Input file: <stdin>
Check file: /scratch/buildbot/bothome/clang-ve-ninja/llvm-project/clang/test/Analysis/analyze-function-guide.cpp

-dump-input=help explains the following input dump.

Input was:
<<<<<<
        1: Every top-level function was skipped.
not:53     !~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  error: no match expected
        2: Pass the -analyzer-display-progress for tracking which functions are analyzed.
>>>>>>

[AVR] Avoid reusing the same variable name (NFC)

Apparently GCC 5.4 (a supported compiler) has a bug where it will
use the "MachineInstr &MI" defined by the range-based for loop
to evaluate the for loop expression. Pick a different variable
name to avoid this.

[analyzer] Prevent misuses of -analyze-function

Sometimes when I pass the mentioned option I forget about passing the
parameter list for c++ sources.
It would be also useful newcomers to learn about this.

This patch introduces some logic checking common misuses involving
`-analyze-function`.

Reviewed-By: martong
Differential Revision: https://reviews.llvm.org/D118690

[OpenCL] Test -fdeclare-opencl-builtins with CL3 and CLC++2021

But only test in combination with -finclude-default-header, as the
headerless tests may be dropped soon.

[mlir][async] Add AutomaticAllocationScope to async::ExecuteOp

Differential Revision: https://reviews.llvm.org/D118761

[TypePromotion] Avoid some unnecessary truncs

Check for legal zext 'sinks' before inserting a trunc.

Differential Revision: https://reviews.llvm.org/D115451

[libc++][P2321R2] Add specializations of basic_common_reference and common_type for pair

Add specializations of basic_common_reference and common_type for pair

Reviewed By: Quuxplusone, Mordante, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D117506

[AArch64][CodeGen] Always use SVE (when enabled) to lower integer divides

This patch adds custom lowering support for ISD::SDIV and ISD::UDIV
when SVE is enabled, regardless of the minimum SVE vector length. We do
this because NEON simply does not have vector integer divide support, so
we want to take advantage of these instructions in SVE.

As part of this patch I've also simplified LowerToPredicatedOp to avoid
re-asking the same question about whether we should be using SVE for
fixed length vectors. Once we've made the decision to call
LowerToPredicatedOp, then we should simply assert we should be using SVE.

I've updated the 128-bit min SVE vector bits tests here:

CodeGen/AArch64/sve-fixed-length-int-div.ll
CodeGen/AArch64/sve-fixed-length-int-rem.ll

Differential Revision: https://reviews.llvm.org/D117871

[GVN] Replace PointerIntPair with separate pointer & kind fields (NFC).

After adding another value kind in 8a12cae862af, Value * pointers do not
have enough available empty bits to store the kind (e.g. on ARM)

To address this, the patch replaces the PointerIntPair with separate
value and kind fields.

[compiler-rt][Darwin] Add arm64 to simulator platforms

I was looking around and noticed that builtins for iossim, tvossim
and watchossim was missing arm64 builds, while apple's clang
toolchain ship with these. After a bit of searching around it just
seems like these are not listed correctly in CMake to be enabled.

I enabled just arm64 since I saw that Apple clang didn't include
arm64e.

Reviewed By: t.p.northover

Differential Revision: https://reviews.llvm.org/D118759

[clang-format] Correctly parse C99 digraphs: "<:", ":>", "<%", "%>", "%:", "%:%:".

Fixes https://github.com/llvm/llvm-project/issues/31592.

This commits enables lexing of digraphs in C++11 and onwards.
Enabling them in C++03 is error-prone, as it would unconditionally treat sequences like "<:" as digraphs, even if they are followed by a single colon, e.g. "<::" would be treated as "[:" instead of "<" followed by "::". Lexing in C++11 doesn't have this problem as it looks ahead the following token.
The relevant excerpt from Lexer::LexTokenInternal:
```
        // C++0x [lex.pptoken]p3:
        //  Otherwise, if the next three characters are <:: and the subsequent
        //  character is neither : nor >, the < is treated as a preprocessor
        //  token by itself and not as the first character of the alternative
        //  token <:.
```

Also, note that both clang and gcc turn on digraphs by default (-fdigraphs), so clang-format should match this behaviour.

Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D118706

[GVN] Support load of pointer-select to value-select conversion.

This patch extends the available-value logic to detect loads
of pointer-selects that can be replaced by a value select.

For example, consider the code below:

  loop:
    %sel.phi = phi i32* [ %start, %ph ], [ %sel, %ph ]
    %l = load %ptr
    %l.sel = load %sel.phi
    %sel = select cond, %ptr, %sel.phi
    ...

  exit:
    %res = load %sel
    use(%res)

The load of the pointer phi can be replaced by a load of the start value
outside the loop and a new phi/select chain based on the loaded values,
as illustrated below

    %l.start = load %start
  loop:
    sel.phi.prom = phi i32 [ %l.start, %ph ], [ %sel.prom, %ph ]
    %l = load %ptr
    %sel.prom = select cond, %l, %sel.phi.prom
    ...
  exit:
    use(%sel.prom)

This is a first step towards alllowing vectorizing loops using common libc++
library functions, like std::min_element (https://clang.godbolt.org/z/6czGzzqbs)

    #include <vector>
    #include <algorithm>

    int foo(const std::vector<int> &V) {
        return *std::min_element(V.begin(), V.end());
    }

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D118143

[mlir][vector] Make write permutation lowering work with tensors.

Use type inference when building the TransferWriteOp in the TransferWritePermutationLowering. Previously, the result type has been set to Type() which triggers an assertion if the pattern is used with tensors instead of memrefs.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D118758

[VE] Packed v512f32 binop isel and tests

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D118335

[AArch64][SVE] NFC: tidy up isel lowering

Whilst adding legal types <-> register classes for Streaming SVE in
D118561 I noticed the hasSVE predication block set operation actions for
opcodes that may not be legal in Streaming SVE. Move these operations to
the later hasSVE block which has loops over the same types.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D118560

[llvm-reduce] Display all relevant options in -help

Previously the options category given to cl::HideUnrelatedOptions was
local to llvm-reduce.cpp and as a result only options declared in that
file were visible in the -help options listing. This was a bit
unfortunate since there were several useful options declared in other
files. This patch addresses that.

Differential Revision: https://reviews.llvm.org/D118682

[ArgPromotion] Add test for volatile and atomic loads (NFC)

Argument promotion does handle these correctly (by not promoting
them), but there were no tests to ensure this.

[flang][optimizer] support aggregate types inside tuple and record type

This patch allows:
- fir.box type to be a member of tuple<> or fir.type<> types,
- tuple<> type to be a member of tuple<> type.

When a fir.box types are nested in tuple<> or fir.type<>, it is translated
to the struct type of a Fortran runtime descriptor, and not a
pointer to a descriptor. This is because the fir.box is owned by the tuple
or fir.type.

FIR type translation was also flattening nested tuple while lowering to LLVM
dialect types. There does not seem to be a deep reason for doing that
and doing it causes issues in fir.coordinate_of generated on such tuple
(a fir.coordinate_of getting tuple<B, C> in tuple<A, tuple<B, C>>
ended-up lowered to an LLVM GEP getting B).

Differential Revision: https://reviews.llvm.org/D118701

[VE] LEGALAVL and staged VVP legalization

The new LEGALAVL node annotates that the AVL refers to packs of 64bit.
We use a two-stage lowering approach with LEGALAVL:

First, standard SDNodes are translated into illegal VVP layer nodes.
Regardless of source (VP or standard), all VVP nodes have a mask and AVL
parameter. The AVL parameter refers to the element position (just as in
VP intrinsics).

Second, we legalize the AVL usage in VVP layer nodes. If the element
size is < 64bit, the EVL parameter has to be adjusted to refer to packs
of 64bits. We wrap the legalized AVL in a LEGALAVL node to track this.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D118321

[AVR][NFC] Make atomics tests easier to read

Use the same mnemonics in the tests that are used in the AtomicLoadOp
pattern ($rd, $rr) but use RR1 instead of $operand. This matches similar
tests in load8.ll.

Differential Revision: https://reviews.llvm.org/D117991

[AVR] Fix atomicrmw result value

This patch fixes the atomicrmw result value to be the value before the
operation instead of the value after the operation. This was a bug, left
as a FIXME in the code (see https://reviews.llvm.org/D97127).

From the LangRef:

> The contents of memory at the location specified by the <pointer>
> operand are atomically read, modified, and written back. The original
> value at the location is returned.

Doing this expansion early allows the register allocator to arrange
registers in such a way that commutable operations are simply swapped
around as needed, which results in shorter code while still being
correct.

Differential Revision: https://reviews.llvm.org/D117725

Bump the trunk major version to 15

[flang] Lower PAUSE statement

Lower the PAUSE statement to a runtime call.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: kiranchandramohan, schweitz

Differential Revision: https://reviews.llvm.org/D118699

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>

[docs] Remove hard-coded version numbers from sphinx configs

This updates all the non-runtime project release notes to use the
version number from CMake instead of the hard-coded version numbers
in conf.py.

It also hides warnings about pre-releases when the git suffix
is dropped from the LLVM version in CMake.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D112181

[cmake][NFC] Configuration for libLLVM.so symbol versioning

Symbol versioning can prevent unintented install-time conflicts
between different llvm versions.  Users may need to override this
for particular products (e.g. Julia), but this requires carrying
a source code patch.  This patch moves this ability to a
configuration option.  NFC for existing usage.

Differential Revision: https://reviews.llvm.org/D118672

Add missing includes after LLVMCore header cleanup

- conditionally include header only used for expensive check
- have Core.h always include llvm-c/ErrorHandling.h