review.tizen.org Git - platform/upstream/llvm.git/log

[AArch64][RISCV] Fix expected smulo/umulo test output

These tests were introduced in D109809 which I pushed on behalf of
@tangxingxin1008. I must have not understood the correct arcanist
workflow for this and as such may have locally tested a stale build.

This patch fixes the issue by re-running update_llc_test_checks.py on
all four tests.

[clang][lex] Refactor check for the first file include

This patch refactors the code that checks whether a file has just been included for the first time.

The `HeaderSearch::FirstTimeLexingFile` function is removed and the information is threaded to the original call site from `HeaderSearch::ShouldEnterIncludeFile`. This will make it possible to avoid tracking the number of includes in a follow up patch.

Depends on D114092.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D114093

[clang][lex] NFC: Remove unused HeaderFileInfo member

This patch removes `HeaderFileInfo::isNonDefault`, which is not being used anywhere.

Reviewed By: dexonsmith, vsapsai

Differential Revision: https://reviews.llvm.org/D114092

[clang][deps] Don't emit `-fmodule-map-file=`

During explicit modules build, when all modules are provided via `-fmodule-file=<path>` and implicit modules and implicit module maps are disabled (`-fno-implicit-modules`, `-fno-implicit-module-maps`), we don't need to load the original module map files at all. This patch stops emitting the `-fmodule-map-file=` arguments we don't need, saving some compilation time due to avoiding parsing such module maps and making the command line shorter.

Reviewed By: bnbarham

Differential Revision: https://reviews.llvm.org/D113473

[AArch64][ARM] Enablement of Cortex-A710 Support

Phabricator review: https://reviews.llvm.org/D113256

[clang][modules] NFC: Fix typo in test name

[docs] Update outdated mentions of lab.llvm.org:8011.

Some places were still referring to the outdated buildbot URL
http://lab.llvm.org:8011. Update those to use the new URL
http://lab.llvm.org/buildbot/#.

[docs] Remove mention of retired smooshlab IRC bot.

the smooshlab bot has been offline for years. Remove it from the list of
IRC bots.

[TargetLowering][RISCV] Fixed a scalable vector issue when lowering [s|u]mul.overflow intrinsics

    Fixed the vector type issue that where we used getVectorNumElements()
    should be replaced by getVectorElementCount() when lowering these
    intrinsics.

    This is similar to D94149

Signed-off-by: Eric Tang <tangxingxin1008@gmail.com>
Reviewed By: craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D109809

Remove non-affecting module maps from PCM files.

Problem:
PCM file includes references to all module maps used in compilation which created PCM. This problem leads to PCM-rebuilds in distributed compilations as some module maps could be missing in isolated compilation. (For example in our distributed build system we create a temp folder for every compilation with only modules and headers that are needed for that particular command).

Solution:
Add only affecting module map files to a PCM-file.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D106876

[fir] Add fir.embox conversion

Convert a `fir.embox` operation to LLVM IR dialect.
A `fir.embox` is converted to a sequence of operation that
create, allocate if needed, and populate a descriptor.

Current limitiation: alignment is set by default but should be retrieved in the specific target.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: kiranchandramohan, awarzynski

Differential Revision: https://reviews.llvm.org/D113756

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>

[CodeGen][SVE] Add missing isel patterns for vector_reverse

We were missing patterns for vector_reverse of unpacked FP vector
types, as well as all the supported bfloat vectors.

Tests added here:

CodeGen/AArch64/named-vector-shuffle-reverse-sve.ll

Differential Revision: https://reviews.llvm.org/D114089

[SCEV] Reorder operands checks in collectConditions.

The initial two cases require a SCEVConstant as RHS. Pull up the condition
to check and swap SCEVConstants from below. Also remove a redundant
check & swap if RHS is SCEVUnknown.

[SCEV] Add additional guard tests with swapped condition ops.

[NFC][clangd] fix clang-tidy finding on isa_and_nonnull

This is a cleanup of the only llvm-prefer-isa-or-dyn-cast-in-conditionals finding in the clangd code base. This patch was created by automatically applying the fixes from clang-tidy.

Differential Revision: https://reviews.llvm.org/D113899

[PowerPC] fix typos in comments, NFC

[fir] Add fir.constc conversion

Add the codegen for fir.constc.

This patch is part of the upstreaming effort from fir-dev.

Differential Revision: https://reviews.llvm.org/D114063

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>

[mlir][Python] Fix generation of accessors for Optional

Previously, in case there was only one `Optional` operand/result within
the list, we would always return `None` from the accessor, e.g., for a
single optional result we would generate:

```
return self.operation.results[0] if len(self.operation.results) > 1 else None
```

But what we really want is to return `None` only if the length of
`results` is smaller than the total number of element groups (i.e.,
the optional operand/result is in fact missing).

This commit also renames a few local variables in the generator to make
the distinction between `isVariadic()` and `isVariableLength()` a bit
more clear.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D113855

Fix Windows build after commit 49682f1.

[mlir][linalg][bufferize] Fix bufferize bug where non-tensor ops are not skipped

`BufferizableOpInterface::bufferize` will only be called on ops that
have tensor operands and/or results.

Differential Revision: https://reviews.llvm.org/D113962

[mlir][linalg][bufferize][NFC] Decouple ComprehensiveBufferize from tensor dialect

Add a new BufferizableOpInterface method `isNotConflicting` that can be used to implement custom analysis rules.

Differential Revision: https://reviews.llvm.org/D113961

[mlir] Convert NamedAttribute to be a class

NamedAttribute is currently represented as an std::pair, but this
creates an extremely clunky .first/.second API. This commit
converts it to a class, with better accessors (getName/getValue)
and also opens the door for more convenient API in the future.

Differential Revision: https://reviews.llvm.org/D113956

[libc++] [test] Add "robust_re_difference_type.compile.pass.cpp" for all the algorithms.

Also, mark these tests as compile-only. They actually are safe to run — notice that
the code "runs" at constexpr-time in C++20, without error — because both of the
input ranges are entirely filled with nullptr, so no matter how you shuffle the
elements, they remain sorted and partitioned and heapified and everything.
But there's no real reason to run them at runtime, so let's just avoid the distraction.

Test cases that fail in trunk right now are commented out with `TODO FIXME`.

Differential Revision: https://reviews.llvm.org/D113906

[X86][Driver] Add X86 target option to avoid fail to other targets. NFC

[X86][ABI] Do not return float/double from x87 registers when x87 is disabled

This is aligned with GCC's behavior.
Also, alias `-mno-fp-ret-in-387` to `-mno-x87`, by which we can fix pr51498.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D112143

[gn build] Port 92832e4889ae

[libc++] Enable <atomic> when threads are disabled

std::atomic is, for the most part, just a thin veneer on top of compiler
builtins. Hence, it should be available even when threads are not available
on the system, and in fact there has been requests for such support.

This patch:
- Moves __libcpp_thread_poll_with_backoff to its own header so it can
  be used in <atomic> when threads are disabled.
- Adds a dummy backoff policy for atomic polling that doesn't know about
  threads.
- Adjusts the <atomic> feature-test macros so they are provided even when
  threads are disabled.
- Runs the <atomic> tests when threads are disabled.

rdar://77873569

Differential Revision: https://reviews.llvm.org/D114109

[gn build] Port 49682f14bf3f

tsan: remove quadratic behavior in pthread_join

pthread_join needs to map pthread_t of the joined thread to our Tid.
Currently we do this with linear search over all threads.
This has quadratic complexity and becomes much worse with the new
tsan runtime, which memorizes all threads that ever existed.

To resolve this add a hash map of live threads only (that are still
associated with pthread_t) and use it for the mapping.

With the new tsan runtime some programs spent 1/3 of time in this mapping.
After this change the mapping disappears from profiles.

Depends on D113996.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D113997

[SPIR-V] Add translator tool

Add a tool for constructing commands for translating LLVM IR to
SPIR-V.

Used by HIPSPV tool chain (D110618).

Reviewed By: bader

Differential Revision: https://reviews.llvm.org/D112404

[clang] Use range-based for loops with llvm::reverse (NFC)

[gn build] Port 24d1673c8b9b

[X86] Add -mskip-rax-setup support to align with GCC

AMD64 ABI mandates caller to specify the number of used SSE registers
when passing variable arguments.
GCC also provides option -mskip-rax-setup to skip the setup of rax when
SSE is disabled. This helps to reduce the code size, see pr23258.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D112413

[NFC][llvm] Inclusive language: remove uses of sanity in llvm/lib/ExecutionEngine/

Reworded and removed code comments to avoid using `sanity check` and `sanity
test`.

[llvm-tblgen][RISCV] Make llvm-tblgen RISCVCompressInstEmitter to be common infra across different targets

Not only RISCV but also other target such as CSKY, there are compressed instructions mixed with normal instructions.
To reuse the basic infra to compress/uncompress and predict instruction, we need reconstruct the RISCVCompressInstEmitter
and make it more general and suitable for other target.

Differential Revision: https://reviews.llvm.org/D113475

[sanitizer] Fix DenseMap for compiler-rt

Depends on D114047.

Differential Revision: https://reviews.llvm.org/D114048

[NFC][sanitizer] Fix headers of DenseMap

Depends on D114046.

Differential Revision: https://reviews.llvm.org/D114047

[NFC][sanitizer] Clang format copied code

Depends on D114045.

Differential Revision: https://reviews.llvm.org/D114046

[NFC][sanitizer] Add unchanged DenseMap

It's just a copy even without reformatting.

Reviewed By: dvyukov, melver

Differential Revision: https://reviews.llvm.org/D114045

[NFC][llvm] Inclusive language: reword and remove uses of sanity in llvm/lib/Target

Reworded removed code comments that contain `sanity check` and `sanity
test`.

PR52537: When performing a no-op TreeTransform of a rewritten binary
operator, mark any functions it calls as referenced.

[Driver][Android] Remove unneeded isNoExecStackDefault

ld.lld used by Android ignores .note.GNU-stack and defaults to noexecstack,
so the `-z noexecstack` linker option is unneeded.

The `--noexecstack` assembler option is unneeded because AsmPrinter.cpp
prints `.section .note.GNU-stack,"",@progbits` (when `llvm.init.trampoline` is unused),
so the assembler won't synthesize an executable .note.GNU-stack.

Reviewed By: danalbert

Differential Revision: https://reviews.llvm.org/D113840

[NFC][sanitizer] Fix veradic-macro warning in RAW_CHECK

[mlir][sparse] refine lexicographic insertion to any tensor

First version was vectors only. With some clever "path" insertion,
we now support any d-dimensional tensor. Up next: reductions too

Reviewed By: bixia, wrengr

Differential Revision: https://reviews.llvm.org/D114024

Revert "[NFC] Refactor symbol table parsing."

This reverts commit 951b107eedab1829f18049443f03339dbb0db165.

Buildbots were failing, there is a deadlock in /Users/gclayton/Documents/src/llvm/clean/llvm-project/lldb/test/Shell/SymbolFile/DWARF/DW_AT_range-DW_FORM_sec_offset.s when ELF files try to relocate things.

Revert "Revert "Make it possible for lldb to launch a remote binary with no local file.""

This reverts commit dd5505a8f2c75a903ec944b6e46aed2042610673.

I picked the wrong class for the test, should have been GDBRemoteTestBase.

[sanitizer] Add a few of type_traits tools

For D114047

[Coroutine] Warn deprecated 'std::experimental::coro' uses

Since we've decided the to not support std::experimental::coroutine*, we
should tell the user they need to update.

Reviewed By: Quuxplusone, ldionne, Mordante

Differential Revision: https://reviews.llvm.org/D113977

[mlir][tosa] Revert add-0 canonicalization for floating-point

Floating point optimization can produce incorrect numerical resutls for
-0.0 + 0.0 optimization as result needs to be -0.0.

Reviewed By: eric-k256

Differential Revision: https://reviews.llvm.org/D114127

[X86][AMX] Don't emit tilerelease for old AMX instrisic.

We should avoid mixing old AMX instrinsic with new AMX intrinsic. For
old AMX intrinsic, user is responsible for invoking tile release. This
patch is to check if there is any tile config generated by compiler. If
so it emit tilerelease instruction, otherwise it don't emit the
instruction.

Differential Revision: https://reviews.llvm.org/D114066

Autogen a test for ease of update

[X86] add 3 missing intrinsics: _mm_(mask/maskz)_cvtpbh_ps

Reviewed By: craig.topper, pengfei

Differential Revision: https://reviews.llvm.org/D114059

[flang] Add a semantics test for co_sum

Test a range of acceptable forms of co_sum calls, including
combinations of keyword and non-keyword actual arguments of
numeric types. Also test that several invalid forms of
co_sum call generate the correct error messages.

Reviewed By: kiranchandramohan, ktras

Differential Revision: https://reviews.llvm.org/D113076

[AMDGPU] Update GFX10 memory model to account for MALL

Document memory attached last level (MALL) cache added in GFX10.3.

Reviewed By: t-tye

Differential Revision: https://reviews.llvm.org/D114076

[flang] Fix INQUIRE(PAD=) and (POSITION=) for predefined units

The predefined units were not being initialized with FORM='FORMATTED',
so INQUIRE(PAD=) was failing if no I/O had already been done.

INQUIRE(POSITION=) was returning 'REWIND' on stdin/stdout (which
is somewhat defensible from the definition, and is what Intel Fortran
does), but most implementations return 'ASIS'. Change the runtime
to return 'REWIND' only for positionable external files, but 'ASIS'
for terminals, sockets, &c.

Differential Revision: https://reviews.llvm.org/D114028

[lld-macho] Add warn flags to enable/disable warnings on -install_name

ld64 doesn't warn on builds using `-install_name` if it's a bundle. But, the
current warning is nice to have because `install_name` only works with dylib.
To prevent an overflow of warnings in build logs and have parity with ld64,
create a `--warn-dylib-install-name` and `--warn-no-dylib-install-name` flag
that enables this LLD specific warning.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D113534

LiteralSupport: Don't assert() on invalid input

When using clangd, it's possible to trigger assertions in
NumericLiteralParser and CharLiteralParser when switching git branches.
This commit removes the initial asserts on invalid input and replaces
those asserts with the error handling mechanism from those respective
classes instead. This allows clangd to gracefully recover without
crashing.

See https://github.com/clangd/clangd/issues/888 for more information
on the clangd crashes.

[compiler-rt][asan] Re-add `self`

We ran into errors where this wasn't defined in Fuchsia's asan implementation.

Revert "[sanitizer] Add a few of type_traits tools"

Does not work with GCC

This reverts commit a82ee2be9c6378cd34deb3ab002d78acd2b04ff3.

[MLIR][Docs] Fix link syntax in Rationale.md

[LegalizeTypes] Further limit expansion of CTTZ during type promotion.

Don't expand CTTZ if CTPOP or CTLZ is supported on the promoted type.
We have special handling for CTTZ expansion to use those ops with a
small conversion. The setup for that doesn't generate extra code or
large constants so we don't gain anything from expanding early and we
make CTTZ_ZERO_UNDEF codegen worse.

Follow up from post commit feedback on D112268. We don't seem to have
any in tree tests that care about this.

[NFC] Refactor symbol table parsing.

Symbol table parsing has evolved over the years and many plug-ins contained duplicate code in the ObjectFile::GetSymtab() that used to be pure virtual. With this change, the "Symbtab *ObjectFile::GetSymtab()" is no longer virtual and will end up calling a new "void ObjectFile::ParseSymtab(Symtab &symtab)" pure virtual function to actually do the parsing. This helps centralize the code for parsing the symbol table and allows the ObjectFile base class to do all of the common work, like taking the necessary locks and creating the symbol table object itself. Plug-ins now just need to parse when they are asked to parse as the ParseSymtab function will only get called once.

Differential Revision: https://reviews.llvm.org/D113965

[sanitizer] Add a few of type_traits tools

For D114047

[mlir][tosa] Fixed shape inference for tosa.transpose_conv2d

Transpose conv2d shape inference was incorrect, tests did not properly validate
that the shape inference was executing. Corrected shape inference, and extended
tests to actually execute.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D114026

Add Android test case for -Wpartial-availability. Also update Android availability tests to match on the whole string, so we can distinguish between "Android 16" and "Android 16.0.0" at the end of warning messages.

Reviewed By: danalbert, srhines

Differential Revision: https://reviews.llvm.org/D114036

[OpenMP][libomp] Enable HWLOC topology detection of multiple CPU kinds

Teach the HWLOC topology method how to detect Atom and Core
types so hybrid CPUs are properly detected and represented when using
the HWLOC topology method.

Differential Revision: https://reviews.llvm.org/D112270

[mlir] Refactor AbstractOperation and OperationName

The current implementation is quite clunky; OperationName stores either an Identifier
or an AbstractOperation that corresponds to an operation. This has several problems:

* OperationNames created before and after an operation are registered are different
* Accessing the identifier name/dialect/etc. from an OperationName are overly branchy
- they need to dyn_cast a PointerUnion to check the state

This commit refactors this such that we create a single information struct for every
operation name, even operations that aren't registered yet. When an OperationName is
created for an unregistered operation, we only populate the name field. When the
operation is registered, we populate the remaining fields. With this we now have two
new classes: OperationName and RegisteredOperationName. These both point to the
same underlying operation information struct, but only RegisteredOperationName can
assume that the operation is actually registered. This leads to a much cleaner API, and
we can also move some AbstractOperation functionality directly to OperationName.

Differential Revision: https://reviews.llvm.org/D114049

[OpenMP][libomp] Improve Windows Processor Group handling within topology

The current implementation of Windows Processor Groups has
a separate topology method to handle them. This patch deprecates
that specific method and uses the regular CPUID topology
method by default and inserts the Windows Processor Group objects
in the topology manually.

Notes:
* The preference for processor groups is lowered to a value less than
  socket so that the user will see sockets in the KMP_AFFINITY=verbose
  output instead of processor groups when sockets=processor groups.
* The topology's capacity is modified to handle additional topology layers
  without the need for reallocation.
* If a user asks for a granularity setting that is "above" the processor
  group layer, then the granularity is adjusted "down" to the processor
  group since this is the coarsest layer available for threads.

Differential Revision: https://reviews.llvm.org/D112273

[OpenMP][libomp] Add support for offline CPUs in Linux

If some CPUs are offline, then make sure they are not included in the
fullMask even if norespect is given to KMP_AFFINITY.

Differential Revision: https://reviews.llvm.org/D112274

[lld-macho][nfc] Factor-out NFC changes from main __eh_frame diff

In order to keep signal:noise high for the `__eh_frame` diff, I have teased-out the NFC changes and put them here.

Differential Revision: https://reviews.llvm.org/D114017

[mlir] Improve documentation of shape dialect

Add small example of usage (brief which will be further refined).

[clang] Allocate 2 bits to store the constexpr specifier kind when serializing

Now that consteval and constinit are possible values, 1 bit
is no longer enough.

Fixes https://github.com/clangd/clangd/issues/887

Differential Revision: https://reviews.llvm.org/D111971

[mlir] Fix wrong variable name in Linalg OpDSL

The name seems to have been left over from a renaming effort on an unexercised
codepaths that are difficult to catch in Python. Fix it and add a test that
exercises the codepath.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D114004

[clang-format][NFC] Add a default value to parseBlock()

Differential Revision: https://reviews.llvm.org/D114073

[runtimes][NFC] Remove filenames at the top of the license notice

We've stopped doing it in libc++ for a while now because these names
would end up rotting as we move things around and copy/paste stuff.
This cleans up all the existing files so as to stop the spreading
as people copy-paste headers around.

[OpenMP][libomp] Allow users to specify KMP_HW_SUBSET in any order

Remove restriction forcing users to specify the KMP_HW_SUBSET value in
topology order. This patch sorts the user KMP_HW_SUBSET value before
trying to apply it. For example: 1s,4c,2t is equivalent to 2t,1s,4c

Differential Revision: https://reviews.llvm.org/D112027

[lldb] remove usage of distutils, fix python path on debian/ubuntu

distutils is deprecated and will be removed, so we shouldn't be
using it.

We were using it to compute LLDB_PYTHON_RELATIVE_PATH.

Discussing a similar issue
[at python.org](https://bugs.python.org/issue41282), Filipe Laíns said:

    If you are relying on the value of distutils.sysconfig.get_python_lib()
    as you shown in your system, you probably don't want to. That
    directory (dist-packages) should be for Debian provided packages
    only, so moving to sysconfig.get_path() would be a good thing,
    as it has the correct value for user installed packages on your
    system.

So I propose using a relative path from `sys.prefix` to
`sysconfig.get_path("platlib")` instead.

On Mac and windows, this results in the same paths as we had before,
which are `lib/python3.9/site-packages` and `Lib\site-packages`,
respectively.

On ubuntu however, this will change the path from
`lib/python3/dist-packages` to `lib/python3.9/site-packages`.

This change seems to be correct, as Filipe said above, `dist-packages`
belongs to the distribution, not us.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D114106

[libc++][NFC] Re-indent and re-order includes in uses_alloc_types.h

[NFC] Update comments to refer to unique_ptr instead of raw pointers.

[clang] Fix typo in 36873fb768dbe

[clang] Try to fix test more after ae98182cf7341181e

We need to use the td-based marshalling instead of doing this manually,
else the setting gets lost on the way to codegen in most build configs.

[OpenMP][libomp][NFC] Remove non-ASCII apostrophe in comment

[SCEVAA] Avoid forming malformed pointer diff expressions

This solves the same crash as in D104503, but with a different approach.

The test case test_non_dom demonstrates a case where scev-aa crashes today. (If exercised either by -eval-aa or -licm.) The basic problem is that SCEV-AA expects to be able to compute a pointer difference between two SCEVs for any two pair of pointers we do an alias query on. For (valid, but out of scope) reasons, we can end up asking whether expressions in different sub-loops can alias each other. This results in a subtraction expression being formed where neither operand dominates the other.

The approach this patch takes is to leverage the "defining scope" notion we introduced for flag semantics to detect and disallow the formation of the problematic SCEV. This ends up being relatively straight forward on that new infrastructure. This change does hint that we should probably be verifying a similar property for all SCEVs somewhere, but I'll leave that to a follow on change.

Differential Revision: D114112

Fix -Wparentheses warnings. NFC.

[SystemZ] [Sanitizer] Bugfixes in internal_clone().

The __flags variable needs to be of type 'long' in order to get sign extended
properly.

internal_clone() uses an svc (Supervisor Call) directly (as opposed to
internal_syscall), and therefore needs to take care to set errno and return
-1 as needed.

Review: Ulrich Weigand

[X86] splitVector - only extract lower half subvector from splats

If we're splitting a source vector that is a splat (with no undefs), just extract (for free) the lower half subvector and use it for both halfs.

[clang] Try to fix test after ae98182cf7341181e

The test assumes an integrated assembler, so use a triple where
that's the default.

[lldb] Port PlatformWindows, PlatformOpenBSD and PlatformRemoteGDBServer to GetSupportedArchitectures

[clang] Address review comments on https://reviews.llvm.org/D113707

- Drop a needless `l` size suffix on a mov instruction in AT&T mode
- Move varying bits of test flags to front
- Add a comment about MS mode test

[libc] fix strtof/d/ld NaN parsing

Fix the fact that previously strtof/d/ld would only accept a NaN as
having parentheses if the thing in the parentheses was a valid number,
now it will accept any combination of letters and numbers, but will only
put valid numbers in the mantissa.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D113790

Fix MSVC signed/unsigned mismatch warning. NFC.

[X86] LowerRotate - improve vXi8 rotate-by-scalar lowering with direct use of (extended) shift-by-scalar helpers.

If we're rotating vXi8 by a splatted amount, then unpack to vXi16, perform a SHL by the (extended) scalar, and then pack the results.

This is a vector equivalent to the "rotl(x,y) -> (((aext(x) << bw) | zext(x)) << (y & (bw-1))) >> bw" style expansion we do for scalars in LowerFunnelShift.

I think we can usefully use this for other vector types and vector funnel-shifts in the future, depending how we expand beyond D113192 for matching rotations/funnel-shifts for more type/ops.

[OpenMP] Add version macro support for 5.1 and 5.2

Differential Revision: https://reviews.llvm.org/D114102

[InstCombine] Generalize complex OR patterns to AND

For every pattern with only NOT, OR, and AND operations there is
always a symmetrical attern with AND and OR swapped.

This adds 2 transformations: https://reviews.llvm.org/D113526

```
(~(a & b) | c) & (~(a & c) | b) --> ~((b ^ c) & a)
(~(a & b) | c) & ~(a & c) --> ~((b | c) & a)
```

```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %and1 = and i4 %b, %a
  %not1 = xor i4 %and1, 15
  %and2 = and i4 %a, %c
  %not2 = xor i4 %and2, 15
  %or = or i4 %not2, %b
  %r = and i4 %or, %not1
  ret i4 %r
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %or = or i4 %b, %c
  %and = and i4 %or, %a
  %r = xor i4 %and, 15
  ret i4 %r
}
Transformation seems to be correct!

----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %and1 = and i4 %a, %b
  %not1 = xor i4 %and1, 15
  %or1 = or i4 %not1, %c
  %and2 = and i4 %a, %c
  %not2 = xor i4 %and2, 15
  %or2 = or i4 %not2, %b
  %and3 = and i4 %or1, %or2
  ret i4 %and3
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %xor = xor i4 %b, %c
  %and = and i4 %xor, %a
  %not = xor i4 %and, 15
  ret i4 %not
}
Transformation seems to be correct!
```

Differential Revision: https://reviews.llvm.org/D113526

[llvm-objcopy] Fix some comment typos

[clang] Make -masm=intel affect inline asm style

With this,

  void f() {  __asm__("mov eax, ebx"); }

now compiles with clang with -masm=intel.

This matches gcc.

The flag is not accepted in clang-cl mode. It has no effect on
MSVC-style `__asm {}` blocks, which are unconditionally in intel
mode both before and after this change.

One difference to gcc is that in clang, inline asm strings are
"local" while they're "global" in gcc. Building the following with
-masm=intel works with clang, but not with gcc where the ".att_syntax"
from the 2nd __asm__() is in effect until file end (or until a
".intel_syntax" somewhere later in the file):

  __asm__("mov eax, ebx");
  __asm__(".att_syntax\nmovl %ebx, %eax");
  __asm__("mov eax, ebx");

This also updates clang's intrinsic headers to work both in
-masm=att (the default) and -masm=intel modes.
The official solution for this according to "Multiple assembler dialects in asm
templates" in gcc docs->Extensions->Inline Assembly->Extended Asm
is to write every inline asm snippet twice:

    bt{l %[Offset],%[Base] | %[Base],%[Offset]}

This works in LLVM after D113932 and D113894, so use that.

(Just putting `.att_syntax` at the start of the snippet works in some but not
all cases: When LLVM interpolates in parameters like `%0`, it uses at&t or
intel syntax according to the inline asm snippet's flavor, so the `.att_syntax`
within the snippet happens to late: The interpolated-in parameter is already
in intel style, and then won't parse in the switched `.att_syntax`.)

It might be nice to invent a `#pragma clang asm_dialect push "att"` /
`#pragma clang asm_dialect pop` to be able to force asm style per snippet,
so that the inline asm string doesn't contain the same code in two variants,
but let's leave that for a follow-up.

Fixes PR21401 and PR20241.

Differential Revision: https://reviews.llvm.org/D113707

[llvm-objcopy][MachO] Add llvm-strip support for newer load commands

Previously llvm-strip would fail because of unknown commands.

Fixes https://bugs.llvm.org/show_bug.cgi?id=50044

Differential Revision: https://reviews.llvm.org/D113734

[libc++] Refactor tests for trivially copyable atomics

- Replace irrelevant synopsis by a comment
- Use a .verify.cpp test instead of .compile.fail.cpp
- Remove unnecessary includes in one of the tests (was a copy-paste error)

Differential Revision: https://reviews.llvm.org/D114094

[x86/asm] Let EmitMSInlineAsmStr() handle variants too

This is preparation for D113707, where I want to make `-masm=intel`
emit `asm inteldialect` instructions.

`{movq %rbx, %rax|mov rax, rbx}` is supposed to evaluate to the bit
between { and | for att and to the bit between | and } for intel.
Since intel will become `asm inteldialect`, which alls EmitMSInlineAsmStr(),
EmitMSInlineAsmStr() has to support variants as well.

(clang translates `{...|...}` to `$(...$|...$)`. I'm not sure why
it doesn't just send along only the first `...` or the second `...`
to LLVM, but given the notes in PR23933 let's not do a big
reorganization in this codepath.)

Differential Revision: https://reviews.llvm.org/D113932

[RISCV] Lower vector CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF by converting to FP and extracting the exponent.

If we have a large enough floating point type that can exactly
represent the integer value, we can convert the value to FP and
use the exponent to calculate the leading/trailing zeros.

The exponent will contain log2 of the value plus the exponent bias.
We can then remove the bias and convert from log2 to leading/trailing
zeros.

This doesn't work for zero since the exponent of zero is zero so we
can only do this for CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF. If we need
a value for zero we can use a vmseq and a vmerge to handle it.

We need to be careful to make sure the floating point type is legal.
If it isn't we'll continue using the integer expansion. We could split the vector
and concatenate the results but that needs some additional work and evaluation.

Differential Revision: https://reviews.llvm.org/D111904