review.tizen.org Git - platform/upstream/llvm.git/log

[ARM] Move double vector insert patterns using vins to DAG combine

This removes the existing patterns for inserting two lanes into an
f16/i16 vector register using VINS, instead using a DAG combine to
pattern match the same code sequences. The tablegen patterns were
already on the large side (foreach LANE = [0, 2, 4, 6]) and were not
handling all the cases they could. Moving that to a DAG combine, whilst
not less code, allows us to better control and expand the selection of
VINSs. Additionally this allows us to remove the AddedComplexity on
VCVTT.

The extra trick that this has learned in the process is to move two
adjacent lanes using a single f32 vmov, allowing some extra
inefficiencies to be removed.

Differenial Revision: https://reviews.llvm.org/D96876

[WebAssembly] call_indirect issues table number relocs

If the reference-types feature is enabled, call_indirect will explicitly
reference its corresponding function table via `TABLE_NUMBER`
relocations against a table symbol.

Also, as before, address-taken functions can also cause the function
table to be created, only with reference-types they additionally cause a
symbol table entry to be emitted.

We abuse the used-in-reloc flag on symbols to indicate which tables
should end up in the symbol table. We do this because unfortunately
older wasm-ld will carp if it see a table symbol.

Differential Revision: https://reviews.llvm.org/D90948

[clang][CodeComplete] Ensure there are no crashes when completing with ParenListExprs as LHS

Differential Revision: https://reviews.llvm.org/D96950

[clang][cli] Pass '-Wspir-compat' to cc1 from driver

This patch moves the creation of the '-Wspir-compat' argument from cc1 to the driver.

Without this change, generating command line arguments from `CompilerInvocation` cannot be done reliably: there's no way to distinguish whether '-Wspir-compat' was passed to cc1 on the command line (should be generated), or if it was created within `CompilerInvocation::CreateFromArgs` (should not be generated).

This is also in line with how other '-W' flags are handled.

(This was introduced in D21567.)

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D97041

[clang][cli] Stop creating '-Wno-stdlibcxx-not-found' in cc1

This patch stops creating the '-Wno-stdlibcxx-not-found' argument in `CompilerInvocation::CreateFromArgs`.

The code was added in 2e7ab55e657f (a follow-up to D48297). However, D61963 removes relevant tests and starts explicitly passing '-Wno-stdlibcxx-not-found' to the driver. I think it's fair to assume this is a dead code.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D97042

[mlir] Mark std.subview as NoSideEffect

Differential Revision: https://reviews.llvm.org/D96951

[NFC][llvm-dwarfdump] Don't calculate unnecessary stats

Small optimization of the code -- No need to calculate any stats
for NULL nodes, and also no need to call the collectStatsForDie()
if it is the CU itself.

Differential Revision: https://reviews.llvm.org/D96871

[InstrProfiling] Fix instrprof-gc-sections.c test

After D97110 __llvm_prof_cnts has the nobits type so it's empty.

[mlir] Export CUDA and Vulkan runtime wrappers on Windows

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D97140

[AArch64][GlobalISel] Fix <16 x s8> G_DUP regbankselect to assign source to gpr.

We can only select this type if the source is on GPR, not FPR.

[CodeGen] Use range-based for loops (NFC)

[llvm] Fix header guards (NFC)

Identified with llvm-header-guard.

[Analysis] Use ListSeparator (NFC)

Revert "[sanitizers] Pass CMAKE_C_FLAGS into TSan buildgo script"

This reverts commit ac6c13bfc49f2d67a77144c839ecf49e48cb994c.
Breaks building with PGO, see https://reviews.llvm.org/D96762#2574009

[mlir] Add simple jupyter kernel

Simple jupyter kernel using mlir-opt and reproducer to run passes.
Useful for local experimentation & generating examples. The export to
markdown from here is not immediately useful nor did I define a
CodeMirror synax to make the HTML output prettier. It only supports one
level of history (e.g., `_`) as I was mostly using with expanding a
pipeline one pass at a time and so was all I needed.

I placed this in utils directory next to editor & debugger utils.

Differential Revision: https://reviews.llvm.org/D95742

[InstrProfiling] Use ELF section groups for counters, data and values

__start_/__stop_ references retain C identifier name sections such as
__llvm_prf_*. Putting these into a section group disables this logic.

The ELF section group semantics ensures that group members are retained
or discarded as a unit. When a function symbol is discarded, this allows
allows linker to discard counters, data and values associated with that
function symbol as well.

Note that `noduplicates` COMDAT is lowered to zero-flag section group in
ELF. We only set this for functions that aren't already in a COMDAT and
for those that don't have available_externally linkage since we already
use regular COMDAT groups for those.

Differential Revision: https://reviews.llvm.org/D96757

[KnownBits][RISCV] Improve known bits for srem.

The result must be less than or equal to the LHS side, so any
leading zeros in the left hand side must also exist in the result.
This is stronger than the previous behavior where we only considered
the sign bit being 0.

The affected test case used the sign bit being known 0 to change
a sign extend to a zero extend pre type legalization. After type
legalization the types were promoted to i64, but we no longer
knew bit 31 was zero. This shifts are are the equivalent of an
AND with 0xffffffff or zext_inreg X, i32. This patch allows us to
see that bit 31 is zero and remove the shifts.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D97124

Implement simple type polymorphism for linalg named ops.

* It was decided that this was the end of the line for the existing custom tc parser/generator, and this is the first step to replacing it with a declarative format that maps well to mathy source languages.
* One such source language is implemented here: https://github.com/stellaraccident/mlir-linalgpy/blob/main/samples/mm.py
* In fact, this is the exact source of the declarative `polymorphic_matmul` in this change.
* I am working separately to clean this python implementation up and add it to MLIR (probably as `mlir.tools.linalg_opgen` or equiv). The scope of the python side is greater than just generating named ops: the ops are callable and directly emit `linalg.generic` ops fully dynamically, and this is intended to be a feature for frontends like npcomp to define custom linear algebra ops at runtime.
* There is more work required to handle full type polymorphism, especially with respect to integer formulations, since they require more specificity wrt types.
* Followups to this change will bring the new generator to feature parity with the current one and delete the current. Roughly, this involves adding support for interface declarations and attribute symbol bindings.

Differential Revision: https://reviews.llvm.org/D97135

[X86] Add vector support to sub(C1, xor(X, C2)) -> add(xor(X, ~C2), C1+1) fold.

[X86] Replace explicit constant handling in sub(C1, xor(X, C2)) -> add(xor(X, ~C2), C1+1) fold. NFCI.

NFC cleanup before adding vector support - rely on the SelectionDAG to handle everything for us.

[X86] Regenerate sub.ll test

[X86] Add 'sub C1, (xor X, C1) -> add (xor X, ~C2), C1+1' tests

This is also in sub.ll but that's for a specific i686 pattern - this adds x86_64 and vector tests

[X86] Add common CHECK check-prefix to sub combine tests

Revert "[lldb-vscode] Emit the breakpoint changed event on location resolved"

This reverts commit 1f21d488bd79a06c9cf405cc5db985fcd71c4f70.

[lldb] [docs] Update platform support status

Update supported features on FreeBSD, and supported platform list
on FreeBSD, Linux and NetBSD.

Differential Revision: https://reviews.llvm.org/D97114

[LLDB] [docs] Update the list of supported architectures on Windows

Differential Revision: https://reviews.llvm.org/D96840

Reapply "[lldb/test] Automatically find debug servers to test"

This reapplies 7df4eaaa93/D96202, which was reverted due to issues on
windows. These were caused by problems in the computation of the liblldb
directory, which was fixed by D96779.

The original commit message was:
Our test configuration logic assumes that the tests can be run either
with debugserver or with lldb-server. This is not entirely correct,
since lldb server has two "personalities" (platform server and debug
server) and debugserver is only a replacement for the latter.

A consequence of this is that it's not possible to test the platform
behavior of lldb-server on macos, as it is not possible to get a hold of
the lldb-server binary.

One solution to that would be to duplicate the server configuration
logic to be able to specify both executables. However, that seems
excessively redundant.

A well-behaved lldb should be able to find the debug server on its own,
and testing lldb with a different (lldb-|debug)server does not seem very
useful (even in the out-of-tree debugserver setup, we copy the server
into the build tree to make it appear "real").

Therefore, this patch deletes the configuration altogether and changes
the low-level server retrieval functions to be able to both lldb-server
and debugserver paths. They do this by consulting the "support
executable" directory of the lldb under test.

Differential Revision: https://reviews.llvm.org/D96202

[SelectionDAG][RISCV] Teach ComputeNumSignBits to handle SREM.

This also removes a pattern from RISCV that is no longer needed
since the sexti32 on the LHS of the srem in the pattern implies
the result is sign extended so the sign_extend_inreg should be
removed in DAG combine now.

Reviewed By: luismarques, RKSimon

Differential Revision: https://reviews.llvm.org/D97133

[X86][AVX] canonicalizeLaneShuffleWithRepeatedOps - remove unnecessary BITCASTs.

In conjunction with the 'vperm2x128(bitcast(x),bitcast(y),c) -> bitcast(vperm2x128(x,y,c))' fold in combineTargetShuffle, this should remove any unnecessary bitcasts around vperm2x128 lane shuffles.

Revert "Make sure the interpreter module was loaded before making checks against it"

This reverts commit a83a825e9902b54b315870e9ed85723525208f09.

[NFC] Remove redundant word in comment

Differential Revision: https://reviews.llvm.org/D97157

[lldb-vscode] Emit the breakpoint changed event on location resolved

VSCode was not being informed whenever a location had been resolved (after being initated as non-resolved), so even though it was actually resolved, the IDE would show a hollow dot (instead of a red dot) because it didn't know about the change.

Differential Revision: https://reviews.llvm.org/D96680

[Loads] Add optimized FindAvailableLoadedValue() overload (NFCI)

FindAvailableLoadedValue() accepts an iterator by reference. If no
available value is found, then the iterator will either be left
at a clobbering instruction or the beginning of the basic block.
This allows using FindAvailableLoadedValue() across multiple blocks.

If this functionality is not needed, as is the case in InstCombine,
then we can use a much more efficient implementation: First try
to find an available value, and only perform clobber checks if
we actually found one. As this function only looks at a very small
number of instructions (6 by default) and usually doesn't find an
available value, this saves many expensive alias analysis queries.

[IR] restrict vector reduction intrinsic types

The arguments in all cases should be vectors of exactly one of integer or FP.

All of the tests currently pass the verifier because we check for any vector
type regardless of the type of reduction.
This obviously can't work if we mix up integer and FP, and based on current
LangRef text it was not intended to work for pointers either.

The pointer case from https://llvm.org/PR49215 is what led me here. That
example was avoided with 5b250a27ec.

Differential Revision: https://reviews.llvm.org/D96904

Make sure the interpreter module was loaded before making checks against it

This issue was introduced in https://reviews.llvm.org/D92187.
The guard I'm changing were is supposed to act when linux is loading the linker for the second time (due to differences in paths like symlinks).
This is done by checking `module_sp != m_interpreter_module.lock()` however this will be true when `m_interpreter_module` wasn't initialized, making linux unload the linker module (the most visible result here is that lldb will stop getting notified about new modules loaded by the process, because it can't set the rendezvous breakpoint again after the stepping over it once).
The `m_interpreter_module` is not getting initialize when it goes through this path: https://github.com/llvm/llvm-project/blob/dbfdb139f75470a9abc78e7c9faf743fdd963c2d/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp#L332, which happens when lldb was able to read the address from the dynamic section of the executable.

What I'm not sure about though, is if when we go through this path if we still load the linker twice on linux. If that's the case then it means we need to somehow set the m_interpreter_module instead of the fix I provide here. I've only tested this on Android.

Differential Revision: https://reviews.llvm.org/D96637

[Loads] Extract helper frunction for available load/store (NFC)

This contains the logic for extracting an available load/store
from a given instruction, to be reused in a following patch.

[ThinLTO] Fix import of multiply defined global variables

Currently, if there is a module that contains a strong definition of
a global variable and a module that has both a weak definition for
the same global and a reference to it, it may result in an undefined symbol error
while linking with ThinLTO.

It happens because:
* the strong definition become internal because it is read-only and can be imported;
* the weak definition gets replaced by a declaration because it's non-prevailing;
* the strong definition failed to be imported because the destination module
already contains another definition of the global yet this def is non-prevailing.

The patch adds a check to computeImportForReferencedGlobals() that allows
considering a global variable for being imported even if the module contains
a definition of it in the case this def has an interposable linkage type.

Note that currently the check is based only on the linkage type
(and this seems to be enough at the moment), but it might be worth to account
the information whether the def is prevailing or not.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D95943

[DAG] Match USUBSAT patterns through zext/trunc

This patch handles usubsat patterns hidden through zext/trunc and uses the getTruncatedUSUBSAT helper to determine if the USUBSAT can be correctly performed in the truncated form:

zext(x) >= y ? x - trunc(y) : 0 --> usubsat(x,trunc(umin(y,SatLimit)))
zext(x) >  y ? x - trunc(y) : 0 --> usubsat(x,trunc(umin(y,SatLimit)))

Based on original examples:

void foo(unsigned short *p, int max, int n) {
    int i;
    unsigned m;
    for (i = 0; i < n; i++) {
        m = *--p;
        *p = (unsigned short)(m >= max ? m-max : 0);
    }
}

Differential Revision: https://reviews.llvm.org/D25987

[X86][AVX] Fold concat(extract_subvector(v0,c0), extract_subvector(v1,c1)) -> vperm2x128

Fixes regression exposed by removing bitcasts across logic-ops in D96206.

Differential Revision: https://reviews.llvm.org/D96206

[X86] Fold bitcast(logic(bitcast(X), Y)) --> logic'(X, bitcast(Y)) for int-int bitcasts

Extend the existing combine that handles bitcasting for fp-logic ops to also help remove logic ops across bitcasts to/from the same integer types.

This helps improve AVX512 predicate handling for D/Q logic ops and also allows DAGCombine's scalarizeExtractedBinop to remove some annoying gpr->simd->gpr transfers.

The concat_vectors regression in pr40891.ll will be addressed in a followup commit on this patch.

Differential Revision: https://reviews.llvm.org/D96206

[RISCV] Add test cases for add/sub/mul overflow intrinsics. NFC

Largely copied from AArch64/arm64-xaluo.ll

[lld][ELF] __start_/__stop_ refs don't retain C-ident named group sections

The special root semantics for identifier-named sections is meant
specifically for the metadata sections. In the context of group
semantics, where group members are always retained or discarded as a
unit, it's natural not to have this semantics apply to a section in a
group, otherwise we would never discard the group defeating the purpose
of using the group in the first place.

This change modifies the GC behavior so that __start_/__stop_ references
don't retain C identifier named sections in section groups which allows
for these groups to be collected. This matches the behavior of BFD ld.

The only kind of existing case that might break is interdependent
metadata sections that are all in a group together, but that group
doesn't contain any other sections referenced by anything except
implicit inclusion in a `__start_` and/or `__stop_`-referenced
identifier-named section, but such cases should be unlikely.

Differential Revision: https://reviews.llvm.org/D96753

[CodeGen] Use range-based for loops (NFC)

[TableGen] Use ListSeparator (NFC)

[dfsan] Comment out unused methods by D97087 temporarily

[clang][Driver][OpenBSD] libcxx also requires pthread

[lldb] Refine ThreadPlan::ShouldAutoContinue

Adjust `ShouldAutoContinue` to be available to any thread plan previous to the plan that
explains a stop, not limited to the parent to the plan that explains the stop.

Before this change, `Thread::ShouldStop` did the following:

1. find the plan that explains the stop
2. if it's not a master plan, continue processing previous (aka parent) plans
3. first, call `ShouldAutoContinue` on the immediate parent of the explaining plan
4. then loop over previous plans, calling `ShouldStop` and `MischiefManaged`

Of note, the iteration in step 4 does not call `ShouldAutoContinue`, so again only the
plan just prior to the explaining plan is given the opportunity to override whether to
continue or stop.

This commit changes the loop call `ShouldAutoContinue`, giving each plan the opportunity
to override `ShouldStop` of previous plans.

Why? This allows a plan to do the following:

1. mark itself done and be popped off the stack
2. allow parent plans to finish their work, and to also be popped off the stack
3. and finally, have the thread continue, not stop

This is useful for stepping into async functions. A plan will would step far enough
enough to set a breakpoint on the async target, and then use `ShouldAutoContinue` to
unwind the necessary stepping, and then have the calling thread continue.

Differential Revision: https://reviews.llvm.org/D97076

Update test error string post pass registration change

[mlir] Register the print-op-graph pass using ODS

Move over to ODS & use pass options.

[NFC] Refactor PreferMemberInitializerCheck

[libcxx] [test] Call create_directory_symlink when linking directories

This makes the symlinks work properly on windows.

A similar round of cleanup was done in
c41bda7f5fc26e4602a029646991d6a5c59cb365, but these tests were
added after that.

Differential Revision: https://reviews.llvm.org/D97089

[libcxx] Make path::format a non-class enum

The spec doesn't declare it as an enum class, and being declared
as an enum class breaks referring to the values as e.g.
path::auto_format.

Differential Revision: https://reviews.llvm.org/D97084

[InstrProfiling] Use nobits as __llvm_prf_cnts section type in ELF

This can reduce the binary size because counters will no longer occupy
space in the binary, instead they will be allocated by dynamic linker.

Differential Revision: https://reviews.llvm.org/D97110

[clang-tidy] Simplify throw keyword missing check

Extend test to verify that it does not match in template instantiations.

Differential Revision: https://reviews.llvm.org/D96132

[clang-tidy] Simplify function complexity check

Update test to note use of lambda instead of the invisible operator().

Differential Revision: https://reviews.llvm.org/D96131

[RISCV] Add another test case showing failure to use remw when the RHS has been zero extended from less than i32. NFC

[clang-itdy] Simplify virtual near-miss check

Diagnose the problem in templates in the context of the template
declaration instead of in the context of all of the (possibly very many)
template instantiations.

Differential Revision: https://reviews.llvm.org/D96224

[ConstantRange] Handle wrapping ranges in min/max (PR48643)

When one of the inputs is a wrapping range, intersect with the
union of the two inputs. The union of the two inputs corresponds
to the result we would get if we treated the min/max as a simple
select.

This fixes PR48643.

[InstCombine] fold fdiv with exp/exp2 divisor (PR49147)

Follow-up to:
D96648 / b40fde062
...for the special-case base calls.

From the earlier commit:
This is unusual in the general (non-reciprocal) case because we need
an extra instruction, but that should be better for general FP
reassociation and codegen. We conservatively check for "arcp" FMF
here as we do with existing fdiv folds, but it is not strictly
necessary to have that.

[InstCombine] add tests for fdiv of exp/exp2; NFC

[ConstantRange] Handle wrapping range in binaryNot()

We don't need any special handling for wrapping ranges (or empty
ranges for that matter). The sub() call will already compute a
correct and precise range.

We only need to adjust the test expectation: We're now computing
an optimal result, rather than an unsigned envelope.

[OpenMP] libomp: cleanup some resource leaks

Close mutexattr and condattr local objects to eliminate resource leaks.

Differential Revision: https://reviews.llvm.org/D96892

[RISCV] Add an additional remw test to rv64m-exhaustive-w-insts.ll. NFC

This adds the IR for this C code

int32_t foo(uint16_t x, int16_t y) {
x %= y;
return x;
}

Note the dividend is unsigned and the divisor is signed. C type
promotion rules will extend them and use a 32-bit srem and the
function returns a 32-bit result.

We fail to use remw for this case. The zero extended input has
enough sign bits, but we won't consider (i64 AssertZext X, i16) in
the sexti32 isel pattern.

We also end up with a extra shifts to zero upper bits on the result.
computeKnownBits knew the result was positive before type legalization
and allowed the SIGN_EXTEND to become ZERO_EXTEND. But after promoting
to i64 we no longer know that bit 31 (and all bits above it) should
be 0.

[Clang][OpenMP] Update driver test case for OpenMP offload to use sm_35

`sm_35` is the minimum requirement for OpenMP offloading on NVPTX device.
Current driver test case is using `sm_20`. D97003 is going to switch the minimum
CUDA version to 9.2, which only supports `sm_30+`. This patch makes step for the
change.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D97120

[clang-tidy] Simplify braced init check

The normalization of matchers means that this now works in all language
modes.

Differential Revision: https://reviews.llvm.org/D96135

clang: Exclude efi_main from -Wmissing-prototypes

When compiling UEFI applications, the main function is named
efi_main() instead of main(). Let's exclude efi_main() from
-Wmissing-prototypes as well to avoid warnings when working
on UEFI applications.

Differential Revision: https://reviews.llvm.org/D95746

[ConstantRangeTest] Print detailed information on failure (NFC)

When the optimality check fails, print the inputs, the computed
range and the better range that was found. This makes it much
simpler to identify the cause of the failure.

Make sure that full ranges (which, unlikely all the other cases,
have multiple ways to construct them that all result in the same
range) only print one message by handling them separately.

[lld/mac] reject -undefined warning and -undefined suppress with -twolevel_namespace

See discussion on https://reviews.llvm.org/D93263

-flat_namespace isn't implemented yet, and neither is -undefined dynamic,
so this makes -undefined pretty pointless in lld/MachO for now. But once
we implement -flat_namespace (which we need to do anyways to get check-llvm
to pass with lld as host linker), the code's already there.

Follow-up to https://reviews.llvm.org/D93263#2491865

Differential Revision: https://reviews.llvm.org/D96963

[ASTMatchers] Fix hasUnaryOperand matcher for postfix operators

Differential Revision: https://reviews.llvm.org/D97095

[LTO] Fix cloning of llvm*.used when splitting module

Refines the fix in 3c4c205060c9398da705eb71b63ddd8a04999de9 to only
put globals whose defs were cloned into the split regular LTO module
on the cloned llvm*.used globals. This avoids an issue where one of the
attached values was a local that was promoted in the original module
after the module was cloned. We only need to have the values defined in
the new module on those globals.

Fixes PR49251.

Differential Revision: https://reviews.llvm.org/D97013

[OpenMP][NFC] clang-format the whole openmp project

Same script as D95318. Test files are excluded.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D97088

Revert "Implement nullPointerConstant() using a better API."

This reverts commit 9148302a (2019-08-22) which broke the pre-existing
unit test for the matcher. Also revert commit 518b2266 (Fix the
nullPointerConstant() test to get bots back to green., 2019-08-22) which
incorrectly changed the test to expect the broken behavior.

Differential Revision: https://reviews.llvm.org/D96665

[RISCV] Support extraction of misaligned subvectors

This patch extends the support for RVV EXTRACT_SUBVECTOR to cover those
which don't align to a vector register boundary. It accomplishes this by
extracting the nearest register-sized subvector (a subregister
operation), then sliding the vector down with VSLIDEDOWN and extracting
the subvector from the first position (a COPY operation).

Since this procedure involves the use of VSCALE and multiplication, the
handling of such operations is done during lowering to simplify the
implementation and make use of DAG combining. This necessitated moving
some helper functions from RISCVISelDAGToDAG to RISCVTargetLowering.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96959

[RISCV] Improve register allocation around vector masks

With vector mask registers only allocatable to V0 (VMV0Regs) it is
relatively simple to generate code which uses multiple masks and naively
requires spilling.

This patch aims to improve codegen in such cases by telling LLVM it can
use VRRegs to hold masks. This will prevent spilling in many cases by
having LLVM copy to an available VR register.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97055

[lit testing] "END." not "END:"

[InstCombine] matchBSwapOrBitReverse - remove pattern matching early-out. NFCI.

recognizeBSwapOrBitReverseIdiom + collectBitParts have pattern matching to bail out early if a bswap/bitreverse pattern isn't possible - we should be able to rely on this instead without any notable change in compile time.

This is part of a cleanup towards letting matchBSwapOrBitReverse /recognizeBSwapOrBitReverseIdiom use 'root' instructions that aren't ORs (FSHL/FSHRs in particular which can be prematurely created).

Differential Revision: https://reviews.llvm.org/D97056

[libc++] Fix the build for AppleClang.

Forgot to add some parts of D93593, this should disable the tests on
Apple. Seems Louis was right ;-)

[RISCV] Pre-commit test case for D97055. NFC.

This adds a test which unnecessarily spills mask registers.

[X86][SSE] Use llvm min/max intrinsics instead of (deprecated) sse intrinsics. NFCI.

These are auto-upgraded to the equivalent llvm variants now.

[X86][SSE] vector-compare-combines.ll - use llvm min/max intrinsics instead of (deprecated) sse intrinsics. NFCI.

These are auto-upgraded to the equivalent llvm variants now.

[X86][AVX] Remove AVX2 min/max intrinsics tests

These are now autoupgraded to the llvm equivalents and the tests already moved avx2-intrinsics-x86-upgrade.ll

[X86][SSE] Remove SSE41 min/max intrinsics tests

These are now autoupgraded to the llvm equivalents and the tests already moved sse41-intrinsics-x86-upgrade.ll

[X86][SSE2] Remove SSE2 min/max intrinsics tests

These are now autoupgraded to the llvm equivalents and the tests already moved sse2-intrinsics-x86-upgrade.ll

[X86] KnownBits - use llvm min/max intrinsics instead of (deprecated) sse intrinsics. NFCI.

These are auto-upgraded to the equivalent llvm variants now.

[DAG] foldSubToUSubSat - fold sub(a,trunc(umin(zext(a),b))) -> usubsat(a,trunc(umin(b,SatLimit)))

This moves the last custom x86 USUBSAT fold to generic DAGCombine.

Completes PR40111

Differential Revision: https://reviews.llvm.org/D96703

[ConstantRangeTest] Make exhaustive testing more principled (NFC)

The current infrastructure for exhaustive ConstantRange testing is
somewhat confusing in what exactly it tests and currently cannot even
be used for operations that produce smallest-size results, rather than
signed/unsigned envelopes.

This patch makes the testing more principled by collecting the exact
set of results of an operation into a bit set and then comparing it
against the range approximation by:

* Checking conservative correctness: All elements in the set must be
   in the range.
* Checking optimality under a given preference function: None of the
   (slack-free) ranges that can be constructed from the set are
   preferred over the computed range.

Implemented preference functions are:

* PreferSmallest: Smallest range regardless of signed/unsigned wrapping
   behavior. Probably what we would call "optimal" without further
   qualification.
* PreferSmallestUnsigned/Signed: Smallest range that has no
   unsigned/signed wrapping. We use this if our calculation is precise
   only up to signed/unsigned envelope.
* PreferSmallestNonFullUnsigned/Signed: Smallest range that has no
   unsigned/signed wrapping -- but preferring a smaller wrapping range
   over a (non-wrapping) full range. We use this if we have a fully
   precise calculation but apply a sign preference to the result
   (union/intersection). Even with a sign preference, returning a
   wrapping range is still "strictly better" than returning a full one.

This also addresses PR49273 by replacing the fragile manual range
construction logic in testBinarySetOperationExhaustive() with generic
code that isn't specialized to the particular form of ranges that set
operations can produces.

Differential Revision: https://reviews.llvm.org/D88356

[Sanitizers][NFC] Fix typo

[lit] Add --xfail and --filter-out (inverse of --filter)

In semi-automated environments, XFAILing or filtering out known regressions without actually committing changes or temporarily modifying the test suite can be quite useful.

Reviewed By: yln

Differential Revision: https://reviews.llvm.org/D96662

Update BPFAdjustOpt.cpp to accept select form of or as well

This is a minor pattern-match update to BPFAdjustOpt.cpp to accept
not only 'or i1 a, b' but also 'select i1 a, i1 true, i1 b'.
This resolves regression after SimplifyCFG's creating select form
of and/or instead (https://reviews.llvm.org/D95026).
This is a small change, and currently such select form isn't created
or doesn't reach to the late pipeline (because InstCombine eagerly
folds it into and/or i1), so I chose to commit without a review process.

[AArch64][GlobalISel] Add selection support for G_VECREDUCE of <2 x i32>

This selects to a pairwise add and a subreg copy.

[libcxx] [test] Remove two unnecesary files/variables in a test

These don't seem to have any function in the test.

The non_regular_file one seems to have been added in
0f8c8f59df057a85d6d49913ec9877c6d597785b, without any apparent
purpose there.

Differential Revision: https://reviews.llvm.org/D97083

[libcxx] Rename a method in PathParser for clarity. NFC.

Differential Revision: https://reviews.llvm.org/D97081

[libc++] Fixes _LIBCPP_HAS_NO_CONCEPTS

Before the define was in a GCC specific part. Now it's available for all
compilers. The patch had its CI run in D93593.

[InstCombine] Add more tests to nonnull-select.ll (NFC)

[CodeGen] Use range-based for loops (NFC)

[TableGen] Use ListSeparator (NFC)

Fixed failing test

Reduce the number of attributes attached to each function

This takes advantage of the implicit default behavior to reduce the number of
attributes.

Reland "[Libcalls, Attrs] Annotate libcalls with noundef"

Fixed Clang tests.

[ValueTracking] Improve impliesPoison

This patch improves ValueTracking's impliesPoison(V1, V2) to do this reasoning:

```
  %res = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
  %overflow = extractvalue { i64, i1 } %res, 1
  %mul      = extractvalue { i64, i1 } %res, 0

; If %mul is poison, %overflow is also poison, and vice versa.
```

This improvement leads to supporting this optimization under `-instcombine-unsafe-select-transform=0`:

```
define i1 @test2_logical(i64 %a, i64 %b, i64* %ptr) {
; CHECK-LABEL: @test2_logical(
; CHECK-NEXT:    [[MUL:%.*]] = mul i64 [[A:%.*]], [[B:%.*]]
; CHECK-NEXT:    [[TMP1:%.*]] = icmp ne i64 [[A]], 0
; CHECK-NEXT:    [[TMP2:%.*]] = icmp ne i64 [[B]], 0
; CHECK-NEXT:    [[OVERFLOW_1:%.*]] = and i1 [[TMP1]], [[TMP2]]
; CHECK-NEXT:    [[NEG:%.*]] = sub i64 0, [[MUL]]
; CHECK-NEXT:    store i64 [[NEG]], i64* [[PTR:%.*]], align 8
; CHECK-NEXT:    ret i1 [[OVERFLOW_1]]
;

  %res = tail call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
  %overflow = extractvalue { i64, i1 } %res, 1
  %mul = extractvalue { i64, i1 } %res, 0
  %cmp = icmp ne i64 %mul, 0
  %overflow.1 = select i1 %overflow, i1 true, i1 %cmp
  %neg = sub i64 0, %mul
  store i64 %neg, i64* %ptr, align 8
  ret i1 %overflow.1
}
```

Previously, this didn't happen because the flag prevented `select i1 %overflow, i1 true, i1 %cmp` from being `or i1 %overflow, %cmp`.
Note that the select -> or conversion happens only when `impliesPoison(%cmp, %overflow)` returns true.
This improvement allows `impliesPoison` to do the reasoning.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D96929