platform/upstream/llvm.git
2 years ago[ConstantFold] Use getFltSemantics instead of manually checking the type
Benjamin Kramer [Thu, 5 May 2022 13:50:33 +0000 (15:50 +0200)]
[ConstantFold] Use getFltSemantics instead of manually checking the type

Simplifies the code and makes fpext/fptrunc constant folding not crash
when the result is bf16.

2 years ago[ThreadSanitizer] Add fallback DebugLocation for instrumentation calls
Marco Elver [Thu, 5 May 2022 13:21:35 +0000 (15:21 +0200)]
[ThreadSanitizer] Add fallback DebugLocation for instrumentation calls

When building with debug info enabled, some load/store instructions do
not have a DebugLocation attached. When using the default IRBuilder, it
attempts to copy the DebugLocation from the insertion-point instruction.
When there's no DebugLocation, no attempt is made to add one.

This is problematic for inserted calls, where the enclosing function has
debug info but the call ends up without a DebugLocation in e.g. LTO
builds that verify that both the enclosing function and calls to
inlinable functions have debug info attached.

This issue was noticed in Linux kernel KCSAN builds with LTO and debug
info enabled:

  | ...
  | inlinable function call in a function with debug info must have a !dbg location
  |   call void @__tsan_read8(i8* %432)
  | ...

To fix, ensure that all calls to the runtime have a DebugLocation
attached, where the possibility exists that the insertion-point might
not have any DebugLocation attached to it.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D124937

2 years ago[Frontend] give createInvocationFromCommandLine an options struct
Sam McCall [Thu, 5 May 2022 00:15:24 +0000 (02:15 +0200)]
[Frontend] give createInvocationFromCommandLine an options struct

It's accumulating way too many optional params (see D124970)

While here, improve the name and the documentation.

Differential Revision: https://reviews.llvm.org/D124971

2 years ago[SLP]Further improvement of the cost model for scalars used in buildvectors.
Alexey Bataev [Tue, 14 Dec 2021 18:02:06 +0000 (10:02 -0800)]
[SLP]Further improvement of the cost model for scalars used in buildvectors.

Further improvement of the cost model for the scalars used in
buildvectors sequences. The main functionality is outlined into
a separate function.
The cost is calculated in the following way:
1. If the Base vector is not undef vector, resizing the very first mask to
have common VF and perform action for 2 input vectors (including non-undef
Base). Other shuffle masks are combined with the resulting after the 1 stage and processed as a shuffle of 2 elements.
2. If the Base is undef vector and have only 1 shuffle mask, perform the
action only for 1 vector with the given mask, if it is not the identity
mask.
3. If > 2 masks are used, perform serie of shuffle actions for 2 vectors,
combing the masks properly between the steps.

The original implementation misses the very first analysis for the Base
vector, so the cost might too optimistic in some cases. But it improves
the cost for the insertelements which are part of the current SLP graph.

Part of D107966.

Differential Revision: https://reviews.llvm.org/D115750

2 years ago[XCOFF][AIX] Use unique section names for LSDA and EH info sections with -ffunction...
Xing Xue [Thu, 5 May 2022 13:01:36 +0000 (09:01 -0400)]
[XCOFF][AIX] Use unique section names for LSDA and EH info sections with -ffunction-sections

Summary:
When -ffunction-sections is on, this patch makes the compiler to generate unique LSDA and EH info sections for functions on AIX by appending the function name to the section name as a suffix. This will allow the AIX linker to garbage-collect unused function.

Reviewed by: MaskRay, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D124855

2 years ago[AArch64] Add -aarch64-insert-extract-base-cost
Peter Waller [Tue, 3 May 2022 08:36:07 +0000 (08:36 +0000)]
[AArch64] Add -aarch64-insert-extract-base-cost

The new flag -aarch64-insert-extract-base-cost can be used to
set the value of AArch64Subtarget::getVectorInsertExtractBaseCost(),
for the purposes of experimentation.

Differential Revision: https://reviews.llvm.org/D124835

2 years ago[AMDGPU] Combine DPP mov even if old reg def is in different BB
Jay Foad [Thu, 21 Apr 2022 11:01:59 +0000 (12:01 +0100)]
[AMDGPU] Combine DPP mov even if old reg def is in different BB

Given a DPP mov like this:

  %2:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
  ...
  %3:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 1, 1, 0, implicit $exec

this patch just removes a check that %2 (the "old reg") was defined in
the same BB as the DPP mov instruction. GCNDPPCombine requires that the
MIR is in SSA form so I don't understand why the BB matters.

This lets the optimization work in more real world cases when the
definition of %2 gets hoisted out of a loop.

Differential Revision: https://reviews.llvm.org/D124182

2 years agosanitizer_common: Define FP_XSTATE_MAGIC1 for old glibc
Tobias Burnus [Thu, 5 May 2022 09:30:10 +0000 (10:30 +0100)]
sanitizer_common: Define FP_XSTATE_MAGIC1 for old glibc

D116208 (commit 1298273e8206a8fc2) added FP_XSTATE_MAGIC1.
However, when building with glibc < 2.16 for backward-dependency
compatibility, it is not defined - and the build breaks.

Note: The define comes from Linux's asm/sigcontext.h but the
file uses signal.h which includes glibc's bits/sigcontext.h - which
is synced from the kernel's file but lags behind.

Solution: For backward compatility with ancient systems, define
FP_XSTATE_MAGIC1 if undefined.

//For the old systems, we were building with Linux kernel 3.19 but to support really old glibc systems, we build with a sysroot of glibc 2.12. While our kernel (and the users' kernels) have FP_XSTATE_MAGIC1, glibc 2.12 is too old. – With this patch, building the sanitizer libs works again. This showed up for us today as GCC mainline/13 has now synced the sanitizer libs.//

Reviewed By: #sanitizers, vitalybuka

Differential Revision: https://reviews.llvm.org/D124927

2 years ago[analyzer] Get direct binding for specific punned case
einvbri [Sun, 24 Apr 2022 19:05:11 +0000 (14:05 -0500)]
[analyzer] Get direct binding for specific punned case

Region store was not able to see through this case to the actual
initialized value of STRUCT ff. This change addresses this case by
getting the direct binding. This was found and debugged in a downstream
compiler, with debug guidance from @steakhal. A positive and negative
test case is added.

The specific case where this issue was exposed.

  typedef struct {
    int a:1;
    int b[2];
  } STRUCT;

  int main() {
    STRUCT ff = {0};
    STRUCT* pff = &ff;
    int a = ((int)pff + 1);
    return a;
  }

Reviewed By: steakhal, martong

Differential Revision: https://reviews.llvm.org/D124349

2 years ago[LICM] Add test to exercise assertion from D123473.
Florian Hahn [Thu, 5 May 2022 09:49:51 +0000 (10:49 +0100)]
[LICM] Add test to exercise assertion from D123473.

Add a test case that triggers an assertion with earlier versions of
D123473.

2 years agoRegAllocGreedy: Common up part of the priority calculation. NFC.
Jay Foad [Thu, 5 May 2022 09:35:20 +0000 (10:35 +0100)]
RegAllocGreedy: Common up part of the priority calculation. NFC.

2 years ago[DAGCombine] Fold (X & ~Y) | Y with truncated not
Nikita Popov [Wed, 4 May 2022 15:35:18 +0000 (17:35 +0200)]
[DAGCombine] Fold (X & ~Y) | Y with truncated not

This extends the (X & ~Y) | Y to X | Y fold to also work if ~Y is
a truncated not (when taking into account the mask X). This is
done by exporting the infrastructure added in D124856 and reusing
it here.

I've retained the old value of AllowUndefs=false, though probably
this can be switched to true with extra test coverage.

Differential Revision: https://reviews.llvm.org/D124930

2 years ago[SimpleLoopUnswitch] Add freeze if branch execs for partial unswitching.
Florian Hahn [Thu, 5 May 2022 08:44:07 +0000 (09:44 +0100)]
[SimpleLoopUnswitch] Add freeze if branch execs for partial unswitching.

We cannot skip the freezing the condition if the unswitched branch
executes, if the condition is a chain of ANDs/ORs. For example, if if we
have an AND %c1, %c2 with  %c1 == undef and %c2 == 0, there would be no
branch on undef in the original code, but a branch on undef if we
unswitch %c1.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D124603

2 years ago[flang] use 1-based dim in transformational runtime error msg
Jean Perier [Thu, 5 May 2022 08:32:40 +0000 (10:32 +0200)]
[flang] use 1-based dim in transformational runtime error msg

Flang transformational runtime was previously reporting conformity
issues in a zero based fashion to describe which dimension is non
conformant. This may confuse Fortran user, especially when the message
is about a dimension other than the first one.

Differential Revision: https://reviews.llvm.org/D124941

2 years ago[clang] Add static_cast to fix Bazel build.
Adrian Kuegel [Thu, 5 May 2022 07:57:45 +0000 (09:57 +0200)]
[clang] Add static_cast to fix Bazel build.

Differential Revision: https://reviews.llvm.org/D124995

2 years ago[mlir][scf][bufferize] Update verifyAnalysis error message
Matthias Springer [Thu, 5 May 2022 07:49:56 +0000 (16:49 +0900)]
[mlir][scf][bufferize] Update verifyAnalysis error message

The previous error message was technically incorrect. We do not compare equivalence of YieldOp operands and ForOp operands.

Differential Revision: https://reviews.llvm.org/D124934

2 years ago[mlir][scf][bufferize][NFC] Split ForOp bufferization into smaller functions
Matthias Springer [Thu, 5 May 2022 07:49:37 +0000 (16:49 +0900)]
[mlir][scf][bufferize][NFC] Split ForOp bufferization into smaller functions

This is in preparation of WhileOp bufferization, which reuses these functions.

Differential Revision: https://reviews.llvm.org/D124933

2 years ago[mlir][scf][bufferize][NFC] Simplify verifyAnalysis implementation
Matthias Springer [Thu, 5 May 2022 07:49:18 +0000 (16:49 +0900)]
[mlir][scf][bufferize][NFC] Simplify verifyAnalysis implementation

Differential Revision: https://reviews.llvm.org/D124928

2 years ago[SCEV] Fold umin_seq to umin using implied poison reasoning
Nikita Popov [Wed, 4 May 2022 10:43:31 +0000 (12:43 +0200)]
[SCEV] Fold umin_seq to umin using implied poison reasoning

Similar to how we convert logical and/or to bitwise and/or, we should
also convert umin_seq to umin based on implied poison reasoning. In
%x umin_seq %y, if %y being poison implies %x being poison, then we
don't need the sequential evaluation: Having %y contribute towards
the result will never make the result more poisonous. An important
corollary of this is that if %y is never poison, we also don't need
the sequential evaluation.

This avoids some of the regressions in D124910.

Differential Revision: https://reviews.llvm.org/D124921

2 years ago[lldb] Fix ppc64 detection in lldb
serge-sans-paille [Mon, 2 May 2022 10:19:48 +0000 (12:19 +0200)]
[lldb] Fix ppc64 detection in lldb

Currently, ppc64le and ppc64 (defaulting to big endian) have the same
descriptor, thus the linear scan always return ppc64le. Handle that through
subtype.

This is a recommit of f114f009486816ed4b3bf984f0fbbb8fc80914f6 with a new test
setup that doesn't involves (unsupported) corefiles.

Differential Revision: https://reviews.llvm.org/D124760

2 years ago[NFC] [Pipelines] Hoist CoroCleanup as Module Pass
Chuanqi Xu [Thu, 5 May 2022 07:15:09 +0000 (15:15 +0800)]
[NFC] [Pipelines] Hoist CoroCleanup as Module Pass

This is similar to previous patch https://reviews.llvm.org/D123925. It
could also reduce the time we call declaresCoroCleanupIntrinsics. And it
is helpful for further changes.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D124362

2 years ago[Pipelines] Hoist CoroCleanup to avoid blocking optimizations
Chuanqi Xu [Thu, 21 Apr 2022 09:35:56 +0000 (17:35 +0800)]
[Pipelines] Hoist CoroCleanup to avoid blocking optimizations

CoroCleanup is designed to lowering all the remaining coroutine
intrinsics. It is required to run after CoroSplit only. However, the
position of CoroCleanup now is far too late. The downside here is that
the unlowered coroutine instrincs might blocking other optimizations
too. So it should be a pure win to hoist the position of CoroCleanup.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D124360

2 years ago[RISCV][Clang] add more tests for clang driver. (NFC)
Zakk Chen [Thu, 5 May 2022 01:26:03 +0000 (18:26 -0700)]
[RISCV][Clang] add more tests for clang driver. (NFC)

Test experimental arch, Zfh, Zfmin and Zve arch.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124611

2 years ago[InstCombine] Remove side effect of replaced constrained intrinsics
Serge Pavlov [Thu, 5 May 2022 05:02:42 +0000 (12:02 +0700)]
[InstCombine] Remove side effect of replaced constrained intrinsics

If a constrained intrinsic call was replaced by some value, it was not
removed in some cases. The dangling instruction resulted in useless
instructions executed in runtime. It happened because constrained
intrinsics usually have side effect, it is used to model the interaction
with floating-point environment. In some cases it is correct behavior
but often the side effect is actually absent or can be ignored.

This change adds specific treatment of constrained intrinsics so that
their side effect can be removed if it actually absents.

Differential Revision: https://reviews.llvm.org/D118426

2 years ago[AMDGPU] Use d16 flag for image.sample instructions
Mariusz Sikora [Mon, 25 Apr 2022 11:57:27 +0000 (12:57 +0100)]
[AMDGPU] Use d16 flag for image.sample instructions

Image.sample instruction can be forced to return half type instead of
float when d16 flag is enabled.

This patch adds new pattern in InstCombine to detect if output of
image.sample is used later only by fptrunc which converts the type
from float to half. If pattern is detected then fptrunc and image.sample
are combined to single image.sample which is returning half type.
Later in Lowering part d16 flag is added to image sample intrinsic.

Differential Revision: https://reviews.llvm.org/D124232

2 years ago[AIX][PGO] Enable linux style PGO on AIX
Wael Yehia [Tue, 3 May 2022 14:27:15 +0000 (10:27 -0400)]
[AIX][PGO] Enable linux style PGO on AIX

This patch switches the PGO implementation on AIX from using the runtime
registration-based section tracking to the __start_SECNAME/__stop_SECNAME
based. In order to enable the recognition of __start_SECNAME/__stop_SECNAME
symbols in the AIX linker, the -bdbg:namedsects:ss needs to be used.

Reviewed By: jsji, MaskRay, davidxl

Differential Revision: https://reviews.llvm.org/D124857

2 years ago[clang][dataflow] Add flowConditionIsTautology function
Eric Li [Wed, 4 May 2022 17:15:00 +0000 (17:15 +0000)]
[clang][dataflow] Add flowConditionIsTautology function

Provide a way for users to check if a flow condition is
unconditionally true.

Differential Revision: https://reviews.llvm.org/D124943

2 years ago[AVR] Always expand STDSPQRr & STDWSPQRr
Patryk Wychowaniec [Thu, 5 May 2022 03:07:41 +0000 (03:07 +0000)]
[AVR] Always expand STDSPQRr & STDWSPQRr

Currently, STDSPQRr and STDWSPQRr are expanded only during
AVRFrameLowering - this means that if any of those instructions happen
to appear _outside_ of the typical FrameSetup / FrameDestroy
context, they wouldn't get substituted, eventually leading to a crash:

```
LLVM ERROR: Not supported instr: <MCInst XXX <MCOperand Reg:1>
<MCOperand Imm:15> <MCOperand Reg:53>>
```

This commit fixes this issue by moving expansion of those two opcodes
into AVRExpandPseudo.

This bug was originally discovered due to the Rust compiler_builtins
library. Its 0.1.37 release contained a 128-bit software
division/remainder routine that exercised this buggy branch in the code.

Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D123528

2 years ago[RISCV][NFC] Use true_mask replace riscv_vmset_vl in defined patterns.
Lian Wang [Fri, 29 Apr 2022 08:37:45 +0000 (08:37 +0000)]
[RISCV][NFC] Use true_mask replace riscv_vmset_vl in defined patterns.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124660

2 years ago[X86] Add `void` to void function. NFC
Phoebe Wang [Thu, 5 May 2022 02:58:23 +0000 (10:58 +0800)]
[X86] Add `void` to void function. NFC

2 years ago[X86][AMX] Replace PXOR instruction with SET0 in AMX pre config.
Luo, Yuanke [Wed, 4 May 2022 04:39:45 +0000 (12:39 +0800)]
[X86][AMX] Replace PXOR instruction with SET0 in AMX pre config.

To generate zero value, the PXOR instruction need 3 operands that is
tied to the same vreg. If is not good in SSA form and with undef value
two address instruction pass may convert
`%0:vr128 = PXORrr undef %0, undef %0`
to `%1:vr128 = PXORrr undef %1:vr128(tied-def 0), undef %0:vr128`.
It is not expected.
It can be simplified to SET0 instruction which only take 1 destination
operand. It should be more friendly to two address instruction pass and
register allocation pass.
`%0:vr128 = V_SET0`
Also add AVX1 code path so that it is consistant to other code.

Differential Revision: https://reviews.llvm.org/D124903

2 years ago[Disassembler][AVR] Remove unused static functions
Ben Shi [Thu, 5 May 2022 02:18:09 +0000 (02:18 +0000)]
[Disassembler][AVR] Remove unused static functions

The unused static functions cause failures on some build machines.

2 years ago[X86] Call initializeX86PreTileConfigPass from LLVMInitializeX86Target.
Craig Topper [Thu, 5 May 2022 01:51:25 +0000 (18:51 -0700)]
[X86] Call initializeX86PreTileConfigPass from LLVMInitializeX86Target.

Without this, the pass doesn't show up in print-before/after-all.

Differential Revision: https://reviews.llvm.org/D124973

2 years ago[SelectionDAG] Use llvm::any_of to simplify a loop. NFC
Craig Topper [Thu, 5 May 2022 01:29:15 +0000 (18:29 -0700)]
[SelectionDAG] Use llvm::any_of to simplify a loop. NFC

2 years ago[MC][AVR] Implement decoding ST/LD
Ben Shi [Mon, 11 Apr 2022 01:44:49 +0000 (01:44 +0000)]
[MC][AVR] Implement decoding ST/LD

Reviewed By: aykevl, dylanmckay

Differential Revision: https://reviews.llvm.org/D123476

2 years ago[MC][AVR] Implement decoding STD/LDD
Ben Shi [Sat, 9 Apr 2022 01:45:22 +0000 (01:45 +0000)]
[MC][AVR] Implement decoding STD/LDD

Reviewed By: aykevl, dylanmckay

Differential Revision: https://reviews.llvm.org/D123442

2 years ago[InstCombine] Fold ((A&B)^C)|B
Alexander Shaposhnikov [Thu, 5 May 2022 00:50:33 +0000 (00:50 +0000)]
[InstCombine] Fold ((A&B)^C)|B

Fold ((A&B)^C)|B into C|B.

https://alive2.llvm.org/ce/z/zSGSor

This addresses the issue https://github.com/llvm/llvm-project/issues/55169

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D124710

2 years ago[compiler-rt][AVR] Fix avr_SOURCES CMake variable
Ayke van Laethem [Wed, 4 May 2022 22:46:27 +0000 (00:46 +0200)]
[compiler-rt][AVR] Fix avr_SOURCES CMake variable

D123200 did not include the generic sources, which means that only the
AVR-specific sources were compiled. With this change, generic sources
are included as expected.

Tested with the following commands:

    cmake -G Ninja -DCOMPILER_RT_DEFAULT_TARGET_TRIPLE=avr -DCOMPILER_RT_BAREMETAL_BUILD=1 -DCMAKE_C_COMPILER=clang-14 -DCMAKE_C_FLAGS="--target=avr -mmcu=avr5 -nostdlibinc -mdouble=64" ../path/to/builtins

    ninja

Differential Revision: https://reviews.llvm.org/D124969

2 years ago[RISCV] Use movImm went multiplying by simm12 in getVLENFactoredAmount.
Craig Topper [Thu, 5 May 2022 00:19:43 +0000 (17:19 -0700)]
[RISCV] Use movImm went multiplying by simm12 in getVLENFactoredAmount.

No reason to special case simm12, movImm handles all immediates.

This also fixe a bug that we weren't passing the frame-setup/destroy
flag to movImm when we were calling it.

2 years ago[InstCombine][NFC] Update comment in and-xor-or.ll
Alexander Shaposhnikov [Thu, 5 May 2022 00:07:49 +0000 (00:07 +0000)]
[InstCombine][NFC] Update comment in and-xor-or.ll

2 years ago[InstCombine][NFC] Add baseline tests for folds of ((A&B)^C)|B
Alexander Shaposhnikov [Thu, 5 May 2022 00:04:33 +0000 (00:04 +0000)]
[InstCombine][NFC] Add baseline tests for folds of ((A&B)^C)|B

Differential revision: https://reviews.llvm.org/D124709

Test plan: make check-all

2 years ago[lld/mac] Support writing zippered dylibs and bundles
Nico Weber [Fri, 22 Apr 2022 15:55:50 +0000 (11:55 -0400)]
[lld/mac] Support writing zippered dylibs and bundles

With -platform_version flags for two distinct platforms,
this writes a LC_BUILD_VERSION header for each.

The motivation is that this is needed for self-hosting with lld as linker
after D124059.

To create a zippered output at the clang driver level, pass

    -target arm64-apple-macos -darwin-target-variant arm64-apple-ios-macabi

to create a zippered dylib.

(In Xcode's clang, `-darwin-target-variant` is spelled just `-target-variant`.)

(If you pass `-target arm64-apple-ios-macabi -target-variant arm64-apple-macos`
instead, ld64 crashes!)

This results in two -platform_version flags being passed to the linker.

ld64 also verifies that the iOS SDK version is at least 13.1. We don't do that
yet. But ld64 also does that for other platforms and we don't. So we need to
do that at some point, but not in this patch.

Only dylib and bundle outputs can be zippered.

I verified that a Catalyst app linked against a dylib created with

    clang -shared foo.cc -o libfoo.dylib \
          -target arm64-apple-macos \
          -target-variant arm64-apple-ios-macabi \
          -Wl,-install_name,@rpath/libfoo.dylib \
          -fuse-ld=$PWD/out/gn/bin/ld64.lld

runs successfully. (The app calls a function `f()` in libfoo.dylib
that returns a const char* "foo", and NSLog(@"%s")s it.)

ld64 is a bit more permissive when writing zippered outputs,
see references to "unzippered twins". That's not implemented yet.
(If anybody wants to implement that, D124275 is a good start.)

Differential Revision: https://reviews.llvm.org/D124887

2 years ago[llvm-otool] Make `llvm-otool -l` output compatible with otool for LC_BUILD_VERSION
Nico Weber [Wed, 4 May 2022 13:08:58 +0000 (09:08 -0400)]
[llvm-otool] Make `llvm-otool -l` output compatible with otool for LC_BUILD_VERSION

Namely, only "symbolize" platform and tool names if `-v` is passed.

(`llvm-otool -lv` output still isn't quite the same as `otool -lv` output, but
`-v` output is arguably for consumption by humans, so I'm not changing that
at this point. Someone else could change it if it was important to them.)

Differential Revision: https://reviews.llvm.org/D124920

2 years ago[PowerPC] Re-run update_mir_test_checks.py on nofpexcept.ll. NFC
Craig Topper [Wed, 4 May 2022 23:13:06 +0000 (16:13 -0700)]
[PowerPC] Re-run update_mir_test_checks.py on nofpexcept.ll. NFC

This test was previously generated by the script, but the script
now uses CHECK-NEXT instead of CHECK.

This is preparation for a strictfp related patch I'm working on.

2 years ago[sanitizer] Use newfstatat for x32
H.J. Lu [Wed, 4 May 2022 21:53:05 +0000 (14:53 -0700)]
[sanitizer] Use newfstatat for x32

Since newfstatat is supported on x32, use it for x32.

Differential Revision: https://reviews.llvm.org/D124968

2 years agoRemove expected fail for TestStepNoDebug on AArch64
Jason Molenda [Wed, 4 May 2022 22:27:09 +0000 (15:27 -0700)]
Remove expected fail for TestStepNoDebug on AArch64

My fix in https://reviews.llvm.org/D124492 should fix
this - I got an "unexpected pass" failure from an
Aarch64 Ubuntu bot when I landed my fix.

2 years ago[DebugInfo] Give warning instead of error for premature terminator in .debug_aranges...
Junfeng Dong [Wed, 4 May 2022 22:21:30 +0000 (15:21 -0700)]
[DebugInfo] Give warning instead of error for premature terminator in .debug_aranges section.

llvm-profgen gives error message when the input binary contains premature terminator in .debug_aranges section. These zero length items point to some rodata with zero size type in embed Rust Library. Considering Zero-Sized Types are a valid feature in Rust. They are not real error. This change makes the "error:" message into a warning to avoid misleading.

Why do we still want a warning on such case? because it doesn't follow dwarf standard.  https://bugs.llvm.org/show_bug.cgi?id=46805 contains early discussion.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D124121

2 years ago[lld-macho][nfc] Set test min version to 11.0
Jez Ng [Wed, 4 May 2022 22:01:34 +0000 (18:01 -0400)]
[lld-macho][nfc] Set test min version to 11.0

The arm64-apple-macos triple is only valid for versions >= 11.0. (If
one passes arm64-apple-macos10.15 to llvm-mc, the output's min version is still
11.0). In order to write tests easily for both target archs, let's up the
default min version in our tests.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D124562

2 years agoUpdate the CFA to use $sp when $fp is restored on arm64
Jason Molenda [Wed, 4 May 2022 21:43:42 +0000 (14:43 -0700)]
Update the CFA to use $sp when $fp is restored on arm64

In UnwindAssemblyInstEmulation we correctly recognize when a LDP
restores the fp & lr in an epilogue, and mark them as having the
caller's contents now, but we don't update the CFA register rule
at that point to indicate that the CFA is now calculated in terms
of $sp.  This doesn't impact the backtrace because the register
contents are all <same> now, but it can confuse the stepper when
the StackID changes mid-epilogue.

Differential Revision: https://reviews.llvm.org/D124492
rdar://92064415

2 years agoRevert "Revert "[clang][extract-api] Use relative includes""
Zixu Wang [Wed, 4 May 2022 19:29:45 +0000 (12:29 -0700)]
Revert "Revert "[clang][extract-api] Use relative includes""

Reapply the change after fixing sanitizer errors.
The original problem was that `StringRef`s in `Matches` are pointing to
temporary local `std::string`s created by `path::convert_to_slash` in
the regex match call. This patch does the conversion up front in
container `FilePath`.

This reverts commit 2966f0fa505266735dbc8324b8821b7f0aa901ff.

Differential Revision: https://reviews.llvm.org/D124964

2 years ago[RISCV] Add a version of insertVSETVLI which uses an iterator [NFC]
Philip Reames [Tue, 3 May 2022 21:00:51 +0000 (14:00 -0700)]
[RISCV] Add a version of insertVSETVLI which uses an iterator [NFC]

This is to simplify the final version of D124869.

2 years ago[AMDGPU] Handle LDS DMA and LDS_DIRECT hazards
Stanislav Mekhanoshin [Wed, 27 Apr 2022 19:10:16 +0000 (12:10 -0700)]
[AMDGPU] Handle LDS DMA and LDS_DIRECT hazards

There shall be 1 wait state between M0 write and LDS DMA/LDS_DIRECT use.

Differential Revision: https://reviews.llvm.org/D124550

2 years ago[amdgpu] Elide module lds allocation in kernels with no callees
Jon Chesterfield [Wed, 4 May 2022 21:42:05 +0000 (22:42 +0100)]
[amdgpu] Elide module lds allocation in kernels with no callees

Introduces a string attribute, amdgpu-requires-module-lds, to allow
eliding the module.lds block from kernels. Will allocate the block as before
if the attribute is missing or has its default value of true.

Patch uses the new attribute to detect the simplest possible instance of this,
where a kernel makes no calls and thus cannot call any functions that use LDS.

Tests updated to match, coverage was already good. Interesting cases is in
lower-module-lds-offsets where annotating the kernel allows the backend to pick
a different (in this case better) variable ordering than previously. A later
patch will avoid moving kernel variables into module.lds when the kernel can
have this attribute, allowing optimal ordering and locally unused variable
elimination.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122091

2 years ago[RISCV] Add a special case to treat riscv-v-vector-bits-min=-1 as meaning use Zvl...
Craig Topper [Wed, 4 May 2022 21:26:44 +0000 (14:26 -0700)]
[RISCV] Add a special case to treat riscv-v-vector-bits-min=-1 as meaning use Zvl*b value.

riscv-v-vector-bits-min is primarily used to opt-in to the
autovectorizer. The vector width can be determined from Zvl*b.

This patch adds support treating -1 as meaning use Zvl*b so we can
still opt-in to autovectorization without needing to repeat a
vector width already given by Zvl*b or -mcpu.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D124960

2 years ago[LoopCacheAnalysis][NFC] Add a test case for improved loop cache analysis cost calcul...
Congzhe Cao [Wed, 4 May 2022 21:09:13 +0000 (17:09 -0400)]
[LoopCacheAnalysis][NFC] Add a test case for improved loop cache analysis cost calculation

Added a motivating test case for D123400 where the loopnest has a
suboptimal loop order j-i-k. After D123400 we ensure that the order
of loop cache analysis output is loop i-j-k, despite the suboptimal
order in the original loopnest.

Reviewed By: bmahjour, #loopoptwg

Differential Revision: https://reviews.llvm.org/D122776

2 years ago[ARM] Delay creation of MVE Imm shifts to legalization
David Green [Wed, 4 May 2022 21:12:09 +0000 (22:12 +0100)]
[ARM] Delay creation of MVE Imm shifts to legalization

The reasoning for creating VSHLIMM/VSHRsIMM/VSHRuIMM nodes in a combine
- because matching i64 constants is difficult -  does not apply for MVE,
as there are not v2i64 shifts. Delaying the creation of the nodes can
allow extra transforms on target independant shl/shr.

2 years ago[BOLT][NFC] Move getInliningInfo out of Inliner class
Amir Ayupov [Wed, 4 May 2022 21:07:42 +0000 (14:07 -0700)]
[BOLT][NFC] Move getInliningInfo out of Inliner class

`getInliningInfo` is useful in other passes that need to check inlining
eligibility for some function. Move the declaration and InliningInfo definition
out of Inliner class. Prepare for subsequent use in ICP.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124899

2 years ago[BOLT][NFC] Minor cleanup in ICP getCallTargets and canPromoteCallsite
Amir Ayupov [Wed, 4 May 2022 21:03:24 +0000 (14:03 -0700)]
[BOLT][NFC] Minor cleanup in ICP getCallTargets and canPromoteCallsite

Minor refactoring. NFC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124898

2 years ago[clang-tidy] Escape diagnostic messages before passing to `diag` in Transformer.
Yitzhak Mandelbaum [Wed, 4 May 2022 18:47:08 +0000 (18:47 +0000)]
[clang-tidy] Escape diagnostic messages before passing to `diag` in Transformer.

Messages generated by Transformer rules may have `%` in them, which
needs to be escaped before being passed to `diag`, which interprets them
specially (and crashes if they are misused).

Differential Revision: https://reviews.llvm.org/D124952

2 years ago[compiler-rt][AVR] Use correct return value for __ledf2 etc
Ayke van Laethem [Wed, 4 May 2022 16:37:28 +0000 (18:37 +0200)]
[compiler-rt][AVR] Use correct return value for __ledf2 etc

Previously the default was long, which is 32-bit on AVR. But avr-gcc
expects a smaller value: it reads the return value from r24.

This is actually a regression from https://reviews.llvm.org/D98205.
Before D98205, the return value was an enum (which was 2 bytes in size)
which was compatible with the 1-byte return value that avr-gcc was
expecting. But long is 4 bytes and thus places the significant return
value in a different register.

Differential Revision: https://reviews.llvm.org/D124939

2 years agoFix a crash on targets where __bf16 isn't supported
Aaron Ballman [Wed, 4 May 2022 20:45:42 +0000 (16:45 -0400)]
Fix a crash on targets where __bf16 isn't supported

We'd nondeterministically assert (and later crash) when calculating the size or
alignment of a __bf16 type when the type isn't supported on a target because of
reading uninitialized values. Now we check whether the type is supported first.

Fixes #50171

2 years ago[mlir][LLVMIR] Do not update instMap via assignments to entry references
Min-Yih Hsu [Wed, 20 Apr 2022 17:13:59 +0000 (10:13 -0700)]
[mlir][LLVMIR] Do not update instMap via assignments to entry references

Inside processInstruction, we assign the translated mlir::Value to a
reference previously taken from the corresponding entry in instMap.
However, instMap (a DenseMap) might resize after the entry reference was
taken, rendering the assignment useless since it's assigning to a
dangling reference. Here is a (pseudo) snippet that shows the concept:
```
// inst has type llvm::Instruction *
Value &v = instMap[inst];
...
// op is one of the operands of inst, has type llvm::Value *
processValue(op);
// instMap resizes inside processValue
...
translatedValue = b.createOp<Foo>(...);
// v is already a dangling reference at this point!
// The following assignment is bogus.
v = translatedValue;
```

Nevertheless, after we stop caching llvm::Constant into instMap, there
is only one case that can cause processValue to resize instMap: If the
operand is a llvm::ConstantExpr. In which case we will insert the
derived llvm::Instruction into instMap.
To trigger instMap to resize, which is a DenseMap, the threshold depends
on the ratio between # of map entries and # of (hash) buckets. More specifically,
it resizes if (# of map entries / # of buckets) >= 0.75.
In this case # of map entries is equal to # of LLVM instructions, and # of
buckets is the power-of-two upperbound of # of map entries. Thus, eventually
in the attaching test case (test/Target/LLVMIR/Import/incorrect-instmap-assignment.ll),
we picked 96 and 128 for the # of map entries and # of buckets, respectively.
(We can't pick numbers that are too small since DenseMap used inlined
storage for small number of entries). Therefore, the ConstantExpr in the
said test case (i.e. a GEP) is the 96-th llvm::Value cached into the
instMap, triggering the issue we're discussing here on its enclosing
instruction (i.e. a load).

This patch fixes this issue by calling `operator[]` everytime we need to
update an entry.

Differential Revision: https://reviews.llvm.org/D124627

2 years ago[memprof] Use unknown_function error type for missing functions
Teresa Johnson [Wed, 4 May 2022 18:52:47 +0000 (11:52 -0700)]
[memprof] Use unknown_function error type for missing functions

Switch the error type when a function is not found in the memprof
profile to unknown_function. This gives compatibility with normal PGO
function matching, and also prevents issuing large numbers of additional
matching errors since pgo-warn-missing-function is off by default.

Differential Revision: https://reviews.llvm.org/D124953

2 years ago[libunwind] Silence warnings about unused variables. NFC.
Martin Storsjö [Wed, 4 May 2022 09:53:58 +0000 (12:53 +0300)]
[libunwind] Silence warnings about unused variables. NFC.

This variable was considered unused when NDEBUG was defined.

Differential Revision: https://reviews.llvm.org/D124911

2 years ago[libunwind] [CMake] Handle the RelWithDebInfo configuration similarly to Release
Martin Storsjö [Wed, 4 May 2022 09:52:20 +0000 (12:52 +0300)]
[libunwind] [CMake] Handle the RelWithDebInfo configuration similarly to Release

This makes sure to include libunwind log messages in the build if
LIBUNWIND_ENABLE_ASSERTIONS is set (which it is by default), when
building in RelWithDebInfo configurations.

Differential Revision: https://reviews.llvm.org/D124912

2 years ago[BOLT][NFC] Fix MCPlusBuilder::getAliases caching behavior
Amir Ayupov [Wed, 4 May 2022 18:42:14 +0000 (11:42 -0700)]
[BOLT][NFC] Fix MCPlusBuilder::getAliases caching behavior

Caching behavior of `getAliases` causes a failure in unit tests where two
MCPlusBuilder objects are created corresponding to AArch64 and X86:
the alias cache is created for AArch64 but then used for X86.

https://lab.llvm.org/staging/#/builders/211/builds/126

The issue only affects unit tests as we only construct one MCPlusBuilder
for ELF binary.

Resolve the issue by moving alias bitvectors to MCPlusBuilder object.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D124942

2 years agoRevert "[clang][extract-api] Use relative includes"
Zixu Wang [Wed, 4 May 2022 19:26:18 +0000 (12:26 -0700)]
Revert "[clang][extract-api] Use relative includes"

This reverts commit 4c262fee08b5383c96857d77eefe80d61c41d2b0.
Revert to fix Msan and Asan errors.

2 years ago[clang-format] Fix a bug in AlignConsecutiveAssignments
owenca [Tue, 3 May 2022 19:04:50 +0000 (12:04 -0700)]
[clang-format] Fix a bug in AlignConsecutiveAssignments

Fixes #55113.

Differential Revision: https://reviews.llvm.org/D124868

2 years ago[gn build] Port 80045e9afa2f
LLVM GN Syncbot [Wed, 4 May 2022 18:28:43 +0000 (18:28 +0000)]
[gn build] Port 80045e9afa2f

2 years ago[libc++] Implement ranges::for_each{, _n}
Nikolas Klauser [Wed, 4 May 2022 18:27:07 +0000 (20:27 +0200)]
[libc++] Implement ranges::for_each{, _n}

Reviewed By: var-const, #libc

Spies: libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D124332

2 years ago[HWASan] cleanup imports in hwasan_symbolize.
Florian Mayer [Wed, 4 May 2022 18:21:23 +0000 (11:21 -0700)]
[HWASan] cleanup imports in hwasan_symbolize.

2 years ago[libc++] Refactor max_size.pass.cpp
Louis Dionne [Mon, 25 Apr 2022 16:49:47 +0000 (10:49 -0600)]
[libc++] Refactor max_size.pass.cpp

Reorganize the test and simplify the #ifdefs. Fix a typo in __powerpc64__
as a fly-by, and also add a test for the unstable ABI.

Differential Revision: https://reviews.llvm.org/D124403

2 years ago[lldb] parallelize calling of Module::PreloadSymbols()
Luboš Luňák [Tue, 5 Apr 2022 13:40:21 +0000 (15:40 +0200)]
[lldb] parallelize calling of Module::PreloadSymbols()

If LLDB index cache is enabled and everything is cached, then loading of debug
info is essentially single-threaded, because it's done from PreloadSymbols()
called from GetOrCreateModule(), which is called from a loop calling
LoadModuleAtAddress() in DynamicLoaderPOSIXDYLD. Parallelizing the entire
loop could be unsafe because of GetOrCreateModule() operating on a module
list, so instead move only the PreloadSymbols() call to Target::ModulesDidLoad()
and parallelize there, which should be safe.

This may greatly reduce the load time if the debugged program uses a large
number of binaries (as opposed to monolithic programs where this presumably
doesn't make a difference). In my specific case of LibreOffice Calc this reduces
startup time from 6s to 2s.

Differential Revision: https://reviews.llvm.org/D122975

2 years ago[NFC] Remove unfinished test case
Zixu Wang [Wed, 4 May 2022 17:40:25 +0000 (10:40 -0700)]
[NFC] Remove unfinished test case

4c262fee08b5383c96857d77eefe80d61c41d2b0 accidentally added local
unfinished test case clang/test/Index/annotate-comments-enum-constant.c
This patch removes it.

2 years ago[clang][extract-api] Use relative includes
Zixu Wang [Fri, 15 Apr 2022 02:04:30 +0000 (19:04 -0700)]
[clang][extract-api] Use relative includes

This patch transforms the given input headers to relative include names
using header search entries and some heuritics.
For example: `/Path/To/Header.h` will be included as `<Header.h>` with a
search path of `-I /Path/To/`; and
`/Path/To/Framework.framework/Headers/Header.h` will be included as
`<Framework/Header.h>`, given a search path of `-F /Path/To`.
Headermaps will also be queried in reverse to find a spelled name to
include headers.

Differential Revision: https://reviews.llvm.org/D123831

2 years agoFix a failing assertion with vector type initialization
Aaron Ballman [Wed, 4 May 2022 17:22:30 +0000 (13:22 -0400)]
Fix a failing assertion with vector type initialization

When constant evaluating the initializer for an object of vector type,
we would call APInt::trunc() but truncate to the same bit-width the
object already had, which would cause an assertion. Instead, use
APInt::truncOrSelf() so that we no longer assert in this situation.

Fix #50216

2 years ago[InstCombine] add type constraint to intrinsic+shuffle fold
Sanjay Patel [Wed, 4 May 2022 16:57:34 +0000 (12:57 -0400)]
[InstCombine] add type constraint to intrinsic+shuffle fold

This check is in the related fold for binops,
but it was missed when the code was adapted
for intrinsics in 432c199e8473. The new test
would crash when trying to create a new
intrinsic with mismatched types.

2 years ago[InstCombine] move shuffle after funnel shift with same-shuffled operands
Sanjay Patel [Wed, 4 May 2022 16:44:47 +0000 (12:44 -0400)]
[InstCombine] move shuffle after funnel shift with same-shuffled operands

This extends 432c199e8473 and 9c4770eaab9d9 with an intrinsic
cited directly in issue #46238

Eventually, we will want to use llvm::isTriviallyVectorizable()
or create some new API for this list, but for now, I am intentionally
making a minimum change to reduce risk and only affect an intrinsic
with regression tests in place.

2 years ago[InstCombine] add tests for funnel-shift with shuffled operands; NFC
Sanjay Patel [Wed, 4 May 2022 16:39:34 +0000 (12:39 -0400)]
[InstCombine] add tests for funnel-shift with shuffled operands; NFC

2 years ago[NFC][CUDA][HIP] rework mangling number for aux target
Yaxun (Sam) Liu [Tue, 3 May 2022 11:48:37 +0000 (07:48 -0400)]
[NFC][CUDA][HIP] rework mangling number for aux target

CUDA/HIP needs to mangle for aux target. When mangling for aux target,
the mangler should use mangling number for aux target. Previously
in https://reviews.llvm.org/D122734 a state was introduced in
ASTContext to let the mangler get mangling number for aux target
from ASTContext. This patch removes that state from ASTConext
and add an IsAux member to MangleContext to indicate that
the mangle context is for aux target. This reflects the reality that
the mangle context is created for mangling aux target and makes
ASTContext cleaner.

Reviewed by: Artem Belevich, Reid Kleckner

Differential Revision: https://reviews.llvm.org/D124842

2 years ago[mlir][sparse][taco] Support more data types.
Bixia Zheng [Wed, 4 May 2022 14:36:03 +0000 (07:36 -0700)]
[mlir][sparse][taco] Support more data types.

Support int8, int16, int32 and int32. Also fix source code format in mlir_pytaco_utils.py.

Add tests.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D124925

2 years ago[clang] Track how headers get included generally during lookup time
Cyndy Ishida [Wed, 4 May 2022 14:38:20 +0000 (07:38 -0700)]
[clang] Track how headers get included generally during lookup time

tapi & clang-extractapi both attempt to construct then check against
how a header was included to determine api information when working
against multiple search paths, headermap, and vfsoverlay mechanisms.
Validating this against what the preprocessor sees during lookup time
makes this check more reliable.

Reviewed By: zixuw, jansvoboda11

Differential Revision: https://reviews.llvm.org/D124638

2 years agoFix a crash on invalid with _Generic expressions
Aaron Ballman [Wed, 4 May 2022 16:39:18 +0000 (12:39 -0400)]
Fix a crash on invalid with _Generic expressions

We were failing to check if the controlling expression is dependent or
not when testing whether it has side effects. This would trigger an
assertion. Instead, if the controlling expression is dependent, we
suppress the check and diagnostic.

This fixes Issue 50227.

2 years ago[VPlan] Add test for printing plan with an exit value.
Florian Hahn [Wed, 4 May 2022 16:19:02 +0000 (17:19 +0100)]
[VPlan] Add test for printing plan with an exit value.

Test for printing plan with additions from D123537.

2 years ago[InstCombine] propagate FMF when reordering intrinsics and shuffles
Sanjay Patel [Wed, 4 May 2022 16:01:53 +0000 (12:01 -0400)]
[InstCombine] propagate FMF when reordering intrinsics and shuffles

This was missed when extending the fold to allow fma with
9c4770eaab9d95c

2 years ago[InstCombine] add FMF to tests for better coverage; NFC
Sanjay Patel [Wed, 4 May 2022 15:58:01 +0000 (11:58 -0400)]
[InstCombine] add FMF to tests for better coverage; NFC

The fold added with 9c4770eaab9d95c neglected to propagate FMF.

2 years ago[Sema] Simplify CheckConstraintSatisfaction. NFC
Ilya Biryukov [Wed, 4 May 2022 15:31:59 +0000 (15:31 +0000)]
[Sema] Simplify CheckConstraintSatisfaction. NFC

- Exit early when constraint caching is disabled.
- Use unique_ptr to manage temporary lifetime.
- Fix a typo in a comment (InsertPos instead of InsertNode).

The new code duplicates the forwarding call to CheckConstraintSatisfaction,
but reduces the number of interconnected if statements and simplifies lifetime
management.

This increases the overall readability.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D124923

2 years ago[InstCombine] move shuffle after fma with same-shuffled operands
Sanjay Patel [Wed, 4 May 2022 15:17:25 +0000 (11:17 -0400)]
[InstCombine] move shuffle after fma with same-shuffled operands

https://alive2.llvm.org/ce/z/sD-JVv

This extends 432c199e8473 with a 3 arg intrinsic to demonstrate
that the code works with the extra operand.

Eventually, we will want to use llvm::isTriviallyVectorizable()
or create some new API for this list, but for now, I am intentionally
making a minimum change to reduce risk and only affect an intrinsic
with regression tests in place.

2 years ago[InstCombine] add tests for fma with shuffled operands; NFC
Sanjay Patel [Wed, 4 May 2022 15:11:47 +0000 (11:11 -0400)]
[InstCombine] add tests for fma with shuffled operands; NFC

2 years ago[mlir] Add a flag to allow equivalent results.
Alexander Belyaev [Wed, 4 May 2022 15:46:17 +0000 (17:46 +0200)]
[mlir] Add a flag to allow equivalent results.

Differential Revision: https://reviews.llvm.org/D124931

2 years ago[clang][dataflow] Only skip ExprWithCleanups when visiting terminators
Eric Li [Mon, 2 May 2022 21:36:04 +0000 (21:36 +0000)]
[clang][dataflow] Only skip ExprWithCleanups when visiting terminators

`IgnoreParenImpCasts` will remove implicit casts to bool
(e.g. `PointerToBoolean`), such that the resulting expression may not
be of the `bool` type. The `cast_or_null<BoolValue>` in
`extendFlowCondition` will then trigger an assert, as the pointer
expression will not have a `BoolValue`.

Instead, we only skip `ExprWithCleanups` and `ParenExpr` nodes, as the
CFG does not emit them.

Differential Revision: https://reviews.llvm.org/D124807

2 years ago[VectorCombine] Add tests for shuffle binops patterns. NFC
David Green [Wed, 4 May 2022 14:07:47 +0000 (15:07 +0100)]
[VectorCombine] Add tests for shuffle binops patterns. NFC

2 years ago[RISCV] Add a test showing incorrect VSETVLI insertion
Fraser Cormack [Wed, 20 Apr 2022 13:12:23 +0000 (14:12 +0100)]
[RISCV] Add a test showing incorrect VSETVLI insertion

This test shows incorrect cross-bb insertion. We'd expect to see
a SEW=8 vsetvli, something like:

        vsetvli zero, zero, e8, mf8, ta, mu
        vluxei64.v      v1, (a2), v8, v0.t

But instead the vsetvli is omitted and instead an inherited SEW=64
vsetvli is used:
        vmv1r.v v9, v1
        vsetvli a3, zero, e64, m1, ta, mu
        vmseq.vi        v9, v1, 0
        vmv1r.v v8, v0
        vmandn.mm       v0, v9, v2
        beqz    a0, .LBB0_2
    # %bb.1:
        vluxei64.v      v1, (a2), v8, v0.t
        vmv1r.v v3, v1

The "mask reg op" vmandn.mm in bb.1 appears to be confusing the insertion
process, as it is able to elide its own vsetvli as its VLMAX (SEW=8,
LMUL=MF8) is identical to the previous one (SEW=64, LMUL=1).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124089

2 years ago[SDAG] Handle truncated not in haveNoCommonBitsSet()
Nikita Popov [Tue, 3 May 2022 15:06:46 +0000 (17:06 +0200)]
[SDAG] Handle truncated not in haveNoCommonBitsSet()

Demanded bits analysis may replace a full-width not with a
any_extend (not (truncate X)) pattern. This patch looks through
this kind of pattern in haveNoCommonBitsSet(). Of course, we can
only do this if we only need negated bits in the non-extended part,
as the other bits may now be arbitrary. For example, if we have
haveNoCommonBitsSet(~X & Y, X) then ~X only needs to actually
negate bits set in Y.

This is only a partial solution to the problem in that it allows
add -> or conversion, but the resulting or doesn't get folded yet.
(I guess that will involve exposing getBitwiseNotOperand() as a
more general helper and using that in the relevant transform.)

Differential Revision: https://reviews.llvm.org/D124856

2 years ago[SCEV] Add additional poison implication tests (NFC)
Nikita Popov [Wed, 4 May 2022 13:23:19 +0000 (15:23 +0200)]
[SCEV] Add additional poison implication tests (NFC)

2 years ago[X86] Fix uninitialized variable warnings in cetintrin.h reported by #55224
Phoebe Wang [Wed, 4 May 2022 11:21:13 +0000 (19:21 +0800)]
[X86] Fix uninitialized variable warnings in cetintrin.h reported by #55224

Fix uninitialized variables introduced by D116325.

Differential Revision: https://reviews.llvm.org/D124916

2 years agoDo not rely on implicit int for this test
Aaron Ballman [Wed, 4 May 2022 13:06:16 +0000 (09:06 -0400)]
Do not rely on implicit int for this test

This should address failing test bots:
https://lab.llvm.org/buildbot/#/builders/68/builds/31828

2 years agoBump the serialization major version number
Aaron Ballman [Wed, 4 May 2022 13:05:07 +0000 (09:05 -0400)]
Bump the serialization major version number

This is a speculative fix for a build bot which does not put the LLVM
revision information into the PCH hash.

http://45.33.8.238/linux/75290/step_7.txt

2 years ago[AArch64][SVE] Restore SP from FP when SVE CSRs and variable sized objects are present
Bradley Smith [Thu, 28 Apr 2022 11:11:11 +0000 (11:11 +0000)]
[AArch64][SVE] Restore SP from FP when SVE CSRs and variable sized objects are present

Without SVE, after a dynamic stack allocation has modified the SP, it is
presumed that a frame pointer restoration will revert the SP back to
it's correct value prior to any caller stack being restored. However the
SVE frame is restored using the stack pointer directly, as it is located
after the frame pointer. This means that in the presence of a dynamic
stack allocation, any SVE callee state gets corrupted as SP has the
incorrect value when the SVE state is restored.

To address this issue, when variable sized objects and SVE CSRs are
present, treat the stack as having been realigned, hence restoring the
stack pointer from the frame pointerr prior to restoring the SVE state.

Differential Revision: https://reviews.llvm.org/D124615

2 years ago[InstCombine] Fix commuted tests (NFC)
Nikita Popov [Wed, 4 May 2022 12:52:31 +0000 (14:52 +0200)]
[InstCombine] Fix commuted tests (NFC)

As pointed out on D124710, these need more thwarting.