review.tizen.org Git - platform/upstream/llvm.git/log

AMDGPU/GlobalISel: Add equiv xform for bitcast_fpimm_to_i32

Only partially fixes one pattern import.

AMDGPU/GlobalISel: Fix add of neg inline constant pattern

TableGen/GlobalISel: Fix slightly wrong generated comment

AMDGPU: Add register class to DS_SWIZZLE_B32 pattern

Reduces diff for a future patch.

[ARM,MVE] Add missing IntrNoMem flag on IR intrinsics.

A lot of the IR-level intrinsics we've been defining for MVE recently
accidentally had `props = []` instead of `props = [IntrNoMem]`, so
that optimization would have been overcautious about reordering them.

All the affected cases were due to instantiating the multiclasses
`MVEPredicated` and `MVEMXPredicated` without filling in the `props`
parameter, because I //thought// I remembered having set the defaults
in those multiclasses to `[IntrNoMem]`. In fact I hadn't done that.
Now I have.

(The IR intrinsics that //do// read and write memory are all
explicitly marked as `[IntrReadMem]` or `[IntrWriteMem]` already, so
they will override these defaults.)

[ARM,MVE] Fix valid immediate range for vsliq_n.

In common with most MVE immediate shift instructions, the left shift
takes an immediate in the range [0,n-1], while the right shift takes
one in the range [1,n]. I had absent-mindedly made them both the
latter.

While I'm here, I've added a set of regression tests checking both
ends of the immediate range for a representative sample of the
immediate shifts.

IR: remove "else" after "return". NFCI.

[OPENMP]Remove unused code, NFC.

[DAGCombiner] reduce extract subvector of concat

If we are extracting a chunk of a vector that's a fraction of an
operand of the concatenated vector operand, we can extract directly
from one of those original operands.

This is another suggestion from PR42024:
https://bugs.llvm.org/show_bug.cgi?id=42024#c2

But I'm not sure yet if it will make any difference on those patterns.
It seems to help a few existing AVX512 tests though.

Differential Revision: https://reviews.llvm.org/D72361

[Concepts] Fix failing test on Windows

Fix test failed by D43357 on Windows.

[InstSimplify] select Cond, true, false --> Cond

This is step 1 of damage control assuming that we need to remove several
over-reaching folds for select-of-booleans because they can cause
miscompiles as shown in D72396.

The scalar case seems obviously safe:
https://rise4fun.com/Alive/jSj

And I don't think there's any danger for vectors either - if the
condition is poisoned, then the select must be poisoned too, so undef
elements don't make any difference.

Differential Revision: https://reviews.llvm.org/D72412

[ARM][MVE] MVE-I should not be disabled by -mfpu=none

Architecturally, it's allowed to have MVE-I without an FPU, thus
-mfpu=none should not disable MVE-I, or moves to/from FP-registers.

This patch removes `+/-fpregs` from features unconditionally added to
target feature list, depending on FPU and moves the logic to Clang
driver, where the negative form (`-fpregs`) is conditionally added to
the target features list for the cases of `-mfloat-abi=soft`, or
`-mfpu=none` without either `+mve` or `+mve.fp`. Only the negative
form is added by the driver, the positive one is derived from other
features in the backend.

Differential Revision: https://reviews.llvm.org/D71843

[InstCombine] Use minimal FMF in testcase for Z / (1.0 / Y) => (Y * Z); NFC

Patch by: @raghesh (Raghesh Aloor)

Differential Revision: https://reviews.llvm.org/D72431

[lldb] Modernize OptionValue::SetValueChangedCallback

instead of a function pointer + void*, take a std::function. This
removes a bunch of repetitive, unsafe void* casts.

[mlir] fix test failure in EDSC/builder-api-test

This patch fixes a test failure on a non-intel (PowerPC64) box.
The two affine.load are independent and hence llvm may reorder them.
The CHECK lines are modified for supporting reordered case.

Differential Revision: https://reviews.llvm.org/D72435

[Concepts] Function trailing requires clauses

Function trailing requires clauses now parsed, supported in overload resolution and when calling, referencing and taking the address of functions or function templates.

Differential Revision: https://reviews.llvm.org/D43357

[NFC][ARM] LowOverheadLoop comments

Add a comment describing the dependencies of the pass.

Fix "pointer is null" static analyzer warning. NFCI.

Use cast<> instead of dyn_cast<> since we know that the pointer should be valid (and is dereferenced immediately).

Fix "pointer is null" static analyzer warning. NFCI.

Use cast<> instead of dyn_cast<> since we know that the pointer should be valid (and is dereferenced immediately below).

[lldb/DWARF] Fix mixed v4+v5 location lists

Summary:
Our code was expecting that a single (symbol) file contains only one
kind of location lists. This is not correct (on non-apple platforms, at
least) as a file can compile units with different dwarf versions.

This patch moves the deteremination of location list flavour down to the
compile unit level, fixing this problem. I have also tried to rougly
align the code with the llvm DWARFUnit. Fully matching the API is not
possible because of how lldb's DWARFExpression lives separately from the
rest of the DWARF code, but this is at least a step in the right
direction.

Reviewers: JDevlieghere, aprantl, clayborg

Subscribers: dblaikie, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D71751

[lldb/DWARF] Add is_dwo member to DWARFUnit

Summary:
A skeleton unit can easily be detected by checking the m_dwo_symbol_file
member, but we cannot tell a split unit from a normal unit from the
"inside", which is sometimes useful.

This patch adds a m_is_dwo member to enable this, and align the code
with llvm::DWARFUnit. Right now it's only used to avoid creating a split
unit inside another split unit (which removes one override from
SymbolFileDWARFDwo and brings us a step closer to deleting it), but my
main motivation is fixing the handling of location lists in mixed v4&v5
files. This comes in a separate patch.

Reviewers: JDevlieghere, aprantl, clayborg

Subscribers: dblaikie, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D71750

Fix "pointer is null" static analyzer warnings. NFCI.

Use castAs<> instead of getAs<> since the pointer is dereferenced immediately below and castAs will perform the null assertion for us.

Fix "pointer is null" static analyzer warnings. NFCI.

Assert that the pointers are non-null before dereferencing them.

[ARM][MVE] Don't unroll intrinsic loops.

We don't unroll vector loops for MVE targets, but we miss the case
when loops only contain intrinsic calls. So just move the logic a
bit to catch this case.

Differential Revision: https://reviews.llvm.org/D72440

[clang-tidy] For checker `readability-misleading-indentation` update tests.

Summary: In D72333 we've introduced support for `if constexpr` but the test for uninstantiated template was not ready to land on windows platform since this target uses `-fdelayed-template-parsing` by default. This patch addresses this by passing `-fno-delayed-template-parsing` to the test.

Reviewers: JonasToth

Subscribers: xazax.hun, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72438

Fix MSVC unhandled enum warning. NFCI.

[lldb] Fix that TestNoSuchArch.py was passing for the wrong reason

The command here failed due to the type in 'create' but the expect
did not actually check for the error message. This fixes the typo
and adds a check for the actuall error message we should see.

[Matrix] Update shape propagation to iterate until done.

This patch updates the shape propagation to iterate until no new shape
information is discovered.

As initial seed for the forward propagation, we use the matrix intrinsic
instructions. Both propagateShapeForward and propagateShapeBackward
return new work lists, with the instructions to be used for the next
iteration. When propagating forward, we record all instructions we added
new shape information for. When propagating backward, we record all
users of instructions we added new shape information for.

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70901

[clangd] Refurbish HoverInfo::present

Summary: Improves basic hover presentation logic to include more info.

Reviewers: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D71555

[Matrix] Propagate and use shape information for loads.

This patch extends to shape propagation to also include load
instructions and implements shape aware lowering for vector loads.

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70900

[VE] Target stub for NEC SX-Aurora

Summary:
This patch registers the 've' target: the NEC SX-Aurora TSUBASA Vector Engine.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D69103

[LoopUtils][NFC] Minor refactoring in getLoopEstimatedTripCount.

[Matrix] Implement back-propagation of shape information.

This patch extends the shape propagation for matrix operations to also
propagate the shape of instructions to their operands.

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70899

Revert "[ARM][LowOverheadLoops] Update liveness info"

This reverts commit e93e0d413f3afa1df5c5f88df546bebcd1183155.

There's some ordering problems on some on the buildbots which needs
investigating.

[DWARFDebugLoc] Tweak error message when resolving offset pairs with no base address

The previous message mentioned DW_LLE_offset_pair, but this is
incorrect/confusing because we can get this message even with DWARF4
(which does not use DW_LLE encodings). This happens because DWARF<=4
location entries are "upgraded" to DWARF v5 during parsing.

The new error message refrains from referencing specific constants.
Fixes pr44482.

[LV] Still vectorise when tail-folding can't find a primary inducation variable

This addresses a vectorisation regression for tail-folded loops that are
counting down, e.g. loops as simple as this:

  void foo(char *A, char *B, char *C, uint32_t N) {
    while (N > 0) {
      *C++ = *A++ + *B++;
       N--;
    }
  }

These are loops that can be vectorised, but when tail-folding is requested, it
can't find a primary induction variable which we do need for predicating the
loop. As a result, the loop isn't vectorised at all, which it is able to do
when tail-folding is not attempted. So, this adds a check for the primary
induction variable where we decide how to lower the scalar epilogue. I.e., when
there isn't a primary induction variable, a scalar epilogue loop is allowed
(i.e. don't request tail-folding) so that vectorisation could still be
triggered.

Having this check for the primary induction variable make sense anyway, and in
addition, in a follow-up of this I will look into discovering earlier the
primary induction variable for counting down loops, so that this can also be
tail-folded.

Differential revision: https://reviews.llvm.org/D72324

[mlir][GPU] introduce utilities for promotion to workgroup memory

Introduce a set of function that promote a memref argument of a `gpu.func` to
workgroup memory using memory attribution. The promotion boils down to
additional loops performing the copy from the original argument to the
attributed memory in the beginning of the function, and back at the end of the
function using all available threads. The loop bounds are specified so as to
adapt to any size of the workgroup. These utilities are intended to compose
with other existing utilities (loop coalescing and tiling) in cases where the
distribution of work across threads is uneven, e.g. copying a 2D memref with
only the threads along the "x" dimension. Similarly, specialization of the
kernel to specific launch sizes should be implemented as a separate pass
combining constant propagation and canonicalization.

Introduce a simple attribute-driven pass to test the promotion transformation
since we don't have a heuristic at the moment.

Differential revision: https://reviews.llvm.org/D71904

[ARM][LowOverheadLoops] Update liveness info

After expanding the pseudo instructions, update the liveness info.
We do this in a post-order traversal of the loop, including its
exit blocks and preheader(s).

Differential Revision: https://reviews.llvm.org/D72131

[mlir][VectorOps] Implement insert_strided_slice conversion

Summary:
This diff implements the progressive lowering of insert_strided_slice.
Two cases appear:
1. when the source and dest vectors have different ranks, extract the dest
subvector at the proper offset and reduce to case 2.
2. when they have the same rank N:
  a. if the source and dest type are the same, the insertion is trivial:
     just forward the source
  b. otherwise, iterate over all N-1 D subvectors and create an
     extract/insert_strided_slice/insert replacement, reducing the problem
     to vecotrs of the same N-1 rank.

This combines properly with the other conversion patterns to lower all the way to LLVM.

Reviewers: ftynse, rriddle, AlexEichenberger, andydavis1, tetuante, nicolasvasilache

Reviewed By: andydavis1

Subscribers: merge_guards_bot, mehdi_amini, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72317

[mlir][VectorOps] Implement strided_slice conversion

Summary:
This diff implements the progressive lowering of strided_slice to either:
1. extractelement + insertelement for the 1-D case
2. extract + optional strided_slice + insert for the n-D case.

This combines properly with the other conversion patterns to lower all the way to LLVM.

Appropriate tests are added.

Reviewers: ftynse, rriddle, AlexEichenberger, andydavis1, tetuante

Reviewed By: andydavis1

Subscribers: merge_guards_bot, mehdi_amini, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72310

[APFloat] Fix checked error assert failures

`APFLoat::convertFromString` returns `Expected` result, which must be
"checked" if the LLVM_ENABLE_ABI_BREAKING_CHECKS preprocessor flag is
set.
To mark an `Expected` result as "checked" we must consume the `Error`
within.
In many cases, we are only interested in knowing if an error occured,
without the need to examine the error info. This is achieved, easily,
with the `errorToBool()` API.

[SCEV] [NFC] add testcase for constant range for addrecexpr with nsw flag

[lldb/SWIG] Refactor extensions to be non Python-specific (3/3)

The current SWIG extensions for the string conversion operator is Python
specific because it uses the PythonObjects. This means that the code
cannot be reused for other SWIG supported languages such as Lua.

This reimplements the extensions in a more generic way that can be
reused. It uses a SWIG macro to reduce code duplication.

Differential revision: https://reviews.llvm.org/D72377

[DAGCombine] Fold the (fma -x, y, -z) to -(fma x, y, z)

This is a positive combination as long as the NEG is NOT free,
as we are reducing the number of NEG from two to one.

Differential Revision: https://reviews.llvm.org/D72312

Revert "Revert "[MIR] Target specific MIR formating and parsing""

There was an unguarded dereference of MF in a function that permitted
nullptr. Fixed

This reverts commit 71d64f72f934631aa2f12b9542c23f74f256f494.

Revert "[MIR] Target specific MIR formating and parsing"

This reverts commit 3ef05d85be8c3666ebfa3ad986eb334da5195a47.
It broke check-llvm on many bots, see comments on D69836.

[MIR] Target specific MIR formating and parsing

Summary:
Added MIRFormatter for target specific MIR formating and parsing with
immediate and custom pseudo source values. Target machine can subclass
MIRFormatter and implement custom logic for printing and parsing
immediate and custom pseudo source values for better readability.

* Target specific immediate mnemonic need to start with "." follows by
  identifier string. When MIR parser sees immediate it will call target
  specific parsing function.

* Custom pseudo source value need to start with custom follows by
  double-quoted string. MIR parser will pass the quoted string to target
  specific PSV parsing function.

* MIRFormatter have 2 helper functions to facilitate LLVM value printing
  and parsing for custom PSV if they refers LLVM values.

Patch by Peng Guo

Reviewers: dsanders, arsenm

Reviewed By: dsanders

Subscribers: wdng, jvesely, nhaehnle, hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69836

Revert "[MIR] Target specific MIR formating and parsing"

Forgot to credit Peng in the commit message.

This reverts commit be841f89d0014b1e0246a4feae941b2f74abd908.

Save more descriptive error msg from FBS/BKS, relay it up to lldb.

When lldb requests an app launch through FrontBoard/BackBoard,
we get back an NSError object if there was a problem with an
integer error code and a descriptive text string. debugserver
would log the descriptive text string to the console, but it
would only save the error code value, ask for the
much-less-specific name of that error code, and send that very
generic error word back to lldb.

This patch saves the longer description of the failure when
available, and sends that to lldb. If unavailable, it falls
back to sending up the generic description of the error code
as it was doing before.

This only impacts the iOS on-device debugserver.

<rdar://problem/49953304>

[MIR] Target specific MIR formating and parsing

Summary:
Added MIRFormatter for target specific MIR formating and parsing with
immediate and custom pseudo source values. Target machine can subclass
MIRFormatter and implement custom logic for printing and parsing
immediate and custom pseudo source values for better readability.

* Target specific immediate mnemonic need to start with "." follows by
  identifier string. When MIR parser sees immediate it will call target
  specific parsing function.

* Custom pseudo source value need to start with custom follows by
  double-quoted string. MIR parser will pass the quoted string to target
  specific PSV parsing function.

* MIRFormatter have 2 helper functions to facilitate LLVM value printing
  and parsing for custom PSV if they refers LLVM values.

Reviewers: dsanders, arsenm

Reviewed By: dsanders

Subscribers: wdng, jvesely, nhaehnle, hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69836

[lldb] Remove various dead Compare functions

[PowerPC] when folding rlwinm+rlwinm. to andi., we should use first rlwinm
input reg.

%2:gprc = RLWINM %1:gprc, 27, 5, 10
%3:gprc = RLWINM_rec %2:gprc, 8, 5, 10, implicit-def $cr0

==>

%3:gprc = ANDI_rec %1, 0, implicit-def $cr0

we should use %1 instead of %2 as ANDI_rec input.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D71885

Revert "[NFC][InlineCost] Factor cost modeling out of CallAnalyzer traversal."

This reverts commit 76aab66d34446ccf764cf8127b73e1517df75fb4.

Failure:
http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/20562,
will investigate and resubmit.

[Attributor][FIX] Carefully change invokes to calls (after manifest)

Before we manually inserted unreachable early but that could lead to
broken PHI nodes. Now we use the existing late modification
functionality.

[Attributor][FIX] Avoid dangling value pointers during code modification

When we replace instructions with unreachable we delete instructions. We
now avoid dangling pointers to those deleted instructions in the
`ToBeChangedToUnreachableInsts` set. Other modification collections
might need to be updated in the future as well.

[NFC][InlineCost] Factor cost modeling out of CallAnalyzer traversal.

Summary:
The goal is to simplify experimentation on the cost model. Today,
CallAnalyzer decides 2 things: legality, and benefit. The refactoring
keeps legality assessment in CallAnalyzer, and factors benefit
evaluation out, as an extension.

Reviewers: davidxl, eraman

Subscribers: kamleshbhalui, fedor.sergeev, hiraditya, baloghadamsoftware, haicheng, a.sidorin, Szelethus, donat.nagy, dkrupp, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71733

[PowerPC]: Add powerpcspe target triple subarch component

Summary:
This allows the use of '-target powerpcspe-unknown-linux-gnu' or
'powerpcspe-unknown-freebsd' to be used, instead of
'-target powerpc-unknown-linux-gnu -mspe'.

Reviewed By: dim
Differential Revision: https://reviews.llvm.org/D72014

Recommit "[MachineVerifier] Improve verification of live-in lists."

MachineVerifier::visitMachineFunctionAfter() is extended to check the
live-through case for live-in lists. This is only done for registers without
aliases and that are neither allocatable or reserved, such as the SystemZ::CC
register.

The MachineVerifier earlier only catched the case of a live-in use without an
entry in the live-in list (as "using an undefined physical register").

A comment in LivePhysRegs.h has been added stating a guarantee that
addLiveOuts() can be trusted for a full register both before and after
register allocation.

Review: Quentin Colombet

Differential Revision: https://reviews.llvm.org/D68267

[libcxx] [test] Disable refwrap/weak_result.pass.cpp in C++20 mode (broken by P0357R3)

[NFC] Whitespace fixes

[X86] Remove EFLAGS from live-in lists in X86FlagsCopyLowering.

When EFLAGS is no longer live into a basic block, remove it from the live-in
list.

Fixes https://bugs.llvm.org/show_bug.cgi?id=44462.

Review: Craig Topper

Differential Revision: https://reviews.llvm.org/D71375

[lldb/SWIG] Refactor extensions to be non Python-specific (2/2)

The current SWIG extensions for the string conversion operator is Python
specific because it uses the PythonObjects. This means that the code
cannot be reused for other SWIG supported languages such as Lua.

This reimplements the extensions in a more generic way that can be
reused. It uses a SWIG macro to reduce code duplication.

Differential revision: https://reviews.llvm.org/D72377

[cfi][test] cross-dso/stats.cpp: don't assume the order of static constructors

__sanitizer_stat_init is called for the executable first, then the
shared object. In WriterModuleReport(), the information for the shared
object will be recorded first. It'd be nice to get rid of the order
requirement of static constructors. (This should make .ctors platforms
work.)

[MLIR] Don't use SSA names directly for std.view canonicalization test

Reviewers: rriddle, nicolasvasilache

Subscribers: mehdi_amini, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72408

Revert "Merge memtag instructions with adjacent stack slots."

*** Bad machine code: Tied use must be a register ***
- function: stg_alloca17
- basic block: %bb.0 entry (0x20076710580)
- instruction: early-clobber %0:gpr64common, early-clobber %1:gpr64sp = STGloop 272, %stack.0.a :: (store 272 into %ir.a, align 16)
- operand 3: %stack.0.a

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/21481/steps/test-check-all/logs/stdio

This reverts commit b675a7628ce6a21b1e4a71c079a67badfb8b073d.

Delete dead code.

https://reviews.llvm.org/D58856

[lldb/CMake] Only auto-enable Lua when SWIG is found

Just like Python, Lua should only be auto-enabled if SWIG is found as
well. This moves the logic of finding SWIG and Lua as a whole into a new
CMake package.

Revert "[JumpThreading] Thread jumps through two basic blocks"

It looks like my patch breaks the sanitizer-windows build:

http://lab.llvm.org:8011/builders/sanitizer-windows/builds/56324

This reverts commit ead815924e6ebeaf02c31c37ebf7a560b5fdf67b.

[lldb/SWIG] Refactor extensions to be non Python-specific

The current SWIG extensions for the string conversion operator is Python
specific because it uses the PythonObjects. This means that the code
cannot be reused for other SWIG supported languages such as Lua.

This reimplements the extensions in a more generic way that can be
reused.

Differential revision: https://reviews.llvm.org/D72377

[InstSimplify] add tests for select of true/false; NFC

[LLD] [COFF] Fix post-commit suggestions for absolute symbol equality

Differential Revision: https://reviews.llvm.org/D72252

Canonicalize static alloc followed by memref_cast and std.view

Summary: Rewrite alloc, memref_cast, std.view into allo, std.view by droping memref_cast.

Reviewers: nicolasvasilache

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72379

[x86] add test for concat-extract corner case; NFC

See D72361 for discussion.

LTOVisibility.rst: fix up syntax in example

Summary: Pretty self-evident. This example was missing an lparen. Added it, and fixed up the ASCII art.

Patch by Nick Black <dankamongmen@gmail.com>

Reviewers: pcc

Reviewed By: pcc

Subscribers: tejohnson, mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, cfe-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D70765

Add a new AST matcher 'optionally'.

This matcher matches any node and at the same time executes all its
inner matchers to produce any possbile result bindings.

This is useful when a user wants certain supplementary information
that's not always present along with the main match result.

Merge memtag instructions with adjacent stack slots.

Summary:
Detect a run of memory tagging instructions for adjacent stack frame slots,
and replace them with a shorter instruction sequence
* replace STG + STG with ST2G
* replace STGloop + STGloop with STGloop

This code needs to run when stack slot offsets are already known, but before
FrameIndex operands in STG instructions are eliminated; that's the
reason for the new hook in PrologueEpilogue.

This change modifies STGloop and STZGloop pseudos to take the size as an
immediate integer operand, and base address as a FI operand when
possible. This is needed to simplify recognizing an STGloop instruction
as operating on a stack slot post-regalloc.

This improves memtag code size by ~0.25%, and it looks like an additional ~0.1%
is possible by rearranging the stack frame such that consecutive STG
instructions reference adjacent slots (patch pending).

Reviewers: pcc, ostannard

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70286

[X86] Keep cl::opts at top of file [NFC]

[MLIR] Fix ML IR build on Windows with Visual Studio

Summary: Right now the path for each lib in whole_archive_link when MSVC is used as the compiler is not a full path - and it's not even the correct path when VS is used to build. This patch sets the lib path to a full path using CMAKE_CFG_INTDIR which means the path will be correct regardless of whether ninja, make or VS is used and it will always be a full path.

Reviewers: denis13, jpienaar

Reviewed By: jpienaar

Subscribers: mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, llvm-commits, asmith

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72403

[clang-tidy] Remove broken test on Windows for `readability-misleading-indentation`.
Because Windows build uses by default `fdelayed-template-parsing` we cannot have a test
where we don't instantiate the template. Please see D72333.

[mlir] NFC: Move the state for managing aliases out of ModuleState and into a new class AliasState.

Summary: This reduces the complexity of ModuleState and simplifies the code. A future revision will mold ModuleState into something that can be used by users for caching of printer state, as well as for implementing printAsOperand style methods.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D72292

[mlir][Linalg] Lower linalg.reshape to LLVM for the static case

Summary:
This diff adds lowering of the linalg.reshape op to LLVM.

A new descriptor is created with fields initialized as follows:
1. allocatedPTr, alignedPtr and offset are copied from the source descriptor
2. sizes are copied from the static destination shape
3. strides are copied from the static strides collected with `getStridesAndOffset`

Only the static case in which the target view conforms to strided memref
semantics is supported. Other cases are left for future work and will be added on
a per-need basis.

Reviewers: ftynse, mravishankar

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72316

[X86] Custom type legalize v4i64->v4f32 uint_to_fp on sse4.1 targets in 64-bit mode

For v4i64->v4f32 uint_to_fp on pre-avx targets where v4i64 isn't legal we create to v2i64->v2f32 uint_to_fp that need to be shuffled together. Our codegen for v2i64->v2f32 involves detecting if the number is larger than (2^31 - 1), if so we do a special divison by 2 so we can do a signed conversion which we need to scalarize, then do a multiply by 2 at the end if we divided earlier.

When v4i64 isn't legal we need to split the checking for a larger number and dividing by 2 into two v2i64 vectors. The scalar part can extract the 4 i64 values from those 4 splits. But we can reassemble the 4 scalar f32 results directly into a single v432 vector. Then we just need to combine the fixup indications from the 2 halves and we can do the final multiply by 2 fixup on all 4 values if needed at once using a single v4f32 blend and v4f32 fadd.

Differential Revision: https://reviews.llvm.org/D72368

[X86] Add isel patterns for bitcasting between v32i1/v64i1 and float/double.

We have to do an intermediate jump to a GPR to make the cast.

Fixes PR43750.

[BranchAlign] Compiler support for suppressing branch align

As discussed heavily in the original review (D70157), there's a need for the compiler to be able to selective suppress padding (either nop or prefix) to respect assumptions about the meaning of labels and instructions in generated code.

Rather than wait for syntax to be finalized - which appears to be a very slow process - this patch focuses on the compiler use case and *only* worries about the integrated assembler. To my knowledge, this covers all cases mentioned to date for clang/JIT support.

For testing purposes, I wired it up so that if the integrated assembler was using autopadding for branch alignment (e.g. enabled at command line) then the textual assembly output would contain a comment for each location where padding was enabled or disabled. This seemed like the least painful choice overall.

Note that the result of this patch effective disables the jcc errata mitigation for many constructs (statepoints, implicit null checks, xray, etc...) which is non ideal. It is at least *correct* and should allow us to enable the mitigation for the compiler. Once that's done, and a few other items are worked through, we probably want to come back to this an explore a bundling based approach instead so that we can pad instructions while keeping labels in the right place.

Differential Revision: https://reviews.llvm.org/D72303

[ELF] Delete an unused special rule from isStaticLinkTimeConstant. NFC

Weak undefined symbols are preemptible after D71794.

  if (sym.isPreemptible)
    return false;
  if (!config->isPic)
    return true;
  // isPic means includeInDynsym is true after D71794.

  ...

  // We can delete this if because it can never be true.
  if (sym.isUndefWeak)
    return true;

Differential Revision: https://reviews.llvm.org/D71795

[ELF] Don't special case weak symbols for pie with no shared objects

D59275 added the following clause to Symbol::includeInDynsym()

if (isUndefWeak() && Config->Pie && SharedFiles.empty())
return false;

D59549 explored the possibility to generalize it for -no-pie.

GNU ld's rules are architecture dependent and partly controlled by -z
{,no-}dynamic-undefined-weak. Our attempts to mimic its rules are
actually half-baked and don't provide perceivable benefits (it can save
a few more weak undefined symbols in .dynsym in a -static-pie
executable). Let's just delete the rule for simplicity. We will expect
cosmetic inconsistencies with ld.bfd in certain -static-pie scenarios.

This permits a simplification in D71795.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D71794

[MC] writeFragment - assert MCFragment::FT_Fill length is legal.

Silence (clang/MSVC) static analyzer warnings that the fragment data may either write out of bounds of the local array or reference uninitialized data.

Fix "pointer is null" static analyzer warning. NFCI.

Use cast<> instead of dyn_cast<> since we know that the pointer should be valid (and is dereferenced immediately below in the getSignature call).

Fix "pointer is null" static analyzer warning. NFCI.

Use castAs<> instead of getAs<> since we know that the pointer will be valid (and is dereferenced immediately below).

[lldb/CMake] Use LLDB's autodetection logic for libxml2

Libxml2 is already an optional dependency. It should use the same
infrastructure as the other dependencies.

Differential revision: https://reviews.llvm.org/D72290

[amdgpu] Remove unused header. NFC.

[SelectionDAG] Use llvm::Optional<APInt> for FoldValue.

Use llvm::Optional<APInt> instead of std::pair<APInt, bool> with the bool second being used to report success/failure of fold.

[InstCombine] Adding testcase for Z / (1.0 / Y) => (Y * Z); NFC

The added testcase shows the current transformation for the operation
Z / (1.0 / Y), which remains unchanged. This will be updated to align
with the transformed code (Y * Z) with D72319.

The existing transformation Z / (X / Y) => (Y * Z) / X is not handling
this case as there are multiple uses for (1.0 / Y) in this testcase.

Patch by: @raghesh (Raghesh Aloor)

Differential Revision: https://reviews.llvm.org/D72388

[DAGCombiner] clean up extract-of-concat fold; NFC

This hopes to improve readability and adds an assert.
The functional change noted by the TODO comment is
proposed in:
D72361

[OPENMP]Allow comma in combiner expression.

Use ParseExpression() instead of ParseAssignmentExpression() to allow
commas in combiner expressions.

[JumpThreading] Thread jumps through two basic blocks

Summary:
This patch teaches JumpThreading.cpp to thread through two basic
blocks like:

  bb3:
    %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

by duplicating basic blocks like bb3 above.  Once we duplicate bb3 as
bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have:

  bb3:
    %var = phi i32* [ @a, %bb2 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb3.dup:
    %var = phi i32* [ null, %bb1 ]
    %tobool = icmp eq i32 %cond, 0
    br i1 %tobool, label %bb4, label ...

  bb4:
    %cmp = icmp eq i32* %var, null
    br i1 %cmp, label bb5, label bb6

Then the existing code in JumpThreading.cpp can thread edge
bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5.

Reviewers: wmi

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70247

[ARM,MVE] Intrinsics for variable shift instructions.

This batch of intrinsics fills in all the shift instructions that take
a variable shift distance in a register, instead of an immediate. Some
of these instructions take a single shift distance in a scalar
register and apply it to all lanes; others take a vector of per-lane
distances.

These instructions are all basically one family, varying in whether
they saturate out-of-range values, and whether they round when bits
are shifted off the bottom. I've implemented them at the IR level by a
much smaller family of IR intrinsics, which take flag parameters to
indicate saturating and/or rounding (along with the usual one to
specify signed/unsigned integers).

An oddity is that all of them are //left// shift instructions – but if
you pass a negative shift count, they'll shift right. So the vector
shift distances are always vectors of //signed// integers, regardless
of whether you're considering the other input vector to be of signed
or unsigned. Also, even the simplest `vshlq` instruction in this
family (neither saturating nor rounding) has to be implemented as an
IR intrinsic, because the ordinary LLVM IR `shl` operation would
consider an out-of-range shift count to be undefined behavior.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72329

[ARM,MVE] Intrinsics for partial-overwrite imm shifts.

This batch of intrinsics covers two sets of immediate shift
instructions, which have in common that they only overwrite part of
their output register and so they need an extra input giving its
previous value.

The VSLI and VSRI instructions shift each lane of the input vector
left or right just as if they were normal immediate VSHL/VSHR, but
then they only overwrite the output bits that correspond to actual
shifted bits of the input. So VSLI will leave the low n bits of each
output lane unchanged, and VSRI the same with the top n bits.

The V[Q][R]SHR[U]N family are all narrowing shifts: they take an input
vector of 2n-bit integers, shift each lane right by a constant, and
then narrowing the shifted result to only n bits. So they only
overwrite half of the n-bit lanes in the output register, and the B/T
suffix indicates whether it's the bottom or top half of each 2n-bit
lane.

I've implemented the whole of the latter family using a single IR
intrinsic `vshrn`, which takes a lot of i32 parameters indicating
which instruction it expands to (by specifying signedness of the input
and output types, whether it saturates and/or rounds, etc).

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72328

[clang-tidy] Disable match on `if constexpr` statements in template instantiation for `readability-misleading-indentation` check.

Summary: Fixes fixes `readability-misleading-identation` for `if constexpr`. This is very similar to D71980.

Reviewers: alexfh

Subscribers: xazax.hun, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72333

[Intrinsic] Add fixed point division intrinsics.

Summary:
This patch adds intrinsics and ISelDAG nodes for
signed and unsigned fixed-point division:

llvm.sdiv.fix.*
llvm.udiv.fix.*

These intrinsics perform scaled division on two
integers or vectors of integers. They are required
for the implementation of the Embedded-C fixed-point
arithmetic in Clang.

Patch by: ebevhan

Reviewers: bjope, leonardchan, efriedma, craig.topper

Reviewed By: craig.topper

Subscribers: Ka-Ka, ilya, hiraditya, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70007