platform/upstream/llvm.git
2 years ago[ELF] Support (TYPE=<value>) to customize the output section type
Fangrui Song [Thu, 17 Feb 2022 20:10:58 +0000 (12:10 -0800)]
[ELF] Support (TYPE=<value>) to customize the output section type

The current output section type allows to set the ELF section type to
SHT_PROGBITS or SHT_NOLOAD. This patch allows an arbitrary section value
to be specified. Some common SHT_* literal names are supported as well.

```
SECTIONS {
  note (TYPE=SHT_NOTE) : { BYTE(8) *(note) }
  init_array ( TYPE=14 ) : { QUAD(14) }
  fini_array (TYPE = SHT_FINI_ARRAY) : { QUAD(15) }
}
```

When `sh_type` is specified, it is an error if an input section has a different type.

Our syntax is compatible with GNU ld 2.39 (https://sourceware.org/bugzilla/show_bug.cgi?id=28841).

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D118840

2 years ago{instsimplify] Precommit some tests for provable inequal pointers derived from allocas
Philip Reames [Thu, 17 Feb 2022 20:00:44 +0000 (12:00 -0800)]
{instsimplify] Precommit some tests for provable inequal pointers derived from allocas

2 years ago[EarlyCSE][OpaquePtr] Check access type when performing DSE
Arthur Eubanks [Thu, 17 Feb 2022 19:18:26 +0000 (11:18 -0800)]
[EarlyCSE][OpaquePtr] Check access type when performing DSE

This will bail out on target specific intrinsics. If those are deemed
important enough for EarlyCSE to handle, we can augment MemIntrinsicInfo
with an access type for TargetTransformInfo::getTgtMemIntrinsic() to
handle.

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D120077

2 years ago[lld] Make error handling functions opaque
Fangrui Song [Thu, 17 Feb 2022 19:54:57 +0000 (11:54 -0800)]
[lld] Make error handling functions opaque

The inline `lld::error` expands to two function calls `errorHandler` and `error`
where the latter is opaque. Move the functions to .cpp files to decrease code
size.

My x86-64 lld executable is 9KiB smaller.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D120002

2 years agoReland "[InstrProf] Make the IndexedInstrProf header backwards compatible."
Snehasish Kumar [Mon, 14 Feb 2022 19:52:40 +0000 (11:52 -0800)]
Reland "[InstrProf] Make the IndexedInstrProf header backwards compatible."

This reverts commit 9fd2cb21fb3f763fc784eab198bf1297a24596fa.

Fixes an issue on big endian systems where the format version
was not converted to little endian prior to passing to GET_VERSION.

Differential Revision: https://reviews.llvm.org/D118390

2 years agoAST: Move __va_list tag back to std conditionally on AArch64.
Peter Collingbourne [Thu, 6 Jan 2022 21:37:21 +0000 (13:37 -0800)]
AST: Move __va_list tag back to std conditionally on AArch64.

In post-commit feedback on D104830 Jessica Clarke pointed out that
unconditionally adding __va_list to the std namespace caused namespace
debug info to be emitted in C, which is not only inappropriate but
turned out to confuse the dtrace tool. Therefore, move __va_list back
to std only in C++ so that the correct debug info is generated. We
also considered moving __va_list to the top level unconditionally
but this would contradict the specification and be visible to AST
matchers and such, so make it conditional on the language mode.

To avoid breaking name mangling for __va_list, teach the Itanium
name mangler to always mangle it as if it were in the std namespace
when targeting ARM architectures. This logic is not needed for the
Microsoft name mangler because Microsoft platforms define va_list as
a typedef of char *.

Depends on D116773

Differential Revision: https://reviews.llvm.org/D116774

2 years agoAST: Make getEffectiveDeclContext() a member function of ItaniumMangleContextImpl...
Peter Collingbourne [Thu, 17 Feb 2022 19:23:33 +0000 (11:23 -0800)]
AST: Make getEffectiveDeclContext() a member function of ItaniumMangleContextImpl. NFCI.

In an upcoming change we are going to need to access mangler state
from the getEffectiveDeclContext() function. Therefore, make it a
member function of ItaniumMangleContextImpl. Any callers that are
not currently members of ItaniumMangleContextImpl or CXXNameMangler
are made members of one or the other depending on where they are
called from.

Differential Revision: https://reviews.llvm.org/D116773

2 years ago[OpenMP] Add RTL function to externalization RAII
Joseph Huber [Thu, 17 Feb 2022 18:20:51 +0000 (13:20 -0500)]
[OpenMP] Add RTL function to externalization RAII

This patch adds the '_kmpc_get_hardware_num_threads_in_block'
OpenMP RTL function to the externalization RAII struct. This was getting
optimized out and then being replaced with an undefined value once added
back in, causing bugs for complex reductions.

Fixes #53909.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120076

2 years ago[LLDB] Fix TestStructuredBinding.py for libstdc++
Shafik Yaghmour [Thu, 17 Feb 2022 19:29:05 +0000 (11:29 -0800)]
[LLDB] Fix TestStructuredBinding.py for libstdc++

For the tuple case for the TestStructuredBinding.py the result type is different
between libc++ and libstdc++.

2 years ago[ifs] Add --strip-needed flag
Alex Brachet [Thu, 17 Feb 2022 19:24:53 +0000 (19:24 +0000)]
[ifs] Add --strip-needed flag

Reviewed By: haowei, mcgrathr

Differential Revision: https://reviews.llvm.org/D119907

2 years ago[lld-macho] Allow order files and call graph sorting to be used together
Leonard Grey [Fri, 14 Jan 2022 19:37:00 +0000 (14:37 -0500)]
[lld-macho] Allow order files and call graph sorting to be used together

If both an order file and a call graph profile are present, the edges of the
call graph which use symbols present in the order file are not used. All of
the symbols in the order file will appear at the beginning of the section just
as they do currently. In other words, the highest priority derived from the
call graph will be below the lowest priority derived from the order file.

Practically, this change renames CallGraphSort.{h,cpp} to SectionPriorities.{h,cpp},
and most order file and call graph profile related code is moved into the new
file to reduce duplication.

Differential Revision: https://reviews.llvm.org/D117354

2 years ago[DEBUGINFO] [LLDB] Add support for generating debug-info for structured bindings...
Shafik Yaghmour [Thu, 17 Feb 2022 19:13:46 +0000 (11:13 -0800)]
[DEBUGINFO] [LLDB] Add support for generating debug-info for structured bindings of structs and arrays

Currently we are not emitting debug-info for all cases of structured bindings a
C++17 feature which allows us to bind names to subobjects in an initializer.

A structured binding is represented by a DecompositionDecl AST node and the
binding are represented by a BindingDecl. It looks the original implementation
only covered the tuple like case which be represented by a DeclRefExpr which
contains a VarDecl.

If the binding is to a subobject of the struct the binding will contain a
MemberExpr and in the case of arrays it will contain an ArraySubscriptExpr.
This PR adds support emitting debug-info for the MemberExpr and ArraySubscriptExpr
cases as well as llvm and lldb tests for these cases as well as the tuple case.

Differential Revision: https://reviews.llvm.org/D119178

2 years ago[AArch64] Add extra widening mul tests. NFC
David Green [Thu, 17 Feb 2022 19:11:45 +0000 (19:11 +0000)]
[AArch64] Add extra widening mul tests. NFC

Also regenerate arm64-neon-2velem-high.ll.

2 years ago[AMDGPU] Promote recursive loads from kernel argument to constant
Stanislav Mekhanoshin [Fri, 11 Feb 2022 20:00:05 +0000 (12:00 -0800)]
[AMDGPU] Promote recursive loads from kernel argument to constant

Not clobbered pointer load chains are promoted to global now. That
is possible to promote these loads itself into constant address
space. Loaded pointers still need to point to global because we
need to be able to store into that pointer and because an actual
load from it may occur after a clobber.

Differential Revision: https://reviews.llvm.org/D119886

2 years ago[mlir] Switch {collapse,expand}_shape ops to the declarative assembly format
Benjamin Kramer [Thu, 17 Feb 2022 18:56:38 +0000 (19:56 +0100)]
[mlir] Switch {collapse,expand}_shape ops to the declarative assembly format

Same functionality, a lot less code.

2 years ago[NFC] Fix debug-info-hotpatch.cpp failure due to downstream regex issue.
Zahira Ammarguellat [Thu, 17 Feb 2022 16:19:40 +0000 (08:19 -0800)]
[NFC] Fix debug-info-hotpatch.cpp failure due to downstream regex issue.

In our downstream, we discovered that the that the .* wildcard
in debug-info-hotpatch.cpp (added https://reviews.llvm.org/D116511)
ended up matching the entire line on our Windows configurations, causing
the -function-padmin check to already be consumed. After digging into it
we weren't able to find any sort of reason why the platform would matter
here, however we suspect there must be some difference in the regex
matcher between systems.
This NFC patch replaces the regex with a more conservative regex that
prevents this from happening by replacing the . match with an 'everything
but double-quote match, [^"].

https://reviews.llvm.org/D120066

2 years ago[Clang] Add attributes alloc_size and alloc_align to mm_malloc
Dávid Bolvanský [Thu, 17 Feb 2022 18:58:12 +0000 (19:58 +0100)]
[Clang] Add attributes alloc_size and alloc_align to mm_malloc

LLVM optimizes source codes with mm_malloc better, especially due to alignment info.

alloc align https://clang.llvm.org/docs/AttributeReference.html#alloc-align
alloc size https://clang.llvm.org/docs/AttributeReference.html#alloc-size

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D117091

2 years ago[X86ISelLowering] permit BlockAddressSDNode "i" constraints for PIC
Nick Desaulniers [Thu, 17 Feb 2022 18:43:12 +0000 (10:43 -0800)]
[X86ISelLowering] permit BlockAddressSDNode "i" constraints for PIC

When building 32b x86 code as PIC, the existing handling of "i"
constraints is conservative since generally we have to go through the
GOT to find references to functions.

But generally, BlockAddresses from C code refer to the Function in the
current TU.  Permit BlockAddresses to be used with the "i" constraint
for those cases.

I regressed this in
commit 4edb9983cb8c ("[SelectionDAG] treat X constrained labels as i for asm")

Fixes: https://github.com/llvm/llvm-project/issues/53868

Reviewed By: efriedma, MaskRay

Differential Revision: https://reviews.llvm.org/D119905

2 years agoFix the declaration printer to properly handle prototypes in C
Aaron Ballman [Thu, 17 Feb 2022 18:52:07 +0000 (13:52 -0500)]
Fix the declaration printer to properly handle prototypes in C

Previously, we would take a declaration like void f(void) and print it
as void f(). That's correct in C++ as far as it goes, but is incorrect
in C because that converts the function from having a prototype to one
which does not.

This turns out to matter for some of our tests that use the pretty
printer where we'd like to get rid of the K&R prototypes from the test
but can't because the test is checking the pretty printed function
signature, as done with the ARCMT tests.

2 years ago[Attributor][FIX] Ensure stable iteration order
Johannes Doerfert [Thu, 17 Feb 2022 18:48:32 +0000 (12:48 -0600)]
[Attributor][FIX] Ensure stable iteration order

With
https://github.com/llvm/llvm-project/commit/668c5c688be7ab0af37739bbbe2d653be82d5c6f
we introduced an ordering issue revealed by the reverse iteration
buildbot. Depending on the order of the map that tracks the AAIsDead AAs
we ended up with slightly different attributes. This is not totally
unexpected and can happen. We should however be deterministic in our
orderings to avoid such issues.

2 years ago[GlobalDCE] [VFE] Add a test for incorrect VFE behavior in presence of null/invalid...
Kuba Mracek [Thu, 17 Feb 2022 05:20:04 +0000 (21:20 -0800)]
[GlobalDCE] [VFE] Add a test for incorrect VFE behavior in presence of null/invalid vtable entries

Add a test for VFE where there's several vtables, and one of them contains an
invalid entry (from VFE's perspective), and which causes VFE to incorrectly skip
scanning subsequent vtables and drop their dependencies.

2 years ago[RewriteStatepointsForGC] Fix an incorrect assertion
Daniil Suchkov [Wed, 16 Feb 2022 23:21:15 +0000 (23:21 +0000)]
[RewriteStatepointsForGC] Fix an incorrect assertion

The assertion verifying that a newly computed value matches what is
already cached used stripPointerCasts() to strip bitcasts, however the
values can be not only pointers, but also vectors of pointers. That is
problematic because stripPointerCasts() doesn't handle vectors of
pointers. This patch introduces an ad-hoc utility function to strip all
bitcasts regardless of the value type.

Reviewed By: skatkov, reames

Differential Revision: https://reviews.llvm.org/D119994

2 years agoRevert "[Driver][Fuchsia][NFC] Use GetLinkerPath to see if linker is lld"
Alex Brachet [Thu, 17 Feb 2022 18:41:49 +0000 (18:41 +0000)]
Revert "[Driver][Fuchsia][NFC] Use GetLinkerPath to see if linker is lld"

This reverts commit b9f4dff8ab40250aac2343e86c1289de46af5585.

2 years ago[OpenMP] Diagnose bad 'omp declare variant' that references itself.
Mike Rice [Wed, 16 Feb 2022 21:58:45 +0000 (13:58 -0800)]
[OpenMP] Diagnose bad 'omp declare variant' that references itself.

When an a variant is specified that is the same as the base function
the compiler will end up crashing in CodeGen. Give an error instead.

Differential Revision: https://reviews.llvm.org/D119979

2 years ago[SystemZ] Improve emission of alignment hints.
Jonas Paulsson [Thu, 17 Feb 2022 01:12:06 +0000 (02:12 +0100)]
[SystemZ] Improve emission of alignment hints.

Handle multiple memoperands in lowerAlignmentHint().

Review: Ulrich Weigand

2 years ago[Driver][Fuchsia][NFC] Use GetLinkerPath to see if linker is lld
Alex Brachet [Thu, 17 Feb 2022 18:20:23 +0000 (18:20 +0000)]
[Driver][Fuchsia][NFC] Use GetLinkerPath to see if linker is lld

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D120074

2 years ago[analyzer] Fix a crash in NoStateChangeVisitor with body-farmed stack frames.
Artem Dergachev [Thu, 17 Feb 2022 05:09:09 +0000 (21:09 -0800)]
[analyzer] Fix a crash in NoStateChangeVisitor with body-farmed stack frames.

LocationContext::getDecl() isn't useful for obtaining the "farmed" body because
the (synthetic) body statement isn't actually attached to the (natural-grown)
declaration in the AST.

Differential Revision: https://reviews.llvm.org/D119509

2 years ago[InstCombine][OpaquePtr] Check store type in DSE implementation
Arthur Eubanks [Thu, 17 Feb 2022 18:00:26 +0000 (10:00 -0800)]
[InstCombine][OpaquePtr] Check store type in DSE implementation

2 years ago[SLP][NFC]Add another test for swapped main/alternate cmp, NFC.
Alexey Bataev [Thu, 17 Feb 2022 17:34:58 +0000 (09:34 -0800)]
[SLP][NFC]Add another test for swapped main/alternate cmp, NFC.

2 years ago[instsimplify] When compare allocas, consider their minimal size
Philip Reames [Thu, 17 Feb 2022 17:50:32 +0000 (09:50 -0800)]
[instsimplify] When compare allocas, consider their minimal size

The code was using exact sizing only, but since what we really need is just to make sure the offsets are in bounds, a minimum bound on the object size is sufficient.

To demonstrate the difference, support computing minimum sizes from obects of scalable vector type.

2 years ago[CUDA][SPIRV] Assign global address space to CUDA kernel arguments
Shangwu Yao [Thu, 17 Feb 2022 17:38:06 +0000 (09:38 -0800)]
[CUDA][SPIRV] Assign global address space to CUDA kernel arguments

This patch converts CUDA pointer kernel arguments with default address space to
CrossWorkGroup address space (__global in OpenCL). This is because Generic or
Function (OpenCL's private) is not supported as storage class for kernel pointer types.

Differential Revision: https://reviews.llvm.org/D119207

2 years ago[instsimplify] Fix a miscompile with zero sized allocas
Philip Reames [Thu, 17 Feb 2022 17:21:42 +0000 (09:21 -0800)]
[instsimplify] Fix a miscompile with zero sized allocas

Remove some code which tried to handle the case of comparing two allocas where an object size could not be precisely computed.  This code had zero coverage in tree, and at least one nasty bug.

The bug comes from the fact that the code uses the size of the result pointer as a proxy for whether the alloca can be of size zero.  Since the result of an alloca is *always* a pointer type, and a pointer type can *never* be empty, this check was a nop.  As a result, we blindly consider a zero offset from two allocas to never be equal.  They can in fact be equal when one or more of the allocas is zero sized.

This is particularly ugly because instcombine contains the exact opposite rule.  If instcombine reaches the allocas first, it combines them into one (making them equal).  If instsimplify reaches the compare first, it would consider them not equal.  This creates all kinds of fun scenarios for order of optimization reaching different and contradictory conclusions.

2 years ago[flang] Lower simple scalar assignment
Valentin Clement [Thu, 17 Feb 2022 17:23:22 +0000 (18:23 +0100)]
[flang] Lower simple scalar assignment

This patch hanlde lowering of simple scalar assignment.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D120058

Co-authored-by: Jean Perier <jperier@nvidia.com>
2 years ago[libc] Add exit and atexit
Alex Brachet [Thu, 17 Feb 2022 17:21:55 +0000 (17:21 +0000)]
[libc] Add exit and atexit

Often atexit is implemented using __cxa_atexit. I have not implemented __cxa_atexit here because it potentially requires more discussion. It is unique for llvm-libc (I think) that it is an exported symbol that wouldn’t be defined in any spec file because it doesn’t have a header. Implementing it will be trivial given what is here already, but I figured it would be more contentious so it can be implemented later.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D119512

2 years ago[clangd] Fix building SerializationTests unit test on OpenBSD
Brad Smith [Thu, 17 Feb 2022 17:15:14 +0000 (12:15 -0500)]
[clangd] Fix building SerializationTests unit test on OpenBSD

This fixes building the unit tests on OpenBSD. OpenBSD does not support RLIMIT_AS.

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D119989

2 years ago[RuntimeDyld] Fix building on OpenBSD
Brad Smith [Thu, 17 Feb 2022 17:09:21 +0000 (12:09 -0500)]
[RuntimeDyld] Fix building on OpenBSD

With https://reviews.llvm.org/D105466 the tree does not build on OpenBSD/amd64.
Moritz suggested only building this code on Linux.

Reviewed By: MoritzS

Differential Revision: https://reviews.llvm.org/D119991

2 years ago[instsimplify] Precommit a test showing an alloca equality miscompile
Philip Reames [Thu, 17 Feb 2022 17:12:17 +0000 (09:12 -0800)]
[instsimplify] Precommit a test showing an alloca equality miscompile

2 years ago[RISCV] Add the policy operand for nomask vector Multiply-Add IR intrinsics.
Zakk Chen [Wed, 16 Feb 2022 07:20:51 +0000 (23:20 -0800)]
[RISCV] Add the policy operand for nomask vector Multiply-Add IR intrinsics.

The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.

The nomask vector Multiply-Add need a policy operand
because merge value could not be undef.

Reviewed By: monkchiang

Differential Revision: https://reviews.llvm.org/D119727

2 years ago[AArch64][SVE] Add structured load/store opcodes to getMemOpInfo
Kerry McLaughlin [Thu, 17 Feb 2022 16:00:36 +0000 (16:00 +0000)]
[AArch64][SVE] Add structured load/store opcodes to getMemOpInfo

Currently, loading from or storing to a stack location with a structured load
or store crashes in isAArch64FrameOffsetLegal as the opcodes are not handled by
getMemOpInfo. This patch adds the opcodes for structured load/store instructions
with an immediate index to getMemOpInfo & getLoadStoreImmIdx, setting appropriate
values for the scale, width & min/max offsets.

Reviewed By: sdesmalen, david-arm

Differential Revision: https://reviews.llvm.org/D119338

2 years ago[NFC][llvm-nm] refactor function dumpSymbolNamesFromFile
zhijian [Thu, 17 Feb 2022 17:04:04 +0000 (12:04 -0500)]
[NFC][llvm-nm] refactor function dumpSymbolNamesFromFile
Summary:
split the function into several small functions.

Reviewers: James Henderson,Fangrui Song
Differential Revision: https://reviews.llvm.org/D119974

2 years agoadd missing include
Adrian Prantl [Thu, 17 Feb 2022 17:02:19 +0000 (09:02 -0800)]
add missing include

2 years ago[clang] Sema::CheckEquivalentExceptionSpec - remove useless nullptr test
Simon Pilgrim [Thu, 17 Feb 2022 16:59:41 +0000 (16:59 +0000)]
[clang] Sema::CheckEquivalentExceptionSpec - remove useless nullptr test

We use castAs<> for NewProto/OldProto, which would assert if the cast failed.

2 years agoAdd support for floating-point option `ffp-eval-method` and for
Zahira Ammarguellat [Tue, 19 Oct 2021 16:12:57 +0000 (09:12 -0700)]
Add support for floating-point option `ffp-eval-method` and for
`pragma clang fp eval_method`.

https://reviews.llvm.org/D109239

2 years ago[clang] [NFC] More exhaustive tests for deducing void return types
Arthur O'Dwyer [Mon, 14 Feb 2022 20:11:47 +0000 (15:11 -0500)]
[clang] [NFC] More exhaustive tests for deducing void return types

Differential Revision: https://reviews.llvm.org/D119772

2 years ago[mlir][linalg][sparse] add linalg optimization passes "upstream"
Aart Bik [Wed, 16 Feb 2022 20:56:43 +0000 (12:56 -0800)]
[mlir][linalg][sparse] add linalg optimization passes "upstream"

It is time to compose Linalg related optimizations with SparseTensor
related optimizations. This is a careful first start by adding some
general Linalg optimizations "upstream" of the sparse compiler in the
full sparse compiler pipeline. Some minor changes were needed to make
those optimizations aware of sparsity.

Note that after this, we will add a sparse specific fusion rule,
just to demonstrate the power of the new composition.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D119971

2 years ago[SCEVExpander][OpaquePtr] Check GEP source type when finding identical GEP
Arthur Eubanks [Thu, 17 Feb 2022 04:34:55 +0000 (20:34 -0800)]
[SCEVExpander][OpaquePtr] Check GEP source type when finding identical GEP

Fixes an opaque pointers miscompile.

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D120004

2 years ago[test][IndVarSimplify][OpaquePtr] Precommit test
Arthur Eubanks [Thu, 17 Feb 2022 04:34:08 +0000 (20:34 -0800)]
[test][IndVarSimplify][OpaquePtr] Precommit test

2 years ago[AArch64] Remove an unused variable in my previous patch
John Brawn [Thu, 17 Feb 2022 16:37:38 +0000 (16:37 +0000)]
[AArch64] Remove an unused variable in my previous patch

2 years agoTitle: Export unique symbol list with llvm-nm new option "--export-symbols"
zhijian [Thu, 17 Feb 2022 16:37:33 +0000 (11:37 -0500)]
Title: Export unique symbol list with llvm-nm new option "--export-symbols"

Summary:

the patch implement of following functionality.
1. export the symbols from archive or object files.
2. sort the export symbols. (based on same symbol name and visibility)
3. delete the duplicate export symbols (based on same symbol name and visibility)
4. print out the  unique and sorted export symbols (print the symbol name and visibility).

there are two new options are add in the patch
1. --export-symbols (enable the functionality of export unique symbol)
2. --no-rsrc (exclude the symbol name begin with "__rsrc" from be exporting from xcoff object file)

Export symbol list for xcoff object file has the same functionality as
The patch has the same functionality as
https://www.ibm.com/docs/en/xl-c-aix/13.1.0?topic=library-exporting-symbols-createexportlist-utility

Reviewers: James Henderson,Fangrui Song
Differential Revision: https://reviews.llvm.org/D112735

2 years ago[objcopy] followup patch after f75da0c8e65cf1b09012a8b62cd7f3e9a646bbc9
Alexey Lapshin [Thu, 17 Feb 2022 16:04:51 +0000 (19:04 +0300)]
[objcopy] followup patch after f75da0c8e65cf1b09012a8b62cd7f3e9a646bbc9

2 years ago[RISCV] Match shufflevector corresponding to slideup.
Craig Topper [Thu, 17 Feb 2022 16:17:41 +0000 (08:17 -0800)]
[RISCV] Match shufflevector corresponding to slideup.

This generalizes isElementRotate to work when there's only a single
slide needed. I've removed matchShuffleAsSlideDown which is now
redundant.

Reviewed By: frasercrmck, khchen

Differential Revision: https://reviews.llvm.org/D119759

2 years ago[RISCV] Fix incorrect MemOperand copy converting splat+load to vlse.
Craig Topper [Thu, 17 Feb 2022 16:10:20 +0000 (08:10 -0800)]
[RISCV] Fix incorrect MemOperand copy converting splat+load to vlse.

Due to an incorrect copy/paste from load intrinsic handling we
checked if the splat node was a MemSDNode which of course it isn't.

Instead get the MemOperand from the LoadSDNode for the source of
the splat.

This enables LICM to see the load is loop invariant and hoist it
out of the loop.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D120014

2 years ago[AArch64] Add some missing strict FP vector lowering
John Brawn [Tue, 25 Jan 2022 17:53:58 +0000 (17:53 +0000)]
[AArch64] Add some missing strict FP vector lowering

Also add a test for the codegen of strict FP vector operations so
these changes get tested.

Differential Revision: https://reviews.llvm.org/D117795

2 years ago[GlobalDCE] Simplify and return Changed = true less often
Jay Foad [Thu, 17 Feb 2022 14:17:36 +0000 (14:17 +0000)]
[GlobalDCE] Simplify and return Changed = true less often

Removing dead constants should not count as making a change to the
module. This means that RemoveUnusedGlobalValue simplifies to just
calling removeDeadConstantUsers, so inline it.

Differential Revision: https://reviews.llvm.org/D120052

2 years ago[AArch64][SVE] Invert VSelect operand order and condition for predicated arithmetic...
Matt Devereau [Tue, 8 Feb 2022 14:24:03 +0000 (14:24 +0000)]
[AArch64][SVE] Invert VSelect operand order and condition for predicated arithmetic operations

   (vselect (setcc ( condcode) (_) (_)) (a)          (op (a) (b)))
=> (vselect (setcc (!condcode) (_) (_)) (op (a) (b)) (a))

As a follow up to D117689, invert the operand order and condition
in order to fold vselects into predicated instructions.

Differential Revision: https://reviews.llvm.org/D119424

2 years ago[polly] Fix regression test after D110620.
Michael Kruse [Thu, 17 Feb 2022 15:42:15 +0000 (09:42 -0600)]
[polly] Fix regression test after D110620.

2 years agoRevert "[JITLink][RISCV] fix the extractBits behavior and add R_RISCV_JAL relocation."
fourdim [Thu, 17 Feb 2022 15:40:32 +0000 (23:40 +0800)]
Revert "[JITLink][RISCV] fix the extractBits behavior and add R_RISCV_JAL relocation."

This reverts commit 3af7bbca4a0ef64de64b8bb38d3b167673ec60f0.

2 years ago[InstCombine] push constant operand down/outside in sequence of min/max intrinsics
Sanjay Patel [Thu, 17 Feb 2022 15:34:48 +0000 (10:34 -0500)]
[InstCombine] push constant operand down/outside in sequence of min/max intrinsics

A generalization like this was suggested in D119754.
This is the inverse direction of D119851,
and we get all of the folds there plus the one that was missed.

There is precedence for this kind of transform in instcombine
with "or" instructions (but strangely only with that one opcode AFAICT).

Similar justification as in the other patch:
The line between instcombine and reassociate for these kinds of folds
is blurry. This doesn't appear to have much cost and gives us the
expected wins from repeated folds as seen in the last set of test diffs.

Differential Revision: https://reviews.llvm.org/D119955

2 years ago[InstCombine] add test for min/max intrinsic with constant expression; NFC
Sanjay Patel [Wed, 16 Feb 2022 18:57:38 +0000 (13:57 -0500)]
[InstCombine] add test for min/max intrinsic with constant expression; NFC

2 years ago[OpenMP][Offloading] Fix test case issues in bug49334.cpp
Shilei Tian [Thu, 17 Feb 2022 15:22:16 +0000 (10:22 -0500)]
[OpenMP][Offloading] Fix test case issues in bug49334.cpp

`bug49334.cpp` has one issue that causes flaky result reported in #53730.
The root cause is `BlockedC` is never initialized but in `BlockMatMul_TargetNowait`
it is directly read and written (via `+=`). Fixes #53730.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D119988

2 years ago[JITLink][RISCV] fix the extractBits behavior and add R_RISCV_JAL relocation.
fourdim [Thu, 17 Feb 2022 15:00:55 +0000 (23:00 +0800)]
[JITLink][RISCV] fix the extractBits behavior and add R_RISCV_JAL relocation.

This patch supports the R_RISCV_JAL relocation.
Moreover, it will fix the extractBits function's behavior as it extracts Size + 1 bits.
In the test ELF_jal.s:
Before:
```
Hi: 4294836480
extractBits(Hi, 12, 8): 480
```
After:
```
Hi: 4294836480
extractBits(Hi, 12, 8): 224
```

Reviewed By: StephenFan

Differential Revision: https://reviews.llvm.org/D117975

2 years ago[RISCV] Add the passthru operand for vmv.vv/vmv.vx/vfmv.vf IR intrinsics.
Zakk Chen [Mon, 14 Feb 2022 02:09:27 +0000 (18:09 -0800)]
[RISCV] Add the passthru operand for vmv.vv/vmv.vx/vfmv.vf IR intrinsics.

Add the passthru operand for
VMV_V_X_VL, VFMV_V_F_VL and SPLAT_VECTOR_SPLIT_I64_VL also.

The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D119688

2 years agotsan: Add a missing disable_sanitizer_instrumentation attribute
Alexander Potapenko [Thu, 17 Feb 2022 14:09:31 +0000 (15:09 +0100)]
tsan: Add a missing disable_sanitizer_instrumentation attribute

Turns out the test was working by accident: we need to ensure
TSan instrumentation is not called from the fork() hook, otherwise the
tool will deadlock. Previously it worked because alloc_free_blocks() got
inlined into __tsan_test_only_on_fork(), but it cannot always be the
case.

Adding __attribute__((disable_sanitizer_instrumentation)) will prevent
TSan from instrumenting alloc_free_blocks().

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D120050

2 years ago[mlir][spirv] Add a pass to unify aliased resource variables
Lei Zhang [Thu, 17 Feb 2022 14:08:15 +0000 (09:08 -0500)]
[mlir][spirv] Add a pass to unify aliased resource variables

In SPIR-V, resources are represented as global variables that
are bound to certain descriptor. SPIR-V requires those global
variables to be declared as aliased if multiple ones are bound
to the same slot. Such aliased decorations can cause issues
for transcompilers like SPIRV-Cross when converting to source
shading languages like MSL.

So this commit adds a pass to perform analysis of aliased
resources and see if we can unify them into one.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D119872

2 years ago[SLP][NFC]Fix misprint in function name, NFC.
Alexey Bataev [Thu, 17 Feb 2022 13:40:01 +0000 (05:40 -0800)]
[SLP][NFC]Fix misprint in function name, NFC.

2 years ago[gn build] (manually) port f75da0c8e65c (ObjCopy lib)
Nico Weber [Thu, 17 Feb 2022 13:56:06 +0000 (08:56 -0500)]
[gn build] (manually) port f75da0c8e65c (ObjCopy lib)

2 years ago[RISCV] add the MC layer support of Zfinx extension
Shao-Ce SUN [Thu, 17 Feb 2022 13:02:58 +0000 (21:02 +0800)]
[RISCV] add the MC layer support of Zfinx extension

This patch added the MC layer support of Zfinx extension.

Authored-by: StephenFan
Co-Authored-by: Shao-Ce Sun
Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D93298

2 years ago[X86] selectLEAAddr - add X86ISD::SMUL/UMULO handling
Simon Pilgrim [Thu, 17 Feb 2022 13:50:52 +0000 (13:50 +0000)]
[X86] selectLEAAddr - add X86ISD::SMUL/UMULO handling

After D118128 relaxed the heuristic to require only one EFLAGS generating operand, it now makes sense to avoid X86ISD::SMUL/UMULO duplication as well.

Differential Revision: https://reviews.llvm.org/D119578

2 years ago[libc][automemcpy] Introduce geomean of scores as a tie breaker
Guillaume Chatelet [Thu, 17 Feb 2022 12:31:00 +0000 (12:31 +0000)]
[libc][automemcpy] Introduce geomean of scores as a tie breaker

Differential Revision: https://reviews.llvm.org/D120040

2 years ago[DAGCombiner] Extend ISD::ABDS/U combine to handle more cases.
Paul Walker [Fri, 10 Dec 2021 18:05:38 +0000 (18:05 +0000)]
[DAGCombiner] Extend ISD::ABDS/U combine to handle more cases.

The current ABD combine doesn't quite work for SVE because only a
single scalable vector per scalar integer type is legal (e.g. for
i32, <vscale x 4 x i32> is the only legal scalable vector type).

This patch extends the combine to also trigger for the cases when
operand extension must be retained.

Differential Revision: https://reviews.llvm.org/D115739

2 years ago[DAG] Fix in ReplaceAllUsesOfValuesWith
Bjorn Pettersson [Fri, 4 Feb 2022 17:32:13 +0000 (18:32 +0100)]
[DAG] Fix in ReplaceAllUsesOfValuesWith

When doing SelectionDAG::ReplaceAllUsesOfValuesWith a worklist is
prepared containing all users that should be updated. Then we use
the RemoveNodeFromCSEMaps/AddModifiedNodeToCSEMaps helpers to handle
recursive CSE updates while doing the replacements.

This patch aims at solving a problem that could arise if the recursive
CSE updates would result in an SDNode present in the worklist is being
removed as a side-effect of morphing a prio user in the worklist.

To examplify such a scenario, imagine that we have these nodes in
the DAG
   t12: i64 = add t8, t11
   t13: i64 = add t12, t8
   t14: i64 = add t11, t11
   t15: i64 = add t14, t8
   t16: i64 = sub t13, t15
and that the t8 uses should be replaced by t11. An initial worklist
(listing the users that should be morphed) could be [t12, t13, t15].
When updating t12 we get
   t12: i64 = add t11, t11
which results in a CSE update that replaces t14 by t12, so we get
   t15: i64 = add t12, t8
which results in a CSE update that replaces t13 by t12, so we get
   t16: i64 = sub t12, t15
and then t13 is removed given that it was the last use of t13.

So when being done with the updates triggered by rewriting the use
of t8 in t12 the t13 node no longer exist. And we used to end up
hitting an assertion when continuing with the worklist aiming at
replacing the t8 uses in t13.

The solution is based on using a DAGUpdateListener, making sure that
we prune a user from the worklist if it is removed during the
recursive CSE updates.

The bug was found using an OOT target. I think the problem is quite
old, even if the particular intree target reproducer added in this
patch seem to pass when using LLVM 13.0.0.

Differential Revision: https://reviews.llvm.org/D119088

2 years ago[clang-doc] SerializeIndex - pass Index param by constant reference
Simon Pilgrim [Thu, 17 Feb 2022 13:28:02 +0000 (13:28 +0000)]
[clang-doc] SerializeIndex - pass Index param by constant reference

Silence coverity warnings about unnecessary copies

2 years ago[NFC] Fix comment
Shao-Ce SUN [Thu, 17 Feb 2022 13:19:14 +0000 (21:19 +0800)]
[NFC] Fix comment

2 years ago[clang] CGDebugInfo::getOrCreateMethodType - use castAs<> instead of getAs<> to avoid...
Simon Pilgrim [Thu, 17 Feb 2022 13:18:02 +0000 (13:18 +0000)]
[clang] CGDebugInfo::getOrCreateMethodType - use castAs<> instead of getAs<> to avoid dereference of nullptr

The pointer is always dereferenced, so assert the cast is correct instead of returning nullptr

2 years ago[clang] CGCXXABI::EmitLoadOfMemberFunctionPointer - use castAs<> instead of getAs...
Simon Pilgrim [Thu, 17 Feb 2022 13:15:19 +0000 (13:15 +0000)]
[clang] CGCXXABI::EmitLoadOfMemberFunctionPointer - use castAs<> instead of getAs<> to avoid dereference of nullptr

The pointer is always dereferenced by arrangeCXXMethodType, so assert the cast is correct instead of returning nullptr

2 years ago[NFC] Correct typo `interger` to `integer`
Shao-Ce SUN [Thu, 17 Feb 2022 13:13:00 +0000 (21:13 +0800)]
[NFC] Correct typo `interger` to `integer`

2 years ago[BufferDeallocation] Don't assume successor operands are unique
Benjamin Kramer [Thu, 17 Feb 2022 12:51:27 +0000 (13:51 +0100)]
[BufferDeallocation] Don't assume successor operands are unique

This would create a double free when a memref is passed twice to the
same op. This wasn't a problem at the time the pass was written but is
common since the introduction of scf.while.

There's a latent non-determinism that's triggered by the test, but this
change is messy enough as-is so I'll leave that for later.

Differential Revision: https://reviews.llvm.org/D120044

2 years ago[AArch64] Allow strict opcodes in faddp patterns
John Brawn [Fri, 28 Jan 2022 15:05:39 +0000 (15:05 +0000)]
[AArch64] Allow strict opcodes in faddp patterns

This also requires adjustment to code in AArch64ISelLowering so that
vector_extract is distributed over strict_fadd.

Differential Revision: https://reviews.llvm.org/D118489

2 years ago[AArch64] Allow strict opcodes in indexed fmul and fma patterns
John Brawn [Fri, 28 Jan 2022 14:31:17 +0000 (14:31 +0000)]
[AArch64] Allow strict opcodes in indexed fmul and fma patterns

Using an indexed version instead of a non-indexed version doesn't
change anything with regards to exceptions or rounding.

Differential Revision: https://reviews.llvm.org/D118487

2 years ago[AArch64] Allow strict opcodes in fp->int->fp patterns
John Brawn [Fri, 28 Jan 2022 14:10:51 +0000 (14:10 +0000)]
[AArch64] Allow strict opcodes in fp->int->fp patterns

These patterns don't change the fundamental instructions that are
used, just the variants that are used in order to remove some extra
MOVs.

Differential Revision: https://reviews.llvm.org/D118485

2 years ago[AArch64] Add instruction selection for strict FP
John Brawn [Fri, 7 Jan 2022 14:47:26 +0000 (14:47 +0000)]
[AArch64] Add instruction selection for strict FP

This consists of marking the various strict opcodes as legal, and
adjusting instruction selection patterns so that 'op' is 'any_op'.

FP16 and vector instructions additionally require some extra work in
lowering and legalization, so we can't set IsStrictFPEnabled just yet.
Also more work needs to be done for full strict fp support (marking
instructions that can raise exceptions as such, and modelling FPCR use
for controlling rounding).

Differential Revision: https://reviews.llvm.org/D114946

2 years ago[llvm][automemcpy] Allow distribution filtering in analysis
Guillaume Chatelet [Thu, 17 Feb 2022 11:51:59 +0000 (11:51 +0000)]
[llvm][automemcpy] Allow distribution filtering in analysis

Differential Revision: https://reviews.llvm.org/D120037

2 years ago[AArch64][NFC] Fix unused-lambda-capture warning.
Adrian Kuegel [Thu, 17 Feb 2022 12:31:50 +0000 (13:31 +0100)]
[AArch64][NFC] Fix unused-lambda-capture warning.

Differential Revision: https://reviews.llvm.org/D120041

2 years ago[Docs] Use correct rst syntax
Nikita Popov [Thu, 17 Feb 2022 13:08:29 +0000 (14:08 +0100)]
[Docs] Use correct rst syntax

2 years agoRemove duplicated code for printing the `uwtable` attribute (NFC)
Momchil Velikov [Thu, 17 Feb 2022 11:18:11 +0000 (11:18 +0000)]
Remove duplicated code for printing the `uwtable` attribute (NFC)

Committed as obvious.

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D120030

2 years ago[Bazel] Fix build after ObjCopy move.
Adrian Kuegel [Thu, 17 Feb 2022 12:02:14 +0000 (13:02 +0100)]
[Bazel] Fix build after ObjCopy move.

Differential Revision: https://reviews.llvm.org/D120039

2 years ago[flang][driver] Add support for `-emit-llvm`
Andrzej Warzynski [Fri, 4 Feb 2022 17:15:12 +0000 (17:15 +0000)]
[flang][driver] Add support for `-emit-llvm`

This patch adds support for the `-emit-llvm` option in the frontend
driver (i.e. `flang-new -fc1`). Similarly to Clang, `flang-new -fc1
-emit-llvm file.f` will generate a textual LLVM IR file.

Depends on D118985

Differential Revision: https://reviews.llvm.org/D119012

2 years ago[libc][automemcpy] Add mean/variance and simplify implementation
Guillaume Chatelet [Thu, 17 Feb 2022 10:56:25 +0000 (10:56 +0000)]
[libc][automemcpy] Add mean/variance and simplify implementation

Differential Revision: https://reviews.llvm.org/D120031

2 years ago[Docs] Update opaque pointers docs
Nikita Popov [Thu, 17 Feb 2022 12:00:46 +0000 (13:00 +0100)]
[Docs] Update opaque pointers docs

Expand migration instructions.

2 years ago[NFC][PhaseOrdering] Improve test coverage for D119975
Roman Lebedev [Thu, 17 Feb 2022 11:30:02 +0000 (14:30 +0300)]
[NFC][PhaseOrdering] Improve test coverage for D119975

2 years ago[SystemZ] lowerDYNAMIC_STACKALLOC_XPLINK - use cast<> instead of dyn_cast<> to avoid...
Simon Pilgrim [Thu, 17 Feb 2022 11:56:29 +0000 (11:56 +0000)]
[SystemZ] lowerDYNAMIC_STACKALLOC_XPLINK - use cast<> instead of dyn_cast<> to avoid dereference of nullptr

The pointer is always dereferenced, so assert the cast is correct instead of returning nullptr

2 years ago[X86] X86tcret_1reg - use cast<> instead of dyn_cast<> to avoid dereference of nullptr
Simon Pilgrim [Thu, 17 Feb 2022 11:54:12 +0000 (11:54 +0000)]
[X86] X86tcret_1reg - use cast<> instead of dyn_cast<> to avoid dereference of nullptr

The pointer is always dereferenced, so assert the cast is correct instead of returning nullptr

2 years agoAArch64_MC::isQForm - Fix MSVC 'no default capture mode' lambda warning
Simon Pilgrim [Thu, 17 Feb 2022 11:41:47 +0000 (11:41 +0000)]
AArch64_MC::isQForm - Fix MSVC 'no default capture mode' lambda warning

2 years ago[RelLookupTableConverter] Ensure that GV, GEP and load types match
Nikita Popov [Thu, 17 Feb 2022 10:59:04 +0000 (11:59 +0100)]
[RelLookupTableConverter] Ensure that GV, GEP and load types match

This code could be generalized to be type-independent, but for now
just ensure that the same type constraints are enforced with opaque
pointers as with typed pointers.

2 years ago[SCEV] Infer ranges for SCC consisting of cycled Phis
Max Kazantsev [Thu, 17 Feb 2022 10:38:42 +0000 (17:38 +0700)]
[SCEV] Infer ranges for SCC consisting of cycled Phis

Our current strategy of computing ranges of SCEVUnknown Phis was to simply
compute the union of ranges of all its inputs. In order to avoid infinite recursion,
we mark Phis as pending and conservatively return full set for them. As result,
even simplest patterns of cycled phis always have a range of full set.

This patch makes this logic a bit smarter. We basically do the same, but instead
of taking inputs of single Phi we find its strongly connected component (SCC)
and compute the union of all inputs that come into this SCC from outside.

Processing entire SCC together has one more advantage: we can set range for all
of them at once, because the only thing that happens to them is the same value is
being passed between those Phis. So, despite we spend more time analyzing a
single Phi, overall we may save time by not processing other SCC members, so
amortized compile time spent should be approximately the same.

Differential Revision: https://reviews.llvm.org/D110620
Reviewed By: reames

2 years ago[OpenCL] Guard 64-bit atomic types
Sven van Haastregt [Thu, 17 Feb 2022 10:58:52 +0000 (10:58 +0000)]
[OpenCL] Guard 64-bit atomic types

Until now, overloads with a 64-bit atomic type argument were always
made available with `-fdeclare-opencl-builtins`.  Ensure these
overloads are only available when both the `cl_khr_int64_base_atomics`
and `cl_khr_int64_extended_atomics` extensions have been enabled, as
required by the OpenCL specification.

Differential Revision: https://reviews.llvm.org/D119858

2 years ago[objcopy][NFC] Add doc comments to the executeObjcopy* functions.
Alexey Lapshin [Thu, 17 Feb 2022 10:25:48 +0000 (13:25 +0300)]
[objcopy][NFC] Add doc comments to the executeObjcopy* functions.

Add doc comments to the executeObjcopy* functions.

Depends on D88827

2 years ago[SchedModels][CortexA55] Add ASIMD integer instructions
Pavel Kosov [Thu, 17 Feb 2022 10:41:57 +0000 (13:41 +0300)]
[SchedModels][CortexA55] Add ASIMD integer instructions

Depends on D114642

Original review https://reviews.llvm.org/D112201

OS Laboratory. Huawei Russian Research Institute. Saint-Petersburg

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D117003

2 years ago[AArch64][SchedModels] Handle virtual registers in FP/NEON predicates
Pavel Kosov [Thu, 17 Feb 2022 10:41:05 +0000 (13:41 +0300)]
[AArch64][SchedModels] Handle virtual registers in FP/NEON predicates

Current implementation of Check[HSDQ]Form predicates doesn’t handle virtual registers and therefore isn’t useful for pre-RA scheduling. Patch fixes this implementing two function predicates: CheckQForm for checking that instruction writes 128-bit NEON register and CheckFpOrNEON which checks that instruction writes FP register (any width). The latter supersedes Check[HSD]Form predicates which are not used individually.

OS Laboratory. Huawei Russian Research Institute. Saint-Petersburg

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D114642

2 years ago[CodeGen] Rename deprecated Address constructor
Nikita Popov [Wed, 16 Feb 2022 15:38:11 +0000 (16:38 +0100)]
[CodeGen] Rename deprecated Address constructor

To make uses of the deprecated constructor easier to spot, and to
ensure that no new uses are introduced, rename it to
Address::deprecated().

While doing the rename, I've filled in element types in cases
where it was relatively obvious, but we're still left with 135
calls to the deprecated constructor.