platform/upstream/llvm.git
2 years ago[InstCombine] limit icmp fold with sub if other sub user is a phi
Sanjay Patel [Sat, 2 Apr 2022 23:23:42 +0000 (19:23 -0400)]
[InstCombine] limit icmp fold with sub if other sub user is a phi

This is a hacky fix for:
https://github.com/llvm/llvm-project/issues/54558

As discussed there, codegen regressed when we opened up this transform
to allow extra uses ( 61580d0949fd3465 ), and it's not clear how to
undo the transforms at the later stage of compilation.

As noted in the code comments, there's a set of remaining folds that
are still limited to one-use, so we can try harder to refine and
expand the limitations on these folds, but it's likely to be an
up-and-down battle as we find and overcome similar regressions.

Differential Revision: https://reviews.llvm.org/D122909

2 years ago[InstCombine] fold fcmp with lossy casted constant (2nd try)
Sanjay Patel [Sat, 2 Apr 2022 22:52:18 +0000 (18:52 -0400)]
[InstCombine] fold fcmp with lossy casted constant (2nd try)

This is a retry of 9397bdc67eb2 - that was reverted until
we had a clang warning in place to alert users about a
possible mistake in source. The warning was added with
ab982eace6e4.

This is noted as a missing clang warning in #54222,
but it is also a missing optimization opportunity.

Alive2 proofs:
https://alive2.llvm.org/ce/z/Q8drDq
https://alive2.llvm.org/ce/z/pE6LRt

I don't see a single conversion for all predicates
using "getFCmpCode" logic, so other predicates are
left as a TODO item.

2 years ago[MLIR][Presburger] Use PresburgerSpace in SetCoalescer
Groverkss [Sat, 2 Apr 2022 22:36:11 +0000 (04:06 +0530)]
[MLIR][Presburger] Use PresburgerSpace in SetCoalescer

This patch changes the implementation of SetCoalescer to use PresburgerSpace
instead of reimplementing parts of PresburgerSpace.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D122984

2 years ago[InstCombine] Fold `(X | C2) ^ C1 --> (X & ~C2) ^ (C1^C2)`
Roman Lebedev [Sat, 2 Apr 2022 20:54:33 +0000 (23:54 +0300)]
[InstCombine] Fold `(X | C2) ^ C1 --> (X & ~C2) ^ (C1^C2)`

These two are equivalent,
and i *think* the `and` form is more-ish canonical.

General proof: https://alive2.llvm.org/ce/z/RrF5s6

If constant on the (outer) `xor` is an `undef`,
the whole lane is dead: https://alive2.llvm.org/ce/z/mu4Sh2

However, if the constant on the (inner) `or` is an `undef`,
we must sanitize it first: https://alive2.llvm.org/ce/z/MHYJL7
I guess, producing a zero `and`-mask is optimal in that case.

alive-tv is happy about the entirety of `xor-of-or.ll`.

2 years ago[NFC][InstCombine] Autogenerate check lines in a test affected by the future change
Roman Lebedev [Sat, 2 Apr 2022 21:03:15 +0000 (00:03 +0300)]
[NFC][InstCombine] Autogenerate check lines in a test affected by the future change

2 years ago[NFC][InstCombine] Add some tests for `(X | C2) ^ C1` pattern
Roman Lebedev [Sat, 2 Apr 2022 20:04:44 +0000 (23:04 +0300)]
[NFC][InstCombine] Add some tests for `(X | C2) ^ C1` pattern

2 years ago[Support] [BLAKE3] Fix compilation with CMAKE_OSX_ARCHITECTURES
Martin Storsjö [Fri, 1 Apr 2022 08:50:25 +0000 (11:50 +0300)]
[Support] [BLAKE3] Fix compilation with CMAKE_OSX_ARCHITECTURES

With CMake, one can build for multiple macOS architectures
at the same time by setting CMAKE_OSX_ARCHITECTURES to multiple
architectures (avoiding needing to do two separate builds and
gluing the binaries together after the build).

In this case, while targeting x86_64 and arm64, neither IS_X64
nor IS_ARM64 is set, while compilation of the individual source
files will hit those cases (in either architecture mode).

Therefore, if we on the CMake level decide not to include the
architecture specific SIMD implementation files, also tell the
source this explicitly by passing the defines indicating that we
don't expect to use them.

Such a build clearly is less ideal than explicitly targeting one
architecture at a time if it won't include all the SIMD optimizations,
but that's a tradeoff that is up to the one deciding to do such an
universal build.

This also fixes builds for i386. The blake3 source code automatically
enables the SIMD implementations when building for i386, but we don't
provide the sources for that build configuration.

Differential Revision: https://reviews.llvm.org/D122884

2 years ago[Support] [BLAKE3] Remove .hidden directives from windows-gnu assembly sources
Martin Storsjö [Fri, 1 Apr 2022 11:29:54 +0000 (14:29 +0300)]
[Support] [BLAKE3] Remove .hidden directives from windows-gnu assembly sources

COFF symbols don't have anything corresponding to a `.hidden` flag;
both GNU binutils as and LLVM's built-in assembler errors out on
these directives.

This reverts one part of
7f05aa2d4c36d6d53f97ac3e0db30ec600abbc62, fixing builds for
mingw x86_64.

Differential Revision: https://reviews.llvm.org/D122893

2 years ago[VPlan] Set VPlan header block name to vector.body.
Florian Hahn [Sat, 2 Apr 2022 18:33:58 +0000 (19:33 +0100)]
[VPlan] Set VPlan header block name to vector.body.

This brings the VPlan block naming in line with the naming of the
generated basic blocks.

2 years ago[MLIR][Presburger][NFC] Rename getCompatibleSpace to getSpaceWithoutLocals
Groverkss [Sat, 2 Apr 2022 18:09:57 +0000 (23:39 +0530)]
[MLIR][Presburger][NFC] Rename getCompatibleSpace to getSpaceWithoutLocals

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D122973

2 years ago[trace][intel pt] Handle better tsc in the decoder
Walter Erquinigo [Fri, 1 Apr 2022 04:13:03 +0000 (21:13 -0700)]
[trace][intel pt] Handle better tsc in the decoder

A problem that I introduced in the decoder is that I was considering TSC decoding
errors as actual instruction errors, which mean that the trace has a gap. This is
wrong because a TSC decoding error doesn't mean that there's a gap in the trace.
Instead, now I'm just counting how many of these errors happened and I'm using
the `dump info` command to check for this number.

Besides that, I refactored the decoder a little bit to make it simpler, more
readable, and to handle TSCs in a cleaner way.

Differential Revision: https://reviews.llvm.org/D122867

2 years agoRevert "[InstSimplify][NFC] Add baseline tests for folds of icmp with ctpop"
Hirochika Matsumoto [Sat, 2 Apr 2022 17:27:59 +0000 (02:27 +0900)]
Revert "[InstSimplify][NFC] Add baseline tests for folds of icmp with ctpop"

This reverts commit b48abeea44ac3c7860b13b863210116e8db1d978.

Accidentally added already optimized tests, not baseline tests.

2 years ago[InstSimplify][NFC] Add baseline tests for folds of icmp with ctpop
Hirochika Matsumoto [Sat, 2 Apr 2022 17:14:51 +0000 (02:14 +0900)]
[InstSimplify][NFC] Add baseline tests for folds of icmp with ctpop

Extracted from: https://reviews.llvm.org/D122757

2 years ago[ConstraintElimination] Move logic to build worklist to helper (NFC).
Florian Hahn [Sat, 2 Apr 2022 15:55:04 +0000 (16:55 +0100)]
[ConstraintElimination] Move logic to build worklist to helper (NFC).

This refactor makes it easier to extend the logic to collect information
from blocks in the future, without even further increasing the size of
eliminateConstriants.

2 years ago[Driver][AArch64] Split up aarch64-cpus.c tests further
tyb0807 [Mon, 7 Mar 2022 10:17:00 +0000 (10:17 +0000)]
[Driver][AArch64] Split up aarch64-cpus.c tests further

This is the continuation of https://reviews.llvm.org/D120875. Now
aarch64-cpus-[12].c are further splitted and renamed to better reflect
the tests.

Differential Revision: https://reviews.llvm.org/D121093

2 years ago[AArch64] Avoid scanning feature list for target parsing
tyb0807 [Thu, 3 Mar 2022 02:12:00 +0000 (02:12 +0000)]
[AArch64] Avoid scanning feature list for target parsing

As discussed in https://reviews.llvm.org/D120111, this patch proposes an
alternative implementation to avoid scanning feature list for
architecture version over and over again. The insertion position for
default extensions is also captured during this single scan of the
feature list.

Differential Revision: https://reviews.llvm.org/D120864

2 years ago[AArch64] Default HBC/MOPS features in clang
tyb0807 [Tue, 1 Feb 2022 13:37:43 +0000 (13:37 +0000)]
[AArch64] Default HBC/MOPS features in clang

This implements minimum support in clang for default HBC/MOPS features
on v8.8-a/v9.3-a or later architectures.

Differential Revision: https://reviews.llvm.org/D120111

2 years agoRevert "[AMDPU][Sanitizer] Refactor sanitizer options handling for AMDGPU Toolchain"
Ron Lieberman [Sat, 2 Apr 2022 13:25:50 +0000 (13:25 +0000)]
Revert "[AMDPU][Sanitizer] Refactor sanitizer options handling for AMDGPU Toolchain"

This reverts commit cc2139524f77248c7e147d4cc3befb31fe3e6daa.

failed a few buildbots

2 years ago[MLIR][Presburger] Make constructors from PresburgerSpace explicit
Groverkss [Sat, 2 Apr 2022 12:44:38 +0000 (18:14 +0530)]
[MLIR][Presburger] Make constructors from PresburgerSpace explicit

This patch makes constructors of IntegerRelation, IntegerPolyhedron,
PresburgerRelation, PresburgerSet from PresburgerSpace explicit. This
prevents bugs like:

```
void fun(IntegerRelation a, IntegerRelation b) {
  IntegerPolyhedron c = a.intersect(b);
}
```

Here, `a.intersect(b)` will return `IntegerRelation`, which will be implicitly
converted to `PresburgerSpace` and will use the `PresburgerSpace` constructor
for IntegerPolyhedron. Leading to loss of any constraints in the intersection
of `a` and `b`. After this patch, this will give a compile error.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D122972

2 years ago[AMDPU][Sanitizer] Refactor sanitizer options handling for AMDGPU Toolchain
Ron Lieberman [Sat, 2 Apr 2022 11:01:04 +0000 (11:01 +0000)]
[AMDPU][Sanitizer] Refactor sanitizer options handling for AMDGPU Toolchain

authored by amit.pandey@amd.com  ampandey-AMD

Differential Revision: https://reviews.llvm.org/D122781

2 years ago[MLIR][Presburger] LexSimplex: support is{Redundant,Separate}Inequality
Arjun P [Fri, 1 Apr 2022 17:46:52 +0000 (18:46 +0100)]
[MLIR][Presburger] LexSimplex: support is{Redundant,Separate}Inequality

Add integer-exact checks for inequalities being separate and redundant in LexSimplex.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D122921

2 years ago[MLIR][Presburger] Make the SimplexBase constructor protected
Arjun P [Thu, 31 Mar 2022 12:19:37 +0000 (13:19 +0100)]
[MLIR][Presburger] Make the SimplexBase constructor protected

This is not supposed to be instantiated directly anyway.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D122923

2 years ago[LoongArch] Fix instruction definition
wanglei [Sat, 2 Apr 2022 10:07:22 +0000 (18:07 +0800)]
[LoongArch] Fix instruction definition

This patch fixes issue with the LU32I_D instruction, which did not have
an input register operand.

Differential Revision: https://reviews.llvm.org/D122970

2 years agoRemove duplicate code from wouldInstructionBeTriviallyDead
Serge Pavlov [Sat, 2 Apr 2022 06:30:20 +0000 (13:30 +0700)]
Remove duplicate code from wouldInstructionBeTriviallyDead

There is a similar check few lines above in this function.

2 years ago[lld][COFF] Fix TypeServerSource lookup on GUID collisions
Tobias Hieta [Thu, 24 Mar 2022 15:07:39 +0000 (16:07 +0100)]
[lld][COFF] Fix TypeServerSource lookup on GUID collisions

Microsoft shipped a bunch of PDB files with broken/invalid GUIDs
which lead lld to use 0xFF as the key for these files in an internal
cache. When multiple files have this key it will lead to collisions
and confused symbol lookup.

Several approaches to fix this was considered. Including making the key
the path to the PDB file, but this requires some filesystem operations
in order to normalize the file path.

Since this only happens with malformatted PDB files and we haven't
seen this before they malformatted files where shipped with visual
studio we probably shouldn't optimize for this use-case.

Instead we now just don't insert files with Guid == 0xFF into the
cache map and warn if we get collisions so similar problems can be
found in the future instead of being silent.

Discussion about the root issue and the approach to this fix can be found on Github: https://github.com/llvm/llvm-project/issues/54487

Reviewed By: aganea

Differential Revision: https://reviews.llvm.org/D122372

2 years ago[mlir] Allow for using OpPassManager in pass options
River Riddle [Fri, 1 Apr 2022 05:34:00 +0000 (22:34 -0700)]
[mlir] Allow for using OpPassManager in pass options

This significantly simplifies the boilerplate necessary for passes
to define nested pass pipelines.

Differential Revision: https://reviews.llvm.org/D122880

2 years ago[mlir:PassOption] Rework ListOption parsing and add support for std::vector/SmallVect...
River Riddle [Fri, 1 Apr 2022 07:55:35 +0000 (00:55 -0700)]
[mlir:PassOption] Rework ListOption parsing and add support for std::vector/SmallVector options

ListOption currently uses llvm::cl::list under the hood, but the usages
of ListOption are generally a tad different from llvm::cl::list. This
commit codifies this by making ListOption implicitly comma separated,
and removes the explicit flag set for all of the current list options.
The new parsing for comma separation of ListOption also adds in support
for skipping over delimited sub-ranges (i.e. {}, [], (), "", ''). This
more easily supports nested options that use those as part of the
format, and this constraint (balanced delimiters) is already codified
in the syntax of pass pipelines.

See https://discourse.llvm.org/t/list-of-lists-pass-option/5950 for
related discussion

Differential Revision: https://reviews.llvm.org/D122879

2 years ago[libc++] Canonicalize the ranges results and their tests
Nikolas Klauser [Thu, 31 Mar 2022 05:06:49 +0000 (07:06 +0200)]
[libc++] Canonicalize the ranges results and their tests

Reviewed By: var-const, Mordante, #libc, ldionne

Spies: ldionne, libcxx-commits

Differential Revision: https://reviews.llvm.org/D121435

2 years ago[clang][Sparc] Enable IAS on the remaining OS's
Brad Smith [Sat, 2 Apr 2022 06:18:30 +0000 (02:18 -0400)]
[clang][Sparc] Enable IAS on the remaining OS's

2 years ago[X86][AMX] enable amx cast intrinsics in FE.
Luo, Yuanke [Thu, 17 Mar 2022 14:48:33 +0000 (22:48 +0800)]
[X86][AMX] enable amx cast intrinsics in FE.

We have some discission in D99152 and llvm-dev and finially come up with
a solution to add amx specific cast intrinsics. We've support the
intrinsics in llvm IR. This patch is to replace bitcast with amx cast
intrinsics in code emitting in FE.

Differential Revision: https://reviews.llvm.org/D122567

2 years ago[libc][NFC] Do not call mmap and munmap from thread functions.
Siva Chandra Reddy [Fri, 1 Apr 2022 07:33:34 +0000 (07:33 +0000)]
[libc][NFC] Do not call mmap and munmap from thread functions.

Instead, memory is allocated and deallocated using mmap and munmap
syscalls directly.

Reviewed By: lntue, michaelrj

Differential Revision: https://reviews.llvm.org/D122876

2 years ago[cmake] Remove LLVM_USE_NEWPM option
Arthur Eubanks [Fri, 1 Apr 2022 23:40:46 +0000 (16:40 -0700)]
[cmake] Remove LLVM_USE_NEWPM option

This option tells the host clang to use the new pass manager.
Given that it's been the default for a while, this seems unnecessary.

This was added in D57068.

(this does not affect any LLVM/Clang functionality)

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D122947

2 years ago[mlir][Vector] Add constant folder for extractelement.
jacquesguan [Fri, 1 Apr 2022 09:21:12 +0000 (17:21 +0800)]
[mlir][Vector] Add constant folder for extractelement.

This revision adds constant folder for vector.extractelement.

Differential Revision: https://reviews.llvm.org/D122886

2 years ago[AIX] XFAIL tests because of no big archive writer operation support
Jake Egan [Sat, 2 Apr 2022 02:38:43 +0000 (22:38 -0400)]
[AIX] XFAIL tests because of no big archive writer operation support

Big archive writer operation is not currently supported so mark these tests XFAIL for now.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D122949

2 years ago[mlir][Vector] Add constant folder for insertelement.
jacquesguan [Wed, 30 Mar 2022 11:14:00 +0000 (19:14 +0800)]
[mlir][Vector] Add constant folder for insertelement.

This revision adds constant folder for vector.insertelement.

Differential Revision: https://reviews.llvm.org/D122721

2 years ago[RISCV] Add lowering for vp.fptoui and vp.uitofp.
Craig Topper [Sat, 2 Apr 2022 01:28:08 +0000 (18:28 -0700)]
[RISCV] Add lowering for vp.fptoui and vp.uitofp.

This is a straightforward extension of D122512 to unsigned integers.

2 years ago[lldb] Remove remaining calls to DataBufferLLVM::GetChars
Jonas Devlieghere [Sat, 2 Apr 2022 00:41:36 +0000 (17:41 -0700)]
[lldb] Remove remaining calls to DataBufferLLVM::GetChars

Update the Linux and NetBSD Host libraries for 2165c36be445 which
removed DataBufferLLVM::GetChars. These files are compiled conditionally
based on the host platform.

2 years ago[lldb] Return a DataBuffer from FileSystem::CreateDataBuffer (NFC)
Jonas Devlieghere [Thu, 31 Mar 2022 22:16:29 +0000 (15:16 -0700)]
[lldb] Return a DataBuffer from FileSystem::CreateDataBuffer (NFC)

The concrete class (DataBufferLLVM) is an implementation detail.

2 years ago[clang-format] Fix a crash in qualifier alignment
Owen Pan [Wed, 30 Mar 2022 18:50:52 +0000 (11:50 -0700)]
[clang-format] Fix a crash in qualifier alignment

Related to #54513.

2 years ago[BOLT][test] Fix AArch64 cross-platform tests
Maksim Panchenko [Fri, 1 Apr 2022 23:43:21 +0000 (16:43 -0700)]
[BOLT][test] Fix AArch64 cross-platform tests

Use target-specific flags for building AArch64 non-runnable tests.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D122520

2 years ago[debug-info] As an NFC commit, refactor EmitFuncArgumentDbgValue so that it can be...
Michael Gottesman [Fri, 1 Apr 2022 22:57:52 +0000 (15:57 -0700)]
[debug-info] As an NFC commit, refactor EmitFuncArgumentDbgValue so that it can be extended to support llvm.dbg.addr.

The reason why I am making this change is that before this commit,
EmitFuncArgumentDbgValue relied on a boolean flag IsDbgDeclare both to signal
that a DBG_VALUE should be made to be indirect /and/ that the original intrinsic
was a dbg.declare. This is no longer always true if we add support for handling
dbg.addr since we will have an indirect DBG_VALUE that is a different intrinsic
from dbg.declare.

With that in mind, in this NFC patch, we prepare for future fixes by introducing
a 3 case-enum argument to EmitFuncArgumentDbgValue that allows the caller to
explicitly specify how the argument's DBG_VALUE should be emitted. This then
allows us to turn the indirect checks into a != FuncArgumentDbgValueKind::Value
and prepare us for a future where we add support here for llvm.dbg.addr
directly.

rdar://83957028

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D122945

2 years ago[NFCI] clang-format SanitizerArgs.cpp
Mitch Phillips [Fri, 1 Apr 2022 23:26:55 +0000 (16:26 -0700)]
[NFCI] clang-format SanitizerArgs.cpp

2 years ago[mlir] Switch debugString helper to << operator
Jacques Pienaar [Fri, 1 Apr 2022 23:33:35 +0000 (16:33 -0700)]
[mlir] Switch debugString helper to << operator

Supports more cases.

2 years ago[lld/mac] Tweak a few comments
Nico Weber [Fri, 1 Apr 2022 14:13:12 +0000 (10:13 -0400)]
[lld/mac] Tweak a few comments

Addresses review feedback I had missed on https://reviews.llvm.org/D122624

No behavior change.

Differential Revision: https://reviews.llvm.org/D122904

2 years agoclang-format HostInfoBase.cpp
Adrian Prantl [Fri, 1 Apr 2022 22:55:07 +0000 (15:55 -0700)]
clang-format HostInfoBase.cpp

2 years ago[test] Mark uuid.s as unsupported on Windows
Arthur Eubanks [Fri, 1 Apr 2022 22:32:22 +0000 (15:32 -0700)]
[test] Mark uuid.s as unsupported on Windows

For systems using gnuwin32, awk does not exist.

2 years ago[libc][NFC] add outline of printf
Michael Jones [Wed, 30 Mar 2022 22:54:30 +0000 (15:54 -0700)]
[libc][NFC] add outline of printf

This patch adds the headers for printf. It contains minimal actual code,
and is more intended to be used for design review. The code is not built
yet, and may have minor errors.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D122773

2 years ago[gn build] Port f547fc89c073
LLVM GN Syncbot [Fri, 1 Apr 2022 21:24:52 +0000 (21:24 +0000)]
[gn build] Port f547fc89c073

2 years ago[clang-tidy] Add modernize-macro-to-enum check
Richard [Fri, 1 Apr 2022 19:53:00 +0000 (13:53 -0600)]
[clang-tidy] Add modernize-macro-to-enum check

[buildbot issues fixed]

This check performs basic analysis of macros and replaces them
with an anonymous unscoped enum.  Using an unscoped anonymous enum
ensures that everywhere the macro token was used previously, the
enumerator name may be safely used.

Potential macros for replacement must meet the following constraints:
- Macros must expand only to integral literal tokens.  The unary
  operators plus, minus and tilde are recognized to allow for positive,
  negative and bitwise negated integers.
- Macros must be defined on sequential source file lines, or with
  only comment lines in between macro definitions.
- Macros must all be defined in the same source file.
- Macros must not be defined within a conditional compilation block.
- Macros must not be defined adjacent to other preprocessor directives.
- Macros must not be used in preprocessor conditions

Each cluster of macros meeting the above constraints is presumed to
be a set of values suitable for replacement by an anonymous enum.
From there, a developer can give the anonymous enum a name and
continue refactoring to a scoped enum if desired.  Comments on the
same line as a macro definition or between subsequent macro definitions
are preserved in the output.  No formatting is assumed in the provided
replacements.

The check cppcoreguidelines-macro-to-enum is an alias for this check.

Fixes #27408

Differential Revision: https://reviews.llvm.org/D117522

2 years agoSimplify ArchSpec::IsFullySpecifiedTriple() (NFC)
Adrian Prantl [Fri, 1 Apr 2022 21:15:58 +0000 (14:15 -0700)]
Simplify ArchSpec::IsFullySpecifiedTriple() (NFC)

I found this function somewhat hard to read and removed a few entirely
redundant checks and converted it to early exits.

Differential Revision: https://reviews.llvm.org/D122912

2 years ago[flang] add evaluate::IsAllocatableDesignator helper
Jean Perier [Fri, 1 Apr 2022 20:31:23 +0000 (22:31 +0200)]
[flang] add evaluate::IsAllocatableDesignator helper

Previously, some semantic checks that are checking if an entity is an
allocatable were relying on the expression being a designator whose
last symbol has the allocatable attribute.

This is wrong since this was considering substrings and array sections of
allocatables as being allocatable. This is wrong (see NOTE 2 in
Fortran 2018 section 9.5.3.1).

Add evaluate::IsAllocatableDesignator to correctly test this.
Also add some semantic tests for ALLOCATED to test the newly added helper.
Note that ifort and nag are rejecting coindexed-named-object in
ALLOCATED (`allocated(coarray_scalar_alloc[2])`).
I think it is wrong given allocated argument is intent(in) as per
16.2.1 point 3.
So 15.5.2.6 point 4 regarding allocatable dummy is not violated (If the actual
argument is a coindexed object, the dummy argument shall have the INTENT (IN)
attribute.) and I think this is valid. gfortran accepts it.

The need for this helper was exposed in https://reviews.llvm.org/D122779.

Differential Revision: https://reviews.llvm.org/D122899

Co-authored-by: Peixin-Qiao <qiaopeixin@huawei.com>
2 years ago[RISCV][AMDGPU][TargetLowering] Special case overflow expansion for (uaddo X, 1).
Craig Topper [Fri, 1 Apr 2022 20:14:10 +0000 (13:14 -0700)]
[RISCV][AMDGPU][TargetLowering] Special case overflow expansion for (uaddo X, 1).

If we expand (uaddo X, 1) we previously expanded the overflow calculation
as (X + 1) <u X. This potentially increases the live range of X and
can prevent X+1 from reusing the register that previously held X.

Since we're adding 1, overflow only occurs if X was UINT_MAX in which
case (X+1) would be 0. So this patch adds a special case to expand
the overflow calculation to (X+1) == 0.

This seems to help with uaddo intrinsics that get introduced by
CodeGenPrepare after LSR. Alternatively, we could block the uaddo
transform in CodeGenPrepare for this case.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122933

2 years agoFix behavior of ifuncs with 'used' extern "C" static functions
Erich Keane [Mon, 28 Mar 2022 15:57:50 +0000 (08:57 -0700)]
Fix behavior of ifuncs with 'used' extern "C" static functions

We expect that `extern "C"` static functions to be usable in things like
inline assembly, as well as ifuncs:
See the bug report here: https://github.com/llvm/llvm-project/issues/54549

However, we were diagnosing this as 'not defined', because the
ifunc's attempt to look up its resolver would generate a declared IR
function.

Additionally, as background, the way we allow these static extern "C"
functions to work in inline assembly is by making an alias with the C
mangling in MOST situations to the version we emit with
internal-linkage/mangling.

The problem here was multi-fold: First- We generated the alias after the
ifunc was checked, so the function by that name didn't exist yet.
Second, the ifunc's generation caused a symbol to exist under the name
of the alias already (the declared function above), which suppressed the
alias generation.

This patch fixes all of this by moving the checking of ifuncs/CFE aliases
until AFTER we have generated the extern-C alias.  Then, it does a
'fixup' around the GlobalIFunc to make sure we correct the reference.

Differential Revision: https://reviews.llvm.org/D122608

2 years ago[RISCV] Add tests for uaddo with a constant 1. NFC
Craig Topper [Fri, 1 Apr 2022 19:23:42 +0000 (12:23 -0700)]
[RISCV] Add tests for uaddo with a constant 1. NFC

The overflow calculation can be optimized to check if the add
result is 0.

2 years agoFIX the wildcards to pass an FP diff in mangle-nttp-anon-union.cpp
Erich Keane [Fri, 1 Apr 2022 19:22:44 +0000 (12:22 -0700)]
FIX the wildcards to pass an FP diff in mangle-nttp-anon-union.cpp

2 years agoAdd some wildcards to pass FP difference on one of the buildbots
Erich Keane [Fri, 1 Apr 2022 19:02:45 +0000 (12:02 -0700)]
Add some wildcards to pass FP difference on one of the buildbots

2 years ago[mlir][vector] Fold transpose(broadcast(<scalar>))
Lei Zhang [Fri, 1 Apr 2022 18:43:09 +0000 (14:43 -0400)]
[mlir][vector] Fold transpose(broadcast(<scalar>))

For such cases, the transpose op can be elided.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D122903

2 years ago[GH54588]Fix ItaniumMangler for NTTP unnamed unions w/ unnamed structs
Erich Keane [Thu, 31 Mar 2022 14:27:11 +0000 (07:27 -0700)]
[GH54588]Fix ItaniumMangler for NTTP unnamed unions w/ unnamed structs

As reported in https://github.com/llvm/llvm-project/issues/54588
and discussed in https://github.com/itanium-cxx-abi/cxx-abi/issues/139

We are supposed to do a DFS, pre-order, decl-order search for a name for
the union in this case. Prevoiusly we crashed because the IdentiferInfo
pointer was nullptr, so this makes sure we have a name in the cases
described by the ABI.

I added an llvm-unreachable to cover an unexpected case at the end of
the new function with information/reference to the ABI in case we come
up with some way to get back to here.

Differential Revision: https://reviews.llvm.org/D122820

2 years agoAddressed post-commit comment https://reviews.llvm.org/D122746#inline-1175831
zhijian [Fri, 1 Apr 2022 18:10:22 +0000 (14:10 -0400)]
Addressed post-commit comment https://reviews.llvm.org/D122746#inline-1175831

2 years ago[mlir][sparse] Moving `delete coo` into codegen instead of runtime library
wren romano [Wed, 30 Mar 2022 19:32:33 +0000 (12:32 -0700)]
[mlir][sparse] Moving `delete coo` into codegen instead of runtime library

Prior to this change there were a number of places where the allocation and deallocation of SparseTensorCOO objects were not cleanly paired, leading to inconsistencies regarding whether each function released its tensor/coo arguments or not, as well as making it easy to run afoul of memory leaks, use-after-free, or double-free errors.  This change cleans up the codegen vs runtime boundary to resolve those issues.  Now, the only time the runtime library frees an object is either (a) because it's a function explicitly designed to do so, or (b) because the allocated object is entirely local to the function and would be a memory leak if not released.  Thus, now the codegen takes complete responsibility for releasing any objects it caused to be allocated.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D122435

2 years ago[AArch64] add tests for funnel+or == 0; NFC
Sanjay Patel [Fri, 1 Apr 2022 16:22:10 +0000 (12:22 -0400)]
[AArch64] add tests for funnel+or == 0; NFC

These are copied from x86 ( 1074bdfb52b2e1753e51472 ) to
provide more coverage for a potential generic combine.

2 years ago[InstCombine] add tests for icmp with sub with multiple uses; NFC
Sanjay Patel [Fri, 1 Apr 2022 14:32:35 +0000 (10:32 -0400)]
[InstCombine] add tests for icmp with sub with multiple uses; NFC

Issue #54558

2 years ago[LLDB] Add require x86 for NativePdb Test.
Zequan Wu [Fri, 1 Apr 2022 17:38:57 +0000 (10:38 -0700)]
[LLDB] Add require x86 for NativePdb Test.

2 years ago[clang][dataflow] Add support for correlation of boolean (tracked) values
Yitzhak Mandelbaum [Mon, 28 Mar 2022 15:38:17 +0000 (15:38 +0000)]
[clang][dataflow] Add support for correlation of boolean (tracked) values

This patch extends the join logic for environments to explicitly handle
boolean values. It creates the disjunction of both source values, guarded by the
respective flow conditions from each input environment. This change allows the
framework to reason about boolean correlations across multiple branches (and
subsequent joins).

Differential Revision: https://reviews.llvm.org/D122838

2 years ago[clang][dataflow] Add support for (built-in) (in)equality operators
Yitzhak Mandelbaum [Fri, 25 Mar 2022 20:01:18 +0000 (20:01 +0000)]
[clang][dataflow] Add support for (built-in) (in)equality operators

Adds logical interpretation of built-in equality operators, `==` and `!=`.s

Differential Revision: https://reviews.llvm.org/D122830

2 years ago[LLDB][NativePDB] Create inline function decls
Zequan Wu [Fri, 18 Mar 2022 22:31:19 +0000 (15:31 -0700)]
[LLDB][NativePDB] Create inline function decls

This creates inline functions decls in the TUs where the funcitons are inlined and local variable decls inside those functions.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D121967

2 years ago[DAG] Add llvm::isMinSignedConstant helper. NFC
Simon Pilgrim [Fri, 1 Apr 2022 16:47:24 +0000 (17:47 +0100)]
[DAG] Add llvm::isMinSignedConstant helper. NFC

Pulled out of D122754

2 years agoRevert "[runtimes] Create Tests.cmake if it does not exist"
Petr Hosek [Fri, 1 Apr 2022 16:29:54 +0000 (09:29 -0700)]
Revert "[runtimes] Create Tests.cmake if it does not exist"

This reverts commit d6623d72461b5a1ed3bd3ac966d14329e5b0f851 since
it broke the build on Mac.

2 years ago[X86] matchAddressRecursively - add XOR(X, MIN_SIGNED_VALUE) handling
Simon Pilgrim [Fri, 1 Apr 2022 16:22:33 +0000 (17:22 +0100)]
[X86] matchAddressRecursively - add XOR(X, MIN_SIGNED_VALUE) handling

Allows us to fold XOR(X, MIN_SIGNED_VALUE) == ADD(X, MIN_SIGNED_VALUE) into LEA patterns

As mentioned on PR52267.

Differential Revision: https://reviews.llvm.org/D122815

2 years ago[intelpt] Refactor timestamps out of `IntelPTInstruction`
Alisamar Husain [Mon, 28 Mar 2022 18:20:28 +0000 (23:50 +0530)]
[intelpt] Refactor timestamps out of `IntelPTInstruction`
Storing timestamps (TSCs) in a more efficient map at the decoded thread level to speed up TSC lookup, as well as reduce the amount of memory used by each decoded instruction. Also introduced TSC range which keeps the current timestamp valid for all subsequent instructions until the next timestamp is emitted.

Differential Revision: https://reviews.llvm.org/D122603

2 years ago[mlir][vector] Handle scalars in extract_strided_slice(broadcast)
Lei Zhang [Fri, 1 Apr 2022 16:07:25 +0000 (12:07 -0400)]
[mlir][vector] Handle scalars in extract_strided_slice(broadcast)

For such cases we cannot generate extract_strided_slice ops.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D122902

2 years ago[mlir][spirv] Add pattern to lower math.copysign
Lei Zhang [Fri, 1 Apr 2022 16:02:43 +0000 (12:02 -0400)]
[mlir][spirv] Add pattern to lower math.copysign

This follows the logic:
https://git.musl-libc.org/cgit/musl/tree/src/math/copysignf.c

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D122910

2 years ago[X86] Fold AND(SRL(X,Y),1) -> SETCC(BT(X,Y)) (RECOMMITTED)
Simon Pilgrim [Fri, 1 Apr 2022 15:58:45 +0000 (16:58 +0100)]
[X86] Fold AND(SRL(X,Y),1) -> SETCC(BT(X,Y)) (RECOMMITTED)

As noticed on PR39174, if we're extracting a single non-constant bit index, then try to use BT+SETCC instead to avoid messing around moving the shift amount to the ECX register, using slow x86 shift ops etc.

Recommitted with a fix to ensure we zext/trunc the SETCC result to the original type.

Differential Revision: https://reviews.llvm.org/D122891

2 years agosanitizer_common: add Mutex::TryLock
Dmitry Vyukov [Fri, 1 Apr 2022 14:30:55 +0000 (16:30 +0200)]
sanitizer_common: add Mutex::TryLock

Will be used in future changes.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D122905

2 years agosanitizer_common: expose max_address from LoadedModule
Dmitry Vyukov [Fri, 1 Apr 2022 14:35:33 +0000 (16:35 +0200)]
sanitizer_common: expose max_address from LoadedModule

Currently LoadedModule provides max_executable_address.
Replace it with just max_address.
It's only used for printing for human inspection and since
modules are non-overlapping, max_address is as good as max_executable_address
for matching addresses/PCs against modules (I assume it's used for that).
On the hand, max_address is more general and can used to match e.g. data addresses.
I want to use it for that purpose in future changes.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D122906

2 years ago[AIX][XCOFF] print unsupported message for llvm-ar big archive write operation
zhijian [Fri, 1 Apr 2022 15:55:11 +0000 (11:55 -0400)]
[AIX][XCOFF] print unsupported message for llvm-ar big archive write operation
Summary:

when run "llvm-ar cr" on AIX OS , it created a gnu archive, it is not desirable in aix OS.
instead of creating a gnu archive, the patch will print a unsupport message for llvm-ar big archive write operation in AIX OS.

after implement the big archive operation, I will revert the XFAIL: AIX " and "--format=gnu" test cases in the patch.

Reviewer : James Henderson, Jinsong Ji
Differential Revision: https://reviews.llvm.org/D122746

2 years agoRecommit "[LV] Remove unneeded createHeaderBranch.(NFCI)"
Florian Hahn [Fri, 1 Apr 2022 15:53:39 +0000 (16:53 +0100)]
Recommit "[LV] Remove unneeded createHeaderBranch.(NFCI)"

This reverts commit 14e3650f01d158f7e4117c353927a07ceebdd504.

The issue causing the revert were fixed independently in
a08c90a4023f and 14e5f9785c9c.

2 years agoRevert rGa5f637bcbb7d1e08ce637f113fc117c3f4b2b110 "[X86] Fold AND(SRL(X,Y),1) ->...
Simon Pilgrim [Fri, 1 Apr 2022 15:48:24 +0000 (16:48 +0100)]
Revert rGa5f637bcbb7d1e08ce637f113fc117c3f4b2b110 "[X86] Fold AND(SRL(X,Y),1) -> SETCC(BT(X,Y))"

Investigating a sanitizer-windows buildbot breakage

2 years ago[X86] lowerShuffleAsRepeatedMaskAndLanePermute - allow 64-bit sublane shuffling on...
Simon Pilgrim [Fri, 1 Apr 2022 15:40:10 +0000 (16:40 +0100)]
[X86] lowerShuffleAsRepeatedMaskAndLanePermute - allow 64-bit sublane shuffling on AVX512BW v64i8 shuffles

We were only performing this on 256-bit vectors on AVX2 targets

Noticed while triaging Issue #54658

2 years ago[X86] Add PR54658 test case
Simon Pilgrim [Fri, 1 Apr 2022 15:21:54 +0000 (16:21 +0100)]
[X86] Add PR54658 test case

2 years ago[X86] Fold AND(SRL(X,Y),1) -> SETCC(BT(X,Y))
Simon Pilgrim [Fri, 1 Apr 2022 14:57:29 +0000 (15:57 +0100)]
[X86] Fold AND(SRL(X,Y),1) -> SETCC(BT(X,Y))

As noticed on PR39174, if we're extracting a single non-constant bit index, then try to use BT+SETCC instead to avoid messing around moving the shift amount to the ECX register, using slow x86 shift ops etc.

Differential Revision: https://reviews.llvm.org/D122891

2 years ago[clang][dataflow] Fix handling of base-class fields.
Yitzhak Mandelbaum [Wed, 23 Mar 2022 00:01:55 +0000 (00:01 +0000)]
[clang][dataflow] Fix handling of base-class fields.

Currently, the framework does not track derived class access to base
fields. This patch adds that support and a corresponding test.

Differential Revision: https://reviews.llvm.org/D122273

2 years ago[InstCombine] Add additional tests for strlen/strnlen (NFC)
Martin Sebor [Fri, 1 Apr 2022 14:58:38 +0000 (16:58 +0200)]
[InstCombine] Add additional tests for strlen/strnlen (NFC)

Taken from D122686.

2 years agoAdd prototypes to functions which need them; NFC
Aaron Ballman [Fri, 1 Apr 2022 14:31:59 +0000 (10:31 -0400)]
Add prototypes to functions which need them; NFC

2 years ago[MemCpyOpt] Add test for PR54682 (NFC)
Nikita Popov [Fri, 1 Apr 2022 14:31:08 +0000 (16:31 +0200)]
[MemCpyOpt] Add test for PR54682 (NFC)

2 years ago[LV] Add SCEV workaround from 80e8025 to epilogue vector code path.
Florian Hahn [Fri, 1 Apr 2022 14:14:47 +0000 (15:14 +0100)]
[LV] Add SCEV workaround from 80e8025 to epilogue vector code path.

This was exposed by 14e3650f. The recommit of 14e3650f will hit the
problematic code path requiring the workaround.
test case that crashes without the workaround.

2 years ago[OpenMP] Make linker wrapper thin-lto default thread count use all
Joseph Huber [Fri, 1 Apr 2022 13:42:21 +0000 (09:42 -0400)]
[OpenMP] Make linker wrapper thin-lto default thread count use all

Summary:
Currently there is no option to configure the number of thin-backend
threads to use when performing thin-lto on the device, but we should
default to use all the threads rather than just one. In the future we
should use the same arguments that gold / lld use and parse it here.

2 years ago[mlir][tensor][bufferize] Support 0-d collapse_shape with offset
Matthias Springer [Fri, 1 Apr 2022 13:07:00 +0000 (22:07 +0900)]
[mlir][tensor][bufferize] Support 0-d collapse_shape with offset

Differential Revision: https://reviews.llvm.org/D122901

2 years ago[x86] add tests for funnel+or == 0; NFC
Sanjay Patel [Thu, 31 Mar 2022 19:10:47 +0000 (15:10 -0400)]
[x86] add tests for funnel+or == 0; NFC

This is another family of patterns based on issue #49541

2 years agofix bazel build after 369337e3c2
Mikhail Goncharov [Fri, 1 Apr 2022 13:27:12 +0000 (15:27 +0200)]
fix bazel build after 369337e3c2

2 years ago[clangd] Record IO precentage for first preamble build of the instance
Kadir Cetinkaya [Fri, 1 Apr 2022 11:53:16 +0000 (13:53 +0200)]
[clangd] Record IO precentage for first preamble build of the instance

Differential Revision: https://reviews.llvm.org/D122894

2 years ago[demangler] Fix node matcher test
Nathan Sidwell [Fri, 1 Apr 2022 12:47:35 +0000 (05:47 -0700)]
[demangler] Fix node matcher test

Move node matcher compilation test to non-anonymous namespace and
avoid using attribute.

2 years ago[AMDGPU] Only count global-to-global as indirect accesses
Jay Foad [Thu, 31 Mar 2022 12:39:02 +0000 (13:39 +0100)]
[AMDGPU] Only count global-to-global as indirect accesses

Previously any load (global, local or constant) feeding into a
global load or store would be counted as an indirect access. This
patch only counts global loads feeding into a global load or store.
The rationale is that the latency for global loads is generally
much larger than the other kinds.

As a side effect this makes it easier to write small kernels test
cases that are not counted as having indirect accesses, despite
the fact that arguments to the kernel are accessed with an SMEM
load.

Differential Revision: https://reviews.llvm.org/D122804

2 years ago[LV] Re-use TripCount from EPI.TripCount.
Florian Hahn [Fri, 1 Apr 2022 12:47:34 +0000 (13:47 +0100)]
[LV] Re-use TripCount from EPI.TripCount.

During skeleton construction for the epilogue vector loop, generic
helpers use getOrCreateTripCount, which will re-expand the trip count
computation. Instead, re-use the TripCount created during main loop
vectorization.

2 years ago[demangler] Fix node matchers
Nathan Sidwell [Tue, 29 Mar 2022 13:19:18 +0000 (06:19 -0700)]
[demangler] Fix node matchers

* Add instantiation tests to ItaniumDemangleTest, to make sure all
  match functions provide constructor arguments to the provided functor.

* Fix the Node constructors that lost const qualification on arguments.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D122665

2 years ago[MLIR][Presburger][NFC] Use "disjunct" to refer to disjuncts in PresburgerRelation
Groverkss [Fri, 1 Apr 2022 12:08:27 +0000 (17:38 +0530)]
[MLIR][Presburger][NFC] Use "disjunct" to refer to disjuncts in PresburgerRelation

This patch modifies the name "integerRelations" and "relation" to refer to the
disjuncts in PresburgerRelation to "disjunct(s)".  This is done to be
consistent with the rest of the interface.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D122892

2 years ago[demangler][NFC] Use def file for node names
Nathan Sidwell [Wed, 30 Mar 2022 12:59:16 +0000 (05:59 -0700)]
[demangler][NFC] Use def file for node names

In order to add a unit test, we need to expose the node names beyond
ItaniumDemangle.h.  This breaks them out into a def file.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D122739

2 years ago[analyzer][ctu] Only import const and trivial VarDecls
Gabor Marton [Fri, 1 Apr 2022 09:58:17 +0000 (11:58 +0200)]
[analyzer][ctu] Only import const and trivial VarDecls

Do import the definition of objects from a foreign translation unit if that's type is const and trivial.

Differential Revision: https://reviews.llvm.org/D122805

2 years ago[AMDGPU][DOC][NFC] Added GFX1013 assembler syntax description
Dmitry Preobrazhensky [Fri, 1 Apr 2022 11:44:24 +0000 (14:44 +0300)]
[AMDGPU][DOC][NFC] Added GFX1013 assembler syntax description

2 years ago[MLIR][Presburger] subtract: fix bug when an input set has duplicate divisions
Arjun P [Fri, 1 Apr 2022 11:21:46 +0000 (12:21 +0100)]
[MLIR][Presburger] subtract: fix bug when an input set has duplicate divisions

Previously, when an input set had a duplicate division, the duplicates might
be removed by a call to mergeLocalIds due to being detected as being duplicate
for the first time. The subtraction implementation cannot handle existing
locals being removed, so this would lead to unexpected behaviour. Resolve this
by removing all the duplicates up front.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D122826