platform/upstream/llvm.git
2 years agofix warning caused by ef4ecc3ceffcf3ef129640c813f823c974f9ba22
Bardia Mahjour [Mon, 2 May 2022 21:06:00 +0000 (17:06 -0400)]
fix warning caused by ef4ecc3ceffcf3ef129640c813f823c974f9ba22

2 years ago[sanitizer] Use canonical syscalls everywhere
Evgenii Stepanov [Thu, 21 Apr 2022 22:17:29 +0000 (15:17 -0700)]
[sanitizer] Use canonical syscalls everywhere

These "new" syscalls have been added in 2.6.16, more than 16 years ago.
Surely that's enough time to migrate. Glibc 2.33 is using them on both
i386 and x86_64. Android has an selinux filter to block the legacy
syscalls in the apps.

Differential Revision: https://reviews.llvm.org/D124212

2 years ago[LoopCacheAnalysis] Consider dimension depth of the subscript reference when calculat...
Bardia Mahjour [Mon, 2 May 2022 20:49:10 +0000 (16:49 -0400)]
[LoopCacheAnalysis] Consider dimension depth of the subscript reference when calculating cost

Reviewed By: congzhe, etiotto

Differential Revision: https://reviews.llvm.org/D123400

2 years ago[memprof] Only insert dynamic shadow load when needed
Teresa Johnson [Mon, 2 May 2022 19:38:27 +0000 (12:38 -0700)]
[memprof] Only insert dynamic shadow load when needed

We don't need to insert a load of the dynamic shadow address unless there
are interesting memory accesses to profile.

Split out of D124703.

Differential Revision: https://reviews.llvm.org/D124797

2 years ago[NFC] Fixing error on some versions of GCC
Chris Bieneman [Mon, 2 May 2022 20:16:21 +0000 (15:16 -0500)]
[NFC] Fixing error on some versions of GCC

Some versions of GCC don't implicitly move Error to Expected.

2 years ago[flang] Fix semantics check for RETURN statement
Emil Kieri [Mon, 2 May 2022 16:17:39 +0000 (18:17 +0200)]
[flang] Fix semantics check for RETURN statement

The RETURN statement is allowed in functions and subroutines, but not
in main programs. It is however a common extension, which we also
implement, to allow RETURN from main programs -- we only issue a
portability warning when -pedantic or -std=f2018 are set.

This patch fixes false positives for this portability warning, where it
was triggered also when RETURN was present in functions or subroutines.

Fixexs #55080

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D124732

2 years ago[NFC] Fix warning reported on bots
Chris Bieneman [Mon, 2 May 2022 20:02:10 +0000 (15:02 -0500)]
[NFC] Fix warning reported on bots

2 years ago[SLP][NFC]Minor code changes for better readability, NFC.
Alexey Bataev [Mon, 2 May 2022 19:57:34 +0000 (12:57 -0700)]
[SLP][NFC]Minor code changes for better readability, NFC.

2 years ago[clangd] Add inlay hints for mutable reference parameters
Tobias Ribizel [Mon, 2 May 2022 19:56:40 +0000 (15:56 -0400)]
[clangd] Add inlay hints for mutable reference parameters

Add a & prefix to all parameter inlay hints that refer to a non-const l-value reference. That makes it easier to identify them even if semantic highlighting is not used (where this is already available)

Reviewed By: nridge

Differential Revision: https://reviews.llvm.org/D124359

2 years ago[NFC] Rename `FixedLenDecoderEmitter` as `DecoderEmitter`
Sheng [Mon, 2 May 2022 19:36:07 +0000 (03:36 +0800)]
[NFC] Rename `FixedLenDecoderEmitter` as `DecoderEmitter`

Since now we are able to handle both fixed length & variable
length instructions.

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D123451

2 years ago[TableGen] Add support for variable length instruction in decoder generator
Sheng [Mon, 2 May 2022 19:26:55 +0000 (03:26 +0800)]
[TableGen] Add support for variable length instruction in decoder generator

To support variable length instructions, I think of them as fixed length instructions with the "maximum length". For example, if there're three instructions with 2, 6 and 9 bytes, we can fit them into the algorithm by treating them all as 9 bytes.

Also, since we can't know the length of the instruction in advance, there is a function object with type `void(APInt &, uint64_t)` added in the parameter list of `decodeInstruction` and `fieldFromInstruction`. We can use this to supply the additional bits the decoder needs after we know the opcode of the instruction.

Finally, `InstrLenTable` is added to let the decoder know the length of the instructions.

See D120960 for its usage.

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D120958

2 years agoRevert "Fix a misuse of `cast`"
Sheng [Mon, 2 May 2022 19:32:02 +0000 (03:32 +0800)]
Revert "Fix a misuse of `cast`"

This reverts commit ba59ec2843f99f19d55d7cd9f9ac536fb038fdab.

2 years ago[memprof] Don't instrument PGO and other compiler inserted variables
Teresa Johnson [Fri, 29 Apr 2022 21:53:31 +0000 (14:53 -0700)]
[memprof] Don't instrument PGO and other compiler inserted variables

Suppress instrumentation of PGO counter accesses, which is unnecessary
and costly. Also suppress accesses to other compiler inserted variables
starting with "__llvm". This is a slightly expanded variant of what is
done for tsan in shouldInstrumentReadWriteFromAddress.

Differential Revision: https://reviews.llvm.org/D124703

2 years ago[OpenMP] Fix -Wswitch (due to new OMPC_cancellation_construct_type) after D123828
Fangrui Song [Mon, 2 May 2022 19:10:09 +0000 (12:10 -0700)]
[OpenMP] Fix -Wswitch (due to new OMPC_cancellation_construct_type) after D123828

2 years ago[SLP]Improve reductions analysis and emission, part 1.
Alexey Bataev [Thu, 18 Nov 2021 16:08:01 +0000 (08:08 -0800)]
[SLP]Improve reductions analysis and emission, part 1.

Currently SLP vectorizer walks through the instructions and selects
3 main classes of values: 1) reduction operations - instructions with same
reduction opcode (add, mul, min/max, etc.), which build the reduction,
2) reduced values - instructions with the same opcodes, but different
from the reduction opcode, 3) extra arguments - all other values,
instructions from the different basic block rather than the root node,
instructions with to many/less uses.

This scheme is not very efficient. It excludes some instructions and all
non-instruction values from the reductions (constants, proficient
gathers), to many possibly reduced values are marked as extra arguments.
Patch improves this process by introducing a bit extended analysis
stage. During this stage, we still try to select 3 classes of the
values: 1) reduction operations - same as before, 2) possibly reduced
values - all instructions from the current block/non-instructions, which
may build a vectorization tree, 3) extra arguments - instructions from
the different basic blocks. Additionally, an extra sorting of the
possibly reduced values occurs to build the scalar sequences which
highly likely will bed vectorized, e.g. loads are grouped by the
distance between them, constants are grouped together, cmp instructions
are sorted by their compare types and predicates, extractelement
instructions are sorted by the vector operand, etc. Also, these groups
are reordered by their length so the longest group is the first in the
list of the possibly reduced values.

The vectorization process tries to emit the reductions for all these
groups. These reductions, remaining non-vectorized possible reduced
values and extra arguments are then combined into the final expression
just like it was before.

Differential Revision: https://reviews.llvm.org/D114171

2 years ago[SystemZ] Accept (. - 0x100000000) PCRel32 constants
Ilya Leoshkevich [Mon, 2 May 2022 18:56:51 +0000 (20:56 +0200)]
[SystemZ] Accept (. - 0x100000000) PCRel32 constants

Clang does not accept instructions like brasl %r0,.-0x100000000,
because the second operand's right-hand-side (0x100000000) barely
misses the acceptable range. However, since it's being subtracted, it
makes sense to perform the range check on the negated value.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D124780

2 years ago[SDAG] fix miscompile when casting int->FP->int
Sanjay Patel [Mon, 2 May 2022 16:48:02 +0000 (12:48 -0400)]
[SDAG] fix miscompile when casting int->FP->int

This is the codegen equivalent of D124692.

As shown in https://github.com/llvm/llvm-project/issues/55150 -
the existing fold may be wrong when converting to a signed value.
This is a quick fix to avoid the miscompile.
https://alive2.llvm.org/ce/z/KtaDmd

Differential Revision: https://reviews.llvm.org/D124771

2 years ago[Object][DX] Initial DXContainer parsing support
Chris Bieneman [Thu, 28 Apr 2022 23:03:25 +0000 (18:03 -0500)]
[Object][DX] Initial DXContainer parsing support

This patch begins adding DXContainer parsing support to libObject.
Following the pattern used by ELFFile my goal here is to write a
standalone DXContainer parser and later write an adapter interface to
support a subset of the ObjectFile interfaces so that we can add
limited objdump support. I will also be adding ObjectYAML support to
help drive testing of the object tools and MC-level object writers as
those come together.

DXContainer is a slightly odd format. It is arranged in "parts" that
are semantically similar to sections, but it doesn't support symbol
listing.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D124643

2 years agoUpdate movmsk-cmp.ll to match improvements made to InstCombine
Amaury Séchet [Mon, 2 May 2022 10:12:27 +0000 (10:12 +0000)]
Update movmsk-cmp.ll to match improvements made to InstCombine

This reflects the changes in the IR generated by InstCombine as pointed out by @RKSimon in https://reviews.llvm.org/D124743#3485199

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D124756

2 years ago[VPlan] Do not create VPWidenCall recipes for scalar vector factors.
Florian Hahn [Mon, 2 May 2022 18:40:33 +0000 (19:40 +0100)]
[VPlan] Do not create VPWidenCall recipes for scalar vector factors.

'Widen' recipe are only used when actual vector values are generated.
Fix tryToWidenCall to do not create VPWidenCallRecipes for scalar vector
factors.

This was exposed by D123720, because the widened recipes are considered
vector users.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D124718

2 years ago[DeadArgElim] Set unused arguments for internal functions
Quentin Colombet [Fri, 29 Apr 2022 20:53:36 +0000 (13:53 -0700)]
[DeadArgElim] Set unused arguments for internal functions

Prior to this patch we would only set to undef the unused arguments of the
external functions. The rationale was that unused arguments of internal
functions wouldn't need to be turned into undef arguments because they
should have been simply eliminated by the time we reach that code.

This is actually not true because there are plenty of cases where we can't
remove unused arguments. For instance, if the internal function is used in
an indirect call, it may not be possible to change the function signature.
Yet, for statically known call-sites we would still like to mark the unused
arguments as undef.

This patch enables the "set undef arguments" optimization on internal
functions when we encounter cases where internal functions cannot be
optimized. I.e., whenever an internal function is marked "live".

Differential Revision: https://reviews.llvm.org/D124699

2 years agoRevert "Re-apply 4b6c2cd642 "Deferred Concept Instantiation Implementation"""
Erich Keane [Mon, 2 May 2022 18:10:35 +0000 (11:10 -0700)]
Revert "Re-apply 4b6c2cd642 "Deferred Concept Instantiation Implementation"""

This reverts commit a97899108e495147985e5e9492e742d51d5cc97a.

The patch caused some problems with the libc++ `__range_adaptor_closure`
that I haven't been able to figure out the cause of, so I am reverting
while I figure out whether this is a solvable problem/issue with the
  CFE, or libc++ depending on an older 'incorrect' behavior.

2 years ago[PS5] Check for HasNativeLLVMSupport
Paul Robinson [Mon, 2 May 2022 18:03:08 +0000 (11:03 -0700)]
[PS5] Check for HasNativeLLVMSupport

2 years ago[NFC] Add test for HasNativeLLVMSupport
Paul Robinson [Mon, 2 May 2022 17:58:22 +0000 (10:58 -0700)]
[NFC] Add test for HasNativeLLVMSupport

It looks like there used to be a test for this, but the test evolved
in a way that caused the check for the diagnostic to be eliminated.
Add a test that is obviously and specifically for that diagnostic.

2 years ago[Driver][test] Remove clang{{.*}} when testing -cc1 command lines
Fangrui Song [Mon, 2 May 2022 18:02:19 +0000 (11:02 -0700)]
[Driver][test] Remove clang{{.*}} when testing -cc1 command lines

The majority of tests omit testing "clang" for -cc1 command lines. In addition,
some distributions symlink %clang to an executable with a content hash based
filename so clang{{.*}} check won't work.

With this change, we can remove many -no-canonical-prefixes whose purpose was to
make the tests pass on such distributions.

2 years ago[ifs] Fix bug where exclude only excluded when outputting ifs files
Alex Brachet [Mon, 2 May 2022 17:49:06 +0000 (17:49 +0000)]
[ifs] Fix bug where exclude only excluded when outputting ifs files

Now output elf files will also have excluded symbols removed.

Reviewed By: haowei

Differential Revision: https://reviews.llvm.org/D124781

2 years agoReapply "[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building
Jonas Paulsson [Wed, 20 Apr 2022 16:19:37 +0000 (18:19 +0200)]
Reapply "[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building
libcalls." (was 0f8c626). This reverts commit 14d9390.

The patch previously failed to recognize cases where user had defined a
function alias with an identical name as that of the library
function. Module::getFunction() would then return nullptr which is what the
sanitizer discovered.

In this updated version a new function isLibFuncEmittable() has as well been
introduced which is now used instead of TLI->has() anytime a library function
is to be emitted . It additionally also makes sure there is e.g. no function
alias with the same name in the module.

Reviewed By: Eli Friedman

Differential Revision: https://reviews.llvm.org/D123198

2 years ago[mlir][OpenMP] Add omp.cancel and omp.cancellationpoint.
Raghu Maddhipatla [Mon, 2 May 2022 17:23:11 +0000 (12:23 -0500)]
[mlir][OpenMP] Add omp.cancel and omp.cancellationpoint.

Reviewed By: kiranchandramohan, peixin, shraiysh

Differential Revision: https://reviews.llvm.org/D123828

2 years ago[mlir] CRunnerUtils: qualify UnrankedMemRefType to avoid collisions with mlir::Unrank...
Eugene Zhulenev [Sun, 1 May 2022 20:28:51 +0000 (13:28 -0700)]
[mlir] CRunnerUtils: qualify UnrankedMemRefType to avoid collisions with mlir::UnrankedMemRefType

When CRunnerUtils included together with MLIR IR headers, it can lead to compilation errors.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D124744

2 years agoMark identifier prefixes as substitutable
Harald van Dijk [Mon, 2 May 2022 17:07:47 +0000 (18:07 +0100)]
Mark identifier prefixes as substitutable

The Itanium C++ ABI says prefixes are substitutable. For most prefixes
we already handle this: the manglePrefix(const DeclContext *, bool) and
manglePrefix(QualType) overloads explicitly handles substitutions or
defer to functions that handle substitutions on their behalf. The
manglePrefix(NestedNameSpecifier *) overload, however, is different and
handles some cases implicitly, but not all. The Identifier case was not
handled; this change adds handling for it, as well as a test case.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D122663

2 years ago[PowerPC] Enable CR bits support for Power8 and above.
Amy Kwan [Mon, 2 May 2022 06:30:10 +0000 (01:30 -0500)]
[PowerPC] Enable CR bits support for Power8 and above.

This patch turns on support for CR bit accesses for Power8 and above. The reason
why CR bits are turned on as the default for Power8 and above is that because
later architectures make use of builtins and instructions that require CR bit
accesses (such as the use of setbc in the vector string isolate predicate
and bcd builtins on Power10).

This patch also adds the clang portion to allow for turning on CR bits in the
front end if the user so desires to.

Differential Revision: https://reviews.llvm.org/D124060

2 years ago[Driver][test] Avoiding producing object file in the current directory
Fangrui Song [Mon, 2 May 2022 17:00:57 +0000 (10:00 -0700)]
[Driver][test] Avoiding producing object file in the current directory

2 years ago[GlobalOpt] Iterate over replaced values deterministically to constprop
Arthur Eubanks [Mon, 2 May 2022 16:21:39 +0000 (09:21 -0700)]
[GlobalOpt] Iterate over replaced values deterministically to constprop

If there are pre-existing dead instructions, the order we visit replaced
values can cause us sometimes to not delete dead instructions.

The added test non-deterministically failed without the change.

2 years ago[Driver][test] Add back some -no-canonical-prefixes
Fangrui Song [Mon, 2 May 2022 16:35:58 +0000 (09:35 -0700)]
[Driver][test] Add back some -no-canonical-prefixes

To make them meaningful, it's useful to check "clang". Use
-no-canonical-prefixes to support distributions that symlink %clang to an
executable with a filename not ending in "clang".

2 years ago[InstCombine] Handle non-canonical GEP index in indexed compare fold (PR55228)
Nikita Popov [Mon, 2 May 2022 15:52:02 +0000 (17:52 +0200)]
[InstCombine] Handle non-canonical GEP index in indexed compare fold (PR55228)

Normally the index type will already be canonicalized here, but
this is not guaranteed depending on visitation order. The code
was already accounting for a potentially needed sext, but a trunc
may also be needed.

Add a ConstantExpr::getSExtOrTrunc() helper method to make this
simpler. This matches the corresponding IRBuilder method in behavior.

Fixes https://github.com/llvm/llvm-project/issues/55228.

2 years ago[gn build] Port 5de0a3e9da72
LLVM GN Syncbot [Mon, 2 May 2022 15:51:27 +0000 (15:51 +0000)]
[gn build] Port 5de0a3e9da72

2 years ago[Analyzer] Minor cleanups in StreamChecker
Marco Antognini [Tue, 26 Apr 2022 09:16:36 +0000 (11:16 +0200)]
[Analyzer] Minor cleanups in StreamChecker

Remove unnecessary conversion to Optional<> and incorrect assumption
that BindExpr can return a null state.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D124681

2 years ago[trace][intelpt] Support system-wide tracing [1] - Add a method for accessing the...
Walter Erquinigo [Wed, 27 Apr 2022 19:13:40 +0000 (12:13 -0700)]
[trace][intelpt] Support system-wide tracing [1] - Add a method for accessing the list of logical core ids

In order to open perf events per core, we need to first get the list of
core ids available in the system. So I'm adding a function that does
that by parsing /proc/cpuinfo. That seems to be the simplest and most
portable way to do that.

Besides that, I made a few refactors and renames to reflect better that
the cpu info that we use in lldb-server comes from procfs.

Differential Revision: https://reviews.llvm.org/D124573

2 years ago[X86] Reduce some superfluous diffs between znver1/znver2 models. NFC
Simon Pilgrim [Mon, 2 May 2022 15:45:39 +0000 (16:45 +0100)]
[X86] Reduce some superfluous diffs between znver1/znver2 models. NFC

znver2 is a mainly a search+replace of the znver1 model, but for no reason the HADD and DPPS have been moved around - try to keep these in sync (no actual changes in the models).

2 years ago[X86][AMX] combineLdSt - don't dereference dyn_cast. NFC
Simon Pilgrim [Mon, 2 May 2022 15:20:06 +0000 (16:20 +0100)]
[X86][AMX] combineLdSt - don't dereference dyn_cast. NFC

This leads to null pointer dereference warnings - use cast<> which will assert that the cast correct.

2 years ago[Analyzer] Fix clang::ento::taint::dumpTaint definition
Marco Antognini [Tue, 19 Apr 2022 11:18:02 +0000 (13:18 +0200)]
[Analyzer] Fix clang::ento::taint::dumpTaint definition

Ensure the definition is in the "taint" namespace, like its declaration.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D124462

2 years agoBuildLibCalls: add alloc-family attribute to many allocator functions
Augie Fackler [Thu, 17 Mar 2022 13:54:46 +0000 (09:54 -0400)]
BuildLibCalls: add alloc-family attribute to many allocator functions

Differential Revision: https://reviews.llvm.org/D123086

2 years agoRe-apply 4b6c2cd642 "Deferred Concept Instantiation Implementation""
Erich Keane [Mon, 2 May 2022 13:29:25 +0000 (06:29 -0700)]
Re-apply 4b6c2cd642 "Deferred Concept Instantiation Implementation""

This reverts commit 0c31da48389754822dc3eecc4723160c295b9ab2.

I've solved the issue with the PointerUnion by making the
`FunctionTemplateDecl` pointer be a NamedDecl, that could be a
`FunctionDecl` or `FunctionTemplateDecl` depending.  This is enforced
with an assert.

2 years ago[LV][SLP] Add tests for vectorizing fptoi_sat intrinsics. NFC
David Green [Mon, 2 May 2022 14:11:44 +0000 (15:11 +0100)]
[LV][SLP] Add tests for vectorizing fptoi_sat intrinsics. NFC

2 years agoBuildLibCalls: infer allocptr attribute for free and realloc() family functions
Augie Fackler [Wed, 16 Mar 2022 17:53:18 +0000 (13:53 -0400)]
BuildLibCalls: infer allocptr attribute for free and realloc() family functions

Differential Revision: https://reviews.llvm.org/D123084

2 years ago[X86] Replace avx512f integer add reduction builtins with generic builtin
Simon Pilgrim [Mon, 2 May 2022 13:39:10 +0000 (14:39 +0100)]
[X86] Replace avx512f integer add reduction builtins with generic builtin

D124741 added the generic "__builtin_reduce_add" which we can use to replace the x86 specific integer add reduction builtins - internally these were mapping to the same intrinsic already so there are no test changes required.

Differential Revision: https://reviews.llvm.org/D124757

2 years agoRevert "Deferred Concept Instantiation Implementation"
Erich Keane [Mon, 2 May 2022 13:25:38 +0000 (06:25 -0700)]
Revert "Deferred Concept Instantiation Implementation"

This reverts commit 4b6c2cd647e9e5a147954886338f97ffb6a1bcfb.

The patch caused numerous ARM 32 bit build failures, since we added a
5th item to the PointerUnion, and went over the 2-bits available in the
32 bit pointers.

2 years ago[CodeGen] Add tests for X+(Y&~X) pattern (NFC)
Nikita Popov [Mon, 2 May 2022 13:23:13 +0000 (15:23 +0200)]
[CodeGen] Add tests for X+(Y&~X) pattern (NFC)

2 years ago[AArch64] add tests for int->FP->int casts; NFC
Sanjay Patel [Mon, 2 May 2022 13:18:12 +0000 (09:18 -0400)]
[AArch64] add tests for int->FP->int casts; NFC

Copied from x86 tests for multi-target coverage.
Also, provides coverage for target-specific asm
testing for Alive2 or its follow-ons.

See #55150 and D124692

2 years ago[x86] add tests for int->FP->int casts; NFC
Sanjay Patel [Mon, 2 May 2022 12:23:52 +0000 (08:23 -0400)]
[x86] add tests for int->FP->int casts; NFC

Adapted from tests for IR in D124692.
Also see #55150

2 years ago[x86] update test file with complete auto-generated check lines; NFC
Sanjay Patel [Mon, 2 May 2022 12:12:43 +0000 (08:12 -0400)]
[x86] update test file with complete auto-generated check lines; NFC

Also, improve test names.

2 years agoDeferred Concept Instantiation Implementation
Erich Keane [Thu, 3 Mar 2022 16:27:49 +0000 (08:27 -0800)]
Deferred Concept Instantiation Implementation

As reported here: https://github.com/llvm/llvm-project/issues/44178

Concepts are not supposed to be instantiated until they are checked, so
this patch implements that and goes through significant amounts of work
to make sure we properly re-instantiate the concepts correctly.

Differential Revision: https://reviews.llvm.org/D119544

2 years ago[libunwind] Add SystemZ support
Ulrich Weigand [Mon, 2 May 2022 12:35:29 +0000 (14:35 +0200)]
[libunwind] Add SystemZ support

Add support for the SystemZ (s390x) architecture to libunwind.

Support should be feature-complete with the exception of
unwinding from signal handlers (to be added later).

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D124248

2 years ago[X86] MOVDDUP has the same sched behaviour as MOVSHDUP/MOVSLDUP on Skylake
Simon Pilgrim [Mon, 2 May 2022 11:50:32 +0000 (12:50 +0100)]
[X86] MOVDDUP has the same sched behaviour as MOVSHDUP/MOVSLDUP on Skylake

Fixes an old TODO - confirmed on Agner + uops.info

2 years ago[InstCombine] Add tests for A+(B&~A) and A+((A&B)^B) (NFC)
Nikita Popov [Mon, 2 May 2022 11:22:39 +0000 (13:22 +0200)]
[InstCombine] Add tests for A+(B&~A) and A+((A&B)^B) (NFC)

2 years ago[gn build] (manually) port fb7a435492a5
Nico Weber [Mon, 2 May 2022 11:23:10 +0000 (07:23 -0400)]
[gn build] (manually) port fb7a435492a5

2 years ago[SLP][X86] Add test coverage for PR41892
Simon Pilgrim [Mon, 2 May 2022 11:17:11 +0000 (12:17 +0100)]
[SLP][X86] Add test coverage for PR41892

2 years agotsan: model atomic read for failing CAS
Dmitry Vyukov [Wed, 27 Apr 2022 07:52:33 +0000 (09:52 +0200)]
tsan: model atomic read for failing CAS

See the added test and https://github.com/google/sanitizers/issues/1520
for the description of the problem.
The standard says that failing CAS is a memory load only,
model it as such to avoid false positives.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D124507

2 years ago[AArch64] Cost modelling for fptoi_sat
David Green [Mon, 2 May 2022 10:36:05 +0000 (11:36 +0100)]
[AArch64] Cost modelling for fptoi_sat

This builds on top of the target-independent cost model added in D124269
to add aarch64 specific costs for fptoui_sat and fptosi_sat intrinsics.
For many common types they will be legal instructions as the AArch64
instructions will saturate naturally. For unsupported pairs of integer
and floating point types, an additional min/max clamp is needed.

Differential Revision: https://reviews.llvm.org/D124357

2 years ago[Clang] Add integer add reduction builtin
Simon Pilgrim [Mon, 2 May 2022 10:03:19 +0000 (11:03 +0100)]
[Clang] Add integer add reduction builtin

Similar to the existing bitwise reduction builtins, this lowers to a llvm.vector.reduce.add intrinsic call.

For other reductions, we've tried to share builtins for float/integer vectors, but the fadd reduction intrinsics also take a starting value argument and can either do unordered or serialized, but not reduction-trees as specified for the builtins. However we address fadd support this shouldn't affect the integer case.

(Split off from D117829)

Differential Revision: https://reviews.llvm.org/D124741

2 years ago[analyzer] Allow CFG dumps in release builds
Balazs Benics [Mon, 2 May 2022 09:48:52 +0000 (11:48 +0200)]
[analyzer] Allow CFG dumps in release builds

This is a similar commit to D124442, but for CFG dumps.
The binary size diff remained the same demonstrated in that patch.

This time I'm adding tests for demonstrating that all the dump debug
checkers work - even in regular builds without asserts.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D124443

2 years ago[analyzer] Allow exploded graph dumps in release builds
Balazs Benics [Mon, 2 May 2022 09:42:08 +0000 (11:42 +0200)]
[analyzer] Allow exploded graph dumps in release builds

Historically, exploded graph dumps were disabled in non-debug builds.
It was done so probably because a regular user should not dump the
internal representation of the analyzer anyway and the dump methods
might introduce unnecessary binary size overhead.

It turns out some of the users actually want to dump this.

Note that e.g. `LiveExpressionsDumper`, `LiveVariablesDumper`,
`ControlDependencyTreeDumper` etc. worked previously, and they are
unaffected by this change.
However, `CFGViewer` and `CFGDumper` still won't work for a similar
reason. AFAIK only these two won't work after this change.

Addresses #53873

---

**baseline**

| binary | size | size after strip |
| clang | 103M | 83M |
| clang-tidy | 67M | 54M |

**after this change**

| binary | size | size after strip |
| clang | 103M | 84M |
| clang-tidy | 67M | 54M |

CMake configuration:
```
cmake -S llvm -GNinja -DBUILD_SHARED_LIBS=OFF -DCMAKE_BUILD_TYPE=Release
-DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang
-DLLVM_ENABLE_ASSERTIONS=OFF -DLLVM_USE_LINKER=lld
-DLLVM_ENABLE_DUMP=OFF -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra"
-DLLVM_ENABLE_Z3_SOLVER=ON -DLLVM_TARGETS_TO_BUILD="X86"
```
Built by `clang-14.0.0`.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D124442

2 years ago[CostModel][X86] getScalarizationOverhead - handle vXi1 extracts with MOVMSK (pre...
Simon Pilgrim [Mon, 2 May 2022 08:58:35 +0000 (09:58 +0100)]
[CostModel][X86] getScalarizationOverhead - handle vXi1 extracts with MOVMSK (pre-AVX512)

We can quickly extract multiple elements of a bool vector using MOVMSK ops - since we don't know what generated the vXi1, I've been optimistic and assumed we can use PMOVMSKB to extract the maximum number of bools with a single op.

The MOVMSK pattern isn't great for extract+insert round trips as vXi1 type legalization can interfere with this a lot - so this relies on us remaining good at using getScalarizationOverhead properly (and tagging both Insert and Extract modes) for those round trip cases.

The AVX512 KMOV codegen for bool extraction is a bit of a mess so for now I've not included that - the per-element cost is a lot more accurate for current codegen.

2 years ago[analyzer] Fix cast evaluation on scoped enums in ExprEngine
Balazs Benics [Mon, 2 May 2022 08:54:26 +0000 (10:54 +0200)]
[analyzer] Fix cast evaluation on scoped enums in ExprEngine

We ignored the cast if the enum was scoped.
This is bad since there is no implicit conversion from the scoped enum to the corresponding underlying type.

The fix is basically: isIntegralOrEnumerationType() -> isIntegralOr**Unscoped**EnumerationType()

This materialized in crashes on analyzing the LLVM itself using the Z3 refutation.
Refutation synthesized the given Z3 Binary expression (`BO_And` of `unsigned char` aka. 8 bits
and an `int` 32 bits) with the wrong bitwidth in the end, which triggered an assert.

Now, we evaluate the cast according to the standard.

This bug could have been triggered using the Z3 CM according to
https://bugs.llvm.org/show_bug.cgi?id=44030

Fixes #47570 #43375

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D85528

2 years ago[Local] Consider atomic loads from constant global as dead
Nikita Popov [Fri, 22 Apr 2022 08:53:43 +0000 (10:53 +0200)]
[Local] Consider atomic loads from constant global as dead

Per the guidance in
https://llvm.org/docs/Atomics.html#atomics-and-ir-optimization,
an atomic load from a constant global can be dropped, as there can
be no stores to synchronize with. Any write to the constant global
would be UB.

IPSCCP will already drop such loads, but the main helper in Local
doesn't recognize this currently. This is motivated by D118387.

Differential Revision: https://reviews.llvm.org/D124241

2 years ago[mlir][OpenMP] Restrict types for omp.parallel args
Shraiysh Vaishay [Mon, 2 May 2022 05:24:28 +0000 (10:54 +0530)]
[mlir][OpenMP] Restrict types for omp.parallel args

This patch restricts the value of `if` clause expression to an I1 value.
It also restricts the value of `num_threads` clause expression to an I32
value.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D124142

2 years ago[clang-format] Fix a bug that misformats Access Specifier after *[]
owenca [Thu, 28 Apr 2022 01:01:49 +0000 (18:01 -0700)]
[clang-format] Fix a bug that misformats Access Specifier after *[]

Fixes #55132.

Differential Revision: https://reviews.llvm.org/D124589

2 years ago[analyzer][docs] Document alpha.security.cert.pos.34c limitations
Balazs Benics [Mon, 2 May 2022 08:37:23 +0000 (10:37 +0200)]
[analyzer][docs] Document alpha.security.cert.pos.34c limitations

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D124659

2 years ago[analyzer] Fix Static Analyzer g_memdup false-positive
Balazs Benics [Mon, 2 May 2022 08:35:51 +0000 (10:35 +0200)]
[analyzer] Fix Static Analyzer g_memdup false-positive

`g_memdup()` allocates and copies memory, thus we should not assume that
the returned memory region is uninitialized because it might not be the
case.

PS: It would be even better to copy the bindings to mimic the actual
content of the buffer, but this works too.

Fixes #53617

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D124436

2 years ago[ConstantFold] Don't convert getelementptr to ptrtoint+inttoptr
Nikita Popov [Fri, 29 Apr 2022 15:22:54 +0000 (17:22 +0200)]
[ConstantFold] Don't convert getelementptr to ptrtoint+inttoptr

ConstantFolding currently converts "getelementptr i8, Ptr, (sub 0, V)"
to "inttoptr (sub (ptrtoint Ptr), V)". This transform is, taken by
itself, correct, but does came with two issues:

1. It unnecessarily broadens provenance by introducing an inttoptr.
   We generally prefer not to introduce inttoptr during optimization.
2. For the case where V == ptrtoint Ptr, this folds to inttoptr 0,
   which further folds to null. In that case provenance becomes
   incorrect. This has been observed as a real-world miscompile with
   rustc.

We should probably address that incorrect inttoptr 0 fold at some
point, but in either case we should also drop this inttoptr-introducing
fold. Instead, replace it with a fold rooted at
ptrtoint(getelementptr), which seems to cover the original
motivation for this fold (test2 in the changed file).

Differential Revision: https://reviews.llvm.org/D124677

2 years ago[AArch64] Add more comprehensive reverse shuffle costmodel tests. NFC
David Green [Mon, 2 May 2022 08:16:57 +0000 (09:16 +0100)]
[AArch64] Add more comprehensive reverse shuffle costmodel tests. NFC

2 years ago[mlir] support isa/cast/dyn_cast<Operation *>(operation)
Alex Zinenko [Fri, 29 Apr 2022 15:13:24 +0000 (17:13 +0200)]
[mlir] support isa/cast/dyn_cast<Operation *>(operation)

This enables one to write generic code that can be instantiated for both
specific operation classes and the common base class without
specialization. Examples include functions that take/return ops, such
as:

```mlir
template <typename FnTy>
void applyIf(FnTy &&lambda, ...) {
  for (Operation *op : ...) {
    auto specific = dyn_cast<function_traits<FnTy>::template arg_t<0>>(op);
    if (specific)
      lambda(specific);
  }
}
```

that would otherwise need to rely on template specialization to support
lambdas that take specific operations and those that take `Operation *`.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D124675

2 years ago[ArgPromotion][Attributor] Update min-legal-vector-width when do promotion
Phoebe Wang [Mon, 2 May 2022 05:29:34 +0000 (13:29 +0800)]
[ArgPromotion][Attributor] Update min-legal-vector-width when do promotion

X86 codegen uses function attribute `min-legal-vector-width` to select the proper ABI. The intention of the attribute is to reflect user's requirement when they passing or returning vector arguments. So Clang front-end will iterate the vector arguments and set `min-legal-vector-width` to the width of the maximum for both caller and callee.

It is assumed any middle end optimizations won't care of the attribute expect inlining and argument promotion.
- For inlining, we will propagate the attribute of inlined functions because the inlining functions become the newer caller.
- For argument promotion, we check the `min-legal-vector-width` of the caller and callee and refuse to promote when they don't match.

The problem comes from the optimizations' combination, as shown by https://godbolt.org/z/zo3hba8xW. The caller `foo` has two callees `bar` and `baz`. When doing argument promotion, both `foo` and `bar` has the same `min-legal-vector-width`. So the argument was promoted to vector. Then the inlining inlines `baz` to `foo` and updates `min-legal-vector-width`, which results in ABI mismatch between `foo` and `bar`.

This patch fixes the problem by expanding the concept of `min-legal-vector-width` to indicator of functions arguments. That says, any passes touch functions arguments have to set `min-legal-vector-width` to the value reflects the width of vector arguments. It makes sense to me because any arguments modifications are ABI related and should response for the ABI compatibility.

Differential Revision: https://reviews.llvm.org/D123284

2 years ago[flang] Added tests for taskwait and taskyield translation
Shraiysh Vaishay [Mon, 2 May 2022 05:11:46 +0000 (10:41 +0530)]
[flang] Added tests for taskwait and taskyield translation

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D124229

Co-authored-by: Sourabh Singh Tomar <SourabhSingh.Tomar@amd.com>
2 years ago[LoopCacheAnalysis] Use stable_sort() to avoid non-deterministic print output
Congzhe Cao [Mon, 2 May 2022 04:49:11 +0000 (00:49 -0400)]
[LoopCacheAnalysis] Use stable_sort() to avoid non-deterministic print output

The print output of loop cache analysis sometimes has a non-deterministic order
and therefore we have been using `CHECK-DAG` in its lit tests. This patch changes
the sorting of LoopCosts to llvm::stable_sort() where we compare loop cost numbers
and sort the loops. In case of the same loop cost numbers, llvm::stable_sort() now
would output a deterministic loop order.

Reviewed By: Meinersbur, fhahn, #loopoptwg

Differential Revision: https://reviews.llvm.org/D124725

2 years ago[clang][preprocessor] Add more macros to target AVR
Ben Shi [Thu, 21 Apr 2022 09:41:42 +0000 (09:41 +0000)]
[clang][preprocessor] Add more macros to target AVR

Reviewed By: MaskRay, aykevl

Differential Revision: https://reviews.llvm.org/D124157

2 years ago[Driver][Ananas] -r: imply -nostdlib like GCC
Brad Smith [Mon, 2 May 2022 04:26:41 +0000 (00:26 -0400)]
[Driver][Ananas] -r: imply -nostdlib like GCC

Similar to D116843 for Gnu.cpp

Reviewed By: zhmu, MaskRay

Differential Revision: https://reviews.llvm.org/D124729

2 years ago[Driver][test] Remove unneeded -no-canonical-prefixes and use preferred --target=
Fangrui Song [Mon, 2 May 2022 03:44:13 +0000 (20:44 -0700)]
[Driver][test] Remove unneeded -no-canonical-prefixes and use preferred --target=

Similar to D119309

2 years ago[compiler-rt][builtins] Add several helper functions for AVR
Ben Shi [Wed, 6 Apr 2022 10:45:50 +0000 (10:45 +0000)]
[compiler-rt][builtins] Add several helper functions for AVR

__mulqi3 : int8 multiplication
__mulhi3 : int16 multiplication
   _exit : golobal terminator

Reviewed By: MaskRay, aykevl

Differential Revision: https://reviews.llvm.org/D123200

2 years ago[gn build] Port 3939e99aae68
LLVM GN Syncbot [Sun, 1 May 2022 22:32:29 +0000 (22:32 +0000)]
[gn build] Port 3939e99aae68

2 years agollvm-reduce: Fix not removing first instruction in MachineBasicBlock
Matt Arsenault [Tue, 19 Apr 2022 13:12:45 +0000 (09:12 -0400)]
llvm-reduce: Fix not removing first instruction in MachineBasicBlock

This had the surprising behavior of using whatever instruction
happened to be first in the block as an anchor point to stick random
implicit defs on. Use a real implicit_def instead.

2 years agollvm-reduce: Introduce new scoring mechanism for MIR reductions
Matt Arsenault [Tue, 19 Apr 2022 21:19:36 +0000 (17:19 -0400)]
llvm-reduce: Introduce new scoring mechanism for MIR reductions

Many MIR reductions benefit from or require increasing the instruction
count. For example, unlike in the IR, you may need to insert a new
instruction to represent an undef. The current instruction reduction
pass works around this by sticking implicit defs on whatever
instruction happens to be first in the entry block block.

Other strategies I've applied manually include breaking instructions
with multiple defs into separate instructions, or breaking large
register defs into multiple subregister defs.

Make up a simple scoring system based on what I generally try to get
rid of first when manually reducing. Counts implicit defs as free
since reduction passes will be introducing them, although they
probably should count for something. It also might make more sense to
have a comparison the two functions, rather than having to compute a
contextless number. This isn't particularly well tested since overall
the MIR support isn't in a place where it is useful on the kinds of
testcases I want to throw at it.

2 years agollvm-reduce: Do not try to delete frame instructions
Matt Arsenault [Mon, 25 Apr 2022 12:58:39 +0000 (08:58 -0400)]
llvm-reduce: Do not try to delete frame instructions

The verifier enforces these appearing as balanced pairs, so just
deleting one has no real chance of producing something valid.

2 years agollvm-reduce: Add pass to reduce IR references from MIR
Matt Arsenault [Tue, 19 Apr 2022 16:10:38 +0000 (12:10 -0400)]
llvm-reduce: Add pass to reduce IR references from MIR

This is typically the first thing I do when reducing a new testcase
until the IR section can be deleted.

2 years ago[RISCV] Lower case the first letter of LowerRISCVMachineOperandToMCOperand. NFC
Fangrui Song [Sun, 1 May 2022 21:13:54 +0000 (14:13 -0700)]
[RISCV] Lower case the first letter of LowerRISCVMachineOperandToMCOperand. NFC

2 years agodoc: update of the adv build doc now that clang is in tree too
Sylvestre Ledru [Sun, 1 May 2022 20:59:36 +0000 (22:59 +0200)]
doc: update of the adv build doc now that clang is in tree too
And be more consistent in the declarations

2 years ago[mlir:PDLInterp] Refactor the implementation of result type inferrence
River Riddle [Tue, 26 Apr 2022 20:38:21 +0000 (13:38 -0700)]
[mlir:PDLInterp] Refactor the implementation of result type inferrence

The current implementation uses a discrete "pdl_interp.inferred_types"
operation, which acts as a "fake" handle to a type range. This op is
used as a signal to pdl_interp.create_operation that types should be
inferred. This is terribly awkward and clunky though:

* This op doesn't have a byte code representation, and its conversion
  to bytecode kind of assumes that it is only used in a certain way. The
  current lowering is also broken and seemingly untested.

* Given that this is a different operation, it gives off the assumption
  that it can be used multiple times, or that after the first use
  the value contains the inferred types. This isn't the case though,
  the resultant type range can never actually be used as a type range.

This commit refactors the representation by removing the discrete
InferredTypesOp, and instead adds a UnitAttr to
pdl_interp.CreateOperation that signals when the created operations
should infer their types. This leads to a much much cleaner abstraction,
a more optimal bytecode lowering, and also allows for better error
handling and diagnostics when a created operation doesn't actually
support type inferrence.

Differential Revision: https://reviews.llvm.org/D124587

2 years ago[SimpleLoopUnswitch] Freeze individual OR/AND operands.
Florian Hahn [Sun, 1 May 2022 19:11:05 +0000 (20:11 +0100)]
[SimpleLoopUnswitch] Freeze individual OR/AND operands.

In some cases, it is not enough to freeze the final AND/OR operation
when chaining a number of invariant conditions together.

After creating a chain of ANDs/ORs, we assume all unswitched operands to
be either true or false. But if any of the operands is poison, the rest
of the operands could have any value after branching on the frozen
condition.

To avoid that, freeze individual operands, if needed. In some cases this
may lead to unnecessary freezes, but it seems required at least for some
cases (see trivial-unswitch-freeze-individual-conditions.ll)

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D124554

2 years ago[VectorCombine] Merge isa<>/cast<> into dyn_cast<>. NFC.
Simon Pilgrim [Sun, 1 May 2022 19:09:05 +0000 (20:09 +0100)]
[VectorCombine] Merge isa<>/cast<> into dyn_cast<>. NFC.

We want to handle the the assert in VectorCombine so avoid the repeated isa/cast code.

2 years ago[Polly] Fix test after D119669.
Michael Kruse [Sun, 1 May 2022 18:32:42 +0000 (13:32 -0500)]
[Polly] Fix test after D119669.

2 years ago[DAG] (style) Break apart if-else chain as they all return
Simon Pilgrim [Sun, 1 May 2022 16:56:54 +0000 (17:56 +0100)]
[DAG] (style) Break apart if-else chain as they all return

2 years ago[clang][dataflow] Optimize flow condition representation
Stanislav Gatev [Mon, 25 Apr 2022 15:23:42 +0000 (15:23 +0000)]
[clang][dataflow] Optimize flow condition representation

Enable efficient implementation of context-aware joining of distinct
boolean values. It can be used to join distinct boolean values while
preserving flow condition information.

Flow conditions are represented as Token <=> Clause iff formulas. To
perform context-aware joining, one can simply add the tokens of flow
conditions to the formula when joining distinct boolean values, e.g:
`makeOr(makeAnd(FC1, Val1), makeAnd(FC2, Val2))`. This significantly
simplifies the implementation of `Environment::join`.

This patch removes the `DataflowAnalysisContext::getSolver` method.
The `DataflowAnalysisContext::flowConditionImplies` method should be
used instead.

Reviewed-by: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D124395

2 years ago[X86] (style) Use auto for dyn_cast<> results
Simon Pilgrim [Sun, 1 May 2022 16:15:18 +0000 (17:15 +0100)]
[X86] (style) Use auto for dyn_cast<> results

2 years ago[X86] (style) Don't use auto for non obvious types
Simon Pilgrim [Sun, 1 May 2022 16:10:21 +0000 (17:10 +0100)]
[X86] (style) Don't use auto for non obvious types

2 years ago[SLPVectorizer] Remove weird unicode character from comment. NFCI.
Simon Pilgrim [Sun, 1 May 2022 15:37:21 +0000 (16:37 +0100)]
[SLPVectorizer] Remove weird unicode character from comment. NFCI.

Whatever it was, Visual Assist really didn't like it....

2 years ago[InstCombine] Add test coverage from D124503
Simon Pilgrim [Sun, 1 May 2022 15:09:23 +0000 (16:09 +0100)]
[InstCombine] Add test coverage from D124503

2 years ago[Coroutines] Regenerate coro-retcon-resume-values.ll
Simon Pilgrim [Sun, 1 May 2022 12:21:55 +0000 (13:21 +0100)]
[Coroutines] Regenerate coro-retcon-resume-values.ll

2 years ago[LoopVectorize][X86] Regenerate invariant-store-vectorization.ll
Simon Pilgrim [Sun, 1 May 2022 12:04:20 +0000 (13:04 +0100)]
[LoopVectorize][X86] Regenerate invariant-store-vectorization.ll

2 years ago[analyzer] Fix return of llvm::StringRef to destroyed std::string
Andrew Ng [Fri, 29 Apr 2022 17:00:33 +0000 (18:00 +0100)]
[analyzer] Fix return of llvm::StringRef to destroyed std::string

This issue was discovered whilst testing with ASAN.

Differential Revision: https://reviews.llvm.org/D124683

2 years ago[CostModel][X86] Check for 'null op' truncations
Simon Pilgrim [Sun, 1 May 2022 11:03:40 +0000 (12:03 +0100)]
[CostModel][X86] Check for 'null op' truncations

If the legalized src/dst types are the same, assume the "truncation" is free.

This fixes some edge cases such as mul lo/hi ops and bool vectors which will get legalized back to legal vector widths