Eugene Zhulenev [Sun, 1 May 2022 20:28:51 +0000 (13:28 -0700)]
[mlir] CRunnerUtils: qualify UnrankedMemRefType to avoid collisions with mlir::UnrankedMemRefType
When CRunnerUtils included together with MLIR IR headers, it can lead to compilation errors.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D124744
Harald van Dijk [Mon, 2 May 2022 17:07:47 +0000 (18:07 +0100)]
Mark identifier prefixes as substitutable
The Itanium C++ ABI says prefixes are substitutable. For most prefixes
we already handle this: the manglePrefix(const DeclContext *, bool) and
manglePrefix(QualType) overloads explicitly handles substitutions or
defer to functions that handle substitutions on their behalf. The
manglePrefix(NestedNameSpecifier *) overload, however, is different and
handles some cases implicitly, but not all. The Identifier case was not
handled; this change adds handling for it, as well as a test case.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D122663
Amy Kwan [Mon, 2 May 2022 06:30:10 +0000 (01:30 -0500)]
[PowerPC] Enable CR bits support for Power8 and above.
This patch turns on support for CR bit accesses for Power8 and above. The reason
why CR bits are turned on as the default for Power8 and above is that because
later architectures make use of builtins and instructions that require CR bit
accesses (such as the use of setbc in the vector string isolate predicate
and bcd builtins on Power10).
This patch also adds the clang portion to allow for turning on CR bits in the
front end if the user so desires to.
Differential Revision: https://reviews.llvm.org/D124060
Fangrui Song [Mon, 2 May 2022 17:00:57 +0000 (10:00 -0700)]
[Driver][test] Avoiding producing object file in the current directory
Arthur Eubanks [Mon, 2 May 2022 16:21:39 +0000 (09:21 -0700)]
[GlobalOpt] Iterate over replaced values deterministically to constprop
If there are pre-existing dead instructions, the order we visit replaced
values can cause us sometimes to not delete dead instructions.
The added test non-deterministically failed without the change.
Fangrui Song [Mon, 2 May 2022 16:35:58 +0000 (09:35 -0700)]
[Driver][test] Add back some -no-canonical-prefixes
To make them meaningful, it's useful to check "clang". Use
-no-canonical-prefixes to support distributions that symlink %clang to an
executable with a filename not ending in "clang".
Nikita Popov [Mon, 2 May 2022 15:52:02 +0000 (17:52 +0200)]
[InstCombine] Handle non-canonical GEP index in indexed compare fold (PR55228)
Normally the index type will already be canonicalized here, but
this is not guaranteed depending on visitation order. The code
was already accounting for a potentially needed sext, but a trunc
may also be needed.
Add a ConstantExpr::getSExtOrTrunc() helper method to make this
simpler. This matches the corresponding IRBuilder method in behavior.
Fixes https://github.com/llvm/llvm-project/issues/55228.
LLVM GN Syncbot [Mon, 2 May 2022 15:51:27 +0000 (15:51 +0000)]
[gn build] Port
5de0a3e9da72
Marco Antognini [Tue, 26 Apr 2022 09:16:36 +0000 (11:16 +0200)]
[Analyzer] Minor cleanups in StreamChecker
Remove unnecessary conversion to Optional<> and incorrect assumption
that BindExpr can return a null state.
Reviewed By: steakhal
Differential Revision: https://reviews.llvm.org/D124681
Walter Erquinigo [Wed, 27 Apr 2022 19:13:40 +0000 (12:13 -0700)]
[trace][intelpt] Support system-wide tracing [1] - Add a method for accessing the list of logical core ids
In order to open perf events per core, we need to first get the list of
core ids available in the system. So I'm adding a function that does
that by parsing /proc/cpuinfo. That seems to be the simplest and most
portable way to do that.
Besides that, I made a few refactors and renames to reflect better that
the cpu info that we use in lldb-server comes from procfs.
Differential Revision: https://reviews.llvm.org/D124573
Simon Pilgrim [Mon, 2 May 2022 15:45:39 +0000 (16:45 +0100)]
[X86] Reduce some superfluous diffs between znver1/znver2 models. NFC
znver2 is a mainly a search+replace of the znver1 model, but for no reason the HADD and DPPS have been moved around - try to keep these in sync (no actual changes in the models).
Simon Pilgrim [Mon, 2 May 2022 15:20:06 +0000 (16:20 +0100)]
[X86][AMX] combineLdSt - don't dereference dyn_cast. NFC
This leads to null pointer dereference warnings - use cast<> which will assert that the cast correct.
Marco Antognini [Tue, 19 Apr 2022 11:18:02 +0000 (13:18 +0200)]
[Analyzer] Fix clang::ento::taint::dumpTaint definition
Ensure the definition is in the "taint" namespace, like its declaration.
Reviewed By: steakhal
Differential Revision: https://reviews.llvm.org/D124462
Augie Fackler [Thu, 17 Mar 2022 13:54:46 +0000 (09:54 -0400)]
BuildLibCalls: add alloc-family attribute to many allocator functions
Differential Revision: https://reviews.llvm.org/D123086
Erich Keane [Mon, 2 May 2022 13:29:25 +0000 (06:29 -0700)]
Re-apply
4b6c2cd642 "Deferred Concept Instantiation Implementation""
This reverts commit
0c31da48389754822dc3eecc4723160c295b9ab2.
I've solved the issue with the PointerUnion by making the
`FunctionTemplateDecl` pointer be a NamedDecl, that could be a
`FunctionDecl` or `FunctionTemplateDecl` depending. This is enforced
with an assert.
David Green [Mon, 2 May 2022 14:11:44 +0000 (15:11 +0100)]
[LV][SLP] Add tests for vectorizing fptoi_sat intrinsics. NFC
Augie Fackler [Wed, 16 Mar 2022 17:53:18 +0000 (13:53 -0400)]
BuildLibCalls: infer allocptr attribute for free and realloc() family functions
Differential Revision: https://reviews.llvm.org/D123084
Simon Pilgrim [Mon, 2 May 2022 13:39:10 +0000 (14:39 +0100)]
[X86] Replace avx512f integer add reduction builtins with generic builtin
D124741 added the generic "__builtin_reduce_add" which we can use to replace the x86 specific integer add reduction builtins - internally these were mapping to the same intrinsic already so there are no test changes required.
Differential Revision: https://reviews.llvm.org/D124757
Erich Keane [Mon, 2 May 2022 13:25:38 +0000 (06:25 -0700)]
Revert "Deferred Concept Instantiation Implementation"
This reverts commit
4b6c2cd647e9e5a147954886338f97ffb6a1bcfb.
The patch caused numerous ARM 32 bit build failures, since we added a
5th item to the PointerUnion, and went over the 2-bits available in the
32 bit pointers.
Nikita Popov [Mon, 2 May 2022 13:23:13 +0000 (15:23 +0200)]
[CodeGen] Add tests for X+(Y&~X) pattern (NFC)
Sanjay Patel [Mon, 2 May 2022 13:18:12 +0000 (09:18 -0400)]
[AArch64] add tests for int->FP->int casts; NFC
Copied from x86 tests for multi-target coverage.
Also, provides coverage for target-specific asm
testing for Alive2 or its follow-ons.
See #55150 and D124692
Sanjay Patel [Mon, 2 May 2022 12:23:52 +0000 (08:23 -0400)]
[x86] add tests for int->FP->int casts; NFC
Adapted from tests for IR in D124692.
Also see #55150
Sanjay Patel [Mon, 2 May 2022 12:12:43 +0000 (08:12 -0400)]
[x86] update test file with complete auto-generated check lines; NFC
Also, improve test names.
Erich Keane [Thu, 3 Mar 2022 16:27:49 +0000 (08:27 -0800)]
Deferred Concept Instantiation Implementation
As reported here: https://github.com/llvm/llvm-project/issues/44178
Concepts are not supposed to be instantiated until they are checked, so
this patch implements that and goes through significant amounts of work
to make sure we properly re-instantiate the concepts correctly.
Differential Revision: https://reviews.llvm.org/D119544
Ulrich Weigand [Mon, 2 May 2022 12:35:29 +0000 (14:35 +0200)]
[libunwind] Add SystemZ support
Add support for the SystemZ (s390x) architecture to libunwind.
Support should be feature-complete with the exception of
unwinding from signal handlers (to be added later).
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D124248
Simon Pilgrim [Mon, 2 May 2022 11:50:32 +0000 (12:50 +0100)]
[X86] MOVDDUP has the same sched behaviour as MOVSHDUP/MOVSLDUP on Skylake
Fixes an old TODO - confirmed on Agner + uops.info
Nikita Popov [Mon, 2 May 2022 11:22:39 +0000 (13:22 +0200)]
[InstCombine] Add tests for A+(B&~A) and A+((A&B)^B) (NFC)
Nico Weber [Mon, 2 May 2022 11:23:10 +0000 (07:23 -0400)]
[gn build] (manually) port
fb7a435492a5
Simon Pilgrim [Mon, 2 May 2022 11:17:11 +0000 (12:17 +0100)]
[SLP][X86] Add test coverage for PR41892
Dmitry Vyukov [Wed, 27 Apr 2022 07:52:33 +0000 (09:52 +0200)]
tsan: model atomic read for failing CAS
See the added test and https://github.com/google/sanitizers/issues/1520
for the description of the problem.
The standard says that failing CAS is a memory load only,
model it as such to avoid false positives.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D124507
David Green [Mon, 2 May 2022 10:36:05 +0000 (11:36 +0100)]
[AArch64] Cost modelling for fptoi_sat
This builds on top of the target-independent cost model added in D124269
to add aarch64 specific costs for fptoui_sat and fptosi_sat intrinsics.
For many common types they will be legal instructions as the AArch64
instructions will saturate naturally. For unsupported pairs of integer
and floating point types, an additional min/max clamp is needed.
Differential Revision: https://reviews.llvm.org/D124357
Simon Pilgrim [Mon, 2 May 2022 10:03:19 +0000 (11:03 +0100)]
[Clang] Add integer add reduction builtin
Similar to the existing bitwise reduction builtins, this lowers to a llvm.vector.reduce.add intrinsic call.
For other reductions, we've tried to share builtins for float/integer vectors, but the fadd reduction intrinsics also take a starting value argument and can either do unordered or serialized, but not reduction-trees as specified for the builtins. However we address fadd support this shouldn't affect the integer case.
(Split off from D117829)
Differential Revision: https://reviews.llvm.org/D124741
Balazs Benics [Mon, 2 May 2022 09:48:52 +0000 (11:48 +0200)]
[analyzer] Allow CFG dumps in release builds
This is a similar commit to D124442, but for CFG dumps.
The binary size diff remained the same demonstrated in that patch.
This time I'm adding tests for demonstrating that all the dump debug
checkers work - even in regular builds without asserts.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D124443
Balazs Benics [Mon, 2 May 2022 09:42:08 +0000 (11:42 +0200)]
[analyzer] Allow exploded graph dumps in release builds
Historically, exploded graph dumps were disabled in non-debug builds.
It was done so probably because a regular user should not dump the
internal representation of the analyzer anyway and the dump methods
might introduce unnecessary binary size overhead.
It turns out some of the users actually want to dump this.
Note that e.g. `LiveExpressionsDumper`, `LiveVariablesDumper`,
`ControlDependencyTreeDumper` etc. worked previously, and they are
unaffected by this change.
However, `CFGViewer` and `CFGDumper` still won't work for a similar
reason. AFAIK only these two won't work after this change.
Addresses #53873
---
**baseline**
| binary | size | size after strip |
| clang | 103M | 83M |
| clang-tidy | 67M | 54M |
**after this change**
| binary | size | size after strip |
| clang | 103M | 84M |
| clang-tidy | 67M | 54M |
CMake configuration:
```
cmake -S llvm -GNinja -DBUILD_SHARED_LIBS=OFF -DCMAKE_BUILD_TYPE=Release
-DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang
-DLLVM_ENABLE_ASSERTIONS=OFF -DLLVM_USE_LINKER=lld
-DLLVM_ENABLE_DUMP=OFF -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra"
-DLLVM_ENABLE_Z3_SOLVER=ON -DLLVM_TARGETS_TO_BUILD="X86"
```
Built by `clang-14.0.0`.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D124442
Simon Pilgrim [Mon, 2 May 2022 08:58:35 +0000 (09:58 +0100)]
[CostModel][X86] getScalarizationOverhead - handle vXi1 extracts with MOVMSK (pre-AVX512)
We can quickly extract multiple elements of a bool vector using MOVMSK ops - since we don't know what generated the vXi1, I've been optimistic and assumed we can use PMOVMSKB to extract the maximum number of bools with a single op.
The MOVMSK pattern isn't great for extract+insert round trips as vXi1 type legalization can interfere with this a lot - so this relies on us remaining good at using getScalarizationOverhead properly (and tagging both Insert and Extract modes) for those round trip cases.
The AVX512 KMOV codegen for bool extraction is a bit of a mess so for now I've not included that - the per-element cost is a lot more accurate for current codegen.
Balazs Benics [Mon, 2 May 2022 08:54:26 +0000 (10:54 +0200)]
[analyzer] Fix cast evaluation on scoped enums in ExprEngine
We ignored the cast if the enum was scoped.
This is bad since there is no implicit conversion from the scoped enum to the corresponding underlying type.
The fix is basically: isIntegralOrEnumerationType() -> isIntegralOr**Unscoped**EnumerationType()
This materialized in crashes on analyzing the LLVM itself using the Z3 refutation.
Refutation synthesized the given Z3 Binary expression (`BO_And` of `unsigned char` aka. 8 bits
and an `int` 32 bits) with the wrong bitwidth in the end, which triggered an assert.
Now, we evaluate the cast according to the standard.
This bug could have been triggered using the Z3 CM according to
https://bugs.llvm.org/show_bug.cgi?id=44030
Fixes #47570 #43375
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D85528
Nikita Popov [Fri, 22 Apr 2022 08:53:43 +0000 (10:53 +0200)]
[Local] Consider atomic loads from constant global as dead
Per the guidance in
https://llvm.org/docs/Atomics.html#atomics-and-ir-optimization,
an atomic load from a constant global can be dropped, as there can
be no stores to synchronize with. Any write to the constant global
would be UB.
IPSCCP will already drop such loads, but the main helper in Local
doesn't recognize this currently. This is motivated by D118387.
Differential Revision: https://reviews.llvm.org/D124241
Shraiysh Vaishay [Mon, 2 May 2022 05:24:28 +0000 (10:54 +0530)]
[mlir][OpenMP] Restrict types for omp.parallel args
This patch restricts the value of `if` clause expression to an I1 value.
It also restricts the value of `num_threads` clause expression to an I32
value.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D124142
owenca [Thu, 28 Apr 2022 01:01:49 +0000 (18:01 -0700)]
[clang-format] Fix a bug that misformats Access Specifier after *[]
Fixes #55132.
Differential Revision: https://reviews.llvm.org/D124589
Balazs Benics [Mon, 2 May 2022 08:37:23 +0000 (10:37 +0200)]
[analyzer][docs] Document alpha.security.cert.pos.34c limitations
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D124659
Balazs Benics [Mon, 2 May 2022 08:35:51 +0000 (10:35 +0200)]
[analyzer] Fix Static Analyzer g_memdup false-positive
`g_memdup()` allocates and copies memory, thus we should not assume that
the returned memory region is uninitialized because it might not be the
case.
PS: It would be even better to copy the bindings to mimic the actual
content of the buffer, but this works too.
Fixes #53617
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D124436
Nikita Popov [Fri, 29 Apr 2022 15:22:54 +0000 (17:22 +0200)]
[ConstantFold] Don't convert getelementptr to ptrtoint+inttoptr
ConstantFolding currently converts "getelementptr i8, Ptr, (sub 0, V)"
to "inttoptr (sub (ptrtoint Ptr), V)". This transform is, taken by
itself, correct, but does came with two issues:
1. It unnecessarily broadens provenance by introducing an inttoptr.
We generally prefer not to introduce inttoptr during optimization.
2. For the case where V == ptrtoint Ptr, this folds to inttoptr 0,
which further folds to null. In that case provenance becomes
incorrect. This has been observed as a real-world miscompile with
rustc.
We should probably address that incorrect inttoptr 0 fold at some
point, but in either case we should also drop this inttoptr-introducing
fold. Instead, replace it with a fold rooted at
ptrtoint(getelementptr), which seems to cover the original
motivation for this fold (test2 in the changed file).
Differential Revision: https://reviews.llvm.org/D124677
David Green [Mon, 2 May 2022 08:16:57 +0000 (09:16 +0100)]
[AArch64] Add more comprehensive reverse shuffle costmodel tests. NFC
Alex Zinenko [Fri, 29 Apr 2022 15:13:24 +0000 (17:13 +0200)]
[mlir] support isa/cast/dyn_cast<Operation *>(operation)
This enables one to write generic code that can be instantiated for both
specific operation classes and the common base class without
specialization. Examples include functions that take/return ops, such
as:
```mlir
template <typename FnTy>
void applyIf(FnTy &&lambda, ...) {
for (Operation *op : ...) {
auto specific = dyn_cast<function_traits<FnTy>::template arg_t<0>>(op);
if (specific)
lambda(specific);
}
}
```
that would otherwise need to rely on template specialization to support
lambdas that take specific operations and those that take `Operation *`.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D124675
Phoebe Wang [Mon, 2 May 2022 05:29:34 +0000 (13:29 +0800)]
[ArgPromotion][Attributor] Update min-legal-vector-width when do promotion
X86 codegen uses function attribute `min-legal-vector-width` to select the proper ABI. The intention of the attribute is to reflect user's requirement when they passing or returning vector arguments. So Clang front-end will iterate the vector arguments and set `min-legal-vector-width` to the width of the maximum for both caller and callee.
It is assumed any middle end optimizations won't care of the attribute expect inlining and argument promotion.
- For inlining, we will propagate the attribute of inlined functions because the inlining functions become the newer caller.
- For argument promotion, we check the `min-legal-vector-width` of the caller and callee and refuse to promote when they don't match.
The problem comes from the optimizations' combination, as shown by https://godbolt.org/z/zo3hba8xW. The caller `foo` has two callees `bar` and `baz`. When doing argument promotion, both `foo` and `bar` has the same `min-legal-vector-width`. So the argument was promoted to vector. Then the inlining inlines `baz` to `foo` and updates `min-legal-vector-width`, which results in ABI mismatch between `foo` and `bar`.
This patch fixes the problem by expanding the concept of `min-legal-vector-width` to indicator of functions arguments. That says, any passes touch functions arguments have to set `min-legal-vector-width` to the value reflects the width of vector arguments. It makes sense to me because any arguments modifications are ABI related and should response for the ABI compatibility.
Differential Revision: https://reviews.llvm.org/D123284
Shraiysh Vaishay [Mon, 2 May 2022 05:11:46 +0000 (10:41 +0530)]
[flang] Added tests for taskwait and taskyield translation
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D124229
Co-authored-by: Sourabh Singh Tomar <SourabhSingh.Tomar@amd.com>
Congzhe Cao [Mon, 2 May 2022 04:49:11 +0000 (00:49 -0400)]
[LoopCacheAnalysis] Use stable_sort() to avoid non-deterministic print output
The print output of loop cache analysis sometimes has a non-deterministic order
and therefore we have been using `CHECK-DAG` in its lit tests. This patch changes
the sorting of LoopCosts to llvm::stable_sort() where we compare loop cost numbers
and sort the loops. In case of the same loop cost numbers, llvm::stable_sort() now
would output a deterministic loop order.
Reviewed By: Meinersbur, fhahn, #loopoptwg
Differential Revision: https://reviews.llvm.org/D124725
Ben Shi [Thu, 21 Apr 2022 09:41:42 +0000 (09:41 +0000)]
[clang][preprocessor] Add more macros to target AVR
Reviewed By: MaskRay, aykevl
Differential Revision: https://reviews.llvm.org/D124157
Brad Smith [Mon, 2 May 2022 04:26:41 +0000 (00:26 -0400)]
[Driver][Ananas] -r: imply -nostdlib like GCC
Similar to D116843 for Gnu.cpp
Reviewed By: zhmu, MaskRay
Differential Revision: https://reviews.llvm.org/D124729
Fangrui Song [Mon, 2 May 2022 03:44:13 +0000 (20:44 -0700)]
[Driver][test] Remove unneeded -no-canonical-prefixes and use preferred --target=
Similar to D119309
Ben Shi [Wed, 6 Apr 2022 10:45:50 +0000 (10:45 +0000)]
[compiler-rt][builtins] Add several helper functions for AVR
__mulqi3 : int8 multiplication
__mulhi3 : int16 multiplication
_exit : golobal terminator
Reviewed By: MaskRay, aykevl
Differential Revision: https://reviews.llvm.org/D123200
LLVM GN Syncbot [Sun, 1 May 2022 22:32:29 +0000 (22:32 +0000)]
[gn build] Port
3939e99aae68
Matt Arsenault [Tue, 19 Apr 2022 13:12:45 +0000 (09:12 -0400)]
llvm-reduce: Fix not removing first instruction in MachineBasicBlock
This had the surprising behavior of using whatever instruction
happened to be first in the block as an anchor point to stick random
implicit defs on. Use a real implicit_def instead.
Matt Arsenault [Tue, 19 Apr 2022 21:19:36 +0000 (17:19 -0400)]
llvm-reduce: Introduce new scoring mechanism for MIR reductions
Many MIR reductions benefit from or require increasing the instruction
count. For example, unlike in the IR, you may need to insert a new
instruction to represent an undef. The current instruction reduction
pass works around this by sticking implicit defs on whatever
instruction happens to be first in the entry block block.
Other strategies I've applied manually include breaking instructions
with multiple defs into separate instructions, or breaking large
register defs into multiple subregister defs.
Make up a simple scoring system based on what I generally try to get
rid of first when manually reducing. Counts implicit defs as free
since reduction passes will be introducing them, although they
probably should count for something. It also might make more sense to
have a comparison the two functions, rather than having to compute a
contextless number. This isn't particularly well tested since overall
the MIR support isn't in a place where it is useful on the kinds of
testcases I want to throw at it.
Matt Arsenault [Mon, 25 Apr 2022 12:58:39 +0000 (08:58 -0400)]
llvm-reduce: Do not try to delete frame instructions
The verifier enforces these appearing as balanced pairs, so just
deleting one has no real chance of producing something valid.
Matt Arsenault [Tue, 19 Apr 2022 16:10:38 +0000 (12:10 -0400)]
llvm-reduce: Add pass to reduce IR references from MIR
This is typically the first thing I do when reducing a new testcase
until the IR section can be deleted.
Fangrui Song [Sun, 1 May 2022 21:13:54 +0000 (14:13 -0700)]
[RISCV] Lower case the first letter of LowerRISCVMachineOperandToMCOperand. NFC
Sylvestre Ledru [Sun, 1 May 2022 20:59:36 +0000 (22:59 +0200)]
doc: update of the adv build doc now that clang is in tree too
And be more consistent in the declarations
River Riddle [Tue, 26 Apr 2022 20:38:21 +0000 (13:38 -0700)]
[mlir:PDLInterp] Refactor the implementation of result type inferrence
The current implementation uses a discrete "pdl_interp.inferred_types"
operation, which acts as a "fake" handle to a type range. This op is
used as a signal to pdl_interp.create_operation that types should be
inferred. This is terribly awkward and clunky though:
* This op doesn't have a byte code representation, and its conversion
to bytecode kind of assumes that it is only used in a certain way. The
current lowering is also broken and seemingly untested.
* Given that this is a different operation, it gives off the assumption
that it can be used multiple times, or that after the first use
the value contains the inferred types. This isn't the case though,
the resultant type range can never actually be used as a type range.
This commit refactors the representation by removing the discrete
InferredTypesOp, and instead adds a UnitAttr to
pdl_interp.CreateOperation that signals when the created operations
should infer their types. This leads to a much much cleaner abstraction,
a more optimal bytecode lowering, and also allows for better error
handling and diagnostics when a created operation doesn't actually
support type inferrence.
Differential Revision: https://reviews.llvm.org/D124587
Florian Hahn [Sun, 1 May 2022 19:11:05 +0000 (20:11 +0100)]
[SimpleLoopUnswitch] Freeze individual OR/AND operands.
In some cases, it is not enough to freeze the final AND/OR operation
when chaining a number of invariant conditions together.
After creating a chain of ANDs/ORs, we assume all unswitched operands to
be either true or false. But if any of the operands is poison, the rest
of the operands could have any value after branching on the frozen
condition.
To avoid that, freeze individual operands, if needed. In some cases this
may lead to unnecessary freezes, but it seems required at least for some
cases (see trivial-unswitch-freeze-individual-conditions.ll)
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D124554
Simon Pilgrim [Sun, 1 May 2022 19:09:05 +0000 (20:09 +0100)]
[VectorCombine] Merge isa<>/cast<> into dyn_cast<>. NFC.
We want to handle the the assert in VectorCombine so avoid the repeated isa/cast code.
Michael Kruse [Sun, 1 May 2022 18:32:42 +0000 (13:32 -0500)]
[Polly] Fix test after D119669.
Simon Pilgrim [Sun, 1 May 2022 16:56:54 +0000 (17:56 +0100)]
[DAG] (style) Break apart if-else chain as they all return
Stanislav Gatev [Mon, 25 Apr 2022 15:23:42 +0000 (15:23 +0000)]
[clang][dataflow] Optimize flow condition representation
Enable efficient implementation of context-aware joining of distinct
boolean values. It can be used to join distinct boolean values while
preserving flow condition information.
Flow conditions are represented as Token <=> Clause iff formulas. To
perform context-aware joining, one can simply add the tokens of flow
conditions to the formula when joining distinct boolean values, e.g:
`makeOr(makeAnd(FC1, Val1), makeAnd(FC2, Val2))`. This significantly
simplifies the implementation of `Environment::join`.
This patch removes the `DataflowAnalysisContext::getSolver` method.
The `DataflowAnalysisContext::flowConditionImplies` method should be
used instead.
Reviewed-by: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D124395
Simon Pilgrim [Sun, 1 May 2022 16:15:18 +0000 (17:15 +0100)]
[X86] (style) Use auto for dyn_cast<> results
Simon Pilgrim [Sun, 1 May 2022 16:10:21 +0000 (17:10 +0100)]
[X86] (style) Don't use auto for non obvious types
Simon Pilgrim [Sun, 1 May 2022 15:37:21 +0000 (16:37 +0100)]
[SLPVectorizer] Remove weird unicode character from comment. NFCI.
Whatever it was, Visual Assist really didn't like it....
Simon Pilgrim [Sun, 1 May 2022 15:09:23 +0000 (16:09 +0100)]
[InstCombine] Add test coverage from D124503
Simon Pilgrim [Sun, 1 May 2022 12:21:55 +0000 (13:21 +0100)]
[Coroutines] Regenerate coro-retcon-resume-values.ll
Simon Pilgrim [Sun, 1 May 2022 12:04:20 +0000 (13:04 +0100)]
[LoopVectorize][X86] Regenerate invariant-store-vectorization.ll
Andrew Ng [Fri, 29 Apr 2022 17:00:33 +0000 (18:00 +0100)]
[analyzer] Fix return of llvm::StringRef to destroyed std::string
This issue was discovered whilst testing with ASAN.
Differential Revision: https://reviews.llvm.org/D124683
Simon Pilgrim [Sun, 1 May 2022 11:03:40 +0000 (12:03 +0100)]
[CostModel][X86] Check for 'null op' truncations
If the legalized src/dst types are the same, assume the "truncation" is free.
This fixes some edge cases such as mul lo/hi ops and bool vectors which will get legalized back to legal vector widths
Nikolas Klauser [Fri, 29 Apr 2022 09:17:58 +0000 (11:17 +0200)]
[libc++][NFC] Replace _LIBCPP_INLINE_VISIBILTIY and _VSTD in <string>
Replace all the instances of `_LIBCPP_INLINE_VISIBILITY` with `_LIBCPP_HIDE_FROM_ABI` and `_VSTD` with `std`.
Reviewed By: Mordante, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D124662
PeixinQiao [Sun, 1 May 2022 10:40:17 +0000 (18:40 +0800)]
[flang] Add one semantic check for implicit interface
As Fortran 2018 C1533, a nonintrinsic elemental procedure shall not be
used as an actual argument. The semantic check for implicit iterface is
missed.
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D124379
sstwcw [Sun, 1 May 2022 08:39:11 +0000 (08:39 +0000)]
[clang-format] Take out common code for parsing blocks NFC
Differential Revision: https://reviews.llvm.org/D121757
Simon Pilgrim [Sun, 1 May 2022 08:32:14 +0000 (09:32 +0100)]
[CostModel][X86] Reduce cost of vector selects on SSE2/AVX1 targets
Based off the script from D103695, we were exaggerating the cost of the OR(AND(X,M),AND(Y,~M)) expansion using instruction count instead of effective throughput
Nathan James [Sun, 1 May 2022 06:41:04 +0000 (07:41 +0100)]
[clang-tidy][NFC] Re-alphabetize the clang tidy release notes
Jack Andersen [Sat, 30 Apr 2022 22:40:04 +0000 (18:40 -0400)]
[CAPI] Expose CastInst::getCastOpcode in C API
Reviewed By: deadalnix
Differential Revision: https://reviews.llvm.org/D91514
Dmitry Vassiliev [Sat, 30 Apr 2022 19:55:20 +0000 (21:55 +0200)]
[NVPTX] Prefix "$L__" for branch label names
A global variable may have the same name as a label, and ptxas does not accept it.
Prefix labels with $L__ to fix this.
Reviewed By: MaskRay, tra
Differential Revision: https://reviews.llvm.org/D119669
Florian Hahn [Sat, 30 Apr 2022 19:43:22 +0000 (20:43 +0100)]
[LV] Add test for interleaving multiple iterations with call.
Simon Pilgrim [Sat, 30 Apr 2022 18:56:41 +0000 (19:56 +0100)]
[PhaseOrdering][X86] Use passes="" instead of passes='' so DOS can evaluate the cmd lines
Fix regenerating the tests on windows builds
Florian Hahn [Sat, 30 Apr 2022 18:53:36 +0000 (19:53 +0100)]
[SimpleLoopUnswitch] Freeze trivial conditions if needed.
Trivial unswitching can also introduce new branches on undef/poison.
Freeze the conditions if needed.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D124549
Simon Pilgrim [Sat, 30 Apr 2022 18:53:07 +0000 (19:53 +0100)]
[PhaseOrdering][X86] Use passes="default<O3>" instead of passes='default<O3>' so DOS can evaluate the cmd lines
Fix regenerating the tests on windows builds
Simon Pilgrim [Sat, 30 Apr 2022 18:47:03 +0000 (19:47 +0100)]
[SLP][X86] extractelement tests - use -mattr=avx2 instead of a -march flag
Paul Walker [Sat, 30 Apr 2022 17:58:31 +0000 (18:58 +0100)]
[LegalizeDAG] Fix TypeSize conversion error when expanding SIGN_EXTEND_INREG
SIGN_EXTEND_INREG expansion can trigger a TypeSize error because
"VT.getSizeInBits() == 1" is used to detect for a boolean without
first verifying VT is a scalar.
Craig Topper [Sat, 30 Apr 2022 18:01:55 +0000 (11:01 -0700)]
[DAGCombiner] When matching a disguised rotate by constant don't forget to apply LHSMask/RHSMask.
We try to match as a disguised rotate by constant of these forms
(shl (X | Y), C1) | (srl X, C2) --> (rotl X, C1) | (shl Y, C1)
(shl X, C1) | (srl (X | Y), C2) --> (rotl X, C1) | (srl Y, C2)
We may have also looked through an AND to find the shift. If we
did, we need to apply a mask to the result.
I'll add an AArch64 test and pre-commit it and the RISC-V test
tomorrow.
Fixes PR55201.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D124711
Craig Topper [Sat, 30 Apr 2022 17:59:14 +0000 (10:59 -0700)]
[RISCV][AArch64] Pre-commit tests for D124711. NFC
Aaron Ballman [Sat, 30 Apr 2022 17:36:59 +0000 (13:36 -0400)]
Accept -fno-knr-functions as a driver flag as well
Due to a think-o, it was only being accepted as a -cc1 flag. This adds
the proper forwarding from the driver to the frontend and adds test
coverage for the option.
luxufan [Sat, 30 Apr 2022 06:57:10 +0000 (14:57 +0800)]
[RISCV] Don't getDebugLoc for the end node of MBB iterator
Because of shrink wrapping, the block to insert epilog may don't have
instructions (Only debug instructions). And the position to insert may
point to MBB.end() that don't have a DebugLoc. This patch fix this
problem.
The test program was copied from the issue:https://github.com/llvm/llvm-project/issues/53662
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D123679
Saleem Abdulrasool [Wed, 27 Apr 2022 03:12:48 +0000 (20:12 -0700)]
AArch64: modify Swift async frame record storage on Windows
The frame layout on Windows differs from that on other platforms. It
will spill the registers in descending numeric value (i.e. x30, x29,
...). Furthermore, the x29, x30 pair is particularly important as it
is used for the fast stack walking. As a result, we cannot simply
insert the Swift async frame record in between the store. To provide
the simplistic search mechanism, always spill the async frame record
prior to the spilled registers.
This was caught by the assertion failure in the frame lowering code when
building the runtime for Windows AArch64.
Fixes: #55058
Differential Revision: https://reviews.llvm.org/D124498
Reviewed By: mstorsjo
Aaron Ballman [Sat, 30 Apr 2022 13:53:49 +0000 (09:53 -0400)]
Generalize calls to ImplicitlyDefineFunction
In C++ and C2x, we would avoid calling ImplicitlyDefineFunction at all,
but in OpenCL mode we would still call the function and have it produce
an error diagnostic. Instead, we now have a helper function to
determine when implicit function definitions are allowed and we use
that to determine whether to call ImplicitlyDefineFunction so that the
behavior is more consistent across language modes.
This changes the diagnostic behavior from telling the users that an
implicit function declaration is not allowed in OpenCL to reporting use
of an unknown identifier and going through typo correction, as done in
C++ and C2x.
Arjun P [Fri, 29 Apr 2022 11:37:13 +0000 (12:37 +0100)]
[MLIR][Presburger] subtraction: add support for divs defined by equalties
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D124668
Mark de Wever [Sat, 30 Apr 2022 11:17:17 +0000 (13:17 +0200)]
Revert "[msan][libcxx] Enable -fsanitize-memory-param-retval"
This reverts commit
beff64ee44acec4e7bfbc2ab165acba7579a6bb7.
The original commit was reviewed as D123979.
This commit caused the libc++ pre-commit CI to fail
https://buildkite.com/llvm-project/libcxx-ci/builds/10483
Reviewed By: #libc, Mordante
Differential Revision: https://reviews.llvm.org/D124713
Juneyoung Lee [Tue, 26 Apr 2022 00:57:25 +0000 (09:57 +0900)]
[InstCombine] Remove the undef-related workaround code in visitSelectInst
This patch removes an old hack in visitSelectInst that was written to avoid miscompilation bugs in loop unswitch.
(Added via https://reviews.llvm.org/D35811)
The legacy loop unswitch pass will be removed after D124376, and the new simple loop unswitch pass correctly uses freeze to avoid introducing UB after D124252.
Since the hack is not necessary anymore, this patch removes it.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D124426
Juneyoung Lee [Wed, 27 Apr 2022 10:07:32 +0000 (19:07 +0900)]
Add a pre-commit test for D124426
Simon Pilgrim [Sat, 30 Apr 2022 10:00:28 +0000 (11:00 +0100)]
[X86] lowerShuffleAsRepeatedMaskAndLanePermute - permit 32-bit sublane permute for unary v32i8 cases
Increase the likelihood that we can lower to a permd(pshufb()) pattern, but only after we've attempted with 64-bit sublane permutes first
Fixes #55066
Sam McCall [Sat, 30 Apr 2022 09:02:31 +0000 (11:02 +0200)]
Reland [clangd] More precisely enable clang warnings through ClangTidy options
This reverts commit
26c82f3d1de11cdada57e499b63a05d24e18b656.
When tests enable 'Checks: *', we may get extra diagnostics.
NAKAMURA Takumi [Sat, 30 Apr 2022 08:10:40 +0000 (17:10 +0900)]
ClangDriverTests:ToolChainTest.cpp: Fix warnings. [-Wsign-compare]
EXPECT_EQ(num,num) is aware of signedness, even if rhs is a constant.
Yeting Kuo [Sun, 27 Mar 2022 11:35:10 +0000 (19:35 +0800)]
[RISCV] Add DAGCombine to fold base operation and reduction.
Transform (<bop> x, (reduce.<bop> vec, splat(neutral_element))) to
(reduce.<bop> vec, splat (x)).
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D122563
Amir Ayupov [Sat, 30 Apr 2022 03:37:32 +0000 (20:37 -0700)]
[BOLT][NFC] Reduce Target/{AArch64,X86} dependencies
We don't actually depend on entire X86/AArch64 components that pull in CodeGen,
SelectionDAG etc., just the Desc part with opcode and other definitions.
Note that it doesn't decouple BOLT from these components - we still pull in X86
and AArch64 from top-level llvm-bolt dependencies as we use assembler and
disassembler. It's difficult to reduce these as this requires non-trivial
changes to X86/AArch64 components themselves (e.g. moving out AsmPrinter).
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D124206