David Green [Wed, 17 Mar 2021 10:57:50 +0000 (10:57 +0000)]
[LV] Account for the cost of predication of scalarized load/store
This adds the cost of an i1 extract and a branch to the cost in
getMemInstScalarizationCost when the instruction is predicated. These
predicated loads/store would generate blocks of something like:
%c1 = extractelement <4 x i1> %C, i32 1
br i1 %c1, label %if, label %else
if:
%sa = extractelement <4 x i32> %a, i32 1
%sb = getelementptr inbounds float, float* %pg, i32 %sa
%sv = extractelement <4 x float> %x, i32 1
store float %sa, float* %sb, align 4
else:
So this increases the cost by the extract and branch. This is probably
still too low in many cases due to the cost of all that branching, but
there is already an existing hack increasing the cost using
useEmulatedMaskMemRefHack. It will increase the cost of a memop if it is
a load or there are more than one store. This patch improves the cost
for when there is only a single store, and hopefully at some point in
the future the hack can be removed.
Differential Revision: https://reviews.llvm.org/D98243
Bu Le [Wed, 17 Mar 2021 10:15:56 +0000 (13:15 +0300)]
[SLP] Fix the trunc instruction insertion problem
Current SLP pass has this piece of code that inserts a trunc instruction
after the vectorized instruction. In the case that the vectorized instruction
is a phi node and not the last phi node in the BB, the trunc instruction
will be inserted between two phi nodes, which will trigger verify problem
in debug version or unpredictable error in another pass.
This patch changes the algorithm to 'if the last vectorized instruction
is a phi, insert it after the last phi node in current BB' to fix this problem.
David Spickett [Wed, 17 Mar 2021 09:43:06 +0000 (09:43 +0000)]
[lldb] Correct typo in memory read error
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D98770
Fraser Cormack [Mon, 15 Mar 2021 15:52:16 +0000 (15:52 +0000)]
[RISCV] Optimize "dominant element" BUILD_VECTORs
This patch adds an optimization path for BUILD_VECTOR nodes where the
majority of the elements are identical. These can be splatted, with the
remaining elements patched up with INSERT_VECTOR_ELTs. The threshold can
be tweaked as required - it is currently conservative. Undef elements
are disregarded when judging the dominance of a particular element. This
allows them to be covered by the splat value.
In addition, vectors of 2 elements are always optimized to a splat (for
the upper element) and an insert at element zero.
This optimization is disabled when optimizing for size.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D98700
Jay Foad [Tue, 16 Mar 2021 16:01:03 +0000 (16:01 +0000)]
[AMDGPU] Split dot2-insts feature
Split out some of the instructions predicated on the dot2-insts target
feature into a new dot7-insts, in preparation for subtargets that have
some but not all of these instructions. NFCI.
Differential Revision: https://reviews.llvm.org/D98717
Martin Storsjö [Wed, 17 Mar 2021 09:40:17 +0000 (11:40 +0200)]
[libcxx] [docs] Fix formatting of inline verbatim snippets in the Windows section
Use double backticks instead of single, as single backticks produces
italic formatting.
Jay Foad [Fri, 5 Mar 2021 11:32:06 +0000 (11:32 +0000)]
[TableGen] Fix excessive compile time issue in FixedLenDecoderEmitter
This patch reduces the time taken for clang to compile the generated
disassembler for an out-of-tree target with InsnType bigger than 64 bits
from 4m30s to 48s.
D67686 did a similar thing for CodeEmitterGen.
The idea is to tweak the API of the APInt-like InsnType class so that
we don't need so many temporary InsnTypes. This takes advantage of the
rule stated in D52100 that currently "no string of bits extracted
from the encoding may exceeed 64-bits", so we can use uint64_t for some
temporaries.
D52100 goes on to say that "fields are still permitted to exceed 64-bits
so long as they aren't one contiguous string of bits". This patch breaks
that by always using a "uint64_t tmp" in the generated decodeToMCInst,
but it should be easy to fix in FilterChooser::emitBinaryParser by
choosing to use a different type of tmp based on the known total field
width.
Differential Revision: https://reviews.llvm.org/D98046
Bu Le [Wed, 17 Mar 2021 08:09:20 +0000 (11:09 +0300)]
[SLP][Test] Precommit test for D98423
Vassil Vassilev [Wed, 17 Mar 2021 08:56:05 +0000 (08:56 +0000)]
Make iteration over the DeclContext::lookup_result safe.
The idiom:
```
DeclContext::lookup_result R = DeclContext::lookup(Name);
for (auto *D : R) {...}
```
is not safe when in the loop body we trigger deserialization from an AST file.
The deserialization can insert new declarations in the StoredDeclsList whose
underlying type is a vector. When the vector decides to reallocate its storage
the pointer we hold becomes invalid.
This patch replaces a SmallVector with an singly-linked list. The current
approach stores a SmallVector<NamedDecl*, 4> which is around 8 pointers.
The linked list is 3, 5, or 7. We do better in terms of memory usage for small
cases (and worse in terms of locality -- the linked list entries won't be near
each other, but will be near their corresponding declarations, and we were going
to fetch those memory pages anyway). For larger cases: the vector uses a
doubling strategy for reallocation, so will generally be between half-full and
full. Let's say it's 75% full on average, so there's N * 4/3 + 4 pointers' worth
of space allocated currently and will be 2N pointers with the linked list. So we
break even when there are N=6 entries and slightly lose in terms of memory usage
after that. We suspect that's still a win on average.
Thanks to @rsmith!
Differential revision: https://reviews.llvm.org/D91524
Rainer Orth [Wed, 17 Mar 2021 08:56:19 +0000 (09:56 +0100)]
[sanitizer_common][test] Handle missing REG_STARTEND in Posix/regex_startend.cpp
As reported in D96348 <https://reviews.llvm.org/D96348>, the
`Posix/regex_startend.cpp` test `FAIL`s on Solaris because
`REG_STARTEND` isn't defined. It's a BSD extension not present everywhere.
E.g. AIX doesn't have it, too.
Fixed by wrapping the test in `#ifdef REG_STARTEND`.
Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and
`x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D98425
Valeriy Savchenko [Mon, 15 Mar 2021 08:00:07 +0000 (11:00 +0300)]
[-Wcalled-once-parameter] Let escapes overwrite MaybeCalled states
This commit makes escapes symmetrical, meaning that having escape
before and after the branching, where parameter is not called on
one of the paths, will have the same effect.
Differential Revision: https://reviews.llvm.org/D98622
Martin Storsjö [Fri, 26 Feb 2021 14:13:38 +0000 (16:13 +0200)]
[libcxx] Simplify rounding of durations in win32 __libcpp_thread_sleep_for
Also fix a comment typo, and remove a superfluous "std::" qualififcation
in __libcpp_semaphore_wait_timed for consistency.
This mirrors what was suggested in review of
1773eec6928f4e37b377e23b84d7a2a07d0d1d0d.
Differential Revision: https://reviews.llvm.org/D98015
edwin-wang [Wed, 17 Mar 2021 08:02:50 +0000 (16:02 +0800)]
[NFC] [XCOFF] Update PowerPC readobj test case with expression
This patch is to replace the fixed value with expression.
Keep .file section as fixed values as it might be changed. The
remaining sections will hardly be modified. So the Index values
are sequential. By using expression, we can avoid the fixed value
changes in coming patches.
This is a follow-up of patch D97117.
Reviewed By: hubert.reinterpretcast, shchenz
Differential Revision: https://reviews.llvm.org/D98620
Fangrui Song [Wed, 17 Mar 2021 07:30:38 +0000 (00:30 -0700)]
[MC] Delete unused MCOperand::{create,is,get}FPImm
Praveen [Sun, 14 Mar 2021 12:42:30 +0000 (18:12 +0530)]
[Flang][OpenMP][OpenACC] Add function for mapping parser clause classes with the corresponding clause kind.
1. Generate the mapping for clauses between the parser class and the
corresponding clause kind for OpenMP and OpenACC using tablegen.
2. Add a common function to get the OmpObjectList from the OpenMP
clauses to avoid repetition of code.
Reviewed by: Kiranchandramohan @kiranchandramohan , Valentin Clement @clementval
Differential Revision: https://reviews.llvm.org/D98603
Vaivaswatha Nagaraj [Wed, 17 Mar 2021 05:55:43 +0000 (11:25 +0530)]
[OCaml] Fix buildbot failure in OCaml tests
The commit
506df1bbfd16233134a6ddea96ed2d49077840fd introduced
a call to `caml_alloc_initialized_string` which seems to be
unavailable on older OCaml versions. So I'm now switching to
using `caml_alloc_string` and using a `memcpy` after that, as
is done in the rest of the file.
Buildbot failure:
https://lab.llvm.org/buildbot/#/builders/16/builds/7919
Greg McGary [Wed, 17 Mar 2021 04:34:28 +0000 (21:34 -0700)]
[lld-macho][NFC] Drop unnecessary braces around simple if/for bodies
Minor cleanup
Differential Revision: https://reviews.llvm.org/D98758
Arthur Eubanks [Wed, 17 Mar 2021 05:29:40 +0000 (22:29 -0700)]
[Unswitch] Guard dbgs logging with LLVM_DEBUG
Vaivaswatha Nagaraj [Wed, 17 Mar 2021 03:43:21 +0000 (09:13 +0530)]
[OCaml] DebugInfo support for OCaml bindings
Many (but not all) DebugInfo functions are now added to the
OCaml bindings, and rest can be safely added incrementally.
Differential Revision: https://reviews.llvm.org/D90831
Max Kazantsev [Wed, 17 Mar 2021 04:16:07 +0000 (11:16 +0700)]
[BasicAA] Drop dependency on Loop Info. PR43276
BasicAA stores a reference to LoopInfo inside. This imposes an implicit
requirement of keeping it up to date whenever we modify the IR (in particular,
whenever we modify terminators of blocks that belong to loops). Failing
to do so leads to incorrect state of the LoopInfo.
Because general AA does not require loop info updates and provides to API to
update it properly, the users of AA reasonably assume that there is no need to
update the loop info. It may be a reason of bugs, as example in PR43276 shows.
This patch drops dependence of BasicAA on LoopInfo to avoid this problem.
This may potentially pessimize the result of queries to BasicAA.
Differential Revision: https://reviews.llvm.org/D98627
Reviewed By: nikic
Greg McGary [Thu, 11 Mar 2021 00:45:18 +0000 (16:45 -0800)]
[lld-macho] Handle error cases properly for -exported_symbol(s_list)
This fixes defects in D98223 [lld-macho] implement options -(un)exported_symbol(s_list):
* disallow export of hidden symbols
* verify that whitelisted literal names are defined in the symbol table
* reflect export-status overrides in `nlist` attribute of `N_EXT` or `N_PEXT`
Thanks to @thakis for raising these issues
Differential Revision: https://reviews.llvm.org/D98381
Sourabh Singh Tomar [Tue, 16 Mar 2021 16:06:31 +0000 (21:36 +0530)]
[flang] Replace Arithmetic Ops with their builtin conunterpart
Replaces `fir.add, fir.sub, fir.mul, fir.div` with their builtin
conuterparts.
This part of upstreaming effort, upstreams some parts of:
PR:https://github.com/flang-compiler/f18-llvm-project/pull/681
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D98719
Bing1 Yu [Wed, 17 Mar 2021 03:05:24 +0000 (11:05 +0800)]
[X86][AMX] Rename amx-bf16 intrinsic according to correct naming convention
__tile_tdpbf16ps should be renamed with __tile_dpbf16ps
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D98685
Jianzhou Zhao [Mon, 15 Mar 2021 16:19:49 +0000 (16:19 +0000)]
[dfsan] Add origin ABI wrappers
supported: bcmp, fstat, memcmp, stat, strcasecmp, strchr, strcmp,
strncasecmp, strncp, strpbrk
This is a part of https://reviews.llvm.org/D95835.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D98636
Stephen Kelly [Wed, 17 Mar 2021 01:29:39 +0000 (01:29 +0000)]
[AST] Suppress diagnostic output when generating code
peter klausler [Tue, 16 Mar 2021 23:46:52 +0000 (16:46 -0700)]
[flang] Fix build error (unused data member warning)
Differential Revision: https://reviews.llvm.org/D98752
River Riddle [Tue, 16 Mar 2021 23:52:03 +0000 (16:52 -0700)]
[mlir][Python] Fix test broken after D98474
Stephen Kelly [Tue, 16 Mar 2021 23:44:45 +0000 (23:44 +0000)]
[AST] Hide errors from the attempt to introspect nodes
Raman Tenneti [Fri, 12 Mar 2021 00:17:50 +0000 (16:17 -0800)]
This introduces gmtime to LLVM libc, based on C99/C2X/Single Unix Spec.
This change doesn't handle TIMEZONE, tm_isdst and leap seconds.
Moved shared code between mktime and gmtime into time_utils.cpp.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D98467
River Riddle [Tue, 16 Mar 2021 23:30:46 +0000 (16:30 -0700)]
[mlir][IR] Move the remaining builtin attributes to ODS.
With this revision, all builtin attributes and types will have been moved to the ODS generator.
Differential Revision: https://reviews.llvm.org/D98474
River Riddle [Tue, 16 Mar 2021 23:30:34 +0000 (16:30 -0700)]
[mlir][AttrTypeDefGen] Add support for custom parameter comparators
Some parameters to attributes and types rely on special comparison routines other than operator== to ensure equality. This revision adds support for those parameters by allowing them to specify a `comparator` code block that determines if `$_lhs` and `$_rhs` are equal. An example of one of these paramters is APFloat, which requires `bitwiseIsEqual` for bitwise comparison (which we want for attribute equality).
Differential Revision: https://reviews.llvm.org/D98473
River Riddle [Tue, 16 Mar 2021 23:11:01 +0000 (16:11 -0700)]
[mlir][pdl] Cast the OperationPosition to Position to fix MSVC miscompile
If we don't cast, MSVC picks an overload that hasn't been defined yet(not sure why) and miscompiles.
Eugene Zhulenev [Mon, 15 Mar 2021 21:39:19 +0000 (14:39 -0700)]
[mlir] Add lowering from math::Log1p to LLVM
[mlir] Add lowering from math::Log1p to LLVM
Reviewed By: cota
Differential Revision: https://reviews.llvm.org/D98662
Anirudh Prasad [Tue, 16 Mar 2021 22:38:03 +0000 (18:38 -0400)]
Revert "[AsmParser][SystemZ][z/OS] Reland "Introduce HLASM Comment Syntax""
This reverts commit
b605cfb336989705f391d255b7628062d3dfe9c3.
Differential Revision: https://reviews.llvm.org/D98744
peter klausler [Tue, 16 Mar 2021 21:32:45 +0000 (14:32 -0700)]
[flang] Order Symbols by source provenance
In parser::AllCookedSources, implement a map from CharBlocks to
the CookedSource instances that they cover. This permits a fast
Find() operation based on std::map::equal_range to map a CharBlock
to its enclosing CookedSource instance.
Add a creation order number to each CookedSource. This allows
AllCookedSources to provide a Precedes(x,y) predicate that is a
true source stream ordering between two CharBlocks -- x is less
than y if it is in an earlier CookedSource, or in the same
CookedSource at an earlier position.
Add a reference to the singleton SemanticsContext to each Scope.
All of this allows operator< to be implemented on Symbols by
means of a true source ordering. From a Symbol, we get to
its Scope, then to the SemanticsContext, and then use its
AllCookedSources reference to call Precedes().
Differential Revision: https://reviews.llvm.org/D98743
Emily Shi [Tue, 16 Mar 2021 19:19:37 +0000 (12:19 -0700)]
[asan] disable MallocNanoZone for no fd test on darwin
On Darwin, MallocNanoZone may log after execv, which messes up this test.
Disable MallocNanoZone for this test since we don't use it anyway with asan.
This environment variable should only affect Darwin and not change behavior on other platforms.
rdar://
74992832
Reviewed By: delcypher
Differential Revision: https://reviews.llvm.org/D98735
Vitaly Buka [Tue, 16 Mar 2021 22:01:35 +0000 (15:01 -0700)]
[sanitizer][NFC] Fix compilation error on Windows
And remove unnecessary const_cast in ubsan.
Zequan Wu [Tue, 16 Mar 2021 21:36:21 +0000 (14:36 -0700)]
Revert "[ConstantFold] Handle vectors in ConstantFoldLoadThroughBitcast()"
That commit caused chromium build to crash: https://bugs.chromium.org/p/chromium/issues/detail?id=1188885
This reverts commit
edf7004851519464f86b0f641da4d6c9506decb1.
Michael Kruse [Tue, 16 Mar 2021 20:59:59 +0000 (15:59 -0500)]
[Polly][CodeGen] Allow nesting of BandAttr mark without loop.
BandAttr markers are added as parents of schedule tree bands. These also
appear as markers its equivalent AST, but a band does not necessarily
corresponds to a loop in this. Iterations may be peeled or the loop
being unrolled (e.g. if it has just one iteration). In such cases it may
happend that there is not loop between a BandAttr marker and the marker
for a loop nested in the former parent band/loop.
Handle the situation by giving priority to the inner marker over the
outer.
Fixes the polly-x86_64-linux-test-suite buildbot.
Sanjay Patel [Tue, 16 Mar 2021 20:42:25 +0000 (16:42 -0400)]
[SLP] separate min/max matching from its instruction-level implementation; NFC
The motivation is to handle integer min/max reductions independently
of whether they are in the current cmp+sel form or the planned intrinsic
form.
We assumed that min/max included a select instruction, but we can
decouple that implementation detail by checking the instructions
themselves rather than relying on the recurrence (reduction) type.
Fangrui Song [Tue, 16 Mar 2021 21:12:18 +0000 (14:12 -0700)]
[RISCV] Make empty name symbols SF_FormatSpecific so that llvm-symbolizer ignores them for symbolization
On RISC-V, clang emits empty name symbols used for label differences. (In GCC the symbols are typically `.L0`)
After D95916, the empty name symbols can show up in llvm-symbolizer's symbolization output.
They have no names and thus not useful. Set `SF_FormatSpecific` so that llvm-symbolizer will ignore them.
`SF_FormatSpecific` is also used in LTO but that case should not matter.
Corresponding addr2line problem: https://sourceware.org/bugzilla/show_bug.cgi?id=27585
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D98669
Vitaly Buka [Tue, 16 Mar 2021 11:03:45 +0000 (04:03 -0700)]
[sanitizer][NFC] Remove InternalScopedString::size()
size() is inconsistent with length().
In most size() use cases we can replace InternalScopedString with
InternalMmapVector.
Remove non-constant data() to avoid direct manipulations of internal
buffer. append() should be enought to modify InternalScopedString.
Anirudh Prasad [Tue, 16 Mar 2021 21:11:14 +0000 (17:11 -0400)]
[AsmParser][SystemZ][z/OS] Reland "Introduce HLASM Comment Syntax"
- Previously, https://reviews.llvm.org/D97703 was [[ https://reviews.llvm.org/D98543 | reverted ]] as it broke when building the unit tests when shared libs on.
- This patch reverts the "revert" and makes two minor changes
- The first is it also links in the MCParser lib when building the unittest. This should resolve the issue when building with with shared libs on and off
- The second renames the name of the unit test from `SystemZAsmLexer` to `SystemZAsmLexerTests` since the convention for unittest binaries is to suffix the name of the unit test with "Tests"
Reviewed By: Kai
Differential Revision: https://reviews.llvm.org/D98666
Roland McGrath [Sat, 13 Mar 2021 00:08:42 +0000 (16:08 -0800)]
[AArch64] Parse "rng" feature flag in .arch directive
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D98566
Mohammad Hadi Jooybar [Tue, 16 Mar 2021 20:28:25 +0000 (16:28 -0400)]
[InstCombine] Avoid Bitcast-GEP fusion for pointers directly from allocation functions
Elimination of bitcasts with void pointer arguments results in GEPs with pure byte indexes. These GEPs do not preserve struct/array information and interrupt phi address translation in later pipeline stages.
Here is the original motivation for this patch:
```
#include<stdio.h>
#include<malloc.h>
typedef struct __Node{
double f;
struct __Node *next;
} Node;
void foo () {
Node *a = (Node*) malloc (sizeof(Node));
a->next = NULL;
a->f = 11.5f;
Node *ptr = a;
double sum = 0.0f;
while (ptr) {
sum += ptr->f;
ptr = ptr->next;
}
printf("%f\n", sum);
}
```
By explicit assignment `a->next = NULL`, we can infer the length of the link list is `1`. In this case we can eliminate while loop traversal entirely. This elimination is supposed to be performed by GVN/MemoryDependencyAnalysis/PhiTranslation .
The final IR before this patch:
```
define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 {
entry:
%call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16) #2
%next = getelementptr inbounds i8, i8* %call, i64 8
%0 = bitcast i8* %next to %struct.__Node**
store %struct.__Node* null, %struct.__Node** %0, align 8, !tbaa !2
%f = bitcast i8* %call to double*
store double 1.150000e+01, double* %f, align 8, !tbaa !8
%tobool12 = icmp eq i8* %call, null
br i1 %tobool12, label %while.end, label %while.body.lr.ph
while.body.lr.ph: ; preds = %entry
%1 = bitcast i8* %call to %struct.__Node*
br label %while.body
while.body: ; preds = %while.body.lr.ph, %while.body
%sum.014 = phi double [ 0.000000e+00, %while.body.lr.ph ], [ %add, %while.body ]
%ptr.013 = phi %struct.__Node* [ %1, %while.body.lr.ph ], [ %3, %while.body ]
%f1 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 0
%2 = load double, double* %f1, align 8, !tbaa !8
%add = fadd contract double %sum.014, %2
%next2 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 1
%3 = load %struct.__Node*, %struct.__Node** %next2, align 8, !tbaa !2
%tobool = icmp eq %struct.__Node* %3, null
br i1 %tobool, label %while.end, label %while.body
while.end: ; preds = %while.body, %entry
%sum.0.lcssa = phi double [ 0.000000e+00, %entry ], [ %add, %while.body ]
%call3 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double %sum.0.lcssa)
ret void
}
```
Final IR after this patch:
```
; Function Attrs: nofree nounwind
define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 {
while.end:
%call3 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double 1.150000e+01)
ret void
}
```
IR before GVN before this patch:
```
define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 {
entry:
%call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16) #2
%next = getelementptr inbounds i8, i8* %call, i64 8
%0 = bitcast i8* %next to %struct.__Node**
store %struct.__Node* null, %struct.__Node** %0, align 8, !tbaa !2
%f = bitcast i8* %call to double*
store double 1.150000e+01, double* %f, align 8, !tbaa !8
%tobool12 = icmp eq i8* %call, null
br i1 %tobool12, label %while.end, label %while.body.lr.ph
while.body.lr.ph: ; preds = %entry
%1 = bitcast i8* %call to %struct.__Node*
br label %while.body
while.body: ; preds = %while.body.lr.ph, %while.body
%sum.014 = phi double [ 0.000000e+00, %while.body.lr.ph ], [ %add, %while.body ]
%ptr.013 = phi %struct.__Node* [ %1, %while.body.lr.ph ], [ %3, %while.body ]
%f1 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 0
%2 = load double, double* %f1, align 8, !tbaa !8
%add = fadd contract double %sum.014, %2
%next2 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 1
%3 = load %struct.__Node*, %struct.__Node** %next2, align 8, !tbaa !2
%tobool = icmp eq %struct.__Node* %3, null
br i1 %tobool, label %while.end.loopexit, label %while.body
while.end.loopexit: ; preds = %while.body
%add.lcssa = phi double [ %add, %while.body ]
br label %while.end
while.end: ; preds = %while.end.loopexit, %entry
%sum.0.lcssa = phi double [ 0.000000e+00, %entry ], [ %add.lcssa, %while.end.loopexit ]
%call3 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double %sum.0.lcssa)
ret void
}
```
IR before GVN after this patch:
```
define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 {
entry:
%call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16) #2
%0 = bitcast i8* %call to %struct.__Node*
%next = getelementptr inbounds %struct.__Node, %struct.__Node* %0, i64 0, i32 1
store %struct.__Node* null, %struct.__Node** %next, align 8, !tbaa !2
%f = getelementptr inbounds %struct.__Node, %struct.__Node* %0, i64 0, i32 0
store double 1.150000e+01, double* %f, align 8, !tbaa !8
%tobool12 = icmp eq i8* %call, null
br i1 %tobool12, label %while.end, label %while.body.preheader
while.body.preheader: ; preds = %entry
br label %while.body
while.body: ; preds = %while.body.preheader, %while.body
%sum.014 = phi double [ %add, %while.body ], [ 0.000000e+00, %while.body.preheader ]
%ptr.013 = phi %struct.__Node* [ %2, %while.body ], [ %0, %while.body.preheader ]
%f1 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 0
%1 = load double, double* %f1, align 8, !tbaa !8
%add = fadd contract double %sum.014, %1
%next2 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 1
%2 = load %struct.__Node*, %struct.__Node** %next2, align 8, !tbaa !2
%tobool = icmp eq %struct.__Node* %2, null
br i1 %tobool, label %while.end.loopexit, label %while.body
while.end.loopexit: ; preds = %while.body
%add.lcssa = phi double [ %add, %while.body ]
br label %while.end
while.end: ; preds = %while.end.loopexit, %entry
%sum.0.lcssa = phi double [ 0.000000e+00, %entry ], [ %add.lcssa, %while.end.loopexit ]
%call3 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double %sum.0.lcssa)
ret void
}
```
The phi translation fails before this patch and it prevents GVN to remove the loop. The reason for this failure is in InstCombine. When the Instruction combining pass decides to convert:
```
%call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16)
%0 = bitcast i8* %call to %struct.__Node*
%next = getelementptr inbounds %struct.__Node, %struct.__Node* %0, i64 0, i32 1
store %struct.__Node* null, %struct.__Node** %next
```
to
```
%call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16)
%next = getelementptr inbounds i8, i8* %call, i64 8
%0 = bitcast i8* %next to %struct.__Node**
store %struct.__Node* null, %struct.__Node** %0
```
GEP instructions with pure byte indexes (e.g. `getelementptr inbounds i8, i8* %call, i64 8`) are obstacles for address translation. address translation is looking for structural similarity between GEPs and these GEPs usually do not match since they have different structure.
This change will cause couple of failures in LLVM-tests. However, in all cases we need to change expected result by the test. I will update those tests as soon as I get green light on this patch.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D96881
Ricky Taylor [Tue, 16 Mar 2021 20:35:06 +0000 (13:35 -0700)]
[M68k] Add more specific operand classes
This change adds an operand class for each addressing mode, which can then
be used as part of the assembler to match instructions.
Differential Revision: https://reviews.llvm.org/D98535
Min-Yih Hsu [Mon, 15 Mar 2021 20:47:10 +0000 (13:47 -0700)]
[M68k] Fixed incorrect `extract-section` command substitution
Fix Bug 49485 (https://bugs.llvm.org/show_bug.cgi?id=49485). Which was
caused by incorrect invocation of `extract-section.py` on Windows.
Replacing it with more general python script invocation.
Differential Revision: https://reviews.llvm.org/D98661
Martin Storsjö [Thu, 11 Mar 2021 20:14:45 +0000 (22:14 +0200)]
[compiler-rt] Use try_compile_only to check for __ARM_FP
This fixes detection when linking isn't supported (i.e. while building
builtins the first time).
Since
8368e4d54c459fe173d76277f17c632478e91add, after setting
CMAKE_TRY_COMPILE_TARGET_TYPE to STATIC_LIBRARY, this isn't strictly
needed, but is good for correctness anyway (and in case that commit
ends up reverted).
Differential Revision: https://reviews.llvm.org/D98737
River Riddle [Tue, 16 Mar 2021 20:12:01 +0000 (13:12 -0700)]
[mlir][PDL] Add support for variadic operands and results in the PDL byte code
Supporting ranges in the byte code requires additional complexity, given that a range can't be easily representable as an opaque void *, as is possible with the existing bytecode value types (Attribute, Type, Value, etc.). To enable representing a range with void *, an auxillary storage is used for the actual range itself, with the pointer being passed around in the normal byte code memory. For type ranges, a TypeRange is stored. For value ranges, a ValueRange is stored. The above problem represents a majority of the complexity involved in this revision, the rest is adapting/adding byte code operations to support the changes made to the PDL interpreter in the parent revision.
After this revision, PDL will have initial end-to-end support for variadic operands/results.
Differential Revision: https://reviews.llvm.org/D95723
River Riddle [Tue, 16 Mar 2021 20:11:50 +0000 (13:11 -0700)]
[mlir][PDL] Add support for variadic operands and results in the PDL Interpreter
This revision extends the PDL Interpreter dialect to add support for variadic operands and results, with ranges of these values represented via the recently added !pdl.range type. To support this extension, three new operations have been added that closely match the single variant:
* pdl_interp.check_types : Compare a range of types with a known range.
* pdl_interp.create_types : Create a constant range of types.
* pdl_interp.get_operands : Get a range of operands from an operation.
* pdl_interp.get_results : Get a range of results from an operation.
* pdl_interp.switch_types : Switch on a range of types.
This revision handles adding support in the interpreter dialect and the conversion from PDL to PDLInterp. Support for variadic operands and results in the bytecode will be added in a followup revision.
Differential Revision: https://reviews.llvm.org/D95722
River Riddle [Tue, 16 Mar 2021 20:11:34 +0000 (13:11 -0700)]
[mlir][PDL] Add support for variadic operands and results in PDL
This revision extends the PDL dialect to add support for variadic operands and results, with ranges of these values represented via the recently added !pdl.range type. To support this extension, three new operations have been added that closely match the single variant:
* pdl.operands : Define a range of input operands.
* pdl.results : Extract a result group from an operation.
* pdl.types : Define a handle to a range of types.
Support for these in the pdl interpreter dialect and byte code will be added in followup revisions.
Differential Revision: https://reviews.llvm.org/D95721
River Riddle [Tue, 16 Mar 2021 20:11:22 +0000 (13:11 -0700)]
[mlir][pdl] Remove CreateNativeOp in favor of a more general ApplyNativeRewriteOp.
This has a numerous amount of benefits, given the overly clunky nature of CreateNativeOp:
* Users can now call into arbitrary rewrite functions from inside of PDL, allowing for more natural interleaving of PDL/C++ and enabling for more of the pattern to be in PDL.
* Removes the need for an additional set of C++ functions/registry/etc. The new ApplyNativeRewriteOp will use the same PDLRewriteFunction as the existing RewriteOp. This reduces the API surface area exposed to users.
This revision also introduces a new PDLResultList class. This class is used to provide results of native rewrite functions back to PDL. We introduce a new class instead of using a SmallVector to simplify the work necessary for variadics, given that ranges will require some changes to the structure of PDLValue.
Differential Revision: https://reviews.llvm.org/D95720
River Riddle [Tue, 16 Mar 2021 20:11:07 +0000 (13:11 -0700)]
[mlir][pdl] Restructure how results are represented.
Up until now, results have been represented as additional results to a pdl.operation. This is fairly clunky, as it mismatches the representation of the rest of the IR constructs(e.g. pdl.operand) and also isn't a viable representation for operations returned by pdl.create_native. This representation also creates much more difficult problems when factoring in support for variadic result groups, optional results, etc. To resolve some of these problems, and simplify adding support for variable length results, this revision extracts the representation for results out of pdl.operation in the form of a new `pdl.result` operation. This operation returns the result of an operation at a given index, e.g.:
```
%root = pdl.operation ...
%result = pdl.result 0 of %root
```
Differential Revision: https://reviews.llvm.org/D95719
Martin Storsjö [Fri, 26 Feb 2021 22:31:18 +0000 (00:31 +0200)]
[sanitizers] [windows] Use InternalMmapVector instead of silencing -Wframe-larger-than
Also use this in ReadBinaryName which currently is producing
warnings.
Keep pragmas for silencing warnings in sanitizer_unwind_win.cpp,
as that can be called more frequently.
Differential Revision: https://reviews.llvm.org/D97726
Philip Reames [Tue, 16 Mar 2021 20:00:23 +0000 (13:00 -0700)]
[rs4gc] Simplify code by cloning existing instructions when inserting base chain [NFC]
Previously we created a new node, then filled in the pieces. Now, we clone the existing node, then change the respective fields. The only change in handling is with phis since we have to handle multiple incoming edges from the same block a bit differently.
Differential Revision: https://reviews.llvm.org/D98316
Philip Reames [Tue, 16 Mar 2021 19:57:54 +0000 (12:57 -0700)]
[rs4gc] don't force a conflict for a canonical broadcast
A broadcast is a shufflevector where only one input is used. Because of the way we handle constants (undef is a constant), the canonical shuffle sees a meet of (some value) and (nullptr). Given this, every broadcast gets treated as a conflict and a new base pointer computation is added.
The other way to tackle this would be to change constant handling specifically for undefs, but this seems easier.
Differential Revision: https://reviews.llvm.org/D98315
Peter Collingbourne [Tue, 16 Mar 2021 18:46:31 +0000 (11:46 -0700)]
scudo: Allow TBI to be disabled on Linux with a macro.
Android's native bridge (i.e. AArch64 emulator) doesn't support TBI so
we need a way to disable TBI on Linux when targeting the native bridge.
This can also be used to test the no-TBI code path on Linux (currently
only used on Fuchsia), or make Scudo compatible with very old
(pre-commit
d50240a5f6ceaf690a77b0fccb17be51cfa151c2 from June 2013)
Linux kernels that do not enable TBI.
Differential Revision: https://reviews.llvm.org/D98732
Philip Reames [Tue, 16 Mar 2021 19:47:57 +0000 (12:47 -0700)]
[rs4gc] don't duplicate existing values which are provably base pointers
RS4GC needs to rewrite the IR to ensure that every relocated pointer has an associated base pointer. The existing code isn't particularly smart about avoiding duplication of existing IR when it turns out the original pointer we were asked to materialize a base pointer for is itself a base pointer.
This patch adds a stage to the algorithm which prunes nodes proven (with a simple forward dataflow fixed point) to be base pointers from the list of nodes considered for duplication. This does require changing some of the later invariants slightly, that's probably the riskiest part of the change.
Differential Revision: D98122
Nikita Popov [Tue, 16 Mar 2021 19:41:26 +0000 (20:41 +0100)]
Revert "[regalloc] Ensure Query::collectInterferringVregs is called before interval iteration"
This reverts commit
d40b4911bd9aca0573752e065f29ddd9aff280e1.
This causes a large compile-time regression:
https://llvm-compile-time-tracker.com/compare.php?from=
0aa637b2037d882ddf7861284169abf63f524677&to=
d40b4911bd9aca0573752e065f29ddd9aff280e1&stat=instructions
Liam Keegan [Tue, 16 Mar 2021 19:30:00 +0000 (20:30 +0100)]
[MemCpyOpt] Add missing MemorySSAWrapperPass dependency macro
Add MemorySSAWrapperPass as a dependency to MemCpyOptLegacyPass,
since MemCpyOpt now uses MemorySSA by default.
Differential Revision: https://reviews.llvm.org/D98484
Mircea Trofin [Tue, 9 Mar 2021 04:55:53 +0000 (20:55 -0800)]
[regalloc] Ensure Query::collectInterferringVregs is called before interval iteration
The main part of the patch is the change in RegAllocGreedy.cpp: Q.collectInterferringVregs()
needs to be called before iterating the interfering live ranges.
The rest of the patch offers support that is the case: instead of clearing the query's
InterferingVRegs field, we invalidate it. The clearing happens when the live reg matrix
is invalidated (existing triggering mechanism).
Without the change in RegAllocGreedy.cpp, the compiler ices.
This patch should make it more easily discoverable by developers that
collectInterferringVregs needs to be called before iterating.
I will follow up with a subsequent patch to improve the usability and maintainability of Query.
Differential Revision: https://reviews.llvm.org/D98232
Arthur O'Dwyer [Sat, 6 Mar 2021 01:13:35 +0000 (20:13 -0500)]
[libc++] Improve src/filesystem's formatting of paths.
This is my attempt to merge D98077 (bugfix the format strings for
Windows paths, which use wchar_t not char)
and D96986 (replace C++ variadic templates with C-style varargs so that
`__attribute__((format(printf)))` can be applied, for better safety)
and D98065 (remove an unused function overload).
The one intentional functional change here is in `__create_what`.
It now prints path1 and path2 in square-brackets _and_ double-quotes,
rather than just square-brackets. Prior to this patch, it would
print either path double-quoted if-and-only-if it was the empty
string. Now the double-quotes are always present. I doubt anybody's
code is relying on the current format, right?
Differential Revision: https://reviews.llvm.org/D98097
Nick Lewycky [Tue, 9 Mar 2021 23:37:04 +0000 (15:37 -0800)]
Add ConstantDataVector::getRaw() to create a constant data vector from raw data.
This parallels ConstantDataArray::getRaw() and can be used with ConstantDataSequential::getRawDataValues() in the base class for both types.
Update BuildConstantData{Array,Vector} tests to test the getRaw API. Also removes its unused Module.
In passing, update some comments to include the support for half and bfloat. Update tests to include testing for bfloat.
Differential Revision: https://reviews.llvm.org/D98302
Maksym Wezdecki [Tue, 16 Mar 2021 17:58:30 +0000 (10:58 -0700)]
Fix for memory leak reported by Valgrind
If llvm so lib is dlopened and dlclosed several times, then memory leak can be observed, reported by Valgrind.
This patch fixes the issue.
Reviewed By: lattner, dblaikie
Differential Revision: https://reviews.llvm.org/D83372
Philip Reames [Tue, 16 Mar 2021 17:52:01 +0000 (10:52 -0700)]
[gvn] CSE gc.relocates based on meaning, not spelling (try 2)
This was (partially) reverted in
cfe8f8e0 because the conversion from readonly to readnone in Intrinsics.td exposed a couple of problems. This change has been reworked to not need that change (via some explicit checks in client code). This is being done to address the original optimization issue and simplify the testing of the readonly changes. I'm working on that piece under 49607.
Original commit message follows:
The last two operands to a gc.relocate represent indices into the associated gc.statepoint's gc bundle list. (Effectively, gc.relocates are projections from the gc.statepoints multiple return values.)
We can use this to recognize when two gc.relocates are equivalent (and can be CSEd), even when the indices are non-equal. This is particular useful when considering a chain of multiple statepoints as it lets us eliminate all duplicate gc.relocates in a single pass.
Differential Revision: https://reviews.llvm.org/D97974
Florian Hahn [Tue, 16 Mar 2021 11:42:08 +0000 (11:42 +0000)]
[VPlan] Remove PredInst2Recipe, use VP operands instead. (NFC)
Instead of maintaining a separate map from predicated instructions to
recipes, we can instead directly look at the VP operands. If the operand
comes from a predicated instruction, the operand will be a
VPPredInstPHIRecipe with a VPReplicateRecipe as its operand.
Giorgis Georgakoudis [Tue, 16 Mar 2021 14:41:39 +0000 (07:41 -0700)]
[Utils] Support lit-like substitutions in update_cc_test_checks
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D98712
Vaivaswatha Nagaraj [Tue, 16 Mar 2021 11:12:30 +0000 (16:42 +0530)]
[Docs] Mention linking to reviews page when committing
Differential Revision: https://reviews.llvm.org/D98695
Aart Bik [Tue, 16 Mar 2021 16:43:42 +0000 (09:43 -0700)]
[mlir][amx] reformatted examples
Examples were missing the underscore of the actual ops format.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D98723
Fangrui Song [Tue, 16 Mar 2021 17:07:01 +0000 (10:07 -0700)]
[llvm-nm] Add --format=just-symbols and make --just-symbol-name its alias
https://sourceware.org/bugzilla/show_bug.cgi?id=27487 binutils will have
--format=just-symbols/-j as well.
Arbitrarily prefer `-j` to `--format=sysv`. Previously `--format=sysv -j` prints
in the sysv format while `-j` takes precedence over other formats.
Differential Revision: https://reviews.llvm.org/D98569
Adrian Prantl [Mon, 15 Mar 2021 23:23:31 +0000 (16:23 -0700)]
Support !heapallocsite attachments in StripDebugInfo().
They point into the DIType type system, so they need to be stripped as well.
rdar://
75341300
Differential Revision: https://reviews.llvm.org/D98668
Adrian Prantl [Mon, 15 Mar 2021 23:16:36 +0000 (16:16 -0700)]
Support !heapallocsite attachments in stripNonLineTableDebugInfo().
They point into the DIType type system, so they need to be stripped as well.
rdar://
75341300
Differential Revision: https://reviews.llvm.org/D98667
Fangrui Song [Tue, 16 Mar 2021 17:02:35 +0000 (10:02 -0700)]
[RISCV] Support clang -fpatchable-function-entry && GNU function attribute 'patchable_function_entry'
Similar to D72215 (AArch64) and D72220 (x86).
```
% clang -target riscv32 -march=rv64g -c -fpatchable-function-entry=2 a.c && llvm-objdump -dr a.o
...
0000000000000000 <main>:
0: 13 00 00 00 nop
4: 13 00 00 00 nop
% clang -target riscv32 -march=rv64gc -c -fpatchable-function-entry=2 a.c && llvm-objdump -dr a.o
...
00000002 <main>:
2: 01 00 nop
4: 01 00 nop
```
Recently the mainline kernel started to use -fpatchable-function-entry=8 for riscv (https://git.kernel.org/linus/
afc76b8b80112189b6f11e67e19cf58301944814).
Differential Revision: https://reviews.llvm.org/D98610
Simonas Kazlauskas [Tue, 16 Mar 2021 00:04:42 +0000 (02:04 +0200)]
[InstSimplify] Restrict a GEP transform to avoid provenance changes
This is a follow-up to D98588, and fixes the inline `FIXME` about a GEP-related simplification not
preserving the provenance.
https://alive2.llvm.org/ce/z/qbQoAY
Additional tests were added in {rGf125f28afdb59eba29d2491dac0dfc0a7bf1b60b}
Depends on D98672
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D98611
Jeremy Morse [Tue, 16 Mar 2021 16:50:49 +0000 (16:50 +0000)]
Tweak spelling of system-windows UNSUPPORTED line
Tomas Matheson [Tue, 23 Feb 2021 14:05:55 +0000 (14:05 +0000)]
[libcxx][type_traits] add tests for is_signed and is_unsigned
In previous versions of clang, __is_signed and __is_unsigned builtins did not
correspond to is_signed and is_unsigned behaviour for enums. The builtins were
fixed in D67897 and D98104.
* Disable the fast path of is_unsigned for clang versions < 13
* Add more tests for is_signed, is_unsigned and is_arithmetic
Differential Revision: https://reviews.llvm.org/D97283
Luke Drummond [Tue, 16 Mar 2021 13:28:06 +0000 (13:28 +0000)]
[OpenCL][NFCI] Prefer CodeGenFunction::EmitRuntimeCall
`CodeGenFunction::EmitRuntimeCall` automatically sets the right calling
convention for the callee so we can avoid setting it ourselves.
As requested in https://reviews.llvm.org/D98411
Reviewed by: anastasia
Differential Revision: https://reviews.llvm.org/D98705
Sanjay Patel [Tue, 16 Mar 2021 13:42:25 +0000 (09:42 -0400)]
[LoopVectorize] add FP induction test with minimal FMF; NFC
Thomas Preud'homme [Tue, 16 Mar 2021 15:49:16 +0000 (15:49 +0000)]
[MemDepAnalysis] Remove redundant comment.
Exact same comment is found 2 lines above.
Aart Bik [Tue, 16 Mar 2021 04:18:52 +0000 (21:18 -0700)]
[mlir][amx] blocked tilezero integration test
This adds a new integration test. However, it also
adapts to a recent memref.XXX change for existing tests
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D98680
Josh Berdine [Fri, 12 Mar 2021 22:50:21 +0000 (22:50 +0000)]
[OCaml] Add missing TypeKinds, Opcode, and AtomicRMWBinOps
There are several enum values that have been added to LLVM-C that are
missing from the OCaml bindings. The types defined in
bindings/ocaml/llvm/llvm.ml should be in sync with the corresponding
enum definitions in include/llvm-c/Core.h. The enum values are passed
from C to OCaml unmodified, and clients of the OCaml bindings
interpret them as tags of the corresponding OCaml types. So the only
changes needed are to add the missing constructors to the type
definitions, and to change the name of the maximum opcode in an
assertion.
Differential Revision: https://reviews.llvm.org/D98578
Joe Ellis [Thu, 4 Mar 2021 09:06:49 +0000 (09:06 +0000)]
[AArch64][SVE] Fold vector ZExt/SExt into gather loads where possible
This commit folds sxtw'd or uxtw'd offsets into gather loads where
possible with a DAGCombine optimization.
As an example, the following code:
1 #include <arm_sve.h>
2
3 svuint64_t func(svbool_t pred, const int32_t *base, svint64_t offsets) {
4 return svld1sw_gather_s64offset_u64(
5 pred, base, svextw_s64_x(pred, offsets)
6 );
7 }
would previously lower to the following assembly:
sxtw z0.d, p0/m, z0.d
ld1sw { z0.d }, p0/z, [x0, z0.d]
ret
but now lowers to:
ld1sw { z0.d }, p0/z, [x0, z0.d, sxtw]
ret
Differential Revision: https://reviews.llvm.org/D97858
Max Kazantsev [Tue, 16 Mar 2021 14:27:25 +0000 (21:27 +0700)]
[SCEV][NFC] Move check up the stack
One of (and primary) callers of isBasicBlockEntryGuardedByCond is
isKnownPredicateAt, which makes isKnownPredicate check before it.
It already makes non-recursive check inside. So, on this execution
path this check is made twice. The only other caller is
isLoopEntryGuardedByCond. Moving the check there should save some
compile time.
Craig Topper [Tue, 16 Mar 2021 14:49:24 +0000 (07:49 -0700)]
[RISCV] Look through copies when trying to find an implicit def in addVSetVL.
The InstrEmitter can sometimes insert a copy after an IMPLICIT_DEF
before connecting it to the vector instruction. This occurs when
constrainRegClass reduces to a class with less than 4 registers.
I believe LMUL8 on masked instructions triggers this since the
result can only use the v8, v16, or v24 register group as the mask
is using v0.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D98567
David Zarzycki [Tue, 16 Mar 2021 14:52:11 +0000 (10:52 -0400)]
[lit testing] Mark reorder.py as unavailable on Windows
The test file has embedded slashes. This is fine for normal users that
are just recording and reordering paths, but not great when the trace
data is committed back to a repository that should work on both Unix and
Windows.
Joe Ellis [Mon, 8 Mar 2021 12:52:40 +0000 (12:52 +0000)]
[AArch64][SVEIntrinsicOpts] Factor out redundant SVE mul/fmul intrinsics
This commit implements an IR-level optimization to eliminate idempotent
SVE mul/fmul intrinsic calls. Currently, the following patterns are
captured:
fmul pg (dup_x 1.0) V => V
mul pg (dup_x 1) V => V
fmul pg V (dup_x 1.0) => V
mul pg V (dup_x 1) => V
fmul pg V (dup v pg 1.0) => V
mul pg V (dup v pg 1) => V
The result of this commit is that code such as:
1 #include <arm_sve.h>
2
3 svfloat64_t foo(svfloat64_t a) {
4 svbool_t t = svptrue_b64();
5 svfloat64_t b = svdup_f64(1.0);
6 return svmul_m(t, a, b);
7 }
will lower to a nop.
This commit does not capture all possibilities; only the simple cases
described above. There is still room for further optimisation.
Differential Revision: https://reviews.llvm.org/D98033
Craig Topper [Tue, 16 Mar 2021 07:29:42 +0000 (00:29 -0700)]
[RISCV] Improve i32 UADDSAT/USUBSAT on RV64.
The default promotion uses zero extends that become shifts. We
cam use sign extend instead which is better for RISCV.
I've used two different implementations based on whether we
have minu/maxu instructions.
Differential Revision: https://reviews.llvm.org/D98683
Aaron Puchert [Tue, 16 Mar 2021 14:17:43 +0000 (15:17 +0100)]
Correct Doxygen syntax for inline code
There is no syntax like {@code ...} in Doxygen, @code is a block command
that ends with @endcode, and generally these are not enclosed in braces.
The correct syntax for inline code snippets is @c <code>.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D98665
LLVM GN Syncbot [Tue, 16 Mar 2021 14:03:53 +0000 (14:03 +0000)]
[gn build] Port
9a5af541ee05
Nathan James [Tue, 16 Mar 2021 14:03:29 +0000 (14:03 +0000)]
[clang-tidy] Remove readability-deleted-default
The deprecation notice was cherrypicked to the release branch in https://github.com/llvm/llvm-project/commit/
f8b32989241cca87a8690c8cc404f06ce1f90e4c so its safe to remove this for the 13.X release cycle.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D98612
Michael Kruse [Tue, 16 Mar 2021 13:50:37 +0000 (08:50 -0500)]
[Polly][Unroll] Fix unroll_double test.
We enumerated the cross product Domain x Scatter, but sorted only be the
scatter key. In case there are are multiple statement instances per
scatter value, the order between statement instances of the same loop
iteration was undefined.
Propertly enumerate and sort only by the scatter value, and group the
domains using the scatter dimension again.
Thanks to Leonard Chan for the report.
Hansang Bae [Thu, 11 Mar 2021 17:24:29 +0000 (11:24 -0600)]
[OpenMP] Add runtime interface for OpenMP 5.1 error directive
The proposed new interface is for supporting `at(execution)` clause in the
error directive.
Differential Revision: https://reviews.llvm.org/D98448
Simon Pilgrim [Tue, 16 Mar 2021 12:31:39 +0000 (12:31 +0000)]
[X86][SSE] canonicalizeShuffleWithBinOps - add PERMILPS/PERMILPD + PERMPD/PERMQ + INSERTPS handling.
Bail if the INSERTPS would introduce zeros across the binop.
RamNalamothu [Tue, 16 Mar 2021 11:06:24 +0000 (16:36 +0530)]
[AMDGPU, NFC] Refactor FP/BP spill index code in emitPrologue/emitEpilogue
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D98617
Simonas Kazlauskas [Sun, 14 Mar 2021 20:54:18 +0000 (22:54 +0200)]
[InstSimplify] Match PtrToInt more directly in a GEP transform (NFC)
In preparation for D98611, the upcoming change will need to apply additional checks to `P` and `V`,
and so this refactor paves the way for adding additional checks in a less awkward way.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D98672
David Zarzycki [Tue, 16 Mar 2021 13:11:01 +0000 (09:11 -0400)]
[lit testing] Fix Windows reliability?
Max Kazantsev [Tue, 16 Mar 2021 12:46:36 +0000 (19:46 +0700)]
[Test] Add test with loops guarded by trivial conditions
Max Kazantsev [Tue, 16 Mar 2021 12:38:57 +0000 (19:38 +0700)]
[Test] Update auto-generated checks
Kirill Bobyrev [Tue, 16 Mar 2021 12:37:48 +0000 (13:37 +0100)]
[clangd] Add basic monitoring info request for remote index server
This allows requesting information about the server uptime and start time. This is the first patch in a series of monitoring changes, hence it's not immediately useful. Next step is propagating the index freshness information and then probably loading metadata into the index server.
The way to test new behaviour through command line:
```
$ grpc_cli call localhost:50051 Monitor/MonitoringInfo ''
connecting to localhost:50051
uptime_seconds: 42
index_age_seconds: 609568
Rpc succeeded with OK status
```
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D98246
serge-sans-paille [Mon, 8 Mar 2021 16:46:35 +0000 (17:46 +0100)]
[NFC] Use SmallString instead of std::string for the AttrBuilder
This avoids a few unnecessary conversion from StringRef to std::string, and a
bunch of extra allocation thanks to the SmallString.
Differential Revision: https://reviews.llvm.org/D98190