Chuanqi Xu [Tue, 25 May 2021 12:33:46 +0000 (20:33 +0800)]
[NFC] [Coroutines] Remove unused variable: UnreachableCache
OCHyams [Tue, 25 May 2021 12:09:14 +0000 (13:09 +0100)]
[dexter] Change --source-root-dir and add --debugger-use-relative-paths
We want to use `DexDeclareFile` to specify paths relative to a project root
directory. The option `--source-root-dir`, prior to this patch, causes dexter
to strip the path prefix from commands before passing them to a debugger, and
appends the prefix to file paths returned from a debugger. This patch changes
the behviour of `--source-root-dir`. Relative paths in commands, made possible
with `DexDeclareFile(relative/path)`, are appended to the `--source-root-dir`
directory.
A new option, `--debugger-use-relative-paths`, can be used alongside
`--source-root-dir` to reproduce the old behaviour: all paths passed to the
debugger will be made relative to `--source-root-dir`.
I've added a regression test source_root_dir.dex for this new behaviour, and
modified the existing `--source-root-dir` regression and unit tests to use
`--debugger-use-relative-paths`.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D100307
Roman Lebedev [Tue, 25 May 2021 12:00:20 +0000 (15:00 +0300)]
[LoopIdiom] Support 'left-shift until zero' idiom
This adds support for the "count active bits" pattern, i.e.:
```
int countBits(unsigned val) {
int cnt = 0;
for( ; (val << cnt) != 0; ++cnt)
;
return cnt;
}
```
but a somewhat more general one:
```
int countBits(unsigned val, int start, int off) {
int cnt;
for (cnt = start; val << (cnt + off); cnt++)
;
return cnt;
}
```
alive2 is happy with all the tests there.
Note that, again, much like with the right-shift cases,
we don't require the `val != 0` guard.
This is the last pattern that was supported by
`detectShiftUntilZeroIdiom()`, which now becomes obsolete.
Roman Lebedev [Tue, 25 May 2021 11:58:44 +0000 (14:58 +0300)]
[NFC][LoopIdiom] Add tests for 'left-shift until zero' idiom
Pushpinder Singh [Tue, 25 May 2021 11:13:46 +0000 (11:13 +0000)]
[AMDGPU][Libomptarget] Mark lambda_by_value test as XFAIL
Reason: Missing printf definition
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D103078
Bradley Smith [Tue, 18 May 2021 12:49:27 +0000 (13:49 +0100)]
[AArch64][SVE] Add fixed length codegen for FP_TO_{S,U}INT/{S,U}INT_TO_FP
Depends on D102607
Differential Revision: https://reviews.llvm.org/D102777
Tom Weaver [Tue, 25 May 2021 11:47:16 +0000 (12:47 +0100)]
[Dexter] Add DexDeclareFile command to Dexter
DexDeclareFile allows test producers to write test files with .dex extensions
that contain pure dexter commands.
.dex file commands do not need to be commented out like they do when written
inline within test source files.
DexDeclareFile commands are declarative in behaviour, they state that any
Dexter command seen from this point on will have its path attribute set to the
path declared in the DexDeclareFile command.
Differential Revision: https://reviews.llvm.org/D99651
Raphael Isemann [Tue, 25 May 2021 11:27:38 +0000 (13:27 +0200)]
[lldb] Fix that LLDB doesn't print NaN's sign on Darwin
It seems std::ostringstream ignores NaN signs on Darwin while it prints them on
Linux. This causes that LLDB behaves differently on those platforms which is
both confusing for users and it also means we have to deal with that in our
tests.
This patch manually implements the NaN/Inf printing (which are apparently
implementation defined) to make LLDB print the same thing on all platforms. The
only output difference in practice seems to be that we now print negative NaNs
as `-nan`, but this potentially also changes the output on other systems I
haven't tested this on.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D102845
Roman Lebedev [Tue, 25 May 2021 11:06:00 +0000 (14:06 +0300)]
[LoopIdiom] Support 'arithmetic right-shift until zero' idiom
This adds support for the "count active bits" pattern, i.e.:
```
int countActiveBits(signed val) {
int cnt = 0;
for( ; (val >> cnt) != 0; ++cnt)
;
return cnt;
}
```
but a somewhat more general one:
```
int countActiveBits(signed val, int start, int off) {
int cnt;
for (cnt = start; val >> (cnt + off); cnt++)
;
return cnt;
}
```
This directly matches the existing 'logical right-shift until zero' idiom.
alive2 is happy with all the tests there.
Note that, again, much like with the original unsigned case,
we don't require the `val != 0` guard.
The old `detectShiftUntilZeroIdiom()` already supports this pattern,
the idea here is that the `val` must be positive (have at least one
leading zero), because otherwise the loop is non-terminating,
but since it is not `while(1)`, that would have been UB.
Roman Lebedev [Tue, 25 May 2021 11:04:04 +0000 (14:04 +0300)]
[NFC][LoopIdiom] Add tests for 'arithmetic right-shift until zero' idiom
Raphael Isemann [Tue, 25 May 2021 11:11:45 +0000 (13:11 +0200)]
[lldb][NFC] Remove misleading ModulePass base class for IRForTarget
IRForTarget is never used by a pass manager or any other interface that requires
this class to inherit from `Pass`.
Also IRForTarget doesn't implement the current interface correctly because it
uses the `runOnModule` return value to indicate success/failure instead of
changed/not-changed, so if this ever ends up being used as a pass it would most
likely not work as intended.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D102677
Raphael Isemann [Tue, 25 May 2021 11:09:45 +0000 (13:09 +0200)]
[lldb] X-FAIL TestCPPStaticMembers on Windows
This was originally failed because of llvm.org/pr21765 which describes that
LLDB can't call a debugee's functions, but I removed the (unnecessary)
function call in the rewrite. It seems that the actual bug here is that we
can't lookup static members at all, so let's X-FAIL the test for the right
reason.
Marco Elver [Tue, 25 May 2021 10:29:00 +0000 (12:29 +0200)]
[SanitizeCoverage] Add support for NoSanitizeCoverage function attribute
We really ought to support no_sanitize("coverage") in line with other
sanitizers. This came up again in discussions on the Linux-kernel
mailing lists, because we currently do workarounds using objtool to
remove coverage instrumentation. Since that support is only on x86, to
continue support coverage instrumentation on other architectures, we
must support selectively disabling coverage instrumentation via function
attributes.
Unfortunately, for SanitizeCoverage, it has not been implemented as a
sanitizer via fsanitize= and associated options in Sanitizers.def, but
rolls its own option fsanitize-coverage. This meant that we never got
"automatic" no_sanitize attribute support.
Implement no_sanitize attribute support by special-casing the string
"coverage" in the NoSanitizeAttr implementation. To keep the feature as
unintrusive to existing IR generation as possible, define a new negative
function attribute NoSanitizeCoverage to propagate the information
through to the instrumentation pass.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=49035
Reviewed By: vitalybuka, morehouse
Differential Revision: https://reviews.llvm.org/D102772
Marco Elver [Tue, 25 May 2021 10:28:50 +0000 (12:28 +0200)]
[NFC][SanitizeCoverage] Test always_inline functions work
Test that always_inline functions are instrumented as expected.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102929
Marco Elver [Tue, 25 May 2021 10:28:36 +0000 (12:28 +0200)]
[NFC][CodeGenOptions] Refactor checking SanitizeCoverage options
Refactor checking SanitizeCoverage options into
CodeGenOptions::hasSanitizeCoverage().
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102927
Simon Pilgrim [Tue, 25 May 2021 10:32:19 +0000 (11:32 +0100)]
Fix MSVC "truncation of constant value" warning. NFCI.
Simon Pilgrim [Mon, 24 May 2021 17:26:16 +0000 (18:26 +0100)]
[CostModel][X86] Improve accuracy of vXi8/vXi16 vector non-uniform shift costs on AVX2/AVX512 targets
Determined from llvm-mca analysis, AVX2+ capable targets have a higher throughput for VPBLENDVB and VPMOVZX ops, making it cheaper to perform shift+select patterns for vXi8 shifts or extend/shift/truncate for vXi16 shifts. Similarly AVX512BW can perform vXi8 as extend/shift/truncate patterns.
Christudasan Devadasan [Tue, 25 May 2021 10:17:42 +0000 (15:47 +0530)]
[AMDGPU] Remove dead declaration (NFC).
Vinayaka Bandishti [Tue, 25 May 2021 09:49:15 +0000 (15:19 +0530)]
[MLIR][Affine][LICM] Mark users of `iter_args` variant
Prevent users of `iter_args` of an affine for loop from being hoisted
out of it. Otherwise, LICM leads to a violation of the SSA dominance
(as demonstrated in the added test case).
Fixes: https://bugs.llvm.org/show_bug.cgi?id=50103
Reviewed By: bondhugula, ayzhuang
Differential Revision: https://reviews.llvm.org/D102984
Tres Popp [Tue, 25 May 2021 09:35:14 +0000 (11:35 +0200)]
[mlir] Fold memref.dim of OffsetSizeAndStrideOpInterface outputs
This previously handled memref::SubviewOp, but this can be extended to
all ops implementing the interface.
Differential Revision: https://reviews.llvm.org/D103076
Florian Hahn [Tue, 25 May 2021 09:50:08 +0000 (10:50 +0100)]
[AArch64] Add tests for lowering of vector load + single extract.
Currently the vector load + extract gets lowered to a single scalar
store, not accounting for the fact that the index could be
out-of-bounds, which is poison, not UB.
See PR50382.
Raphael Isemann [Tue, 25 May 2021 09:53:30 +0000 (11:53 +0200)]
[lldb] Disable minimal import mode for RecordDecls that back FieldDecls
Clang adds a Decl in two phases to a DeclContext. First it adds it invisible and
then it makes it visible (which will add it to the lookup data structures). It's
important that we can't do lookups into the DeclContext we are currently adding
the Decl to during this process as once the Decl has been added, any lookup will
automatically build a new lookup map and add the added Decl to it. The second
step would then add the Decl a second time to the lookup which will lead to
weird errors later one. I made adding a Decl twice to a lookup an assertion
error in D84827.
In the first step Clang also does some computations on the added Decl if it's
for example a FieldDecl that is added to a RecordDecl.
One of these computations is checking if the FieldDecl is of a record type
and the record type has a deleted constexpr destructor which will delete
the constexpr destructor of the record that got the FieldDecl.
This can lead to a bug with the way we implement MinimalImport in LLDB
and the following code:
```
struct Outer {
typedef int HookToOuter;
struct NestedClass {
HookToOuter RefToOuter;
} NestedClassMember; // We are adding this.
};
```
1. We just imported `Outer` minimally so far.
2. We are now asked to add `NestedClassMember` as a FieldDecl.
3. We import `NestedClass` minimally.
4. We add `NestedClassMember` and clang does a lookup for a constexpr dtor in
`NestedClass`. `NestedClassMember` hasn't been added to the lookup.
5. The lookup into `NestedClass` will now load the members of `NestedClass`.
6. We try to import the type of `RefToOuter` which will try to import the `HookToOuter` typedef.
7. We import the typedef and while importing we check for conflicts in `Outer` via a lookup.
8. The lookup into `Outer` will cause the invisible `NestedClassMember` to be added to the lookup.
9. We continue normally until we get back to the `addDecl` call in step 2.
10. We now add `NestedClassMember` to the lookup even though we already did that in step 8.
The fix here is disabling the minimal import for RecordTypes from FieldDecls. We
actually already did this, but so far we only force the definition of the type
to be imported *after* we imported the FieldDecl. This just moves that code
*before* we import the FieldDecl so prevent the issue above.
Reviewed By: shafik, aprantl
Differential Revision: https://reviews.llvm.org/D102993
Raphael Isemann [Tue, 25 May 2021 09:43:24 +0000 (11:43 +0200)]
[lldb] Re-eanble and rewrite TestCPPStaticMembers
It's not clear why the whole test got disabled, but the linked bug report
has since been fixed and the only part of it that still fails is the test
for the too permissive lookup. This re-enables the test, rewrites it to use
the modern test functions we have and splits the failing part into its
own test that we can skip without disabling the rest.
Stanislav Mekhanoshin [Mon, 24 May 2021 21:55:49 +0000 (14:55 -0700)]
[IR] Allow Value::replaceUsesWithIf() to process constants
The change is currently NFC, but exploited by the depending D102954.
Code to handle constants is borrowed from the general implementation
of Value::doRAUW().
Differential Revision: https://reviews.llvm.org/D103051
Roman Lebedev [Tue, 25 May 2021 08:48:43 +0000 (11:48 +0300)]
[llvm-exegesis] Loop unrolling for loop snippet repetitor mode
I really needed this, like, factually, yesterday,
when verifying dependency breaking idioms for AMD Zen 3 scheduler model.
Consider the following example:
```
$ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=duplicate
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-4a7e50.o
---
mode: inverse_throughput
key:
instructions:
- 'VPXORYrr YMM0 YMM0 YMM0'
config: ''
register_initial_values: []
cpu_name: znver3
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 1000000
measurements:
- { key: inverse_throughput, value: 0.31025, per_snippet_value: 0.31025 }
error: ''
info: ''
assembled_snippet: C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C3
...
```
What does it tell us?
So wait, it can only execute ~3 x86 AVX YMM PXOR zero-idioms per cycle?
That doesn't seem right. That's even less than there are pipes supporting this type of op.
Now, second example:
```
$ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=loop
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-2418b5.o
---
mode: inverse_throughput
key:
instructions:
- 'VPXORYrr YMM0 YMM0 YMM0'
config: ''
register_initial_values: []
cpu_name: znver3
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 1000000
measurements:
- { key: inverse_throughput, value: 1.00011, per_snippet_value: 1.00011 }
error: ''
info: ''
assembled_snippet: 49B80800000000000000C5FDEFC0C5FDEFC04983C0FF75F2C3
...
```
Now that's just worse. Due to the looping, the throughput completely plummeted,
and now we can only do a single instruction/cycle!?
That's not great.
And final example:
```
$ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=loop --loop-body-size=1000
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-c402e2.o
---
mode: inverse_throughput
key:
instructions:
- 'VPXORYrr YMM0 YMM0 YMM0'
config: ''
register_initial_values: []
cpu_name: znver3
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 1000000
measurements:
- { key: inverse_throughput, value: 0.167087, per_snippet_value: 0.167087 }
error: ''
info: ''
assembled_snippet: 49B80800000000000000C5FDEFC0C5FDEFC04983C0FF75F2C3
...
```
So if we merge the previous two approaches, do duplicate this single-instruction snippet 1000x
(loop-body-size/instruction count in snippet), and run a loop with 1000 iterations
over that duplicated/unrolled snippet, the measured throughput goes through the roof,
up to 5.9 instructions/cycle, which finally tells us that this idiom is zero-cycle!
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D102522
Kristina Bessonova [Tue, 25 May 2021 08:59:38 +0000 (10:59 +0200)]
[ARM][NEON] Combine base address updates for vld1x intrinsics
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D102855
David Spickett [Mon, 24 May 2021 14:00:08 +0000 (14:00 +0000)]
[clang][ARM] Remove non-existent arm9312 CPU
I cannot find documentation on this CPU, and it
is not supported by the Arm Compiler 5 product either.
It was likely a mistake or a different name for the
"ep9312", which is an Arm based Cirrus Logic chip.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D103024
David Spickett [Mon, 24 May 2021 13:52:00 +0000 (13:52 +0000)]
[llvm][ARM] Remove non-existent arm1176j-s CPU
This was removed in https://reviews.llvm.org/D52594 for clang.
The one test using it has been updated to use the mpcore
CPU as the linked clang change does.
This is part of fixing https://bugs.llvm.org/show_bug.cgi?id=50454.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D103022
Benjamin Kramer [Tue, 25 May 2021 08:55:00 +0000 (10:55 +0200)]
[GlobalISel] Silence unused variable warning in Release builds. NFC.
David Spickett [Mon, 24 May 2021 13:33:08 +0000 (13:33 +0000)]
[clang][ARM] Remove non-existent arm1136jz-s CPU
There is an ARM1136JF-S and an ARM1136J-S but I could find
no references to an ARM1136JZ-S. In CPU manuals or the manual
for Arm Compiler 5.
See:
https://developer.arm.com/documentation/ddi0211/latest/
https://developer.arm.com/documentation/dui0472/latest/
Using this CPU you get:
$ ./bin/clang --target=arm-linux-gnueabihf -march=armv3m -mcpu=arm1136jz-s -c /tmp/test.c -o /tmp/test.o
'arm1136jz-s' is not a recognized processor for this target (ignoring processor)
Since the llvm target does not know what it is.
This is part of fixing https://bugs.llvm.org/show_bug.cgi?id=50454.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D103019
Matthias Springer [Tue, 25 May 2021 08:42:49 +0000 (17:42 +0900)]
[mlir] Check only last dim stride in transfer op lowering
Lower a 1D vector transfer op to LLVM if the last dim stride is 1. Also fixes a bug in the original unit stride computation.
Differential Revision: https://reviews.llvm.org/D102897
Alexey Lapshin [Fri, 26 Mar 2021 16:16:26 +0000 (19:16 +0300)]
[TRE] Reland: allow TRE for non-capturing calls.
The D82085 "allow TRE for non-capturing calls" caused failure during bootstrap.
This patch does the same as D82085 plus fixes bootstrap error.
The problem with D82085 is that it does not create copies for byval
operands, while replacing function call with a branch.
Consider following example:
```
int zoo ( S p1 );
int foo ( int count, S p1 ) {
if ( count > 10 )
return zoo(p1);
// temporarily variable created for passing byvalue parameter
// p1 could be used when zoo(p1) is called(after TRE is done).
// lifetime.start p1.byvalue.temp
return foo(count+1, p1);
// lifetime.end p1.byvalue.temp
}
```
After recursive call to foo is replaced with a jump into
start of the function, its parameters could be passed to
zoo function. i.e. temporarily variable created for byvalue
parameter "p1" could be passed to zoo. Finally zoo receives
broken operand:
```
int foo ( int count, S p1 ) {
:tailrecurse
p1_tr = phi p1, p1.byvalue.temp
if ( count > 10 )
return zoo(p1_tr);
// temporarily variable created for passing byvalue parameter
// p1 could be used when zoo(p1) is called(after TRE is done).
lifetime.start p1.byvalue.temp
memcpy (p1.byvalue.temp, p1_tr)
count = count + 1
lifetime.end p1.byvalue.temp
br tailrecurse
}
```
To prevent using p1.byvalue.temp after its scope finished by
lifetime.end marker this patch copies value from p1.byvalue.temp
into another temporarily variable and then copies this variable
into the input parameter for next iteration.
This patch passes bootstrap build and bootstrap build with AddressSanitizer.
Differential Revision: https://reviews.llvm.org/D85614
Jon Chesterfield [Tue, 25 May 2021 08:29:10 +0000 (09:29 +0100)]
[libomptarget][nfc] Accept callable for hsa iterate_symbols
[libomptarget][nfc] Accept callable for hsa iterate_symbols
Candidate refactor to simplify D102692
Reviewed By: pdhaliwal
Differential Revision: https://reviews.llvm.org/D103030
Ella Ma [Tue, 25 May 2021 07:42:16 +0000 (09:42 +0200)]
[analyzer][ctu] Reland "Avoid parsing invocation list again and again..
..during on-demand parsing of CTU"
During CTU, the *on-demand parsing* will read and parse the invocation
list to know how to compile the file being imported. However, it seems
that the invocation list will be parsed again if a previous parsing
has failed.
Then, parse again and fail again. This patch tries to overcome the
problem by storing the error code during the first parsing, and
re-create the stored error during the later parsings.
Reland without test.
Reviewed By: steakhal
Patch By: OikawaKirie!
Differential Revision: https://reviews.llvm.org/D101763
Amara Emerson [Mon, 24 May 2021 22:07:00 +0000 (15:07 -0700)]
[GlobalISel] Fix MachineIRBuilder not using the DstOp argument for G_SHUFFLE_VECTOR.
Balazs Benics [Tue, 25 May 2021 07:28:58 +0000 (09:28 +0200)]
Revert "[analyzer][ctu] Avoid parsing invocation list again and again during on-demand parsing of CTU"
This reverts commit
db8af0f21dc9aad4d336754c857c24470afe53e3.
clang-x86_64-debian-fast fails on this.
+ : 'RUN: at line 4'
+ /usr/bin/ccache
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/Analysis/ctu-on-demand-parsing-multiple-invocation-list-parsing.cpp
-fPIC -shared -o
/b/1/clang-x86_64-debian-fast/llvm.obj/tools/clang/test/Analysis/Output/ctu-on-demand-parsing-multiple-invocation-list-parsing.cpp.tmp/mock_open.so
ccache: error: execv of
/b/1/clang-x86_64-debian-fast/llvm.src/clang/test/Analysis/ctu-on-demand-parsing-multiple-invocation-list-parsing.cpp
failed: Permission denied
Ella Ma [Tue, 25 May 2021 07:19:14 +0000 (09:19 +0200)]
[analyzer][ctu] Avoid parsing invocation list again and again during on-demand parsing of CTU
During CTU, the *on-demand parsing* will read and parse the invocation
list to know how to compile the file being imported. However, it seems
that the invocation list will be parsed again if a previous parsing
has failed.
Then, parse again and fail again. This patch tries to overcome the
problem by storing the error code during the first parsing, and
re-create the stored error during the later parsings.
Reviewed By: steakhal
Patch By: OikawaKirie!
Differential Revision: https://reviews.llvm.org/D101763
Ben Shi [Tue, 25 May 2021 06:14:09 +0000 (14:14 +0800)]
[RISCV] Optimize xor/or with immediate in the zbs extension
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D102893
Lang Hames [Tue, 25 May 2021 05:56:17 +0000 (22:56 -0700)]
[JITLink] Suppress expect-death test in release mode.
Max Kazantsev [Tue, 25 May 2021 05:22:41 +0000 (12:22 +0700)]
[LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration
This patch handles one particular case of one-iteration loops for which SCEV
cannot straightforwardly prove BECount = 1. The idea of the optimization is to
symbolically execute conditional branches on the 1st iteration, moving in topoligical
order, and only visiting blocks that may be reached on the first iteration. If we find out
that we never reach header via the latch, then the backedge can be broken.
Differential Revision: https://reviews.llvm.org/D102615
Reviewed By: reames
Max Kazantsev [Tue, 25 May 2021 05:10:31 +0000 (12:10 +0700)]
[Test] Add test for unreachable backedge with duplicating predecessors
Christudasan Devadasan [Mon, 12 Apr 2021 10:19:47 +0000 (15:49 +0530)]
AMDGPU/GlobalISel: Legalize G_[SU]DIVREM instructions
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D100726
Lang Hames [Tue, 25 May 2021 03:19:32 +0000 (20:19 -0700)]
[JITLink] Enable creation and management of mutable block content.
This patch introduces new operations on jitlink::Blocks: setMutableContent,
getMutableContent and getAlreadyMutableContent. The setMutableContent method
will set the block content data and size members and flag the content as
mutable. The getMutableContent method will return a mutable copy of the existing
content value, auto-allocating and populating a new mutable copy if the existing
content is marked immutable. The getAlreadyMutableMethod asserts that the
existing content is already mutable and returns it.
setMutableContent should be used when updating the block with totally new
content backed by mutable memory. It can be used to change the size of the
block. The argument value should *not* be shared with any other block.
getMutableContent should be used when clients want to modify the existing
content and are unsure whether it is mutable yet.
getAlreadyMutableContent should be used when clients want to modify the existing
content and know from context that it must already be immutable.
These operations reduce copy-modify-update boilerplate and unnecessary copies
introduced when clients couldn't me sure whether the existing content was
mutable or not.
Min-Yih Hsu [Fri, 21 May 2021 22:15:11 +0000 (15:15 -0700)]
[cfe] Support target-specific escaped character in inline asm
GCC allows each target to define a set of non-letter and non-digit
escaped characters for inline assembly that will be replaced by another
string (They call this "punctuation" characters. The existing "%%" and
"%{" -- replaced by '%' and '{' at the end -- can be seen as special
cases shared by all targets).
This patch implements this feature by adding a new hook in `TargetInfo`.
Differential Revision: https://reviews.llvm.org/D103036
Logan Smith [Tue, 25 May 2021 04:13:30 +0000 (21:13 -0700)]
[Sema] Always search the full function scope context if a potential availability violation is encountered
This fixes both https://bugs.llvm.org/show_bug.cgi?id=50309 and https://bugs.llvm.org/show_bug.cgi?id=50310.
Previously, lambdas inside functions would mark their own bodies for later analysis when encountering a potentially unavailable decl, without taking into consideration that the entire lambda itself might be correctly guarded inside an @available check. The same applied to inner class member functions. Blocks happened to work as expected already, since Sema::getEnclosingFunction() skips through block scopes.
This patch instead simply and conservatively marks the entire outermost function scope for search, and removes some special-case logic that prevented DiagnoseUnguardedAvailabilityViolations from traversing down into lambdas and nested functions. This correctly accounts for arbitrarily nested lambdas, inner classes, and blocks that may be inside appropriate @available checks at any ancestor level. It also treats all potential availability violations inside functions consistently, without being overly sensitive to the current DeclContext, which previously caused issues where e.g. nested struct members were warned about twice.
DiagnoseUnguardedAvailabilityViolations now has more work to do in some cases, particularly in functions with many (possibly deeply) nested lambdas and classes, but the big-O is the same, and the simplicity of the approach and the fact that it fixes at least two bugs feels like a strong win.
Differential Revision: https://reviews.llvm.org/D102338
Nathan Lanza [Tue, 16 Mar 2021 08:33:50 +0000 (04:33 -0400)]
[lld:elf] Weaken the requirement for a computed binding to be STB_LOCAL
Given the following scenario:
```
// Cat.cpp
struct Animal { virtual void makeNoise() const = 0; };
struct Cat : Animal { void makeNoise() const override; };
extern "C" int puts(char const *);
void Cat::makeNoise() const { puts("Meow"); }
void doThingWithCat(Animal *a) { static_cast<Cat *>(a)->makeNoise(); }
// CatUser.cpp
struct Animal { virtual void makeNoise() const = 0; };
struct Cat : Animal { void makeNoise() const override; };
void doThingWithCat(Animal *a);
void useDoThingWithCat() {
Cat *d = new Cat;
doThingWithCat(d);
}
// cat.ver
{
global: _Z17useDoThingWithCatv;
local: *;
};
$ clang++ Cat.cpp CatUser.cpp -fpic -flto=thin -fwhole-program-vtables
-shared -O3 -fuse-ld=lld -Wl,--lto-whole-program-visibility
-Wl,--version-script,cat.ver
```
We cannot devirtualize `Cat::makeNoise`. The issue is complex:
Due to `-fsplit-lto-unit` and usage of type metadata, we place the Cat
vtable declaration into module 0 and the Cat vtable definition with type
metadata into module 1, causing duplicate entries (Undefined followed by
Defined) in the `lto::InputFile::symbols()` output.
In `BitcodeFile::parse`, after processing the `Undefined` then the
`Defined`, the final state is `Defined`.
In `BitcodeCompiler::add`, for the first symbol, `computeBinding`
returns `STB_LOCAL`, then we reset it to `Undefined` because it is
prevailing (`versionId` is `preserved`). For the second symbol, because
the state is now `Undefined`, `computeBinding` returns `STB_GLOBAL`,
causing `ExportDynamic` to be true and suppressing devirtualization.
In D77280, the `computeBinding` change used a stricter `isDefined()`
condition to make weak``Lazy` symbol work.
This patch relaxes the condition to weaker `!isLazy()` to keep it
working while making the devirtualization work as well.
Differential Revision: https://reviews.llvm.org/D98686
Arthur Eubanks [Tue, 25 May 2021 01:44:14 +0000 (18:44 -0700)]
Making Instrumentation aware of LoopNest Pass
Intrumentation callbacks are not made aware of LoopNest passes. From the loop pass manager, we can pass the outermost loop of the LoopNest to instrumentation in case of LoopNest passes.
The current patch made the change in two places in StandardInstrumentation.cpp. I will submit a proper patch where the OuterMostLoop is passed from the LoopPassManager to the call backs. That way we will avoid making changes at multiple places in StandardInstrumentation.cpp.
A testcase also will be submitted.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D102463
maekawatoshiki [Tue, 25 May 2021 02:39:49 +0000 (11:39 +0900)]
Revert "[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass"
This reverts commit
d65c32fb41b03a35a2a16330ba1ea15cf6818f04.
Dhruva Chakrabarti [Mon, 24 May 2021 23:35:29 +0000 (16:35 -0700)]
[libomptarget] [amdgpu] Added LDS usage to the kernel trace
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D103059
Nico Weber [Tue, 25 May 2021 01:22:07 +0000 (21:22 -0400)]
Revert "Do not create LLVM IR `constant`s for objects with dynamic initialisation"
This reverts commit
13dd65b3a1a3ac049b5f3a9712059f7c61649bea.
Breaks check-clang on macOS, see https://reviews.llvm.org/D102693
Vitaly Buka [Mon, 24 May 2021 20:12:47 +0000 (13:12 -0700)]
[NFC][scudo] Add paramenters DCHECKs
Reviewed By: hctim
Differential Revision: https://reviews.llvm.org/D103042
David Blaikie [Mon, 24 May 2021 23:51:31 +0000 (16:51 -0700)]
lld-coff: Simplify a few lambda uses after
7975dd033cb9
David Blaikie [Mon, 24 May 2021 23:48:41 +0000 (16:48 -0700)]
Add a range-based wrapper for std::unique(begin, end, binary_predicate)
Vitaly Buka [Tue, 25 May 2021 00:14:17 +0000 (17:14 -0700)]
[NFC][OMP] Fix 'unused' warning
Vitaly Buka [Tue, 25 May 2021 00:13:29 +0000 (17:13 -0700)]
[NFC][scudo] Avoid cast in test
Jonas Devlieghere [Mon, 24 May 2021 23:24:16 +0000 (16:24 -0700)]
[dsymutil] Emit an error when the Mach-O exceeds the 4GB limit.
The Mach-O object file format is limited to 4GB because its used of
32-bit offsets in the header. It is possible for dsymutil to (silently)
emit an invalid binary. Instead of having consumers deal with this, emit
an error instead.
Jonas Devlieghere [Mon, 24 May 2021 21:55:52 +0000 (14:55 -0700)]
[dsymutil] Use EXIT_SUCCESS and EXIT_FAILURE (NFC)
Jonas Devlieghere [Mon, 24 May 2021 21:49:14 +0000 (14:49 -0700)]
[dsymutil] Compute the output location once per input file (NFC)
Compute the location of the output file just once outside the loop over
the different architectures.
Richard Smith [Mon, 24 May 2021 23:06:28 +0000 (16:06 -0700)]
PR50456: Properly handle multiple escaped newlines in a '*/'.
Mitch Phillips [Mon, 24 May 2021 23:08:57 +0000 (16:08 -0700)]
[scudo] Add unmapTestOnly() to secondary.
When trying to track down a vaddr-poisoning bug, I found that that the
secondary cache isn't emptied on test teardown. We should probably do
that to make the tests hermetic. Otherwise, repeating the tests lots of
times using --gtest_repeat fails after the mmap vaddr space is
exhausted.
To repro:
$ ninja check-scudo_standalone # build
$ ./projects/compiler-rt/lib/scudo/standalone/tests/ScudoUnitTest-x86_64-Test \
--gtest_filter=ScudoSecondaryTest.*:-ScudoSecondaryTest.SecondaryCombinations \
--gtest_repeat=10000
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D102874
River Riddle [Mon, 24 May 2021 22:56:22 +0000 (15:56 -0700)]
[mlir-opt] Don't enable `printOpOnDiagnostic` if it was explicitly disabled.
We are currently explicitly setting the flag solely based on the value of `-verify`, which ends up ignoring the situation where the user explicitly disabled this option from the command line.
Differential Revision: https://reviews.llvm.org/D102952
Anton Afanasyev [Tue, 18 May 2021 08:30:03 +0000 (11:30 +0300)]
[SLP] Fix "gathering" of insertelement instructions
For rare exceptional case vector tree node (insertelements for now only)
is marked as `NeedToGather`, this case is processed by patch. Follow-up
of D98714 to fix bug reported here https://reviews.llvm.org/D98714#2764135.
Differential Revision: https://reviews.llvm.org/D102675
Hansang Bae [Fri, 21 May 2021 23:13:36 +0000 (18:13 -0500)]
[OpenMP] Fix crashing critical section with hint clause
Runtime was using the default lock type without using the hint.
Differential Revision: https://reviews.llvm.org/D102955
Dhruva Chakrabarti [Sat, 22 May 2021 02:35:03 +0000 (19:35 -0700)]
[libomptarget] [amdgpu] Fix copy-paste error setting NumThreads for a corner case.
Fix the case where NumTeams was set incorrectly instead of NumThreads
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D103037
Alex Langford [Mon, 24 May 2021 22:13:06 +0000 (15:13 -0700)]
[lldb][NFC] Remove unused header from Target
Should have been removed with
4c0b0de904a5622c33e3ed97e86c6792fbc13feb
but I forgot to do so.
thomasraoux [Mon, 24 May 2021 21:32:56 +0000 (14:32 -0700)]
[mlir] Lower sm version for TensorCore intergration tests
Those tests only require sm70, this allows to run those integration
tests on more hardware.
Differential Revision: https://reviews.llvm.org/D103049
Jinsong Ji [Mon, 24 May 2021 21:32:50 +0000 (21:32 +0000)]
[compiler-rt][scudo] Fix sign-compare warnings
Fix buildbot failure
https://lab.llvm.org/buildbot/#/builders/57/builds/6542/steps/6/logs/stdio
/llvm-project/llvm/utils/unittest/googletest/include/gtest/gtest.h:1629:28:
error: comparison of integers of different signs: 'const unsigned long'
and 'const int' [-Werror,-Wsign-compare]
GTEST_IMPL_CMP_HELPER_(GT, >);
~~~~~~~~~~~~~~~~~~~~~~~~~~^~
/llvm-project/llvm/utils/unittest/googletest/include/gtest/gtest.h:1609:12:
note: expanded from macro 'GTEST_IMPL_CMP_HELPER_'
if (val1 op val2) {\
~~~~ ^ ~~~~
/llvm-project/compiler-rt/lib/scudo/standalone/tests/common_test.cpp:30:3:
note: in instantiation of function template specialization
'testing::internal::CmpHelperGT<unsigned long, int>' requested here
EXPECT_GT(OnStart, 0);
^
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D103029
Arthur O'Dwyer [Wed, 19 May 2021 15:54:31 +0000 (11:54 -0400)]
[libc++] Assume that __wrap_iter always wraps a fancy pointer.
Not only do we conscientiously avoid using `__wrap_iter` for non-contiguous
iterators (in vector, string, span...) but also we make the assumption
(in regex) that `__wrap_iter<_Iter>` is contiguous for all `_Iter`.
So `__wrap_iter<reverse_iterator<int*>>` should be considered IFNDR,
and every `__wrap_iter` should correctly advertise contiguity in C++20.
Drive-by simplify some type traits.
Reviewed as part of https://reviews.llvm.org/D102781
Momchil Velikov [Mon, 24 May 2021 20:40:47 +0000 (21:40 +0100)]
Do not create LLVM IR `constant`s for objects with dynamic initialisation
When a const-qualified object has a section attribute, that
section is set to read-only and clang outputs a LLVM IR constant
for that object. This is incorrect for dynamically initialised
objects.
For example:
int init() { return 15; }
__attribute__((section("SA")))
const int a = init();
a is allocated to a read-only section and is left
unintialised (zero-initialised).
This patch adds checks if an initialiser is a constant expression
and allocates objects to sections as follows:
* const-qualified objects
- no initialiser or constant initialiser: .rodata
- dynamic initializer: .bss
* non const-qualified objects
- no initialiser or dynamic initialiser: .bss
- constant initialiser: .data
(".rodata", ".data", and ".bss" names used just for explanatory
purpose)
Differential Revision: https://reviews.llvm.org/D102693
Alex Langford [Wed, 19 May 2021 20:54:14 +0000 (13:54 -0700)]
[lldb] Move ClangModulesDeclVendor ownership to ClangPersistentVariables from Target
More decoupling of plugins and non-plugins. Target doesn't need to
manage ClangModulesDeclVendor and ClangPersistentVariables is always available
in situations where you need ClangModulesDeclVendor.
Differential Revision: https://reviews.llvm.org/D102811
Andrzej Warzynski [Mon, 24 May 2021 20:10:11 +0000 (20:10 +0000)]
[flang][cmake] Set the default for FLANG_BUILD_NEW_DRIVER for oot builds
For out-of-tree builds of Flang, FLANG_BUILD_NEW_DRIVER is not inherited
from llvm-project/llvm/CMakeLists.txt. Instead, a separate definition is
required (but only for out-of-tree builds).
Differential Revision: https://reviews.llvm.org/D102323
Hongtao Yu [Mon, 24 May 2021 19:58:25 +0000 (12:58 -0700)]
[NFC][CSSPGO]llvm-profge] Fix Build warning dueo to an attrbute usage.
Chris Lattner [Sun, 23 May 2021 18:24:59 +0000 (11:24 -0700)]
[GreedyPatternRewriter] Introduce a config object that allows controlling internal parameters. NFC.
This exposes the iterations and top-down processing as flags, and also
allows controlling whether region simplification is desirable for a client.
This allows deleting some duplicated entrypoints to
applyPatternsAndFoldGreedily.
This also deletes the Constant Preprocessing pass, which isn't worth it
on balance.
All defaults are all kept the same, so no one should see a behavior change.
Differential Revision: https://reviews.llvm.org/D102988
Hongtao Yu [Sat, 22 May 2021 00:44:56 +0000 (17:44 -0700)]
[CSSPGO][llvm-profgen] Report samples for untrackable frames.
Fixing an issue where samples collected for an untrackable frame is not reported. An untrackable frame refers to a frame whose caller is untrackable due to missing debug info or pseudo probe. Though the frame is connected to its parent frame through the frame pointer chain at runtime, the compiler cannot build the connection without debug info or pseudo probe. In such case we just need to report the untrackable frame as the base frame and all of its child frames.
With more samples reported I'm seeing this improves the performance of an internal benchmark by 2.5%.
Reviewed By: wenlei, wlei
Differential Revision: https://reviews.llvm.org/D102961
Nick Desaulniers [Mon, 24 May 2021 19:06:49 +0000 (12:06 -0700)]
fix up test from D102742
In D102742, I mistakenly put the split file designator above a bunch of
CHECK lines, which unintentionally removed the CHECKs from actually
being verified.
This can be verified by observing:
<build dir>/test/CodeGen/X86/Output/stack-protector-3.ll.tmp/main.ll
George [Mon, 24 May 2021 18:52:41 +0000 (11:52 -0700)]
Surface clone APIs in CAPI
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D102987
LLVM GN Syncbot [Mon, 24 May 2021 18:48:17 +0000 (18:48 +0000)]
[gn build] Port
b510e4cf1b96
Craig Topper [Mon, 24 May 2021 17:25:27 +0000 (10:25 -0700)]
[RISCV] Add a vsetvli insert pass that can be extended to be aware of incoming VL/VTYPE from other basic blocks.
This is a replacement for D101938 for inserting vsetvli
instructions where needed. This new version changes how
we track the information in such a way that we can extend
it to be aware of VL/VTYPE changes in other blocks. Given
how much it changes the previous patch, I've decided to
abandon the previous patch and post this from scratch.
For now the pass consists of a single phase that assumes
the incoming state from other basic blocks is unknown. A
follow up patch will extend this with a phase to collect
information about how VL/VTYPE change in each block and
a second phase to propagate this information to the entire
function. This will be used by a third phase to do the
vsetvli insertion.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D102737
LLVM GN Syncbot [Mon, 24 May 2021 18:36:50 +0000 (18:36 +0000)]
[gn build] Port
a64ebb863727
Heejin Ahn [Sun, 23 May 2021 09:09:17 +0000 (02:09 -0700)]
[WebAssembly] Add NullifyDebugValueLists pass
`WebAssemblyDebugValueManager` does not currently handle
`DBG_VALUE_LIST`, which is a recent addition to LLVM. We tried to
nullify them within the constructor of `WebAssemblyDebugValueManager` in
D102589, but it made the class error-prone to use because it deletes
instructions within the constructor and thus invalidates existing
iterators within the BB, so the user of the class should take special
care not to use invalidated iterators. This actually caused a bug in
ExplicitLocals pass.
Instead of trying to fix ExplicitLocals pass to make the iterator usage
correct, which is possible but error-prone, this adds
NullifyDebugValueLists pass that nullifies all `DBG_VALUE_LIST`
instructions before we run WebAssembly specific passes in the backend.
We can remove this pass after we implement handlers for
`DBG_VALUE_LIST`s in `WebAssemblyDebugValueManager` and elsewhere.
Fixes https://github.com/emscripten-core/emscripten/issues/14255.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D102999
George Balatsouras [Fri, 21 May 2021 17:56:45 +0000 (10:56 -0700)]
[dfsan] Add function that prints origin stack trace to buffer
Reviewed By: stephan.yichao.zhao
Differential Revision: https://reviews.llvm.org/D102451
Artem Belevich [Fri, 21 May 2021 17:53:28 +0000 (10:53 -0700)]
[CUDA] Work around compatibility issue with libstdc++ 11.1.0
libstdc++ redeclares __failed_assertion multiple times and that results in the
function declared with conflicting set of attributes when we include <complex>
with __host__ __device__ attributes force-applied to all functions.
In order to work around the issue, we rename __failed_assertion within the
region with forced attributes.
See https://bugs.llvm.org/show_bug.cgi?id=50383 for the details.
Differential Revision: https://reviews.llvm.org/D102936
Stella Laurenzo [Mon, 24 May 2021 16:41:38 +0000 (16:41 +0000)]
Enable MLIR Python bindings for TOSA.
Differential Revision: https://reviews.llvm.org/D103035
Raphael Isemann [Mon, 24 May 2021 17:16:40 +0000 (19:16 +0200)]
[lldb] Add missing mutex guards to TargetList::CreateTarget
TestMultipleTargets is randomly failing on the bots. The reason for that is that
the test is calling `SBDebugger::CreateTarget` from multiple threads.
`TargetList::CreateTarget` is curiously missing the guard that all of its other
member functions have, so all the threads in the test end up changing the
internal TargetList state at the same time and end up corrupting it.
Reviewed By: vsk, JDevlieghere
Differential Revision: https://reviews.llvm.org/D103020
serge-sans-paille [Mon, 24 May 2021 17:43:40 +0000 (19:43 +0200)]
Revert "[NFC] remove explicit default value for strboolattr attribute in tests"
This reverts commit
bda6e5bee04c75b1f1332b4fd1ac4e8ef6c3c247.
See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance
serge-sans-paille [Sun, 23 May 2021 11:19:23 +0000 (13:19 +0200)]
[NFC] remove explicit default value for strboolattr attribute in tests
Since
d6de1e1a71406c75a4ea4d5a2fe84289f07ea3a1, no attributes is quivalent to
setting attribute to false.
This is a preliminary commit for https://reviews.llvm.org/D99080
Craig Topper [Mon, 24 May 2021 17:19:09 +0000 (10:19 -0700)]
[X86] Call insertDAGNode on trunc/zext created in tryShiftAmountMod.
This puts the new nodes in the proper place in the topologically
sorted list of nodes.
Fixes PR50431, which was introduced recently in D101944.
LLVM GN Syncbot [Mon, 24 May 2021 17:18:43 +0000 (17:18 +0000)]
[gn build] Port
095e91c9737b
Vitaly Buka [Sun, 23 May 2021 22:49:43 +0000 (15:49 -0700)]
[NFC][scudo] Small test cleanup
Fixing issues raised on D102979 review.
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D102994
Jon Roelofs [Mon, 24 May 2021 16:49:32 +0000 (09:49 -0700)]
[Remarks] Add analysis remarks for memset/memcpy/memmove lengths
Re-landing now that the crasher this patch previously uncovered has been fixed
in: https://reviews.llvm.org/D102935
Differential revision: https://reviews.llvm.org/D102452
Roman Lebedev [Mon, 24 May 2021 17:09:04 +0000 (20:09 +0300)]
[X86][Costmodel] getMaskedMemoryOpCost(): don't scalarize non-power-of-two vectors with legal element type
This follows in steps of similar `getMemoryOpCost()` changes, D100099/D100684.
Intel SDM, `VPMASKMOV — Conditional SIMD Integer Packed Loads and Stores`:
```
Faults occur only due to mask-bit required memory accesses that caused the faults. Faults will not occur due to
referencing any memory location if the corresponding mask bit for that memory location is 0. For example, no
faults will be detected if the mask bits are all zero.
```
I.e., if mask is all-zeros, any address is fine.
Masked load/store's prime use-case is e.g. tail masking the loop remainder,
where for the last iteration, only first some few elements of a vector exist.
So much similarly, i don't see why must we scalarize non-power-of-two vectors,
iff the element type is something we can masked- store/load.
We simply need to legalize it, widen the mask, and be done with it.
And we even already count the cost of widening the mask.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D102990
luxufan [Mon, 24 May 2021 16:51:04 +0000 (09:51 -0700)]
[RISCV] Optimize getVLENFactoredAmount function.
If the local variable `NumOfVReg` isPowerOf2_32(NumOfVReg - 1) or isPowerOf2_32(NumOfVReg + 1), the ADDI and MUL instructions can be replaced with SLLI and ADD(or SUB) instructions.
Based on original patch by StephenFan.
Reviewed By: frasercrmck, StephenFan
Differential Revision: https://reviews.llvm.org/D100577
Markus Böck [Mon, 24 May 2021 16:40:39 +0000 (18:40 +0200)]
[mlir][doc] Fix links and references in top level docs directory
This is the fourth and final patch in a series of patches fixing markdown links and references inside the mlir documentation. This patch combined with the other three should fix almost every broken link on mlir.llvm.org as far as I can tell.
This patch in particular addresses all Markdown files in the top level docs directory.
Differential Revision: https://reviews.llvm.org/D103032
Jon Roelofs [Mon, 24 May 2021 16:19:31 +0000 (09:19 -0700)]
[Remarks] Look through inttoptr/ptrtoint for -ftrivial-auto-var-init remarks.
The crasher is a related problem that @aemerson found broke speck2k6/403.gcc
when I landed https://reviews.llvm.org/D102452. It has been reduced & modified
to reproduce without that patch.
Differential revision: https://reviews.llvm.org/D102935
Adrian Prantl [Mon, 24 May 2021 16:06:00 +0000 (09:06 -0700)]
CoroSplit: Replace ad-hoc implementation of reachability with API from CFG.h
The current ad-hoc implementation used to determine whether a basic
block is unreachable doesn't work correctly in the general case (for
example it won't detect successors of unreachable blocks as
unreachable). This patch replaces it with the correct API that uses a
DominatorTree to answer the question correctly and quickly.
rdar://
77181156
Differential Revision: https://reviews.llvm.org/D102963
Steven Wu [Mon, 24 May 2021 16:13:34 +0000 (09:13 -0700)]
[llvm] Revert align attr test in test/Bitcode/attribute-3.3.ll
Revert testcase changed in D87304 now the upgrader can correctly handle
the align attribute in upgrader.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102880
Suraj Sudhir [Mon, 24 May 2021 15:47:24 +0000 (15:47 +0000)]
[mlir][tosa] Align tensor rank specifications with current spec
Deconstrains several TOSA operators to align with the current TOSA spec, including all the elementwise ops.
Note: some more ops are under consideration for further cleanup; they will follow once the spec has been updated.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D102958
Kostya Kortchinsky [Wed, 19 May 2021 16:10:30 +0000 (09:10 -0700)]
[scudo] Separate Fuchsia & Default SizeClassMap
The Fuchsia allocator config was using the default size class map.
This CL gives Fuchsia its own size class map and changes a couple of
things in the default one:
- make `SizeDelta` configurable in `Config` for a fixed size class map
as it currently is for a table size class map;
- switch `SizeDelta` to 0 for the default config, it allows for size
classes that allow for power of 2s, and overall better wrt pages
filling;
- increase the max number of caches pointers to 14 in the default,
this makes the transfer batch 64/128 bytes on 32/64-bit platforms,
which is cache-line friendly (previous size was 48/96 bytes).
The Fuchsia size class map remains untouched for now, this doesn't
impact Android which uses the table size class map.
Differential Revision: https://reviews.llvm.org/D102783
Nikita Popov [Mon, 24 May 2021 15:28:38 +0000 (17:28 +0200)]
[CVP] Add additional test for phi common val transform (NFC)
Nikita Popov [Mon, 24 May 2021 13:16:06 +0000 (15:16 +0200)]
[LoopUnroll] Add additional trip multiple test (NFC)
This uses a trip multiple on a (unique) non-latch exit.