Raphael Isemann [Mon, 12 Jul 2021 13:16:44 +0000 (15:16 +0200)]
[lldb][NFC] Use ArrayRef in TypeSystemClang::SetFunctionParameters
The implementation converts the pointer/size pair anyway back to ArrayRef.
Frederik Gossen [Mon, 12 Jul 2021 13:14:25 +0000 (15:14 +0200)]
[MLIR][StandardToLLVM] Move `copyUnrankedDescriptors` to pattern
Make the function `copyUnrankedDescriptors` accessible in
`ConvertToLLVMPattern`.
Differential Revision: https://reviews.llvm.org/D105810
Cullen Rhodes [Mon, 12 Jul 2021 10:58:36 +0000 (10:58 +0000)]
[AArch64] Add target features for Armv9-A Scalable Matrix Extension (SME)
First patch in a series adding MC layer support for the Arm Scalable
Matrix Extension.
This patch adds the following features:
sme, sme-i64, sme-f64
The sme-i64 and sme-f64 flags are for the optional I16I64 and F64F64
features.
If a target supports I16I64 then the following instructions are
implemented:
* 64-bit integer ADDHA and ADDVA variants (D105570).
* SMOPA, SMOPS, SUMOPA, SUMOPS, UMOPA, UMOPS, USMOPA, and USMOPS
instructions that accumulate 16-bit integer outer products into 64-bit
integer tiles.
If a target supports F64F64 then the FMOPA and FMOPS instructions that
accumulate double-precision floating-point outer products into
double-precision tiles are implemented.
Outer products are implemented in D105571.
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06
Reviewed By: CarolineConcatto
Differential Revision: https://reviews.llvm.org/D105569
Michael Liao [Mon, 12 Jul 2021 13:23:59 +0000 (09:23 -0400)]
Fix warning '-Wparentheses'. NFC.
Jonas Paulsson [Sat, 10 Jul 2021 09:15:12 +0000 (11:15 +0200)]
[SystemZ] Bugfix for the 'N' code for inline asm operand.
Don't use a local MachineOperand copy in SystemZAsmPrinter::PrintAsmOperand()
and change the register as it may break the MRI tracking of register
uses. Use an MCOperand instead.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D105757
Sanjay Patel [Mon, 12 Jul 2021 12:58:05 +0000 (08:58 -0400)]
[InstCombine] reduce signbit test of logic ops to cmp with zero
This is the pattern from the description of:
https://llvm.org/PR50816
There might be a way to generalize this to a smaller or more
generic pattern, but I have not found it yet.
https://alive2.llvm.org/ce/z/ShzJoF
define i1 @src(i8 %x) {
%add = add i8 %x, -1
%xor = xor i8 %x, -1
%and = and i8 %add, %xor
%r = icmp slt i8 %and, 0
ret i1 %r
}
define i1 @tgt(i8 %x) {
%r = icmp eq i8 %x, 0
ret i1 %r
}
Sanjay Patel [Mon, 12 Jul 2021 12:43:29 +0000 (08:43 -0400)]
[InstCombine][tests] add tests for signbit + logic; NFC
PR50816
Tobias Gysi [Mon, 12 Jul 2021 12:31:30 +0000 (12:31 +0000)]
[mlir][linalg][python] Add auto-generated file warning (NFC).
Annotate LinalgNamedStructuredOps.yaml with a comment stating the file is auto-generated and should not be edited manually.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D105809
Simon Pilgrim [Mon, 12 Jul 2021 11:29:35 +0000 (12:29 +0100)]
[CostModel][X86] Adjust truncate SSE/AVX legalized costs based on llvm-mca reports.
Update truncation costs based on the worst case costs from the script in D103695.
Move to using legalized types wherever possible, which allows us to prune the cost tables.
Nico Weber [Mon, 12 Jul 2021 12:50:11 +0000 (08:50 -0400)]
[gn build] port
0da172b1766e more
David Green [Mon, 12 Jul 2021 12:39:21 +0000 (13:39 +0100)]
[AArch64] Set the latency of Cortex-A55 stores to 1
This sets the latency of stores to 1 in the Cortex-A55 scheduling model,
to better match the values given in the software optimization guide.
The latency of a store in normal llvm scheduling does not appear to have
a lot of uses. If the store has no outputs then the latency is somewhat
meaningless (and pre/post increment update operands use the WriteAdr
write for those operands instead). The one place it does alter things is
the latency between a store and the end of the scheduling region, which
can in turn have an effect on the critical path length. As a result a
latency of 1 is more correct and offers ever-so-slightly better
scheduling of instructions near the end of the block.
They are marked as RetireOOO to keep the llvm-mca from introducing
stalls where non would exist.
Differential Revision: https://reviews.llvm.org/D105541
Nico Weber [Mon, 12 Jul 2021 12:15:59 +0000 (08:15 -0400)]
[gn build] (semi-manually) port
0da172b1766e
Corentin Jabot [Mon, 12 Jul 2021 12:03:51 +0000 (08:03 -0400)]
Partially implement P1401R5 (Narrowing contextual conversions to bool)
Support Narrowing conversions to bool in if constexpr condition
under C++23 language mode.
Only if constexpr is implemented as the behavior of static_assert
is already conforming. Still need to work on explicit(bool) to
complete support.
Dmitry Vyukov [Mon, 12 Jul 2021 11:57:14 +0000 (13:57 +0200)]
sanitizer_common: fix 32-bit build
https://reviews.llvm.org/D105716 enabled thread safety annotations,
and that broke 32-bit build:
https://green.lab.llvm.org/green/job/lldb-cmake/33604/consoleFull#-
77815080549ba4694-19c4-4d7e-bec5-
911270d8a58c
1. Enable thread-safety analysis in unit tests
(this catches the breakage even in 64-bit mode).
2. Add NO_THREAD_SAFETY_ANALYSIS to sanitizer_allocator_primary32.h
to unbreak the build.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D105808
Aaron Ballman [Mon, 12 Jul 2021 11:58:06 +0000 (07:58 -0400)]
Fix the Clang documentation builder; NFC.
It was broken three days ago by the changes in D95561.
Med Ismail Bennani [Mon, 12 Jul 2021 11:32:07 +0000 (12:32 +0100)]
[lldb/Target] Fix event handling during process launch
This patch fixes process event handling when the events are broadcasted
at launch. To do so, the patch introduces a new listener to fetch events
by hand off the event queue and then resending them ensure the event ordering.
Differental Revision: https://reviews.llvm.org/D105698
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Yevgeny Rouban [Mon, 12 Jul 2021 11:06:06 +0000 (18:06 +0700)]
[RS4GC] Use one DVCache for both inlineGetBaseAndOffset() and insertParsePoints()
This new test demonstrates a case where a base ptr is generated
twice for the same value: the first one is generated while
the gc.get.pointer.base() is inlined, the second is generated
for the statepoint. This happens because the methods
inlineGetBaseAndOffset() and insertParsePoints() do not share
their defining value cache used by the findBasePointer() method.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D103240
Yevgeny Rouban [Mon, 12 Jul 2021 10:35:10 +0000 (17:35 +0700)]
[RS4GC] Add a test to demonstrate duplication of base generation. NFC
This new test demonstrates a case where a base ptr is generated
twice for the same value: the first one is generated while
the gc.get.pointer.base() is inlined, the second is generated
for the statepoint. This happens because the methods
inlineGetBaseAndOffset() and insertParsePoints() do not share
their defining value cache used by the findBasePointer() method.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D103238
Nemanja Ivanovic [Mon, 12 Jul 2021 10:49:12 +0000 (05:49 -0500)]
[PowerPC] Fix rounding mode for vec_round in altivec.h
The function is supposed to be the equivalent of rint() (as in
round to nearest, ties to even) rather than round() (round to
nearest, ties away from zero). In fact, the instruction we emit
without VSX is vrfin which is correct. However, with VSX we emit
xvrspi which is the equivalent of round() and therefore incorrect.
Since there is no equivalent VSX instruction, simply use vrfin
regardless of availability of VSX.
Dmitry Vyukov [Sun, 11 Jul 2021 12:43:31 +0000 (14:43 +0200)]
sanitizer_common: make sem_trywait as non-blocking
sem_trywait never blocks.
Use REAL instead of COMMON_INTERCEPTOR_BLOCK_REAL.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D105775
Dmitry Vyukov [Sun, 11 Jul 2021 12:51:58 +0000 (14:51 +0200)]
sanitizer_common: remove debugging logic from the internal allocator
The internal allocator adds 8-byte header for debugging purposes.
The problem with it is that it's not possible to allocate nicely-sized
objects without a significant overhead. For example, if we allocate
512-byte objects, that will be rounded up to 768 or something.
This logic migrated from tsan where it was added during initial development,
I don't remember that it ever caught anything (we don't do bugs!).
Remove it so that it's possible to allocate nicely-sized objects
without overheads.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D105777
Aaron Ballman [Mon, 12 Jul 2021 10:51:19 +0000 (06:51 -0400)]
[OpenMP] Support OpenMP 5.1 attributes
OpenMP 5.1 added support for writing OpenMP directives using [[]]
syntax in addition to using #pragma and this introduces support for the
new syntax.
In OpenMP, the attributes take one of two forms:
[[omp::directive(...)]] or [[omp::sequence(...)]]. A directive
attribute contains an OpenMP directive clause that is identical to the
analogous #pragma syntax. A sequence attribute can contain either
sequence or directive arguments and is used to ensure that the
attributes are processed sequentially for situations where the order of
the attributes matter (remember:
https://eel.is/c++draft/dcl.attr.grammar#4.sentence-4).
The approach taken here is somewhat novel and deserves mention. We
could refactor much of the OpenMP parsing logic to work for either
pragma annotation tokens or for attribute clauses. It would be a fair
amount of effort to share the logic for both, but it's certainly
doable. However, the semantic attribute system is not designed to
handle the arbitrarily complex arguments that OpenMP directives
contain. Adding support to thread the novel parsed information until we
can produce a semantic attribute would be considerably more effort.
What's more, existing OpenMP constructs are not (often) represented as
semantic attributes. So doing this through Attr.td would be a massive
undertaking that would likely only benefit OpenMP and comes with
additional risks. Rather than walk down that path, I am taking
advantage of the fact that the syntax of the directives within the
directive clause is identical to that of the #pragma form. Once the
parser recognizes that we're processing an OpenMP attribute, it caches
all of the directive argument tokens and then replays them as though
the user wrote a pragma. This reuses the same OpenMP parsing and
semantic logic directly, but does come with a risk if the OpenMP
committee decides to purposefully diverge their pragma and attribute
syntaxes. So, despite this being a novel approach that does token
replay, I think it's actually a better approach than trying to do this
through the declarative syntax in Attr.td.
Nicolas Vasilache [Mon, 12 Jul 2021 10:10:26 +0000 (10:10 +0000)]
[mlir][Linalg] Improve comprehensive bufferization for scf.yield.
Previously, comprehensive bufferization of scf.yield did not have enough information
to detect whether an enclosing scf::for bbargs would bufferize to a buffer equivalent
to that of the matching scf::yield operand.
As a consequence a separate sanity check step would be required to determine whether
bufferization occured properly.
This late check would miss the case of calling a function in an loop.
Instead, we now pass and update aliasInfo during bufferization and it is possible to
imrpove bufferization of scf::yield and drop that post-pass check.
Add an example use case that was failing previously.
This slightly modifies the error conditions, which are also updated as part of this
revision.
Differential Revision: https://reviews.llvm.org/D105803
Dmitry Vyukov [Sun, 11 Jul 2021 12:46:09 +0000 (14:46 +0200)]
sanitizer_common: use 0 for empty stack id
We use 0 for empty stack id from stack depot.
Deadlock detector 1 is the only place that uses -1
as a special case. Use 0 because there is a number
of checks of the form "if (stack id) ...".
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D105776
David Truby [Mon, 12 Jul 2021 09:55:11 +0000 (10:55 +0100)]
[llvm][sve] Lowering for VLS truncating stores
This adds custom lowering for truncating stores when operating on
fixed length vectors in SVE. It also includes a DAG combine to
fold extends followed by truncating stores into non-truncating
stores in order to prevent this pattern appearing once truncating
stores are supported.
Currently truncating stores are not used in certain cases where
the size of the vector is larger than the target vector width.
Differential Revision: https://reviews.llvm.org/D104471
Joachim Protze [Mon, 12 Jul 2021 10:12:09 +0000 (12:12 +0200)]
[OpenMP][OMPT] Fix compile-time assertion in ompt-multiplex.h
The compile-time assertion is supposed to prevent double-free caused by
unexpected combination of preprocessor defines passed by an OMPT tool.
The current defines are not used, so this patch replaces the check with
macros actually used in ompt-multiplex.h
Reported by: Semih Burak
Differential Revision: https://reviews.llvm.org/D104633
Nemanja Ivanovic [Mon, 12 Jul 2021 03:08:31 +0000 (22:08 -0500)]
[PowerPC] Remove unnecessary 64-bit guards from altivec.h
A number of functions in the header have guards for 64-bit only
that were presumably added as some of the functions in the blocks
use vector __int128 which is only available in 64-bit mode.
A more appropriate guard (__SIZEOF_INT128__) has been added for
those functions since, making the 64-bit guards redundant.
This patch removes those guards as they inadvertently guard code
that uses vector long long which does not actually require 64-bit
mode.
Dmitry Vyukov [Fri, 9 Jul 2021 17:29:41 +0000 (19:29 +0200)]
sanitizer_common: add thread safety annotations
Enable clang Thread Safety Analysis for sanitizers:
https://clang.llvm.org/docs/ThreadSafetyAnalysis.html
Thread Safety Analysis can detect inconsistent locking,
deadlocks and data races. Without GUARDED_BY annotations
it has limited value. But this does all the heavy lifting
to enable analysis and allows to add GUARDED_BY incrementally.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D105716
Dmitry Vyukov [Sun, 11 Jul 2021 12:17:52 +0000 (14:17 +0200)]
sanitizer_common: rename Mutex to MutexState
We have 3 different mutexes (RWMutex, BlockingMutex __tsan::Mutex),
each with own set of downsides. I want to unify them under a name Mutex.
But it will conflict with Mutex in the deadlock detector,
which is a way too generic name. Rename it to MutexState.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D105773
Muhammad Omair Javaid [Mon, 12 Jul 2021 09:18:18 +0000 (14:18 +0500)]
[LLDB] Testsuite: Add helper to check for AArch64 target
This patch adds a helper function to test target architecture is
AArch64 or not. This also tightens isAArch64* helpers by adding an
extra architecture check.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D105483
Alex Zinenko [Fri, 9 Jul 2021 14:41:36 +0000 (16:41 +0200)]
[mlir] factor math-to-llvm out of standard-to-llvm
After the Math has been split out of the Standard dialect, the
conversion to the LLVM dialect remained as a huge monolithic pass.
This is undesirable for the same complexity management reasons as having
a huge Standard dialect itself, and is even more confusing given the
existence of a separate dialect. Extract the conversion of the Math
dialect operations to LLVM into a separate library and a separate
conversion pass.
Reviewed By: silvas
Differential Revision: https://reviews.llvm.org/D105702
Sander de Smalen [Mon, 12 Jul 2021 08:32:15 +0000 (09:32 +0100)]
[LV] Ignore candidate VFs with invalid costs.
This follows on from discussion on the mailing-list:
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151047.html
to interpret an Invalid cost as 'infinitely expensive', as this
simplifies some of the legalization issues with scalable vectors.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D105473
Simon Pilgrim [Mon, 12 Jul 2021 08:56:59 +0000 (09:56 +0100)]
[X86][SSE] X86ISD::FSETCC nodes (cmpss/cmpsd) return a 0/-1 allbits signbits result
Annoyingly, i686 cmpsd handling still fails to remove the unnecessary neg(and(x,1))
Simon Pilgrim [Mon, 12 Jul 2021 08:40:59 +0000 (09:40 +0100)]
[X86][SSE] Add signbit tests to show cmpss/cmpsd ops not recognised as 'allbits' results.
Balazs Benics [Mon, 12 Jul 2021 07:06:46 +0000 (09:06 +0200)]
[analyzer][NFC] Display the correct function name even in crash dumps
The `-analyzer-display-progress` displayed the function name of the
currently analyzed function. It differs in C and C++. In C++, it
prints the argument types as well in a comma-separated list.
While in C, only the function name is displayed, without the brackets.
E.g.:
C++: foo(), foo(int, float)
C: foo
In crash traces, the analyzer dumps the location contexts, but the
string is not enough for `-analyze-function` in C++ mode.
This patch addresses the issue by dumping the proper function names
even in stack traces.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D105708
Clement Courbet [Thu, 8 Jul 2021 08:52:35 +0000 (10:52 +0200)]
[llvm-exegesis] Fix compilation with old libpfm versions.
Do not try include `perfmon/perf_event.h` when we are not sure that it
exists.
Fixes PR51017.
Differential Revision: https://reviews.llvm.org/D105615
Fangrui Song [Mon, 12 Jul 2021 05:05:39 +0000 (22:05 -0700)]
[AArch64] De-capitalize some Emit* functions
AsmParser/AsmPrinter/Streamer are mostly consistent on emit* functions now.
Muhammad Omair Javaid [Mon, 12 Jul 2021 03:38:08 +0000 (08:38 +0500)]
[LLDB] Only build TestWatchTaggedAddress.py on aarch64 PAC targets
This patch fixes buildbot failures caused by TestWatchTaggedAddress.py
Differential Revision: https://reviews.llvm.org/D101361
Jacques Pienaar [Mon, 12 Jul 2021 03:41:33 +0000 (20:41 -0700)]
[mlir] Fix broadcasting check with 1 values
The trait was inconsistent with the other broadcasting logic here. And
also fix printing here to use ? rather than -1 in the error.
Differential Revision: https://reviews.llvm.org/D105748
Muhammad Omair Javaid [Mon, 12 Jul 2021 02:06:34 +0000 (07:06 +0500)]
Support AArch64/Linux watchpoint on tagged addresses
AArch64 architecture support virtual addresses with some of the top bits ignored.
These ignored bits can host memory tags or bit masks that can serve to check for
authentication of address integrity. We need to clear away the top ignored bits
from watchpoint address to reliably hit and set watchpoints on addresses
containing tags or masks in their top bits.
This patch adds support to watch tagged addresses on AArch64/Linux.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D101361
Liqiang Tao [Mon, 12 Jul 2021 02:28:17 +0000 (10:28 +0800)]
[llvm][Inliner] Templatize PriorityInlineOrder
The patch templatize PriorityInlinerOrder so that it can accept any type priority metric.
Reviewed By: kazu
Differential Revision: https://reviews.llvm.org/D104972
Jez Ng [Mon, 12 Jul 2021 01:34:31 +0000 (21:34 -0400)]
[lld-macho][nfc] Fix YAML input in compact-unwind-sym-relocs.s
* Adjust strsize so llvm-objdump doesn't complain about it extending
past the end of file
* Remove symbol that was referencing a deleted section
* Adjust n_sect of the remaining `_main` symbol to point at the right
section
Johannes Doerfert [Thu, 1 Jul 2021 05:21:26 +0000 (00:21 -0500)]
[OpenMP] Create and use `__kmpc_is_generic_main_thread`
In order to fold calls based on high-level knowledge and control flow
tracking it helps to expose the information as a runtime call. The
logic: `!SPMD && getTID() == getMasterTID()` was used in various places
and is now encapsulated in `__kmpc_is_generic_main_thread`. As part of
this rewrite we replaced eager computation of arguments with on-demand
computation, especially helpful if the calls can be folded and arguments
don't need to be computed consequently.
Differential Revision: https://reviews.llvm.org/D105768
Johannes Doerfert [Sun, 11 Jul 2021 00:01:56 +0000 (19:01 -0500)]
[OpenMP] Simplify variable sharing and increase shared memory size
In order to avoid malloc/free, up to NUM_SHARED_VARIABLES_IN_SHARED_MEM
(=64) variables are communicated in dedicated shared memory instead. The
simplification does avoid the need for an "init" and requires "deinit"
only if we ever communicate more than NUM_SHARED_VARIABLES_IN_SHARED_MEM
variables.
Differential Revision: https://reviews.llvm.org/D105767
Johannes Doerfert [Sat, 10 Jul 2021 00:09:40 +0000 (19:09 -0500)]
[Attributor][NFCI] Add UsedAssumedInformation to more interfaces
As with other Attributor interfaces we often want to know if assumed
information was used to answer a query. This is important if only
known information is allowed or if known information can lead to an
early fixpoint. The users have been adjusted but none of them utilizes
the new information yet.
Eli Friedman [Mon, 12 Jul 2021 00:00:14 +0000 (17:00 -0700)]
[IndVars] Don't widen pointers in WidenIV::getWideRecurrence
It's not a reasonable transform, and calling getSignExtendExpr() on a
pointer hits an assertion.
Jez Ng [Sun, 11 Jul 2021 22:36:59 +0000 (18:36 -0400)]
[lld-macho][nfc] clang-format
Jez Ng [Sun, 11 Jul 2021 22:35:45 +0000 (18:35 -0400)]
[lld-macho][nfc] Remove unnecessary llvm:: namespace prefixes
Jez Ng [Sun, 11 Jul 2021 22:24:53 +0000 (18:24 -0400)]
[lld-macho][nfc] Avoid using std::map for PlatformKinds
The mappings we were using had a small number of keys, so a vector is
probably better. This allows us to remove the last usage of std::map in
our codebase.
I also used `removeSimulator` to simplify the code a bit further.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D105786
Florian Hahn [Sun, 11 Jul 2021 20:00:44 +0000 (22:00 +0200)]
[VPlan] Remove default arg from getVPValue (NFC).
The const version of VPValue::getVPValue still had a default value for
the value index. Remove the default value and use getVPSingleValue
instead, which is the proper function.
Craig Topper [Sun, 11 Jul 2021 19:31:02 +0000 (12:31 -0700)]
[RISCV] Add tests for suboptimal handling of negative constants for i32 uaddo/usubo on RV64. NFC
We end up zero extending constants when we promote to i64. We
should sign extend instead to allow use of addiw or improve
constant materialization.
Craig Topper [Sun, 11 Jul 2021 17:34:49 +0000 (10:34 -0700)]
[RISCV] Add tests for suboptimal handling of negative constants on the LHS of i32 shifts/rotates/subtracts on RV64. NFC
The constants end up getting zero extended to i64, but sign extend
would be better for constant materialization. We're using W
instructions so either behavior is correct since the upper bits
aren't read.
Nico Weber [Sun, 11 Jul 2021 17:57:07 +0000 (13:57 -0400)]
[lld/mac] Unbreak objc.s after
6e05c1cd5f98
Nico Weber [Sun, 11 Jul 2021 17:13:34 +0000 (13:13 -0400)]
[lld/mac] Always reference dyld_stub_binder when linked with libSystem
lld currently only references dyld_stub_binder when it's needed.
ld64 always references it when libSystem is linked.
Match ld64.
The (somewhat lame) motivation is that `nm` on a binary without any
export writes a "no symbols" warning to stderr, and this change makes
it so that every binary in practice has at least a reference to
dyld_stub_binder, which suppresses that.
Every "real" output file will reference dyld_stub_binder, so most
of the time this shouldn't make much of a difference. And if you
really don't want to have this reference for whatever reason, you
can stop passing -lSystem, like you have to for ld64 anyways.
(After linking any dylib, we dump the exported list of symbols to
a txt file with `nm` and only relink downstream deps if that txt
file changes. A nicer fix is to make lld optionally write .tbd files
with the public interface of a linked dylib and use that instead,
but for now the txt files are what we do.)
Differential Revision: https://reviews.llvm.org/D105782
Craig Topper [Sun, 11 Jul 2021 17:25:05 +0000 (10:25 -0700)]
[RISCV] Remove stale FIXME from a test. NFC
sext has been used for sltu/sltiu since
e0e62e97.
Craig Topper [Sun, 11 Jul 2021 17:03:05 +0000 (10:03 -0700)]
[DivRemPairs] Add an initial case for hoisting to a common predecessor.
This patch adds support for hoisting the division and maybe the
remainder for control flow graphs like this.
```
PredBB
| \
| Rem
| /
Div
```
If we have DivRem we'll hoist both to PredBB. If not we'll just
hoist Div and expand Rem using the Div.
This improves our codegen for something like this
```
__uint128_t udivmodti4(__uint128_t dividend, __uint128_t divisor, __uint128_t *remainder) {
if (remainder != 0)
*remainder = dividend % divisor;
return dividend / divisor;
}
```
Reviewed By: spatel, lebedev.ri
Differential Revision: https://reviews.llvm.org/D87555
Nico Weber [Sun, 11 Jul 2021 16:35:38 +0000 (12:35 -0400)]
[lld/mac] Use normal Undefined machinery for dyld_stub_binder lookup
This is for aesthetic reasons, I'm not aware of anything that needs
this in practice. It does have a few effects:
- `-undefined dynamic_lookup` now has an effect for dyld_stub_binder.
This matches ld64.
- `-U dyld_stub_binder` now works like you'd expect (it doesn't work in ld64).
- The error message for a missing dyld_stub_binder symbol now looks like
other undefined reference symbols, it changes from
symbol dyld_stub_binder not found (normally in libSystem.dylib). Needed to perform lazy binding.
to
error: undefined symbol: dyld_stub_binder
>>> referenced by lazy binding (normally in libSystem.dylib)
Also add test coverage for that error message.
But in practice, this should have no interesting effects since everything links
in dyld_stub_binder via libSystem anyways.
Differential Revision: https://reviews.llvm.org/D105781
Daniel Egger [Sun, 11 Jul 2021 14:58:11 +0000 (15:58 +0100)]
[ARM] Add lowering of uadd_sat to uq{add|sub}8 and uq{add|sub}16
This follow the lead of https://reviews.llvm.org/D68974 to add lowering
of unsigned saturated addition/subtraction.
Differential Revision: https://reviews.llvm.org/D105413
Vassil Vassilev [Sun, 11 Jul 2021 14:39:26 +0000 (14:39 +0000)]
Revert "[clang-repl] Implement partial translation units and error recovery."
This reverts commit
6775fc6ffa3ca1c36b20c25fa4e7f48f81213cf2.
It also reverts "[lldb] Fix compilation by adjusting to the new ASTContext signature."
This reverts commit
03a3f86071c10a1f6cbbf7375aa6fe9d94168972.
We see some failures on the lldb infrastructure, these changes might play a role
in it. Let's revert it now and see if the bots will become green.
Ref: https://reviews.llvm.org/D104918
Kazu Hirata [Sun, 11 Jul 2021 14:10:11 +0000 (07:10 -0700)]
[Analysis] Remove unused declaration isPotentiallyReachableFromMany (NFC)
David Green [Sun, 11 Jul 2021 13:45:54 +0000 (14:45 +0100)]
[IfCvt] Don't use pristine register for counting liveins for predicated instructions.
The test case here hits machine verifier problems. There are volatile
long loads that the results of do not get used, loading into two dead
registers. IfCvt will predicate them and as it does will add implicit
uses of the predicating registers due to thinking they are live in. As
nothing has used the register, the machine verifier disagrees that they
are really live and we end up with a failure.
The registers come from Pristine regs that LivePhysRegs counts as live.
This patch adds a addLiveInsNoPristines method to be used instead in
IfCvt, so that only really live in regs need to be added as implicit
operands.
Differential Revision: https://reviews.llvm.org/D90965
Dmitry Vyukov [Sun, 11 Jul 2021 10:55:04 +0000 (12:55 +0200)]
sanitizer_common: unbreak ThreadRegistry tests
https://reviews.llvm.org/D105713
forgot to update tests for the new ctor.
Differential Revision: https://reviews.llvm.org/D105772
Vassil Vassilev [Sun, 11 Jul 2021 10:52:52 +0000 (10:52 +0000)]
[lldb] Fix compilation by adjusting to the new ASTContext signature.
This change was introduced in https://reviews.llvm.org/D104918
Dmitry Vyukov [Fri, 9 Jul 2021 16:34:30 +0000 (18:34 +0200)]
sanitizer_common: add simpler ThreadRegistry ctor
Currently ThreadRegistry is overcomplicated because of tsan,
it needs tid quarantine and reuse counters. Other sanitizers
don't need that. It also seems that no other sanitizer now
needs max number of threads. Asan used to need 2^24 limit,
but it does not seem to be needed now. Other sanitizers blindly
copy-pasted that without reasons. Lsan also uses quarantine,
but I don't see why that may be potentially needed.
Add a ThreadRegistry ctor that does not require any sizes
and use it in all sanitizers except for tsan.
In preparation for new tsan runtime, which won't need
any of these parameters as well.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D105713
Vassil Vassilev [Wed, 12 May 2021 16:19:40 +0000 (16:19 +0000)]
[clang-repl] Implement partial translation units and error recovery.
https://reviews.llvm.org/D96033 contained a discussion regarding efficient
modeling of error recovery. @rjmccall has outlined the key ideas:
Conceptually, we can split the translation unit into a sequence of partial
translation units (PTUs). Every declaration will be associated with a unique PTU
that owns it.
The first key insight here is that the owning PTU isn't always the "active"
(most recent) PTU, and it isn't always the PTU that the declaration
"comes from". A new declaration (that isn't a redeclaration or specialization of
anything) does belong to the active PTU. A template specialization, however,
belongs to the most recent PTU of all the declarations in its signature - mostly
that means that it can be pulled into a more recent PTU by its template
arguments.
The second key insight is that processing a PTU might extend an earlier PTU.
Rolling back the later PTU shouldn't throw that extension away. For example, if
the second PTU defines a template, and the third PTU requires that template to
be instantiated at float, that template specialization is still part of the
second PTU. Similarly, if the fifth PTU uses an inline function belonging to the
fourth, that definition still belongs to the fourth. When we go to emit code in
a new PTU, we map each declaration we have to emit back to its owning PTU and
emit it in a new module for just the extensions to that PTU. We keep track of
all the modules we've emitted for a PTU so that we can unload them all if we
decide to roll it back.
Most declarations/definitions will only refer to entities from the same or
earlier PTUs. However, it is possible (primarily by defining a
previously-declared entity, but also through templates or ADL) for an entity
that belongs to one PTU to refer to something from a later PTU. We will have to
keep track of this and prevent unwinding to later PTU when we recognize it.
Fortunately, this should be very rare; and crucially, we don't have to do the
bookkeeping for this if we've only got one PTU, e.g. in normal compilation.
Otherwise, PTUs after the first just need to record enough metadata to be able
to revert any changes they've made to declarations belonging to earlier PTUs,
e.g. to redeclaration chains or template specialization lists.
It should even eventually be possible for PTUs to provide their own slab
allocators which can be thrown away as part of rolling back the PTU. We can
maintain a notion of the active allocator and allocate things like Stmt/Expr
nodes in it, temporarily changing it to the appropriate PTU whenever we go to do
something like instantiate a function template. More care will be required when
allocating declarations and types, though.
We would want the PTU to be efficiently recoverable from a Decl; I'm not sure
how best to do that. An easy option that would cover most declarations would be
to make multiple TranslationUnitDecls and parent the declarations appropriately,
but I don't think that's good enough for things like member function templates,
since an instantiation of that would still be parented by its original class.
Maybe we can work this into the DC chain somehow, like how lexical DCs are.
We add a different kind of translation unit `TU_Incremental` which is a
complete translation unit that we might nonetheless incrementally extend later.
Because it is complete (and we might want to generate code for it), we do
perform template instantiation, but because it might be extended later, we don't
warn if it declares or uses undefined internal-linkage symbols.
This patch teaches clang-repl how to recover from errors by disconnecting the
most recent PTU and update the primary PTU lookup tables. For instance:
```./clang-repl
clang-repl> int i = 12; error;
In file included from <<< inputs >>>:1:
input_line_0:1:13: error: C++ requires a type specifier for all declarations
int i = 12; error;
^
error: Parsing failed.
clang-repl> int i = 13; extern "C" int printf(const char*,...);
clang-repl> auto r1 = printf("i=%d\n", i);
i=13
clang-repl> quit
```
Differential revision: https://reviews.llvm.org/D104918
Dmitry Vyukov [Fri, 9 Jul 2021 18:21:52 +0000 (20:21 +0200)]
sanitizer_common: sanitize time functions
We have SleepForSeconds, SleepForMillis and internal_sleep.
Some are implemented in terms of libc functions, some -- in terms
of syscalls. Some are implemented in per OS files,
some -- in libc/nolibc files. That's unnecessary complex
and libc functions cause crashes in some contexts because
we intercept them. There is no single reason to have calls to libc
when we have syscalls (and we have them anyway).
Add internal_usleep that is implemented in terms of syscalls per OS.
Make SleepForSeconds/SleepForMillis/internal_sleep a wrapper
around internal_usleep that is implemented in sanitizer_common.cpp once.
Also remove return values for internal_sleep, it's not used anywhere.
Eventually it would be nice to remove SleepForSeconds/SleepForMillis/internal_sleep.
There is no point in having that many different names for the same thing.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D105718
Dmitry Vyukov [Fri, 9 Jul 2021 18:33:24 +0000 (20:33 +0200)]
sanitizer_common: split LibIgnore into fast/slow paths
LibIgnore is checked in every interceptor.
Currently it has all logic in the single function
in the header, which makes it uninlinable.
Split it into fast path (no libraries ignored)
and slow path (have ignored libraries).
It makes the fast path inlinable (single load).
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D105719
Jez Ng [Sun, 11 Jul 2021 04:19:03 +0000 (00:19 -0400)]
[lld-macho][nfc] Expand the compact unwind symbol reloc test
Add a bit more detail to the comments, and check that the final binary
does indeed have a `__unwind_info` section (D105557 previosly regressed
this).
Also rename the test to emphasize that we are testing relocations
compact unwind, not relocations in general.
hyeongyu kim [Sun, 11 Jul 2021 02:45:32 +0000 (11:45 +0900)]
[InstCombine] Add optimization to prevent poison from being propagated.
In D104569, Freeze was inserted just before br to solve the `branching on undef` miscompilation problem.
But value analysis was being disturbed by added freeze.
```
v = load ptr
cond = freeze(icmp (and v, const), const')
br cond, ...
```
The case in which value analysis disturbed is as above.
By changing freeze to add immediately after load, value analysis will be successful again.
```
v = load ptr
freeze(icmp (and v, const), const')
=>
v = load ptr
v' = freeze v
icmp (and v', const), const'
```
In this patch, I propose the above optimization.
With this patch, the poison will not spread as the freeze is performed early.
Reviewed By: nikic, lebedev.ri
Differential Revision: https://reviews.llvm.org/D105392
David Blaikie [Sun, 11 Jul 2021 02:07:41 +0000 (19:07 -0700)]
Fix windows directory separator some more for test from
b447b9dce0d105e7f0b22db719fe8624108e99dc
David Blaikie [Sun, 11 Jul 2021 01:46:43 +0000 (18:46 -0700)]
Reapply "llvm-symbolizer: Fix "start file" to work with Split DWARF"
Originally committed as
04c203e310bd3fb58e16c936c0200d680100526e
Reverted in
768510632c5ddbf9438693d9c7db1903e39295ad due to the test
failing when encountering windows directory separators.
Fix the path separator platform issue with a FileCheck pattern {{[/\\]}}
Original commit message:
A followup to the feature added in
69da27c7496ea373567ce5121e6fe8613846e7a5
that added the optional "start file name" to match "start line" - but this
didn't work with Split DWARF because of the need for the decl file number
resolution code to refer back to the skeleton unit to find its .debug_line
contribution. So this patch adds the necessary infrastructure to track the
skeleton unit corresponding to a split full unit for the purpose of this
lookup.
Craig Topper [Sun, 11 Jul 2021 01:34:59 +0000 (18:34 -0700)]
[DivRemPairs] Add test cases for D87555. NFC
Craig Topper [Sun, 11 Jul 2021 01:02:13 +0000 (18:02 -0700)]
[RISCV] Restore non-constant srem test I accidentally deleted. NFC
Kazu Hirata [Sun, 11 Jul 2021 00:21:32 +0000 (17:21 -0700)]
[Analysis] Remove changeCondBranchToUnconditionalTo (NFC)
The last use was removed on Jan 21, 2021 in commit
0895b836d74ed333468ddece2102140494eb33b6.
Craig Topper [Sun, 11 Jul 2021 00:07:39 +0000 (17:07 -0700)]
[RISCV] Add test cases for div/rem with constant left hand side. NFC
Some of these would produce better code if we used W instructions,
but constant LHS currently prevents that.
Johannes Doerfert [Sat, 10 Jul 2021 23:53:37 +0000 (18:53 -0500)]
[Attributor][FIX] Destroy bump allocator objects to avoid leaks
AllocationInfo and DeallocationInfo objects themselves are allocated
with the Attributor bump allocator and do not need to be deallocated.
That said, the sets in AllocationInfo and DeallocationInfo need to be
destroyed to avoid memory leaks.
Johannes Doerfert [Wed, 23 Jun 2021 21:33:49 +0000 (16:33 -0500)]
[OpenMP] Detect SPMD compatible kernels and execute them as such
In the spirit of TRegions [0], this patch analyzes a kernel and tracks
if it can be executed in SPMD-mode. If so, we flip the arguments of
the __kmpc_target_init and deinit call to enable the mode. We also
update the `<kernel>_exec_mode` flag to indicate to the runtime we
changed the mode to SPMD.
The code analysis is done interprocedurally by extending the
AAKernelInfo abstract attribute to track SPMD compatibility as well.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
Differential Revision: https://reviews.llvm.org/D102307
Johannes Doerfert [Sat, 10 Jul 2021 23:40:32 +0000 (18:40 -0500)]
[OpenMP][FIX] Add missing `)` to remark
Johannes Doerfert [Wed, 7 Jul 2021 20:06:41 +0000 (15:06 -0500)]
[OpenMP] Remove checkXXXX device runtime functions
We had multiple functions to determine the execution mode (SPMD/Generic)
and runtime status (initialized/uninitialized) but that just increased
complexity without a real benefit. Especially with D102307 in mind it
is helpful to reduce the dependence on the `ident_t` flags.
Differential Revision: https://reviews.llvm.org/D105586
Johannes Doerfert [Sat, 10 Jul 2021 23:18:34 +0000 (18:18 -0500)]
[OpenMP][NFCI] Re-enable two remarks tests after D101977 landed
Johannes Doerfert [Thu, 20 May 2021 05:37:29 +0000 (00:37 -0500)]
[OpenMP] Create custom state machines for generic target regions
In the spirit of TRegions [0], this patch creates a custom state
machine for a generic target region based on the potentially called
parallel regions.
The code analysis is done interprocedurally via an abstract attribute
(AAKernelInfo). All outermost parallel regions are collected and we
check if there might be unknown outermost parallel regions for which
we need an indirect call. Other AAKernelInfo extensions are expected.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
Differential Revision: https://reviews.llvm.org/D101977
Johannes Doerfert [Thu, 17 Jun 2021 16:23:20 +0000 (11:23 -0500)]
[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.
The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
This was in parts extracted from D59319.
Reviewed By: ABataev, JonChesterfield
Differential Revision: https://reviews.llvm.org/D101976
Johannes Doerfert [Thu, 24 Jun 2021 21:01:35 +0000 (16:01 -0500)]
[Attributor][FIX] Sanitize queries to LVI and ScalarEvolution
When we talk to outside analyse, e.g., LVI and ScalarEvolution, we need
to be careful with the query. The particular error occurred because we
folded a PHI node before the LVI query but the context location was now
not dominated by the value anymore. This is not supported by LVI so we
have to filter these situations before we query the outside analyses.
Johannes Doerfert [Thu, 6 May 2021 21:03:51 +0000 (16:03 -0500)]
[Attributor] Look through selects in genericValueTraversal
If we can simplify the select condition we can avoid one value in the
traversal.
Differential Revision: https://reviews.llvm.org/D103861
Johannes Doerfert [Sat, 10 Jul 2021 21:36:02 +0000 (16:36 -0500)]
[OpenMP][FIX] Update remark in test file after rewording
Johannes Doerfert [Fri, 25 Jun 2021 23:24:01 +0000 (18:24 -0500)]
[Attributor] Reorganize AAHeapToStack
In order to simplify future extensions, e.g., the merge of
AAHeapToShared in to AAHeapToStack, we reorganize AAHeapToStack and the
state we keep for each malloc-like call. The result is also less
confusing as we only track malloc-like calls, not all calls. Further, we
only perform the updates necessary for a malloc-like to argue it can go
to the stack, e.g., we won't check all uses if we moved on to the
"must-be-freed" argument.
This patch also uses Attributor helps to simplify the allocated size,
alignment, and the potentially freed objects.
Overall, this is mostly a reorganization and only the use of the
optimistic helpers should change (=improve) the capabilities a bit.
Differential Revision: https://reviews.llvm.org/D104993
Johannes Doerfert [Thu, 24 Jun 2021 18:12:06 +0000 (13:12 -0500)]
[Attributor][FIX] Do not replace a value with a non-dominating instruction
We have to be careful when we replace values to not use a non-dominating
instruction. It makes sense that simplification offers those as
"simplified values" but we can't manifest them in the IR without PHI
nodes. In the future we should consider potentially adding those PHI
nodes.
David Green [Sat, 10 Jul 2021 21:08:30 +0000 (22:08 +0100)]
[ARM] Extra widening and narrowing combinations tests. NFC
Johannes Doerfert [Mon, 10 May 2021 01:16:50 +0000 (20:16 -0500)]
[Attributor] Use AAValueSimplify to simplify returned values
We should use AAValueSimplify for all value simplification, however
there was some leftover logic that predates AAValueSimplify in
AAReturnedValues. This remove the AAReturnedValues part and provides a
replacement by making AAValueSimplifyReturned strong enough to handle
all previously covered cases. Further, this improve
AAValueSimplifyCallSiteReturned to handle returned arguments.
AAReturnedValues is now much easier and the collected returned
values/instructions are now from the associated function only, making it
much more sane. We also do not have the brittle logic anymore that looks
for unresolved calls. Instead, we use AAValueSimplify to handle
recursion.
Useful code has been split into helper functions, e.g., an Attributor
interface to get a simplified value.
Differential Revision: https://reviews.llvm.org/D103860
Johannes Doerfert [Sat, 8 May 2021 06:21:17 +0000 (01:21 -0500)]
[Attributor] Introduce an optimistic getUnderlyingObjects helper
As the `llvm::getUnderlyingObjects` helper, the optimistic version
collects objects that might be the base of a given pointer. In contrast
to the llvm variant, the optimistic one will use assumed information,
e.g., about select conditions or dead blocks, to provide a more precise
result.
Differential Revision: https://reviews.llvm.org/D103859
Johannes Doerfert [Sat, 8 May 2021 04:06:44 +0000 (23:06 -0500)]
[Attributor][FIX] Traverse uses even if a value is assumed constant
Not all attributes are able to handle the interprocedural step and
follow the uses into a call site. Let them be able to combine call site
uses instead. This might result in some unused values/arguments being
leftover but it removes problems where we misused "is dead" even though
it was actually "is simplified/replaced".
We explicitly check for dead values due to constant propagation in
`AAIsDeadValueImpl::areAllUsesAssumedDead` instead.
Differential Revision: https://reviews.llvm.org/D103858
Nico Weber [Sat, 10 Jul 2021 20:15:55 +0000 (16:15 -0400)]
Revert Attributor patch series
Broke check-clang, see https://reviews.llvm.org/D102307#2869065
Ran `git revert -n
ebbe149a6f08535ede848a531a601ae6591cfbc5..
269416d41908bb670f67af689155d5ab8eea689a`
Vassil Vassilev [Sat, 10 Jul 2021 17:50:41 +0000 (17:50 +0000)]
Reland "[clang-repl] Allow passing in code as positional arguments."
This reverts commit
3ec88ca60b24 which reverted
e386871e1d21 due to a asan build
failure.
This patch removes the new lines in the test case which seem to introduce the
failure.
Differential revision: https://reviews.llvm.org/D104898
Nico Weber [Sat, 10 Jul 2021 17:35:05 +0000 (13:35 -0400)]
Revert "llvm-symbolizer: Fix "start file" to work with Split DWARF"
This reverts commit
04c203e310bd3fb58e16c936c0200d680100526e.
Test fails on Windows.
Johannes Doerfert [Sat, 10 Jul 2021 00:09:40 +0000 (19:09 -0500)]
[Attributor][NFCI] Add UsedAssumedInformation to more interfaces
As with other Attributor interfaces we often want to know if assumed
information was used to answer a query. This is important if only
known information is allowed or if known information can lead to an
early fixpoint. The users have been adjusted but none of them utilizes
the new information yet.
Johannes Doerfert [Wed, 23 Jun 2021 21:33:49 +0000 (16:33 -0500)]
[OpenMP] Detect SPMD compatible kernels and execute them as such
In the spirit of TRegions [0], this patch analyzes a kernel and tracks
if it can be executed in SPMD-mode. If so, we flip the arguments of
the __kmpc_target_init and deinit call to enable the mode. We also
update the `<kernel>_exec_mode` flag to indicate to the runtime we
changed the mode to SPMD.
The code analysis is done interprocedurally by extending the
AAKernelInfo abstract attribute to track SPMD compatibility as well.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
Differential Revision: https://reviews.llvm.org/D102307
Johannes Doerfert [Wed, 7 Jul 2021 20:06:41 +0000 (15:06 -0500)]
[OpenMP] Remove checkXXXX device runtime functions
We had multiple functions to determine the execution mode (SPMD/Generic)
and runtime status (initialized/uninitialized) but that just increased
complexity without a real benefit. Especially with D102307 in mind it
is helpful to reduce the dependence on the `ident_t` flags.
Differential Revision: https://reviews.llvm.org/D105586
Johannes Doerfert [Thu, 24 Jun 2021 21:01:35 +0000 (16:01 -0500)]
[Attributor][FIX] Sanitize queries to LVI and ScalarEvolution
When we talk to outside analyse, e.g., LVI and ScalarEvolution, we need
to be careful with the query. The particular error occurred because we
folded a PHI node before the LVI query but the context location was now
not dominated by the value anymore. This is not supported by LVI so we
have to filter these situations before we query the outside analyses.
Johannes Doerfert [Thu, 24 Jun 2021 18:12:06 +0000 (13:12 -0500)]
[Attributor][FIX] Do not replace a value with a non-dominating instruction
We have to be careful when we replace values to not use a non-dominating
instruction. It makes sense that simplification offers those as
"simplified values" but we can't manifest them in the IR without PHI
nodes. In the future we should consider potentially adding those PHI
nodes.
Johannes Doerfert [Thu, 20 May 2021 05:37:29 +0000 (00:37 -0500)]
[OpenMP] Create custom state machines for generic target regions
In the spirit of TRegions [0], this patch creates a custom state
machine for a generic target region based on the potentially called
parallel regions.
The code analysis is done interprocedurally via an abstract attribute
(AAKernelInfo). All outermost parallel regions are collected and we
check if there might be unknown outermost parallel regions for which
we need an indirect call. Other AAKernelInfo extensions are expected.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
Differential Revision: https://reviews.llvm.org/D101977