Craig Topper [Tue, 5 May 2020 21:40:01 +0000 (14:40 -0700)]
[X86] Fix usage of Align constructing MachineMemOperands.
Similar to D77687, but for the X86 specific code.
Differential Revision: https://reviews.llvm.org/D79381
Sergej Jaskiewicz [Tue, 5 May 2020 22:22:55 +0000 (01:22 +0300)]
[libc++] [test] Generate static_test_env on the fly
Summary:
Instead of storing `static_test_env` (with all the symlinks) in the repo, we create it on the fly to be cross-toolchain-friendly. The primary use case for this are Windows-hosted cross-toolchains. Windows doesn't really have a concept of symlinks. So, when the monorepo is cloned, those symlinks turn to ordinary text files. Previously, if we cross-compiled libc++ for some symlink-friendly system (e. g. Linux) and ran tests on the target system, some tests would fail. This patch makes them pass.
Reviewers: ldionne, #libc
Reviewed By: ldionne, #libc
Subscribers: EricWF, dexonsmith, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D78200
Sergej Jaskiewicz [Tue, 5 May 2020 22:19:48 +0000 (01:19 +0300)]
Revert "[libc++] Generate symlinks in static_test_env on the fly"
This reverts commit
645ad5badbabdeca31de5c98ea8135c5a6e7d710.
This commit did not incorporate all the changes intended.
Reid Kleckner [Tue, 5 May 2020 03:03:19 +0000 (20:03 -0700)]
[Support] Move LLD's parallel algorithm wrappers to support
Essentially takes the lld/Common/Threads.h wrappers and moves them to
the llvm/Support/Paralle.h algorithm header.
The changes are:
- Remove policy parameter, since all clients use `par`.
- Rename the methods to `parallelSort` etc to match LLVM style, since
they are no longer C++17 pstl compatible.
- Move algorithms from llvm::parallel:: to llvm::, since they have
"parallel" in the name and are no longer overloads of the regular
algorithms.
- Add range overloads
- Use the sequential algorithm directly when 1 thread is requested
(skips task grouping)
- Fix the index type of parallelForEachN to size_t. Nobody in LLVM was
using any other parameter, and it made overload resolution hard for
for_each_n(par, 0, foo.size(), ...) because 0 is int, not size_t.
Remove Threads.h and update LLD for that.
This is a prerequisite for parallel public symbol processing in the PDB
library, which is in LLVM.
Reviewed By: MaskRay, aganea
Differential Revision: https://reviews.llvm.org/D79390
Christopher Tetreault [Tue, 5 May 2020 21:21:59 +0000 (14:21 -0700)]
[SVE] Fix invalid usage of getNumElements() in InstCombineMulDivRem
Summary:
getLogBase2 tries to iterate over the number of vector elements. Since
the number of elements of a scalable vector is unknown at compile time,
we must return null if the input type is scalable.
Identified by test LLVM.Transforms/InstCombine::nsw.ll
Reviewers: efriedma, fpetrogalli, kmclaughlin, spatel
Reviewed By: efriedma, fpetrogalli
Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79197
Sergej Jaskiewicz [Wed, 15 Apr 2020 13:21:27 +0000 (16:21 +0300)]
[libc++] Generate symlinks in static_test_env on the fly
Instead of storing static_test_env (with all the symlinks) in the repo,
we create it on the fly to be cross-toolchain-friendly. The primary
use case for this are Windows-hosted cross-toolchains. Windows doesn't
really have a concept of symlinks. So, when the monorepo is cloned,
those symlinks turn to ordinary text files. Previously, if we
cross-compiled libc++ for some symlink-friendly system (e. g. Linux) and
ran tests on the target system, some tests would fail. This patch makes
them pass.
Differential Revision: https://reviews.llvm.org/D78200
Konrad Kleine [Tue, 5 May 2020 14:29:57 +0000 (10:29 -0400)]
[clang/clang-tools-extra] Fix BZ44437 - add_new_check.py does not work with Python 3
Summary:
This fixes https://bugs.llvm.org/show_bug.cgi?id=44437.
Thanks to Arnaud Desitter for providing the patch in the bug report!
A simple example program to reproduce this error is this:
```lang=python
import sys
with open(sys.argv[0], 'r') as f:
lines = f.readlines()
lines = iter(lines)
line = lines.next()
print(line)
```
which will error with this in python python 3:
```
Traceback (most recent call last):
File "./mytest.py", line 8, in <module>
line = lines.next()
AttributeError: 'list_iterator' object has no attribute 'next'
```
Here's the same strategy applied to my test program as applied to the `add_new_check.py` file:
```lang=python
import sys
with open(sys.argv[0], 'r') as f:
lines = f.readlines()
lines = iter(lines)
line = next(lines)
print(line)
```
The built-in function `next()` is new since Python 2.6: https://docs.python.org/2/library/functions.html#next
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79419
Artem Belevich [Tue, 5 May 2020 20:54:26 +0000 (13:54 -0700)]
Revert D77954 -- it breaks Eigen & Tensorflow.
This reverts commit
55bcb96f3154808bcb5afc3fb46d8e00bf1db847.
Jonas Devlieghere [Tue, 5 May 2020 21:06:24 +0000 (14:06 -0700)]
[lldb/Test] Update expressions.test for non-zero exit code
Updates Windows test for
61d5b0e66394.
Jan Korous [Tue, 5 May 2020 20:54:37 +0000 (13:54 -0700)]
[VFS][NFC] Fix typo in comment
Stanislav Mekhanoshin [Tue, 5 May 2020 20:52:04 +0000 (13:52 -0700)]
[AMDGPU] Added 'a' constraint documentation. NFC.
AGPR inline asm constraint was missing from the LangRef.rst.
Sean Silva [Tue, 5 May 2020 20:19:30 +0000 (13:19 -0700)]
[mlir][shape] Extract ShapeBase.td
Alina Sbirlea [Tue, 5 May 2020 00:25:14 +0000 (17:25 -0700)]
[MemorySSA] Make MemoryLocation unknown when phi translation cannot be performed.
Summary: When phi translation cannot be performed, be conservative and make the MemoryLocation unknown.
Reviewers: george.burgess.iv
Subscribers: Prazek, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79386
Siva Chandra Reddy [Tue, 5 May 2020 20:06:49 +0000 (13:06 -0700)]
[libc] Add no_sanitize("address") attribute to the getMPFRMatcher function.
This dramtically reduces the run time of tests. For example,
sincosf_test takes over 25 minutes without this attribute but only 8
seconds with this attribute.
Davide Italiano [Tue, 5 May 2020 20:13:46 +0000 (13:13 -0700)]
[TestIndirectSymbol] This tests an Apple-specific feature.
Remove a redundant check.
Davide Italiano [Tue, 5 May 2020 20:13:16 +0000 (13:13 -0700)]
[TestIndirectSymbols] This now runs and works on iOS (arm64).
Sanjay Patel [Tue, 5 May 2020 20:02:45 +0000 (16:02 -0400)]
[ValueTracking] fix CannotBeNegativeZero() to disregard 'nsz' FMF
The 'nsz' flag is different than 'nnan' or 'ninf' in that it does not create poison.
Make that explicit in the LangRef and fix ValueTracking analysis that misinterpreted
the definition.
This manifests as bugs in InstSimplify shown in the test diffs and as discussed in
PR45778:
https://bugs.llvm.org/show_bug.cgi?id=45778
Differential Revision: https://reviews.llvm.org/D79422
River Riddle [Tue, 5 May 2020 19:39:29 +0000 (12:39 -0700)]
[mlir][DenseStringElementsAttr] Fix AttributeElementIterator in the case of a splat.
River Riddle [Tue, 5 May 2020 19:39:22 +0000 (12:39 -0700)]
[mlir][DenseElementsAttr] Add support for opaque APFloat/APInt complex values.
This revision allows for creating DenseElementsAttrs and accessing elements using std::complex<APInt>/std::complex<APFloat>. This allows for opaquely accessing and transforming complex values. This is used by the printer/parser to provide pretty printing for complex values. The form for complex values matches that of std::complex, i.e.:
```
// `(` element `,` element `)`
dense<(10,10)> : tensor<complex<i64>>
```
Differential Revision: https://reviews.llvm.org/D79296
River Riddle [Tue, 5 May 2020 19:39:12 +0000 (12:39 -0700)]
[mlir][DenseElementsAttr] Add support for ComplexType elements
This revision adds support for storing ComplexType elements inside of a DenseElementsAttr. We store complex objects as an array of two elements, matching the definition of std::complex. There is no current attribute storage for ComplexType, but DenseElementsAttr provides API for access/creation using std::complex<>. Given that the internal implementation of DenseElementsAttr is already fairly opaque, the only real complexity here is in the printing/parsing. This revision keeps it simple for now and always uses hex when printing complex elements. A followup will add prettier syntax for this.
Differential Revision: https://reviews.llvm.org/D79281
Stephen Neuendorffer [Tue, 5 May 2020 19:26:48 +0000 (12:26 -0700)]
[MLIR] mlir-opt needs PUBLIC dependence
We see intermittent build errors on the windows buildbot because
mlir-opt is including Linalg headers which haven't been built yet.
This dependence should be resolved by declaring a PUBLIC dependence
on the Linalg library when building MLIROptMain.
Michael Liao [Tue, 5 May 2020 04:55:13 +0000 (00:55 -0400)]
[clang][codegen] Hoist parameter attribute setting in function prolog.
Summary:
- If the coerced type is still a pointer, it should be set with proper
parameter attributes, such as `noalias`, `nonnull`, and etc. Hoist
that (pointer) parameter attribute setting so that the coerced pointer
parameter could be marked properly.
Depends on D79394
Reviewers: rjmccall, kerbowa, yaxunl
Subscribers: jvesely, nhaehnle, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79395
Michael Liao [Tue, 5 May 2020 03:53:24 +0000 (23:53 -0400)]
[clang][codegen] Refactor argument loading in function prolog. NFC.
Summary:
- Skip copying function arguments and unnecessary casting by using them
directly.
Reviewers: rjmccall, kerbowa, yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79394
Erik Pilkington [Tue, 5 May 2020 18:01:04 +0000 (14:01 -0400)]
[AST] Print fixed enum type regardless of language mode
These are permitted in all language modes, not just C++11.
Erik Pilkington [Tue, 5 May 2020 17:32:08 +0000 (13:32 -0400)]
[SemaObjC] Add a warning for dictionary literals with duplicate keys
Duplicate keys in a literal break NSDictionary's invariants. rdar://
50454461A
Differential revision: https://reviews.llvm.org/D78660
Lei Zhang [Tue, 5 May 2020 18:54:23 +0000 (14:54 -0400)]
[mlir] Specify CMAKE_CXX_STANDARD to standalone dialect
This addresses a compilation failure on GCC 5:
error: #error This file requires compiler and library support for the
ISO C++ 2011 standard. This support must be enabled with the -std=c++11
or -std=gnu++11 compiler options.
#error This file requires compiler and library support
Differential Revision: https://reviews.llvm.org/D79439
Christudasan Devadasan [Tue, 5 May 2020 18:40:14 +0000 (00:10 +0530)]
[AMDGPU] Fixed the test by adding the triple.
Alex Zinenko [Tue, 5 May 2020 12:09:35 +0000 (14:09 +0200)]
[mlir] Harden verifiers for DMA ops
DMA operation classes in the Standard dialect (`DmaStartOp` and `DmaWaitOp`)
provide helper functions that make numerous assumptions about the number and
order of operands, and about their types. However, these assumptions were not
checked in the verifier, leading to assertion failures or crashes when helper
functions were used on ill-formed ops. Some of the assuptions were checked in
the custom parser (and thus could not check assumption violations in ops
constructed programmatically, e.g., during rewrites) and others were not
checked at all. Introduce the verifiers for all these assumptions and drop
unnecessary checks in the parser that are now covered by the verifier.
Addresses PR45560.
Differential Revision: https://reviews.llvm.org/D79408
Tim Keith [Mon, 4 May 2020 14:25:02 +0000 (07:25 -0700)]
[flang] Fix bug in tests for standalone build
When doing a standalone build of flang against an LLVM that contains a
built flang, the tests were run on the flang from LLVM rather than on
the one that was just built.
The problem was in the lit configuration for finding %flang etc.
Fix it to look only in the directory where it was built.
Differential Revision: https://reviews.llvm.org/D79327
Momchil Velikov [Tue, 5 May 2020 17:53:52 +0000 (18:53 +0100)]
Revert "[ARM] CMSE code generation"
This reverts commit
7cbbf89d230d46c3de9a7affc29b23f08c4377a1.
The regression tests fail with the expensive checks.
David Blaikie [Tue, 5 May 2020 18:04:43 +0000 (11:04 -0700)]
Collapse variable into assert to remove non-assert unused variable
Kazu Hirata [Mon, 4 May 2020 20:49:27 +0000 (13:49 -0700)]
[Inlining] Teach shouldBeDeferred to take the total cost into account
Summary:
This patch teaches shouldBeDeferred to take into account the total
cost of inlining.
Suppose we have a call hierarchy {A1,A2,A3,...}->B->C. (Each of A1,
A2, A3, ... calls B, which in turn calls C.)
Without this patch, shouldBeDeferred essentially returns true if
TotalSecondaryCost < IC.getCost()
where TotalSecondaryCost is the total cost of inlining B into As.
This means that if B is a small wraper function, for example, it would
get inlined into all of As. In turn, C gets inlined into all of As.
In other words, shouldBeDeferred ignores the cost of inlining C into
each of As.
This patch adds an option, inline-deferral-scale, to replace the
expression above with:
TotalCost < Allowance
where
- TotalCost is TotalSecondaryCost + IC.getCost() * # of As, and
- Allowance is IC.getCost() * Scale
For now, the new option defaults to -1, disabling the new scheme.
Reviewers: davidxl
Subscribers: eraman, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79138
Jonas Devlieghere [Tue, 5 May 2020 17:58:03 +0000 (10:58 -0700)]
[lldb/Driver] Exit with a non-zero exit code in case of error in batch mode.
We have the option to stop running commands in batch mode when an error
occurs. When that happens we should exit the driver with a non-zero exit
code.
Differential revision: https://reviews.llvm.org/D78825
Nico Weber [Tue, 5 May 2020 01:15:29 +0000 (21:15 -0400)]
Let normalize() for posix style convert backslash to slash unconditionally.
Currently, normalize() for posix replaces backslashes to slashes, except
that two backslashes in sequence are kept as-is.
clang calls normalize() to convert \ to / is microsoft compat mode. This
generally works well, but a path like "c:\\foo\\bar.h" with two
backslashes doesn't work due to the exception in normalize().
These paths happen naturally on Windows hosts with e.g.
`#include __FILE__`, and them not working on other hosts makes it
more difficult to write tests for this case.
The special case has been around without justification since this code
was added in r203611 (since then moved around in r215241 r215243). No
integration tests fail if I remove it.
Try removing the special case.
Differential Revision: https://reviews.llvm.org/D79265
Andy Davis [Tue, 5 May 2020 17:29:09 +0000 (10:29 -0700)]
[MLIR][LoopOps] Adds the loop unroll transformation for loop::ForOp.
Summary:
Adds the loop unroll transformation for loop::ForOp.
Adds support for promoting the body of single-iteration loop::ForOps into its containing block.
Adds check tests for loop::ForOps with dynamic and static lower/upper bounds and step.
Care was taken to share code (where possible) with the AffineForOp unroll transformation to ease maintenance and potential future transition to a LoopLike construct on which loop transformations for different loop types can implemented.
Reviewers: ftynse, nicolasvasilache
Reviewed By: ftynse
Subscribers: bondhugula, mgorny, zzheng, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79184
Nico Weber [Tue, 5 May 2020 17:41:24 +0000 (13:41 -0400)]
Add a test to Support.NormalizePath.
Christudasan Devadasan [Fri, 27 Mar 2020 07:46:51 +0000 (03:46 -0400)]
[AMDGPU] Introduce more scratch registers in the ABI.
The AMDGPU target has a convention that defined all VGPRs
(execept the initial 32 argument registers) as callee-saved.
This convention is not efficient always, esp. when the callee
requiring more registers, ended up emitting a large number of
spills, even though its caller requires only a few.
This patch revises the ABI by introducing more scratch registers
that a callee can freely use.
The 256 vgpr registers now become:
32 argument registers
112 scratch registers and
112 callee saved registers.
The scratch registers and the CSRs are intermixed at regular
intervals (a split boundary of 8) to obtain a better occupancy.
Reviewers: arsenm, t-tye, rampitec, b-sumner, mjbedy, tpr
Reviewed By: arsenm, t-tye
Differential Revision: https://reviews.llvm.org/D76356
Lei Zhang [Tue, 5 May 2020 16:41:28 +0000 (12:41 -0400)]
[mlir] Add missing dependency to MLIRMlirOptMain
Differential Revision: https://reviews.llvm.org/D79429
Louis Dionne [Tue, 5 May 2020 17:13:56 +0000 (13:13 -0400)]
[libc++] Rewrite the tests for cin, cout, clog, cerr and friends
The tests were disabled with `#if 0`, most likely because there was no
way of writing shell tests when they were first written.
Siva Chandra Reddy [Fri, 1 May 2020 07:54:33 +0000 (00:54 -0700)]
[libc] Improve information printed on failure of a math test which uses MPFR.
A new test matcher class MPFRMatcher is added along with helper macros
EXPECT|ASSERT_MPFR_MATCH.
New type traits classes RemoveCV and IsFloatingPointType have been
added and used to implement the above class and its helpers.
Reviewers: abrachet, phosek
Differential Revision: https://reviews.llvm.org/D79256
Momchil Velikov [Tue, 5 May 2020 16:04:29 +0000 (17:04 +0100)]
[ARM] CMSE code generation
This patch implements the final bits of CMSE code generation:
* emit special linker symbols
* restrict parameter passing to not use memory
* emit BXNS and BLXNS instructions for returns from non-secure entry
functions, and non-secure function calls, respectively
* emit code to save/restore secure floating-point state around calls
to non-secure functions
* emit code to save/restore non-secure floating-pointy state upon
entry to non-secure entry function, and return to non-secure state
* emit code to clobber registers not used for arguments and returns
when switching to no-secure state
Patch by Momchil Velikov, Bradley Smith, Javed Absar, David Green,
possibly others.
Differential Revision: https://reviews.llvm.org/D76518
Louis Dionne [Tue, 5 May 2020 17:20:38 +0000 (13:20 -0400)]
[libc++abi] NFC: Remove pragma mark in favor of normal comment
Stanislav Mekhanoshin [Mon, 4 May 2020 19:47:23 +0000 (12:47 -0700)]
[AMDGPU] Fix FoldImmediate for 16 bit operand
Differential Revision: https://reviews.llvm.org/D79362
Hans Wennborg [Tue, 5 May 2020 13:43:54 +0000 (15:43 +0200)]
Don't assert about missing profile info in createProfileWeightsForLoop
The compiler shouldn't crash if the profile info is slightly off. We hit
this in Chromium.
Differential revision: https://reviews.llvm.org/D79417
Sid Manning [Tue, 28 Apr 2020 20:43:54 +0000 (15:43 -0500)]
[Hexagon] Add R_HEX_GD_PLT_B22/32_PCREL relocations
Extended versions of GD_PLT_B22_PCREL. These surface when -mlong-calls
is used.
Differential Revision: https://reviews.llvm.org/D79191
Sanjay Patel [Tue, 5 May 2020 16:32:15 +0000 (12:32 -0400)]
[SLP] add another bailout for load-combine patterns
This builds on the or-reduction bailout that was added with D67841.
We still do not have IR-level load combining, although that could
be a target-specific enhancement for -vector-combiner.
The heuristic is narrowly defined to catch the motivating case from
PR39538:
https://bugs.llvm.org/show_bug.cgi?id=39538
...while preserving existing functionality.
That is, there's an unmodified test of pure load/zext/store that is
not seen in this patch at llvm/test/Transforms/SLPVectorizer/X86/cast.ll.
That's the reason for the logic difference to require the 'or'
instructions. The chances that vectorization would actually help a
memory-bound sequence like that seem small, but it looks nicer with:
vpmovzxwd (%rsi), %xmm0
vmovdqu %xmm0, (%rdi)
rather than:
movzwl (%rsi), %eax
movl %eax, (%rdi)
...
In the motivating test, we avoid creating a vector mess that is
unrecoverable in the backend, and SDAG forms the expected bswap
instructions after load combining:
movzbl (%rdi), %eax
vmovd %eax, %xmm0
movzbl 1(%rdi), %eax
vmovd %eax, %xmm1
movzbl 2(%rdi), %eax
vpinsrb $4, 4(%rdi), %xmm0, %xmm0
vpinsrb $8, 8(%rdi), %xmm0, %xmm0
vpinsrb $12, 12(%rdi), %xmm0, %xmm0
vmovd %eax, %xmm2
movzbl 3(%rdi), %eax
vpinsrb $1, 5(%rdi), %xmm1, %xmm1
vpinsrb $2, 9(%rdi), %xmm1, %xmm1
vpinsrb $3, 13(%rdi), %xmm1, %xmm1
vpslld $24, %xmm0, %xmm0
vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero
vpslld $16, %xmm1, %xmm1
vpor %xmm0, %xmm1, %xmm0
vpinsrb $1, 6(%rdi), %xmm2, %xmm1
vmovd %eax, %xmm2
vpinsrb $2, 10(%rdi), %xmm1, %xmm1
vpinsrb $3, 14(%rdi), %xmm1, %xmm1
vpinsrb $1, 7(%rdi), %xmm2, %xmm2
vpinsrb $2, 11(%rdi), %xmm2, %xmm2
vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero
vpinsrb $3, 15(%rdi), %xmm2, %xmm2
vpslld $8, %xmm1, %xmm1
vpmovzxbd %xmm2, %xmm2 # xmm2 = xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[2],zero,zero,zero,xmm2[3],zero,zero,zero
vpor %xmm2, %xmm1, %xmm1
vpor %xmm1, %xmm0, %xmm0
vmovdqu %xmm0, (%rsi)
movl (%rdi), %eax
movl 4(%rdi), %ecx
movl 8(%rdi), %edx
movbel %eax, (%rsi)
movbel %ecx, 4(%rsi)
movl 12(%rdi), %ecx
movbel %edx, 8(%rsi)
movbel %ecx, 12(%rsi)
Differential Revision: https://reviews.llvm.org/D78997
Pete Steinfeld [Fri, 1 May 2020 20:00:28 +0000 (13:00 -0700)]
[flang] New implementation for checks for constraints C741 through C750
Summary:
Most of these checks were already implemented, and I just added references to
them to the code and tests. Also, much of this code was already
reviewed in the old flang/f18 GitHub repository, but I didn't get to
merge it before we switched repositories.
I implemented the check for C747 to not allow coarray components in derived
types that are of type C_PTR, C_FUNPTR, or type TEAM_TYPE.
I implemented the check for C748 that requires a data component whose type has
a coarray ultimate component to be a nonpointer, nonallocatable scalar and not
be a coarray.
I implemented the check for C750 that adds additional restrictions to the
bounds expressions of a derived type component that's an array.
These bounds expressions are sepcification expressions as defined in
10.1.11. There was already code in lib/Evaluate/check-expression.cpp to
check semantics for specification expressions, but it did not check for
the extra requirements of C750.
C750 prohibits specification functions, the intrinsic functions
ALLOCATED, ASSOCIATED, EXTENDS_TYPE_OF, PRESENT, and SAME_TYPE_AS. It
also requires every specification inquiry reference to be a constant
expression, and requires that the value of the bound not depend on the
value of a variable.
To implement these additional checks, I added code to the intrinsic proc
table to get the intrinsic class of a procedure. I also added an
enumeration to distinguish between specification expressions for
derived type component bounds versus for type parameters. I then
changed the code to pass an enumeration value to
"CheckSpecificationExpr()" to indicate that the expression was a bounds
expression and used this value to determine whether to emit an error
message when violations of C750 are found.
I changed the implementation of IsPureProcedure() to handle statement
functions and changed some references in the code that tested for the
PURE attribute to call IsPureProcedure().
I also fixed some unrelated tests that got new errors when I implemented these
new checks.
Reviewers: tskeith, DavidTruby, sscalpone
Subscribers: jfb, llvm-commits
Tags: #llvm, #flang
Differential Revision: https://reviews.llvm.org/D79263
Francesco Petrogalli [Mon, 27 Apr 2020 22:11:43 +0000 (22:11 +0000)]
[clang][OpenMP] Fix getNDSWDS for aarch64.
Summary:
This change fixes an aarch64-specific bug in the generation of the NDS and WDS values used to compute the signature of the vector functions out of OpenMP directives like `declare simd`. When the directive is used in conjunction with the `linear` clause, the size of the pointee must be used instead of the size of the pointer to compute NDS and WDS.
The code-fix is strictly related to the behavior for `linear`, but given that the only way we have to test the NDS and WDS values is to check the resulting `<vlen>` token in the mangled name of the vector function, the tests have been extended to cover all the possible values of WDS and NDS as defined in the ABI at https://github.com/ARM-software/abi-aa/tree/master/vfabia64.
Reviewers: ABataev, jdoerfert, andwar
Reviewed By: jdoerfert
Subscribers: yaxunl, kristof.beyls, guansong, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78969
Stephen Neuendorffer [Wed, 22 Apr 2020 17:52:07 +0000 (10:52 -0700)]
[MLIR] Add a tests for out of tree dialect example.
This attempts to ensure that out of tree usage remains stable.
Differential Revision: https://reviews.llvm.org/D78656
Vedant Kumar [Tue, 5 May 2020 16:17:55 +0000 (09:17 -0700)]
[lldb/unittest] Avoid relying on compiler character encoding in unicode test
This is a speculative fix for a unit test failure on a Win/MSVC2017 bot
(http://lab.llvm.org:8011/builders/lldb-x64-windows-ninja/builds/16106/steps/test/logs/stdio).
Jinsong Ji [Tue, 5 May 2020 14:27:59 +0000 (14:27 +0000)]
[MachinePipeliner] Add ORE for MachinePipeliner
This patch adds ORE for MachinePipeliner, so that people can anaylyze
their code using opt-viewer or other tools, then optimize the code to
catch more piplining opportunities.
Reviewed By: bcahoon
Differential Revision: https://reviews.llvm.org/D79368
Simon Pilgrim [Tue, 5 May 2020 15:57:55 +0000 (16:57 +0100)]
[TTI] getScalarizationOverhead - use explicit VectorType operand
getScalarizationOverhead is only ever called with vectors (and we already had a load of cast<VectorType> calls immediately inside the functions).
Followup to D78357
Reviewed By: @samparker
Differential Revision: https://reviews.llvm.org/D79341
Stephen Neuendorffer [Tue, 5 May 2020 15:54:03 +0000 (08:54 -0700)]
[MLIR] GPUToCUDA conversion: MC is only needed if NVPTX is enabled.
This patch conditionally links with MC
Stephen Neuendorffer [Mon, 4 May 2020 21:49:53 +0000 (14:49 -0700)]
[flang] update tools/f18 to use LLVM_LINK_COMPONENTS.
This will prevent conflicts with libLLVM.so when using LLVM_LINK_LLVM_DYLIB=on.
Differential Revision: https://reviews.llvm.org/D79370
Pengxuan Zheng [Thu, 23 Apr 2020 22:07:03 +0000 (15:07 -0700)]
[RISCV] Update debug scratch register names
Summary:
The RISC-V debug register was named dscratch in a previous draft of the RISC-V
debug mode spec. The number of registers has been increased to 2 in the latest
ratified version of the debug mode spec and the registers were named dscratch0
and dscratch1. We still support using the old register name "dscratch", but it
would be disassembled as "dscratch0" with this change.
Reviewers: apazos, asb, lenary, luismarques
Reviewed By: asb
Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, sameer.abuasal, evandro, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78764
Fangrui Song [Mon, 4 May 2020 16:40:47 +0000 (09:40 -0700)]
[llvm-objcopy][ELF] Allow --dump-section to dump an empty non-SHT_NOBITS section
This is the ELF part of D75949.
GNU objcopy from binutils 2.35 onwards will support an empty non-SHT_NOBITS section as
well
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=
e052e2ba295a65b6ea80cbc3f90495beca299c42
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D79339
Fangrui Song [Tue, 5 May 2020 15:16:10 +0000 (08:16 -0700)]
[llvm-objcopy][test] ELF/dump-section.test: change #CHECK to # CHECK
The latter is more common and thus the preferred way to use CHECK lines.
Arthur Eubanks [Mon, 4 May 2020 17:37:38 +0000 (10:37 -0700)]
Remove unnecessary check for inalloca in IPConstantPropagation
Summary:
This was added in https://reviews.llvm.org/D2449, but I'm not sure it's
necessary since an inalloca value is never a Constant (should be an
AllocaInst).
Reviewers: hans, rnk
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79350
Jay Foad [Tue, 5 May 2020 12:32:59 +0000 (13:32 +0100)]
[InstCombine] Allow denormal C in pow(C,y) -> exp2(log2(C)*y)
We check that C is finite and strictly positive, but there's no need to
check that it's normal too. exp2 should be just as accurate on denormals
as pow is.
Differential Revision: https://reviews.llvm.org/D79413
Fangrui Song [Mon, 4 May 2020 16:45:41 +0000 (09:45 -0700)]
[Support] Allow FileOutputBuffer::create to create an empty file
Size==0 triggers `assert(Size != 0)` in mapped_file_region::init.
I plan to use an empty file in D79339 (llvm-objcopy --dump-section).
According to POSIX, "If len is zero, mmap() shall fail and no mapping
shall be established." Just specialize case Size=0 to use
createInMemoryBuffer.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D79338
Jay Foad [Tue, 5 May 2020 12:28:14 +0000 (13:28 +0100)]
Precommit new test cases for D79413 [InstCombine] Allow denormal C in pow(C,y) -> exp2(log2(C)*y)
David Green [Tue, 5 May 2020 06:15:21 +0000 (07:15 +0100)]
[LSR] Don't require register reuse under postinc
LSR has some logic that tries to aggressively reuse registers in
formula. This can lead to sub-optimal decision in complex loops where
the backend it trying to use shouldFavorPostInc. This disables the
re-use in those situations.
Differential Revision: https://reviews.llvm.org/D79301
Jay Foad [Mon, 4 May 2020 16:41:08 +0000 (17:41 +0100)]
[AMDGPU] Better support for VMEM soft clauses in GCNHazardRecognizer
VMEM soft clauses only contain VMEM and FLAT instructions. Teaching
GCNHazardRecognizer::checkSoftClauseHazards that other kinds of
instructions will naturally break the clause means there are far fewer
cases where it has to insert an s_nop instruction to forcibly break the
clause.
Differential Revision: https://reviews.llvm.org/D79353
Yitzhak Mandelbaum [Tue, 5 May 2020 00:00:41 +0000 (20:00 -0400)]
[clang-tidy] In TransformerClangTidyCheck, support option IncludeStyle.
Summary:
The new option allows the user to specify which file naming convention is used
by the source code ('llvm' or 'google').
Reviewers: gribozavr2
Subscribers: xazax.hun, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79380
Sam Parker [Tue, 5 May 2020 14:28:40 +0000 (15:28 +0100)]
[NFC] Update tests
Run the update script on a couple of tests.
Louis Dionne [Tue, 5 May 2020 13:53:16 +0000 (09:53 -0400)]
[libc++] Remove unused functions and minor features of the test suite
This commit removes minor features of the test suite that I've never
seen used and that are basically just a maintenance burden:
- color_diagnostics: Diagnostics are colored by default when running
from a terminal, and not colored otherwise. This is the right behavior.
Being able to tweak this has minor value, and could be achieved by
modifying the %{compile_flags} instead if absolutely needed.
- ccache: This can be achieved by using a wrapper for the %{cxx}
substitution.
- _dump_macros_verbose is just a dead function now.
Ehsan Toosi [Tue, 28 Apr 2020 12:44:21 +0000 (14:44 +0200)]
[MLIR][LINALG] Convert Linalg on Tensors to Buffers
This is a basic pass to convert Linalg.GenericOp which works on tensors to use
buffers instead.
Differential Revision: https://reviews.llvm.org/D78996
Jay Foad [Tue, 5 May 2020 10:35:23 +0000 (11:35 +0100)]
[InstCombine] Remove hasOneUse check for pow(C,x) -> exp2(log2(C)*x)
I don't think there's any good reason not to do this transformation when
the pow has multiple uses.
Differential Revision: https://reviews.llvm.org/D79407
Louis Dionne [Tue, 5 May 2020 13:43:54 +0000 (09:43 -0400)]
[libc++] Allow <__config_site> not being included
Otherwise, we can't test other standard libraries.
Matt Arsenault [Tue, 28 Apr 2020 01:44:46 +0000 (21:44 -0400)]
Elaborate more on --rocm-path flag.
I'm not sure what the conventions are for this documentation. The
format seems limiting. I don't see how to refer to other flags, or
mark flags as deprecated. The rst I believe these generate seems to be
in source, and out of date.
Louis Dionne [Mon, 4 May 2020 21:05:21 +0000 (17:05 -0400)]
[libc++] Move parsing of <__config_site> macros to the DSL
Raphael Isemann [Tue, 5 May 2020 13:23:34 +0000 (15:23 +0200)]
Revert "[lldb][cmake] Also use local submodule visibility on Darwin"
This reverts commit
8baa0b9439b5788bc5a0a2ee45dfda01b7a5a43f. This broke the
LLDB Green Dragon bot where htonl is getting miscompiled on macOS 10.14 and 10.15
SDKs, causing networking tests to fail as IP addressed were being inverted
(e.g., 127.0.0.1 became 1.0.0.127 with an enabled modules build).
Reverting until this is fixed.
Jonathan Coe [Tue, 5 May 2020 13:05:00 +0000 (14:05 +0100)]
[clang-format] C# always regards && as a binary operator
Reviewers: krasimir, MyDeveloperDay
Reviewed By: MyDeveloperDay
Subscribers: cfe-commits
Tags: #clang-format, #clang
Differential Revision: https://reviews.llvm.org/D79414
Sebastian Neubauer [Tue, 17 Mar 2020 12:14:24 +0000 (13:14 +0100)]
[AMDGPU] Don't mark the .note section as ALLOC
Marking a section as ALLOC tells the ELF loader to load the section into memory.
As we do not want to load the notes into VRAM, the flag should not be there.
On AMDHSA, .note is still marked as ALLOC, apparently this is currently
needed for OpenCL (see https://reviews.llvm.org/D74995).
Differential Revision: https://reviews.llvm.org/D76278
Sander de Smalen [Tue, 5 May 2020 12:15:19 +0000 (13:15 +0100)]
[AArch64][SVE] Guard bitcast patterns under IsLE predicate
Reviewed By: efriedma
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79352
David Green [Tue, 5 May 2020 10:27:12 +0000 (11:27 +0100)]
[ARM] Correct the type on a predicate cast
A PREDICATE_CAST(PREDICATE_CAST(X)) can be converted to a
PREDICATE_CAST(X) as the operation can convert between any forms of
predicates (v4i1/v8i1/v16i1/i32). Unfortunately I got the type wrong on
one of the rarer converts, which would lead to invalid nodes during
isel. This fixes it up to use the correct type.
Differential Revision: https://reviews.llvm.org/D79402
Sander de Smalen [Tue, 5 May 2020 12:04:14 +0000 (13:04 +0100)]
[SveEmitter] Add builtins for svreinterpret
The reinterpret builtins are generated separately because they
need the cross product of all types, 121 functions in total,
which is inconvenient to specify in the arm_sve.td file.
Reviewers: SjoerdMeijer, efriedma, ctetreau, rengolin
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78756
Jean-Michel Gorius [Tue, 5 May 2020 11:37:37 +0000 (13:37 +0200)]
[mlir][standalone] NFC: Update CMakeLists.txt to reflect best practices
Update to follow the changes introduced in 5469f43 and documented in 93f7e52.
Simon Pilgrim [Tue, 5 May 2020 11:29:57 +0000 (12:29 +0100)]
[InstCombine] Fold or(zext(bswap(x)),shl(zext(bswap(y)),bw/2)) -> bswap(or(zext(x),shl(zext(y), bw/2))
This adds a general combine that can be used to fold:
or(zext(OP(x)), shl(zext(OP(y)),bw/2))
-->
OP(or(zext(x), shl(zext(y),bw/2)))
Allowing us to widen 'concat-able' style or+zext patterns - I've just set this up for BSWAP but we could use this for other similar ops (BITREVERSE for instance).
We already do something similar for bitop(bswap(x),bswap(y)) --> bswap(bitop(x,y))
Fixes PR45715
Reviewed By: @lebedev.ri
Differential Revision: https://reviews.llvm.org/D79041
Alexander Belyaev [Tue, 5 May 2020 10:24:10 +0000 (12:24 +0200)]
[MLIR] Link MLIRStandardOpsTransforms with MLIRTransforms.
Summary: This fixes shared lib build.
Differential Revision: https://reviews.llvm.org/D79403
Simon Pilgrim [Tue, 5 May 2020 10:57:25 +0000 (11:57 +0100)]
[X86][AVX] combineVectorSignBitsTruncation - avoid complex vXi64->vXi32 PACKSS truncations (PR45794)
Unless we're truncating an 'all-bits' result, using PACKSS for vXi64->vXi32 truncation causes problems with later combines as ComputeNumSignBits struggles to see through BITCASTs to smaller types. If we don't use PACKSS in these cases then we fallback to shuffles which are usually just as good.
Simon Pilgrim [Tue, 5 May 2020 10:32:33 +0000 (11:32 +0100)]
[X86][AVX] Add PR45794 sitofp v4i64-v4f32 test case
Kadir Cetinkaya [Mon, 4 May 2020 08:48:19 +0000 (10:48 +0200)]
[clangd] Get rid of Inclusion::R
Summary:
This is only used by documentlink and go-to-definition. We are pushing
range detection logic from Inclusion creation to users. This would make using
stale preambles easier.
For document links we make use of the spelledtokens stored in tokenbuffers to
figure out file name range.
For go-to-def, we keep storing the line number we've seen the include directive.
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79315
James Henderson [Mon, 4 May 2020 09:46:41 +0000 (10:46 +0100)]
[docs][llvm-objcopy] Update --output-target text with right defaults
The --output-target documentation has slightly rotted, as the default is
no longer purely based on the input file format, but also the value of
--input-target. This patch updates the documentation to make this
explicit.
Reviewed by: MaskRay, alexshap
Differential Revision: https://reviews.llvm.org/D79318
Nico Weber [Tue, 5 May 2020 10:15:20 +0000 (06:15 -0400)]
[gn build] (manually) merge
07f8ca6ab19
Andrea Di Biagio [Tue, 5 May 2020 09:46:55 +0000 (10:46 +0100)]
Forgot to add a -mtriple to a test. NFC
This should unbreak the clang-ppc64be-linux buildbot.
Kirill Bobyrev [Tue, 5 May 2020 09:22:59 +0000 (11:22 +0200)]
[clangd] NFC: Cleanup unused headers and libraries
Summary: Extended version of D78843.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79313
Sander de Smalen [Tue, 5 May 2020 08:16:57 +0000 (09:16 +0100)]
Reland D78750: [SveEmitter] Add builtins for svdupq and svdupq_lane
Edit: Changed a few CHECK lines into CHECK-DAG lines.
This reverts commit
90f3f62cb087782fe2608e95d686c29067281b6e.
Sam Parker [Tue, 28 Apr 2020 13:11:27 +0000 (14:11 +0100)]
[NFC][CostModel] Add TargetCostKind to relevant APIs
Make the kind of cost explicit throughout the cost model which,
apart from making the cost clear, will allow the generic parts to
calculate better costs. It will also allow some backends to
approximate and correlate the different costs if they wish. Another
benefit is that it will also help simplify the cost model around
immediate and intrinsic costs, where we currently have multiple APIs.
RFC thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html
Differential Revision: https://reviews.llvm.org/D79002
Andrea Di Biagio [Mon, 4 May 2020 17:23:04 +0000 (18:23 +0100)]
[MCA] Fixed a bug where loads and stores were sometimes incorrectly marked as depedent. Fixes PR45793.
This fixes a regression introduced by a very old commit
280ac1fd1dc35 (was
llvm-svn 361950).
Commit
280ac1fd1dc35 redesigned the logic in the LSUnit with the goal of
speeding up isReady() queries, and stabilising the LSUnit API (while also making
the load store unit more customisable).
The concept of MemoryGroup (effectively an alias set) was added by that commit
to better describe and track dependencies between memory operations. However,
that concept was not just used for alias dependencies, but it was also used for
describing memory "order" dependencies (enforced by the memory consistency
model).
Instructions of a same memory group were considered "equivalent" as in:
independent operations that can potentially execute in parallel. The problem
was that the cost of a dependency (in terms of number of cycles) should have
been different for "order" dependency. Instructions in an order dependency
simply have to have to wait until their predecessors are "issued" to an
underlying pipeline (rather than having to wait until predecessors have beeng
fully executed). For simple "order" dependencies, this was effectively
introducing an artificial delay on the "issue" of independent loads and stores.
This patch fixes the issue and adds a new test named 'independent-load-stores.s'
to a bunch of x86 targets. That test contains the reproducible posted by Fabian
Ritter on PR45793.
I had to rerun the update-mca-tests script on several files. To avoid expected
regressions on some Exynos tests, I have added a -noalias=false flag (to match
the old strict behavior on latencies).
Some tests for processor Barcelona are improved/fixed by this change and they
now show better results. In a few tests we were incorrectly counting the time
spent by instructions in a scheduler queue. In one case in particular we now
correctly see a store executed out of order. That test was affected by the same
underlying issue reported as PR45793.
Reviewers: mattd
Differential Revision: https://reviews.llvm.org/D79351
Pratyai Mazumder [Tue, 5 May 2020 08:19:13 +0000 (01:19 -0700)]
[SanitizerCoverage] Replace the unconditional store with a load, then a conditional store.
Reviewers: vitalybuka, kcc
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79392
Alex Zinenko [Tue, 5 May 2020 09:22:14 +0000 (11:22 +0200)]
[mlir] NFC: update ::build signature in the tutorial document
This was missing from the original commit that changed the interface of
`::build` methods to take `OpBuilder &` instead of `Builder *.
Heejin Ahn [Thu, 30 Apr 2020 07:09:59 +0000 (00:09 -0700)]
[WebAssembly] Fix block marker placing after fixUnwindMismatches
Summary:
This fixes a few things that are connected. It is very hard to provide
an independent test case for each of those fixes, because they are
interconnected and sometimes one masks another. The provided test case
triggers some of those bugs below but not all.
---
1. Background:
`placeBlockMarker` takes a BB, and if the BB is a destination of some
branch, it places `end_block` marker there, and computes the nearest
common dominator of all predecessors (what we call 'header') and places
a `block` marker there.
When we first place markers, we traverse BBs from top to bottom. For
example, when there are 5 BBs A, B, C, D, and E and B, D, and E are
branch destinations, if mark the BB given to `placeBlockMarker` with `*`
and draw a rectangle representing the border of `block` and `end_block`
markers, the process is going to look like
```
-------
----- |-----|
--- |---| ||---||
|A| ||A|| |||A|||
--- --> |---| --> ||---||
*B | B | || B ||
C | C | || C ||
D ----- |-----|
E *D | D |
E -------
*E
```
which means when we first place markers, we go from inner to outer
scopes. So when we place a `block` marker, if the header already
contains other `block` or `try` marker, it has to belong to an inner
scope, so the existing `block`/`try` markers should go _after_ the new
marker. This was the assumption we had.
But after placing all markers we run `fixUnwindMismatches` function.
There we do some control flow transformation and create some branches,
and we call `placeBlockMarker` again to place `block`/`end_block`
markers for those newly created branches. We can't assume that we are
traversing branch destination BBs from top to bottom now because we are
basically inserting some new markers in the middle of existing markers.
Fix:
In `placeBlockMarker`, we don't have the assumption that the BB given is
in the order of top to bottom, and when placing `block` markers,
calculates whether existing `block` or `try` markers are inner or
outer scopes with respect to the current scope.
---
2. Background:
In `fixUnwindMismatches`, when there is a call whose correct unwind
destination mismatches the current destination after initially placing
`try` markers, we wrap that with a new nested `try`/`catch`/`end` and
jump to the correct handler within the new `catch`. The correct handler
code is split as a separate BB from its original EH pad so it can be
branched to. Here's an example:
- Before
```
mbb:
call @foo <- Unwind destination mismatch!
wrong-ehpad:
catch
...
cont:
end_try
...
correct-ehpad:
catch
[handler code]
```
- After
```
mbb:
try (new)
call @foo
nested-ehpad: (new)
catch (new)
local.set n / drop (new)
br %handleri (new)
nested-end: (new)
end_try (new)
wrong-ehpad:
catch
...
cont:
end_try
...
correct-ehpad:
catch
local.set n / drop (new)
handler: (new)
end_try
[handler code]
```
Note that after this transformation, it is possible there are no calls
to actually unwind to `correct-ehpad` here. `call @foo` now
branches to `handler`, and there can be no other calls to unwind to
`correct-ehpad`. In this case `correct-ehpad` does not have any
predecessors anymore.
This can cause a bug in `placeBlockMarker`, because we may need to place
`end_block` marker in `handler`, and `placeBlockMarker` computes the
nearest common dominator of all predecessors. If one of `handler`'s
predecessor (here `correct-ehpad`) does not have any predecessors, i.e.,
no way of reaching it, we cannot correctly compute the common dominator
of predecessors of `handler`, and end up placing no `block`/`end`
markers. This bug actually sometimes masks the bug 1.
Fix:
When we have an EH pad that does not have any predecessors after this
transformation, deletes all its successors, so that its successors don't
have any dangling predecessors.
---
3. Background:
Actually the `handler` BB in the example shown in bug 2 doesn't need
`end_block` marker, despite it being a new branch destination, because
it already has `end_try` marker which can serve the same purpose. I just
put that example there for an illustration purpose. There is a case we
actually need to place `end_block` marker: when the branch dest is the
appendix BB. The appendix BB is created when there is a call that is
supposed to unwind to the caller ends up unwinding to a wrong EH pad. In
this case we also wrap the call with a nested `try`/`catch`/`end`,
create an 'appendix' BB at the very end of the function, and branch to
that BB, where we rethrow the exception to the caller.
Fix:
When we don't actually need to place block markers, we don't.
---
4. In case we fall through to the continuation BB after the catch block,
after extracting handler code in `fixUnwindMismatches` (refer to bug 2
for an example), we now have to add a branch to it to bypass the
handler.
- Before
```
try
...
(falls through to 'cont')
catch
handler body
end
<-- cont
```
- After
```
try
...
br %cont (new)
catch
end
handler body
<-- cont
```
The problem is, we haven't been placing a new `end_block` marker in the
`cont` BB in this case. We should, and this fixes it. But it is hard to
provide a test case that triggers this bug, because the current
compilation pipeline from .ll to .s does not generate this kind of code;
we always have a `br` after `invoke`. But code without `br` is still
valid, and we can have that kind of code if we have some pipeline
changes or optimizations later. Even mir test cases cannot trigger this
part for now, because we don't encode auxiliary EH-related data
structures (such as `WasmEHFuncInfo`) in mir now. Those functionalities
can be added later, but I don't think we should block this fix on that.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79324
Pierre-vh [Fri, 1 May 2020 13:57:36 +0000 (14:57 +0100)]
[Target][ARM] Fold or(A, B) more aggressively for I1 vectors
This patch makes the folding of or(A, B) into not(and(not(A), not(B)))
more agressive for I1 vector. This only affects Thumb2 MVE and improves
codegen, because it removes a lot of msr/mrs instructions on VPR.P0.
This patch also adds a xor(vcmp) -> !vcmp fold for MVE.
Differential Revision: https://reviews.llvm.org/D77202
Pierre-vh [Tue, 7 Apr 2020 14:09:56 +0000 (15:09 +0100)]
[Target][ARM] Add PerformVSELECTCombine for MVE Integer Ops
This patch adds an implementation of PerformVSELECTCombine in the
ARM DAG Combiner that transforms vselect(not(cond), lhs, rhs) into
vselect(cond, rhs, lhs).
Normally, this should be done by the target-independent DAG Combiner,
but it doesn't handle the kind of constants that we generate, so we
have to reimplement it here.
Differential Revision: https://reviews.llvm.org/D77712
Peter Smith [Sat, 2 May 2020 10:16:45 +0000 (11:16 +0100)]
[ELF][ARM] Do not create .ARM.exidx sections for out of range inputs
A linker will create .ARM.exidx sections for InputSections that don't
have them. This can cause a relocation out of range error If the
InputSection happens to be extremely far away from the other sections.
This is often the case for the vector table on older ARM CPUs as the only
two places that the table can be placed is 0 or 0xffff0000. We fix this
by removing InputSections that need a linker generated .ARM.exidx
section if that would cause an error.
Differential Revision: https://reviews.llvm.org/D79289
David Green [Tue, 5 May 2020 08:26:28 +0000 (09:26 +0100)]
[ARM] MVE predcast with const test. NFC
Martin Storsjö [Tue, 5 May 2020 08:45:36 +0000 (11:45 +0300)]
[LLD] [COFF] Fix a typo in an assert message. NFC.
Haojian Wu [Tue, 5 May 2020 07:26:12 +0000 (09:26 +0200)]
[clang] Fix an uint32_t overflow in large preamble.
Summary:
I was surprised to see the LocalOffset can exceed uint32_t, but it
does happen and lead to crashes in one of our internal huge TU with a large
preamble.
with this patch, the crash is gone.
Reviewers: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79397
Alexander Belyaev [Tue, 5 May 2020 06:30:30 +0000 (08:30 +0200)]
[MLIR] Add conversion from AtomicRMWOp -> GenericAtomicRMWOp.
Adding this pattern reduces code duplication. There is no need to have a
custom implementation for lowering to llvm.cmpxchg.
Differential Revision: https://reviews.llvm.org/D78753