Simon Pilgrim [Thu, 9 Sep 2021 11:16:08 +0000 (12:16 +0100)]
[X86][AVX] Add missing X86ISD::VBROADCAST(v2f64 -> v4f64) isel pattern for AVX1 targets
As discussed on the ticket, I'm intending to add additional 128->256 patterns when we have test coverage, but this addresses a known crash.
Differential Revision: https://reviews.llvm.org/D109434
Muhammad Omair Javaid [Thu, 9 Sep 2021 11:04:43 +0000 (16:04 +0500)]
AArch64 SVE restore SVE registers after expression
This patch fixes register save/restore on expression call to also include SVE registers.
This will fix expression calls like:
re re p1
<Register Value P1 before expression>
p <var-name or function call>
re re p1
<Register Value P1 after expression>
In above example register P1 should remain the same before and after the expression evaluation.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D108739
Alex Zinenko [Wed, 25 Aug 2021 09:07:17 +0000 (11:07 +0200)]
[mlir] support reductions in SCF to OpenMP conversion
OpenMP reductions need a neutral element, so we match some known reduction
kinds (integer add/mul/or/and/xor, float add/mul, integer and float min/max) to
define the neutral element and the atomic version when possible to express
using atomicrmw (everything except float mul). The SCF-to-OpenMP pass becomes a
module pass because it now needs to introduce new symbols for reduction
declarations in the module.
Reviewed By: chelini
Differential Revision: https://reviews.llvm.org/D107549
Bradley Smith [Tue, 7 Sep 2021 15:52:39 +0000 (15:52 +0000)]
[AArch64][SVE] Add missing patterns for unpredicated subr intrinsics
Differential Revision: https://reviews.llvm.org/D109369
Simon Pilgrim [Thu, 9 Sep 2021 10:23:36 +0000 (11:23 +0100)]
[X86] Move _mm256_set_m128* intrinsics before _mm256_loadu2_m128* intrinsics. NFC.
This is necessary for PR51796 where we'll update _mm256_loadu2_m128* to use _mm256_set_m128*
Alfonso Sánchez-Beato [Thu, 9 Sep 2021 10:14:52 +0000 (11:14 +0100)]
[yaml2obj][COFF] Allow variable number of directories
Allow variable number of directories, as allowed by the
specification. NumberOfRvaAndSize will default to 16 if not specified,
as in the past.
Reviewed by: jhenderson
Differential Revision: https://reviews.llvm.org/D108825
Sjoerd Meijer [Thu, 9 Sep 2021 09:11:28 +0000 (10:11 +0100)]
[FuncSpec] Fixed minor formatting issues. NFC.
Roman Lebedev [Sun, 15 Aug 2021 16:01:44 +0000 (19:01 +0300)]
[SimplifyCFG] performBranchToCommonDestFolding(): require block-closed SSA form for bonus instructions (PR51125)
I can't seem to wrap my head around the proper fix here,
we should be fine without this requirement, iff we can form this form,
but the naive attempt (https://reviews.llvm.org/D106317) has failed.
So just to unblock the release, put up a restriction.
Fixes https://bugs.llvm.org/show_bug.cgi?id=51125
Jun Ma [Fri, 20 Aug 2021 09:27:00 +0000 (17:27 +0800)]
Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values.""
Differential Revision: https://reviews.llvm.org/D106056
Michał Górny [Thu, 22 Apr 2021 19:39:53 +0000 (21:39 +0200)]
[lldb] [test] Add tests for coredumps with multiple threads
Differential Revision: https://reviews.llvm.org/D101157
Cullen Rhodes [Thu, 9 Sep 2021 07:14:54 +0000 (07:14 +0000)]
[SelectionDAG] NFC: Remove unused template args
Identified in D109359.
Jean Perier [Thu, 9 Sep 2021 07:11:49 +0000 (09:11 +0200)]
[flang] Fix common block size extension mistake in D109156
https://reviews.llvm.org/D109156 did not properly update the case where
the equivalence symbol appearing in the common statement is the
"base symbol of an equivalence group" (this was the only case that previously
worked ok, and the patch broke it).
Fix this and add a test that actually uses this code path.
Differential Revision: https://reviews.llvm.org/D109439
Cullen Rhodes [Thu, 9 Sep 2021 06:44:09 +0000 (06:44 +0000)]
[AArch64][SVE] NFC: Remove unused template args
For sve_fp_3op_p_zds_zx we have zero patterns downstream but the
intrinsic args can be added again if/when the patterns are implemented.
Identified in D109359.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D109429
Cullen Rhodes [Thu, 9 Sep 2021 06:43:24 +0000 (06:43 +0000)]
[AArch64][SVE] NFC: Use stepvector directly in index multiclasses
Also fixes a couple of warnings identified in D109359:
SVEInstrFormats.td:5099:59: warning: unused template argument: sve_int_index_ri::step_vector
SVEInstrFormats.td:5133:59: warning: unused template argument: sve_int_index_rr::step_vector
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D109422
Alexander Pivovarov [Fri, 3 Sep 2021 21:16:29 +0000 (14:16 -0700)]
[RISCV] Add SiFive cores E and S series
Add SiFive cores E20, E21, E24, E34, S21, S54 and S76
Differential Revision: https://reviews.llvm.org/D109260
Yvan Roux [Thu, 9 Sep 2021 05:31:28 +0000 (07:31 +0200)]
[RISCV] Fix Machine Outliner jump table handling.
Don't outline machine instructions which are using jump table indexes
since they are materialized as local labels (like the already handled
case of constant pools).
Reviewed By: paquette
Differential Revision: https://reviews.llvm.org/D109436
Pushpinder Singh [Tue, 7 Sep 2021 07:25:47 +0000 (12:55 +0530)]
[AMDGPU][OpenMP] Use complex definitions from complex_cmath.h
Following nvptx approach, this patch uses complex function
definitions from complex_cmath.h. With this patch, ovo passes
23/34 complex mathematical test cases.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D109344
Matthias Springer [Thu, 9 Sep 2021 04:46:02 +0000 (13:46 +0900)]
[mlir][scf] Fold dim(scf.for) to dim(iter_arg)
Fold dim ops of scf.for results to dim ops of the respective iter args if the loop is shape preserving.
Differential Revision: https://reviews.llvm.org/D109430
Matthias Springer [Thu, 9 Sep 2021 04:35:58 +0000 (13:35 +0900)]
[mlir][linalg] Fold dim(linalg.tiled_loop) to dim(output_arg)
Fold dim ops of linalg.tiled_loop results to dim ops of the respective iter args if the loop is shape preserving.
Differential Revision: https://reviews.llvm.org/D109431
Tom Stellard [Thu, 9 Sep 2021 04:10:38 +0000 (21:10 -0700)]
scudo: Only add no-omit-frame-pointer flags when the compiler supports them
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D109196
Matthias Springer [Thu, 9 Sep 2021 03:12:28 +0000 (12:12 +0900)]
[mlir][linalg] Fix dim(iter_arg) canonicalization
Run a small analysis to see if the runtime type of the iter_arg is changing. Fold only if the runtime type stays the same. (Same as `DimOfIterArgFolder` in SCF.)
Differential Revision: https://reviews.llvm.org/D109299
Leonard Chan [Thu, 9 Sep 2021 03:04:56 +0000 (20:04 -0700)]
[polly] Fix "no member named 'getIndexExpressionsFromGEP'"
As of
741fabc222f226d34d806056b804244b012853b, polly builders are
failing from this error. The signiature is slightly different and
accepts a ScalarEvolution reference instead. This should fix the polly
builders.
Peter Collingbourne [Wed, 8 Sep 2021 19:53:30 +0000 (12:53 -0700)]
gn build: Add support for building lldb-server on Android.
The cross-compiled lldb-server targets are added to the lldb deps if
Android cross compilation is enabled.
Differential Revision: https://reviews.llvm.org/D109464
Peter Collingbourne [Thu, 2 Sep 2021 20:39:32 +0000 (13:39 -0700)]
gn build: Add support for building LLDB on Linux.
On Linux, LLDB depends on lldb-server at runtime (on Mac, the dependency on
a debug server presumably comes via the system debugserver), so I added it
to deps.
Differential Revision: https://reviews.llvm.org/D109463
Matthias Springer [Thu, 9 Sep 2021 02:04:41 +0000 (11:04 +0900)]
[mlir][linalg] Tiling: Use loop ub in extract_slice size computation if possible
When tiling a LinalgOp, extract_slice/insert_slice pairs are inserted. To avoid going out-of-bounds when the tile size does not divide the shape size evenly (at the boundary), AffineMin ops are inserted. Some ops have assumptions regarding the dimensions of inputs/outputs. E.g., in a `A * B` matmul, `dim(A, 1) == dim(B, 0)`. However, loop bounds use either `dim(A, 1)` or `dim(B, 0)`.
With this change, AffineMin ops are expressed in terms of loop bounds instead of tensor sizes. (Both have the same runtime value.) This simplifies canonicalizations.
Differential Revision: https://reviews.llvm.org/D109267
Leonard Chan [Thu, 9 Sep 2021 01:31:10 +0000 (18:31 -0700)]
Revert "[runtimes] Set more paths when building runtimes standalone"
This reverts commit
407e07aa67ab56c92cdec1fdbf6b121afbceddaf.
Reverting since this seems to break OpenMP builds and our clang
builders. See thread on https://reviews.llvm.org/D107895.
Chris Lattner [Thu, 9 Sep 2021 00:36:43 +0000 (17:36 -0700)]
[APInt.h] Reduce the APInt header file interface a bit. NFC
This moves one mid-size function out of line, inlines the
trivial tcAnd/tcOr/tcXor/tcComplement methods into their only
caller, and moves the magic/umagic functions into SelectionDAG
since they are implementation details of its algorithm. This
also removes the unit tests for magic, but these are already
tested in the divide lowering logic for various targets.
This also upgrades some C style comments to C++.
Differential Revision: https://reviews.llvm.org/D109476
Jessica Paquette [Thu, 9 Sep 2021 00:32:54 +0000 (17:32 -0700)]
[MachineOutliner][AArch64] Ensure LR is live-in when inserting reg-save calls
Similar to other code which handles creating the function frame.
If LR isn't live-in to the block that we're inserting the call into, we'll get
a MachineVerifier error.
Amara Emerson [Wed, 8 Sep 2021 06:51:48 +0000 (23:51 -0700)]
[GlobalISel] Implement merging of stores of truncates.
This is a port of a combine which matches a pattern where a wide type scalar
value is stored by several narrow stores. It folds it into a single store or
a BSWAP and a store if the targets supports it.
Assuming little endian target:
i8 *p = ...
i32 val = ...
p[0] = (val >> 0) & 0xFF;
p[1] = (val >> 8) & 0xFF;
p[2] = (val >> 16) & 0xFF;
p[3] = (val >> 24) & 0xFF;
=>
*((i32)p) = val;
On CTMark AArch64 -Os this results in a good amount of savings:
Program before after diff
SPASS 412792 412788 -0.0%
kc 432528 432512 -0.0%
lencod 430112 430096 -0.0%
consumer-typeset 419156 419128 -0.0%
bullet 475840 475752 -0.0%
tramp3d-v4 367760 367628 -0.0%
clamscan 383388 383204 -0.0%
pairlocalalign 249764 249476 -0.1%
7zip-benchmark 570100 568860 -0.2%
sqlite3 287628 286920 -0.2%
Geomean difference -0.1%
Differential Revision: https://reviews.llvm.org/D109419
Philip Reames [Wed, 8 Sep 2021 23:29:29 +0000 (16:29 -0700)]
[SCEV] Move getIndexExpressionsFromGEP to delinearize [NFC]
Mehdi Amini [Wed, 8 Sep 2021 18:06:47 +0000 (18:06 +0000)]
Add sanity check in MLIR ODS to catch case where two results have the same name
This is making a tablegen crash with a more friendly error.
Differential Revision: https://reviews.llvm.org/D109456
Chris Lattner [Wed, 8 Sep 2021 23:33:06 +0000 (16:33 -0700)]
[APInt.h] don't privatize "needsCleanup"; it is used by Clang APValue
David Blaikie [Wed, 8 Sep 2021 23:21:11 +0000 (16:21 -0700)]
Error: Improve unit test by using gtest equality rather than explicit string compare calls
This ensures error messages from gtest includes the raw text of both
sides of the comparison - otherwise all gtest can report is the text of
the expression source, without any information about the values or how
they differ.
David Blaikie [Wed, 8 Sep 2021 23:16:11 +0000 (16:16 -0700)]
FileError: Provide a way to retrieve the underlying error string without the file name
For use with APIs that want to report the file name in a different
syntactic form, have other knowledge of the filename, etc.
David Blaikie [Wed, 8 Sep 2021 23:03:28 +0000 (16:03 -0700)]
FileError: Support zero-length file names
It's a common error in an API - to try to open an empty file, so it
seems like a reasonable FileError to produce "hey, you tried to open an
empty file" and to handle it the same way as any other file error.
Chris Lattner [Wed, 8 Sep 2021 23:07:43 +0000 (16:07 -0700)]
[APInt.h] Clean up the APInt interface. NFC.
This moves all the private implementation details to the bottom of
the header, and pushes all the "make an APInt" stuff up to the top.
This is in prep for making other changes to spiff up APInt a bit.
Usman Nadeem [Wed, 8 Sep 2021 22:52:19 +0000 (15:52 -0700)]
[clang][Driver] Update/cleanup LTO logic to ensure that the last lto argument is honored
- Make flto an alias of flto=full.
- Make foffload-lto an alias of foffload-lto=full.
- Make flto_EQ_jobserver, flto_EQ_auto aliases of flto=full,
since they are being treated as full lto right now.
- Clean up the code for parseLTOMode and setLTOMode.
- Replace uses of OPT_flto with OPT_flto_EQ since they alias now.
Differential Revision: https://reviews.llvm.org/D108881
Change-Id: I5d867db83a680434fba5c8d85c9a83135d3b81ee
Leonard Chan [Wed, 8 Sep 2021 22:52:02 +0000 (15:52 -0700)]
[clang][Fuchsia] Remove COMPILER_RT_CAN_EXECUTE_TESTS
I forgot that we run `check-runtimes-x86_64-unknown-linux-gnu`, which
will run all compiler-rt tests also even though we are currently not in
a state where we can run them all yet. Remove this for now to fix our CI
builders.
Usman Nadeem [Wed, 8 Sep 2021 22:49:35 +0000 (15:49 -0700)]
Revert "[clang][Driver] Update/cleanup LTO logic to ensure that the last lto argument is honored"
This reverts commit
d2d2e5ea480feb09dc0edeac2eb14310de74b372.
Usman Nadeem [Wed, 8 Sep 2021 19:10:17 +0000 (12:10 -0700)]
[clang][Driver] Update/cleanup LTO logic to ensure that the last lto argument is honored
- Make flto an alias of flto=full.
- Make foffload-lto an alias of foffload-lto=full.
- Make flto_EQ_jobserver, flto_EQ_auto aliases of flto=full,
since they are being treated as full lto right now.
- Clean up the code for parseLTOMode and setLTOMode.
- Replace uses of OPT_flto with OPT_flto_EQ since they alias now.
Change-Id: Iea5338c20cb800b43529b20745e92600e2cfd2b1
Philip Reames [Wed, 8 Sep 2021 22:25:08 +0000 (15:25 -0700)]
[SCEV] Simplify findExistingSCEVInCache interface [NFC]
We were returning a tuple when all but one caller only cared about one piece of the return value. That one caller can inline the complexity, and we can simplify all other uses.
Andrew Litteken [Mon, 23 Aug 2021 19:02:30 +0000 (12:02 -0700)]
[CodeExtractor] Creating exit stubs based off original order branch instructions.
Previously the CodeExtractor created exit stubs, and the subsequent return value of the outlined function based on the order of out-of-region blocks after splitting any phi nodes, and collecting the blocks to be outlined. This could cause differences in order if there was a difference of exit block phi nodes between the two regions. This patch moves the collection of the output target blocks to be before this occurs, so that the assignment of target block to output value will be the same, regardless of the contents of the output block.
Reviewers: paquette, roelofs
Differential Revision: https://reviews.llvm.org/D108657
David Green [Wed, 8 Sep 2021 22:00:34 +0000 (23:00 +0100)]
[AArch64] Rewrite floatdp_1source.ll test. NFC
Rewrite this test to not rely on volatile stores in a large function,
just use separate functions like any other test would.
Arthur Eubanks [Wed, 8 Sep 2021 20:27:55 +0000 (13:27 -0700)]
Port the cost model printer to New PM
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D109284
Craig Topper [Wed, 8 Sep 2021 21:26:05 +0000 (14:26 -0700)]
[RISCV] Disable use of i128 shift libcalls on RV32.
Since i128 isn't a legal C type on RV32, I don't believe
libgcc implements these functions for RV32. compiler-rt
does implement them because i128 support is enabled
in order to handle long double.
This is consistent with 32-bit X86 and ARM.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D109383
Eli Friedman [Wed, 8 Sep 2021 21:18:47 +0000 (14:18 -0700)]
[NFC] Add extra test for D106331
Jon Chesterfield [Wed, 8 Sep 2021 21:07:05 +0000 (22:07 +0100)]
[openmp]
41c73671d0, this time with staged patch applied
Michael Kruse [Wed, 8 Sep 2021 20:36:39 +0000 (15:36 -0500)]
[Delinerization] Require by offset to be zero.
Users of delinearization assume that the the offset into the array element is zero. In most cases it will indeed be zero, but if it is not, the delinearization has to fail since it violates that assumption without the API even allowing to signal to the caller that the by offset is non-zero.
This bug caused Polly to miscompile blender (526.blender_r from SPEC CPU 2017) in -polly-process-unprofitable mode. The SCEV expression incorrectly delinearized has been reduced in the test case byte_offset.ll. The dropped offset into the array element of size 4 (a float) is ((sext i32 %mul7.i4534 to i64) + {(sext i32 %i1 to i64),+,((sext i32 (1 + ((1 + %shl.i.i) * (1 + %shl.i.i)) + %shl.i.i) to i64) * (sext i32 %i1 to i64))}<%for.body703>). This significant component was just dropped, and the wrong pointer was computed when regenerating code from the remaining delinearized subscripts. This occurred during blender's subsurface scattering implementation. As a result, blender's rendering diverged from the reference image.
Patch D108885 would also fix the API.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D109133
Martin Storsjö [Wed, 11 Aug 2021 10:44:06 +0000 (13:44 +0300)]
[runtimes] Allow overriding where CMake installs RUNTIME type libraries (DLLs)
Differential Revision: https://reviews.llvm.org/D107892
Martin Storsjö [Wed, 11 Aug 2021 11:14:31 +0000 (14:14 +0300)]
[runtimes] Set more paths when building runtimes standalone
These paths are needed when building with per-target runtime directories.
(It's possible to fix this by manually setting these when invoking
cmake, but one isn't supposed to need to do that.)
Also set LLVM_TOOLS_BINARY_DIR while touching this area (as it's
also unset in this case) even if it isn't specifically needed by the
per-target runtime configuration.
Differential Revision: https://reviews.llvm.org/D107895
Greg Clayton [Tue, 7 Sep 2021 23:08:22 +0000 (16:08 -0700)]
Log to the right stream in DwarfTransformer::handleDie().
Since we might end up using multiple threads when logging information in the DWARFTransformer, the handleDie() method must use the supplied stream named "OS" when logging warnings and errors. When we use multiple threads, we log to a thread specific stream buffer and then use a mutex to ensure our output doesn't overlap when we emit warnings and errors after a thread is done.
Differential Revision: https://reviews.llvm.org/D109401
Jon Chesterfield [Wed, 8 Sep 2021 19:54:40 +0000 (20:54 +0100)]
[openmp] Re-enable test from D109057, now with windows path aware regex
Jonas Devlieghere [Wed, 8 Sep 2021 20:44:24 +0000 (13:44 -0700)]
[lldb] Make sure there's a value for the key before dereferencing.
Make sure there's a value for the shared_cache_base_address key exists
in the dictionary before trying to dereference the value.
rdar://
76894476
Florian Hahn [Wed, 8 Sep 2021 18:49:32 +0000 (20:49 +0200)]
[LAA] Remove unused OrigPtr from replaceSymbolicStrideSCEV (NFC).
The OrigPtr argument is not used in tree.
Chris Lattner [Wed, 8 Sep 2021 17:28:52 +0000 (10:28 -0700)]
[Canonicalize] Don't call isBeforeInBlock in OperationFolder::tryToFold.
This patch (
e4635e6328c8) fixed a bug where a newly generated/reused
constant wouldn't dominate a folded operation. It did so by calling
isBeforeInBlock to move the constant around on demand. This introduced
a significant compile time regression, because "isBeforeInBlock" is
O(n) in the size of a block the first time it is called, and the cache
is invalidated any time canonicalize changes something big in the block.
This fixes LLVM PR51738 and this CIRCT issue:
https://github.com/llvm/circt/issues/1700
This does affect the order of constants left in the top of a block,
I staged in the testsuite changes in rG42431b8207a5.
Differential Revision: https://reviews.llvm.org/D109454
Michael Kruse [Wed, 8 Sep 2021 20:28:43 +0000 (15:28 -0500)]
[Polly] Compile fix after Delinearization move.
by commit
585c594d749a2a88150b63804587af85abdabeaa
Nikita Popov [Fri, 3 Sep 2021 20:26:47 +0000 (22:26 +0200)]
[SROA] Support opaque pointers
Make the following changes in order to support opaque pointers in SROA:
* Generate i8 GEPs for opaque pointers.
* Explicitly enforce that promotable allocas only have stores of
the alloca type -- previously this was implicitly enforced.
* Replace a check for pointer element type with load/store type.
Differential Revision: https://reviews.llvm.org/D109259
Steven Wan [Wed, 8 Sep 2021 20:21:39 +0000 (16:21 -0400)]
[AIX] Check for typedef properly when getting preferred type align
The current check for typedef is naive and doesn't deal with any convoluted cases. This patch makes use of the new 'AlignRequirement' enum field from 'TypeInfo' to determine whether or not this is an 'aligned' attribute on a typedef.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D109387
Arthur Eubanks [Tue, 31 Aug 2021 19:24:50 +0000 (12:24 -0700)]
[MemorySSA] Support invariant.group metadata
The implementation is mostly copied from MemDepAnalysis. We want to look
at all loads and stores to the same pointer operand. Bitcasts and zero
GEPs of a pointer are considered the same pointer value. We choose the
most dominating instruction.
Since updating MemorySSA with invariant.group is non-trivial, for now
handling of invariant.group is not cached in any way, so it's part of
the walker. The number of loads/stores with invariant.group is small for
now anyway. We can revisit if this actually noticeably affects compile
times.
To avoid invariant.group affecting optimized uses, we need to have
optimizeUsesInBlock() not use invariant.group in any way.
Co-authored-by: Piotr Padlewski <prazek@google.com>
Reviewed By: asbirlea, nikic, Prazek
Differential Revision: https://reviews.llvm.org/D109134
Louis Dionne [Tue, 7 Sep 2021 16:55:48 +0000 (12:55 -0400)]
[libc++] Revert OpenBSD-related changes to the documentation
This commit partially reverts
0954e2b2d038 and
3fa4cff97480, which
make changes to the libc++ documentation implifying that OpenBSD is
supported. Neither of these changes have been reviewed AFAICT, so
I'm reverting as a matter of enforcing:
1. That changes get reviewed before being committed
2. That we have a discussion and a support plan for supporting
OpenBSD officially in libc++
Please note that I would be thrilled to support OpenBSD officially in
libc++, however doing so requires more than adding a note in the docs.
In particular, please make sure you read the note in [1] about setting
up CI testing for OpenBSD.
[1]: https://libcxx.llvm.org/#platform-and-compiler-support
Differential Revision: https://reviews.llvm.org/D109373
Philip Reames [Wed, 8 Sep 2021 19:01:19 +0000 (12:01 -0700)]
Move delinearization logic out of SCEV [NFC]
None of this logic has anything to do with SCEV's internals, it just uses the existing public APIs. As a result, we can move the code from ScalarEvolution.cpp/hpp to Delinearization.cpp/hpp with only minor changes.
This was discussed in advance on today's loop opt call. It turned out to be easy as hoped.
Nikita Popov [Wed, 8 Sep 2021 19:22:28 +0000 (21:22 +0200)]
[ConstantHoisting] Support opaque pointers
Directly use i8 for GEP, rather than fetching element type of i8*.
Louis Dionne [Wed, 8 Sep 2021 13:14:43 +0000 (09:14 -0400)]
[libc++][NFC] Rename _EnableIf to __enable_if_t for consistency
In other places in the code, we use lowercase spelling for things that
are not available in prior standards.
Differential Revision: https://reviews.llvm.org/D109435
Akira Hatanaka [Wed, 8 Sep 2021 18:58:03 +0000 (11:58 -0700)]
[ObjC][ARC] Use the addresses of the ARC runtime functions instead of
integer 0/1 for the operand of bundle "clang.arc.attachedcall"
https://reviews.llvm.org/D102996 changes the operand of bundle
"clang.arc.attachedcall". This patch makes changes to llvm that are
needed to handle the new IR.
This should make it easier to understand what the IR is doing and also
simplify some of the passes as they no longer have to translate the
integer values to the runtime functions.
Differential Revision: https://reviews.llvm.org/D103000
Akira Hatanaka [Wed, 8 Sep 2021 18:56:22 +0000 (11:56 -0700)]
[ObjC][ARC] Use the addresses of the ARC runtime functions instead of
integer 0/1 for the operand of bundle "clang.arc.attachedcall"
This should make it easier to understand what the IR is doing and also
simplify some of the passes as they no longer have to translate the
integer values to the runtime functions.
Differential Revision: https://reviews.llvm.org/D102996
Matt Morehouse [Wed, 8 Sep 2021 18:46:37 +0000 (11:46 -0700)]
[libFuzzer] Add missing argument to CrashResistantMerge.
Fixes a build error caused by a bad merge conflict resolution for
https://reviews.llvm.org/D105084.
Leonard Chan [Wed, 8 Sep 2021 18:45:52 +0000 (11:45 -0700)]
[compiler-rt][fuzzer] Do not link in libc++ in tests and disable exceptions
Differential Revision: https://reviews.llvm.org/D109208
Leonard Chan [Wed, 8 Sep 2021 18:44:00 +0000 (11:44 -0700)]
[compiler-rt] Use COMPILER_RT_TEST_CXX_COMPILER for linking compiler-rt tests
Before, COMPILER_RT_TEST_COMPILER was used which pointed to a C compiler. While
it is incorrect to assume either of these is the default compiler, using the
C++ one allows for linking cpp tests.
Differential Revision: https://reviews.llvm.org/D109207
Andrew Litteken [Mon, 23 Aug 2021 18:56:10 +0000 (11:56 -0700)]
[IROutliner] Using canonical values to find corresponding values. (NFC)
D104143 introduced canonical value numbering between regions, which allows for the easy identification of items across a region, eliminating the need in the outliner to create parallel lists of instructions for each region, and replace output values in a less convoluted way.
Additionally, in a future commit, the output values will not necessarily be recorded values from the region itself, it could be a combination value where the actual value being output is a PHINode instead. This new method allows us to handle the replacement of the output value to the stored value with the corresponding item in the same place for both normal output values, and PHINode outputs instead of handling the different types of outputs in different locations.
Reviewers: paquette, roelofs
Differential Revision: https://reviews.llvm.org/D108656
Joseph Huber [Wed, 8 Sep 2021 13:52:28 +0000 (09:52 -0400)]
[OpenMP] Do not SPMDize generic regions with no parallel
This patch changes SPMDization to not trigger for regions with no
parallelism. Otherwise, this will introduce unnecessary barriers that
will slow the single-threaded region down.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D109438
Leonard Chan [Wed, 8 Sep 2021 18:32:11 +0000 (11:32 -0700)]
[compiler-rt][Fuchsia] Support building + running compiler-rt tests on fuchsia's host toolchain
Differential Revision: https://reviews.llvm.org/D109199
Alex Langford [Tue, 7 Sep 2021 18:42:34 +0000 (11:42 -0700)]
[lldb] Delete IRExecutionUnit::SearchSpec
IRExecutionUnit::SearchSpec is a struct that encapsulates information
needed to look for a symbol. Specifically, it is comprised of a name
represented with a ConstString and a FunctionNameType mask.
Because the mask is unused (effectively always set to
eFunctionNameTypeFull), we can remove the mask and replace all uses with
eFunctionNameTypeFull. After doing that, SearchSpec is effectively a
wrapper around a ConstString.
As an aside, SearchSpec is similar in purpose to Module::LookupInfo. I
briefly considered replacing uses of SearchSpec with LookupInfo, but
the current code only cares about symbol names (treating them as
eFunctionNameTypeFull). This code does care about language type, so
LookupInfo may be appropriate for IRExecutionUnit in the future.
Differential Revision: https://reviews.llvm.org/D109384
Amara Emerson [Wed, 8 Sep 2021 18:24:44 +0000 (11:24 -0700)]
[GlobalISel] Use a typedef for builder function matchinfos for brevity. NFC.
Nick Desaulniers [Wed, 8 Sep 2021 17:44:16 +0000 (10:44 -0700)]
[ISEL][BitTestBlock] omit additional bit test when default destination is unreachable
Otherwise we end up with an extra conditional jump, following by an
unconditional jump off the end of a function. ie.
bb.0:
BT32rr ..
JCC_1 %bb.4 ...
bb.1:
BT32rr ..
JCC_1 %bb.2 ...
JMP_1 %bb.3
bb.2:
...
bb.3.unreachable:
bb.4:
...
Should be equivalent to:
bb.0:
BT32rr ..
JCC_1 %bb.4 ...
JMP_1 %bb.2
bb.1:
bb.2:
...
bb.3.unreachable:
bb.4:
...
This can occur since at the higher level IR (Instruction) SwitchInsts
are required to have BBs for default destinations, even when it can be
deduced that such BBs are unreachable.
For most programs, this isn't an issue, just wasted instructions since the
unreachable has been statically proven.
The x86_64 Linux kernel when built with CONFIG_LTO_CLANG_THIN=y fails to
boot though once D106056 is re-applied. D106056 makes it more likely
that correlation-propagation (CVP) can deduce that the default case of
SwitchInsts are unreachable. The x86_64 kernel uses a binary post
processor called objtool, which emits this warning:
vmlinux.o: warning: objtool: cfg80211_edmg_chandef_valid()+0x169: can't
find jump dest instruction at .text.cfg80211_edmg_chandef_valid+0x17b
I haven't debugged precisely why this causes a failure at boot time, but
fixing this very obvious jump off the end of the function fixes the
warning and boot problem.
Link: https://bugs.llvm.org/show_bug.cgi?id=50080
Fixes: https://github.com/ClangBuiltLinux/linux/issues/679
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1440
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D109103
Kirill Stoimenov [Wed, 8 Sep 2021 17:04:43 +0000 (17:04 +0000)]
[asan] Fixed the jump to use the 4 byte offset version.
This should have been the 4 byte version in the first place. Unfortunatelly there is no easy way to add a test as both the 1 byte and 4 byte version are printed as 'jmp' in the assembly code.
Reviewed By: kda
Differential Revision: https://reviews.llvm.org/D109453
Wouter van Oortmerssen [Thu, 2 Sep 2021 22:18:58 +0000 (15:18 -0700)]
[WebAssembly] Change WebAssemblyMCLowerPrePass to ModulePass
It was a FunctionPass before, which subverted its purpose to collect ALL symbols before MCLowering, depending on how LLVM schedules function passes.
Fixes https://bugs.llvm.org/show_bug.cgi?id=51555
Differential Revision: https://reviews.llvm.org/D109202
Yaxun (Sam) Liu [Fri, 20 Aug 2021 22:03:56 +0000 (18:03 -0400)]
[HIP] Warn capture this pointer in device lambda
HIP currently diagnose capture of this pointer in device lambda in
host member functions. If this pointer points to managed memory,
it can be used in both device and host functions. Under this
situation, capturing this pointer in device lambda functions
in host member functions is valid usage. Change the diagnostic
about capturing this pointer to warning.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D108493
Arthur O'Dwyer [Wed, 8 Sep 2021 01:35:37 +0000 (21:35 -0400)]
[libc++] Comma-operator-proof a lot of algorithm/container code.
Detected by evil-izing the widely used `MoveOnly` testing type.
I had to patch some tests that were themselves using its comma operator,
but I think that's a worthwhile cost in order to catch more places
in our headers that needed comma-proofing.
The trick here is that even `++ptr, SomeClass()` can find a comma operator
by ADL, if `ptr` is of type `Evil*`. (A comma between two operands
of non-class-or-enum type is always treated as the built-in
comma, without ADL. But if either operand is class-or-enum, then
ADL happens for _both_ operands' types.)
Differential Revision: https://reviews.llvm.org/D109414
Craig Topper [Wed, 8 Sep 2021 17:25:32 +0000 (10:25 -0700)]
[RISCV] Pre-commit tests for D109394. NFC
Chris Lattner [Wed, 8 Sep 2021 17:09:42 +0000 (10:09 -0700)]
[tests] Make testsuite more resilient to "order of constant" changes. NFC.
Saleem Abdulrasool [Wed, 8 Sep 2021 16:20:43 +0000 (09:20 -0700)]
Support: hoist `extern template` declarations
Precede the `extern template` declaration prior to use. This is helpful
as it prevents the compiler from having to worry about instantiating the
template as it will be provided for. This is particularly important for
Windows where `__declspec(dllexport)` will traverses inheritance clauses
resulting in an incorrect application of dll interface to declarations.
Jessica Paquette [Sat, 4 Sep 2021 00:55:13 +0000 (17:55 -0700)]
[GlobalISel] Add G_ROTL and G_ROTR to right_identity_zero
Similar to `DAGCombiner::visitRotate`.
This makes `rotl_bitwidth_cst` in postlegalizercombiner-rotate.mir reduce down
to a COPY. Modify the checkline to make sure that only rotate_out_of_range
runs there.
Differential Revision: https://reviews.llvm.org/D109264
Craig Topper [Wed, 8 Sep 2021 17:01:29 +0000 (10:01 -0700)]
[RISCV] Remove unused tablegen template parameters. NFC
Identified in D109359
Mehdi Amini [Wed, 8 Sep 2021 16:50:27 +0000 (16:50 +0000)]
Add sanity check in MLIR ODS to catch case where two operands have the same name
This is making a tablegen crash into a more friendly error.
Differential Revision: https://reviews.llvm.org/D109449
Dan Liew [Wed, 8 Sep 2021 01:09:38 +0000 (18:09 -0700)]
Fix `asan/TestCases/Darwin/scrible.cpp` to work on platforms where `long` is not 64-bits.
Previously the test was failing on platforms where `long` was less than
64-bits wide (e.g. older WatchOS simulators and arm64_32) because the
`padding` field was too small.
The test currently relies on the `my_object->isa` being scribbled or
left unmodified after `my_object` is freed. However, this was not the
case because the `isa` pointer intersected with
`ChunkHeader::free_context_id`. `free_context_id` starts at the
beginning of user memory but it only initialized once the memory is
freed. This caused the `isa` pointer to change after it was freed
leading to the test crashing.
To fix this the `padding` field has been made explicitly 64-bits wide
(same size as `ChunkHeader::free_context_id`).
rdar://
75806757
Differential Revision: https://reviews.llvm.org/D109409
Nick Desaulniers [Wed, 8 Sep 2021 16:44:17 +0000 (09:44 -0700)]
[ISEL][BitTestBlock] pre-commit test for D109103
Upload a test that shows ISEL taking a SwitchInst that has an
unreachable BB for a default target being lowered to an unconditional
jump off the end of a function.
Link: https://bugs.llvm.org/show_bug.cgi?id=50080
Link: https://github.com/ClangBuiltLinux/linux/issues/679
Link: https://github.com/ClangBuiltLinux/linux/issues/1440
Reviewed By: craig.topper, hans
Differential Revision: https://reviews.llvm.org/D109106
Craig Topper [Wed, 8 Sep 2021 16:35:02 +0000 (09:35 -0700)]
[RISCV] Use V0 instead of VMV0: for mask vectors in isel patterns.
This is consistent with the RVV intrinsic patterns. This has been
shown to prevent some "ran out of registers" errors in our internal
testing.
Unfortunately, there are some regressions on LMUL=8 tests in here.
I think the lack of registers with LMUL=8 just makes it very hard
to schedule correctly.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D109245
Benjamin Kramer [Wed, 8 Sep 2021 16:33:21 +0000 (18:33 +0200)]
[IROutliner] Remove unused variable. NFC.
Roman Lebedev [Wed, 8 Sep 2021 16:16:55 +0000 (19:16 +0300)]
[X86] X86DAGToDAGISel::matchBitExtract(): support 'num high bits to clear' pattern
Currently, we only deal with the case where we can match
the number of low bits to be kept, i.e.:
```
x & ((1 << y) - 1)
```
will extract low `y` bits of `x`.
But what will
```
x & (-1 >> y)
```
do?
Logically, it will extract `bitwidth(x) - y` low bits, i.e.:
```
x & ~(-1 << (bitwidth(x)-y))
```
... except we can't do such a transformation in IR in general,
because if we wanted to extract all the bits `(-1 >> 0)` is fine,
but `-1 << bitwidth(x)` would be `poison`: https://alive2.llvm.org/ce/z/BKJZfw,
Yet, here with BMI's BEXTR and BMI2's BZHI we don't have any such problems with edge-cases.
So what we can do is: https://alive2.llvm.org/ce/z/gm5M2B
As briefly discussed with @craig.topper, this appears to be not worse than what we'd end up with currently (a pair of shifts):
* https://godbolt.org/z/nsPb8bejs (direct data dependency, sequential execution)
* https://godbolt.org/z/7bj3zeh1d (no direct data dependency, parallel execution)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D107923
Craig Topper [Wed, 8 Sep 2021 16:14:56 +0000 (09:14 -0700)]
[RISCV] Add an GPR def to the Zvlseg SPILL/RELOAD pseudos
The expansion of these pseudos creates ADD instructions. Those
ADDs modify a GPR so that it is no longer contains the same value
as the input base pointer. Therefore, I believe we should have a
GPR as a Def on these instructions and expansion should get the
destination register for the ADDs from that operand.
At least in our tests here this works out so that register
scavenging picks the same register as the base pointer.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D109405
gtt1995 [Wed, 8 Sep 2021 13:18:19 +0000 (06:18 -0700)]
Redistribute energy for Corpus
I found that the initial corpus allocation of fork mode has certain defects.
I designed a new initial corpus allocation strategy based on size grouping.
This method can give more energy to the small seeds in the corpus and
increase the throughput of the test.
Fuzzbench data (glibfuzzer is -fork_corpus_groups=1):
https://www.fuzzbench.com/reports/experimental/2021-08-05-parallel/index.html
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D105084
Saleem Abdulrasool [Tue, 7 Sep 2021 21:37:53 +0000 (14:37 -0700)]
Analysis: move declaration of variables to a more suitable location
This moves 2 variable declarations from `llvm/Support/Debug.h` to a more
appropriate home in the headers for `LLVMAnalysis`. These variables are
defined in `LLVMAnalysis` rather than in `LLVMSupport` and although they
control debugging behavior, the declarations being colocated in the same
library's headers is both easier to locate and aids correctly describing
the library's interfaces.
Reviewed By: rnk, mehdi_amini, aeubanks
Differential Revision: https://reviews.llvm.org/D109396
Alexey Lapshin [Tue, 7 Sep 2021 18:13:28 +0000 (21:13 +0300)]
[llvm-objcopy][NFC] Refactor CopyConfig structure - categorize options.
This patch continues refactoring done by D99055. It puts format specific
options into the correponding CopyConfig structures.
Differential Revision: https://reviews.llvm.org/D102277
Andrew Litteken [Wed, 28 Jul 2021 16:22:35 +0000 (09:22 -0700)]
[IROutliner] Adding supports for multiple exits
When we start outlining across branches, there is the possibility that we will have two different blocks with different output locations, or a single branch that goes to two blocks outside of the region that is being outlined. While the CodeExtractor provides most of the mechanisms by using the return value of the extracted function as the input to a switch statement to correctly branch to the correct location, we need special handling for different output schemas to each location.
This is done by repeating the existing storing scheme for each different exit block. We have a map from the return values used, to the basic block that is used to store the outputs for that particular exit block within the outlined function. Then if needed, we create a switch statement for each return block to branch to the correct set of stored outputs.
Reviewers: paquette
Differential Revision: https://reviews.llvm.org/D106993
Kazu Hirata [Wed, 8 Sep 2021 15:54:15 +0000 (08:54 -0700)]
[IR] Construct SmallVector with iterator ranges (NFC)
Note that arg_operands has been deprecated in favor of args.
Saleem Abdulrasool [Tue, 7 Sep 2021 21:28:05 +0000 (14:28 -0700)]
IR: move the declaration of `VerifyDomInfo` (NFC)
This moves the declaration of `VerifyDomInfo` into
`llvm/IR/Dominators.h` from `llvm/Support/Debug.h`. Although this is a
debugging utility, the definition of the symbol is in LLVMIR, not in
LLVMSupport. This moves the declaration to the containing modules'
header.
Reviewed By: rnk, mehdhi_amini
Differential Revision: https://reviews.llvm.org/D109395
David Spickett [Wed, 8 Sep 2021 08:55:56 +0000 (08:55 +0000)]
[lldb] Remove unused GDBRemoteCommunicationClient::SendAttach function
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D109427
AndreyChurbanov [Wed, 8 Sep 2021 15:12:31 +0000 (18:12 +0300)]
[OpenMP][NFC] Added comment on OpenMP 5.0 task affinity pilot implementation
Kunwar Shaanjeet Singh Grover [Wed, 8 Sep 2021 14:52:01 +0000 (20:22 +0530)]
[MLIR] FlatAffineConstraints: Refactored computation of explicit representation for identifiers
This patch refactors the existing implementation of computing an explicit
representation of an identifier as a floordiv in terms of other identifiers and
exposes this computation as a public function.
The computation of this representation is required to support local identifiers
in PresburgerSet subtract, complement and isEqual.
Reviewed By: bondhugula, arjunp
Differential Revision: https://reviews.llvm.org/D106662
Guillaume Chatelet [Wed, 8 Sep 2021 14:43:48 +0000 (14:43 +0000)]
[libc] Fix running benchmarks under msan/asan
asan/msan intercepts `aligned_malloc` and misbehave when the requested
alignment is greater than 512.
https://github.com/llvm/llvm-project/blob/
b041b613e6fff713fc9ad6dbc73024286fb2fc93/compiler-rt/lib/asan/asan_allocator.cpp#L430-L431