Elvina Yakubova [Wed, 11 Nov 2020 13:45:58 +0000 (16:45 +0300)]
[OpenCL] Make Clang recognize -cl-std=1.0 as a value argument
This patch makes Clang recognize -cl-std=1.0 as a value argument,
before only -std=cl1.0 has to be used instead.
Fixes https://bugs.llvm.org/show_bug.cgi?id=47981
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D91237
Adrian Kuegel [Wed, 11 Nov 2020 13:58:55 +0000 (14:58 +0100)]
MLIR: Remove TanhOp from ops list. It caused a build failure.
Nico Weber [Wed, 11 Nov 2020 13:39:35 +0000 (08:39 -0500)]
[gn build] (manually) port
98aa067109e
Nico Weber [Wed, 11 Nov 2020 13:34:49 +0000 (08:34 -0500)]
[gn build] (semi-manually) Port
454579e46a87
Nico Weber [Wed, 11 Nov 2020 13:34:30 +0000 (08:34 -0500)]
Revert "[gn build] (semi-manually) Port
98aa067109"
This reverts commit
04ce13e497be60f51d340e649c72138d49cb13e9.
The commit message was wrong. Will reland with fixed message.
Adrian Kuegel [Wed, 11 Nov 2020 13:14:43 +0000 (14:14 +0100)]
MLIR: add SinOp Lowering to __ocml_sin_f32 and __ocml_sin_f64
This mimics the recent similar patch for GPUToNVVM.
Differential Revision: https://reviews.llvm.org/D91252
Andrzej Warzynski [Wed, 11 Nov 2020 08:45:54 +0000 (08:45 +0000)]
[flang][driver] Make sure that `-###` is marked as supported (NFC)
`-###` has always been supported in the new flang driver. This patch
merely makes sure that it's included when printing the help screen (i.e.
`flang-new -help`).
Caroline Concatto [Fri, 6 Nov 2020 15:53:59 +0000 (15:53 +0000)]
[AArch64]Add memory op cost model for SVE
This patch adds/fixes memory op cost model for SVE with fixed-width
vector.
Differential Revision: https://reviews.llvm.org/D90950
Nico Weber [Mon, 9 Nov 2020 23:04:06 +0000 (18:04 -0500)]
[gn build] (semi-manually) Port
98aa067109
Simon Pilgrim [Wed, 11 Nov 2020 12:07:38 +0000 (12:07 +0000)]
[KnownBits] Add KnownBits::commonBits helper. NFCI.
We have a frequent pattern where we're merging two KnownBits to get the common/shared bits, and I just fell for the gotcha where I tried to use the & operator to merge them........
Max Kazantsev [Wed, 11 Nov 2020 12:06:57 +0000 (19:06 +0700)]
[Test] Add failing test for PR48150
Jan Svoboda [Wed, 11 Nov 2020 10:05:24 +0000 (11:05 +0100)]
[clang][cli] Port ObjCMTAction to new option parsing system
Merge existing marhsalling info kinds and add some primitives to
express flag options that contribute to a bitfield.
Depends on D82574
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D82860
Kerry McLaughlin [Wed, 11 Nov 2020 11:15:32 +0000 (11:15 +0000)]
[SVE][CodeGen] Lower scalable masked scatters
Lowers the llvm.masked.scatter intrinsics (scalar plus vector addressing mode only)
Changes included in this patch:
- Custom lowering for MSCATTER, which chooses the appropriate scatter store opcode to use.
Floating-point scatters are cast to integer, with patterns added to match FP reinterpret_casts.
- Added the getCanonicalIndexType function to convert redundant addressing
modes (e.g. scaling is redundant when accessing bytes)
- Tests with 32 & 64-bit scaled & unscaled offsets
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D90941
Sam McCall [Sat, 31 Oct 2020 20:09:11 +0000 (21:09 +0100)]
[Syntax] Start to move trivial Node class definitions to TableGen. NFC
This defines two node archetypes with trivial class definitions:
- Alternatives: the generated abstract classes are trivial as all
functionality is via LLVM RTTI
- Unconstrained: this is a placeholder, I think all of these are going to be
Lists but today they have no special accessors etc, so we just say
"could contain anything", and migrate them one-by-one to Sequence later.
Compared to Dmitri's prototype, Nodes.td looks more like a class hierarchy and
less like a grammar. (E.g. variants list the Alternatives parent rather than
vice versa).
The main reasons for this:
- the hierarchy is an important part of the API we want direct control over.
- e.g. we may introduce abstract bases like "loop" that the grammar doesn't
care about in order to model is-a concepts that might make refactorings
more expressive. This is less natural in a grammar-like idiom.
- e.g. we're likely to have to model some alternatives as variants and others
as class hierarchies, the choice will probably be based on natural is-a
relationships.
- it reduces the cognitive load of switching from editing *.td to working with
code that uses the generated classes
Differential Revision: https://reviews.llvm.org/D90543
Aleksandr Platonov [Wed, 11 Nov 2020 11:29:03 +0000 (14:29 +0300)]
[clangd] Improve clangd-indexer performance
This is a try to improve clangd-indexer tool performance:
- avoid processing already processed files.
- use different mutexes for different entities (e.g. do not block insertion of references while symbols are inserted)
Results for LLVM project indexing:
- before: ~30 minutes
- after: ~10 minutes
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D91051
Andrzej Warzynski [Wed, 11 Nov 2020 11:11:48 +0000 (11:11 +0000)]
Revert "[flang] Fix CheckSpecificationExpr handling of associated names"
This reverts commit
b670189975f5ba4e8ef22c74724c610287b69c28.
This patch causes shared library builds (BUILD_SHARED_LIBS=ON) to fail:
* http://lab.llvm.org:8011/#/builders/33/builds/626
I wasn't able to identify any easy fix, hence reverting.
Krasimir Georgiev [Wed, 11 Nov 2020 11:07:30 +0000 (12:07 +0100)]
[clang-format] do not break before @tags in JS comments
In JavaScript breaking before a `@tag` in a comment puts it on a new line, and
machinery that parses these comments will fail to understand such comments.
This adapts clang-format to not break before `@`. Similar functionality exists
for not breaking before `{`.
Reviewed By: mprobst
Differential Revision: https://reviews.llvm.org/D91078
Florian Hahn [Tue, 10 Nov 2020 19:44:18 +0000 (19:44 +0000)]
[llvm-reduce] Add reduction for special globals like llvm.used.
This patch adds a reduction of 'special' globals that lead to further
reductions (e.g. alias or regular globals reduction) being less efficient
because there are special constraints on values referenced in those
special globals. For example, values in @llvm.used and
@llvm.compiler.used need to be named, so replacing all uses of an
alias/global with undef or a different unnamed constant results in
invalid IR.
More details:
https://llvm.org/docs/LangRef.html#intrinsic-global-variables
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D90302
Kerry McLaughlin [Wed, 11 Nov 2020 10:34:49 +0000 (10:34 +0000)]
[SVE][CodeGen] Add the isTruncatingStore flag to MSCATTER
This patch adds the IsTruncatingStore flag to MaskedScatterSDNode, set by getMaskedScatter().
Updated SelectionDAGDumper::print_details for MaskedScatterSDNode to print
the details of masked scatters (is truncating, signed or scaled).
This is the first in a series of patches which adds support for scalable masked scatters
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D90939
Sander de Smalen [Wed, 11 Nov 2020 10:45:28 +0000 (10:45 +0000)]
[LoopVectorizer] Silence warning in GetRegUsage.
This patch silences the warning:
error: lambda capture 'DL' is not used [-Werror,-Wunused-lambda-capture]
auto GetRegUsage = [&DL, &TTI=TTI](Type *Ty, ElementCount VF) {
~^~~
1 error generated.
Introduced in:
https://reviews.llvm.org/rGb873aba3943c067a5efd5303cbdf5aeb0732cf88
Yashaswini [Wed, 11 Nov 2020 09:55:19 +0000 (15:25 +0530)]
Add Semantic check for Flang OpenMP 4.5 - 2.7.1 schedule clause
Semantic check for the positive chunk size.
Test Cases:
omp-do-schedule01.f90
omp-do-schedule02.f90
omp-do-schedule03.f90
omp-do-schedule04.f90
Reviewed by: Kiran Chandramohan @kiranchandramohan
Differential Revision: https://reviews.llvm.org/D89546
Sam McCall [Mon, 9 Nov 2020 23:00:51 +0000 (00:00 +0100)]
Reland [Syntax] Add minimal TableGen for syntax nodes. NFC
This reverts commit
09c6259d6d0eb51b282f6c3a28052a8146bc095b.
(Fixed side-effecting code being buried in an assert)
Daniel Kiss [Wed, 11 Nov 2020 09:58:41 +0000 (10:58 +0100)]
[libunwind] LIBUNWIND_REMEMBER_HEAP_ALLOC to cmake.
Missed it originally in https://reviews.llvm.org/D85005.
Reviewed By: gargaroff
Differential Revision: https://reviews.llvm.org/D91182
Sander de Smalen [Wed, 11 Nov 2020 09:55:43 +0000 (09:55 +0000)]
[LoopVectorizer] NFCI: Calculate register usage based on TLI.getTypeLegalizationCost.
This is more accurate than dividing the bitwidth based on the element count by the
maximum register size, as it can just reuse whatever has been calculated for
legalization of these types.
This change is also necessary when calculating register usage for scalable vectors, where
the legalization of these types cannot be done based on the widest register size, because
that does not take the 'vscale' component into account.
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D91059
Kirill Bobyrev [Wed, 11 Nov 2020 10:13:43 +0000 (11:13 +0100)]
[clangd] Abort rename when given the same name
When user wants to rename the symbol to the same name we shouldn't do any work.
Bail out early and return error to save compute.
Resolves: https://github.com/clangd/clangd/issues/580
Reviewed By: hokein
Differential Revision: https://reviews.llvm.org/D91134
Sam Parker [Wed, 11 Nov 2020 10:02:20 +0000 (10:02 +0000)]
[NFC][ARM] Replace lambda with any_of
Sander de Smalen [Wed, 11 Nov 2020 09:22:18 +0000 (09:22 +0000)]
[LoopVectorizer] NFC: Return ElementCount from compute[Feasible]MaxVF
Interfaces changed to return `ElementCount`:
* LoopVectorizationCostModel::computeMaxVF
* LoopVectorizationCostModel::computeFeasibleMaxVF
This is NFC for fixed-width vectors.
Reviewed By: dmgreen, ctetreau
Differential Revision: https://reviews.llvm.org/D90880
Eugene Zhulenev [Wed, 11 Nov 2020 09:38:51 +0000 (01:38 -0800)]
[mlir] Add NumberOfExecutions analysis + update RegionBranchOpInterface interface to query number of region invocations
Implements RFC discussed in: https://llvm.discourse.group/t/rfc-operationinstancesinterface-or-any-better-name/2158/10
Reviewed By: silvas, ftynse, rriddle
Differential Revision: https://reviews.llvm.org/D90922
Tres Popp [Tue, 10 Nov 2020 18:03:11 +0000 (19:03 +0100)]
[mlir] Rework DialectConversion inlineRegionBefore
The previous logic for inlining a region A with N blocks into region B
would produce incorrect results on rollback for N greater than 1. This
rollback logic would leave blocks 1..N in region B and only move block 0
to region A.
The new inlining action recording stores the block move actions from N-1
to 0. Now on roll back, block 0 is moved to region A and then 1..N is
appended to the list of blocks in region A.
Differential Revision: https://reviews.llvm.org/D91185
Christian Sigg [Tue, 10 Nov 2020 11:56:50 +0000 (12:56 +0100)]
[mlir][gpu] Add missing initialization of gpu runtime wrappers.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D91148
Christian Sigg [Wed, 11 Nov 2020 08:42:23 +0000 (09:42 +0100)]
[mlir] Use assemblyFormat in AllocLikeOp.
Split operands into dynamicSizes and symbolOperands.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D90589
Jan Svoboda [Wed, 11 Nov 2020 09:20:11 +0000 (10:20 +0100)]
[NFC] First test commit
Chen Zheng [Tue, 10 Nov 2020 15:01:00 +0000 (10:01 -0500)]
[SelectionDAG] fminnum should be a binary operator
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D91163
Stephan Herhut [Tue, 10 Nov 2020 15:59:54 +0000 (16:59 +0100)]
[mlir][llvm] Expose getters for alias and align attribute names
This adds getters for `llvm.align` and `llvm.noalias` strings that are used
as attribute names in the llvm dialect.
Differential Revision: https://reviews.llvm.org/D91166
Christian Sigg [Wed, 11 Nov 2020 07:24:16 +0000 (08:24 +0100)]
[mlir] Allow omitting spaces in assemblyFormat with a `` literal.
I would like to use this for D90589 to switch std.alloc to assemblyFormat.
Hopefully it will be useful in other places as well.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D91068
Raphael Isemann [Wed, 11 Nov 2020 08:13:56 +0000 (09:13 +0100)]
[lldb][test] Remove not_remote_testsuite_ready in favor of skipIfRemote decorator
Those two decorators have identical behaviour. This removes
`not_remote_testsuite_ready` as `skipIfRemote` seems more consistent with the
other decorator names we have
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D89376
Wang, Pengfei [Wed, 11 Nov 2020 07:32:08 +0000 (15:32 +0800)]
[CodeGen][X86] Remove unused check-prefixes from vec tests. NFCI.
Kirill Bobyrev [Tue, 10 Nov 2020 12:59:26 +0000 (13:59 +0100)]
[clangd] NFC: Add more logging to remote index test
Wang, Pengfei [Wed, 11 Nov 2020 07:13:43 +0000 (15:13 +0800)]
[CodeGen][X86] Remove unused check-prefixes from vector tests. NFCI.
Wang, Pengfei [Wed, 11 Nov 2020 06:49:00 +0000 (14:49 +0800)]
[CodeGen][X86] Remove unused check-prefixes from vector reduce tests. NFCI.
Amara Emerson [Tue, 10 Nov 2020 05:55:22 +0000 (21:55 -0800)]
[AArch64][GlobalISel] Port some AArch64 target specific MUL combines from SDAG.
These do things like turn a multiply of a pow-2+1 into a shift and and add,
which is a common pattern that pops up, and is universally better than expensive
madd instructions with a constant.
I've added check lines to an existing codegen test since the code being ported
is almost identical, however the mul by negative pow2 constant tests don't generate
the same code because we're missing some generic G_MUL combines still.
Differential Revision: https://reviews.llvm.org/D91125
Wang, Pengfei [Wed, 11 Nov 2020 06:17:24 +0000 (14:17 +0800)]
[CodeGen][X86] Remove unused check-prefixes from vector shift tests. NFCI.
Wang, Pengfei [Wed, 11 Nov 2020 05:45:35 +0000 (13:45 +0800)]
[CodeGen][X86] Remove unused check-prefixes from vector shuffle tests. NFCI.
Wang, Pengfei [Wed, 11 Nov 2020 05:31:01 +0000 (13:31 +0800)]
[CodeGen][X86] Remove unused check-prefixes from vector tzcnt tests. NFCI.
Faisal Vali [Wed, 11 Nov 2020 05:40:12 +0000 (23:40 -0600)]
[NFC, Refactor] Rename the (scoped) enum DeclaratorContext's enumerators to remove duplication
Since these are scoped enumerators, they have to be prefixed by DeclaratorContext, so lets remove Context from the name, and return some characters to the multiverse.
Patch was reviewed here: https://reviews.llvm.org/D91011
Thank you to aaron, bruno, wyatt and barry for indulging me.
Xun Li [Wed, 11 Nov 2020 04:46:05 +0000 (20:46 -0800)]
[SafeStack] Make sure SafeStack does not break musttail call contract
SafeStack instrumentation should not insert anything inbetween musttail call and return instruction.
For every ReturnInst that needs to be instrumented, we adjust the insertion point to the musttail call if exists.
Differential Revision: https://reviews.llvm.org/D90702
Max Kazantsev [Wed, 11 Nov 2020 04:17:13 +0000 (11:17 +0700)]
[SCEV] Generalize no-self-wrap check in isLoopInvariantExitCondDuringFirstIterations
Lift limitation on step being `+/- 1`. In fact, the only thing it is needed for
is proving no-self-wrap. We can instead check this flag directly.
Theoretically it can increase the scope of the transform, but I could not
construct such test easily.
Differential Revision: https://reviews.llvm.org/D91126
Reviewed By: apilipenko
Wang, Pengfei [Wed, 11 Nov 2020 03:55:09 +0000 (11:55 +0800)]
[CodeGen][X86] Remove unused check-prefixes from vector shuffle tests. NFCI.
Wang, Pengfei [Wed, 11 Nov 2020 03:43:22 +0000 (11:43 +0800)]
[CodeGen][X86] Remove unused check-prefixes from vector shift tests. NFCI.
Wang, Pengfei [Wed, 11 Nov 2020 03:31:04 +0000 (11:31 +0800)]
[CodeGen][X86] Remove unused check-prefixes from vector reduce tests. NFCI.
Wang, Pengfei [Wed, 11 Nov 2020 03:22:03 +0000 (11:22 +0800)]
[CodeGen][X86] Remove unused check-prefixes from vector popcnt tests. NFCI.
Roland McGrath [Wed, 11 Nov 2020 02:56:59 +0000 (18:56 -0800)]
[clang] Add missing header guard in <cpuid.h>
This header has long lacked a standard multiple inclusion guard
like other headers have, for no apparent reason. The GCC header
of the same name likewise lacks one up through release 10.1, but
trunk GCC (release 11, and perhaps future 10.x) has fixed it
(see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96238).
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D91226
Kazu Hirata [Wed, 11 Nov 2020 03:17:13 +0000 (19:17 -0800)]
Revert "[BranchProbabilityInfo] Use SmallVector (NFC)"
This reverts commit
2f1038c7b699e959e0521638e2e2818a849fe19c.
Wang, Pengfei [Wed, 11 Nov 2020 03:07:05 +0000 (11:07 +0800)]
[CodeGen][X86] Remove unused check-prefixes from mask tests. NFCI.
Gaurav Jain [Wed, 4 Nov 2020 14:51:00 +0000 (06:51 -0800)]
[NFC] Use [MC]Register in TwoAddressInstructionPass
Differential Revision: https://reviews.llvm.org/D90902
Akira Hatanaka [Wed, 11 Nov 2020 02:36:10 +0000 (18:36 -0800)]
Fix the data layout mangling specification for 'arm64-pc-win32-macho'
rdar://problem/
70410504
Chen Zheng [Tue, 10 Nov 2020 14:46:43 +0000 (09:46 -0500)]
NFC - use script to update testcases and add new testcases.
zoecarver [Wed, 11 Nov 2020 02:23:22 +0000 (18:23 -0800)]
[libc++] Change requirements on linear_congruential_engine.
This patch changes how linear_congruential_engine picks its randomization
algorithm. It adds two restrictions, `_OverflowOK` and `_SchrageOK`.
`_OverflowOK` means that m is a power of two so using the classic
`(a * x + c) % m` will create a meaningless overflow. The second checks
that Schrage's algorithm will produce results that are in bounds of min
and max. This patch fixes https://llvm.org/PR27839.
Differential Revision: D65041
Chen Zheng [Tue, 10 Nov 2020 02:24:36 +0000 (21:24 -0500)]
[EarlyCSE] delete abs/nabs handling
delete abs/nabs handling in earlycse pass to avoid bugs related to
hashing values. After abs/nabs is canonicalized to intrinsics in D87188,
we should get CSE ability for abs/nabs back.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D90734
Sam Clegg [Tue, 10 Nov 2020 01:52:39 +0000 (17:52 -0800)]
[lld][WebAssembly] Allow references to __tls_base without shared memory
Previously we limited the use of atomics and TLS to programs
linked with `--shared-memory`.
However, as of https://reviews.llvm.org/D79530 we now allow
programs that use atomic to be linked without `--shared-memory`.
For this to be useful we also want to all TLS usage in such
programs. In this case, since we know we are single threaded
we simply include the TLS data as a regular active segment
and create an immutable `__tls_base` global that point to the
start of this segment.
Fixes: https://github.com/emscripten-core/emscripten/issues/12489
Differential Revision: https://reviews.llvm.org/D91115
Evgenii Stepanov [Tue, 10 Nov 2020 22:15:47 +0000 (14:15 -0800)]
[hwasan] Fix Thread reuse.
HwasanThreadList::DontNeedThread clobbers Thread::next_, breaking the
freelist. As a result, only the top of the freelist ever gets reused,
and the rest of it is lost.
Since the Thread object its associated ring buffer is only 8Kb, this is
typically only noticable in long running processes, such as fuzzers.
Fix the problem by switching from an intrusive linked list to a vector.
Differential Revision: https://reviews.llvm.org/D91208
Wang, Pengfei [Wed, 11 Nov 2020 01:11:28 +0000 (09:11 +0800)]
[CodeGen][X86] Remove unused check-prefixes from bitcast tests. NFCI.
Vedant Kumar [Wed, 11 Nov 2020 00:37:22 +0000 (16:37 -0800)]
[test] Delete redundant lldbutil import, NFC
Vedant Kumar [Wed, 11 Nov 2020 00:01:16 +0000 (16:01 -0800)]
[ThreadPlan] Add a test for `thread step-in -r`, NFC
Adds test coverage for ThreadPlanStepInRange::SetAvoidRegexp.
See:
http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/lldb/source/Target/ThreadPlanStepInRange.cpp.html#L309
Differential Revision: https://reviews.llvm.org/D91220
peter klausler [Tue, 10 Nov 2020 22:53:01 +0000 (14:53 -0800)]
[flang] Fix CheckSpecificationExpr handling of associated names
Avoid a spurious error message about a dummy procedure reference
in a specification expression by restructuring the handling of
use-associated and host-associated symbols.
Differential revision: https://reviews.llvm.org/D91209
Vedant Kumar [Tue, 10 Nov 2020 23:22:29 +0000 (15:22 -0800)]
[ThreadPlan] Delete unused ThreadPlanStepInRange code, NFC
Vedant Kumar [Tue, 10 Nov 2020 23:24:07 +0000 (15:24 -0800)]
[ThreadPlan] Reflow docs to fit the 80 column limit, NFC
Vedant Kumar [Tue, 10 Nov 2020 23:17:17 +0000 (15:17 -0800)]
[Command] Fix accidental word concatenation in Options.td
Split up words that appear to have been accidentally concatenated.
This looks to be exhaustive: to find these in vim, use:
/\v[^ ]"\n +"[^ ]
Peter Collingbourne [Tue, 10 Nov 2020 23:50:04 +0000 (15:50 -0800)]
hwasan: Bring back operator {new,delete} interceptors on Android.
It turns out that we can't remove the operator new and delete
interceptors on Android without breaking ABI, so bring them back
as forwards to the malloc and free functions.
Differential Revision: https://reviews.llvm.org/D91219
Lang Hames [Tue, 10 Nov 2020 01:38:41 +0000 (12:38 +1100)]
[ORC] Add debugging output for ResourceTracker to be used in JITDylib::define.
Richard Smith [Tue, 10 Nov 2020 23:52:36 +0000 (15:52 -0800)]
Properly collect template arguments from a class-scope function template
specialization.
Fixes a crash-on-valid if further template parameters are introduced
within the specialization (by a generic lambda).
Peter Collingbourne [Tue, 10 Nov 2020 23:46:21 +0000 (15:46 -0800)]
gn build: (manually) Port
ae032e27 and
21f83113.
__register_frame and __deregister_frame are associated with the
.eh_frame section, which I think is used on all of our platforms
except Windows and 32-bit ARM (which uses the ARM EHABI).
Also add a file that was added to lld/MachO.
Gaurav Jain [Thu, 5 Nov 2020 03:14:59 +0000 (19:14 -0800)]
[NFC] Use [MC]Register for x86 target
Differential Revision: https://reviews.llvm.org/D91161
Tue Ly [Thu, 5 Nov 2020 19:55:46 +0000 (14:55 -0500)]
[libc] Add implementations of fdim[f|l].
Implementing fdim, fdimf, and fdiml for llvm-libc.
Differential Revision: https://reviews.llvm.org/D90906
Stephen Kelly [Mon, 9 Nov 2020 19:56:48 +0000 (19:56 +0000)]
Add utility for testing if we're matching nodes AsIs
Differential Revision: https://reviews.llvm.org/D91144
Kazushi (Jam) Marukawa [Tue, 3 Nov 2020 13:08:57 +0000 (22:08 +0900)]
[VE] Implement FoldImmediate
Implement FoldImmediate for only integer aritihmetic operations.
Add regression tests also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D91150
Florian Hahn [Tue, 10 Nov 2020 22:50:46 +0000 (22:50 +0000)]
Revert "[VPlan] Use VPValue def for VPWidenSelectRecipe."
This reverts commit
a8e50f1c6e7b404aab8fedb972f003a4d6a6434e.
This reportedly breaks building the Linux kernel.
https://bugs.llvm.org/show_bug.cgi?id=48142
Richard Smith [Tue, 10 Nov 2020 21:30:33 +0000 (13:30 -0800)]
Add PrintingPolicy overload to APValue::printPretty. NFC.
Stephen Kelly [Tue, 10 Nov 2020 22:29:23 +0000 (22:29 +0000)]
Revert "Add utility for testing if we're matching nodes AsIs"
This reverts commit
e73296d3b92fc231f3f913815e477d55b66595bd.
This may have caused build bot failure.
Zequan Wu [Tue, 10 Nov 2020 22:25:25 +0000 (14:25 -0800)]
[llvm-cov] Add a test for
c75a0a1e
Akira Hatanaka [Tue, 10 Nov 2020 21:46:53 +0000 (13:46 -0800)]
[CodeGen] Mark calls to objc_autorelease as tail
This enables a method sending an autorelease message to an object and
returning the object in MRR to avoid adding the object to an autorelease
pool if a call to objc_retainAutoreleasedReturnValue in the caller
function accepts the hand off of the retain count.
rdar://problem/
50678052
Differential Revision: https://reviews.llvm.org/D91111
Sean Silva [Wed, 28 Oct 2020 20:25:48 +0000 (13:25 -0700)]
[mlir] Add pass to convert elementwise ops to linalg.
This patch converts elementwise ops on tensors to linalg.generic ops
with the same elementwise op in the payload (except rewritten to
operate on scalars, obviously). This is a great form for later fusion to
clean up.
E.g.
```
// Compute: %arg0 + %arg1 - %arg2
func @f(%arg0: tensor<?xf32>, %arg1: tensor<?xf32>, %arg2: tensor<?xf32>) -> tensor<?xf32> {
%0 = addf %arg0, %arg1 : tensor<?xf32>
%1 = subf %0, %arg2 : tensor<?xf32>
return %1 : tensor<?xf32>
}
```
Running this through
`mlir-opt -convert-std-to-linalg -linalg-fusion-for-tensor-ops` we get:
```
func @f(%arg0: tensor<?xf32>, %arg1: tensor<?xf32>, %arg2: tensor<?xf32>) -> tensor<?xf32> {
%0 = linalg.generic {indexing_maps = [#map0, #map0, #map0, #map0], iterator_types = ["parallel"]} ins(%arg0, %arg1, %arg2 : tensor<?xf32>, tensor<?xf32>, tensor<?xf32>) {
^bb0(%arg3: f32, %arg4: f32, %arg5: f32): // no predecessors
%1 = addf %arg3, %arg4 : f32
%2 = subf %1, %arg5 : f32
linalg.yield %2 : f32
} -> tensor<?xf32>
return %0 : tensor<?xf32>
}
```
So the elementwise ops on tensors have nicely collapsed into a single
linalg.generic, which is the form we want for further transformations.
Differential Revision: https://reviews.llvm.org/D90354
Sean Silva [Wed, 4 Nov 2020 02:17:55 +0000 (18:17 -0800)]
[mlir] Add ElementwiseMappable trait and apply it to std elementwise ops.
This patch adds an `ElementwiseMappable` trait as discussed in the RFC
here:
https://llvm.discourse.group/t/rfc-std-elementwise-ops-on-tensors/2113/23
This trait can power a number of transformations and analyses.
A subsequent patch adds a convert-elementwise-to-linalg pass exhibits
how this trait allows writing generic transformations.
See https://reviews.llvm.org/D90354 for that patch.
This trait slightly changes some verifier messages, but the diagnostics
are usually about as good. I fiddled with the ordering of the trait in
the .td file trait lists to minimize the changes here.
Differential Revision: https://reviews.llvm.org/D90731
Pirama Arumuga Nainar [Tue, 10 Nov 2020 19:15:16 +0000 (11:15 -0800)]
[ARM] Fix PR 47980: Use constrainRegClass during foldImmediate opt.
Previously we used setRegClass to rgpr, which may expand the register
domain if the result was already in a constrained class (tcgpr in the
above PR).
Differential Revision: https://reviews.llvm.org/D91192
Michael Kruse [Tue, 10 Nov 2020 08:37:35 +0000 (02:37 -0600)]
[Polly][ScopBuilder] Use only modeled instructions to compute statement granularity.
ScopBuilder distributes independent instructions between statements.
Only modeled (e.g. not synthesizable) instructions are represented.
To compute independence, non-modeled instructions were used in some
parts of determining instruction independence, which could lead to the
re-introduction of non-model instructions.
In particular, required invariant loads could be added to instruction
list, which then led to redundant MemoryAccesses for such a load.
This fixes llvm.org/PR48059.
Xun Li [Tue, 10 Nov 2020 21:02:18 +0000 (13:02 -0800)]
[Coroutine][Sema] Cleanup temporaries as early as possible
The original bug was discovered in T75057860. Clang front-end emits an AST that looks like this for an co_await expression:
|- ExprWithCleanups
|- -CoawaitExpr
|- -MaterializeTemporaryExpr ... Awaiter
...
|- -CXXMemberCallExpr ... .await_ready
...
|- -CallExpr ... __builtin_coro_resume
...
|- -CXXMemberCallExpr ... .await_resume
...
ExprWithCleanups is responsible for cleaning up (including calling dtors) for the temporaries generated in the wrapping expression).
In the above structure, the __builtin_coro_resume part (which corresponds to the code for the suspend case in the co_await with symmetric transfer), the pseudocode looks like this:
__builtin_coro_resume(
awaiter.await_suspend(
from_address(
__builtin_coro_frame())).address());
One of the temporaries that's generated as part of this code is the coroutine handle returned from awaiter.await_suspend() call. The call returns a handle which is a prvalue (since it's a returned value on the fly). In order to call the address() method on it, it needs to be converted into an xvalue. Hence a materialized temp is created to hold it. This temp will need to be cleaned up eventually. Now, since all cleanups happen at the end of the entire co_await expression, which is after the <coro.suspend> suspension point, the compiler will think that such a temp needs to live across suspensions, and need to be put on the coroutine frame, even though it's only used temporarily just to call address() method.
Such a phenomena not only unnecessarily increases the frame size, but can lead to ASAN failures, if the coroutine was already destroyed as part of the await_suspend() call. This is because if the coroutine was already destroyed, the frame no longer exists, and one can not store anything into it. But if the temporary object is considered to need to live on the frame, it will be stored into the frame after await_suspend() returns.
A fix attempt was done in https://reviews.llvm.org/D87470. Unfortunately it is incorrect. The reason is that cleanups in Clang works more like linearly than nested. There is one current state indicating whether it needs cleanup, and an ExprWithCleanups resets that state. This means that an ExprWithCleanups must be capable of cleaning up all temporaries created in the wrapping expression, otherwise there will be dangling temporaries cleaned up at the wrong place.
I eventually found a walk-around (https://reviews.llvm.org/D89066) that doesn't break any existing tests while fixing the issue. But it targets the final co_await only. If we ever have a co_await that's not on the final awaiter and the frame gets destroyed after suspend, we are in trouble. Hence we need a proper fix.
This patch is the proper fix. It does the folllowing things to fully resolve the issue:
1. The AST has to be generated in the order according to their nesting relationship. We should not generate AST out of order because then the code generator would incorrectly track the state of temporaries and when a cleanup is needed. So the code in buildCoawaitCalls is reorganized so that we will be generating the AST for each coawait member call in order along with their child AST.
2. await_ready() call is wrapped with an ExprWithCleanups so that temporaries in it gets cleaned up as early as possible to avoid living across suspension.
3. await_suspend() call is wrapped with an ExprWithCleanups if it's not a symmetric transfer. In the case of a symmetric transfer, in order to maintain the musttail call contract, the ExprWithCleanups is wraaped before the resume call.
4. In the end, we mark again that it needs a cleanup, so that the entire CoawaitExpr will be wrapped with a ExprWithCleanups which will clean up the Awaiter object associated with the await expression.
Differential Revision: https://reviews.llvm.org/D90990
AndreyChurbanov [Tue, 10 Nov 2020 21:16:23 +0000 (00:16 +0300)]
[OpenMP] Fixes for shared memory cleanup when aborts occur
Patch by Erdner, Todd <todd.erdner@intel.com>
Differential Revision: https://reviews.llvm.org/D90974
Stanislav Mekhanoshin [Tue, 10 Nov 2020 20:12:33 +0000 (12:12 -0800)]
[AMDGPU] Set default op_sel_hi on accvgpr read/write
These are opsel opcodes with op_sel actually being ignored.
As a such op_sel_hi needs to be set to default 1 even though
these bits are ignored. This is compatibility change.
Differential Revision: https://reviews.llvm.org/D91202
Richard Smith [Tue, 10 Nov 2020 20:51:00 +0000 (12:51 -0800)]
Move code to determine the type of an LValueBase out of ExprConstant and
into a member function on LValueBase. NFC.
Bruno Cardoso Lopes [Tue, 10 Nov 2020 19:20:56 +0000 (11:20 -0800)]
[Coroutines] Add missing llvm.dbg.declare's to cover for more allocas
Tracking local variables across suspend points is still somewhat incomplete.
Consider this coroutine snippet:
```
resumable foo() {
int x[10] = {};
int a = 3;
co_await std::experimental::suspend_always();
a++;
x[0] = 1;
a += 2;
x[1] = 2;
a += 3;
x[2] = 3;
}
```
Can't manage to print `a` or `x` if they turn out to be allocas during
CoroSplit (which happens if you build this code with `-O0` prior to this
commit):
```
* thread #1, queue = 'com.apple.main-thread', stop reason = step over
frame #0: 0x0000000100003729 main-noprint`foo() at main-noprint.cpp:43:5
40 co_await std::experimental::suspend_always();
41 a++;
42 x[0] = 1;
-> 43 a += 2;
44 x[1] = 2;
45 a += 3;
46 x[2] = 3;
(lldb) p x
error: <user expression 21>:1:1: use of undeclared identifier 'x'
x
^
```
The generated IR contains a `llvm.dbg.declare` for `x` in it's initialization
basic block. After CoroSplit, the `llvm.dbg.declare` might not dominate all of
`x` uses and we lose debugging quality.
Add `llvm.dbg.value`s to all relevant basic blocks such that if later
transformations break the dominance the reliable debug info is already in
place. For instance, this BB:
```
await.ready:
...
%arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* %x.reload.addr, i64 0, i64 0, !dbg !760
...
%arrayidx19 = getelementptr inbounds [10 x i32], [10 x i32]* %x.reload.addr, i64 0, i64 1, !dbg !763
...
%arrayidx21 = getelementptr inbounds [10 x i32], [10 x i32]* %x.reload.addr, i64 0, i64 2, !dbg !766
```
becomes:
```
await.ready:
...
call void @llvm.dbg.value(metadata [10 x i32]* %x.reload.addr, metadata !751, metadata !DIExpression()), !dbg !753
...
%arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* %x.reload.addr, i64 0, i64 0, !dbg !760
...
%arrayidx19 = getelementptr inbounds [10 x i32], [10 x i32]* %x.reload.addr, i64 0, i64 1, !dbg !763
...
%arrayidx21 = getelementptr inbounds [10 x i32], [10 x i32]* %x.reload.addr, i64 0, i64 2, !dbg !766
```
Differential Revision: https://reviews.llvm.org/D90772
Sjoerd Meijer [Mon, 9 Nov 2020 16:16:54 +0000 (16:16 +0000)]
[LoopFlatten] Run it earlier, just before IndVarSimplify
This is a prep step for widening induction variables in LoopFlatten if this is
posssible (D90640), to avoid having to perform certain overflow checks. Since
IndVarSimplify may already widen induction variables, we want to run
LoopFlatten just before IndVarSimplify. This is a minor reshuffle as both
passes were already close after each other.
Differential Revision: https://reviews.llvm.org/D90402
Marius Brehler [Tue, 10 Nov 2020 17:14:37 +0000 (18:14 +0100)]
[mlir] Refactor finding python
This drops the use of deprecated CMake modules to find python.
Differential Revision: https://reviews.llvm.org/D91197
Jez Ng [Tue, 27 Oct 2020 02:18:29 +0000 (19:18 -0700)]
[lld-macho] Add very basic support for LTO
Just enough to consume some bitcode files and link them. There's more
to be done around the symbol resolution API and the LTO config, but I don't yet
understand what all the various LTO settings do...
Reviewed By: #lld-macho, compnerd, smeenai, MaskRay
Differential Revision: https://reviews.llvm.org/D90663
Jez Ng [Wed, 14 Oct 2020 19:46:49 +0000 (12:46 -0700)]
[lld-macho][easy] Fix segment max protection
We should have maxprot == initprot for all non-i386 architectures, which
is what ld64 does.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D89420
Jez Ng [Wed, 14 Oct 2020 18:03:34 +0000 (11:03 -0700)]
[lld-macho] Implement LC_UUID
Apple devtools use this to locate the dSYM files for a given
binary.
The UUID is computed based on an MD5 hash of the binary's contents. In order to
hash the contents, we must first write them, but LC_UUID itself must be part of
the written contents in order for all the offsets to be calculated correctly.
We resolve this circular paradox by first writing an LC_UUID with an all-zero
UUID, then updating the UUID with its real value later.
I'm not sure there's a good way to test that the value of the UUID is
"as expected", so I've just checked that it's present.
Reviewed By: #lld-macho, compnerd, smeenai
Differential Revision: https://reviews.llvm.org/D89418
Jez Ng [Wed, 7 Oct 2020 21:50:42 +0000 (14:50 -0700)]
[lld-macho] Support linking against stub dylibs
Stub dylibs differ from "real" dylibs in that they lack any content in
their sections. What they do have are export tries and symbol tables,
which means we can still link against them. I am unclear how to
properly create these stub dylibs; XCode 11.3's `lipo` is able to create
stub dylibs, but those lack LC_ID_DYLIB load commands and are considered
invalid by most tooling. Newer versions of `lipo` aren't able to create
stub dylibs at all. However, recent SDKs in XCode still come with valid
stub dylibs, so it still seems worthwhile to support them. The YAML in
this diff's test was generated by taking a non-stub dylib and editing
the appropriate fields.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D89012
Yang Fan [Tue, 10 Nov 2020 20:09:06 +0000 (15:09 -0500)]
[Sema] Fix volatile check when testing if a return object can be implicitly moved
In C++11 standard, to become implicitly movable, the expression in return
statement should be a non-volatile automatic object. CWG1579 changed the rule
to require that the expression only needs to be an automatic object. C++14
standard and C++17 standard kept this rule unchanged. C++20 standard changed
the rule back to require the expression be a non-volatile automatic object.
This should be a typo in standards, and VD should be non-volatile.
Differential Revision: https://reviews.llvm.org/D88295
Mehdi Amini [Tue, 10 Nov 2020 19:56:48 +0000 (19:56 +0000)]
Add Python binding to run a PassManager on a MLIR Module
Reviewed By: ftynse, stellaraccident
Differential Revision: https://reviews.llvm.org/D90823
Sjoerd Meijer [Mon, 9 Nov 2020 15:59:50 +0000 (15:59 +0000)]
[LoopFlatten] Make it a FunctionPass
This converts LoopFlatten from a LoopPass to a FunctionPass so that we don't
run into problems of a loop pass deleting a (inner)loop.
Differential Revision: https://reviews.llvm.org/D90940
Mehdi Amini [Tue, 10 Nov 2020 18:39:12 +0000 (18:39 +0000)]
Add basic Python bindings for the PassManager and bind libTransforms
This only exposes the ability to round-trip a textual pipeline at the
moment.
To exercise it, we also bind the libTransforms in a new Python extension. This
does not include any interesting bindings, but it includes all the
mechanism to add separate native extensions and load them dynamically.
As such passes in libTransforms are only registered after `import
mlir.transforms`.
To support this global registration, the TableGen backend is also
extended to bind to the C API the group registration for passes.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D90819