Jonas Devlieghere [Wed, 13 Jan 2021 04:13:43 +0000 (20:13 -0800)]
[dsymutil] Fix spurious space in REQUIRES: line
This test is incorrectly running on non-darwin hosts.
Jonas Devlieghere [Wed, 13 Jan 2021 03:58:35 +0000 (19:58 -0800)]
[dsymutil] s/dwarfdump/llvm-dwarfdump/ in test
Jon Chesterfield [Wed, 13 Jan 2021 03:51:10 +0000 (03:51 +0000)]
[libomptarget][nvptx] Include omp_data.cu in bitcode deviceRTL
[libomptarget][nvptx] Include omp_data.cu in bitcode deviceRTL
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D94565
Jonas Devlieghere [Tue, 12 Jan 2021 00:17:51 +0000 (16:17 -0800)]
[dsymutil] Copy eh_frame content into the dSYM companion file.
Copy over the __eh_frame from the binary into the dSYM. This helps
kernel developers that are working with only dSYMs (i.e. no binaries)
when debugging a core file. This only kicks in when the __eh_frame
exists in the linked binary. Most of the time ld64 will remove the
section in favor of compact unwind info. When it is emitted, it's
generally small enough and should not bloat the dSYM.
rdar://
69774935
Differential revision: https://reviews.llvm.org/D94460
Serguei Katkov [Mon, 11 Jan 2021 07:55:39 +0000 (14:55 +0700)]
[InlineSpiller] Re-tie operands if folding failed
InlineSpiller::foldMemoryOperand unties registers before an attempt to fold and
does not restore tied-ness in case of failure.
I do not have a particular test for demo of invalid behavior.
This is something of clean-up.
It is better to keep the behavior correct in case some time in future it happens.
Reviewers: reames, dantrushin
Reviewed By: dantrushin, reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D94389
Lang Hames [Wed, 13 Jan 2021 03:20:13 +0000 (14:20 +1100)]
[Orc] Add a unit test for asynchronous definition generation.
Jonas Devlieghere [Wed, 13 Jan 2021 02:50:57 +0000 (18:50 -0800)]
[dsymutil] Warn on timestmap mismatch between object file and debug map
Add a warning when the timestmap doesn't match between the object file
and the debug map entry. We were already emitting such warnings for
archive members and swift interface files. This patch also unifies the
warning across all three.
rdar://
65614640
Differential revision: https://reviews.llvm.org/D94536
Hsiangkai Wang [Tue, 12 Jan 2021 08:19:37 +0000 (16:19 +0800)]
[NFC] Use generic name for scalable vector stack ID.
Differential Revision: https://reviews.llvm.org/D94471
Hansang Bae [Thu, 7 Jan 2021 15:16:52 +0000 (09:16 -0600)]
[OpenMP] Use persistent memory for omp_large_cap_mem
This change enables volatile use of persistent memory for omp_large_cap_mem*
on supported systems. It depends on libmemkind's support for persistent memory,
and requirements/details can be found at the following url.
https://pmem.io/2020/01/20/memkind-dax-kmem.html
Differential Revision: https://reviews.llvm.org/D94353
Shoaib Meenai [Tue, 12 Jan 2021 18:29:03 +0000 (10:29 -0800)]
[libc++] Give extern templates default visibility on gcc
Contrary to the current visibility macro documentation, it appears that
gcc does handle visibility attribute on extern templates correctly, e.g.
https://godbolt.org/g/EejuV7. We need this so that extern template
instantiations of classes not marked _LIBCPP_TEMPLATE_VIS (e.g.
__vector_base_common) are correctly exported with gcc when building with
hidden visibility.
Reviewed By: ldionne
Differential Revision: https://reviews.llvm.org/D35388
Nico Weber [Wed, 13 Jan 2021 02:10:55 +0000 (21:10 -0500)]
[gn build] Reorganize libcxx/include/BUILD.gn a bit
- Merge
6706342f48bea80 -- no more libcxx_needs_site_config, we now
always need it
- Since it was always off in practice, write_config bitrot. Unbitrot
it so that it works
- Remove copy step and let concat step write to final location
immediately -- and fix copy destination directory
As a side effect, libcxx/include/BUILD.gn now has only a single
sources list, which means the cmake sync script should be able to
automatically sync additions and removals of .h files. On the flipside,
this means this file now must be updated after most changes to
libcxx/include/__config_site.in, and looking through the last few months
of changes this looks like it's going to be a wash.
Hansang Bae [Thu, 7 Jan 2021 16:14:21 +0000 (10:14 -0600)]
[OpenMP] Update allocator trait key/value definitions
Use new definitions introduced in 5.1 specification.
Differential Revision: https://reviews.llvm.org/D94277
Reid Kleckner [Thu, 4 Jun 2020 04:22:11 +0000 (21:22 -0700)]
[PDB] Defer relocating .debug$S until commit time and parallelize it
This is a pretty classic optimization. Instead of processing symbol
records and copying them to temporary storage, do a first pass to
measure how large the module symbol stream will be, and then copy the
data into place in the PDB file. This requires defering relocation until
much later, which accounts for most of the complexity in this patch.
This patch avoids copying the contents of all live .debug$S sections
into heap memory, which is worth about 20% of private memory usage when
making PDBs. However, this is not an unmitigated performance win,
because it can be faster to read dense, temporary, heap data than it is
to iterate symbol records in object file backed memory a second time.
Results on release chrome.dll:
peak mem: 5164.89MB -> 4072.19MB (-1,092.7MB, -21.2%)
wall-j1: 0m30.844s -> 0m32.094s (slightly slower)
wall-j3: 0m20.968s -> 0m20.312s (slightly faster)
wall-j8: 0m19.062s -> 0m17.672s (meaningfully faster)
I gathered similar numbers for a debug, component build of content.dll
in Chrome, and the performance impact of this change was in the noise.
The memory usage reduction was visible and similar.
Because of the new parallelism in the PDB commit phase, more cores makes
the new approach faster. I'm assuming that most C++ developer machines
these days are at least quad core, so I think this is a win.
Differential Revision: https://reviews.llvm.org/D94267
Yuanfang Chen [Wed, 13 Jan 2021 01:42:10 +0000 (17:42 -0800)]
[Coroutine] Update promise object's final layout index
promise is a header field but it is not guaranteed that it would be the third
field of the frame due to `performOptimizedStructLayout`.
Reviewed By: lxfind
Differential Revision: https://reviews.llvm.org/D94137
Luo, Yuanke [Sun, 10 Jan 2021 06:06:18 +0000 (14:06 +0800)]
[X86][AMX] Prohibit pointer cast on load.
The load/store instruction will be transformed to amx intrinsics in the
pass of AMX type lowering. Prohibiting the pointer cast make that pass
happy.
Differential Revision: https://reviews.llvm.org/D94372
zhanghb97 [Tue, 12 Jan 2021 13:40:27 +0000 (21:40 +0800)]
[mlir][Python] Add checking process before create an AffineMap from a permutation.
An invalid permutation will trigger a C++ assertion when attempting to create an AffineMap from the permutation.
This patch adds an `isPermutation` function to check the given permutation before creating the AffineMap.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D94492
Nico Weber [Wed, 13 Jan 2021 01:30:56 +0000 (20:30 -0500)]
[gn build] (manually) port
79f99ba65d96
Jianzhou Zhao [Tue, 12 Jan 2021 21:49:59 +0000 (21:49 +0000)]
[MSan] Tweak CopyOrigin
There could be some mis-alignments when copying origins not aligned.
I believe inaligned memcpy is rare so the cases do not matter too much
in practice.
1) About the change at line 50
Let dst be (void*)5,
then d=5, beg=4
so we need to write 3 (4+4-5) bytes from 5 to 7.
2) About the change around line 77.
Let dst be (void*)5,
because of lines 50-55, the bytes from 5-7 were already writen.
So the aligned copy is from 8.
Reviewed-by: eugenis
Differential Revision: https://reviews.llvm.org/D94552
Juneyoung Lee [Wed, 13 Jan 2021 00:33:21 +0000 (09:33 +0900)]
[DAGCombiner] Fold BRCOND(FREEZE(COND)) to BRCOND(COND)
This patch resolves the suboptimal codegen described in http://llvm.org/pr47873 .
When CodeGenPrepare lowers select into a conditional branch, a freeze instruction is inserted.
It is then translated to `BRCOND(FREEZE(SETCC))` in SelDag.
The `FREEZE` in the middle of `SETCC` and `BRCOND` was causing a suboptimal code generation however.
This patch adds `BRCOND(FREEZE(cond))` -> `BRCOND(cond)` fold to DAGCombiner to remove the `FREEZE`.
To make this optimization sound, `BRCOND(UNDEF)` simply should nondeterministically jump to the branch or not, rather than raising UB.
It wasn't clear what happens when the condition was undef according to the comments in ISDOpcodes.h, however.
I updated the comments of `BRCOND` to make it explicit (as well as `BR_CC`, which is also a conditional branch instruction).
Note that it diverges from the semantics of `br` instruction in IR, which is explicitly UB.
Since the UB semantics was necessary to explain optimizations that use branching conditions, and SelDag doesn't seem to have such optimization, I think this divergence is okay.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D92015
Juneyoung Lee [Mon, 11 Jan 2021 05:42:08 +0000 (14:42 +0900)]
[LangRef] State that a nocapture pointer cannot be returned
This is a small patch stating that a nocapture pointer cannot be returned.
Discussed in D93189.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D94386
Siva Chandra Reddy [Wed, 13 Jan 2021 00:11:28 +0000 (16:11 -0800)]
[libc][NFC] Use more specific comparison macros in LdExpTest.h.
Michael Jones [Tue, 12 Jan 2021 22:37:56 +0000 (22:37 +0000)]
[libc] add isascii and toascii implementations
adding both at once since these are trivial functions.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D94558
Joe Nash [Thu, 7 Jan 2021 18:56:02 +0000 (13:56 -0500)]
[AMDGPU] Add _e64 suffix to VOP3 Insts
Previously, instructions which could be
expressed as VOP3 in addition to another
encoding had a _e64 suffix on the tablegen
record name, while those
only available as VOP3 did not. With this
patch, all VOP3s will have the _e64 suffix.
The assembly does not change, only the mir.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D94341
Change-Id: Ia8ec8890d47f8f94bbbdac43745b4e9dd2b03423
David Blaikie [Tue, 12 Jan 2021 23:29:44 +0000 (15:29 -0800)]
Delete unused function (was breaking the -Werror build)
Mircea Trofin [Tue, 12 Jan 2021 22:31:58 +0000 (14:31 -0800)]
[NFC] Disallow unused prefixes under MC/AMDGPU
This patches remaining tests, and patches lit.local.cfg to block future
such cases (until we flip FileCheck's flag)
Differential Revision: https://reviews.llvm.org/D94556
Julian Lettner [Tue, 12 Jan 2021 23:01:18 +0000 (15:01 -0800)]
[Sanitizer][Darwin] Fix test for macOS 11+ point releases
This test wrongly asserted that the minor version is always 0 when
running on macOS 11 and above.
Jessica Paquette [Fri, 8 Jan 2021 23:06:13 +0000 (15:06 -0800)]
[MIPatternMatch] Add matcher for G_PTR_ADD
Add a matcher which recognizes G_PTR_ADD and add a test.
Differential Revision: https://reviews.llvm.org/D94348
Hongtao Yu [Wed, 23 Dec 2020 06:43:22 +0000 (22:43 -0800)]
Add sample-profile-suffix-elision-policy attribute with -funique-internal-linkage-names.
Adding sample-profile-suffix-elision-policy attribute to functions whose linkage names are uniquefied so that their unique name suffix won't be trimmed when applying AutoFDO profiles.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D94455
Bob Haarman [Tue, 12 Jan 2021 02:08:01 +0000 (02:08 +0000)]
[ELF][NFCI] small cleanup to OutputSections.h
OutputSections.h used to close the lld::elf namespace only to
immediately open it again. This change merges both parts into
one.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D94538
Craig Topper [Tue, 12 Jan 2021 22:37:28 +0000 (14:37 -0800)]
[RISCV] Remove '.mask' from vcompress intrinsic name. NFC
It has a mask argument, but isn't a masked instruction. It doesn't
use the mask policy of or the v0.t syntax.
Nathan James [Tue, 12 Jan 2021 22:43:48 +0000 (22:43 +0000)]
[ADT][NFC] Use empty base optimisation in BumpPtrAllocatorImpl
Most uses of this class just use the default MallocAllocator.
As this contains no fields, we can use the empty base optimisation for BumpPtrAllocatorImpl and save 8 bytes of padding for most use cases.
This prevents using a class that is marked as `final` as the `AllocatorT` template argument.
In one must use an allocator that has been marked as `final`, the simplest way around this is a proxy class.
The class should have all the methods that `AllocaterBase` expects and should forward the calls to your own allocator instance.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D94439
Mircea Trofin [Tue, 12 Jan 2021 22:06:30 +0000 (14:06 -0800)]
[NFC] Disallow unused prefixes in MC/AMDGPU
1 out of 2 patches.
Differential Revision: https://reviews.llvm.org/D94553
Fangrui Song [Tue, 12 Jan 2021 22:19:55 +0000 (14:19 -0800)]
[Driver] Fix assertion failure when -fprofile-generate -fcs-profile-generate are used together
If conflicting `-fprofile-generate -fcs-profile-generate` are used together,
there is currently an assertion failure. Fix the failure.
Also add some driver tests.
Reviewed By: xur
Differential Revision: https://reviews.llvm.org/D94463
Matt Arsenault [Wed, 6 Jan 2021 19:04:19 +0000 (14:04 -0500)]
AMDGPU: Remove wrapper only call limitation
This seems to only have overridden cold handling, which we probably
shouldn't do. As far as I can tell the wrapper library functions are
still inlined as appropriate.
Shilei Tian [Tue, 12 Jan 2021 22:00:49 +0000 (17:00 -0500)]
[OpenMP] Fixed a typo in openmp/CMakeLists.txt
Martin Storsjö [Thu, 17 Dec 2020 13:40:06 +0000 (15:40 +0200)]
[libcxx] Avoid overflows in the windows __libcpp_steady_clock_now()
As freq.QuadValue can be in the range of
10000000 to
19200000,
the multiplication before division makes the calculation overflow
and wrap to negative values every 16-30 minutes.
Instead count the whole seconds separately before adding the
scaled fractional seconds.
Add a testcase for steady_clock to check that the values returned for
now() compare as bigger than the zero time origin; this
corresponds to a testcase in Qt [1] [2] (that failed spuriously
due to this).
[1] https://bugreports.qt.io/browse/QTBUG-89539
[2] https://code.qt.io/cgit/qt/qtbase.git/tree/tests/auto/corelib/kernel/qdeadlinetimer/tst_qdeadlinetimer.cpp?id=
f8de5e54022b8b7471131b7ad55c83b69b2684c0#n569
Differential Revision: https://reviews.llvm.org/D93456
Martin Storsjö [Fri, 11 Dec 2020 10:42:07 +0000 (12:42 +0200)]
[AArch64] [Windows] Properly add :lo12: reloc specifiers when generating assembly
This makes sure that assembly output actually can be assembled.
Set the correct MCExpr relocations specifier VK_PAGEOFF - and also
set VK_PAGE consistently even though it's not visible in the assembly
output.
Differential Revision: https://reviews.llvm.org/D94365
Shilei Tian [Tue, 12 Jan 2021 21:48:19 +0000 (16:48 -0500)]
[OpenMP] Fixed the link error that cannot find static data member
Constant static data member can be defined in the class without another
define after the class in C++17. Although it is C++17, Clang can still handle it
even w/o the flag for C++17. Unluckily, GCC cannot handle that.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D94541
modimo [Tue, 12 Jan 2021 21:19:30 +0000 (13:19 -0800)]
[Inliner] Change inline remark format and update ReplayInlineAdvisor to use it
This change modifies the source location formatting from:
LineNumber.Discriminator
to:
LineNumber:ColumnNumber.Discriminator
The motivation here is to enhance location information for inline replay that currently exists for the SampleProfile inliner. This will be leveraged further in inline replay for the CGSCC inliner in the related diff.
The ReplayInlineAdvisor is also modified to read the new format and now takes into account the callee for greater accuracy.
Testing:
ninja check-llvm
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D94333
Alex Zinenko [Tue, 12 Jan 2021 11:07:12 +0000 (12:07 +0100)]
[mlir] Update LLVM dialect type documentation
Recent commits reconfigured LLVM dialect types to use built-in types whenever
possible. Update the documentation accordingly.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D94485
Nikita Popov [Tue, 12 Jan 2021 21:35:19 +0000 (22:35 +0100)]
[InstCombine] Handle logical and/or in assume optimization
assume(a && b) can be converted to assume(a); assume(b) even if
the condition is logical. Same for assume(!(a || b)).
Michael Munday [Tue, 12 Jan 2021 21:22:34 +0000 (21:22 +0000)]
[RISCV] Legalize select when Zbt extension available
The custom expansion of select operations in the RISC-V backend
interferes with the matching of cmov instructions. Legalizing
select when the Zbt extension is available solves that problem.
Reviewed By: lenary, craig.topper
Differential Revision: https://reviews.llvm.org/D93767
Nikita Popov [Tue, 12 Jan 2021 21:15:54 +0000 (22:15 +0100)]
[InstCombine] Add tests for logical and/or poison implication (NFC)
These tests cover some cases where we can fold select to and/or
based on poison implication logic.
Craig Topper [Tue, 12 Jan 2021 21:08:58 +0000 (13:08 -0800)]
[RISCV] Add double test cases to vfmerge-rv32.ll. NFC
Sanjay Patel [Tue, 12 Jan 2021 20:07:01 +0000 (15:07 -0500)]
[SLP] reduce code duplication while processing reductions; NFC
Sanjay Patel [Tue, 12 Jan 2021 19:55:09 +0000 (14:55 -0500)]
[SLP] rename variable to improve readability; NFC
The OperationData in the 2nd block (visiting the operands)
is completely independent of the 1st block.
Sanjay Patel [Tue, 12 Jan 2021 18:53:18 +0000 (13:53 -0500)]
[SLP] reduce code duplication in processing reductions; NFC
Sanjay Patel [Tue, 12 Jan 2021 18:45:32 +0000 (13:45 -0500)]
[SLP] reduce code duplication while matching reductions; NFC
Philip Reames [Tue, 12 Jan 2021 20:54:07 +0000 (12:54 -0800)]
[LV] Weaken spuriously strong assert in LoopVersioning
LoopVectorize uses some utilities on LoopVersioning, but doesn't actually use it for, you know, versioning. As a result, the precondition LoopVersioning expects is too strong for this user. At the moment, LoopVectorize supports any loop with a unique exit block, so check the same precondition here.
Really, the whole class structure here is a mess. We should separate the actual versioning from the metadata updates, but that's a bigger problem.
Nikita Popov [Tue, 12 Jan 2021 19:54:23 +0000 (20:54 +0100)]
[InstCombine] Duplicate tests for logical and/or (NFC)
This replicates existing and/or tests to also test variants using
select. This should help us get a more accurate view on which
optimizations we're missing if we disable the select -> and/or
fold.
Sunil Srivastava [Tue, 12 Jan 2021 20:37:18 +0000 (12:37 -0800)]
Fix for crash in __builtin_return_address in template context.
The check for argument value needs to be guarded by !isValueDependent().
Differential Revision: https://reviews.llvm.org/D94438
Philip Reames [Tue, 12 Jan 2021 20:32:24 +0000 (12:32 -0800)]
[LV] Relax assumption that LCSSA implies single entry
This relates to the ongoing effort to support vectorization of multiple exit loops (see D93317).
The previous code assumed that LCSSA phis were always single entry before the vectorizer ran. This was correct, but only because the vectorizer allowed only a single exiting edge. There's nothing in the definition of LCSSA which requires single entry phis.
A common case where this comes up is with a loop with multiple exiting blocks which all reach a common exit block. (e.g. see the test updates)
Differential Revision: https://reviews.llvm.org/D93725
Nikita Popov [Tue, 12 Jan 2021 20:21:22 +0000 (21:21 +0100)]
[InstCombine] Regenerate test checks (NFC)
Yitzhak Mandelbaum [Mon, 11 Jan 2021 22:28:17 +0000 (22:28 +0000)]
[clang-tidy] Add test for Transformer-based checks with diagnostics.
Adds a test that checks the diagnostic output of the tidy.
Differential Revision: https://reviews.llvm.org/D94453
Zequan Wu [Tue, 12 Jan 2021 19:22:31 +0000 (11:22 -0800)]
[IR] move nomerge attribute from function declaration/definition to callsites
Move nomerge attribute from function declaration/definition to callsites to
allow virtual function calls attach the attribute.
Differential Revision: https://reviews.llvm.org/D94537
Florian Hahn [Tue, 12 Jan 2021 19:55:17 +0000 (19:55 +0000)]
[FunctionAttrs] Derive willreturn for fns with readonly` & `mustprogress`.
Similar to D94125, derive `willreturn` for functions that are `readonly` and
`mustprogress` in FunctionAttrs.
To quote the reasoning from D94125:
Since D86233 we have `mustprogress` which, in combination with
`readonly`, implies `willreturn`. The idea is that every side-effect
has to be modeled as a "write". Consequently, `readonly` means there
is no side-effect, and `mustprogress` guarantees that we cannot "loop"
forever without side-effect.
Reviewed By: jdoerfert, nikic
Differential Revision: https://reviews.llvm.org/D94502
David Truby [Thu, 3 Dec 2020 11:25:57 +0000 (11:25 +0000)]
[clang][aarch64] Precondition isHomogeneousAggregate on isCXX14Aggregate
MSVC on WoA64 includes isCXX14Aggregate in its definition. This is de-facto
specification on that platform, so match msvc's behaviour.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=47611
Co-authored-by: Peter Waller <peter.waller@arm.com>
Differential Revision: https://reviews.llvm.org/D92751
Jon Chesterfield [Tue, 12 Jan 2021 19:40:02 +0000 (19:40 +0000)]
[libomptarget][amdgpu][nfc] Fix build on centos
[libomptarget][amdgpu][nfc] Fix build on centos
rtl.cpp replaced 224 with a #define from elf.h, but that
doesn't work on a centos 7 build machine with an old elf.h
Reviewed By: ronlieb
Differential Revision: https://reviews.llvm.org/D94528
Shilei Tian [Tue, 12 Jan 2021 19:32:27 +0000 (14:32 -0500)]
[OpenMP] Fixed include directories for OpenMP when building OpenMP with LLVM_ENABLE_RUNTIMES
Some LLVM headers are generated by CMake. Before the installation,
LLVM's headers are distributed everywhere, some of which are in
`${LLVM_SRC_ROOT}/llvm/include/llvm`, and some are in
`${LLVM_BINARY_ROOT}/include/llvm`. After intallation, they're all in
`${LLVM_INSTALLATION_ROOT}/include/llvm`.
OpenMP now depends on LLVM headers. Some headers depend on headers generated
by CMake. When building OpenMP along with LLVM, a.k.a via `LLVM_ENABLE_RUNTIMES`,
we need to tell OpenMP where it can find those headers, especially those still
have not been copied/installed.
Reviewed By: jdoerfert, jhuber6
Differential Revision: https://reviews.llvm.org/D94534
Nikita Popov [Thu, 24 Dec 2020 16:04:40 +0000 (17:04 +0100)]
[InstSimplify] Don't fold gep p, -p to null
This is a partial fix for https://bugs.llvm.org/show_bug.cgi?id=44403.
Folding gep p, q-p to q is only legal if p and q have the same
provenance. This fold should probably be guarded by something like
getUnderlyingObject(p) == getUnderlyingObject(q).
This patch is a partial fix that removes the special handling for
gep p, 0-p, which will fold to a null pointer, which would certainly
not pass an underlying object check (unless p is also null, in which
case this would fold trivially anyway). Folding to a null pointer
is particularly problematic due to the special handling it receives
in many places, making end-to-end miscompiles more likely.
Differential Revision: https://reviews.llvm.org/D93820
Brad Smith [Tue, 12 Jan 2021 19:16:15 +0000 (14:16 -0500)]
[libcxx] Port to OpenBSD
Add initial OpenBSD support.
Reviewed By: ldionne
Differential Revision: https://reviews.llvm.org/D94205
Arthur O'Dwyer [Fri, 18 Dec 2020 20:11:51 +0000 (15:11 -0500)]
[libc++] Add a missing `<_Compare>` template argument.
Sometimes `_Compare` is an lvalue reference type, so letting it be
deduced is pretty much always wrong. (Well, less efficient than
it could be, anyway.)
Differential Revision: https://reviews.llvm.org/D93562
Florian Hahn [Mon, 11 Jan 2021 16:33:22 +0000 (16:33 +0000)]
[FunctionAttrs] Precommit tests for willreturn inference.
Tests for D94502.
Craig Topper [Tue, 12 Jan 2021 18:52:53 +0000 (10:52 -0800)]
[RISCV] Use vmerge.vim for llvm.riscv.vfmerge with a 0.0 scalar operand.
We can use a 0 immediate to avoid needing to materialize 0 into
an FPR first.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D94459
Arthur Eubanks [Mon, 11 Jan 2021 21:50:52 +0000 (13:50 -0800)]
[NewPM] Run non-trivial loop unswitching under -O2/3/s/z
Fixes https://bugs.llvm.org/show_bug.cgi?id=48715.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D94448
Nathan Ridge [Mon, 11 Jan 2021 01:41:50 +0000 (20:41 -0500)]
[clangd] Avoid recursion in TargetFinder::add()
Fixes https://github.com/clangd/clangd/issues/633
Differential Revision: https://reviews.llvm.org/D94382
Craig Topper [Tue, 12 Jan 2021 17:52:00 +0000 (09:52 -0800)]
[LegalizeDAG][RISCV][PowerPC][AMDGPU][WebAssembly] Improve expansion of SETONE/SETUEQ on targets without SETO/SETUO.
If SETO/SETUO aren't legal, they'll be expanded and we'll end up
with 3 comparisons.
SETONE is equivalent to (SETOGT || SETOLT)
so if one of those operations is supported use that expansion. We
don't need both since we can commute the operands to make the other.
SETUEQ can be implemented with !(SETOGT || SETOLT) or (SETULE && SETUGE).
I've only implemented the first because it didn't look like most of the
affected targets had legal SETULE/SETUGE.
Reviewed By: frasercrmck, tlively, nemanjai
Differential Revision: https://reviews.llvm.org/D94450
sameeran joshi [Thu, 17 Dec 2020 08:58:03 +0000 (14:28 +0530)]
[Flang][openmp][openacc] Extend CheckNoBranching to handle branching provided by LabelEnforce.
`CheckNoBranching` is currently handling only illegal branching out for constructs
with `Parser::Name` in them.
Extend the same for handling illegal branching out caused by `Parser::Label` based statements.
This patch could possibly solve one of the issues(typically branching out) mentioned in D92735.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D93447
Dávid Bolvanský [Tue, 12 Jan 2021 18:28:01 +0000 (19:28 +0100)]
[instCombine] Add (A ^ B) | ~(A | B) -> ~(A & B)
define i32 @src(i32 %x, i32 %y) {
%0:
%xor = xor i32 %y, %x
%or = or i32 %y, %x
%neg = xor i32 %or,
4294967295
%or1 = or i32 %xor, %neg
ret i32 %or1
}
=>
define i32 @tgt(i32 %x, i32 %y) {
%0:
%and = and i32 %x, %y
%neg = xor i32 %and,
4294967295
ret i32 %neg
}
Transformation seems to be correct!
https://alive2.llvm.org/ce/z/Cvca4a
Dávid Bolvanský [Tue, 12 Jan 2021 17:56:49 +0000 (18:56 +0100)]
[Tests] Add tests for new InstCombine OR transformation, NFC
Michał Górny [Tue, 12 Jan 2021 17:16:57 +0000 (18:16 +0100)]
[llvm] [cmake] Remove obsolete /usr/local hack for *BSD
Remove the hack adding /usr/local paths on FreeBSD and DragonFlyBSD.
It does not seem to be necessary today, and it breaks cross builds.
Differential Revision: https://reviews.llvm.org/D94491
Timm Bäder [Tue, 12 Jan 2021 18:18:13 +0000 (13:18 -0500)]
Return false from __has_declspec_attribute() if not explicitly enabled
Currently, projects can check for __has_declspec_attribute() and use
it accordingly, but the check for __has_declspec_attribute will return
true even if declspec attributes are not enabled for the target.
This changes Clang to instead return false when declspec attributes are
not supported for the target.
Emil Engler [Thu, 7 Jan 2021 02:28:54 +0000 (18:28 -0800)]
[doc] Place sha256 in lld/README.md into backticks
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D93984
Timm Bäder [Tue, 12 Jan 2021 18:15:21 +0000 (13:15 -0500)]
Add -ansi option to CompileOnly group
-ansi is documented as being the "same as -std=c89", but there are
differences when passing it to a link.
Adding -ansi to said group makes sense since it's supposed to be an
alias for -std=c89 and resolves this inconsistency.
Cullen Rhodes [Tue, 12 Jan 2021 17:48:52 +0000 (17:48 +0000)]
[SVE][NFC] Regenerate a few CodeGen tests
Regenerated using llvm/utils/update_llc_test_checks.py as part of
D94504, committing separately to reduce the diff for D94504.
Simon Pilgrim [Tue, 12 Jan 2021 18:01:41 +0000 (18:01 +0000)]
[AMDGPU] Regenerate umax crash test
Akira Hatanaka [Tue, 12 Jan 2021 17:56:06 +0000 (09:56 -0800)]
Fix typo in diagnostic message
rdar://
66684531
Simon Pilgrim [Tue, 12 Jan 2021 17:24:34 +0000 (17:24 +0000)]
[X86] Regenerate sdiv_fix_sat.ll + udiv_fix_sat.ll tests
Adding missing libcall PLT qualifiers
Rahul Joshi [Thu, 7 Jan 2021 00:32:59 +0000 (16:32 -0800)]
[MLIR] Disallow `sym_visibility`, `sym_name` and `type` attributes in the parsed attribute dictionary.
Differential Revision: https://reviews.llvm.org/D94200
Jinsong Ji [Tue, 12 Jan 2021 15:56:58 +0000 (15:56 +0000)]
[PowerPC][NFCI] PassSubtarget to ASMWriter
Subtarget feature bits are needed to change instprinter's behavior based
on feature bits.
Most of the other popular targets were updated back in 2015,
in https://reviews.llvm.org/rGb46d0234a6969
we should update it too.
Reviewed By: sfertile
Differential Revision: https://reviews.llvm.org/D94449
Lei Zhang [Tue, 12 Jan 2021 16:11:45 +0000 (11:11 -0500)]
[mlir][spirv] NFC: split deserialization into multiple source files
This avoids large source files and gives a better structure. It also
allows leveraging compilation parallelism.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D94360
Marek Kurdej [Tue, 12 Jan 2021 16:06:58 +0000 (17:06 +0100)]
[libc++] [C++2b] [P1048] Add is_scoped_enum and is_scoped_enum_v.
* https://wg21.link/p1048
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D94409
Vladislav Vinogradov [Tue, 12 Jan 2021 16:06:06 +0000 (17:06 +0100)]
[mlir] Fix for LIT tests
Add `MLIR_SPIRV_CPU_RUNNER_ENABLED` to `llvm_canonicalize_cmake_booleans`.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D94407
Vladislav Vinogradov [Tue, 12 Jan 2021 16:02:56 +0000 (17:02 +0100)]
[mlir][CAPI] Fix inline function declaration
Add `static` keyword, otherwise build fail with linker error for some cases.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D94496
Lei Zhang [Mon, 11 Jan 2021 13:50:00 +0000 (08:50 -0500)]
Reland "[mlir][linalg] Support parsing attributes in named op spec"
With this, now we can specify a list of attributes on named ops
generated from the spec. The format is defined as
```
attr-id ::= bare-id (`?`)?
attr-typedef ::= type (`[` `]`)?
attr-def ::= attr-id `:` attr-typedef
tc-attr-def ::= `attr` `(` attr-def-list `)`
tc-def ::= `def` bare-id
`(`tensor-def-list`)` `->` `(` tensor-def-list`)`
(tc-attr-def)?
```
For example,
```
ods_def<SomeCppOp>
def some_op(...) -> (...)
attr(
f32_attr: f32,
i32_attr: i32,
array_attr : f32[],
optional_attr? : f32
)
```
where `?` means optional attribute and `[]` means array type.
Reviewed By: hanchung, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D94240
Nemanja Ivanovic [Tue, 12 Jan 2021 15:46:11 +0000 (09:46 -0600)]
[PowerPC] Add support for embedded devices with EFPU2
PowerPC cores like e200z759n3 [1] using an efpu2 only support single precision
hardware floating point instructions. The single precision instructions efs*
and evfs* are identical to the spe float instructions while efd* and evfd*
instructions trigger a not implemented exception.
This patch introduces a new command line option -mefpu2 which leads to
single-hardware / double-software code generation.
[1] Core reference:
https://www.nxp.com/files-static/32bit/doc/ref_manual/e200z759CRM.pdf
Differential revision: https://reviews.llvm.org/D92935
Bjorn Pettersson [Tue, 12 Jan 2021 15:28:16 +0000 (16:28 +0100)]
[SLP] Add test case showing a bug when dealing with padded types
We shouldn't vectorize stores of non-packed types (i.e. types that
has padding between consecutive variables in a scalar layout,
but being packed in a vector layout).
The problem was detected as a miscompile in a downstream test case.
This is a pre-commit of a test case for the fix in D94446.
Lei Zhang [Mon, 11 Jan 2021 14:58:31 +0000 (09:58 -0500)]
[mlir][spirv] NFC: place ops in the proper file for their categories
This commit moves dangling ops in the main ops.td file to the proper
file matching their categories. This makes ops.td as purely including
all category files.
Differential Revision: https://reviews.llvm.org/D94413
Kazushi (Jam) Marukawa [Tue, 12 Jan 2021 12:36:55 +0000 (21:36 +0900)]
[VE] Update VELIntrinsic tests
Update comment and style of regression tests for VELIntrinsic
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D94490
Bevin Hansson [Tue, 12 Jan 2021 14:40:36 +0000 (15:40 +0100)]
[X86] Improved lowering for saturating float to int.
Adapted from D54696 by @nikic.
This patch improves lowering of saturating float to
int conversions, FP_TO_[SU]INT_SAT, for X86.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D86079
Valentin Clement [Tue, 12 Jan 2021 14:42:25 +0000 (09:42 -0500)]
[mlir][openacc] Use TableGen information for default enum
Use TableGen and information in ACC.td for the Default enum in the OpenACC dialect.
This patch generalize what was done for OpenMP for directives.
Follow up patch after D93576
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D93710
Paul C. Anagnostopoulos [Mon, 11 Jan 2021 14:46:27 +0000 (09:46 -0500)]
[TableGen] Improve error message for semicolon after braced body.
Add a test for this message.
Differential Revision: https://reviews.llvm.org/D94412
Nicolas Vasilache [Tue, 12 Jan 2021 14:01:59 +0000 (14:01 +0000)]
[mlir][Linalg] NFC - Refactor fusion APIs
This revision uniformizes fusion APIs to allow passing OpOperand, OpResult and adds a finer level of control fusion.
Differential Revision: https://reviews.llvm.org/D94493
Simon Pilgrim [Tue, 12 Jan 2021 14:07:53 +0000 (14:07 +0000)]
[X86][SSE] getFauxShuffleMask - handle PACKSS(SRAI(),SRAI()) shuffle patterns.
We can't easily treat ASHR a faux shuffle, but if it was just feeding a PACKSS then it was likely being used as sign-extension for a truncation, so just peek through and adjust the mask accordingly.
Simon Pilgrim [Tue, 12 Jan 2021 13:43:56 +0000 (13:43 +0000)]
[X86][SSE] combineSubToSubus - add v16i32 handling on pre-AVX512BW targets.
v16i32 -> v16i16/v8i16 truncation is now good enough using PACKSS/PACKUS + shuffle combining that its no longer necessary to early-out on pre-AVX512BW targets.
This was noticed while looking at completing PR40111 and moving combineSubToSubus to DAGCombine entirely.
Bevin Hansson [Mon, 11 Jan 2021 21:46:42 +0000 (22:46 +0100)]
[Fixed Point] Add codegen for conversion between fixed-point and floating point.
The patch adds the required methods to FixedPointBuilder
for converting between fixed-point and floating point,
and uses them from Clang.
This depends on D54749.
Reviewed By: leonardchan
Differential Revision: https://reviews.llvm.org/D86632
Simon Pilgrim [Tue, 12 Jan 2021 11:50:09 +0000 (11:50 +0000)]
[X86][SSE] combineSubToSubus - remove SSE2 early-out.
SSE2 truncation codegen has improved over the past few years (mainly due to better shuffle lowering/combining and computeKnownBits) - its no longer necessary to early-out from v8i32/v8i64 truncations.
This was noticed while looking at completing PR40111 and moving combineSubToSubus to DAGCombine entirely.
Fraser Cormack [Fri, 8 Jan 2021 17:14:08 +0000 (17:14 +0000)]
[RISCV] Improve scalable-vector shift tests (NFC)
All i8/i16 and several i32 tests were testing immediate shift amounts
which exceeded the bits in the vector elements, creating poison values.
Amend the tests to test well-behaved shift amounts.
Christian Sigg [Thu, 7 Jan 2021 08:41:36 +0000 (09:41 +0100)]
Change the LLVM_ATTRIBUTE_DEPRECATED macro to use C++14 attribute.
C++14 attributes are superior because they can be applied to functions with inline definition and the syntax is cleaner.
I intend to convert all uses and then remove the macro.
One issue that might hold back switching uses to C++14 attributes is that
clang-format does not put long attributes on separate lines and formatted code will look like:
```
template <typename T>
[[deprecated("blah blah")]] void
foooooooooooooooooooooooooooo() {
...
}
```
Putting long attributes on a separate line would be prettier.
See https://stackoverflow.com/questions/
45740466/clang-format-setting-to-control-c-attributes
AttributeMacros probably won't help because it can't match the custom message.
https://clang.llvm.org/docs/ClangFormatStyleOptions.html
Reviewed By: rriddle, MaskRay
Differential Revision: https://reviews.llvm.org/D94219
Nico Weber [Tue, 12 Jan 2021 11:30:32 +0000 (06:30 -0500)]
Revert "[Test] Add failing test for PR48725"
This reverts commit
e8287cb2b2923af9da72fd953e2ec5495c33861a.
Test unexpectedly passes on mac, see comment 2 on PR48725.