Lang Hames [Sun, 10 Oct 2021 19:56:37 +0000 (12:56 -0700)]
[ORC] Reorder callWrapperAsync and callSPSWrapperAsync parameters.
The callee address is now the first parameter and the 'SendResult' function
the second. This change improves consistentency with the non-async functions
where the callee is the first address and the return value the second.
Lang Hames [Fri, 8 Oct 2021 20:59:48 +0000 (13:59 -0700)]
Revert "Add missing include after
dfd74db9"
This reverts commit
dd384d2814094bf5d3ab44f917f759fa24a41158.
dfd74db9 was reverted in
8fe3d9df0ed, so this is no longer needed.
Dawid Jurczak [Sun, 10 Oct 2021 17:52:33 +0000 (19:52 +0200)]
[DSE] Re-enable calloc transformation with extra care (PR25892)
Transformation from malloc+memset to calloc is always correct and in many situations
it brings significant observable benefits in terms of execution speed and memory consumption [1][2].
Unfortunately there are cases when producing calloc cause performance drops [3].
As discussed here: https://reviews.llvm.org/D103009 it's possible to differentiate between those 2 scenarios.
If optimizer is able to prove that after malloc call it's _very_ likely to reach memset branch then after
calloc emission we shouldn't observe any performance hits. Therefore finding "null pointer check" pattern
before memset basic block sounds like good justification for performing transformation.
Also that method was already suggested by GCC folks [4]. Main reason for change is that for now
to be safe we check for post dominance relation which is way too conservative approach making transformation
"almost" disabled in practice. This patch tends to enable transformation again but with extra care.
[1] https://stackoverflow.com/questions/2688466/why-mallocmemset-is-slower-than-calloc
[2] https://vorpus.org/blog/why-does-calloc-exist/
[3] http://smalldatum.blogspot.com/2017/11/a-new-optimization-in-gcc-5x-and-mysql.html
[4] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83022
Differential Revision: https://reviews.llvm.org/D110021
Sylvestre Ledru [Sun, 10 Oct 2021 19:28:24 +0000 (21:28 +0200)]
clang release notes: document the -Wbool-operation improvement
Reviewed By: xbolva00
Differential Revision: https://reviews.llvm.org/D111215
Nico Weber [Sun, 10 Oct 2021 19:14:46 +0000 (15:14 -0400)]
clang: Add range-based CFG::try_blocks()
..and use it. No behavior change.
Nico Weber [Sun, 10 Oct 2021 18:32:51 +0000 (14:32 -0400)]
clang: Convert two loops to for-each
And rewrap a line at 80 columns while here. No behavior change.
Joe Loser [Sun, 10 Oct 2021 18:46:35 +0000 (14:46 -0400)]
[libc++][test] Replace a TEST_NOEXCEPT_FALSE with noexcept(false). NFC.
Replace `TEST_NOEXCEPT_FALSE` directly with `noexcept(false)` in
optional hash test which is only run in C++17 or later.
`TEST_NOEXCEPT_FALSE` is only useful in C++03 context where `noexcept`
isn't supported by clang. `TEST_NOEXCEPT_FALSE` now only has one remaining use
in `hash_unique_ptr.pass.cpp`.
Joe Loser [Sun, 10 Oct 2021 18:35:00 +0000 (14:35 -0400)]
[libc++] Remove empty namespace std in type_traits. NFCI.
There is an empty `namespace std` in `type_traits` which was originally
used when `std::byte` was added in
c97d8aa86650ed795bf75a7dd735ecfaef3b8f55. At some point, the bitwise operators
on `std::byte` got relocated but this empty namespace was left around.
Remove it.
Reviewed By: Quuxplusone, Mordante, #libc
Differential Revision: https://reviews.llvm.org/D111512
Jean Perier [Sun, 10 Oct 2021 18:18:45 +0000 (20:18 +0200)]
[fir] Add character conversion pass
Upstream the character conversion pass.
Translates entities of one CHARACTER KIND to another.
By default the translation is to naively zero-extend or truncate a code
point to fit the destination size.
This patch is part of the upstreaming effort from fir-dev branch.
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D111405
Joe Loser [Sun, 10 Oct 2021 16:53:35 +0000 (12:53 -0400)]
[libc++][NFC] Replace tab with whitespace in comment
There is a stray tab character in a comment block. Replace the tab
character with a space for consistency with other comments.
Kazu Hirata [Sun, 10 Oct 2021 15:52:14 +0000 (08:52 -0700)]
[Basic] Use llvm::is_contained (NFC)
Sanjay Patel [Sun, 10 Oct 2021 15:26:03 +0000 (11:26 -0400)]
[InstCombine] move fold for "(X-Y) == 0"; NFC
This consolidates related folds that all have a
similar use restriction that may not be necessary.
Sanjay Patel [Sun, 10 Oct 2021 15:13:46 +0000 (11:13 -0400)]
[InstCombine] add tests for (X - Y) == 0; NFC
Sanjay Patel [Sun, 10 Oct 2021 14:58:58 +0000 (10:58 -0400)]
[InstCombine] canonicalize "(C2 - Y) > C" as (Y + ~C2) < ~C
The test diffs show that we have better analysis/folds for 'add'
(although we should at least have the simplifications
independently, so we don't have the one-use restriction).
This is related to solving regressions that would appear in
transforms related to D111410, and that is part of a series
of enhancements that may eventually helpi solve PR34047.
https://alive2.llvm.org/ce/z/3tB9KG
define i1 @src(i8 %x, i8 %C, i8 %C2) {
%sub = sub nuw i8 %C2, %x
%r = icmp slt i8 %sub, %C
ret i1 %r
}
define i1 @tgt(i8 %x, i8 %C, i8 %C2) {
%Cnot = xor i8 %C, -1
%C2not = xor i8 %C2, -1
%add = add nuw i8 %x, %C2not
%r = icmp sgt i8 %add, %Cnot
ret i1 %r
}
Sanjay Patel [Sun, 10 Oct 2021 14:41:28 +0000 (10:41 -0400)]
[InstCombine] add test for or-of-icmps; NFC
Chen Zheng [Sun, 10 Oct 2021 14:39:20 +0000 (14:39 +0000)]
[PowerPC] update test case using the scripts; nfc
Mark de Wever [Sun, 10 Oct 2021 12:21:01 +0000 (14:21 +0200)]
[libc++][nfc] Remove a duplicated include.
Dávid Bolvanský [Sun, 10 Oct 2021 09:34:03 +0000 (11:34 +0200)]
[NFC] Added tests for PR52056
william woodruff [Sun, 10 Oct 2021 04:10:22 +0000 (09:40 +0530)]
[BitcodeAnalyzer] allow a motivated user to dump BLOCKINFO
This adds the `--dump-blockinfo` flag to `llvm-bcanalyzer`, allowing a sufficiently motivated user to dump (parts of) the `BLOCKINFO_BLOCK` block. The default behavior is unchanged, and `--dump-blockinfo` only takes effect in the same context as other flags that control dump behavior (i.e., requires that `--dump` is also passed).
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D107536
Amara Emerson [Sun, 10 Oct 2021 03:43:21 +0000 (20:43 -0700)]
[GlobalISel] Fix the stores of truncates -> wide store combine for non-evenly dividing type sizes.
If the wide store we'd generate is not a multiple of the memory type of the
narrow stores (e.g. s48 and s32), we'd assert. Fix that.
william woodruff [Sun, 10 Oct 2021 00:44:08 +0000 (06:14 +0530)]
[clang] Fix JSON AST output when a filter is used
Without this, the combination of `-ast-dump=json` and `-ast-dump-filter FILTER` produces invalid JSON: the first line is a string that says `Dumping $SOME_DECL_NAME: `.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D108441
Med Ismail Bennani [Sun, 10 Oct 2021 01:28:36 +0000 (03:28 +0200)]
[lldb/test] Disable 'TestScriptedProcess.py' on macOS
This is disabling 'TestScriptedProcess.py' on macOS since it fails on
Green Dragon: https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/35974
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Joe Loser [Sat, 9 Oct 2021 21:20:19 +0000 (17:20 -0400)]
[libc++][test] Remove empty {ind.move.subsumption.compile.pass.cpp}
`{ind.move.subsumption.compile.pass.cpp}` was accidentally commited in
https://reviews.llvm.org/D102639. Per the conversation on Discord in
Amy Zhuang [Sat, 9 Oct 2021 19:40:13 +0000 (12:40 -0700)]
[mlir] Vectorize induction variables
1. Add support to vectorize induction variables of loops that are
not mapped to any vector dimension in SuperVectorize pass.
2. Fix a bug in getForInductionVarOwner.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D111370
mydeveloperday [Sat, 9 Oct 2021 18:34:30 +0000 (19:34 +0100)]
[clang-format][NFC] improve the visual of the "clang-formatted %"
NOTE: some files are being removed from those files that are clang-formatted
which means some lack of formatting is slipping through the net on reviews
Mehdi Amini [Sat, 9 Oct 2021 17:56:23 +0000 (17:56 +0000)]
Fix a comment at call-site to match the declared parameter (NFC)
(clang-tidy warning)
Ron Lieberman [Sat, 9 Oct 2021 16:51:53 +0000 (12:51 -0400)]
[libomptarget][amdgpu][NFC] tweak a comment
Kazu Hirata [Sat, 9 Oct 2021 16:38:15 +0000 (09:38 -0700)]
[IR] Remove arg_operands and getNumArgOperands (NFC)
The last uses were removed on Oct 8, 2021 in commit
46ef2e0bf995d8db4cbdf69f3d1bbc2487030ba0.
This is a relanding of
b2ee408dde374d6a27a34746fd7c7b5bab97ea89.
Sanjay Patel [Sat, 9 Oct 2021 15:34:48 +0000 (11:34 -0400)]
[InstCombine] enhance icmp with sub folds
There were 2 related but over-specified folds for:
C1 - X == C
One allowed multi-use but was limited to equal constants.
The other allowed different constants but disallowed multi-use.
This combines the 2 folds into a more general match.
The test diffs show the multi-use cases that were falling
through the cracks.
https://alive2.llvm.org/ce/z/4_hEt2
define i1 @src(i8 %x, i8 %subC, i8 %C) {
%s = sub i8 %subC, %x
%r = icmp eq i8 %s, %C
ret i1 %r
}
define i1 @tgt(i8 %x, i8 %subC, i8 %C) {
%newC = sub i8 %subC, %C
%isneg = icmp eq i8 %x, %newC
ret i1 %isneg
}
Sanjay Patel [Fri, 8 Oct 2021 20:30:55 +0000 (16:30 -0400)]
[InstCombine] add tests for icmp of negated op; NFC
Sanjay Patel [Fri, 8 Oct 2021 17:03:19 +0000 (13:03 -0400)]
[InstCombine] add tests for (iN X s>> N-1) | Y; NFC
These are for a sibling fold suggested in D111410.
The tests correspond to the 'and' tests added with:
a35673f4cfc4
Dávid Bolvanský [Sat, 9 Oct 2021 15:27:41 +0000 (17:27 +0200)]
Fixed some errors detected by PVS Studio
Dávid Bolvanský [Sat, 9 Oct 2021 15:19:53 +0000 (17:19 +0200)]
Fixed some errors detected by PVS Studio
Nikita Popov [Sat, 9 Oct 2021 14:53:07 +0000 (16:53 +0200)]
[CanonicalizeFreeze] Drop IVUsers.h include (NFC)
Looking for users of IVUsers, this was a false positive. Only LSR
uses IVUsers.
David Green [Sat, 9 Oct 2021 14:58:31 +0000 (15:58 +0100)]
[AArch64] Make -mcpu=generic schedule for an in-order core
We would like to start pushing -mcpu=generic towards enabling the set of
features that improves performance for some CPUs, without hurting any
others. A blend of the performance options hopefully beneficial to all
CPUs. The largest part of that is enabling in-order scheduling using the
Cortex-A55 schedule model. This is similar to the Arm backend change
from
eecb353d0e25ba which made -mcpu=generic perform in-order scheduling
using the cortex-a8 schedule model.
The idea is that in-order cpu's require the most help in instruction
scheduling, whereas out-of-order cpus can for the most part out-of-order
schedule around different codegen. Our benchmarking suggests that
hypothesis holds. When running on an in-order core this improved
performance by 3.8% geomean on a set of DSP workloads, 2% geomean on
some other embedded benchmark and between 1% and 1.8% on a set of
singlecore and multicore workloads, all running on a Cortex-A55 cluster.
On an out-of-order cpu the results are a lot more noisy but show flat
performance or an improvement. On the set of DSP and embedded
benchmarks, run on a Cortex-A78 there was a very noisy 1% speed
improvement. Using the most detailed results I could find, SPEC2006 runs
on a Neoverse N1 show a small increase in instruction count (+0.127%),
but a decrease in cycle counts (-0.155%, on average). The instruction
count is very low noise, the cycle count is more noisy with a 0.15%
decrease not being significant. SPEC2k17 shows a small decrease (-0.2%)
in instruction count leading to a -0.296% decrease in cycle count. These
results are within noise margins but tend to show a small improvement in
general.
When specifying an Apple target, clang will set "-target-cpu apple-a7"
on the command line, so should not be affected by this change when
running from clang. This also doesn't enable more runtime unrolling like
-mcpu=cortex-a55 does, only changing the schedule used.
A lot of existing tests have updated. This is a summary of the important
differences:
- Most changes are the same instructions in a different order.
- Sometimes this leads to very minor inefficiencies, such as requiring
an extra mov to move variables into r0/v0 for the return value of a test
function.
- misched-fusion.ll was no longer fusing the pairs of instructions it
should, as per D110561. I've changed the schedule used in the test
for now.
- neon-mla-mls.ll now uses "mul; sub" as opposed to "neg; mla" due to
the different latencies. This seems fine to me.
- Some SVE tests do not always remove movprfx where they did before due
to different register allocation giving different destructive forms.
- The tests argument-blocks-array-of-struct.ll and arm64-windows-calls.ll
produce two LDR where they previously produced an LDP due to
store-pair-suppress kicking in.
- arm64-ldp.ll and arm64-neon-copy.ll are missing pre/postinc on LPD.
- Some tests such as arm64-neon-mul-div.ll and
ragreedy-local-interval-cost.ll have more, less or just different
spilling.
- In aarch64_generated_funcs.ll.generated.expected one part of the
function is no longer outlined. Interestingly if I switch this to use
any other scheduled even less is outlined.
Some of these are expected to happen, such as differences in outlining
or register spilling. There will be places where these result in worse
codegen, places where they are better, with the SPEC instruction counts
suggesting it is not a decrease overall, on average.
Differential Revision: https://reviews.llvm.org/D110830
Nico Weber [Sat, 9 Oct 2021 14:18:52 +0000 (10:18 -0400)]
Revert "Reland "[gn build] (manually) port
6fe2beba7d2a (ExceptionTests)""
This reverts commit
842035d8bdf470af05848114ce1808802c5d4aef.
1dba6b3 was reverted yet again in
04aff395047a.
Michał Górny [Sat, 9 Oct 2021 13:42:34 +0000 (15:42 +0200)]
[lldb] [DynamicRegisterInfo] Remove obsolete dwarf typedefs (NFC)
Raphael Isemann [Sat, 9 Oct 2021 12:15:56 +0000 (14:15 +0200)]
[lldb][NFC] Early-exit in DWARFASTParserClang::ParseSingleMember
ParseSingleMember has two large ifs around the back of it's body:
`if (!is_artificial)` and `if (member_type)`. This patch just converts those
to early-exits. The patch is NFC. It even retains the curious fact that
Objective-C properties that fail to parse are silently ignored, but now there
is at least a FIXME that points this out.
Aaron Ballman [Sat, 9 Oct 2021 12:20:20 +0000 (08:20 -0400)]
Fix a diagnoses-valid in C++20 with variadic macros
C++20 and later allow you to pass no argument for the ... parameter in
a variadic macro, whereas earlier language modes and C disallow it.
We no longer diagnose in C++20 and later modes. This fixes PR51609.
Mark de Wever [Sat, 9 Oct 2021 11:31:20 +0000 (13:31 +0200)]
[NFC][libc++] Update back_insert_iterator style.
As suggested in D110573 land the rename part separately.
Mark de Wever [Sat, 9 Oct 2021 11:28:38 +0000 (13:28 +0200)]
[libc++][doc] Update format status.
Updated based on recent commits.
mydeveloperday [Sat, 9 Oct 2021 11:26:07 +0000 (12:26 +0100)]
[clang-format][NFC] Fix spelling mistakes
Frederic Cambus [Sat, 9 Oct 2021 11:21:39 +0000 (13:21 +0200)]
[Driver][OpenBSD] Use ToolChain reference instead of getToolChain().
Differential Revision: https://reviews.llvm.org/D111462
mydeveloperday [Sat, 9 Oct 2021 11:18:25 +0000 (12:18 +0100)]
[clang-format][NFC] Fix spelling mistake
mydeveloperday [Sat, 9 Oct 2021 10:02:49 +0000 (11:02 +0100)]
[clang-format][docs][NFC] correct the "first supported versions" of some of the clang-format options
Some of the first supported version field were incorrectly attributed to a later branch.
It wasn't possible to correctly determine the "introduced version" with my naive implementation
using git blame alone, (especially if the type had been changed from a bool -> enum)
I saw more things attributed to clang-format 13 than I remembered and reviewed
those options to determine their introduced version.
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D110803
Nikita Popov [Sat, 9 Oct 2021 09:28:11 +0000 (11:28 +0200)]
[Type] Avoid APFloat.h include (NFC)
This is only used by a handful of methods working on fltSemantics,
and having these defined inline in the header does not look
particularly important.
Nikita Popov [Sat, 9 Oct 2021 08:30:39 +0000 (10:30 +0200)]
[MCPseudoProbe] Clean up includes (NFC)
This was including various things that don't appear to be used in
the header at all.
luxufan [Sat, 9 Oct 2021 07:40:03 +0000 (15:40 +0800)]
[Orc] Fix global variable destructor function support when --jit-kind=orc-lazy
The bug was reported here https://bugs.llvm.org/show_bug.cgi?id=52030
This patch follows the idea that @lhames commented in the above webpage.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D110990
Max Kazantsev [Sat, 9 Oct 2021 07:47:44 +0000 (14:47 +0700)]
[LoopDeletion] Support selects when symbolically evaluating 1st iteration
Adds support for selects for which we know value on the 1st iteration.
Differential Revision: https://reviews.llvm.org/D104111
Reviewed By: nikic
Max Kazantsev [Sat, 9 Oct 2021 07:32:46 +0000 (14:32 +0700)]
[Test] Add commit justifying revert of D110922
Test by Arthur Eubanks!
luxufan [Sat, 9 Oct 2021 00:36:41 +0000 (08:36 +0800)]
[Orc] Support atexit in Orc(JITLink)
There is a bug reported at https://bugs.llvm.org/show_bug.cgi?id=48938
After looking through the glibc, I found the `atexit(f)` is the same as `__cxa_atexit(f, NULL, NULL)`. In orc runtime, we identify different JITDylib by their dso_handle value, so that a NULL dso_handle is invalid. So in this patch, I added a `PlatformJDDSOHandle` to ELFNixRuntimeState, and functions which are registered by atexit will be registered at PlatformJD.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D111413
william woodruff [Sat, 9 Oct 2021 03:29:45 +0000 (08:59 +0530)]
[BitcodeReader] fix a logic error in vector type element validation
The current code checks whether the vector's element type is a valid structure element type, rather than a valid vector element type. The two have separate implementations and but only accept very slightly different sets of types, which is probably why this wasn't caught before.
Differential Revision: https://reviews.llvm.org/D109655
Brad Smith [Sat, 9 Oct 2021 03:56:52 +0000 (23:56 -0400)]
[OpenBSD] Use cortex-a8 as default CPU for ARMv7
hsmahesha [Sat, 9 Oct 2021 03:52:59 +0000 (09:22 +0530)]
[CFE][Codegen][In-progress] Remove CodeGenFunction::InitTempAlloca()
CodeGenFunction::InitTempAlloca() inits the static alloca within the
entry block which may *not* necessarily be correct always.
For example, the current instruction insertion point (pointed by the
instruction builder) could be a program point which is hit multiple
times during the program execution, and it is expected that the static
alloca is initialized every time the program point is hit.
Hence remove CodeGenFunction::InitTempAlloca(), and initialize the
static alloca where the instruction insertion point is at the moment.
This patch, as a starting attempt, removes the calls to
CodeGenFunction::InitTempAlloca() which do not have any side effect on
the lit tests.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D111293
Michael Kruse [Sat, 9 Oct 2021 03:42:27 +0000 (22:42 -0500)]
[Polly] Fix test case fixing the colon.
Commit
573531fb1f529b1413b789fa9eee11c7b41ac83d fixed the colon at the
end of a CHECK line (was a semicolon by mistake). With the check
enabled, it turned out that it was failing. Check for the correct
content.
Also add the missing colon to the next CHECK line.
Qiu Chaofan [Sat, 9 Oct 2021 03:26:01 +0000 (11:26 +0800)]
Revert a LIT typo fix in a RUN line
Commit 573531f changes the behavior of the test, revert it back.
Mehdi Amini [Sat, 9 Oct 2021 03:01:42 +0000 (03:01 +0000)]
Disable mlir/test/mlir-cpu-runner/async-group.mlir with ASAN
This test is crashing 9 out of 10 runs in CI, but I can't reproduce
locally right now. Disabling to get the CI back to green and avoid
backsliding with more ASAN issues that would go unnoticed.
Richard Smith [Sat, 9 Oct 2021 02:06:22 +0000 (19:06 -0700)]
Don't update the vptr at the start of the destructor of a final class.
In this case, we know statically that we're destroying the most-derived
class, so the vptr must already point to the current class and never
needs to be updated.
Qiu Chaofan [Sat, 9 Oct 2021 02:48:44 +0000 (10:48 +0800)]
[Clang] Enable _Complex __ibm128 type
fae0dfa implemented the new __ibm128 type, this patch enables its
complex form.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D109948
Qiu Chaofan [Sat, 9 Oct 2021 02:39:10 +0000 (10:39 +0800)]
[NFC] [Clang] Use global enum for explicit float mode
Currently, there're multiple float types that can be represented by
__attribute__((mode(xx))). It's parsed, and then a corresponding type is
created if available.
This refactor moves the enum for mode into a global enum class visible
to ASTContext.
Reviewed By: aaron.ballman, erichkeane
Differential Revision: https://reviews.llvm.org/D111391
Joseph Huber [Sat, 9 Oct 2021 00:08:28 +0000 (20:08 -0400)]
[OpenMP] Add RTL function for getting number of threads in block.
This patch adds support for the
`__kmpc_get_hardware_num_threads_in_block` function that returns the
number of threads. This was missing in the new runtime and was used by
the AMDGPU plugin which prevented it from using the new runtime. This
patchs also unified the interface for getting the thread numbers in the
frontend.
Originally authored by jdoerfert.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D111475
Qiu Chaofan [Sat, 9 Oct 2021 02:12:10 +0000 (10:12 +0800)]
[APFloat] Set size of PPCDoubleDouble to 128
566690b0 uses size information in float semantics, but PPCDoubleDouble
left them empty.
As follow-up, we can consider remove PPCDoubleDoubleLegacy and fill
other fields in the future.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D111398
Qiu Chaofan [Sat, 9 Oct 2021 02:01:27 +0000 (10:01 +0800)]
Fix typo of colon to semicolon in lit tests
Joseph Huber [Sat, 9 Oct 2021 01:52:54 +0000 (21:52 -0400)]
[OpenMP] Avoid calling `isSPMDMode` during RT initialization
Until we hit the first barrier we should not call `mapping::isSPMDMode`
with all threads. Instead, we now have (and use during initialization) a
`mapping::isMainThreadInGenericMode` overload that takes the known
SPMD-mode state and one that queries it.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D111381
Richard Smith [Sat, 9 Oct 2021 01:38:30 +0000 (18:38 -0700)]
PR51079: Treat thread_local variables with an incomplete class type as
being not trivially destructible when determining if we can skip calling
their thread wrapper function.
Michael Kruse [Sat, 9 Oct 2021 00:49:40 +0000 (19:49 -0500)]
[Polly] Add greedy fusion algorithm.
When the option -polly-loopfusion-greedy is set, the ScheduleOptimizer
tries to aggressively fuse any band it can and does not violate any
dependences.
As part if the implementation, the functionalty for copying a band
into an new schedule was extracted out of the ScheduleTreeRewriter.
Arthur Eubanks [Sat, 9 Oct 2021 01:26:15 +0000 (18:26 -0700)]
[LICM] Use Align instead of int
John Ericson [Thu, 7 Oct 2021 15:39:01 +0000 (11:39 -0400)]
Remove unnecessary StringRef convesion in llvm-config
We have a string litteral (via CPP) used to construct `StringRef`, which
is used to construct a `SmallString`. Just construct the latter
directly.
Differential Revision: https://reviews.llvm.org/D111322
Aditya Kumar [Fri, 8 Oct 2021 23:28:15 +0000 (16:28 -0700)]
Add no_instrument_function attribute to Objective C methods as well
There are functions where we do not want function instrumentation which is why we have `__attribute__((no_instrument_function))`. Extending this functionality to disable instrumentation for Objective-C methods as well. Objective C methods like `+load` run premain and having instrumentation on them causes runtime errors depending on the implementation of `__cyg_profile_func_enter` etc. functions
Reviewed By: rjmccall, aaron.ballman
Differential Revision: https://reviews.llvm.org/D111286
Leonard Chan [Sat, 9 Oct 2021 00:43:23 +0000 (17:43 -0700)]
Revert "Reland "[clang-repl] Re-implement clang-interpreter as a test case.""
This reverts commit
1dba6b37bdc70210f75a480eff3715ebe1f1d8be.
Reverting because the ClangReplInterpreterExceptionTests test fails on
our builders with this patch.
Yuanfang Chen [Fri, 8 Oct 2021 23:46:03 +0000 (16:46 -0700)]
[LangRef] Fix a typo in DISubrange section
Arthur Eubanks [Fri, 8 Oct 2021 23:34:22 +0000 (16:34 -0700)]
Make more places that use alignment use uint64_t
Followup to D110451.
Kent Ross [Fri, 8 Oct 2021 21:54:28 +0000 (14:54 -0700)]
[libc++][spaceship] Implement std::tuple::operator<=>
Implement parts of P1614, including three-way comparison for tuples, and expand testing.
Reviewed By: ldionne, Mordante, #libc
Differential Revision: https://reviews.llvm.org/D108250
Reid Kleckner [Fri, 8 Oct 2021 22:43:43 +0000 (15:43 -0700)]
Fix TargetRegistry shlib build, clang edition
Nick Desaulniers [Fri, 8 Oct 2021 22:17:54 +0000 (15:17 -0700)]
[InlineCost] model calls to llvm.is.constant* more carefully
llvm.is.constant* intrinsics are evaluated to 0 or 1 integral values.
A common use case for llvm.is.constant comes from the higher level
__builtin_constant_p. A common usage pattern of __builtin_constant_p in
the Linux kernel is:
void foo (int bar) {
if (__builtin_constant_p(bar)) {
// lots of code that will fold away to a constant.
} else {
// a little bit of code, usually a libcall.
}
}
A minor issue in InlineCost calculations is when `bar` is _not_ Constant
and still will not be after inlining, we don't discount the true branch
and the inline cost of `foo` ends up being the cost of both branches
together, rather than just the false branch.
This leads to code like the above where inlining will not help prove bar
Constant, but it still would be beneficial to inline foo, because the
"true" branch is irrelevant from a cost perspective.
For example, IPSCCP can sink a passed constant argument to foo:
const int x = 42;
void bar (void) { foo(x); }
This improves our inlining decisions, and fixes a few head scratching
cases were the disassembly shows a relatively small `foo` not inlined
into a lone caller.
We could further improve this modeling by tracking whether the argument
to llvm.is.constant* is a parameter of the function, and if inlining
would allow that parameter to become Constant. This idea is noted in a
FIXME comment.
Link: https://github.com/ClangBuiltLinux/linux/issues/1302
Reviewed By: kazu
Differential Revision: https://reviews.llvm.org/D111272
Vedant Kumar [Fri, 8 Oct 2021 22:00:02 +0000 (15:00 -0700)]
[ADT] Mark IntervalMap::overlaps const
This allows the overlaps() predicate to be used on a const IntervalMap.
Tested by building ADTTests, llc, and lldb-test.
Reid Kleckner [Fri, 8 Oct 2021 22:18:58 +0000 (15:18 -0700)]
Fix shlib builds for all lib/Target/*/TargetInfo libs
They all must depend on MC now that the target registry is in MC.
Also fix llvm-cxxdump
Reid Kleckner [Fri, 8 Oct 2021 22:06:03 +0000 (15:06 -0700)]
Fix shared library build after TargetRegistry move
Reid Kleckner [Fri, 8 Oct 2021 17:48:15 +0000 (10:48 -0700)]
Move TargetRegistry.(h|cpp) from Support to MC
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
Fangrui Song [Fri, 8 Oct 2021 21:40:22 +0000 (14:40 -0700)]
[Driver][test] Fix undefined-libs.cpp when CLANG_DEFAULT_UNWINDLIB is libunwind
Vitaly Buka [Fri, 8 Oct 2021 21:26:25 +0000 (14:26 -0700)]
[NFC][sanitizer] Add const to ChainedOriginDepotNode methods
Vitaly Buka [Fri, 8 Oct 2021 21:13:15 +0000 (14:13 -0700)]
[NFC][sanitizer] Remove includes from header
Richard Smith [Fri, 8 Oct 2021 21:24:03 +0000 (14:24 -0700)]
Fix unintended fall-through.
Unfortunately I've not found a way to exercise this code that doesn't
crash elsewhere yet, due to unrelated bugs in how Sema incorrectly
instantiates lambdas in function template signatures.
Leonard Chan [Fri, 8 Oct 2021 21:20:26 +0000 (14:20 -0700)]
[AArch64] Emit CFI instruction for updating x18 when using ShadowCallStack with exception unwinding
PR45875 notes an instance where exception handling crashes on aarch64-fuchsia
where SCS is enabled by default. The underlying issue seems to be that within libunwind,
various _Unwind_* functions, the x18 register is not updated if a function is marked
with nounwind. This removes the check for nounwind and emits the CFI instruction that updates x18.
Differential Revision: https://reviews.llvm.org/D79822
Nikita Popov [Thu, 7 Oct 2021 19:41:30 +0000 (21:41 +0200)]
[LoopFlatten] Mark inner loop as deleted
If a loop is flattened, the inner loop is removed and the LPM
should be informed of this fact, so it can invalidate associated
analyses. To support this, we relax an assertion in LPMUpdater to
allow invalidating non-top-level loops when running in LoopNestMode,
as the pass does not know how exactly it will get scheduled.
Differential Revision: https://reviews.llvm.org/D111350
Vitaly Buka [Fri, 8 Oct 2021 21:05:29 +0000 (14:05 -0700)]
[NFC][sanitizer] Parametrize PersistentAllocator with type
Joe Loser [Fri, 8 Oct 2021 20:57:44 +0000 (16:57 -0400)]
[libc++] Implement P1394r4 for span: range constructor
Implement https://wg21.link/p1394 which allows span to be constructible
from any contiguous forwarding-range that has a compatible element type.
Fixes https://bugs.llvm.org/show_bug.cgi?id=51443
Reviewed By: ldionne, Quuxplusone, #libc
Differential Revision: https://reviews.llvm.org/D110503
Emilio Cota [Fri, 8 Oct 2021 20:28:04 +0000 (13:28 -0700)]
X86Vector: relax checks in rsqrt's integration test
Instead of hard-coding results for both Intel and AMD, let's relax
the checks to simplify the test while supporting both implementations.
Note that:
- If a new hardware implementation comes up in the future, it is likely
to pass the relaxed tests, i.e. no future maintenance burden for us.
- If something terribly wrong happens (e.g. instead of rsqrt we
execute 1/sqrt), the tests will probably catch it, since the relaxed
tests expect low precision (e.g. rsqrt(1) != 1.0).
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D111461
Richard Smith [Fri, 8 Oct 2021 20:39:49 +0000 (13:39 -0700)]
PR52073: Fix equivalence computation for lambda-expressions.
Distinct lambda expressions are always considered non-equivalent, so two
token-for-token identical function declarations whose signatures involve
lambda-expressions declare distinct functions.
Peter Steinfeld [Fri, 8 Oct 2021 18:26:32 +0000 (11:26 -0700)]
[flang] Fix capitalization of "ishft"
We weren't recognizing the ISHFT intrinsic because the code had
incorrectly capitalized it.
Differential Revision: https://reviews.llvm.org/D111449
Lang Hames [Fri, 8 Oct 2021 20:01:31 +0000 (13:01 -0700)]
Revert "[ORC] Move SimpleRemoteEPCServer::Dispatcher into OrcShared."
This reverts commit
dfd74db9813b0c7c64038c303726ba43f335e07a.
SimpleRemoteEPC should share dispatch with the ExecutionSession, rather than
having two different dispatch systems on the controller side.
SimpleRemoteEPCServer::Dispatch doesn't need to be shared.
Lang Hames [Fri, 8 Oct 2021 19:44:01 +0000 (12:44 -0700)]
[ORC] Remove a stale comment.
SimpleRemoteEPCServer Service shutdown (
c965fde7c234a) takes care of this.
Vitaly Buka [Fri, 8 Oct 2021 20:41:57 +0000 (13:41 -0700)]
[NFC][sanitizer] Move ChainedOriginDepotNode into cpp file
Vitaly Buka [Fri, 8 Oct 2021 17:19:24 +0000 (10:19 -0700)]
[NFC][sanitizer] Remove sanitizer_persistent_allocator.cpp
We need to make it a template
Amy Kwan [Fri, 8 Oct 2021 17:52:01 +0000 (12:52 -0500)]
[NFC] Update vec_extract builtin signatures to take signed int.
This patch updates the vec_extract builtins to take a signed int as the second
parameter, as defined by the Power Vector Intrinsics Programming Reference.
This patch is NFC and all existing tests pass.
Differential Revision: https://reviews.llvm.org/D110935
Arthur Eubanks [Fri, 8 Oct 2021 19:36:40 +0000 (12:36 -0700)]
[test] Fixup builtin-assume-aligned.c
__builtin_assume_aligned's second parameter is size_t, which may be 32 bits.
We can't pass 2^32 when that happens. Update tests accordingly.
Example broken bot due to D111250:
https://lab.llvm.org/buildbot/#/builders/171/builds/4531
Nikita Popov [Thu, 7 Oct 2021 21:26:45 +0000 (23:26 +0200)]
[DenseMapInfo] Move hash_code implementation to Hashing.h (NFC)
This moves the DenseMapInfo implementation for hash_code into
Hashing.h, removing the need to include Hashing.h (and thus <string>)
in DenseMapInfo.h. This follows the general convention of declaring
DenseMapInfo for types that we own in the respective header. The
remaining implementations in DenseMapInfo.h are all for types we
do not own.
Differential Revision: https://reviews.llvm.org/D111451
Joseph Huber [Fri, 1 Oct 2021 18:37:02 +0000 (14:37 -0400)]
[Libomptarget] Add an external interface to dynamic shared memory
This patch adds an external interface to access the dynamic shared
memory buffer in the device runtime. The function introduced is
``llvm_omp_get_dynamic_shared``. This includes a host-side
definition that only returns a null pointer so that it can be used when
host-fallback is enabled without crashing. Support for dynamic shared
memory was also ported to the old device runtime.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D110957
Andrew Browne [Fri, 8 Oct 2021 18:49:20 +0000 (11:49 -0700)]
[DFSan] Fix warning: getArgsFunctionType defined but not used
Warning introduced in
61ec2148c5a68d870356d6348309e94a2267a1a4
Arthur Eubanks [Wed, 6 Oct 2021 22:23:32 +0000 (15:23 -0700)]
More followup type changes after
05392466