David Green [Sun, 13 Jun 2021 12:55:34 +0000 (13:55 +0100)]
[ARM] Introduce t2WhileLoopStartTP
This adds t2WhileLoopStartTP, similar to the t2DoLoopStartTP added in
D90591. It keeps a reference to both the tripcount register and the
element count register, so that the ARMLowOverheadLoops pass in the
backend can pick the correct one without having to search for it from
the operand of a VCTP.
Differential Revision: https://reviews.llvm.org/D103236
Markus Böck [Sun, 13 Jun 2021 12:48:27 +0000 (14:48 +0200)]
[clang][NFC] Add IsAnyDestructorNoReturn field to CXXRecord instead of calculating it on demand
This patch addresses a performance issue I noticed when using clang-12 to compile projects of mine. Even though the files weren't too large (around 1k cpp), the compiler was taking more than a minute to compile the source file, much longer than either GCC or MSVC.
Using a profiler it turned out the issue was the isAnyDestructorNoReturn function in CXXRecordDecl. In particular it being recursive, recalculating the property for every invocation, for every field and base class. This showed up in tracebacks in the profiler.
This patch instead adds IsAnyDestructorNoReturn as a Field to the data inside of CXXRecord and updates when a new base class, destructor, or record field member is added.
After this patch the problematic file of mine went from a compile time of 81s, down to 12s.
The patch itself should not change any functionality, just improve performance.
Differential Revision: https://reviews.llvm.org/D104182
Sanjay Patel [Sun, 13 Jun 2021 12:21:23 +0000 (08:21 -0400)]
[InstCombine] fold ctlz/cttz of bool types
https://alive2.llvm.org/ce/z/tX4pUT
Simon Pilgrim [Sun, 13 Jun 2021 12:05:17 +0000 (13:05 +0100)]
SValExplainer.h - get APSInt values by const reference instead of value. NFCI.
Avoid unnecessary copies.
Simon Pilgrim [Sun, 13 Jun 2021 12:03:47 +0000 (13:03 +0100)]
ArgumentPromotion.cpp - remove unused <string> include. NFCI.
Simon Pilgrim [Sun, 13 Jun 2021 11:36:51 +0000 (12:36 +0100)]
VPlanSLP.cpp - tidy implicit header dependencies. NFCI.
We don't use std::string and std::vector, but we do use std::pair and std::max.
Lang Hames [Sun, 13 Jun 2021 10:45:20 +0000 (20:45 +1000)]
[ORC-RT] Remove unused header in unit test.
Lang Hames [Sun, 13 Jun 2021 10:43:49 +0000 (20:43 +1000)]
[JITLink][MachO] Add missing testcase.
This test was accidentally left out of
f9649d123db.
Lang Hames [Sun, 13 Jun 2021 09:31:36 +0000 (19:31 +1000)]
[ORC-RT] Fix a comment.
Matheus Izvekov [Fri, 19 Mar 2021 02:32:06 +0000 (03:32 +0100)]
[clang] Implement P2266 Simpler implicit move
This Implements [[http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2266r1.html|P2266 Simpler implicit move]].
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D99005
Kristina Bessonova [Wed, 2 Jun 2021 17:51:11 +0000 (19:51 +0200)]
[ARM][NEON] Combine base address updates for vld1Ndup intrinsics
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D103836
Luo, Yuanke [Thu, 10 Jun 2021 14:50:20 +0000 (22:50 +0800)]
[X86] Check immediate before get it.
For CMP imm instruction, when the operand 1 is symbol address we should
check if it is immediate first. Here is the example code.
`CMP64mi32 $noreg, 8, killed renamable $rcx, @d, $noreg, @a, implicit-def
$eflags`
Many thanks to Craig, Topper for the test case to reproduce this issue.
Differential Revision: https://reviews.llvm.org/D104037
Luo, Yuanke [Sun, 13 Jun 2021 05:55:19 +0000 (13:55 +0800)]
Revert "[X86] Check immediate before get it."
This reverts commit
9eb2f723c24523194b833779d20b027bf89a4f55.
Shoaib Meenai [Sun, 13 Jun 2021 02:47:09 +0000 (19:47 -0700)]
[runtimes] Fix umbrella component targets
When we're building the runtimes for multiple platform targets, we
create umbrella build targets for each distribution component, but those
targets didn't have any dependencies and were just no-ops. Make the
umbrella target depend on the sub-targets for each platform to fix this,
which is consistent with the behavior of the umbrella targets for each
runtime, and also consistent with the behavior when we've only specified
the default target.
David Blaikie [Sun, 13 Jun 2021 01:54:08 +0000 (18:54 -0700)]
llvm-objcopy: fix section size truncation/extension when dumping sections
Since this only comes up with inputs containing sections at least 4GB
large (I guess I could use a bzero section or something, so the input
file doesn't have to be 4GB, but even then the output file would have to
be 4GB, right?) I've skipped testing this. If there's a nice way to test
this without needing 4GB inputs or output files.
The subtlety here is demonstrated by this code:
struct t { operator uint64_t(); };
static_assert(std::is_same_v<int, decltype(std::declval<bool>() ? 0 : std::declval<t>())>);
static_assert(std::is_same_v<uint64_t, decltype(std::declval<bool>() ? 0 : std::declval<uint64_t>())>);
Because of this difference, the original source code was getting an int
type (truncating the actual size) and then extending it again, resulting
in bogus values (I haven't thought through this hard enough to explain
why the resulting value was 0xffff... - sign extension, possible UB, but
in any case it's the wrong answer - in this particular case I was
looking at that resulted in a size so large that we couldn't open a file
large enough to write to and ended up with a rather vague:
error: 'file_name.o': Invalid argument
Luo, Yuanke [Thu, 10 Jun 2021 14:50:20 +0000 (22:50 +0800)]
[X86] Check immediate before get it.
For CMP imm instruction, when the operand 1 is symbol address we should
check if it is immediate first. Here is the example code.
`CMP64mi32 $noreg, 8, killed renamable $rcx, @d, $noreg, @a, implicit-def
$eflags`
Many thanks to Craig, Topper for the test case to reproduce this issue.
Differential Revision: https://reviews.llvm.org/D104037
Lang Hames [Sat, 12 Jun 2021 22:55:47 +0000 (08:55 +1000)]
[ORC-RT] Split Simple-Packed-Serialization code into its own header.
This will simplify integration of this code into LLVM -- The
Simple-Packed-Serialization code can be copied near-verbatim, but
WrapperFunctionResult will require more adaptation.
Mehdi Amini [Sat, 12 Jun 2021 21:55:37 +0000 (21:55 +0000)]
Simplify getArgAttrDict/getResultAttrDict by removing unnecessary checks
There is a slight change in behavior: if the arg dictionnary is empty
then we return this empty dictionnary instead of a null attribute.
This is more consistent with accessing it through:
ArrayAttr args_attr = func_op.getAllArgAttrs();
args_attr[num].cast<DictionnaryAttr>() ...
Differential Revision: https://reviews.llvm.org/D104189
Roman Lebedev [Sat, 12 Jun 2021 21:00:28 +0000 (00:00 +0300)]
[NFC][X86][Codegen] Add shuffle test that would benefit from sorting in reduceBuildVecToShuffle()
Mehdi Amini [Sat, 12 Jun 2021 20:08:37 +0000 (20:08 +0000)]
Use dyn_cast_or_null instead of dyn_cast in FunctionLike::verifyTrait (NFC)
This is making the verifier more tolerant to cases where a "null"
Attribute would be inserted in the array of func arguments/results
attributes.
Ian McIntyre [Sat, 12 Jun 2021 19:23:07 +0000 (12:23 -0700)]
[llvm-objcopy] Exclude empty sections in IHexWriter output
IHexWriter was evaluating a section's physical address when deciding if
that section should be written to an output. This approach does not
account for a zero-sized section that has the same physical address as a
sized section. The behavior varies from GNU objcopy, and may result in a
HEX file that does not include all program sections.
The IHexWriter now excludes zero-sized sections when deciding what
should be written to the output. This affects the contents of the
writer's `Sections` collection; we will not try to insert multiple
sections that could have the same physical address. The behavior seems
consistent with GNU objcopy, which always excludes empty sections,
no matter the address.
The new test case evaluates the IHexWriter behavior when provided a
variety of empty sections that overlap or append a filled section. See
the input file's comments for more information. Given that test input,
and the change to the IHexWriter, GNU objcopy and llvm-objcopy produce
the same output.
Reviewed By: jhenderson, MaskRay, evgeny777
Differential Revision: https://reviews.llvm.org/D101332
Xun Li [Sat, 12 Jun 2021 17:29:53 +0000 (10:29 -0700)]
[CHR] Don't run ControlHeightReduction if any BB has address taken
This patch is to address https://bugs.llvm.org/show_bug.cgi?id=50610.
In computed goto pattern, there are usually a list of basic blocks that are all targets of indirectbr instruction, and each basic block also has address taken and stored in a variable.
CHR pass could potentially clone these basic blocks, which would generate a cloned version of the indirectbr and clonved version of all basic blocks in the list.
However these basic blocks will not have their addresses taken and stored anywhere. So latter SimplifyCFG pass will simply remove all tehse cloned basic blocks, resulting in incorrect code.
To fix this, when searching for scopes, we skip scopes that contains BBs with addresses taken.
Added a few test cases.
Reviewed By: aeubanks, wenlei, hoy
Differential Revision: https://reviews.llvm.org/D103867
Craig Topper [Sat, 12 Jun 2021 16:49:32 +0000 (09:49 -0700)]
[X86] Add ISD::FREEZE and ISD::AssertAlign to the list of opcodes that don't guarantee upper 32 bits are zero.
The freeze issue was reported here
https://llvm.discourse.group/t/bug-or-feature-freeze-instruction/3639
I don't have a test for AssertAlign. I just noticed it was missing
and assume it should be similar to the other two Asserts.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D104178
Saleem Abdulrasool [Sat, 12 Jun 2021 02:05:42 +0000 (19:05 -0700)]
Revert "Revert "DirectoryWatcher: add an implementation for Windows""
This reverts commit
0ec1cf13f2a4e31aa2c5ccc665c5fbdcd3a94577.
Restore the implementation with some minor tweaks:
- Use std::unique_ptr for the path instead of std::vector
* Stylistic improvement as the buffer is already heap allocated, this
just makes it clearer.
- Correct the notification buffer allocation size
* Memory usage fix: we were allocating 4x the computed size
- Correct the passing of the buffer size to RDC
* Memory usage fix: we were reporting 1/4th of the size
- Convert the operation event to auto-reset
* Bug Fix: we never reset the event
- Remove `FILE_NOTIFY_CHANGE_LAST_ACCESS` from RDC events
* Memory usage fix: we never needed this notification
- Fold events for the notification action
* Stylistic improvement to be clear how the events map
- Update comment
* Stylistic improvement to be clear what the RAII controls
- Fix the race condition that was uncovered previously
* We would return from the construction before the watcher thread
began execution. The test would then proceed to begin execution,
and we would miss the initial notifications. We now ensure that the
watcher thread is initialized before we return. This ensures that
we do not miss the initial notifications.
Running the test on a SSD was able to uncover the access pattern. This
now seems to pass reliably where it was previously flaky locally.
Matheus Izvekov [Fri, 19 Mar 2021 02:32:06 +0000 (03:32 +0100)]
[clang] NRVO: Improvements and handling of more cases.
This expands NRVO propagation for more cases:
Parse analysis improvement:
* Lambdas and Blocks with dependent return type can have their variables
marked as NRVO Candidates.
Variable instantiation improvements:
* Fixes crash when instantiating NRVO variables in Blocks.
* Functions, Lambdas, and Blocks which have auto return type have their
variables' NRVO status propagated. For Blocks with non-auto return type,
as a limitation, this propagation does not consider the actual return
type.
This also implements exclusion of VarDecls which are references to
dependent types.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D99696
Florian Hahn [Sat, 12 Jun 2021 11:11:51 +0000 (12:11 +0100)]
[VPlan] Add more sinking/merging tests with predicated loads/stores.
Shashij gupta [Sat, 12 Jun 2021 13:58:40 +0000 (19:28 +0530)]
[MLIR] Simplify affine.if ops with trivial conditions
The commit simplifies affine.if ops :
The affine if operation gets removed if the condition is universally true or false and then/else block is merged with the parent block.
Signed-off-by: Shashij Gupta shashij.gupta@polymagelabs.com
Reviewed By: bondhugula, pr4tgpt
Differential Revision: https://reviews.llvm.org/D104015
Florian Hahn [Sat, 12 Jun 2021 11:03:59 +0000 (12:03 +0100)]
Revert "Allow signposts to take advantage of deferred string substitution"
This reverts commit
4fc93a3a1f95ef5a0a57750fc621f2411ea445a8 because it
breaks LLDB builds on certain macOS platform & SDK combinations, e.g.
http://green.lab.llvm.org/green/job/lldb-cmake-standalone/3288/consoleFull#-
195476041949ba4694-19c4-4d7e-bec5-
911270d8a58c
Kristina Bessonova [Wed, 19 May 2021 12:12:27 +0000 (14:12 +0200)]
[lit] Attempt for fix tests failing because of 'warning: non-portable path to file'
This is an attempt to fix clang test failures due to 'nonportable-include-path'
warnings on Windows when a path to llvm-project's base directory contains some
uppercase letters (excluding a drive letter).
The issue originates from 2 problems:
* discovery.py loads site config in lower case causing all the paths
based on __file__ and requested within the config file to be in lowercase as well,
* neither os.path.abspath() nor os.path.realpath() (both used to obtain paths of
config files, sources, object directories, etc) do not return paths in the correct
case for Windows (at least consistently for all python versions).
As os.path library doesn't seem to provide any relaible way to restore
the case for paths on Windows, this patch proposes to use pathlib.resolve().
pathlib is a part of Python 3.4 while llvm lit requires Python 3.6.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D103014
Florian Hahn [Sat, 12 Jun 2021 10:28:08 +0000 (11:28 +0100)]
Revert "[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB"
This reverts commit
1b748faf2bae246e2fc77d88420df13c2e60f4df because it
breaks building the llvm-test-suite with -verify-machineinstrs on X86:
http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-x86_64-O3/9585/
Running llc -verify-machineinstr on X86 crashes on the IR below:
target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
%struct.widget = type { i32, i32, i32, i32, i32*, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [16 x [16 x i16]], [6 x [32 x i32]], [16 x [16 x i32]], [4 x [12 x [4 x [4 x i32]]]], [16 x i32], i8**, i32*, i32***, i32**, i32, i32, i32, i32, %struct.baz*, %struct.wobble.1*, i32, i32, i32, i32, i32, i32, %struct.quux.2*, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [3 x i32], i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32***, i32***, i32****, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [3 x [2 x i32]], [3 x [2 x i32]], i32, i32, i64, i64, %struct.zot.3, %struct.zot.3, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 }
%struct.baz = type { i32, i32, i32, i32, i32, i32, i32, i32, i32, %struct.snork*, %struct.wombat.0*, %struct.wobble*, i32, i32*, i32*, i32*, i32, i32*, i32*, i32*, i32 (%struct.widget*, %struct.eggs*)*, i32, i32, i32, i32 }
%struct.snork = type { %struct.spam*, %struct.zot, i32 (%struct.wombat*, %struct.widget*, %struct.snork*)* }
%struct.spam = type { i32, i32, i32, i32, i8*, i32 }
%struct.zot = type { i32, i32, i32, i32, i32, i8*, i32* }
%struct.wombat = type { i32, i32, i32, i32, i32, i32, i32, i32, void (i32, i32, i32*, i32*)*, void (%struct.wombat*, %struct.widget*, %struct.zot*)* }
%struct.wombat.0 = type { [4 x [11 x %struct.quux]], [2 x [9 x %struct.quux]], [2 x [10 x %struct.quux]], [2 x [6 x %struct.quux]], [4 x %struct.quux], [4 x %struct.quux], [3 x %struct.quux] }
%struct.quux = type { i16, i8 }
%struct.wobble = type { [2 x %struct.quux], [4 x %struct.quux], [3 x [4 x %struct.quux]], [10 x [4 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [5 x %struct.quux]], [10 x [5 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [15 x %struct.quux]] }
%struct.eggs = type { [1000 x i8], [1000 x i8], [1000 x i8], i32, i32, i32, i32, i32, i32, i32, i32 }
%struct.wobble.1 = type { i32, [2 x i32], i32, i32, %struct.wobble.1*, %struct.wobble.1*, i32, [2 x [4 x [4 x [2 x i32]]]], i32, i64, i64, i32, i32, [4 x i8], [4 x i8], i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 }
%struct.quux.2 = type { i32, i32, i32, i32, i32, %struct.quux.2* }
%struct.zot.3 = type { i64, i16, i16, i16 }
define void @blam(%struct.widget* %arg, i32 %arg1) local_unnamed_addr {
bb:
%tmp = load i32, i32* undef, align 4
%tmp2 = sdiv i32 %tmp, 6
%tmp3 = sdiv i32 undef, 6
%tmp4 = load i32, i32* undef, align 4
%tmp5 = icmp eq i32 %tmp4, 4
%tmp6 = select i1 %tmp5, i32 %tmp3, i32 %tmp2
%tmp7 = getelementptr inbounds [4 x [4 x i32]], [4 x [4 x i32]]* undef, i64 0, i64 0, i64 0
%tmp8 = zext i16 undef to i32
%tmp9 = zext i16 undef to i32
%tmp10 = load i16, i16* undef, align 2
%tmp11 = zext i16 %tmp10 to i32
%tmp12 = zext i16 undef to i32
%tmp13 = zext i16 undef to i32
%tmp14 = zext i16 undef to i32
%tmp15 = load i16, i16* undef, align 2
%tmp16 = zext i16 %tmp15 to i32
%tmp17 = zext i16 undef to i32
%tmp18 = sub nsw i32 %tmp8, %tmp9
%tmp19 = shl nsw i32 undef, 1
%tmp20 = add nsw i32 %tmp19, %tmp18
%tmp21 = sub nsw i32 %tmp11, %tmp12
%tmp22 = shl nsw i32 undef, 1
%tmp23 = add nsw i32 %tmp22, %tmp21
%tmp24 = sub nsw i32 %tmp13, %tmp14
%tmp25 = shl nsw i32 undef, 1
%tmp26 = add nsw i32 %tmp25, %tmp24
%tmp27 = sub nsw i32 %tmp16, %tmp17
%tmp28 = shl nsw i32 undef, 1
%tmp29 = add nsw i32 %tmp28, %tmp27
%tmp30 = sub nsw i32 %tmp20, %tmp29
%tmp31 = sub nsw i32 %tmp23, %tmp26
%tmp32 = shl nsw i32 %tmp30, 1
%tmp33 = add nsw i32 %tmp32, %tmp31
store i32 %tmp33, i32* undef, align 4
%tmp34 = mul nsw i32 %tmp31, -2
%tmp35 = add nsw i32 %tmp34, %tmp30
store i32 %tmp35, i32* undef, align 4
%tmp36 = select i1 %tmp5, i32 undef, i32 undef
br label %bb37
bb37: ; preds = %bb
%tmp38 = load i32, i32* undef, align 4
%tmp39 = ashr i32 %tmp38, %tmp6
%tmp40 = load i32, i32* undef, align 4
%tmp41 = sdiv i32 %tmp39, %tmp40
store i32 %tmp41, i32* undef, align 4
ret void
}
Florian Hahn [Sat, 12 Jun 2021 10:27:33 +0000 (11:27 +0100)]
Revert "[X86FixupLEAs] Sub register usage of LEA dest should block LEA/SUB optimization"
This reverts commit
f35bcea1d4748889b8240defdf00cb7a71cbe070 because it
depends on
1b748faf2bae246e2fc77d88420df13c2e60f4df, which breaks
building the llvm-test-suite with -verify-machineinstrs on X86.
See
154adc0f135cff3f8a8861c335d2b88c8049d098 for more details.
madhur13490 [Thu, 3 Jun 2021 17:04:10 +0000 (22:34 +0530)]
[AMDGPU][IndirectCalls] Fix register usage propagation for indirect/external calls
This patch computes max SGPRs and VGPRs used by module
in presence of indirect calls and makes that
as register requirement for functions/kernels
which makes indirect calls.
This patch also refactors code AMDGPUSubTarget.cpp
which add a "base" variants of getMaxNumSGPRs which
is used by MachineFunction and new Function version.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D103636
spupyrev [Sat, 12 Jun 2021 04:45:47 +0000 (21:45 -0700)]
A post-processing for BFI inference
The current implementation for computing relative block frequencies does
not handle correctly control-flow graphs containing irreducible loops. This
results in suboptimally generated binaries, whose perf can be up to 5%
worse than optimal.
To resolve the problem, we apply a post-processing step, which iteratively
updates block frequencies based on the frequencies of their predesessors.
This corresponds to finding the stationary point of the Markov chain by
an iterative method aka "PageRank computation". The algorithm takes at
most O(|E| * IterativeBFIMaxIterations) steps but typically converges faster.
It is turned on by passing option `use-iterative-bfi-inference`
and applied only for functions containing profile data and irreducible loops.
Tested on SPEC06/17, where it is helping to get correct profile counts for one of
the binaries (403.gcc). In prod binaries, we've seen a speedup of up to 2%-5%
for binaries containing functions with hot irreducible loops.
Reviewed By: hoy, wenlei, davidxl
Differential Revision: https://reviews.llvm.org/D103289
Michael Kruse [Sat, 12 Jun 2021 04:25:33 +0000 (23:25 -0500)]
[Flang][test] Fix Windows buildbot.
Commit
1b241b9b400bdfc5b8e0d157f0f46436677927b8 /
patch https://reviews.llvm.org/D104130 introduced an new test which
calls a UNIX shell script. Add
REQUIRES: shell
to not run it on Windows.
Stephen Neuendorffer [Fri, 11 Jun 2021 23:58:07 +0000 (16:58 -0700)]
[mlir] make normalizeAffineFor public
Previously this was just a static method.
Adrian Prantl [Sat, 12 Jun 2021 00:58:05 +0000 (17:58 -0700)]
Improve materializer error messages to include type names.
rdar://
79201552
Alexander Shaposhnikov [Sat, 12 Jun 2021 00:47:28 +0000 (17:47 -0700)]
[lld][MachO] Fix function starts section
Sort the addresses stored in FunctionStarts section.
Previously we were encoding potentially large numbers (due to unsigned overflow).
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D103662
Jez Ng [Sat, 12 Jun 2021 00:18:00 +0000 (20:18 -0400)]
[lld-macho] Fix debug build
D103977 broke a bunch of stuff as I had only tested the release build
which eliminated asserts.
I've retained the asserts where possible, but I also removed a bunch
instead of adding a whole lot of verbose ConcatInputSection casts.
Uday Bondhugula [Sat, 5 Jun 2021 14:24:34 +0000 (19:54 +0530)]
[MLIR] Execution engine python binding support for shared libraries
Add support to Python bindings for the MLIR execution engine to load a
specified list of shared libraries - for eg. to use MLIR runtime
utility libraries.
Differential Revision: https://reviews.llvm.org/D104009
Kai Luo [Fri, 11 Jun 2021 23:21:40 +0000 (23:21 +0000)]
[AIX][compiler-rt] Fix cmake build of libatomic for cmake-3.16+
cmake-3.16+ for AIX changes the default behavior of building a `SHARED` library which breaks AIX's build of libatomic, i.e., cmake-3.16+ builds `SHARED` as an archive of dynamic libraries. To fix it, we have to build `libatomic.so.1` as `MODULE` which keeps `libatomic.so.1` as an normal dynamic library.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D103786
Adrian Prantl [Fri, 11 Jun 2021 22:18:25 +0000 (15:18 -0700)]
Allow signposts to take advantage of deferred string substitution
One nice feature of the os_signpost API is that format string
substitutions happen in the consumer, not the logging
application. LLVM's current Signpost class doesn't take advantage of
this though and instead always uses a static "Begin/End %s" format
string.
This patch uses variadic macros to allow the API to be used as
intended. Unfortunately, the primary use-case I had in mind (the
LLDB_SCOPED_TIMER() macro) does not get much better from this, because
__PRETTY_FUNCTION__ is *not* a macro, but a static string, so
signposts created by LLDB_SCOPED_TIMER() still use a static "%s"
format string. At least LLDB_SCOPED_TIMERF() works as intended.
This reapplies the previsously reverted patch with support for
platforms where signposts are unavailable.
Differential Revision: https://reviews.llvm.org/D103575
Jez Ng [Fri, 11 Jun 2021 23:49:54 +0000 (19:49 -0400)]
[lld-macho] Have dead-stripping work with literal sections
Literal sections are not atomically live or dead. Rather,
liveness is tracked for each individual literal they contain. CStrings
have their liveness tracked via a `live` bit in StringPiece, and
fixed-width literals have theirs tracked via a BitVector.
The live-marking code now needs to track the offset within each section
that is to be marked live, in order to identify the literal at that
particular offset.
Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W
with both `-dead_strip` and `--deduplicate-literals`, with and without this diff
applied:
```
N Min Max Median Avg Stddev
x 20 4.32 4.44 4.375 4.372 0.
03105174
+ 20 4.3 4.39 4.36 4.3595 0.
023277502
No difference proven at 95.0% confidence
```
This gives us size savings of about 0.4%.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D103979
Jez Ng [Fri, 11 Jun 2021 23:49:53 +0000 (19:49 -0400)]
[lld-macho][nfc] Have InputSection ctors take some parameters
This is motivated by an upcoming diff in which the
WordLiteralInputSection ctor sets itself up based on the value of its
section flags. As such, it needs to be passed the `flags` value as part
of its ctor parameters, instead of having them assigned after the fact
in `parseSection()`. While refactoring code to make that possible, I
figured it would make sense for the other InputSections to also take
their initial values as ctor parameters.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D103978
Jez Ng [Fri, 11 Jun 2021 23:49:52 +0000 (19:49 -0400)]
[lld-macho][nfc] Move liveness-tracking fields into ConcatInputSection
These fields currently live in the parent InputSection class,
but they should be specific to ConcatInputSection, since the other
InputSection classes (that contain literals) aren't atomically live or
dead -- rather their component string/int literals should have
individual liveness states. (An upcoming diff will add liveness bits for
StringPieces and fixed-sized literals.)
I also factored out some asserts for isCoalescedWeak() in MarkLive.cpp.
We now avoid putting coalesced sections in the `inputSections` vector,
so we don't have to check/assert against it everywhere.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D103977
Jez Ng [Fri, 11 Jun 2021 23:49:50 +0000 (19:49 -0400)]
[lld-macho] Deduplicate fixed-width literals
Conceptually, the implementation is pretty straightforward: we put each
literal value into a hashtable, and then write out the keys of that
hashtable at the end.
In contrast with ELF, the Mach-O format does not support variable-length
literals that aren't strings. Its literals are either 4, 8, or 16 bytes
in length. LLD-ELF dedups its literals via sorting + uniq'ing, but since
we don't need to worry about overly-long values, we should be able to do
a faster job by just hashing.
That said, the implementation right now is far from optimal, because we
add to those hashtables serially. To parallelize this, we'll need a
basic concurrent hashtable (only needs to support concurrent writes w/o
interleave reads), which shouldn't be to hard to implement, but I'd like
to punt on it for now.
Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:
N Min Max Median Avg Stddev
x 20 4.27 4.39 4.315 4.3225 0.
033225703
+ 20 4.36 4.82 4.44 4.4845 0.
13152846
Difference at 95.0% confidence
0.162 +/- 0.0613971
3.74783% +/- 1.42041%
(Student's t, pooled s = 0.0959262)
This corresponds to binary size savings of 2MB out of 335MB, or 0.6%.
It's not a great tradeoff as-is, but as mentioned our implementation can
be signficantly optimized, and literal dedup will unlock more
opportunities for ICF to identify identical structures that reference
the same literals.
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D103113
Adrian Prantl [Fri, 11 Jun 2021 23:46:10 +0000 (16:46 -0700)]
Revert "Allow signposts to take advantage of deferred string substitution"
I forgot to make the LLDB macro conditional on Linux.
This reverts commit
541ccd1c1bb23e1e20a382844b35312c0caffd79.
Andrew Litteken [Fri, 11 Jun 2021 23:15:29 +0000 (18:15 -0500)]
[IRSim] Strip out the findSimilarity call from the constructor
Both doInitialize and runOnModule were running the entire analysis
due to the actual work being done in the constructor. Strip it out here
and only get the similarity during runOnModule.
Author: lanza
Reviewers: AndrewLitteken, paquette, plofti
Differential Revision: https://reviews.llvm.org/D92524
Adrian Prantl [Fri, 11 Jun 2021 22:51:35 +0000 (15:51 -0700)]
Disambiguate usage of struct mach_header and other MachO definitions.
Unfortunately the Darwin signpost header also pulls in the system
MachO header and so we need to make sure to use the LLVM versions of
those definitions.
Adrian Prantl [Fri, 11 Jun 2021 22:18:25 +0000 (15:18 -0700)]
Allow signposts to take advantage of deferred string substitution
One nice feature of the os_signpost API is that format string
substitutions happen in the consumer, not the logging
application. LLVM's current Signpost class doesn't take advantage of
this though and instead always uses a static "Begin/End %s" format
string.
This patch uses variadic macros to allow the API to be used as
intended. Unfortunately, the primary use-case I had in mind (the
LLDB_SCOPED_TIMER() macro) does not get much better from this, because
__PRETTY_FUNCTION__ is *not* a macro, but a static string, so
signposts created by LLDB_SCOPED_TIMER() still use a static "%s"
format string. At least LLDB_SCOPED_TIMERF() works as intended.
Differential Revision: https://reviews.llvm.org/D103575
Alexander Shaposhnikov [Fri, 11 Jun 2021 23:34:59 +0000 (16:34 -0700)]
[llvm-objcopy][MachO] Do not strip symbols with the flag REFERENCED_DYNAMICALLY set
Do not strip symbols having the flag REFERENCED_DYNAMICALLY set.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D104092
Reid Kleckner [Fri, 11 Jun 2021 23:12:07 +0000 (16:12 -0700)]
[ASan/Win] Hide index from compiler to avoid new clang warning
Arthur Eubanks [Fri, 11 Jun 2021 22:59:20 +0000 (15:59 -0700)]
[NFC][OpaquePtr] Make getMemoryParamAllocType() compatible with opaque pointers
These ABI attributes now always require the type parameter.
sret was missing from the first set of checks but was covered by the
second set.
Sanjay Patel [Fri, 11 Jun 2021 17:57:56 +0000 (13:57 -0400)]
[InstCombine] add tests for bit manipulation intrinsics with bool values; NFC
Sanjay Patel [Fri, 11 Jun 2021 17:43:34 +0000 (13:43 -0400)]
[InstCombine] update test checks; NFC
Kevin Athey [Fri, 11 Jun 2021 20:39:35 +0000 (13:39 -0700)]
[sanitizer] Remove numeric values from -asan-use-after-return flag. (NFC)
for issue: https://github.com/google/sanitizers/issues/1394
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D104152
Kevin Athey [Fri, 11 Jun 2021 19:46:15 +0000 (12:46 -0700)]
[sanitizer] Replace -mllvm -asan-use-after-return in compile-rt tests with -fsanitize-address-use-after-return (NFC)
for issue: https://github.com/google/sanitizers/issues/1394
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D104146
Andrew Litteken [Fri, 11 Jun 2021 20:41:36 +0000 (15:41 -0500)]
[IRSim] Don't copy the Mapper for createCandidatesFromSuffixTree
Every invocation this was copying the Mapper for no reason. Take a const
ref instead.
Author: lanza
Reviewers: AndrewLitteken, plofti, paquette,
Differential Review: https://reviews.llvm.org/D92532
Raphael Isemann [Fri, 11 Jun 2021 20:43:38 +0000 (22:43 +0200)]
[lldb] Remove GCC XFAIL for TestCPPAuto and TestClassTemplateParameterPack
Both tests are passing for GCC>8 on Linux so let's mark them as passing.
TestCPPAuto was originally disabled due to "an problem with debug info generation"
in
ea35dbeff29f3095df3ad1d77cce3d9e5b197e7c .
TestClassTemplateParameterPack was disabled without explanation in
0f01fb39e3fe3d8e99df1dd185e75ad584b777b3 .
Roman Lebedev [Fri, 11 Jun 2021 20:26:17 +0000 (23:26 +0300)]
[NFC][X86][Codegen] Megacommit: mass-regenerate all check lines that were already autogenerated
The motivation is that the update script has at least two deviations
(`<...>@GOT`/`<...>@PLT`/ and not hiding pointer arithmetics) from
what pretty much all the checklines were generated with,
and most of the tests are still not updated, so each time one of the
non-up-to-date tests is updated to see the effect of the code change,
there is a lot of noise. Instead of having to deal with that each
time, let's just deal with everything at once.
This has been done via:
```
cd llvm-project/llvm/test/CodeGen/X86
grep -rl "; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py" | xargs -L1 <...>/llvm-project/llvm/utils/update_llc_test_checks.py --llc-binary <...>/llvm-project/build/bin/llc
```
Not all tests were regenerated, however.
Daniil Fukalov [Tue, 8 Jun 2021 16:53:28 +0000 (19:53 +0300)]
[NFC][CostModel] Fixed comment that comparisons work regardless of the state.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D104068
Andrew Litteken [Fri, 11 Jun 2021 20:44:08 +0000 (15:44 -0500)]
Revert "[IRSim] Adding basic implementation of llvm-sim."
This reverts commit
f47d00c54b52bd8adf9b8725912ea1cd0f1873d5.
Philip Reames [Fri, 11 Jun 2021 20:30:10 +0000 (13:30 -0700)]
Allow ptrtoint/inttoptr of non-integral pointer types in IR
I don't like landing this change, but it's an acknowledgement of a practical reality. Despite not having well specified semantics for inttoptr and ptrtoint involving non-integral pointer types, they are used in practice. Here's a quick summary of the current pragmatic reality:
* I happen to know that the main external user of non-integral pointers has effectively disabled the verifier rules.
* RS4GC (the lowering pass for abstract GC machine model which is the key motivation for non-integral pointers), even supports them. We just have all the tests using an integral pointer space to let the verifier run.
* Certain idioms (such as alignment checks for alignment N, where any relocation is guaranteed to be N byte aligned) are fine in practice.
* As implemented, inttoptr/ptrtoint are CSEd and are not control dependent. This means that any code which is intending to check a particular bit pattern at site of use must be wrapped in an intrinsic or external function call.
This change allows them in the Verifier, and updates the LangRef to specific them as implementation dependent. This allows us to acknowledge current reality while still leaving ourselves room to punt on figuring out "good" semantics until the future.
Alex Lorenz [Fri, 11 Jun 2021 20:23:44 +0000 (13:23 -0700)]
[clang][ObjC] allow the use of NSAttributedString * argument type with format attribute
This is useful for APIs that want to accept an attributed NSString as their format string
rdar://
79163229
Andrew Litteken [Mon, 7 Jun 2021 15:57:39 +0000 (10:57 -0500)]
[IRSim] Adding basic implementation of llvm-sim.
This is a similarity visualization tool that accepts a Module and
passes it to the IRSimilarityIdentifier. The resulting SimilarityGroups
are output in a JSON file.
Tests are found in test/tools/llvm-sim and check for the file not found,
a bad module, and that the JSON is created correctly.
Reviewers: paquette, jroelofs, MaskRay
Recommit of:
15645d044bcfe2a0f63156048b302f997a717688 to fix linking
errors.
Differential Revision: https://reviews.llvm.org/D86974
Arthur Eubanks [Fri, 11 Jun 2021 19:51:25 +0000 (12:51 -0700)]
[docs][OpaquePtr] Add some specific examples of what needs to be done
Marius Brehler [Fri, 11 Jun 2021 17:28:30 +0000 (19:28 +0200)]
[mlir][docs] Reorder PassWrapper arguments
Fixes the order of template arguments passed to the `PassWrapper`.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D104132
Peter Steinfeld [Fri, 11 Jun 2021 16:28:26 +0000 (09:28 -0700)]
[flang] Handle multiple USE statements for the same module
It's possible to have several USE statements for the same module that
have different mixes of rename clauses and ONLY clauses. The presence
of a rename cause has the effect of hiding a previously associated name,
and the presence of an ONLY clause forces the name to be visible even in
the presence of a rename.
I fixed this by keeping track of the names that appear on rename and ONLY
clauses. Then, when processing the USE association of a name, I check to see
if it previously appeared in a rename clause and not in a USE clause. If so, I
remove its USE associated symbol. Also, when USE associating all of the names
in a module, I do not USE associate names that have appeared in rename clauses.
I also added a test.
Differential Revision: https://reviews.llvm.org/D104130
Kevin Athey [Fri, 11 Jun 2021 17:13:10 +0000 (10:13 -0700)]
[clang-cl][sanitizer] Add -fsanitize-address-use-after-return to clang.
Also:
- add driver test (fsanitize-use-after-return.c)
- add basic IR test (asan-use-after-return.cpp)
- (NFC) cleaned up logic for generating table of __asan_stack_malloc
depending on flag.
for issue: https://github.com/google/sanitizers/issues/1394
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D104076
Arthur Eubanks [Tue, 18 May 2021 22:09:06 +0000 (15:09 -0700)]
[NFC][OpaquePtr] Explicitly pass GEP source type in optimizeGatherScatterInst()
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D103480
John Paul Adrian Glaubitz [Fri, 11 Jun 2021 18:44:04 +0000 (19:44 +0100)]
[compiler-rt] Add platform detection support for x32
Currently, the compiler-rt build system checks only whether __X86_64
is defined to determine whether the default compiler-rt target arch
is x86_64. Since x32 defines __X86_64 as well, we must also check that
the default pointer size is eight bytes and not four bytes to properly
detect a 64-bit x86_64 compiler-rt default target arch.
Reviewed By: hvdijk, vitalybuka
Differential Revision: https://reviews.llvm.org/D99988
LLVM GN Syncbot [Fri, 11 Jun 2021 18:04:01 +0000 (18:04 +0000)]
[gn build] Port
7eba4856c702
zoecarver [Thu, 6 May 2021 20:19:13 +0000 (13:19 -0700)]
[libcxx][ranges] Add class ref_view.
Differential Revision: https://reviews.llvm.org/D102020
Matt Arsenault [Thu, 10 Jun 2021 21:39:51 +0000 (17:39 -0400)]
AMDGPU/GlobalISel: Remove leftover hack for argument memory sizes
Since the call lowering code now tries to respect the tablegen
reported argument types, this is no longer necessary.
Matt Arsenault [Thu, 10 Jun 2021 16:00:36 +0000 (12:00 -0400)]
AMDGPU/GlobalISel: Fix indentation
Matt Arsenault [Thu, 10 Jun 2021 01:22:00 +0000 (21:22 -0400)]
GlobalISel: Reduce indentation and remove dead path
Matt Arsenault [Tue, 8 Jun 2021 21:10:51 +0000 (17:10 -0400)]
CodeGen: Fix missing const
eahcmrh [Tue, 8 Jun 2021 17:00:42 +0000 (19:00 +0200)]
[Sema] Address-space sensitive check for unbounded arrays (v2)
Check applied to unbounded (incomplete) arrays and pointers to spot
cases where the computed address is beyond the largest possible
addressable extent of the array, based on the address space in which the
array is delcared, or which the pointer refers to.
Check helps to avoid cases of nonsense pointer math and array indexing
which could lead to linker failures or runtime exceptions. Of
particular interest when building for embedded systems with small
address spaces.
This is version 2 of this patch -- version 1 had some testing issues
due to a sign error in existing code. That error is corrected and
lit test for this chagne is extended to verify the fix.
Originally reviewed/accepted by: aaron.ballman
Original revision: https://reviews.llvm.org/D86796
Reviewed By: aaron.ballman, ebevhan
Differential Revision: https://reviews.llvm.org/D88174
Denys Shabalin [Fri, 11 Jun 2021 15:53:45 +0000 (17:53 +0200)]
Introduce alloca_scope op
## Introduction
This proposal describes the new op to be added to the `std` (and later moved `memref`)
dialect called `alloca_scope`.
## Motivation
Alloca operations are easy to misuse, especially if one relies on it while doing
rewriting/conversion passes. For example let's consider a simple example of two
independent dialects, one defines an op that wants to allocate on-stack and
another defines a construct that corresponds to some form of looping:
```
dialect1.looping_op {
%x = dialect2.stack_allocating_op
}
```
Since the dialects might not know about each other they are going to define a
lowering to std/scf/etc independently:
```
scf.for … {
%x_temp = std.alloca …
… // do some domain-specific work using %x_temp buffer
… // and store the result into %result
%x = %result
}
```
Later on the scf and `std.alloca` is going to be lowered to llvm using a
combination of `llvm.alloca` and unstructured control flow.
At this point the use of `%x_temp` is bound to either be either optimized by
llvm (for example using mem2reg) or in the worst case: perform an independent
stack allocation on each iteration of the loop. While the llvm optimizations are
likely to succeed they are not guaranteed to do so, and they provide
opportunities for surprising issues with unexpected use of stack size.
## Proposal
We propose a new operation that defines a finer-grain allocation scope for the
alloca-allocated memory called `alloca_scope`:
```
alloca_scope {
%x_temp = alloca …
...
}
```
Here the lifetime of `%x_temp` is going to be bound to the narrow annotated
region within `alloca_scope`. Moreover, one can also return values out of the
alloca_scope with an accompanying `alloca_scope.return` op (that behaves
similarly to `scf.yield`):
```
%result = alloca_scope {
%x_temp = alloca …
…
alloca_scope.return %myvalue
}
```
Under the hood the `alloca_scope` is going to lowered to a combination of
`llvm.intr.stacksave` and `llvm.intr.strackrestore` that are going to be invoked
automatically as control-flow enters and leaves the body of the `alloca_scope`.
The key value of the new op is to allow deterministic guaranteed stack use
through an explicit annotation in the code which is finer-grain than the
function-level scope of `AutomaticAllocationScope` interface. `alloca_scope`
can be inserted at arbitrary locations and doesn’t require non-trivial
transformations such as outlining.
## Which dialect
Before memref dialect is split, `alloca_scope` can temporarily reside in `std`
dialect, and later on be moved to `memref` together with the rest of
memory-related operations.
## Implementation
An implementation of the op is available [here](https://reviews.llvm.org/D97768).
Original commits:
* Add initial scaffolding for alloca_scope op
* Add alloca_scope.return op
* Add no region arguments and variadic results
* Add op descriptions
* Add failing test case
* Add another failing test
* Initial implementation of lowering for std.alloca_scope
* Fix backticks
* Fix getSuccessorRegions implementation
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D97768
Jonas Devlieghere [Wed, 10 Feb 2021 03:52:11 +0000 (19:52 -0800)]
[lldb] Support new objective-c hash table layout
Update LLDB for thew new Objective-C hash table layout in the dyld
shared cache found in macOS Monterey.
rdar://
72863911
Jonas Devlieghere [Fri, 11 Jun 2021 17:24:01 +0000 (10:24 -0700)]
[lldb] Enable TestRuntimeTypes on Apple Silicon
Valery N Dmitriev [Fri, 11 Jun 2021 16:35:08 +0000 (09:35 -0700)]
[SLP][NFC] Fix condition that was supposed to save a bit of compile time.
It was found by chance revealing discrepancy between comment (few lines above),
the condition and how re-ordering of instruction is done inside the if statement
it guards. The condition was always evaluated to true.
Differential Revision: https://reviews.llvm.org/D104064
LLVM GN Syncbot [Fri, 11 Jun 2021 16:57:34 +0000 (16:57 +0000)]
[gn build] Port
c54d3050f7b9
Louis Dionne [Thu, 10 Jun 2021 17:38:55 +0000 (13:38 -0400)]
[libc++] NFC: Move indirect_concepts.h to __iterator/concepts.h
There's no fundamental reason to separate those from the other iterator
concepts.
Differential Revision: https://reviews.llvm.org/D104048
Guozhi Wei [Fri, 11 Jun 2021 16:43:52 +0000 (09:43 -0700)]
[X86FixupLEAs] Sub register usage of LEA dest should block LEA/SUB optimization
In function searchALUInst, sub register usage of LEA dest should also block LEA/SUB optimization, otherwise the sub register usage gets an undefined value.
This patch fixes https://bugs.llvm.org/show_bug.cgi?id=50615.
Differential Revision: https://reviews.llvm.org/D103922
Louis Dionne [Tue, 16 Feb 2021 16:24:27 +0000 (11:24 -0500)]
[libc++] Enable the synchronization library on Apple platforms
The synchronization library was marked as disabled on Apple platforms
up to now because we were not 100% sure that it was going to be ABI
stable. However, it's been some time since we shipped it in upstream
libc++ now and there's been no changes so far. This patch enables the
synchronization library on Apple platforms, and hence commits the ABI
stability as far as that vendor is concerned.
Differential Revision: https://reviews.llvm.org/D96790
LLVM GN Syncbot [Fri, 11 Jun 2021 16:34:49 +0000 (16:34 +0000)]
[gn build] Port
9106047ee3dd
zoecarver [Mon, 3 May 2021 21:37:42 +0000 (14:37 -0700)]
[libcxx][ranges] Add range.subrange.
Basically the title.
Differential Revision: https://reviews.llvm.org/D102006
Adam Nemet [Fri, 11 Jun 2021 15:50:07 +0000 (08:50 -0700)]
[Matrix] In transpose opts, handle a^t * a^t
Without the fix the testcase crashes because we remove the same instruction
twice.
Differential Revision: https://reviews.llvm.org/D104127
Aaron En Ye Shi [Thu, 10 Jun 2021 21:24:38 +0000 (21:24 +0000)]
[HIP] Fix --hip-version flag with 0 as component
Allow the usage of minor version 0, for hip versions
such as 4.0. Change the default values when performing
version checks.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D104062
Ayush Sahay [Fri, 11 Jun 2021 15:52:15 +0000 (21:22 +0530)]
[lldb-vscode] Synchronize calls to SendTerminatedEvent
If an inferior exits prior to the processing of a disconnect request,
then the threads executing EventThreadFunction and request_discontinue
respectively may call SendTerminatedEvent simultaneously, in turn,
testing and/or setting g_vsc.sent_terminated_event without any
synchronization. In case the thread executing EventThreadFunction sets
it before the thread executing request_discontinue has had a chance to
test it, the latter would move ahead to issue a response to the
disconnect request. Said response may be dispatched ahead of the
terminated event compelling the client to terminate the debug session
without consuming any console output that might've been generated by
the execution of terminateCommands.
Reviewed By: clayborg, wallace
Differential Revision: https://reviews.llvm.org/D103609
Aaron Ballman [Fri, 11 Jun 2021 16:06:21 +0000 (12:06 -0400)]
Update the C status page somewhat.
This adds implementation information for N2607,
clarifies that C17 only resolved defect reports,
and adds -std= information for the different versions.
Tomas Matheson [Tue, 20 Apr 2021 18:03:09 +0000 (19:03 +0100)]
[CodeGen][regalloc] Don't align stack slots if the stack can't be realigned
Register allocation may spill virtual registers to the stack, which can
increase alignment requirements of the stack frame. If the the function
did not require stack realignment before register allocation, the
registers required to do so may not be reserved/available. This results
in a stack frame that requires realignment but can not be realigned.
Instead, only increase the alignment of the stack if we are still able
to realign.
The register SpillAlignment will be ignored if we can't realign, and the
backend will be responsible for emitting the correct unaligned loads and
stores. This seems to be the assumed behaviour already, e.g.
ARMBaseInstrInfo::storeRegToStackSlot and X86InstrInfo::storeRegToStackSlot
are both `canRealignStack` aware.
Differential Revision: https://reviews.llvm.org/D103602
Alexey Bataev [Wed, 9 Jun 2021 19:37:34 +0000 (12:37 -0700)]
[SLP]Allow reordering of insertelements.
After we added support for non-ordered insertelements, we can allow
their reordering.
Differential Revision: https://reviews.llvm.org/D104057
eahcmrh [Fri, 11 Jun 2021 15:44:06 +0000 (17:44 +0200)]
Revert "[Sema] Address-space sensitive check for unbounded arrays (v2)"
This reverts commit
e42a347b74400b7212ceaaea6d39562a0435df42.
eahcmrh [Tue, 8 Jun 2021 17:00:42 +0000 (19:00 +0200)]
[Sema] Address-space sensitive check for unbounded arrays (v2)
Check applied to unbounded (incomplete) arrays and pointers to spot
cases where the computed address is beyond the largest possible
addressable extent of the array, based on the address space in which the
array is delcared, or which the pointer refers to.
Check helps to avoid cases of nonsense pointer math and array indexing
which could lead to linker failures or runtime exceptions. Of
particular interest when building for embedded systems with small
address spaces.
This is version 2 of this patch -- version 1 had some testing issues
due to a sign error in existing code. That error is corrected and
lit test for this chagne is extended to verify the fix.
Originally reviewed/accepted by: aaron.ballman
Original revision: https://reviews.llvm.org/D86796
Reviewed By: aaron.ballman, ebevhan
Differential Revision: https://reviews.llvm.org/D88174
Matt Morehouse [Fri, 11 Jun 2021 15:19:36 +0000 (08:19 -0700)]
[HWASan] Add basic stack tagging support for LAM.
Adds the basic instrumentation needed for stack tagging.
Currently does not support stack short granules or TLS stack histories,
since a different code path is followed for the callback instrumentation
we use.
We may simply wait to support these two features until we switch to
a custom calling convention.
Patch By: xiangzhangllvm, morehouse
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102901
Nico Weber [Fri, 11 Jun 2021 15:15:11 +0000 (11:15 -0400)]
[lld/mac] Use sectionType() more
Not sure sectionType() carries its weight, but while we have it
we should use it consistently.
No behavior change.
Differential Revision: https://reviews.llvm.org/D104027
Alexey Bataev [Fri, 11 Jun 2021 12:13:31 +0000 (05:13 -0700)]
[SLP]Remove unnecessary UndefValue in CreateShuffle.
No need to use UndefValue in CreateShuffle call.
Differential Revision: https://reviews.llvm.org/D104113
Alexey Bataev [Fri, 11 Jun 2021 15:00:15 +0000 (08:00 -0700)]
[SLP][NFC]Add a test for unordered stores, NFC.
thomasraoux [Fri, 11 Jun 2021 14:39:01 +0000 (07:39 -0700)]
[mlir][VectorToGPU] First step to convert vector ops to GPU MMA ops
This is the first step to convert vector ops to MMA operations in order to
target GPUs tensor core ops. This currently only support simple cases,
transpose and element-wise operation will be added later.
Differential Revision: https://reviews.llvm.org/D102962