review.tizen.org Git - platform/upstream/llvm.git/log

[Flang][NFC] Remove license comments from files in docs/ folder.

Solves issue https://reviews.llvm.org/D86131#2247275

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D86875

[analyzer] Add modeling for unique_ptr move constructor

Summary:
Add support for handling move contructor of std::unique_ptr.

Reviewers: NoQ, Szelethus, vsavchenko, xazax.hun

Reviewed By: NoQ

Subscribers: martong, cfe-commits
Tags: #clang

Differential Revision: https://reviews.llvm.org/D86373

[NFCI] Silent a build warning due to an extra semi-colon

[lldb] tab completion for class `CommandObjectTypeFormatterDelete`

1. Added a dedicated completion to class `CommandObjectTypeFormatterDelete`
which can be used by these commands: `type filter/format/summary/synthetic delete`;
2. Added a related test case.

Reviewed By: teemperor

Differential Revision: https://reviews.llvm.org/D84142

Fix sphinx documentation after a6a37a2fcd2a8048a75bd0d8280497ed89d73224

[lldb][NFC] Remove trailing whitespace in TestCompletion

[lldb] Fix TestCompletion's pid completion failing randomly

TestCompletion is randomly failing on some bots. The error message however states
that the computed completions actually do contain the expected pid we're
looking for, so there shouldn't be any test failure.

The reason for that turns out to be that complete_from_to is actually used
for testing two different features. It can be used for testing what the
common prefix for the list of completions is and *also* for checking all the
possible completions that are returned for a command. Which one of the two
things should be checked can't be defined by a parameter to the function, but
is instead guessed by the test method instead based on the results that were
returned. If there is a common prefix in all completions, then that prefix
is searched and otherwise all completions are searched.

For TestCompletion's pid test this behaviour leads to the strange test failures.
If all the pid's that our test LLDB can see have a common prefix (e.g., it
can only see pids [123, 122, 10004, 10000] -> common prefix '1'), then
complete_from_to check that the common prefix contains our pid, which is
always fails ('1' doesn't contain '123' or any other valid pid). If there
isn't a common prefix (e.g., pids are [123, 122, 10004, 777]) then
complete_from_to will check the list of completions instead which works correctly.

This patch is fixing this by adding a simple check method that doesn't
have this behaviour and is simply searching the returned list of completions.
This should get the bots green while I'm working on a proper fix that fixes
complete_from_to.

[llvm-readobj/elf] - Don't fail when dumping an archive with a member that can't be recognized.

Imagine we have an archive that has 3 objects in the following order:
<valid known object>,<unknown object> and <valid known object>.

Currently llvm-readelf/obj report an error and stops dumping in the middle.
This patch changes the error reported to warning.

Differential revision: https://reviews.llvm.org/D86771

Revert "[FileCheck] Move FileCheck implementation out of LLVMSupport into its own library"

This reverts commit e9a3d1a401b07cbf7b11695637f1b549782a26cd. Seems the new
FileCheck library doesn't link on some bots. Reverting for now.

[FileCheck] Move FileCheck implementation out of LLVMSupport into its own library

The actual FileCheck logic seems to be implemented in LLVMSupport. I don't see a
good reason for having FileCheck implemented there as it has a very specific use
while LLVMSupport is a dependency of pretty much every LLVM tool there is. In
fact, the only use of FileCheck I could find (outside the FileCheck tool and the
FileCheck unit test) is a single call in GISelMITest.h.

This moves the FileCheck logic to its own LLVMFileCheck library. This way only
FileCheck and the GlobalISelTests now have a dependency on this code.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D86344

[lldb] Don't crash when LLDB can't extract the tsan report

Right now all tsan tests are crashing on Linux. The tests were already marked as
expected failures, but since commit 20ce8affce85d added an assert that every
StopInfo needs a non-empty stop description the tests actually started crash
(which is even with an expectedFailure a failed test).

The reason for that is that we never had any stop description when hitting tsan
errors on Linux. Before the assert that just made the test fail, but now the
empty description is hitting the assert. This patch just adds a generic stop
description mentioning tsan to prevent that we hit that assert on platforms
where we don't support extracting the tsan report.

Reviewed By: friss

Differential Revision: https://reviews.llvm.org/D86593

[Test] Simplify DWARF test cases. NFC.

The Length, AbbrOffset and Values fields of the debug_info section are
optional. This patch helps remove them and simplify test cases.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D86857

[Sink] Optimize/simplify sink candidate finding with nearest common dominator

For an instruction in the basic block BB, SinkingPass enumerates basic blocks
dominated by BB and BB's successors. For each enumerated basic block,
SinkingPass uses `AllUsesDominatedByBlock` to check whether the basic
block dominates all of the instruction's users. This is inefficient.

Use the nearest common dominator of all users to avoid enumerating the
candidate. The nearest common dominator may be in a parent loop which is
not beneficial. In that case, find the ancestors in the dominator tree.

In the case that the instruction has no user, with this change we will
not perform unnecessary move. This causes some amdgpu test changes.

A stage-2 x86-64 clang is a byte identical with this change.

[Sink][test] Add nounwind test and properly test convergent

[InstCombine] add extra-use tests for fmul+sqrt; NFC

[GVN] add another commutable intrinsic test; NFC

This is a reduced version of a test-suite crasher with rG25597f7

Revert "[IR][GVN] allow intrinsics in Instruction's isCommutative query"

This reverts commit 25597f7783e7038b8a2ee88bb49ac605b211b564.
It is causing crashing on bots such as:
http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10523/steps/ninja-build/logs/stdio

[DSE,MemorySSA] Skip defs without analyzable write locations.

Similar to other checks above, if there is no write location for a def,
it cannot be considered for elimination and can be skipped.

[IR][GVN] allow intrinsics in Instruction's isCommutative query

As discussed in D86798 / rG09652721 , we were potentially
returning a different result for whether an Instruction
is commutable depending on if we call the base class or
derived class method.

This requires relaxing an assert in GVN, but that pass
seems to be working otherwise.

NewGVN requires more work because it uses different
code paths for numbering binops and calls.

[NewGVN] add test for commutative intrinsic; NFC

[GVN] add test for commutative intrinsic; NFC

[DSE,MemorySSA] Simplify code, EarlierAccess is be a MemoryDef (NFC).

After recent changes, we return early if Current is a MemoryPhi, so
EarlierAccess can only be a MemoryDef.

[X86] Pre-commit the test-shrink.ll changes from D86578.

The conditions in these tests are guaranteed to always
go one direction. InstCombine would have folded them away.

[llvm-reduce] Add test for BB reduction with non-void ret type.

Precommit test for D86849.

[FileCheck] Add precision to format specifier

Add printf-style precision specifier to pad numbers to a given number of
digits when matching them if the value is smaller than the given
precision. This works on both empty numeric expression (e.g. variable
definition from input) and when matching a numeric expression. The
syntax is as follows:

[[#%.<precision><format specifier>, ...]

where <format specifier> is optional and ... can be a variable
definition or not with an empty expression or not. In the absence of a
precision specifier, a variable definition will accept leading zeros.

Reviewed By: jhenderson, grimar

Differential Revision: https://reviews.llvm.org/D81667

Fix gcc warning by explicitly initializing the base class copy ctor (NFC)

Full diagnostic was:

warning: base class ‘class mlir::OptReductionBase<mlir::OptReductionPass>’ should be explicitly initialized in the copy constructor [-Wextra]

[LV] Update CFG before adding runtime checks.

addRuntimeChecks uses SCEVExpander, which relies on the DT/LoopInfo to
be up-to-date. Changing the CFG afterwards may invalidate some inserted
instructions, especially LCSSA phis.

Reorder the code to first update the CFG and then create the runtime
checks. This should not have any impact on the generated code, as we
adjust the CFG and generate runtime checks together.

Fixes PR47343.

[libcxx/variant] Implement workaround for GCC bug.

A parameter pack is deemed to be uncaptured, which is bogus... but it seems to
be because it's within an expression that involves `decltype` of an uncaptured
pack or something: https://godbolt.org/z/b8z3sh

Drive-by fix for uglified name.

Differential Revision: https://reviews.llvm.org/D86827

[FastISel] update to use intrinsic's isCommutative(); NFC

This requires adding a missing 'const' to the definition because
the callers are using const args, but there should be no change
in behavior.

The intrinsic method was added with D86798 / rG096527214033

[DAGCombiner] skip reciprocal divisor optimization for x/sqrt(x)

In general, we probably want to try the multi-use reciprocal
transform before sqrt transforms, but x/sqrt(x) is a special-case
because that will always reduce to plain sqrt(x) or an estimate.

The AArch64 tests show that the transform is limited by TLI
hook to patterns where there are 3 or more uses of the divisor.
So this change can result in an extra division compared to
what we had, but that's the intended behvior based on the
current setting of that hook.

[AArch64] add tests for multi-use fast sqrt/recip; NFC

[x86] add tests for multi-use fast sqrt/recip; NFC

[SLP] make commutative check apply only to binops; NFC

As discussed in D86798, it's not clear if the caller code
works with a more liberal definition of "commutative" that
includes intrinsics like min/max. This makes the binop
restriction (current functionality is unchanged) explicit
until the code is audited/tested.

[CVP] Regenerate test checks (NFC)

[NFC][compiler-rt] Factor out __div[sdt]i3 and __mod[dt]i3 implementations

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D86400

[Hexagon] Fix perfect shuffle generation for single vectors

Perfect shuffle instruction (vdealvdd/vshuffvdd) work on vector
pairs. When given a single input vector, half of it first needs
to be transposed into the other vector before the generated
shuffles can take effect. Also the first transpose needs to be
undone at the end (this last step was missing).

[LV] Add some const to RecurrenceDescriptor. NFC

[llvm-reduce] Function body reduction: don't forget to unset comdat

althought the interstingness test should usually fail when the module is invalid
this changes reduces the frequency at which llvm-reduce generate invalid IR.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D86404

Reland [OpenMPOpt] ICV tracking for calls

The problem with module slice has been addressed in D86319

Introduce two new AAs. AAICVTrackerFunctionReturned which checks if a
function can have a unique ICV value after it is finished, and
AAICVCallSiteReturned which checks AAICVTrackerFunctionReturned for a
call site. This enables us to check the value of a call and if it
changes the ICV. This also changes the approach in
`getReplacementValues()` to a worklist-based approach so we can explore
all relevant BBs.

Differential Revision: https://reviews.llvm.org/D85544

[Attributor] Introduce module slice.

Summary:
The module slice describes which functions we can analyze and transform
while working on an SCC as part of the Attributor-CGSCC pass. So far we
simply restricted it to the SCC.

Reviewers: jdoerfert

Differential Revision: https://reviews.llvm.org/D86319

Improve doc comments for several methods returning bools

Differential Revision: https://reviews.llvm.org/D86848

[OpenMPOpt][NFC] add reproducer for problem found in D85544

[LangRef] Apply a missing comment from D86189

[LangRef] State that storing an aggregate fills padding with undef

This patch makes LangRef be explicit about the value of padding when storing an aggregate.
It states that when an aggregate is stored into memory, padding is filled with undef.

Here is a clue that supports this change (edited to reflect the discussion from llvm-dev):

- IPSCCP ignores padding and directly stores a constant aggregate if possible. It loses the data stored in the padding. https://godbolt.org/z/xzenYs Memcpyopt ignores (the preexisting value of) padding when copying an aggregate or storing a constant: https://godbolt.org/z/hY6ndd / https://godbolt.org/z/3WMP5a

The two items below are not relevant with this patch because Clang lowers load/store of individual field of struct into load/stores of the corresponding pointer with a primitive type. Also, when copy is needed, it uses memcpy instead of load/store of an aggregate, as discussed in the llvm-dev. However, this patch is still valid (as discussed) because it is needed to explain the two optimizations above.

- According to C17, the value of padding bytes when storing values in structures or unions is unspecified.

- I updated Alive2 and it did not find any problematic transformation from LLVM unit tests and while running translation validation of a few C programs.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D86189

[Attributor] Fix callsite check in AAUndefinedBehavior

This is the next patch of D86842
When we check `noundef` attribute violation at callsites, we do not have to require `nonnull` in the following two cases.
1. An argument is known to be simplified to undef
2. An argument is known to be dead

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86845

[Attributor][NFC] Fix dependency type in AAUndefinedBehaviorImpl::updateImpl

This patch fixes wrong dependency type in AAUB.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86842

Set alignment of .llvmbc and .llvmcmd to 1

Otherwise their alignment is dependent on the size of the section. If the size
is large than 16, the alignment will be 16.

16 is a bad choice for both .llvmbc and .llvmcmd because the padding between two
contributions from input sections is of a variable size.

A bitstream is actually guaranteed to be 4-byte aligned, but consumers don't
need this property.

Remove OpenBSD/sparc support

[ORC] Add getDFSLinkOrder / getReverseDFSLinkOrder methods to JITDylib.

DFS and Reverse-DFS linkage orders are used to order execution of
deinitializers and initializers respectively.

This patch replaces uses of special purpose DFS order functions in
MachOPlatform and LLJIT with uses of the new methods.

[libc++] Temporarily force-set the LIBCXX_TEST_CONFIG cache value

This ensures that existing CMake build trees will start using the new
default without having to nuke their build directories.

[libc++] Move the default site config template alongside other config files

[libc++] Add from-scratch configuration files for the test suite

This commit adds the first from-scratch configuration files for running
the libc++ test suite without using the old configuration:

- libcxx-trunk-shared.cfg.py:
Runs the test suite against a trunk libc++ shared library.
- libcxx-trunk-static.cfg.py:
Runs the test suite against a trunk libc++ static library.

Differential Revision: https://reviews.llvm.org/D81866

[Attributor] Fix AANoUndef identification

Even though `noundef` IR attribute might be attached to non-void type values, AANoUndef is mistakenly identified for pointer type values only.
This patch fixes that.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86737

[InstSimplify] Reduce code duplication in simplifySelectWithICmpCond (NFC)

Canonicalize icmp ne to icmp eq and implement all the folds only once.

[InstSimplify] Protect against more poison in SimplifyWithOpReplaced (PR47322)

Replace the check for poison-producing instructions in
SimplifyWithOpReplaced() with the generic helper canCreatePoison()
that properly handles poisonous shifts and thus avoids the problem
from PR47322.

This additionally fixes a bug in IIQ.UseInstrInfo=false mode, which
previously could have caused this code to ignore poison flags.
Setting UseInstrInfo=false should reduce the possible optimizations,
not increase them.

This is not a full solution to the problem, as poison could be
introduced more indirectly. This is just a minimal, easy to backport
fix.

Differential Revision: https://reviews.llvm.org/D86834

[LV] Check opt-for-size before expanding runtime checks.

Move bail out when optimizing for size before runtime check generation.
In that case, we do not use the result of the expansion, the expanded
instruction will be dead and cleaned up later.

By doing the check before expanding the runtime-checks, we can save a
bit of unnecessary work.

[LVI] Remove unnecessary lambda capture (NFC)

Reapply [LVI] Normalize pointer behavior

This got reverted because a dependency was reverted. It has since
been reapplied, so reapply this as well.

-----

Related to D69686. As noted there, LVI currently behaves differently
for integer and pointer values: For integers, the block value is always
valid inside the basic block, while for pointers it is only valid at
the end of the basic block. I believe the integer behavior is the
correct one, and CVP relies on it via its getConstantRange() uses.

The reason for the special pointer behavior is that LVI checks whether
a pointer is dereferenced in a given basic block and marks it as
non-null in that case. Of course, this information is valid only after
the dereferencing instruction, or in conservative approximation,
at the end of the block.

This patch changes the treatment of dereferencability: Instead of
including it inside the block value, we instead treat it as something
similar to an assume (it essentially is a non-nullness assume) and
incorporate this information in intersectAssumeOrGuardBlockValueConstantRange()
if the context instruction is the terminator of the basic block.
This happens either when determining an edge-value internally in LVI,
or when a terminator was explicitly passed to getValueAt(). The latter
case makes this more powerful than the previous implementation as
a side-effect, and this does actually seem benefitial in practice.

Of course, we do not want to recompute dereferencability on each
intersectAssume call, so we need a new cache for this. The
dereferencability analysis requires walking the entire basic block
and computing underlying objects of all memory operands. This was
previously done separately for each queried pointer value. In the
new implementation (both because this makes the caching simpler,
and because it is faster), I instead only walk the full BB once and
cache all the dereferenced pointers. So the traversal is now performed
only once per BB, instead of once per queried pointer value.

I think the overall model now makes more sense than before, and there
will be no more pitfalls due to differing integer/pointer behavior.

Differential Revision: https://reviews.llvm.org/D69914

[NFC][Local] EliminateDuplicatePHINodes(): add STATISTIC()

[NFCI][Local] Rewrite EliminateDuplicatePHINodes to optionally check hashing invariants

EarlyCSE has a mode to verify the invariant that hash equality equals
key equality, but EliminateDuplicatePHINodes() doesn't.

I've verified that this would have caught the stage2-stage3 mismatches
5ec2b757cc7d37ff0d03b36ee863b0962fe78108 revert has fixed,
that were introduced last time in 3e69871ab5a66fb55913a2a2f5e7f5b42899a4c9.

[Attributor][NFC] Do not manifest noundef for positions to be changed to undef

This patch fixes AANoUndef manifestation.
We should not manifest noundef for positions that will be changed to undef.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86835

[Attributor][NFC] rerun update_test_checks without --scrub-attributes

[DSE,MemorySSA] Return early when hitting a MemoryPhi.

A MemoryPhi can never be eliminated. If we hit one, return the Phi, so
the caller can continue traversing the incoming accesses.

This saves some unnecessary read clobber checks and improves
compile-time
http://llvm-compile-time-tracker.com/compare.php?from=1ffc58b6d098ce8fa71f3a80fe75b990f633f921&to=d0fa8d1982380b57d7b6067528104bc373dbe07a&stat=instructions

[IR] Inline AttrBuilder::addAttribute. It just sets 1 bit. NFC.

[Sema] Simplify ShouldDiagnoseUnusedDecl, NFC

Instead of writing to a flag and then returning based on that flag we
can also return directly. The flag name also doesn't provide additional
information, it just reflects the name of the function (isReferenced).

[Sema] ICK_Function_Conversion is a third kind conversion

Not sure if this has any effect, but it was inconsistent before.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D67113

[Instruction] Speculatively undo isIdenticalToWhenDefined() PHI handling changes

The stage2-stage3 differences persist even without instcombine-based
PHI CSE, so this is the only possible reason.

[EarlyCSE] fold commutable intrinsics

Handling the new min/max intrinsics is the motivation, but it
turns out that we have a bunch of other intrinsics with this
missing bit of analysis too.

The FP min/max tests show that we are intersecting FMF,
so that part should be safe too.

As noted in https://llvm.org/PR46897 , there is a commutative
property specifier for intrinsics, but no corresponding function
attribute, and so apparently no uses of that bit. We may want to
remove that next.

Follow-up patches should wire up the Instruction::isCommutative()
to this IntrinsicInst specialization. That requires updating
callers to be aware of the more general commutative property
(not just binops).

Differential Revision: https://reviews.llvm.org/D86798

[EarlyCSE] add tests for commutative intrinsics; NFC

[TargetLowering] Strip tailing whitespace (NFC)

[InstCombine] Take 3: Perform trivial PHI CSE

The original take 1 was 6102310d814ad73eab60a88b21dd70874f7a056f,
which taught InstSimplify to do that, which seemed better at time,
since we got EarlyCSE support for free.

However, it was proven that we can not do that there,
the simplified-to PHI would not be reachable from the original PHI,
and that is not something InstSimplify is allowed to do,
as noted in the commit ed90f15efb40d26b5d3ead3bb8e9e284218e0186
that reverted it:
> It appears to cause compilation non-determinism and caused stage3 mismatches.

Then there was take 2 3e69871ab5a66fb55913a2a2f5e7f5b42899a4c9,
which was InstCombine-specific, but it again showed stage2-stage3 differences,
and reverted in bdaa3f86a040b138c58de41d73d35b76fdec1380.
This is quite alarming.

Here, let's try to change how we find existing PHI candidate:
due to the worklist order, and the way PHI nodes are inserted
(it may be inserted as the first one, or maybe not), let's look at *all*
PHI nodes in the block.

Effects on vanilla llvm test-suite + RawSpeed:
```
| statistic name                                     | baseline  | proposed  |      Δ |        % |    \|%\| |
|----------------------------------------------------|-----------|-----------|-------:|---------:|---------:|
| asm-printer.EmittedInsts                           | 7942329   | 7942457   |    128 |    0.00% |    0.00% |
| assembler.ObjectBytes                              | 254295632 | 254312480 |  16848 |    0.01% |    0.01% |
| correlated-value-propagation.NumPhis               | 18412     | 18347     |    -65 |   -0.35% |    0.35% |
| early-cse.NumCSE                                   | 2183283   | 2183267   |    -16 |    0.00% |    0.00% |
| early-cse.NumSimplify                              | 550105    | 541842    |  -8263 |   -1.50% |    1.50% |
| instcombine.NumAggregateReconstructionsSimplified  | 73        | 4506      |   4433 | 6072.60% | 6072.60% |
| instcombine.NumCombined                            | 3640311   | 3644419   |   4108 |    0.11% |    0.11% |
| instcombine.NumDeadInst                            | 1778204   | 1783205   |   5001 |    0.28% |    0.28% |
| instcombine.NumPHICSEs                             | 0         | 22490     |  22490 |    0.00% |    0.00% |
| instcombine.NumWorklistIterations                  | 2023272   | 2024400   |   1128 |    0.06% |    0.06% |
| instcount.NumCallInst                              | 1758395   | 1758802   |    407 |    0.02% |    0.02% |
| instcount.NumInvokeInst                            | 59478     | 59502     |     24 |    0.04% |    0.04% |
| instcount.NumPHIInst                               | 330557    | 330545    |    -12 |    0.00% |    0.00% |
| instcount.TotalBlocks                              | 1077138   | 1077220   |     82 |    0.01% |    0.01% |
| instcount.TotalFuncs                               | 101442    | 101441    |     -1 |    0.00% |    0.00% |
| instcount.TotalInsts                               | 8831946   | 8832606   |    660 |    0.01% |    0.01% |
| simplifycfg.NumHoistCommonCode                     | 24186     | 24187     |      1 |    0.00% |    0.00% |
| simplifycfg.NumInvokes                             | 4300      | 4410      |    110 |    2.56% |    2.56% |
| simplifycfg.NumSimpl                               | 1019813   | 999767    | -20046 |   -1.97% |    1.97% |
```
So it fires 22490 times, which is less than ~24k the take 1 did,
but more than what take 2 did (22228 times)
.
It allows foldAggregateConstructionIntoAggregateReuse() to actually work
after PHI-of-extractvalue folds did their thing. Previously SimplifyCFG
would have done this PHI CSE, of all places. Additionally, allows some
more `invoke`->`call` folds to happen (+110, +2.56%).

All in all, expectedly, this catches less things overall,
but all the motivational cases are still caught, so all good.

[UpdateTestChecks] Don't skip attributes when comparing functions

Revert "[InstCombine] Take 2: Perform trivial PHI CSE"

While the original variant with doing this in InstSimplify (rightfully)
caused questions and ultimately was detected to be a culprit
of stage2-stage3 mismatch, it was expected that
InstCombine-based implementation would be fine.

But apparently it's not, as
http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24095/steps/compare-compilers/logs/stdio
suggests.

Which suggests that somewhere in InstCombine there is a loop
over nondeterministically sorted container, which causes
different worklist ordering.

This reverts commit 3e69871ab5a66fb55913a2a2f5e7f5b42899a4c9.

[InstCombine] Return replaceInstUsesWith() result (NFC)

Follow the usual usage pattern for this function and return the
result.

[AArch64] Generate and parse SEH assembly directives

This ensures that you get the same output regardless if generating
code directly to an object file or if generating assembly and
assembling that.

Add implementations of the EmitARM64WinCFI*() methods in
AArch64TargetAsmStreamer, and fill in one blank in MCAsmStreamer.

Add corresponding directive handlers in AArch64AsmParser and
COFFAsmParser.

Some SEH directive names have been picked to match the prior art
for SEH assembly directives for x86_64, e.g. the spelling of
".seh_startepilogue" matching the preexisting ".seh_endprologue".

For the directives for saving registers, the exact spelling
from the arm64 documentation is picked, e.g. ".seh_save_reg" (to follow
that naming for all the other ones, e.g. ".seh_save_fregp_x"), while
the corresponding one for x86_64 is plain ".seh_savereg" without the
second underscore.

Directives in the epilogues have the same names as in prologues,
e.g. .seh_savereg, even though the registers are restored, not
saved, at that point.

Differential Revision: https://reviews.llvm.org/D86529

[MC] [Win64EH] Fill in FuncletOrFuncEnd if missing

This can happen e.g. for code that declare .seh_proc/.seh_endproc
in assembly, or for code that use .seh_handlerdata (which triggers
the unwind info to be emitted before the end of the function).

The TextSection field must be made non-const to be able to use it
with Streamer.SwitchSection().

Differential Revision: https://reviews.llvm.org/D86528

[InstCombine] foldAggregateConstructionIntoAggregateReuse(): use InstCombiner::replaceInstUsesWith() instead of RAUW

We really shouldn't use RAUW in InstCombine
because we should consistently update Worklist to avoid extra iterations.

[InstCombine] canonicalizeICmpPredicate(): use InstCombiner::replaceInstUsesWith() instead of RAUW

We really shouldn't use RAUW in InstCombine
because we should consistently update Worklist to avoid extra iterations.

[NFC][InstCombine] Fix some comments: the code already uses IC::replaceInstUsesWith()

[NFC] Instruction::isIdenticalToWhenDefined(): s/nessesairly/necessarily/

[NFC][InstCombine] Add STATISTIC() for how many iterations we did

As we've established, if it takes more than two iterations
(one to perform folding and one to ensure that no folding opportunities
remain) per function, then there are worklist management issues.
So it may be interesting to keep track of it.

[NFC][InstCombine] select.ll: remove outdated TODO comment

Fixed by 3e69871ab5a66fb55913a2a2f5e7f5b42899a4c9

[InstCombine] visitPHINode(): use InstCombiner::replaceInstUsesWith() instead of RAUW

As noted in post-commit review, we really shouldn't use RAUW in InstCombine
because we should consistently update Worklist to avoid extra iterations.

[InstCombine] Take 2: Perform trivial PHI CSE

The original take was 6102310d814ad73eab60a88b21dd70874f7a056f,
which taught InstSimplify to do that, which seemed better at time,
since we got EarlyCSE support for free.

However, it was proven that we can not do that there,
the simplified-to PHI would not be reachable from the original PHI,
and that is not something InstSimplify is allowed to do,
as noted in the commit ed90f15efb40d26b5d3ead3bb8e9e284218e0186
that reverted it :
> It appears to cause compilation non-determinism and caused stage3 mismatches.

However InstCombine already does many different optimizations,
so it should be a safe place to do it here.

Note that we still can't just compare incoming values ranges,
because there is no guarantee that these PHI's we'd simplify to
were already re-visited and sorted.
However coming up with a test is problematic.

Effects on vanilla llvm test-suite + RawSpeed:
```
| statistic name                                     | baseline  | proposed  |      Δ |        % |      |%| |
|----------------------------------------------------|-----------|-----------|-------:|---------:|---------:|
| instcombine.NumPHICSEs                             | 0         | 22228     |  22228 |    0.00% |    0.00% |
| asm-printer.EmittedInsts                           | 7942329   | 7942456   |    127 |    0.00% |    0.00% |
| assembler.ObjectBytes                              | 254295632 | 254313792 |  18160 |    0.01% |    0.01% |
| early-cse.NumCSE                                   | 2183283   | 2183272   |    -11 |    0.00% |    0.00% |
| early-cse.NumSimplify                              | 550105    | 541842    |  -8263 |   -1.50% |    1.50% |
| instcombine.NumAggregateReconstructionsSimplified  | 73        | 4506      |   4433 | 6072.60% | 6072.60% |
| instcombine.NumCombined                            | 3640311   | 3666911   |  26600 |    0.73% |    0.73% |
| instcombine.NumDeadInst                            | 1778204   | 1783318   |   5114 |    0.29% |    0.29% |
| instcount.NumCallInst                              | 1758395   | 1758804   |    409 |    0.02% |    0.02% |
| instcount.NumInvokeInst                            | 59478     | 59502     |     24 |    0.04% |    0.04% |
| instcount.NumPHIInst                               | 330557    | 330549    |     -8 |    0.00% |    0.00% |
| instcount.TotalBlocks                              | 1077138   | 1077221   |     83 |    0.01% |    0.01% |
| instcount.TotalFuncs                               | 101442    | 101441    |     -1 |    0.00% |    0.00% |
| instcount.TotalInsts                               | 8831946   | 8832611   |    665 |    0.01% |    0.01% |
| simplifycfg.NumInvokes                             | 4300      | 4410      |    110 |    2.56% |    2.56% |
| simplifycfg.NumSimpl                               | 1019813   | 999740    | -20073 |   -1.97% |    1.97% |
```
So it fires ~22k times, which is less than ~24k the take 1 did.
It allows foldAggregateConstructionIntoAggregateReuse() to actually work
after PHI-of-extractvalue folds did their thing. Previously SimplifyCFG
would have done this PHI CSE, of all places. Additionally, allows some
more `invoke`->`call` folds to happen (+110, +2.56%).

All in all, expectedly, this catches less things overall,
but all the motivational cases are still caught, so all good.

[NFC][InstSimplify] Add a note to PHI CSE tests that they are all negative tests

As discussed in https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824235.html
even though it seems worthwhile doing so in InstSimplify,
we really can't do that there, because the other PHI wouldn't be
def-reachable from the original PHI.

[NFC][InstCombine] Add tests for PHI CSE

As discussed in https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824235.html
even though it seems worthwhile doing so in InstSimplify,
we really can't do that there, because the other PHI wouldn't be
def-reachable from the original PHI.

So ignoring whether or not EarlyCSE should also know to do this,
InstCombine is the place.

[PPC] Fix platform definitions when compiling FreeBSD powerpc64 as LE

As a prerequisite to doing experimental buids of pieces of FreeBSD PowerPC64 as little-endian, allow actually targeting it.

This is needed so basic platform definitions are pulled in. Without it, the compiler will only run freestanding.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D73425

[InstCombine] Fix typo in comment (NFC)

As pointed out in post-commit review of D63060.

[Target][AArch64] Allow for char as int8_t in AArch64AsmParser.cpp

A couple of AArch64 tests were failing on Solaris, both sparc and x86:

  LLVM :: MC/AArch64/SVE/add-diagnostics.s
  LLVM :: MC/AArch64/SVE/cpy-diagnostics.s
  LLVM :: MC/AArch64/SVE/cpy.s
  LLVM :: MC/AArch64/SVE/dup-diagnostics.s
  LLVM :: MC/AArch64/SVE/dup.s
  LLVM :: MC/AArch64/SVE/mov-diagnostics.s
  LLVM :: MC/AArch64/SVE/mov.s
  LLVM :: MC/AArch64/SVE/sqadd-diagnostics.s
  LLVM :: MC/AArch64/SVE/sqsub-diagnostics.s
  LLVM :: MC/AArch64/SVE/sub-diagnostics.s
  LLVM :: MC/AArch64/SVE/subr-diagnostics.s
  LLVM :: MC/AArch64/SVE/uqadd-diagnostics.s
  LLVM :: MC/AArch64/SVE/uqsub-diagnostics.s

For example, reduced from `MC/AArch64/SVE/add-diagnostics.s`:

  add     z0.b, z0.b, #0, lsl #8

missed the expected diagnostics

  $ ./bin/llvm-mc -triple=aarch64 -show-encoding -mattr=+sve add.s
  add.s:1:21: error: immediate must be an integer in range [0, 255] with a shift amount of 0
  add     z0.b, z0.b, #0, lsl #8
                      ^

The message is `Match_InvalidSVEAddSubImm8`, emitted in the generated
`lib/Target/AArch64/AArch64GenAsmMatcher.inc` for `MCK_SVEAddSubImm8`.
When comparing the call to `::AArch64Operand::isSVEAddSubImm<char>` on both
Linux/x86_64 and Solaris, I find

  875     bool IsByte = std::is_same<int8_t, std::make_signed_t<T>>::value;

is `false` on Solaris, unlike Linux.

The problem boils down to the fact that `int8_t` is plain `char` on
Solaris: both the sparc and i386 psABIs have `char` as signed.  However,
with

  9887     DiagnosticPredicate DP(Operand.isSVEAddSubImm<int8_t>());

in `lib/Target/AArch64/AArch64GenAsmMatcher.inc`, `std::make_signed_t<int8_t>`
above yieds `signed char`, so `std::is_same<int8_t, signed char>` is `false`.

This can easily be fixed by also allowing for `int8_t` here and in a few
similar places.

Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and
`x86_64-pc-linux-gnu`.

Differential Revision: https://reviews.llvm.org/D85225

[Attributes] Merge calls to getFnAttribute/hasFnAttribute using Attribute::isValid. NFC

Rather than calling hasFnAttribute and then calling getFnAttribute
if the attribute exists, its better to just call getFnAttribute and
then check if we got a valid attribute back.

[NFC][InstructionSimplify] Add a warning about not simplifying to not def-reachable

See
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824235.html
and
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824967.html

InstSimply is not allowed to perform simplifications to instructions
that are not def-reachable from the original instruction.

[NFC][STLExtras] Add make_first_range(), similar to existing make_second_range()

Having just one of the two seens weird.
I wanted to use it a few times, but it wasn't there.

[DWARFYAML] Make the debug_abbrev_offset field optional.

This patch helps make the debug_abbrev_offset field optional. We don't
need to calculate the value of this field in the future.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D86614

[LLD][PowerPC][test] Disable ELF/ppc64-pcrel-long-branch-error.s

Following 0becc27ebfec, `ppc64-pcrel-long-branch-error.s` fails in some
environments with out-of-memory errors associated with buffering the
output in-memory. Since the alternative of writing to an allocated file
is also known to cause problems, we will disable the test
unconditionally (pending a mechanism to disable the test selectively).

[DAGCombiner] Enhance (zext(setcc))

Current `v:t = zext(setcc x,y,cc)` will be transformed to `select x, y, 1:t, 0:t, cc`. It misses some opportunities if x's type size is less than `t`'s size. This patch enhances the above transformation.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D86687

[compiler-rt][tsan] Remove unnecesary typedefs

These typedefs are not used anywhere else in this compilation unit.

Differential Revision: https://reviews.llvm.org/D86826

[lldb] Hoist --framework argument out of LLDB_TEST_COMMON_ARGS (NFC)

Give the framework argument its own variable (LLDB_FRAMEWORK_DIR) so
that we can configure it in lit.site.cfg.py if we so desire.

[lldb] Make the lit configuration values optional for the API tests

LIT uses a model where the test suite is configurable trough a
lit.site.cfg file. Most of the time we use the lit.site.cfg with values
that match the current build configuration, generated by CMake.

Nothing prevents you from running the test suite with a different
configuration, either by overriding some of these values from the
command line, or by passing a different lit.site.cfg.

The latter is currently tedious. Many configuration values are optional
but they still need to be set because lit.cfg.py is accessing them
directly. This patch changes the code to use getattr to return the
attribute if it exists. This makes it possible to specify a minimal
lit.site.cfg with only the mandatory/desired configuration values.

Differential revision: https://reviews.llvm.org/D86821

[ObjC][ARC] In HandlePotentialAlterRefCount, check whether an
instruction can decrement the reference count, not whether it can alter
it

This prevents the state transition from S_Use to S_CanRelease when doing
a bottom-up traversal and the transition from S_Retain to S_CanRelease
when doing a top-down traversal when the visited instruction can
increment the ref count but cannot decrement it. This allows the ARC
optimizer to remove retain/release pairs which were previously not
removed.

rdar://problem/21793154

Use report_fatal_error instead of llvm::errs() + abort() (NFC)

This is making the error reporting in line with other fatal errors.