platform/upstream/llvm.git
3 years ago[LV] Add a logical and/or select cost test. NFC
David Green [Thu, 8 Apr 2021 09:27:06 +0000 (10:27 +0100)]
[LV] Add a logical and/or select cost test. NFC

3 years ago[RISCV] Support OR/XOR/AND reductions on vector masks
Fraser Cormack [Wed, 7 Apr 2021 09:03:22 +0000 (10:03 +0100)]
[RISCV] Support OR/XOR/AND reductions on vector masks

This patch adds RVV codegen support for OR/XOR/AND reductions for both
scalable- and fixed-length vector types. There are a few possible
codegen strategies for each -- vmfirst.m, vmsbf.m, and vmsif.m could be
used to some extent -- but the vpopc.m instruction was chosen since it
produces the scalar result in one instruction, after which scalar
instructions can finish off the computation.

The reductions are lowered identically for both scalable- and
fixed-length vectors, although some alternate strategies may be more
optimal on fixed-length vectors since it's cheaper to get the length of
those types.

Other reduction types were not deemed to be relevant for mask vectors.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D100030

3 years ago[OpenCL] Fix mipmap read_image return types
Sven van Haastregt [Thu, 8 Apr 2021 08:51:44 +0000 (09:51 +0100)]
[OpenCL] Fix mipmap read_image return types

The return type did not match the function name.

3 years agoReorg firmware corefile tests; add test for OS plugin loading
Jason Molenda [Thu, 8 Apr 2021 08:44:13 +0000 (01:44 -0700)]
Reorg firmware corefile tests; add test for OS plugin loading

A little cleanup to how these firmware corefile tests are done; add
a test that loads a dSYM that loads an OS plugin, and confirm that
the OS plugin's threads are created.

3 years ago[AMDGPU, test] Fix use of undef FileCheck var
Thomas Preud'homme [Sun, 4 Apr 2021 10:12:58 +0000 (11:12 +0100)]
[AMDGPU, test] Fix use of undef FileCheck var

Test CodeGen/AMDGPU/amdgpu.private-memory.ll and
CodeGen/AMDGPU/private-memory-r600.ll have a block of CHECK directives
whose prefix is inconsistent: R600-CHECK Vs R600. This leads to a
R600-NOT directive using an undefined CHAN variable due to R600-CHECK
directives never being considered by FileCheck. Fixing the prefix leads
to the testcase failing. As per https://reviews.llvm.org/D99865#2675235
this commit removes the directives instead since it is not possible to
write a reliable check.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D99865

3 years ago[mlir] add support for index type in vectors.
Tobias Gysi [Thu, 8 Apr 2021 08:15:14 +0000 (08:15 +0000)]
[mlir] add support for index type in vectors.

The patch enables the use of index type in vectors. It is a prerequisite to support vectorization for indexed Linalg operations. This refactoring became possible due to the newly introduced data layout infrastructure. The data layout of a module defines the bitwidth of the index type needed to verify bitcasts and similar vector operations.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D99948

3 years ago[clang] Speedup line offset mapping computation
serge-sans-paille [Thu, 1 Apr 2021 20:18:55 +0000 (22:18 +0200)]
[clang] Speedup line offset mapping computation

Clang spends a decent amount of time in the LineOffsetMapping::get(...)
function. This function used to be vectorized (through SSE2) then the
optimization got dropped because the sequential version was on-par performance
wise.

This provides an optimization of the sequential version that works on a word at
a time, using (documented) bithacks to provide a portable vectorization.

When preprocessing the sqlite amalgamation, this yields a sweet 3% speedup.

This is a recommit of 6951b72334bbe4c189c71751edc1e361d7b5632c with endianness
and unsigned long vs uint64_t issues fixed (hopefully).

Differential Revision: https://reviews.llvm.org/D99409

3 years ago[AsmParser] Recognize more escaped characters between single quotes
LemonBoy [Thu, 8 Apr 2021 07:57:50 +0000 (09:57 +0200)]
[AsmParser] Recognize more escaped characters between single quotes

The GNU AS manual states the following about single-character constants enclosed within single quotes:

>  Some backslash escapes apply to characters, \b, \f, \n, \r, \t, and \" with the same meaning as for strings, plus \' for a single quote.

Add two more characters to the switch handling this case to match GAS behaviour, plus a test to make sure nothing regresses.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D99609

3 years ago[GreedyRA ORE] Extract computeNumberOfSplillsReloads to use in different places....
Serguei Katkov [Thu, 8 Apr 2021 07:38:38 +0000 (14:38 +0700)]
[GreedyRA ORE] Extract computeNumberOfSplillsReloads to use in different places. NFC.

Extract one basic block handling to introduce stat computation for method scope.

Reviewers: reames, MatzeB, anemet, thegameg
Reviewed By: reames, thegameg
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D100013

3 years ago[GreedyRA ORE] Extract stats in RAGreedyStats struct. NFC.
Serguei Katkov [Thu, 8 Apr 2021 07:27:37 +0000 (14:27 +0700)]
[GreedyRA ORE] Extract stats in RAGreedyStats struct. NFC.

Combine all collected stats into separate struct RAGreedyStats
with add and report methods.

The motivation is to extend the number of statistics to capture and instead of
adding new parameters, just combine all of them into one structure.
Additionally I plan to use report from different places in future to report data
for function as well.

Reviewers: reames, MatzeB, anemet, thegameg
Reviewed By: thegameg
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D100012

3 years ago[GreedyRA ORE] Compute ORE stats if extra analysis is enabled
Serguei Katkov [Tue, 6 Apr 2021 14:32:02 +0000 (21:32 +0700)]
[GreedyRA ORE] Compute ORE stats if extra analysis is enabled

To save compile time, avoid computation of stats if ORE will not emit it.
The motivation is to add more stats and compute it only if it will dumped.

Reviewers: reames, MatzeB, anemet, thegameg
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D100010

3 years ago[Debug-Info] Use inlined strings in .dwinfo section by default for DBX.
Esme-Yi [Thu, 8 Apr 2021 07:20:22 +0000 (07:20 +0000)]
[Debug-Info] Use inlined strings in .dwinfo section by default for DBX.

Summary: Set the default DwarfInlinedStrings as inlined strings for DBX, due to DBX does not support .dwstr section for now.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D99933

3 years ago[RISCV] Add scalable offset under very large stack size.
Hsiangkai Wang [Thu, 8 Apr 2021 02:49:10 +0000 (10:49 +0800)]
[RISCV] Add scalable offset under very large stack size.

If the stack size is larger than 12 bits, we have to use a scratch
register to store the stack size. Before we introduce the scalable stack
offset, we could simplify

%0 = ADDI %stack.0, 0

=>

%scratch = ... # sequence of instructions to move the offset into
%%scratch
%0 = ADD %fp, %scratch

However, if the offset contains scalable part, we need to consider it.

%0 = ADDI %stack.0, 0

=>

%scratch = ... # sequence of instructions to move the offset into
%%scratch
%scratch = ADD %fp, %scratch
%scalable_offset = ... # sequence of instructions for vscaled-offset.
%0 = ADD/SUB %scratch, %scalable_offset

Differential Revision: https://reviews.llvm.org/D100035

3 years ago[NFC][RISCV] Add test for scalable offset under large stack size.
Hsiangkai Wang [Thu, 8 Apr 2021 02:44:09 +0000 (10:44 +0800)]
[NFC][RISCV] Add test for scalable offset under large stack size.

This test case shows that we access wrong stack slots when the frame
object has scalable offset under large stack size.

Differential Revision: https://reviews.llvm.org/D100084

3 years ago[Constant] Remove unused variable
Juneyoung Lee [Thu, 8 Apr 2021 06:44:42 +0000 (15:44 +0900)]
[Constant] Remove unused variable

3 years ago[Constant] ConstantStruct/Array should not lower poison to undef
Juneyoung Lee [Thu, 8 Apr 2021 06:20:08 +0000 (15:20 +0900)]
[Constant] ConstantStruct/Array should not lower poison to undef

This is a (late) follow-up patch of 8871a4b4cab8a56fd6ff12fd024002c3c79128b3 and
c95f39891a282ebf36199c73b705d4a2c78a46ce to make ConstantStruct::get/ConstantArray::getImpl
correctly return PoisonValue if all elements are poison.
This was found while discussing about the elements of a vector-typed UndefValue (D99853)

3 years ago[CSSPGO] Move pseudo probes to the beginning of a block to unblock SelectionDAG combine.
Hongtao Yu [Wed, 7 Apr 2021 01:32:23 +0000 (18:32 -0700)]
[CSSPGO] Move pseudo probes to the beginning of a block to unblock SelectionDAG combine.

Pseudo probes, when scattered in a block, can be chained dependencies of other regular DAG nodes and block DAG combine optimizations. To fix this, scattered probes in a block are grouped and placed at the beginning of the block. This shouldn't affect the profile quality.

Test Plan:

Reviewed By: wenlei, wmi

Differential Revision: https://reviews.llvm.org/D100002

3 years agoChange the default value for `preloadDialectsInContext` for MlirOptMain
Mehdi Amini [Thu, 8 Apr 2021 03:46:56 +0000 (03:46 +0000)]
Change the default value for `preloadDialectsInContext` for MlirOptMain

This option has been deprecated for 6 months, change the default setting for now before
future removal.

While clients can set the option to true for now, they should start
updating their passes to define the right `dependentDialects` in
preparation of the removal of this option. See the FAQ for more info:
https://mlir.llvm.org/getting_started/Faq/

Reviewed By: rriddle, jpienaar

Differential Revision: https://reviews.llvm.org/D99025

3 years agoInclude `llvm-config` and `not` in AppleClang toolchains.
Dan Liew [Thu, 8 Apr 2021 04:21:07 +0000 (21:21 -0700)]
Include `llvm-config` and `not` in AppleClang toolchains.

The motivation here is so that we can configure and run compiler-rt
tests from a standalone build against AppleClang.

rdar://75975846

Differential Revision: https://reviews.llvm.org/D100086

3 years ago[docs] Document our norms around reverts
Philip Reames [Thu, 8 Apr 2021 03:59:40 +0000 (20:59 -0700)]
[docs] Document our norms around reverts

This has come up a few times recently, and I was surprised to notice that we don't have anything in the docs.

This patch deliberately sticks to stuff that is uncontroversial in the community. Everything herein is thought to be widely agreed to by a large majority of the community.  A few things were noted and removed in review which failed this standard, if you spot anything else, please point it out.

Differential Revision: https://reviews.llvm.org/D99305

3 years ago[Driver] Drop $DEFAULT_TRIPLE-$name as a fallback program name
Fangrui Song [Thu, 8 Apr 2021 04:01:10 +0000 (21:01 -0700)]
[Driver] Drop $DEFAULT_TRIPLE-$name as a fallback program name

D13340 introduced this behavior which is not needed even for mips.
This was raised on https://lists.llvm.org/pipermail/cfe-dev/2020-May/065437.html
but no action was taken.

This was raised again in https://lists.llvm.org/pipermail/cfe-dev/2021-April/067974.html
"The LLVM host/target TRIPLE padding drama on Debian"
as it caused confusion. This patch drops the behavior.

Differential Revision: https://reviews.llvm.org/D99996

3 years ago[RISCV] DAG nodes and pseudo instructions for CSR access
Serge Pavlov [Thu, 18 Mar 2021 15:07:27 +0000 (22:07 +0700)]
[RISCV] DAG nodes and pseudo instructions for CSR access

New custom DAG nodes were added to represent operations on CSR. These
nodes are lowered to corresponding pseudo instruction. Using the pseudo
instructions allows to specify different scheduling information for
operations on different system registers. It also make possible to
specify dependencies of instructions on specific system registers.

Differential Revision: https://reviews.llvm.org/D98936

3 years ago[AMDGPU] Only use ds_read/write_b128 for alignment >= 16
hsmahesha [Thu, 8 Apr 2021 02:41:42 +0000 (08:11 +0530)]
[AMDGPU] Only use ds_read/write_b128 for alignment >= 16

PS: Submitting on behalf of Jay.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D100008

3 years ago[AMDGPU] Add some exhaustive ds read/write alignment tests
hsmahesha [Thu, 8 Apr 2021 02:37:32 +0000 (08:07 +0530)]
[AMDGPU] Add some exhaustive ds read/write alignment tests

PS: Submitting on behalf of Jay.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D100007

3 years ago[PowerPC] fixup killed flags for ri + addi to ri transformation
Chen Zheng [Thu, 8 Apr 2021 01:46:25 +0000 (21:46 -0400)]
[PowerPC] fixup killed flags for ri + addi to ri transformation

Fixup killed flags if DefMI and MI are not in the same basic blocks.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D100023

3 years agoRevert "[LoopInterchange] Fix transformation bugs in loop interchange"
Congzhe Cao [Thu, 8 Apr 2021 01:15:15 +0000 (21:15 -0400)]
Revert "[LoopInterchange] Fix transformation bugs in loop interchange"

This reverts commit 6ec68bd815d00c1eec2a6b9766452554f0e6cb61.

3 years ago[NFC][AMDGPU] Correct indentation in AMDGPUUsage.rst
Tony Tye [Thu, 8 Apr 2021 00:58:02 +0000 (00:58 +0000)]
[NFC][AMDGPU] Correct indentation in AMDGPUUsage.rst

Correct indentation that results in rST syntax error.

3 years ago[LoopInterchange] Fix transformation bugs in loop interchange
CongzheUalberta [Thu, 8 Apr 2021 00:44:32 +0000 (20:44 -0400)]
[LoopInterchange] Fix transformation bugs in loop interchange

After loop interchange, the (old) outer loop header should not jump to
`LoopExit`. Note that the old outer loop becomes the new inner loop
after interchange. If we branched to `LoopExit` then after interchange
we would jump directly from the (new) inner loop header to `LoopExit`
without executing the rest of (new) outer loop.

This patch modifies adjustLoopBranches() such that the old outer
loop header (which becomes the new inner loop header) jumps to the
old inner loop latch which becomes the new outer loop latch after
interchange.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D98475

3 years ago[RISCV] Use multiclass inheritance to simplify some of riscv_vector.td. NFCI
Craig Topper [Thu, 8 Apr 2021 00:33:20 +0000 (17:33 -0700)]
[RISCV] Use multiclass inheritance to simplify some of riscv_vector.td. NFCI

We don't need to instantiate single multiclasses inside of
other multiclasses. We can use inheritance and save writing 'defm ""'.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D100074

3 years ago[lld-macho] Parallelize __LINKEDIT generation
Jez Ng [Wed, 7 Apr 2021 23:55:45 +0000 (19:55 -0400)]
[lld-macho] Parallelize __LINKEDIT generation

Benchmarking chromium_framework on a 3.2 GHz 16-Core Intel Xeon W Mac Pro:

      N           Min           Max        Median           Avg        Stddev
  x  20          4.33          4.42          4.37          4.37   0.021026299
  +  20          4.12          4.23          4.18         4.175   0.035318103
  Difference at 95.0% confidence
    -0.195 +/- 0.0186025
    -4.46224% +/- 0.425686%
    (Student's t, pooled s = 0.0290644)

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D99998

3 years agoDisable use of SCC bit from asm
Stanislav Mekhanoshin [Wed, 7 Apr 2021 17:34:53 +0000 (10:34 -0700)]
Disable use of SCC bit from asm

Differential Revision: https://reviews.llvm.org/D100069

3 years ago[AMDGPU] Update gfx90a memory model support
Tony Tye [Tue, 30 Mar 2021 22:38:19 +0000 (22:38 +0000)]
[AMDGPU] Update gfx90a memory model support

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D100070

3 years ago[AMDGPU] Split GCNRegBankReassign
Stanislav Mekhanoshin [Wed, 7 Apr 2021 19:45:13 +0000 (12:45 -0700)]
[AMDGPU] Split GCNRegBankReassign

Allow pass to work separately with SGPR, VGPR registers or both.
This is NFC now but will be needed to split RA for separate
SGPR and VGPR passes.

Differential Revision: https://reviews.llvm.org/D100063

3 years ago[BasicAA] Add another GEP modulo test with shl with odd op.
Florian Hahn [Wed, 7 Apr 2021 21:26:01 +0000 (22:26 +0100)]
[BasicAA] Add another GEP modulo test with shl with odd op.

3 years ago[InstCombine] fold not ops around min/max intrinsics
Sanjay Patel [Wed, 7 Apr 2021 21:19:15 +0000 (17:19 -0400)]
[InstCombine] fold not ops around min/max intrinsics

This is another step towards parity with the existing
cmp+select folds (see D98152).

3 years ago[InstCombine] add test for min/max intrinsic with not ops; NFC
Sanjay Patel [Wed, 7 Apr 2021 21:14:08 +0000 (17:14 -0400)]
[InstCombine] add test for min/max intrinsic with not ops; NFC

3 years ago[LLDB] Clarifying the documentation for variable formatting wrt to qualifiers and...
Shafik Yaghmour [Wed, 7 Apr 2021 21:29:12 +0000 (14:29 -0700)]
[LLDB] Clarifying the documentation for variable formatting wrt to qualifiers and adding a test that demonstrates this

When looking up user specified formatters qualifiers are removed from types before matching,
I have added a clarifying example to the document and added an example to a relevant test to demonstrate this behavior.

Differential Revision: https://reviews.llvm.org/D99827

3 years ago[RISCV] Add a special case to lowerSELECT for select of 2 constants with a SETLT...
Craig Topper [Wed, 7 Apr 2021 20:46:16 +0000 (13:46 -0700)]
[RISCV] Add a special case to lowerSELECT for select of 2 constants with a SETLT condition.

If the constants have a difference of 1 we can convert one to
the other by adding or subtracting the condition.

We have a DAG combine for this, but it only runs before type
legalization. If the select is introduced later during type
legalization or op legalization we will miss it.

We don't need a specific condition, but some conditions are
harder to materialize than others on RISCV. I know that SETLT
will be a single instruction and it is what is used by the
motivating pattern from signed saturating add/sub.

Differential Revision: https://reviews.llvm.org/D99021

3 years ago[libc++abi] Adjust XFAIL for misaligned exception header on ARM
Louis Dionne [Wed, 7 Apr 2021 20:14:00 +0000 (16:14 -0400)]
[libc++abi] Adjust XFAIL for misaligned exception header on ARM

On ARM, the alignment has always been the right one, so this test never
fails.

3 years ago[Driver][test] Test intended target only
Jinsong Ji [Wed, 7 Apr 2021 20:08:24 +0000 (20:08 +0000)]
[Driver][test] Test intended target only

6fe7de90b9e4e466a5c2baadafd5f72d3203651d changed GNU toolchain,
and added new RUN line to test expected behavior.

The change is for GNU toolchain only, so this will fail other toolchain,
eg: AIX.

Update the test with `-target` to test GNU tool chain only.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D99901

3 years ago[lld-macho] Make time-trace* options more permissive.
Vy Nguyen [Wed, 7 Apr 2021 03:40:41 +0000 (23:40 -0400)]
[lld-macho] Make time-trace* options more permissive.

If either `time-trace-granularity` or `time-trace-file` is specified, then don't make users specify `-time-trace`.
It seems silly that I have to type all three options, eg, `-time-trace -time-trace-file=- -time-trace-granularity=...`.

Differential Revision: https://reviews.llvm.org/D100011

3 years agoFix missing generate capture expression for novariants condition.
Jennifer Yu [Wed, 7 Apr 2021 16:26:14 +0000 (09:26 -0700)]
Fix missing generate capture expression for novariants condition.

3 years ago[clang] Move int <-> float scalar conversion to a separate function
Saurabh Jha [Wed, 7 Apr 2021 19:09:50 +0000 (12:09 -0700)]
[clang] Move int <-> float scalar conversion to a separate function

As prelude to this patch https://reviews.llvm.org/D99037, we want to
move the int-float conversion
into a separate function that can be reused by matrix cast

Differential Revision: https://reviews.llvm.org/D100051

3 years ago[mlir] Fixed alignment attribute of alloc constant folding.
Haruki Imai [Wed, 7 Apr 2021 19:17:19 +0000 (19:17 +0000)]
[mlir] Fixed alignment attribute of alloc constant folding.

When allocLikeOp is updated in alloc constant folding,
alighnment attribute was ignored. This patch fixes it.

Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D99882

3 years agoRemove .gitignore entries not relevant in the monorepo.
Paul Robinson [Wed, 7 Apr 2021 17:21:39 +0000 (10:21 -0700)]
Remove .gitignore entries not relevant in the monorepo.

Differential Revision: https://reviews.llvm.org/D100049

3 years ago[RISCV] Replace 'return ReplaceNode' with 'ReplaceNode; return;' NFC
Craig Topper [Wed, 7 Apr 2021 19:17:42 +0000 (12:17 -0700)]
[RISCV] Replace 'return ReplaceNode' with 'ReplaceNode; return;' NFC

ReplaceNode is a void function as is the function that we were
doing this in. While this is valid code, it was a bit confusing.

3 years ago[BasicAA] Extend test coverage for GEP modulo logic.
Florian Hahn [Wed, 7 Apr 2021 18:59:17 +0000 (19:59 +0100)]
[BasicAA] Extend test coverage for GEP modulo logic.

Add a few additional test cases which combine multiplies with
powers-of-2, different wrapping flags.

3 years ago[AArch64] Materialize FP constant in code for large code model
Jonas Hahnfeld [Tue, 30 Mar 2021 16:28:54 +0000 (18:28 +0200)]
[AArch64] Materialize FP constant in code for large code model

When using the large code model with FastISel (for example via
clang -O0 which adds the optnone attribute), FP constants could
still be materialized using adrp + ldr. Unconditionally enable
the existing path for MachO to materialize the constant in code.

For testing, restore literal_pools_float.ll to exercise the constant
pool and add two optnone-functions that return a float and a double,
respectively. Consolidate fpimm.ll and add a new fast-isel-fpimm.ll
to check the code paths taken with FastISel.

Differential Revision: https://reviews.llvm.org/D99607

3 years agoRevert "[AsmPrinter] Delete dead takeDeletedSymbsForFunction()"
Arthur Eubanks [Wed, 7 Apr 2021 18:26:18 +0000 (11:26 -0700)]
Revert "[AsmPrinter] Delete dead takeDeletedSymbsForFunction()"

This reverts commit 9583a3f2625818b78c0cf6d473cdedb9f23ad82c.

This wasn't NFC as initially thought. Needed for D99707.

3 years ago[Windows] Remove global OF_None flag for Windows in ToolOutputFiles
Abhina Sreeskantharajan [Wed, 7 Apr 2021 18:09:21 +0000 (14:09 -0400)]
[Windows] Remove global OF_None flag for Windows in ToolOutputFiles

Since we have created a new OF_TextWithCRLF flag, we no longer need to worry about OF_Text flag turning on CRLF translation. I can remove this workaround I added to globally open all ToolOutputFiles as binary on Windows.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D100034

3 years agoCorrect the tablegen logic for MutualExclusions attribute checking.
Aaron Ballman [Wed, 7 Apr 2021 17:59:54 +0000 (13:59 -0400)]
Correct the tablegen logic for MutualExclusions attribute checking.

Just because an attribute is a statement attribute doesn't mean it's
not also a declaration attribute. In Clang, there are not currently any
DeclOrStmtAttr attributes that require mutual exclusion checking, but
downstream clients discovered this issue.

3 years ago[lld-macho][nfc] Minor refactoring + clang-tidy fixes
Vy Nguyen [Wed, 7 Apr 2021 05:32:59 +0000 (01:32 -0400)]
[lld-macho][nfc] Minor refactoring + clang-tidy fixes

- use "empty()" instead of "size()"
- refactor the re-export code so it doesn't create a new vector every time.

Differential Revision: https://reviews.llvm.org/D100019

3 years ago[lldb][Editline] Fix crash when navigating through empty command history.
Jordan Rupprecht [Wed, 7 Apr 2021 16:55:20 +0000 (09:55 -0700)]
[lldb][Editline] Fix crash when navigating through empty command history.

An empty history entry can happen by entering the expression evaluator an immediately hitting enter:

```
$ lldb
(lldb) e
Enter expressions, then terminate with an empty line to evaluate:
  1:  <hit enter>
```

The next time the user enters the expression evaluator, if they hit the up arrow to load the previous expression, lldb crashes. This patch treats empty history sessions as a single expression of zero length, instead of an empty list of expressions.

Fixes http://llvm.org/PR49845.

Differential Revision: https://reviews.llvm.org/D100048

3 years ago[RISCV] Support vslide1up/down intrinsics for SEW=64 on RV32.
Craig Topper [Wed, 7 Apr 2021 17:14:59 +0000 (10:14 -0700)]
[RISCV] Support vslide1up/down intrinsics for SEW=64 on RV32.

This can't use our normal strategy of splatting the scalar and using
a .vv operation instead of .vx.

Instead this patch bitcasts the vector to the equivalent SEW=32
vector and inserts the scalar parts using two vslide1up/down. We
do that unmasked and apply the mask separately at the end with
a vmerge.

For vslide1up there maybe some other options here like getting
i64 into element 0 and using vslideup.vi with this vector as
vd and the original source as vs1. Masking would still need to
be done afterwards.

That idea doesn't work for vslide1down. We need to slidedown and
then insert a single scalar at vl-1 which we could do with a
vslideup, but that assumes vl > 0 which I don't think we can assume.

The i32 double slide1down implemented here is the best I could come
up with and I just made vslide1up consistent.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D99910

3 years ago[HIP] Fix rocm-detect.hip test path
Aaron En Ye Shi [Wed, 7 Apr 2021 17:19:28 +0000 (17:19 +0000)]
[HIP] Fix rocm-detect.hip test path

The ROCm installation directory may be another
directory, llvm/ inside the build directory.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D100045

3 years ago[SelectionDAG] Teach SelectionDAG::FoldConstantArithmetic to handle SPLAT_VECTOR
Craig Topper [Wed, 7 Apr 2021 17:03:31 +0000 (10:03 -0700)]
[SelectionDAG] Teach SelectionDAG::FoldConstantArithmetic to handle SPLAT_VECTOR

This allows FoldConstantArithmetic to handle SPLAT_VECTOR in
addition to BUILD_VECTOR. This allows it to support scalable
vectors. I'm also allowing fixed length SPLAT_VECTOR which is
used by some targets, but I'm not familiar enough to write tests
for those targets.

I had to block this function from running on CONCAT_VECTORS to
avoid calling getNode for a CONCAT_VECTORS of 2 scalars.
This can happen because the 2 operand getNode calls this
function for any opcode. Previously we were protected because
CONCAT_VECTORs of BUILD_VECTOR is folded to a larger BUILD_VECTOR
before that call. But it's not always possible to fold a CONCAT_VECTORS
of SPLAT_VECTORs, and we don't even try.

This fixes PR49781 where DAG combine thought constant folding
should be possible, but FoldConstantArithmetic couldn't do it.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D99682

3 years ago[LoopIdiomRecognize] Minor cleanups to the FFS idiom matching. NFC
Craig Topper [Wed, 7 Apr 2021 16:44:52 +0000 (09:44 -0700)]
[LoopIdiomRecognize] Minor cleanups to the FFS idiom matching. NFC

-Make sure of the CreateShl/LShr/AShr methods that take a uint64_t
instead of creating a ConstantInt for 1 ourselves.
-Use Builder.getInt1 or ConstantInt::getBool instead of a conditional.
-Pull out repeated calls to getType.

3 years ago[mlir][sparse] support integral types i32,i16,i8 for *numerical* values
Aart Bik [Tue, 6 Apr 2021 23:46:27 +0000 (16:46 -0700)]
[mlir][sparse] support integral types i32,i16,i8 for *numerical* values

Some sparse matrices operate on integral values (in contrast with the common
f32 and f64 values). This CL expands the compiler and runtime support to deal
with several common type combinations.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D99999

3 years agoAvoid testing for libc++ internal macros after D99834
Dimitry Andric [Wed, 7 Apr 2021 16:51:53 +0000 (18:51 +0200)]
Avoid testing for libc++ internal macros after D99834

As D99834 was meant specifically for FreeBSD, which still uses the older
non-trivial std::pair copy constructors, test for `__FreeBSD__` instead
of relying on a macro which is an internal detail of libc++.

Noted by Louis Dionne.

3 years ago[InstCombine] foldAddWithConstant(): don't deal with non-immediate constants
Roman Lebedev [Wed, 7 Apr 2021 16:46:30 +0000 (19:46 +0300)]
[InstCombine] foldAddWithConstant(): don't deal with non-immediate constants

All of the code that handles general constant here (other than the more
restrictive APInt-dealing code) expects that it is an immediate,
because otherwise we won't actually fold the constants, and increase
instruction count. And it isn't obvious why we'd be okay with
increasing the number of constant expressions,
those still will have to be run..

But after 2829094a8e252d04f13aabdf6f416c42a06af695
this could also cause endless combine loops.
So actually properly restrict this code to immediates.

3 years ago[libc++] Update contributor documentation.
Mark de Wever [Wed, 24 Mar 2021 18:54:40 +0000 (19:54 +0100)]
[libc++] Update contributor documentation.

The document has the following updates:
- Rename 'feature test' to 'feature-test', the latter is the spelling
  used in the Standard.
- Add information how an ABI list can be downloaded from Buildkite.

Differential Revision: https://reviews.llvm.org/D99290

3 years ago[InstCombine] avoid infinite loop from partial undef vectors
Sanjay Patel [Wed, 7 Apr 2021 16:11:23 +0000 (12:11 -0400)]
[InstCombine] avoid infinite loop from partial undef vectors

This fixes the examples from
D99674 and
https://llvm.org/PR49878

The matchers succeed on partial undef/poison vector constants,
but the transform creates a full 'not' (-1) constant, so it
would undo a demanded vector elements change triggered by the
extractelement.

Differential Revision: https://reviews.llvm.org/D100044

3 years ago[libcxx] adds __cpp_lib_concepts feature-test macro
Christopher Di Bella [Fri, 2 Apr 2021 18:07:31 +0000 (18:07 +0000)]
[libcxx] adds __cpp_lib_concepts feature-test macro

Also adjusts C++20 status paper to indicate full concepts support.

Depends on D96477, D99817.

Differential Revision: https://reviews.llvm.org/D99805

3 years ago[libcxx] adds remaining callable concepts
Christopher Di Bella [Wed, 31 Mar 2021 05:28:25 +0000 (05:28 +0000)]
[libcxx] adds remaining callable concepts

* `std::predicate`
* `std::relation`
* `std::equivalence_relation`
* `std::strict_weak_order`

Implements parts of:
    - P0898R3 Standard Library Concepts
    - P1754 Rename concepts to standard_case for C++20, while we still can

Differential Revision: https://reviews.llvm.org/D96477

3 years ago[lld-macho] Sibling N_SO symbols must have the empty string
Jez Ng [Wed, 7 Apr 2021 16:08:14 +0000 (12:08 -0400)]
[lld-macho] Sibling N_SO symbols must have the empty string

We had been giving them a string index of zero, which actually corresponds to a
string with a single space due to {D89639}.

This was far from obvious in the old test because llvm-nm doesn't quote the
symbol names, making the empty string look identical to a string of a single
space. `dsymutil -s` quotes its strings, so I've changed the test accordingly.

Fixes llvm.org/PR48714. Thanks @clayborg for the tips!

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D100003

3 years ago[lld-macho][nfc] Add test for ARM64 stubs
Jez Ng [Wed, 7 Apr 2021 16:08:12 +0000 (12:08 -0400)]
[lld-macho][nfc] Add test for ARM64 stubs

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D99813

3 years ago[CSSPGO] Fix incorrect probe distribution factor computation in top-down inliner
wlei [Wed, 7 Apr 2021 15:38:13 +0000 (08:38 -0700)]
[CSSPGO] Fix incorrect probe distribution factor computation in top-down inliner

We see a regression related to low probe factor(0.01) which prevents some callsites being promoted in ICPPass and later cause the missing inline in CGSCC inliner. The root cause is due to redundant(the second) multiplication of the probe factor and this change try to fix it.

`Sum` does multiply a factor right after findCallSamples but later when using as the parameter in setProbeDistributionFactor, it multiplies one again.

This change could get ~2% perf back on mcf benchmark. In mcf, previously the corresponding factor is 1 and it's the recent feature introducing the <1 factor then trigger this bug.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D99787

3 years ago[X86][AVX] Add HADD lane crossing test
Simon Pilgrim [Wed, 7 Apr 2021 15:43:36 +0000 (16:43 +0100)]
[X86][AVX] Add HADD lane crossing test

This used to work before rG77d625f8d8aa, but we now merge the shuffles across the fadd resulting in a hadd that requires a lane crossing post shuffle, which we don't permit on AVX1 targets

3 years ago[mlir] Export python-related .cmake files
Nicolas Vasilache [Wed, 7 Apr 2021 15:21:40 +0000 (15:21 +0000)]
[mlir] Export python-related .cmake files

This allows downstream projects to build python extensions using the same macros as MLIR.

Differential Revision: https://reviews.llvm.org/D100040

3 years ago[SystemZ][z/OS][TableGen] TableGen files should be text
Abhina Sreeskantharajan [Wed, 7 Apr 2021 15:21:33 +0000 (11:21 -0400)]
[SystemZ][z/OS][TableGen] TableGen files should be text

This patch sets tablegen files as text. It should have no effect on Windows after this patch landed https://reviews.llvm.org/rG82b3e28e836d2f5c8cfd6e1047b93c088522365a.

Reviewed By: anirudhp

Differential Revision: https://reviews.llvm.org/D100036

3 years ago[mlir,shape] Update min/max op description
Jacques Pienaar [Wed, 7 Apr 2021 15:21:15 +0000 (08:21 -0700)]
[mlir,shape] Update min/max op description

3 years ago[SVE] Remove checks for warnings in scalable-vector tests.
Sander de Smalen [Wed, 17 Mar 2021 21:46:32 +0000 (21:46 +0000)]
[SVE] Remove checks for warnings in scalable-vector tests.

After D98856 these tests will by default break (fatal_error) if any of
the wrong interfaces are used, so there's no longer a need to have a
RUN line that checks for a warning message emitted by the compiler.

3 years ago[WebAssembly] Improve error messages regarding missing indirect function table. NFC
Sam Clegg [Tue, 6 Apr 2021 15:06:18 +0000 (08:06 -0700)]
[WebAssembly] Improve error messages regarding missing indirect function table. NFC

Use report_fatal_error here since this is an internal error, and not
something the user can/should be trying to fix.

Also distinguish between the symbol being missing and the symbol having
the wrong type.

We have a failure internally where the symbol is missing.  Currently
trying to reduce the test case to something we can attach to an llvm
bug.

Differential Revision: https://reviews.llvm.org/D99960

3 years ago[AMDGPU] Update SGPRSpillVGPRCSR name. NFC
Sebastian Neubauer [Thu, 1 Apr 2021 12:50:59 +0000 (14:50 +0200)]
[AMDGPU] Update SGPRSpillVGPRCSR name. NFC

The struct is used for both, callee and caller-save registers now.
The frame index is not set for entrypoints, as we do not need to save
the registers then.
Update the struct name to reflect that.

Differential Revision: https://reviews.llvm.org/D99722

3 years ago[NPM] Fix typo inisLTOPreLink for loop rotate
Jingu Kang [Wed, 7 Apr 2021 13:22:32 +0000 (14:22 +0100)]
[NPM] Fix typo inisLTOPreLink for loop rotate

Differential Revision: https://reviews.llvm.org/D100033

3 years agoRevert "[clang] Speedup line offset mapping computation"
Nico Weber [Wed, 7 Apr 2021 13:42:11 +0000 (09:42 -0400)]
Revert "[clang] Speedup line offset mapping computation"

This reverts commit 6951b72334bbe4c189c71751edc1e361d7b5632c.
Breaks several bots, see comments on https://reviews.llvm.org/D99409

3 years ago[X86] Improve optimizeCompareInstr for signed comparisons after AND/OR/XOR instructions
Simon Pilgrim [Wed, 7 Apr 2021 13:07:35 +0000 (14:07 +0100)]
[X86] Improve optimizeCompareInstr for signed comparisons after AND/OR/XOR instructions

Extend D94856 to handle 'and', 'or' and 'xor' instructions as well

We still fail on many i8/i16 cases as the test and the logic-op are performed on different widths

3 years ago[SLP]Avoid multiple attempts to vectorize CmpInsts.
Alexey Bataev [Tue, 6 Apr 2021 12:59:03 +0000 (05:59 -0700)]
[SLP]Avoid multiple attempts to vectorize CmpInsts.

No need to lookup through and/or try to vectorize operands of the
CmpInst instructions during attempts to find/vectorize min/max
reductions. Compiler implements postanalysis of the CmpInsts so we can
skip extra attempts in tryToVectorizeHorReductionOrInstOperands and save
compile time.

Differential Revision: https://reviews.llvm.org/D99950

3 years ago[flang][driver] Fix `-fdebug-dump-provenance`
Andrzej Warzynski [Wed, 7 Apr 2021 13:10:35 +0000 (13:10 +0000)]
[flang][driver] Fix `-fdebug-dump-provenance`

The -fdebug-dump-provenance flag is meant to be used with
needProvenanceRangeToCharBlockMappings set to true. This way, extra
mapping is generated that allows e.g. IDEs to retrieve symbol's scope
(offset into cooked character stream) based on symbol's source code
location. This patch makes sure that this option is set when using
-fdebug-dump-provenance.

With this patch, the implementation of  -fdebug-dump-provenance in
`flang-new -fc1` becomes consistent with `f18`. The corresponding LIT
test is updated so that it can be shared with `f18`. I refined it a bit
so that:
  * it becomes a frontend-only test
  * it's stricter about the expected output

Differential Revision: https://reviews.llvm.org/D98847

3 years ago[AMDGPU] SIFoldOperands: don't dump extra '\n' after MachineInstr. NFC.
Jay Foad [Wed, 7 Apr 2021 13:03:17 +0000 (14:03 +0100)]
[AMDGPU] SIFoldOperands: don't dump extra '\n' after MachineInstr. NFC.

3 years ago[flang][driver] Add support for `-cpp/-nocpp`
Andrzej Warzynski [Wed, 7 Apr 2021 11:42:37 +0000 (11:42 +0000)]
[flang][driver] Add support for `-cpp/-nocpp`

This patch adds support for the `-cpp` and `-nocpp` flags. The
implemented semantics match f18 (i.e. the "throwaway" driver), but are
different to gfortran. In Flang the preprocessor is always run. Instead,
`-cpp/-nocpp` are used to control whether predefined and command-line
preprocessor macro definitions are enabled or not. In practice this is
sufficient to model gfortran`s `-cpp/-nocpp`.

In the absence of `-cpp/-nocpp`, the driver will use the extension of
the input file to decide whether to include the standard macro
predefinitions. gfortran's documentation [1] was used to decide which
file extension to use for this.

The logic mentioned above was added in FrontendAction::BeginSourceFile.
That's relatively late in the driver set-up, but this roughly where the
name of the input file becomes available. The logic for deciding between
fixed and free form works in a similar way and was also moved to
FrontendAction::BeginSourceFile for consistency (and to reduce
code-duplication).

The `-cpp/-nocpp` flags are respected also when the input is read from
stdin. This is different to:
   * gfortran (behaves as if `-cpp` was used)
   * f18 (behaves as if `-nocpp` was used)

Starting with this patch, file extensions are significant and some test
files had to be renamed to reflect that. Where possible, preprocessor
tests were updated so that they can be shared between `f18` and
`flang-new`. This was implemented on top of adding new test for
`-cpp/-nocpp`.

[1] https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D99292

3 years ago[clang] Check AuxTarget exists when creating target in CompilerInstance
oToToT [Wed, 7 Apr 2021 12:58:48 +0000 (20:58 +0800)]
[clang] Check AuxTarget exists when creating target in CompilerInstance

D97493 separate target creation out to a single function
`CompilerInstance::createTarget`. However, it would overwrite AuxTarget
even if it has been set.
As @kadircet recommended in D98128, this patch check the existence of
AuxTarget and not overwrite it when it has been set.

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D100024

3 years ago[InstCombine] move abs transform to helper function; NFC
Sanjay Patel [Wed, 7 Apr 2021 12:12:38 +0000 (08:12 -0400)]
[InstCombine] move abs transform to helper function; NFC

The swap of the operands can affect later transforms that
are expecting a constant as operand 1. I don't think we
can trigger a bug with the current code, but I hit that
problem while drafting a new transform for min/max intrinsics.

3 years ago[InstCombine] add tests for not-of-min/max; NFC
Sanjay Patel [Tue, 6 Apr 2021 20:31:25 +0000 (16:31 -0400)]
[InstCombine] add tests for not-of-min/max; NFC

3 years ago[mlir] Add "mask" operand to vector.transfer_read/write.
Matthias Springer [Wed, 7 Apr 2021 12:11:55 +0000 (21:11 +0900)]
[mlir] Add "mask" operand to vector.transfer_read/write.

Also factors out out-of-bounds mask generation from vector.transfer_read/write into a new MaterializeTransferMask pattern.

Differential Revision: https://reviews.llvm.org/D100001

3 years ago[X86] Add AND/OR/XOR signed-comparison overflow test cases for PR48768
Simon Pilgrim [Wed, 7 Apr 2021 12:27:41 +0000 (13:27 +0100)]
[X86] Add AND/OR/XOR signed-comparison overflow test cases for PR48768

D94856 covered the BMI cases where we had existing tests, this adds missing AND/OR/XOR test cases

3 years ago[Clang] Extend test coverage for -f[no-]finite-loops options.
Florian Hahn [Wed, 7 Apr 2021 12:01:17 +0000 (13:01 +0100)]
[Clang] Extend test coverage for -f[no-]finite-loops options.

Extend test coverage by checking various standard versions with
-f[no-]finite-loops. Suggested as part of D96418.

3 years ago[clang] Speedup line offset mapping computation
serge-sans-paille [Thu, 1 Apr 2021 20:18:55 +0000 (22:18 +0200)]
[clang] Speedup line offset mapping computation

Clang spends a decent amount of time in the LineOffsetMapping::get(...)
function. This function used to be vectorized (through SSE2) then the
optimization got dropped because the sequential version was on-par performance
wise.

This provides an optimization of the sequential version that works on a word at
a time, using (documented) bithacks to provide a portable vectorization.

When preprocessing the sqlite amalgamation, this yields a sweet 3% speedup.

Differential Revision: https://reviews.llvm.org/D99409

3 years ago[analyzer][NFC] Add tests for extents
Balazs Benics [Wed, 7 Apr 2021 11:42:29 +0000 (13:42 +0200)]
[analyzer][NFC] Add tests for extents

If we allocate memory, the extent of the MemRegion will be the symbolic
value of the size parameter. This way, if that symbol gets constrained,
the extent will be also constrained.

This test demonstrates that the extent is indeed the same symbol.

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D99959

3 years ago[X86] Improve optimizeCompareInstr for signed comparisons after BZHI instructions
Simon Pilgrim [Wed, 7 Apr 2021 11:07:10 +0000 (12:07 +0100)]
[X86] Improve optimizeCompareInstr for signed comparisons after BZHI instructions

Extend D94856 to handle 'bzhi' instructions as well

3 years ago[-Wcompletion-handler] Don't recognize init methods as conventional
Valeriy Savchenko [Tue, 30 Mar 2021 16:06:37 +0000 (19:06 +0300)]
[-Wcompletion-handler] Don't recognize init methods as conventional

rdar://75704162

Differential Revision: https://reviews.llvm.org/D99601

3 years ago[Statepoint Lowering] Allow other than N byte sized types in deopt bundle
Yevgeny Rouban [Wed, 7 Apr 2021 10:45:05 +0000 (17:45 +0700)]
[Statepoint Lowering] Allow other than N byte sized types in deopt bundle

I do not see any bit-width restriction from the point of the
LLVM Lang Ref - Operand Bundles on the types of the deopt bundle
operands. Statepoint Lowering seems to be able to work with any
types.
This patch relaxes the two related assertions and adds a new test
for this change.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D100006

3 years ago[analyzer] Fix body farm for Obj-C++ properties
Valeriy Savchenko [Tue, 23 Mar 2021 15:48:58 +0000 (18:48 +0300)]
[analyzer] Fix body farm for Obj-C++ properties

When property is declared in a superclass (or in a protocol),
it still can be of CXXRecord type and Sema could've already
generated a body for us.  This patch joins two branches and
two ways of acquiring IVar in order to reuse the existing code.
And prevent us from generating l-value to r-value casts for
C++ types.

rdar://67416721

Differential Revision: https://reviews.llvm.org/D99194

3 years ago[X86] Add BZHI test case for PR48768
Simon Pilgrim [Wed, 7 Apr 2021 10:20:55 +0000 (11:20 +0100)]
[X86] Add BZHI test case for PR48768

D94856 covered the BMI cases where we had existing tests, this adds a missing BZHI test case

3 years agoFix crash when an invalid URI is parsed and error handling is attempted
crr0004 [Wed, 7 Apr 2021 10:31:41 +0000 (12:31 +0200)]
Fix crash when an invalid URI is parsed and error handling is attempted

When you pass in a payload with an invalid URI in a build with assertions enabled, it will crash.
Consuming the error from the failed URI parse prevents the error.

The crash is caused by the [llvm::expected](https://llvm.org/doxygen/classllvm_1_1Expected.html) having protection around trying to deconstruct without consuming the error first.

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D99872

3 years ago[CMake] try creating symlink first on windows
Kirill Bobyrev [Wed, 7 Apr 2021 09:23:10 +0000 (11:23 +0200)]
[CMake] try creating symlink first on windows

//-E create_symlink//  is available on windows since CMake 3.13 (LLVM now uses 3.13.4)
It may needs administrator privileges or enabled developer mode (Windows 10)
See https://cmake.org/cmake/help/latest/release/3.13.html

Reviewed By: kbobyrev

Differential Revision: https://reviews.llvm.org/D99170

3 years ago[clang][Syntax] Handle invalid source range in expandedTokens.
Utkarsh Saxena [Tue, 6 Apr 2021 09:55:55 +0000 (11:55 +0200)]
[clang][Syntax] Handle invalid source range in expandedTokens.

Differential Revision: https://reviews.llvm.org/D99934

3 years ago[OpenCL] Add as_size/ptrdiff/intptr/uintptr_t operators
Sven van Haastregt [Wed, 7 Apr 2021 09:16:41 +0000 (10:16 +0100)]
[OpenCL] Add as_size/ptrdiff/intptr/uintptr_t operators

size_t and friends are built-in scalar data types and s6.4.4.2 of the
OpenCL C Specification says the as_type() operator must be available
for these data types.

Differential Revision: https://reviews.llvm.org/D98959

3 years ago[Orc][examples] Add missing FileCheck for lit test and polish output
Stefan Gränitz [Wed, 7 Apr 2021 09:11:27 +0000 (11:11 +0200)]
[Orc][examples] Add missing FileCheck for lit test and polish output

3 years agoReland [InstCombine] Fold `((X - Y) - Z)` to `X - (Y + Z)` (PR49858)
Roman Lebedev [Wed, 7 Apr 2021 08:04:57 +0000 (11:04 +0300)]
Reland [InstCombine] Fold `((X - Y) - Z)` to `X - (Y + Z)` (PR49858)

This reverts commit a547b4e26b311e417cd51100e379693f51a3f448,
relanding commit 31d219d2997fed1b7dc97e0adf170d5aaf65883e,
which was reverted because there was a conflicting inverse transform,
which was causing an endless combine loop, which has now been adjusted.

Original commit message:

https://alive2.llvm.org/ce/z/67w-wQ

We prefer `add`s over `sub`, and this particular xform
allows further folds to happen:

Fixes https://bugs.llvm.org/show_bug.cgi?id=49858