platform/upstream/llvm.git
2 years ago[OpenMP] Add Cuda path to linker wrapper tool
Joseph Huber [Thu, 3 Feb 2022 21:41:47 +0000 (16:41 -0500)]
[OpenMP] Add Cuda path to linker wrapper tool

The linker wrapper tool uses the 'nvlink' and 'ptxas' binaries to link
and assemble device files. Previously we searched for this using the
binaries in the user's path. This didn't work in cases where the user
passed in a specific Cuda path to Clang. This patch changes the linker
wrapper to accept an argument for the Cuda path we can get from Clang.
This should fix #53573.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D118944

2 years ago[nfc][mlgo][regalloc] Cache live interval feature components
Mircea Trofin [Tue, 1 Feb 2022 02:53:54 +0000 (18:53 -0800)]
[nfc][mlgo][regalloc] Cache live interval feature components

Lazily cache the feature components of a LiveInterval.

Differential Revision: https://reviews.llvm.org/D118674

2 years ago[Support] unsafe pointer arithmetic in llvm_regcomp()
Miod Vallat [Fri, 4 Feb 2022 00:50:58 +0000 (19:50 -0500)]
[Support] unsafe pointer arithmetic in llvm_regcomp()

regcomp.c uses the "start + count < end" idiom to check that there are
"count" bytes available in an array of char "start" and "end" both point
to.

This is fine, unless "start + count" goes beyond the last element of the
array. In this case, pedantic interpretation of the C standard makes
the comparison of such a pointer against "end" undefined, and optimizers
from hell will happily remove as much code as possible because of this.

An example of this occurs in regcomp.c's bothcases(), which defines
bracket[3], sets "next" to "bracket" and "end" to "bracket + 2". Then it
invokes p_bracket(), which starts with "if (p->next + 5 < p->end)"...

Because bothcases() and p_bracket() are static functions in regcomp.c,
there is a real risk of miscompilation if aggressive inlining happens.

The following diff rewrites the "start + count < end" constructs into
"end - start > count". Assuming "end" and "start" are always pointing in
the array (such as "bracket[3]" above), "end - start" is well-defined
and can be compared without trouble.

As a bonus, MORE2() implies MORE() therefore SEETWO() can be simplified
a bit.

Bug report: https://github.com/llvm/llvm-project/issues/47993

Reviewed By: MaskRay, vitalybuka

Differential Revision: https://reviews.llvm.org/D97129

2 years ago[lld-macho][nfc] Eliminate InputSection::Shared
Jez Ng [Fri, 4 Feb 2022 00:53:29 +0000 (19:53 -0500)]
[lld-macho][nfc] Eliminate InputSection::Shared

Earlier in LLD's evolution, I tried to create the illusion that
subsections were indistinguishable from "top-level" sections. Thus, even
though the subsections shared many common field values, I hid those
common values away in a private Shared struct (see D105305). More
recently, however, @gkm added a public `Section` struct in D113241 that
served as an explicit way to store values that are common to an entire
set of subsections (aka InputSections). Now that we have another "common
value" struct, `Shared` has been rendered redundant. All its fields can
be moved into `Section` instead, and the pointer to `Shared` can be replaced
with a pointer to `Section`.

This `Section` pointer also has the advantage of letting us inspect other
subsections easily, simplifying the implementation of {D118798}.

P.S. I do think that having both `Section` and `InputSection` makes for
a slightly confusing naming scheme. I considered renaming `InputSection`
to `Subsection`, but that would break the symmetry with `OutputSection`.
It would also make us deviate from LLD-ELF's naming scheme.

This change is perf-neutral on my 3.2 GHz 16-Core Intel Xeon W machine:

             base           diff           difference (95% CI)
  sys_time   1.258 ± 0.031  1.248 ± 0.023  [  -1.6% ..   +0.1%]
  user_time  3.659 ± 0.047  3.658 ± 0.041  [  -0.5% ..   +0.4%]
  wall_time  4.640 ± 0.085  4.625 ± 0.063  [  -1.0% ..   +0.3%]
  samples    49             61

There's also no stat sig change in RSS (as measured by `time -l`):

           base                         diff                           difference (95% CI)
  time     998038627.097 ± 13567305.958 1003327715.556 ± 15210451.236  [  -0.2% ..   +1.2%]
  samples  31                           36

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D118797

2 years agoRevert "[ProfileData] Read and symbolize raw memprof profiles."
Snehasish Kumar [Fri, 4 Feb 2022 00:09:23 +0000 (16:09 -0800)]
Revert "[ProfileData] Read and symbolize raw memprof profiles."

This reverts commit 26f978d4c5ad0d2217940ef7625b0c3c0d576988.

This patch added a transitive dependency on libcurl via symbolize.
See discussion
https://reviews.llvm.org/D116784#inline-1137928
https://reviews.llvm.org/D113717#3295350

2 years agogithub: Fix issue-subscriber workflow
Tom Stellard [Fri, 4 Feb 2022 00:11:04 +0000 (16:11 -0800)]
github: Fix issue-subscriber workflow

This stopped working due to additional dependencies added to github-automation.py
by daf82a51a0c2ba9990cde172a4a1b8c1004d584d

2 years ago[gn build] Set -fmsc-version=1920 on Windows
Arthur Eubanks [Thu, 3 Feb 2022 23:55:53 +0000 (15:55 -0800)]
[gn build] Set -fmsc-version=1920 on Windows

Now that the minimum version version of MSVC required to build LLVM has
been bumped, we see

  ../../llvm/include\llvm/Support/Compiler.h(94,2): error: LLVM requires
  at least VS 2019.
  #error LLVM requires at least VS 2019.

e.g. http://45.33.8.238/win/53703/step_4.txt

1920 corresponds to the earliest version of VS 2019.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D118713

2 years agoRevert "[instrprof][NFC] Sort link components and dedupe."
Snehasish Kumar [Thu, 3 Feb 2022 23:41:24 +0000 (15:41 -0800)]
Revert "[instrprof][NFC] Sort link components and dedupe."

This reverts commit 28ba0b9f6dd6dd08c7c2380a0c00c7170d3ddf48.

clang ppc build failed
https://lab.llvm.org/buildbot#builders/121/builds/16080

2 years ago[LLDB][NativePDB] terminal entry has lower precedence than new entry
Zequan Wu [Thu, 3 Feb 2022 23:41:18 +0000 (15:41 -0800)]
[LLDB][NativePDB] terminal entry has lower precedence than new entry

2 years ago[instrprof][NFC] Sort link components and dedupe.
Snehasish Kumar [Thu, 3 Feb 2022 23:32:32 +0000 (15:32 -0800)]
[instrprof][NFC] Sort link components and dedupe.

Accidentally added a duplicate link component in D116784.

2 years ago[clang][utils] Remove StringRef lldb summary provider
Dave Lee [Tue, 11 Jan 2022 02:46:29 +0000 (18:46 -0800)]
[clang][utils] Remove StringRef lldb summary provider

Remove the `StringRef` summary provider in favor of the implementation in
`llvm/utils/lldbDataFormatters.py`.

This implementation was resulting in errors in some cases.

Differential Revision: https://reviews.llvm.org/D116987

2 years ago[libc++] Fix chrono::duration constructor constraint
Tiago Macarios [Thu, 3 Feb 2022 15:23:15 +0000 (10:23 -0500)]
[libc++] Fix chrono::duration constructor constraint

As per [time.duration.cons]/1, the constructor constraint should be on
const Rep2&. As it is now the code will fail to compile in certain
cases, for example (https://godbolt.org/z/c7fPrcTYM):

     struct S{
          operator int() const&& noexcept = delete;
          operator int() const& noexcept;
     };

     const S &fun();

     auto k = std::chrono::microseconds{fun()};

Differential Revision: https://reviews.llvm.org/D118902

2 years agogithub: Add actions to automate part of the release workflow
Tom Stellard [Thu, 3 Feb 2022 22:44:07 +0000 (14:44 -0800)]
github: Add actions to automate part of the release workflow

This adds support for automatically cherry-picking and testing fixes for the
release branch using 'commands' in issue comments.  The two supported commands are:

/cherry-pick <commit1> <commit2> ...

Which will backport and test commits from main.  And also

/branch owner/repo/branch

Which will test commits from the given branch.

Reviewed By: alexbatashev, kwk

Differential Revision: https://reviews.llvm.org/D117386

2 years ago[SLP] Strengthen internal invariant assertions slightly
Philip Reames [Thu, 3 Feb 2022 21:46:30 +0000 (13:46 -0800)]
[SLP] Strengthen internal invariant assertions slightly

This builds on the invariant checks introduced in 1519629, and adds a couple more than seem to hold without additional work.

2 years agoRevert "[OpenMP] Don't use bound architecture when checking cache on the host"
Joseph Huber [Thu, 3 Feb 2022 22:43:02 +0000 (17:43 -0500)]
Revert "[OpenMP] Don't use bound architecture when checking cache on the host"

This reverts commit 9138d96f8b01605b213e8c4d587853a46cca3f44.

2 years ago[ProfileData] Read and symbolize raw memprof profiles.
Snehasish Kumar [Wed, 29 Dec 2021 23:52:11 +0000 (15:52 -0800)]
[ProfileData] Read and symbolize raw memprof profiles.

This change extends the RawMemProfReader to read all the sections of the
raw profile and symbolize the virtual addresses recorded as part of the
callstack for each allocation. For now the symbolization is used to
display the contents of the profile with llvm-profdata.

Differential Revision: https://reviews.llvm.org/D116784

2 years ago[memprof] Print out the summary in YAML format.
Snehasish Kumar [Fri, 7 Jan 2022 00:14:41 +0000 (16:14 -0800)]
[memprof] Print out the summary in YAML format.

Print out the profile summary in YAML format to make it easier to for
tools and tests to read in the contents of the raw profile.

Differential Revision: https://reviews.llvm.org/D116783

2 years ago[instrprof][NFC] Templatize the instrprof iterator.
Snehasish Kumar [Wed, 29 Dec 2021 23:58:44 +0000 (15:58 -0800)]
[instrprof][NFC] Templatize the instrprof iterator.

This change templatizes the InstrProfIterator where the default
specialization is based on the current usage, i.e. the reader_type is
InstrProfReader and the record_type (value_type) is
NamedInstrProfRecord. A subsequent patch will use the same iterator
template to implement an iterator for the RawMemProfReader.

Differential Revision: https://reviews.llvm.org/D116782

2 years ago[DebugInfo] Move the SymbolizableObjectFile header to include/llvm.
Snehasish Kumar [Wed, 29 Dec 2021 23:46:22 +0000 (15:46 -0800)]
[DebugInfo] Move the SymbolizableObjectFile header to include/llvm.

This change moves the SymbolizableObjectFile header to
include/llvm/DebugInfo/Symbolize. Making this header available to other
llvm libraries simplifies use cases where implicit caching, multiple
platform support and other features of the Symbolizer class are not
required. This also makes the dependent libraries easier to unit test
by having mocks which derive from SymbolizableModule.

Differential Revision: https://reviews.llvm.org/D116781

2 years ago[GlobalISel] Combine (G_*ADDO x, 0) -> x + no carry out
Jessica Paquette [Mon, 31 Jan 2022 22:13:18 +0000 (14:13 -0800)]
[GlobalISel] Combine (G_*ADDO x, 0) -> x + no carry out

Similar to the G_*MULO change.

The code for checking if a constant is legal/pre-legalize is shared between
these, and is kind of hairy. So, factor it out into a new function:
`isConstantLegalOrBeforeLegalizer`.

To make the refactoring clean, further refactor `isLegalOrBeforeLegalizer` into
a wrapper for two functions:

- `isPreLegalize`
- `isLegal`

This is a bit easier to read in general.

https://godbolt.org/z/KW7oszP1o

Differential Revision: https://reviews.llvm.org/D118655

2 years ago[GlobalISel] Combine: (G_*MULO x, 0) -> 0 + no carry out
Jessica Paquette [Mon, 31 Jan 2022 18:42:19 +0000 (10:42 -0800)]
[GlobalISel] Combine: (G_*MULO x, 0) -> 0 + no carry out

Similar to the following combine in `DAGCombiner::visitMULO`:

```
  // fold (mulo x, 0) -> 0 + no carry out
  if (isNullOrNullSplat(N1))
    return CombineTo(N, DAG.getConstant(0, DL, VT),
                     DAG.getConstant(0, DL, CarryVT));
```

This fixes some generally poor codegen for `*mulo`:

https://godbolt.org/z/eTxYsvz8f

Differential Revision: https://reviews.llvm.org/D118635

2 years ago[mlir] Keep sorted vector of registered operation names for efficient lookup
Eugene Zhulenev [Thu, 3 Feb 2022 21:16:14 +0000 (13:16 -0800)]
[mlir] Keep sorted vector of registered operation names for efficient lookup

I see a lot of array sorting in stack traces of our compiler, canonicalizer traverses this list every time it builds a pattern set, and it gets expensive very quickly.

Reviewed By: rriddle, mehdi_amini

Differential Revision: https://reviews.llvm.org/D118937

2 years ago[OpenMP] Don't use bound architecture when checking cache on the host
Joseph Huber [Thu, 3 Feb 2022 00:07:39 +0000 (19:07 -0500)]
[OpenMP] Don't use bound architecture when checking cache on the host

When we are creating jobs for the new driver we first check the cache to
see if the job was already created as a part of the offloading
toolchain. This would sometimes fail if the bound architecture was set
for the host during offloading. We want to ingore this because it is not
relevant for looking up host actions. Previously it was set on some
machines and would cause the cache lookup to fail.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D118858

2 years ago[lld-macho] Minor clean up: use .find() to check for key existence rather than [...
Vy Nguyen [Thu, 3 Feb 2022 22:04:18 +0000 (17:04 -0500)]
[lld-macho] Minor clean up: use .find() to check for key existence rather than [], which would create a new entry.

Differential Revision: https://reviews.llvm.org/D118945

2 years ago[LLDB] remove an extra register enum on windows x64
Zequan Wu [Thu, 3 Feb 2022 22:12:18 +0000 (14:12 -0800)]
[LLDB] remove an extra register enum on windows x64

2 years ago[libc++] Remove the std::string base class
Nikolas Klauser [Wed, 2 Feb 2022 19:15:40 +0000 (20:15 +0100)]
[libc++] Remove the std::string base class

Removing the base class of std::basic_string is not an ABI break, so we can remove any references to it from the header.

Reviewed By: ldionne, Mordante, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D118733

2 years ago[clang-format][NFC] Code Tidies in UnwrappedLineFormatter
Björn Schäpers [Fri, 3 Dec 2021 07:13:57 +0000 (08:13 +0100)]
[clang-format][NFC] Code Tidies in UnwrappedLineFormatter

* Give I[1] and I[-1] a name:
  - Easier to understand
  - Easier to debug (since you don't go through operator[] everytime)
* TheLine->First != TheLine->Last follows since last is a l brace and
  first isn't.
* Factor the check for is(tok::l_brace) out.
* Drop else after return.

Differential Revision: https://reviews.llvm.org/D115060

2 years ago[gn build] (manually) attempt to port 95d609b549bb
Nico Weber [Thu, 3 Feb 2022 21:53:34 +0000 (16:53 -0500)]
[gn build] (manually) attempt to port 95d609b549bb

2 years ago[clang-format] Use wider comment prefix space rule
ksyx [Thu, 3 Feb 2022 03:18:09 +0000 (22:18 -0500)]
[clang-format] Use wider comment prefix space rule

This commit changes the condition of requiring comment to start with
alphanumeric characters to make no change only for a certain set of
characters, currently horizontal whitespace and punctuation characters,
to support wider set of leading characters unrelated to documentation
generation directives.

Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D118869

2 years ago[VFS] Add back setFallthrough for downstream users
Ben Barham [Thu, 3 Feb 2022 21:35:16 +0000 (13:35 -0800)]
[VFS] Add back setFallthrough for downstream users

This fixes lldb's build. We can remove this in the future if we want but
for now this will be nicer to existing consumers.

2 years agoUse functions with prototypes when appropriate; NFC
Aaron Ballman [Thu, 3 Feb 2022 21:39:21 +0000 (16:39 -0500)]
Use functions with prototypes when appropriate; NFC

A significant number of our tests in C accidentally use functions
without prototypes. This patch converts the function signatures to have
a prototype for the situations where the test is not specific to K&R C
declarations. e.g.,

  void func();

becomes

  void func(void);

This is the first batch of tests being updated (there are a significant
number of other tests left to be updated).

2 years ago[cmake] Increase -fms-compatibility-version in Windows toolchain file
Shoaib Meenai [Thu, 3 Feb 2022 21:39:54 +0000 (13:39 -0800)]
[cmake] Increase -fms-compatibility-version in Windows toolchain file

Make it match LLVM's new minimum requirement (after https://reviews.llvm.org/D114639).

2 years ago[GSYM] Add Split Dwarf Support to DwarfTransformer
Alexander Yermolovich [Thu, 3 Feb 2022 21:39:03 +0000 (13:39 -0800)]
[GSYM] Add Split Dwarf Support to DwarfTransformer

The convert only worked on CUs in main binary.
If it's a skeleton CU it will now use the DWO CU
when invoking handleDie.

Test Plan:
llvm-lit

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D118521

2 years agoRevert "[clang] Mark `trivial_abi` types as "trivially relocatable"."
Dmitri Gribenko [Thu, 3 Feb 2022 21:19:35 +0000 (22:19 +0100)]
Revert "[clang] Mark `trivial_abi` types as "trivially relocatable"."

This reverts commit 19aa2db023c0128913da223d4fb02c474541ee22. It breaks
a PS4 buildbot.

2 years ago[llvm-objcopy][COFF] Implement --update-section
Alex Brachet [Thu, 3 Feb 2022 21:30:42 +0000 (21:30 +0000)]
[llvm-objcopy][COFF] Implement --update-section

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D118189

2 years ago[SLP] Add basic self consistency asserts into scheduling
Philip Reames [Thu, 3 Feb 2022 21:23:02 +0000 (13:23 -0800)]
[SLP] Add basic self consistency asserts into scheduling

The idea here is to have a verify routine we can call during scheduling to ensure broken invariants are reported.  The intent is to help in debugging scheduling bugs.

At the moment, only the most basic properties are checked as adding several I thought held reported failures.

2 years ago[mlir][scf] Fix bug in pipelining prologue emission
Thomas Raoux [Thu, 3 Feb 2022 19:42:49 +0000 (11:42 -0800)]
[mlir][scf] Fix bug in pipelining prologue emission

Induction variable calculation was ignoring scf.for step value. Fix it to get
the correct induction variable value in the prologue.

Differential Revision: https://reviews.llvm.org/D118932

2 years ago[VFS] Add a "redirecting-with" field to overlays
Ben Barham [Thu, 3 Feb 2022 20:53:22 +0000 (12:53 -0800)]
[VFS] Add a "redirecting-with" field to overlays

Extend "fallthrough" to allow a third option: "fallback". Fallthrough
allows the original path to used if the redirected (or mapped) path
fails. Fallback is the reverse of this, ie. use the original path and
fallback to the mapped path otherwise.

While this result *can* be achieved today using multiple overlays, this
adds a much more intuitive option. As an example, take two directories
"A" and "B". We would like files from "A" to be used, unless they don't
exist, in which case the VFS should fallback to those in "B".

With the current fallthrough option this is possible by adding two
overlays: one mapping from A -> B and another mapping from B -> A. Since
the frontend *nests* the two RedirectingFileSystems, the result will
be that "A" is mapped to "B" and back to "A", unless it isn't in "A" in
which case it fallsthrough to "B" (or fails if it exists in neither).

Using "fallback" semantics allows a single overlay instead: one mapping
from "A" to "B" but only using that mapping if the operation in "A"
fails first.

"redirect-only" is used to represent the current "fallthrough: false"
case.

Differential Revision: https://reviews.llvm.org/D117937

2 years ago[HWASan] Add __hwasan_init to .preinit_array.
Matt Morehouse [Thu, 3 Feb 2022 21:07:11 +0000 (13:07 -0800)]
[HWASan] Add __hwasan_init to .preinit_array.

Fixes segfaults on x86_64 caused by instrumented code running before
shadow is set up.

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D118171

2 years ago[AMDGPU] Fix scheduler live-ins with debug inst at start of block
Vang Thao [Wed, 2 Feb 2022 23:33:40 +0000 (15:33 -0800)]
[AMDGPU] Fix scheduler live-ins with debug inst at start of block

GCNDownwardRPTracker RPTracker.reset() skips debug instructions for NextMI so RPTracker.getNext() will never give the beginning of a sched region if it is a debug value. In this case we will never set the live-ins for that block.

Add check to see if getNext also equals the MI after skipping debug instructions.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D118853

2 years ago[nfc][regalloc] const LiveIntervals within the allocator
Mircea Trofin [Thu, 3 Feb 2022 17:07:42 +0000 (09:07 -0800)]
[nfc][regalloc] const LiveIntervals within the allocator

Once built, LiveIntervals are immutable. This patch captures that.

Differential Revision: https://reviews.llvm.org/D118918

2 years ago[tests] Add coverage for SLP reschedule event
Philip Reames [Thu, 3 Feb 2022 20:22:25 +0000 (12:22 -0800)]
[tests] Add coverage for SLP reschedule event

This is slightly reduced from the crash reported against D117951.

2 years ago[SampleProfile] Reduce indentation with an early return (NFC)
Kazu Hirata [Thu, 3 Feb 2022 20:22:23 +0000 (12:22 -0800)]
[SampleProfile] Reduce indentation with an early return (NFC)

2 years ago[CodeGenPrepare] Avoid out-of-bounds shift
Bjorn Pettersson [Mon, 31 Jan 2022 12:49:35 +0000 (13:49 +0100)]
[CodeGenPrepare] Avoid out-of-bounds shift

AddressingModeMatcher::matchOperationAddr may attempt to shift a
variable by the same amount of steps as found in the IR in a SHL
instruction. This was done without considering that there could be
undefined behavior in the IR, so the shift performed when compiling
could end up having undefined behavior as well.

This patch avoid UB in the codegenprepare by making sure that we
limit the shift amount used, in a similar way as already being done
in CodeGenPrepare::optimizeLoadExt.

Differential Revision: https://reviews.llvm.org/D118602

2 years ago[AMDGPU] Fix windows build warning with IMMBitSelConst. NFC.
Stanislav Mekhanoshin [Thu, 3 Feb 2022 19:44:45 +0000 (11:44 -0800)]
[AMDGPU] Fix windows build warning with IMMBitSelConst. NFC.

VS gives this warning for an integer constant:
AMDGPUGenDAGISel.inc(214687): warning C4334: '<<': result of 32-bit
shift implicitly converted to 64 bits (was 64-bit shift intended?)

2 years ago[lld][clang][cmake] Clean up a few things
John Ericson [Wed, 2 Feb 2022 15:37:13 +0000 (15:37 +0000)]
[lld][clang][cmake] Clean up a few things

- If not using `llvm-config`, `LLVM_MAIN_SRC_DIR` now has a sane default

- `LLVM_CONFIG_PATH` will continue to work for LLD for back compat.

- More quoting of paths in an abundance of caution.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D118792

2 years ago[mlir:Vector][NFC] Remove unnecessary dependency on Affine
River Riddle [Thu, 3 Feb 2022 19:45:39 +0000 (11:45 -0800)]
[mlir:Vector][NFC] Remove unnecessary dependency on Affine

2 years ago[LLDB] Fix window bot failure
Zequan Wu [Thu, 3 Feb 2022 01:12:40 +0000 (17:12 -0800)]
[LLDB] Fix window bot failure

This attempts to fix this bot failure: https://lab.llvm.org/buildbot/#/builders/83/builds/14736 caused by D118750 by un-xfail those expected failed tests.

Differential Revision: https://reviews.llvm.org/D118866

2 years ago[AArch64][SVE] Add more folds to make use of gather/scatter with 32-bit indices
Caroline Concatto [Thu, 13 Jan 2022 16:52:41 +0000 (16:52 +0000)]
[AArch64][SVE] Add more folds to make use of gather/scatter with 32-bit indices

In AArch64ISelLowering.cpp this patch implements this fold:

1) GEP (%ptr, SHL ((stepvector(A) + splat(%offset))) << splat(B)))
into GEP (%ptr + (%offset << B), step_vector (A << B))

The above transform simplifies the index operand so that it can be expressed
as i32 elements.
This allows using only one gather/scatter assembly instruction instead of two.

Patch by Paul Walker (@paulwalker-arm).

Depends on D117900

Differential Revision: https://reviews.llvm.org/D118345

2 years ago[NFC] [hwasan] use InstIterator
Florian Mayer [Thu, 3 Feb 2022 01:07:19 +0000 (17:07 -0800)]
[NFC] [hwasan] use InstIterator

Differential Revision: https://reviews.llvm.org/D118865

2 years ago[llvm-libtool-darwin] Remove var to fix use
Keith Smiley [Thu, 3 Feb 2022 19:05:18 +0000 (19:05 +0000)]
[llvm-libtool-darwin] Remove var to fix use

This seems to have been moved so the second use is invalid on Linux but
not macOS

2 years ago[nfc] [mte] use InstrIter.
Florian Mayer [Thu, 3 Feb 2022 18:19:26 +0000 (10:19 -0800)]
[nfc] [mte] use InstrIter.

this improves code clarity.

2 years ago[AArch64][SVE] Fold gather/scatter with 32bits when possible
Caroline Concatto [Thu, 13 Jan 2022 16:52:41 +0000 (16:52 +0000)]
[AArch64][SVE] Fold gather/scatter with 32bits when possible

In AArch64ISelLowering.cpp this patch implements this fold:

GEP (%ptr, (splat(%offset) + stepvector(A)))
into GEP ((%ptr + %offset), stepvector(A))

The above transform simplifies the index operand so that it can be expressed
as i32 elements.
This allows using only one gather/scatter assembly instruction instead of two.

Patch by Paul Walker (@paulwalker-arm).

Depends on D118459

Differential Revision: https://reviews.llvm.org/D117900

2 years ago[mlir][NFC] Split MlirQuant into proper IR/Utils/Transforms libraries
River Riddle [Thu, 3 Feb 2022 18:41:14 +0000 (10:41 -0800)]
[mlir][NFC] Split MlirQuant into proper IR/Utils/Transforms libraries

This matches the structure of other dialects, and also removes
unnecessary dependencies from the core dialect lib.

2 years ago[mli][Linalg] NFC: Refactor methods in `ElementwiseOpFusion`.
Mahesh Ravishankar [Thu, 3 Feb 2022 18:40:26 +0000 (18:40 +0000)]
[mli][Linalg] NFC: Refactor methods in `ElementwiseOpFusion`.

Reorder the methods and patterns to move related patterns/methods
closer (textually).

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D118870

2 years agosanitizer_common: fix __sanitizer_get_module_and_offset_for_pc signature mismatch
Dmitry Vyukov [Thu, 3 Feb 2022 15:04:34 +0000 (16:04 +0100)]
sanitizer_common: fix __sanitizer_get_module_and_offset_for_pc signature mismatch

This fixes the following error:

sanitizer_interface_internal.h:77:7: error: conflicting types for
     '__sanitizer_get_module_and_offset_for_pc'
  int __sanitizer_get_module_and_offset_for_pc(
common_interface_defs.h:349:5: note: previous declaration is here
int __sanitizer_get_module_and_offset_for_pc(void *pc, char *module_path,

I am getting it on a code that uses sanitizer_common (includes internal headers),
but also transitively gets includes of the public headers in tests
via an internal version of gtest.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D118910

2 years ago[clang-format] regression from clang-format v13
mydeveloperday [Thu, 3 Feb 2022 18:36:56 +0000 (18:36 +0000)]
[clang-format] regression from clang-format v13

https://github.com/llvm/llvm-project/issues/53567

The following source

```
namespace A {

template <int N> struct Foo<char[N]> {
  void foo() { std::cout << "Bar"; }
}; // namespace A
```

is incorrectly formatted as:

```
namespace A {

template <int N> struct Foo<char[N]>{void foo(){std::cout << "Bar";
}
}
; // namespace A
```

This looks to be caused by https://github.com/llvm/llvm-project/commit/5c2e7c9ca043d92bed75b08e653fb47c384edd13

Reviewed By: curdeius

Differential Revision: https://reviews.llvm.org/D118911

2 years ago[llvm-libtool-darwin] Improve warning message for no symbols
Keith Smiley [Thu, 3 Feb 2022 01:19:58 +0000 (17:19 -0800)]
[llvm-libtool-darwin] Improve warning message for no symbols

This more closely mirrors apple's libtool, and also potentially makes it
clearer for multi-arch archives where the issue lies.

Differential Revision: https://reviews.llvm.org/D118867

2 years ago[Clang][Docs] Add documention for new OpenMP offloading driver
Joseph Huber [Wed, 2 Feb 2022 18:06:55 +0000 (13:06 -0500)]
[Clang][Docs] Add documention for new OpenMP offloading driver

This patch adds more documentation for the OpenMP offloading driver.
This includes a new file that describes the overall pipeline becuase
that was not previously explained in full elsewhere.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D118815

2 years ago[Support][NFC] Don’t duplicate class or function name in comment
Amir Ayupov [Wed, 2 Feb 2022 23:37:32 +0000 (15:37 -0800)]
[Support][NFC] Don’t duplicate class or function name in comment

Refactor comments in CommandLine.h to follow the Coding Style rule

Reviewed By: MaskRay, serge-sans-paille

Differential Revision: https://reviews.llvm.org/D118859

2 years ago[clang-format] Avoid merging macro definitions.
Marek Kurdej [Thu, 3 Feb 2022 17:46:35 +0000 (18:46 +0100)]
[clang-format] Avoid merging macro definitions.

Fixes https://github.com/llvm/llvm-project/issues/42087.

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D118879

2 years ago[clang-format] Avoid adding space after the name of a function-like macro when the...
Marek Kurdej [Thu, 3 Feb 2022 17:29:53 +0000 (18:29 +0100)]
[clang-format] Avoid adding space after the name of a function-like macro when the name is a keyword.

Fixes https://github.com/llvm/llvm-project/issues/31086.

Before the code:
```
#define if(x)
```

was erroneously formatted to:
```
#define if (x)
```

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D118844

2 years ago[RISCV] Add FMV_X_W and FMV_X_H to RISCVSExtWRemoval.
Craig Topper [Thu, 3 Feb 2022 17:26:34 +0000 (09:26 -0800)]
[RISCV] Add FMV_X_W and FMV_X_H to RISCVSExtWRemoval.

Add -target-abi to sextw-removal.ll RUN lines to show benefit on
new test case.

2 years ago[AMDGPU] HWRegs TMA and TBA also supported on gfx9
Stanislav Mekhanoshin [Wed, 2 Feb 2022 23:51:11 +0000 (15:51 -0800)]
[AMDGPU] HWRegs TMA and TBA also supported on gfx9

Differential Revision: https://reviews.llvm.org/D118860

2 years ago[x86] add minimal test for sbb idiom and CPU capabilities; NFC
Sanjay Patel [Thu, 3 Feb 2022 17:05:40 +0000 (12:05 -0500)]
[x86] add minimal test for sbb idiom and CPU capabilities; NFC

D116804 proposes to alter codegen on this example based on
CPU tuning, so check a variety of models to confirm it works
as expected. We already have this test mixed in with several
others in another test file, but it seems wasteful to add so
many RUN lines to check this difference over and over again.

2 years ago[x86] remove CPU requirement for RUN line in test file; NFC
Sanjay Patel [Thu, 3 Feb 2022 16:56:04 +0000 (11:56 -0500)]
[x86] remove CPU requirement for RUN line in test file; NFC

A proposed change ( D118843 ) that would affect this test
will not require a specific CPU model to show a difference.

2 years ago[hwasan] add musttail IR test.
Florian Mayer [Wed, 2 Feb 2022 23:44:41 +0000 (15:44 -0800)]
[hwasan] add musttail IR test.

we currently only have a test at the clang level

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D118856

2 years agoRevert "[nfc][mlgo] De-const a parameter"
Mircea Trofin [Thu, 3 Feb 2022 17:10:06 +0000 (09:10 -0800)]
Revert "[nfc][mlgo] De-const a parameter"

This reverts commit bc3b372161716a4c4845d47a877e4892df0d08da.

The planned change that would have needed non-const MachineFunction refs
isn't needed after all.

2 years ago[SLP] Fix a typo in comment
Philip Reames [Thu, 3 Feb 2022 17:10:42 +0000 (09:10 -0800)]
[SLP] Fix a typo in comment

2 years ago[lldb] Fix windows&mac builds for c34698a811b13
Pavel Labath [Thu, 3 Feb 2022 17:06:19 +0000 (18:06 +0100)]
[lldb] Fix windows&mac builds for c34698a811b13

2 years ago[AMDGPU] Introduce new ISel combine for trunc-slr patterns
Thomas Symalla [Fri, 28 Jan 2022 11:27:54 +0000 (12:27 +0100)]
[AMDGPU] Introduce new ISel combine for trunc-slr patterns

In some cases, when selecting a (trunc (slr)) pattern, the slr gets translated
to a v_lshrrev_b3e2_e64 instruction whereas the truncation gets selected to
a sequence of v_and_b32_e64 and v_cmp_eq_u32_e64. In the final ISA, this appears
as selecting the nth-bit:

v_lshrrev_b32_e32 v0, 2, v1
v_and_b32_e32 v0, 1, v0
v_cmp_eq_u32_e32 vcc_lo, 1, v0

However, when the value used in the right shift is known at compilation time, the
whole sequence can be reduced to two VALUs when the constant operand in the v_and is adjusted to (1 << lshrrev_operand):

v_and_b32_e32 v0, (1 << 2), v1
v_cmp_ne_u32_e32 vcc_lo, 0, v0

In the example above, the following pseudo-code:

v0 = (v1 >> 2)
v0 = v0 & 1
vcc_lo = (v0 == 1)

would be translated to:

v0 = v1 & 0b100
vcc_lo = (v0 == 0b100)

which should yield an equivalent result.
This is a little bit hard to test as one needs to force the SelectionDAG to
contain the nodes before instruction selection, but the test sequence was
roughly derived from a production shader.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D118461

2 years ago[mlir-translate] Teach these tools about --allow-unregistered-dialect
Chris Lattner [Thu, 3 Feb 2022 05:40:41 +0000 (21:40 -0800)]
[mlir-translate] Teach these tools about --allow-unregistered-dialect

Some translations do work with unregistered dialects, this allows one
to write testcases against them.  It works the same way as it does for
mlir-opt.

Differential Revision: https://reviews.llvm.org/D118872

2 years ago[AARCH64][NEON] Allow to sink operands for aarch64_neon_pmull
Sunho Kim [Thu, 3 Feb 2022 16:46:49 +0000 (16:46 +0000)]
[AARCH64][NEON] Allow to sink operands for aarch64_neon_pmull

This teaches AArch64TargetLowering::shouldSinkOperands to sink the
operands of aarch64_neon_pmull intrinsic.

Differential Revision: https://reviews.llvm.org/D117944

2 years ago[test] check strictest attributes possible for InferFunctionAttrs test
Augie Fackler [Thu, 3 Feb 2022 16:39:15 +0000 (08:39 -0800)]
[test] check strictest attributes possible for InferFunctionAttrs test

This appears to have all the same attributes as many other functions
in this file, and I think the use of INACCESSIBLEMEMONLY_NOFREE_NOUNWIND
instead of INACCESSIBLEMEMONLY_NOFREE_NOUNWIND_WILLRETURN was an
oversight that meant aligned_alloc's attributes were just going
unchecked. This patch corrects the test defect and now the attributes
inferred on aligned_alloc are actually validated, and the test still
passes.

Differential Revision: https://reviews.llvm.org/D117922

2 years agoadd IR compatability test for (upcoming) allocsize attribute
Augie Fackler [Thu, 3 Feb 2022 16:31:44 +0000 (08:31 -0800)]
add IR compatability test for (upcoming) allocsize attribute

2 years ago[NFC] MemoryBuiltins: tease out a getFreeFunctionDataForFunction helper
Augie Fackler [Thu, 3 Feb 2022 16:30:37 +0000 (08:30 -0800)]
[NFC] MemoryBuiltins: tease out a getFreeFunctionDataForFunction helper

2 years ago[RISCV] Remove createVirtualRegister from RISCVInstrInfo::movImm.
Craig Topper [Thu, 3 Feb 2022 16:30:39 +0000 (08:30 -0800)]
[RISCV] Remove createVirtualRegister from RISCVInstrInfo::movImm.

Based on the discussion in D61884, this was done to enable compressed
instructions by giving freedom to pick a compressible register.

Integer materializing can generate LUI, ADDI, ADDIW, SLLI and some
Zb* instructions. C.LI, C.LUI, C.ADDI, C.ADDIW, and C.SLLI all have a 5-bit
register encoding. The Zb* instructions aren't compressible. Based on
that I don't think compressibility of the register is a concern.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118741

2 years ago[clang-tidy] Fix LLVM include order check policy
Kadir Cetinkaya [Thu, 3 Feb 2022 16:26:41 +0000 (17:26 +0100)]
[clang-tidy] Fix LLVM include order check policy

Clang-format LLVM style has a custom include category for gtest/ and
gmock/ headers between regular includes and angled includes. Do the same here.

Fixes https://github.com/llvm/llvm-project/issues/53525.

Differential Revision: https://reviews.llvm.org/D118913

2 years ago[RISCV] Remove RISCVISD::SPLAT_VECTOR_I64 in favor of RISCVISD::VMV_V_X_VL.
Craig Topper [Thu, 3 Feb 2022 16:23:34 +0000 (08:23 -0800)]
[RISCV] Remove RISCVISD::SPLAT_VECTOR_I64 in favor of RISCVISD::VMV_V_X_VL.

SPLAT_VECTOR_I64 has the same semantics as RISCVISD::VMV_V_X_VL, it
just assumed VLMax instead of carrying a VL operand.

Include order of RISCVInstrInfoVSDPatterns.td and RISCVInstrInfoVVLPatterns.td
has been swapped to avoid moving riscv_vmv_v_x_vl into
RISCVInstrInfoVSDPatterns.td and to allow moving other "_vl" SDNodes back to
RISCVInstrInfoVVLPatterns.td

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118841

2 years agoMemoryBuiltins: simplify isLibFreeFunction [NFC]
Augie Fackler [Thu, 3 Feb 2022 16:21:09 +0000 (08:21 -0800)]
MemoryBuiltins: simplify isLibFreeFunction [NFC]

This is in anticipation of my next patch, where I need to store more information about free functions than just their argument count. It felt invasive enough on this function that it seemed worthwhile to just extract this as its own commit that makes no functional changes.

Differential Revision: https://reviews.llvm.org/D117350

2 years ago[AMDGPU] Simplify AMDGPUAnnotateUniformValues::visitLoadInst
Jay Foad [Thu, 3 Feb 2022 15:27:12 +0000 (15:27 +0000)]
[AMDGPU] Simplify AMDGPUAnnotateUniformValues::visitLoadInst

Always set uniform metadata on the pointer if it is an instruction, but
otherwise do not bother to create a trivial getelementptr instruction,
because AMDGPUInstrInfo::isUniformMMO can already detect that various
non-instruction pointers are uniform.

Most of the test case churn is from tests that used undef as a pointer,
which AMDGPUInstrInfo::isUniformMMO treats as uniform.

Differential Revision: https://reviews.llvm.org/D118909

2 years ago[mlir][taco] Uses sparse_tensor.new to read tensor input data from files.
Bixia Zheng [Wed, 2 Feb 2022 16:40:33 +0000 (08:40 -0800)]
[mlir][taco] Uses sparse_tensor.new to read tensor input data from files.

Replace the Python implementation for reading tensor input data from files with
create_sparse_tensor that uses sparse_tensor.new.

The MLIR TNS format has two extra meta data lines. Add the extra meta data to a
test data file.

Implement TACO tensor methods evaluate and unpack.

Add unit tests.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D118803

2 years ago[MLIR][SCF] Remove loop invariant arguments of scf.while
Abhishek Varma [Thu, 3 Feb 2022 16:09:51 +0000 (17:09 +0100)]
[MLIR][SCF] Remove loop invariant arguments of scf.while

-- This commit adds a canonicalization pattern on scf.while to remove
   the loop invariant arguments.
-- An argument is considered loop invariant if the iteration argument value is
   the same as the corresponding one being yielded (at the same position) in both
   the before/after block of scf.while.
-- For the arguments removed, their use within scf.while and their corresponding
   scf.while's result are replaced with their corresponding initial value.

Signed-off-by: Abhishek Varma <abhishek.varma@polymagelabs.com>
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D116923

2 years ago[AMDGPU] Tweak tests in noclobber-barrier.ll
Jay Foad [Thu, 3 Feb 2022 16:02:39 +0000 (16:02 +0000)]
[AMDGPU] Tweak tests in noclobber-barrier.ll

Tweak some of the tests to demonstrate
AMDGPUAnnotateUniformValues::visitLoadInst inserting a trivial
getelementptr instruction, just to have somewhere to put amdgpu.uniform
metadata. NFC.

2 years ago[gn build] Port c34698a811b1
LLVM GN Syncbot [Thu, 3 Feb 2022 16:04:30 +0000 (16:04 +0000)]
[gn build] Port c34698a811b1

2 years agoMipsABIFlagsSection.h - replace unnecessary StringRef include with forward declaration
Simon Pilgrim [Thu, 3 Feb 2022 15:57:12 +0000 (15:57 +0000)]
MipsABIFlagsSection.h - replace unnecessary StringRef include with forward declaration

2 years ago[X86] simplifyX86varShift - use KnownBits.getMaxValue().ult() to check for out of...
Simon Pilgrim [Thu, 3 Feb 2022 15:55:55 +0000 (15:55 +0000)]
[X86] simplifyX86varShift - use KnownBits.getMaxValue().ult() to check for out of bounds shift amounts

This is easier to grok than MaskedValueIsZero for high bits.

2 years ago[gn build] (manually) port 20e05b9f0ebe (ClangPseudoTests)
Nico Weber [Thu, 3 Feb 2022 15:55:40 +0000 (10:55 -0500)]
[gn build] (manually) port 20e05b9f0ebe (ClangPseudoTests)

2 years ago[clang][driver][wasm] Remove unneeded default labels
Timm Bäder [Thu, 3 Feb 2022 15:52:07 +0000 (16:52 +0100)]
[clang][driver][wasm] Remove unneeded default labels

Fix build fallout from b5787a0c6cc4da47b7d7b218e23f780076ad2f5f

2 years ago[LV] Use VScaleForTuning to allow wider epilogue VFs.
Sander de Smalen [Thu, 3 Feb 2022 09:36:03 +0000 (09:36 +0000)]
[LV] Use VScaleForTuning to allow wider epilogue VFs.

When the main loop is e.g. VF=vscale x 1 and the epilogue VF cannot
be any smaller, the vectorizer should try to estimate how many lanes are
executed at runtime and allow a suitable fixed-width VF to be chosen. It
can use VScaleForTuning to figure out what a suitable fixed-width VF could
be. For the case where the main loop VF is VF=vscale x 1, and VScaleForTuning=8,
it could still choose an epilogue VF upto VF=4.

This was a bit tricky to test, so this patch also introduces a wrapper
function to get 'VScaleForTuning' by also considering vscale_range.
If min and max are equal, then that will be the vscale we compile for.
It makes little sense to tune for a different width if the code
will not be portable for other widths.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D118709

2 years ago[clang][driver][wasm] Support -stdlib=libstdc++ for WebAssembly
Timm Bäder [Fri, 21 Jan 2022 10:15:23 +0000 (11:15 +0100)]
[clang][driver][wasm] Support -stdlib=libstdc++ for WebAssembly

The WebAssembly toolchain currently supports only -stdlib=libc++
and implicitly assumes the c++ stdlib to be libc++. Change this to also
support libstdc++.

Differential Revision: https://reviews.llvm.org/D117888#3290628

2 years agoRevert "[flang] Debugging of ACCESS='STREAM' I/O"
Andrzej Warzynski [Thu, 3 Feb 2022 15:15:53 +0000 (15:15 +0000)]
Revert "[flang] Debugging of ACCESS='STREAM' I/O"

This reverts commit be9946b877add0db906090d22840b213c3f41dd2.

This change has caused Flang's Windows buildbot to start failing:
* https://lab.llvm.org/buildbot/#/builders/172/builds/7664

2 years ago[Lanai] Remove orphan LanaiInstPrinter::printAluOperand declaration. NFCI.
Simon Pilgrim [Thu, 3 Feb 2022 14:48:38 +0000 (14:48 +0000)]
[Lanai] Remove orphan LanaiInstPrinter::printAluOperand declaration. NFCI.

2 years agoLanaiInstPrinter.h - replace unnecessary StringRef include with forward declaration
Simon Pilgrim [Thu, 3 Feb 2022 14:33:30 +0000 (14:33 +0000)]
LanaiInstPrinter.h - replace unnecessary StringRef include with forward declaration

2 years ago[SLP]Excluded external uses from the reordering estimation.
Alexey Bataev [Fri, 31 Dec 2021 17:31:24 +0000 (09:31 -0800)]
[SLP]Excluded external uses from the reordering estimation.

Compiler adds the estimation for the external uses during operands
reordering analysis, which makes it tend to prefer duplicates in the
lanes rather than diamond/shuffled match in the graph. It changes the sizes of
the vector operands and may prevent some vectorization. We don't need
this kind of estimation for the analysis phase, because we just need to
choose the most compatible instruction and it does not matter if it has
external user or used in the non-matching lane. Instead, we count the number
of unique instruction in the lane and see if the reassociation changes
the number of unique scalars to be power of 2 or not. If we have power
of 2 unique scalars in the lane, it is considered more profitable rather
than having non-power-of-2 number of unique scalars.

Metric: SLP.NumVectorInstructions

                          test-suite :: MultiSource/Benchmarks/FreeBench/distray/distray.test   70.00   86.00   22.9%
                             test-suite :: External/SPEC/CFP2017rate/544.nab_r/544.nab_r.test  346.00  353.00    2.0%
                            test-suite :: External/SPEC/CFP2017speed/644.nab_s/644.nab_s.test  346.00  353.00    2.0%
                         test-suite :: MultiSource/Benchmarks/mediabench/gsm/toast/toast.test  235.00  239.00    1.7%
                  test-suite :: MultiSource/Benchmarks/MiBench/telecomm-gsm/telecomm-gsm.test  235.00  239.00    1.7%
                     test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 8723.00 8834.00    1.3%
                                 test-suite :: MultiSource/Applications/JM/ldecod/ldecod.test 1051.00 1064.00    1.2%
                         test-suite :: External/SPEC/CINT2017speed/625.x264_s/625.x264_s.test 1628.00 1646.00    1.1%
                          test-suite :: External/SPEC/CINT2017rate/525.x264_r/525.x264_r.test 1628.00 1646.00    1.1%
                       test-suite :: External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test 9100.00 9184.00    0.9%
                     test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test 3565.00 3577.00    0.3%
                    test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test 3565.00 3577.00    0.3%
                       test-suite :: External/SPEC/CFP2017rate/511.povray_r/511.povray_r.test 4235.00 4245.00    0.2%
                              test-suite :: MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test 1996.00 1998.00    0.1%
                                 test-suite :: MultiSource/Applications/JM/lencod/lencod.test 1671.00 1672.00    0.1%

test-suite :: MultiSource/Benchmarks/Prolangs-C/TimberWolfMC/timberwolfmc.test  783.00  782.00   -0.1%
                      test-suite :: SingleSource/Benchmarks/Misc/oourafft.test   69.00   68.00   -1.4%
        test-suite :: External/SPEC/CINT2017speed/641.leela_s/641.leela_s.test  207.00  192.00   -7.2%
         test-suite :: External/SPEC/CINT2017rate/541.leela_r/541.leela_r.test  207.00  192.00   -7.2%
 test-suite :: External/SPEC/CINT2017rate/531.deepsjeng_r/531.deepsjeng_r.test   89.00   80.00  -10.1%
test-suite :: External/SPEC/CINT2017speed/631.deepsjeng_s/631.deepsjeng_s.test   89.00   80.00  -10.1%
       test-suite :: MultiSource/Benchmarks/mediabench/jpeg/jpeg-6a/cjpeg.test  260.00  215.00  -17.3%
 test-suite :: MultiSource/Benchmarks/MiBench/consumer-jpeg/consumer-jpeg.test  256.00  211.00  -17.6%

MultiSource/Benchmarks/Prolangs-C/TimberWolfMC - pretty the same.
SingleSource/Benchmarks/Misc/oourafft.test - 2 <2 x > loads replaced by
one <4 x> load.
External/SPEC/CINT2017speed/641.leela_s - function gets vectorized and
not inlined anymore.
External/SPEC/CINT2017rate/541.leela_r - same
xternal/SPEC/CINT2017rate/531.deepsjeng_r - changed the order in
multi-block tree, the result is pretty the same.
External/SPEC/CINT2017speed/631.deepsjeng_s - same.
MultiSource/Benchmarks/mediabench/jpeg/jpeg-6a - the result is the same
as before.
MultiSource/Benchmarks/MiBench/consumer-jpeg - same.

Differential Revision: https://reviews.llvm.org/D116688

2 years ago[NFC] Move FoldingSetNodeID::AddInteger and FoldingSetNodeID::AddPointer definitions...
Dawid Jurczak [Mon, 31 Jan 2022 14:51:13 +0000 (15:51 +0100)]
[NFC] Move FoldingSetNodeID::AddInteger and FoldingSetNodeID::AddPointer definitions to header

Lack of AddInteger/AddPointer inlining slows down NodeEquals/Profile/:operator== calls.
Inlining makes FunctionProtoTypes/PointerTypes/ElaboratedTypes/ParenTypes Profile functions faster
but since NodeEquals is still called indirectly through function pointer from FindNodeOrInsertPos
there is room for further inlining improvements.

Extracted from: https://reviews.llvm.org/D118385

Differential Revision: https://reviews.llvm.org/D118610

2 years agoXCoreTargetMachine.h - replace unnecessary StringRef include with forward declaration
Simon Pilgrim [Thu, 3 Feb 2022 14:21:22 +0000 (14:21 +0000)]
XCoreTargetMachine.h - replace unnecessary StringRef include with forward declaration

2 years agoXCoreInstPrinter.h - replace unnecessary StringRef include with forward declaration
Simon Pilgrim [Thu, 3 Feb 2022 14:05:53 +0000 (14:05 +0000)]
XCoreInstPrinter.h - replace unnecessary StringRef include with forward declaration

2 years ago[XCore] Remove orphan XCoreInstPrinter::printMemOperand declaration. NFCI.
Simon Pilgrim [Thu, 3 Feb 2022 13:55:15 +0000 (13:55 +0000)]
[XCore] Remove orphan XCoreInstPrinter::printMemOperand declaration. NFCI.

2 years ago[SLP]Alternate vectorization for cmp instructions.
Alexey Bataev [Thu, 16 Dec 2021 16:55:52 +0000 (08:55 -0800)]
[SLP]Alternate vectorization for cmp instructions.

Added support for alternate ops vectorization of the cmp instructions.
It allows to vectorize either cmp instructions with same/swapped
predicate but different (swapped) operands kinds or cmp instructions
with different predicates and compatible operands kinds.

Differential Revision: https://reviews.llvm.org/D115955