platform/upstream/llvm.git
2 years ago[MLIR] Matrix: support matrix-vector multiplication
Arjun P [Wed, 2 Feb 2022 12:09:32 +0000 (17:39 +0530)]
[MLIR] Matrix: support matrix-vector multiplication

This just moves in the implementation from LinearTransform.

Reviewed By: Groverkss, bondhugula

Differential Revision: https://reviews.llvm.org/D118479

2 years agoRevert "[SLP]Alternate vectorization for cmp instructions."
Benjamin Kramer [Wed, 2 Feb 2022 12:02:35 +0000 (13:02 +0100)]
Revert "[SLP]Alternate vectorization for cmp instructions."

This reverts commit 83620bd2ad867f706c699d0f2b8be10e43d9f3d7.

It's causing miscompilations, see review comments at
https://reviews.llvm.org/D115955

2 years ago[LAA] Add Memory dependence remarks.
Malhar Jajoo [Wed, 2 Feb 2022 02:06:38 +0000 (02:06 +0000)]
[LAA] Add Memory dependence remarks.

Adds new optimization remarks when vectorization fails.

More specifically, new remarks are added for following 4 cases:

- Backward dependency
- Backward dependency that prevents Store-to-load forwarding
- Forward dependency that prevents Store-to-load forwarding
- Unknown dependency

It is important to note that only one of the sources
of failures (to vectorize) is reported by the remarks.
This source of failure may not be first in program order.

A regression test has been added to test the following cases:

a) Loop can be vectorized: No optimization remark is emitted
b) Loop can not be vectorized: In this case an optimization
remark will be emitted for one source of failure.

Reviewed By: sdesmalen, david-arm

Differential Revision: https://reviews.llvm.org/D108371

2 years ago[DAG] SimplifyDemandedVectorElts - remove KnownZero/KnownUndef from DCI helper wrapper
Simon Pilgrim [Wed, 2 Feb 2022 11:40:27 +0000 (11:40 +0000)]
[DAG] SimplifyDemandedVectorElts - remove KnownZero/KnownUndef from DCI helper wrapper

None of the external users actual touch these (they're purely used internally down the recursive call) - its trivial to add another wrapper if anything ever does want to track known elements.

2 years ago[scan-build] Fix deadlock at failures in libears/ear.c
Balazs Benics [Wed, 2 Feb 2022 11:55:44 +0000 (12:55 +0100)]
[scan-build] Fix deadlock at failures in libears/ear.c

We experienced some deadlocks when we used multiple threads for logging
using `scan-builds` intercept-build tool when we used multiple threads by
e.g. logging `make -j16`

```
(gdb) bt
#0  0x00007f2bb3aff110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f2bb3af70a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f2bb3d152e4 in ?? ()
#3  0x00007ffcc5f0cc80 in ?? ()
#4  0x00007f2bb3d2bf5b in ?? () from /lib64/ld-linux-x86-64.so.2
#5  0x00007f2bb3b5da27 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x00007f2bb3b5dbe0 in exit () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x00007f2bb3d144ee in ?? ()
#8  0x746e692f706d742f in ?? ()
#9  0x692d747065637265 in ?? ()
#10 0x2f653631326b3034 in ?? ()
#11 0x646d632e35353532 in ?? ()
#12 0x0000000000000000 in ?? ()
```

I think the gcc's exit call caused the injected `libear.so` to be unloaded
by the `ld`, which in turn called the `void on_unload() __attribute__((destructor))`.
That tried to acquire an already locked mutex which was left locked in the
`bear_report_call()` call, that probably encountered some error and
returned early when it forgot to unlock the mutex.

All of these are speculation since from the backtrace I could not verify
if frames 2 and 3 are in fact corresponding to the `libear.so` module.
But I think it's a fairly safe bet.

So, hereby I'm releasing the held mutex on *all paths*, even if some failure
happens.

PS: I would use lock_guards, but it's C.

Reviewed-by: NoQ
Differential Revision: https://reviews.llvm.org/D118439

2 years ago[libc] Fix automemcpy test by adding memmove configuration
Guillaume Chatelet [Wed, 2 Feb 2022 11:28:06 +0000 (11:28 +0000)]
[libc] Fix automemcpy test by adding memmove configuration

2 years agoRe-apply 3fab2d138e30, now with a triple added
Jeremy Morse [Tue, 1 Feb 2022 19:19:20 +0000 (19:19 +0000)]
Re-apply 3fab2d138e30, now with a triple  added

Was reverted in 1c1b670a73a9 as it broke all non-x86 bots. Original commit
message:

[DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out

In certain circumstances with things like autogenerated code and asan, you
can end up with thousands of Values live at the same time, causing a large
working set and a lot of information spilled to the stack. Unfortunately
InstrRefBasedLDV doesn't cope well with this and consumes a lot of memory
when there are many many stack slots. See the reproducer in D116821.

It seems very unlikely that a developer would be able to reason about
hundreds of live named local variables at the same time, so a huge working
set and many stack slots is an indicator that we're likely analysing
autogenerated or instrumented code. In those cases: gracefully degrade by
setting an upper bound on the amount of stack slots to track. This limits
peak memory consumption, at the cost of dropping some variable locations,
but in a rare scenario where it's unlikely someone is actually going to
use them.

In terms of the patch, this adds a cl::opt for max number of stack slots to
track, and has the stack-slot-numbering code optionally return None. That
then filters through a number of code paths, which can then chose to not
track a spill / restore if it touches an untracked spill slot. The added
test checks that we drop variable locations that are on the stack, if we
set the limit to zero.

Differential Revision: https://reviews.llvm.org/D118601

2 years ago[mlir][vector] Avoid hoisting alloca'ed temporary buffers across AutomaticAllocationScope
Nicolas Vasilache [Wed, 2 Feb 2022 10:21:02 +0000 (05:21 -0500)]
[mlir][vector] Avoid hoisting alloca'ed temporary buffers across AutomaticAllocationScope

This revision avoids incorrect hoisting of alloca'd buffers across an AutomaticAllocationScope boundary.
In the more general case, we will probably need a ParallelScope-like interface.

Differential Revision: https://reviews.llvm.org/D118768

2 years ago[MSVC] Workaround missing search path for sanitizer headers.
Pierre Gousseau [Wed, 2 Feb 2022 10:54:22 +0000 (10:54 +0000)]
[MSVC] Workaround missing search path for sanitizer headers.

This is to fix build errors "Cannot open include file:
'sanitizer/asan_interface.h'" when building LLVM with MSVC and
LLVM_USE_SANITIZER=Address.

asan_interface.h is not available in MSVC's search path, instead it is
located under %VCToolsInstallDir%/crt/src/sanitizer.
This is an alternate solution to https://reviews.llvm.org/D118159, to
avoid adding all internal crt sources to the header search paths.

Tested with visual studio 2019 v16.9.6 and visual studio 2022 v17.0.5

Reviewed By: aaron.ballman, rnk

Differential Revision: https://reviews.llvm.org/D118624

2 years ago[mlir] Fully qualify generated C++ code in RewriterGen.cpp
Markus Böck [Wed, 2 Feb 2022 10:57:16 +0000 (11:57 +0100)]
[mlir] Fully qualify generated C++ code in RewriterGen.cpp

By fully qualifying the use of any types and functions from the mlir namespace, users are not required to add using namespace mlir; into the C++ file including the Tablegen output.

Differential Revision: https://reviews.llvm.org/D118767

2 years agoRevert "[analyzer] Prevent misuses of -analyze-function"
Balazs Benics [Wed, 2 Feb 2022 10:44:27 +0000 (11:44 +0100)]
Revert "[analyzer] Prevent misuses of -analyze-function"

This reverts commit 9d6a6159730171bc0faf78d7f109d6543f4c93c2.

Exit Code: 1

Command Output (stderr):
--
/scratch/buildbot/bothome/clang-ve-ninja/llvm-project/clang/test/Analysis/analyze-function-guide.cpp:53:21: error: CHECK-EMPTY-NOT: excluded string found in input // CHECK-EMPTY-NOT: Every top-level function was skipped.
                    ^
<stdin>:1:1: note: found here
Every top-level function was skipped.
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Input file: <stdin>
Check file: /scratch/buildbot/bothome/clang-ve-ninja/llvm-project/clang/test/Analysis/analyze-function-guide.cpp

-dump-input=help explains the following input dump.

Input was:
<<<<<<
        1: Every top-level function was skipped.
not:53     !~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  error: no match expected
        2: Pass the -analyzer-display-progress for tracking which functions are analyzed.
>>>>>>

2 years ago[AVR] Avoid reusing the same variable name (NFC)
Nikita Popov [Wed, 2 Feb 2022 10:20:55 +0000 (11:20 +0100)]
[AVR] Avoid reusing the same variable name (NFC)

Apparently GCC 5.4 (a supported compiler) has a bug where it will
use the "MachineInstr &MI" defined by the range-based for loop
to evaluate the for loop expression. Pick a different variable
name to avoid this.

2 years ago[analyzer] Prevent misuses of -analyze-function
Balazs Benics [Wed, 2 Feb 2022 10:31:22 +0000 (11:31 +0100)]
[analyzer] Prevent misuses of -analyze-function

Sometimes when I pass the mentioned option I forget about passing the
parameter list for c++ sources.
It would be also useful newcomers to learn about this.

This patch introduces some logic checking common misuses involving
`-analyze-function`.

Reviewed-By: martong
Differential Revision: https://reviews.llvm.org/D118690

2 years ago[OpenCL] Test -fdeclare-opencl-builtins with CL3 and CLC++2021
Sven van Haastregt [Wed, 2 Feb 2022 10:23:02 +0000 (10:23 +0000)]
[OpenCL] Test -fdeclare-opencl-builtins with CL3 and CLC++2021

But only test in combination with -finclude-default-header, as the
headerless tests may be dropped soon.

2 years ago[mlir][async] Add AutomaticAllocationScope to async::ExecuteOp
Nicolas Vasilache [Wed, 2 Feb 2022 10:01:42 +0000 (05:01 -0500)]
[mlir][async] Add AutomaticAllocationScope to async::ExecuteOp

Differential Revision: https://reviews.llvm.org/D118761

2 years ago[TypePromotion] Avoid some unnecessary truncs
Sam Parker [Wed, 2 Feb 2022 10:05:15 +0000 (10:05 +0000)]
[TypePromotion] Avoid some unnecessary truncs

Check for legal zext 'sinks' before inserting a trunc.

Differential Revision: https://reviews.llvm.org/D115451

2 years ago[libc++][P2321R2] Add specializations of basic_common_reference and common_type for...
Nikolas Klauser [Tue, 1 Feb 2022 20:36:50 +0000 (21:36 +0100)]
[libc++][P2321R2] Add specializations of basic_common_reference and common_type for pair

Add specializations of basic_common_reference and common_type for pair

Reviewed By: Quuxplusone, Mordante, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D117506

2 years ago[AArch64][CodeGen] Always use SVE (when enabled) to lower integer divides
David Sherwood [Wed, 19 Jan 2022 11:52:31 +0000 (11:52 +0000)]
[AArch64][CodeGen] Always use SVE (when enabled) to lower integer divides

This patch adds custom lowering support for ISD::SDIV and ISD::UDIV
when SVE is enabled, regardless of the minimum SVE vector length. We do
this because NEON simply does not have vector integer divide support, so
we want to take advantage of these instructions in SVE.

As part of this patch I've also simplified LowerToPredicatedOp to avoid
re-asking the same question about whether we should be using SVE for
fixed length vectors. Once we've made the decision to call
LowerToPredicatedOp, then we should simply assert we should be using SVE.

I've updated the 128-bit min SVE vector bits tests here:

  CodeGen/AArch64/sve-fixed-length-int-div.ll
  CodeGen/AArch64/sve-fixed-length-int-rem.ll

Differential Revision: https://reviews.llvm.org/D117871

2 years ago[GVN] Replace PointerIntPair with separate pointer & kind fields (NFC).
Florian Hahn [Wed, 2 Feb 2022 09:44:15 +0000 (09:44 +0000)]
[GVN] Replace PointerIntPair with separate pointer & kind fields (NFC).

After adding another value kind in 8a12cae862af, Value * pointers do not
have enough available empty bits to store the kind (e.g. on ARM)

To address this, the patch replaces the PointerIntPair with separate
value and kind fields.

2 years ago[compiler-rt][Darwin] Add arm64 to simulator platforms
Tobias Hieta [Wed, 2 Feb 2022 09:06:37 +0000 (10:06 +0100)]
[compiler-rt][Darwin] Add arm64 to simulator platforms

I was looking around and noticed that builtins for iossim, tvossim
and watchossim was missing arm64 builds, while apple's clang
toolchain ship with these. After a bit of searching around it just
seems like these are not listed correctly in CMake to be enabled.

I enabled just arm64 since I saw that Apple clang didn't include
arm64e.

Reviewed By: t.p.northover

Differential Revision: https://reviews.llvm.org/D118759

2 years ago[clang-format] Correctly parse C99 digraphs: "<:", ":>", "<%", "%>", "%:", "%:%:".
Marek Kurdej [Wed, 2 Feb 2022 09:13:12 +0000 (10:13 +0100)]
[clang-format] Correctly parse C99 digraphs: "<:", ":>", "<%", "%>", "%:", "%:%:".

Fixes https://github.com/llvm/llvm-project/issues/31592.

This commits enables lexing of digraphs in C++11 and onwards.
Enabling them in C++03 is error-prone, as it would unconditionally treat sequences like "<:" as digraphs, even if they are followed by a single colon, e.g. "<::" would be treated as "[:" instead of "<" followed by "::". Lexing in C++11 doesn't have this problem as it looks ahead the following token.
The relevant excerpt from Lexer::LexTokenInternal:
```
        // C++0x [lex.pptoken]p3:
        //  Otherwise, if the next three characters are <:: and the subsequent
        //  character is neither : nor >, the < is treated as a preprocessor
        //  token by itself and not as the first character of the alternative
        //  token <:.
```

Also, note that both clang and gcc turn on digraphs by default (-fdigraphs), so clang-format should match this behaviour.

Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D118706

2 years ago[GVN] Support load of pointer-select to value-select conversion.
Florian Hahn [Wed, 2 Feb 2022 09:23:09 +0000 (09:23 +0000)]
[GVN] Support load of pointer-select to value-select conversion.

This patch extends the available-value logic to detect loads
of pointer-selects that can be replaced by a value select.

For example, consider the code below:

  loop:
    %sel.phi = phi i32* [ %start, %ph ], [ %sel, %ph ]
    %l = load %ptr
    %l.sel = load %sel.phi
    %sel = select cond, %ptr, %sel.phi
    ...

  exit:
    %res = load %sel
    use(%res)

The load of the pointer phi can be replaced by a load of the start value
outside the loop and a new phi/select chain based on the loaded values,
as illustrated below

    %l.start = load %start
  loop:
    sel.phi.prom = phi i32 [ %l.start, %ph ], [ %sel.prom, %ph ]
    %l = load %ptr
    %sel.prom = select cond, %l, %sel.phi.prom
    ...
  exit:
    use(%sel.prom)

This is a first step towards alllowing vectorizing loops using common libc++
library functions, like std::min_element (https://clang.godbolt.org/z/6czGzzqbs)

    #include <vector>
    #include <algorithm>

    int foo(const std::vector<int> &V) {
        return *std::min_element(V.begin(), V.end());
    }

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D118143

2 years ago[mlir][vector] Make write permutation lowering work with tensors.
gysit [Wed, 2 Feb 2022 09:06:31 +0000 (09:06 +0000)]
[mlir][vector] Make write permutation lowering work with tensors.

Use type inference when building the TransferWriteOp in the TransferWritePermutationLowering. Previously, the result type has been set to Type() which triggers an assertion if the pattern is used with tensors instead of memrefs.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D118758

2 years ago[VE] Packed v512f32 binop isel and tests
Simon Moll [Wed, 2 Feb 2022 08:40:52 +0000 (09:40 +0100)]
[VE] Packed v512f32 binop isel and tests

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D118335

2 years ago[AArch64][SVE] NFC: tidy up isel lowering
Cullen Rhodes [Tue, 1 Feb 2022 20:46:46 +0000 (20:46 +0000)]
[AArch64][SVE] NFC: tidy up isel lowering

Whilst adding legal types <-> register classes for Streaming SVE in
D118561 I noticed the hasSVE predication block set operation actions for
opcodes that may not be legal in Streaming SVE. Move these operations to
the later hasSVE block which has loops over the same types.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D118560

2 years ago[llvm-reduce] Display all relevant options in -help
Markus Lavin [Wed, 2 Feb 2022 08:31:29 +0000 (09:31 +0100)]
[llvm-reduce] Display all relevant options in -help

Previously the options category given to cl::HideUnrelatedOptions was
local to llvm-reduce.cpp and as a result only options declared in that
file were visible in the -help options listing. This was a bit
unfortunate since there were several useful options declared in other
files. This patch addresses that.

Differential Revision: https://reviews.llvm.org/D118682

2 years ago[ArgPromotion] Add test for volatile and atomic loads (NFC)
Nikita Popov [Wed, 2 Feb 2022 08:43:38 +0000 (09:43 +0100)]
[ArgPromotion] Add test for volatile and atomic loads (NFC)

Argument promotion does handle these correctly (by not promoting
them), but there were no tests to ensure this.

2 years ago[flang][optimizer] support aggregate types inside tuple and record type
Jean Perier [Wed, 2 Feb 2022 08:21:44 +0000 (09:21 +0100)]
[flang][optimizer] support aggregate types inside tuple and record type

This patch allows:
 - fir.box type to be a member of tuple<> or fir.type<> types,
 - tuple<> type to be a member of tuple<> type.

When a fir.box types are nested in tuple<> or fir.type<>, it is translated
to the struct type of a Fortran runtime descriptor, and not a
pointer to a descriptor. This is because the fir.box is owned by the tuple
or fir.type.

FIR type translation was also flattening nested tuple while lowering to LLVM
dialect types. There does not seem to be a deep reason for doing that
and doing it causes issues in fir.coordinate_of generated on such tuple
(a fir.coordinate_of getting tuple<B, C> in tuple<A, tuple<B, C>>
ended-up lowered to an LLVM GEP getting B).

Differential Revision: https://reviews.llvm.org/D118701

2 years ago[VE] LEGALAVL and staged VVP legalization
Simon Moll [Wed, 2 Feb 2022 08:11:33 +0000 (09:11 +0100)]
[VE] LEGALAVL and staged VVP legalization

The new LEGALAVL node annotates that the AVL refers to packs of 64bit.
We use a two-stage lowering approach with LEGALAVL:

First, standard SDNodes are translated into illegal VVP layer nodes.
Regardless of source (VP or standard), all VVP nodes have a mask and AVL
parameter. The AVL parameter refers to the element position (just as in
VP intrinsics).

Second, we legalize the AVL usage in VVP layer nodes. If the element
size is < 64bit, the EVL parameter has to be adjusted to refer to packs
of 64bits.  We wrap the legalized AVL in a LEGALAVL node to track this.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D118321

2 years ago[AVR][NFC] Make atomics tests easier to read
Ayke van Laethem [Sun, 23 Jan 2022 16:55:30 +0000 (17:55 +0100)]
[AVR][NFC] Make atomics tests easier to read

Use the same mnemonics in the tests that are used in the AtomicLoadOp
pattern ($rd, $rr) but use RR1 instead of $operand. This matches similar
tests in load8.ll.

Differential Revision: https://reviews.llvm.org/D117991

2 years ago[AVR] Fix atomicrmw result value
Ayke van Laethem [Wed, 19 Jan 2022 22:30:54 +0000 (23:30 +0100)]
[AVR] Fix atomicrmw result value

This patch fixes the atomicrmw result value to be the value before the
operation instead of the value after the operation. This was a bug, left
as a FIXME in the code (see https://reviews.llvm.org/D97127).

From the LangRef:

> The contents of memory at the location specified by the <pointer>
> operand are atomically read, modified, and written back. The original
> value at the location is returned.

Doing this expansion early allows the register allocator to arrange
registers in such a way that commutable operations are simply swapped
around as needed, which results in shorter code while still being
correct.

Differential Revision: https://reviews.llvm.org/D117725

2 years agoBump the trunk major version to 15
Tom Stellard [Wed, 2 Feb 2022 07:29:29 +0000 (23:29 -0800)]
Bump the trunk major version to 15

2 years ago[flang] Lower PAUSE statement
Valentin Clement [Wed, 2 Feb 2022 07:15:26 +0000 (08:15 +0100)]
[flang] Lower PAUSE statement

Lower the PAUSE statement to a runtime call.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: kiranchandramohan, schweitz

Differential Revision: https://reviews.llvm.org/D118699

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
2 years ago[docs] Remove hard-coded version numbers from sphinx configs
Tom Stellard [Wed, 2 Feb 2022 07:13:01 +0000 (23:13 -0800)]
[docs] Remove hard-coded version numbers from sphinx configs

This updates all the non-runtime project release notes to use the
version number from CMake instead of the hard-coded version numbers
in conf.py.

It also hides warnings about pre-releases when the git suffix
is dropped from the LLVM version in CMake.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D112181

2 years ago[cmake][NFC] Configuration for libLLVM.so symbol versioning
Stephen Neuendorffer [Tue, 1 Feb 2022 01:56:42 +0000 (17:56 -0800)]
[cmake][NFC] Configuration for libLLVM.so symbol versioning

Symbol versioning can prevent unintented install-time conflicts
between different llvm versions.  Users may need to override this
for particular products (e.g. Julia), but this requires carrying
a source code patch.  This patch moves this ability to a
configuration option.  NFC for existing usage.

Differential Revision: https://reviews.llvm.org/D118672

2 years agoAdd missing includes after LLVMCore header cleanup
serge-sans-paille [Wed, 2 Feb 2022 06:49:40 +0000 (07:49 +0100)]
Add missing includes after LLVMCore header cleanup

- conditionally include header only used for expensive check
- have Core.h always include llvm-c/ErrorHandling.h

2 years agoUpdate status on migration again. Add note about issues with reply by email from...
Tanya Lattner [Wed, 2 Feb 2022 06:25:31 +0000 (22:25 -0800)]
Update status on migration again. Add note about issues with reply by email from emails pre-migration.

2 years ago[lld][ELF] Add support for ADRP+ADD optimization for AArch64
Alexander Shaposhnikov [Wed, 2 Feb 2022 06:08:05 +0000 (06:08 +0000)]
[lld][ELF] Add support for ADRP+ADD optimization for AArch64

This diff adds support for ADRP+ADD optimization for AArch64 described in
https://github.com/ARM-software/abi-aa/commit/d2ca58c54b8e955cfef25c71822f837ae0439d73
i.e. under appropriate constraints

ADRP  x0, symbol
ADD   x0, x0, :lo12: symbol

can be turned into

NOP
ADR   x0, symbol

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D117614

2 years agoCleanup header dependencies in LLVMCore
serge-sans-paille [Mon, 31 Jan 2022 21:35:07 +0000 (22:35 +0100)]
Cleanup header dependencies in LLVMCore

Based on the output of include-what-you-use.

This is a big chunk of changes. It is very likely to break downstream code
unless they took a lot of care in avoiding hidden ehader dependencies, something
the LLVM codebase doesn't do that well :-/

I've tried to summarize the biggest change below:

- llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h
- llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h
- llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h
- llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h
- llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h
- llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h
- llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h

And the usual count of preprocessed lines:
$ clang++ -E  -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 6400831
after:  6189948

200k lines less to process is no that bad ;-)

Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup

Differential Revision: https://reviews.llvm.org/D118652

2 years ago[RISCV] Fix some 80 column violations in ComputeNumSignBitsForTargetNode. NFC
Craig Topper [Wed, 2 Feb 2022 05:30:58 +0000 (21:30 -0800)]
[RISCV] Fix some 80 column violations in ComputeNumSignBitsForTargetNode. NFC

2 years ago[TableGen][RISCV] Relax a restriction in generating patterns for commutable SDNodes.
Craig Topper [Wed, 2 Feb 2022 05:07:02 +0000 (21:07 -0800)]
[TableGen][RISCV] Relax a restriction in generating patterns for commutable SDNodes.

Previously, all children would be checked to see if any were an
explicit Register. If anywhere no commutable patterns would be
generated. This patch loosens the restriction to only check the
children that are being commuted.

Digging back through history, this code predates the existence of
commutable intrinsics and commutable SDNodes with more than 2
operands. At that time the loop would count the number of children that
weren't registers and if that was equal to 2 it would allow commuting.
I don't think this loop was re-considered when commutable
intrinsics were added or when we allowed SDNodes with more than 2
operands.

This important for RISCV were our isel patterns have a V0 mask
operand after the commutable operands on some RISCVISD opcodes.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D117955

2 years ago[mlir][ods] NFC Fix ASAN error in FormatParser
Mogball [Wed, 2 Feb 2022 04:29:57 +0000 (04:29 +0000)]
[mlir][ods] NFC Fix ASAN error in FormatParser

Some FormatElement subclasses contain `std::vector`. Since these use
BumpPtrAllocator, they need to be converted to trailing objects.
However, this is not a trivial fix so I will leave it as a FIXME and use
a workaround.

2 years agoReland "[gn build] (manually) port 36892727e4f1"
Nico Weber [Wed, 2 Feb 2022 03:30:13 +0000 (22:30 -0500)]
Reland "[gn build] (manually) port 36892727e4f1"

This reverts commit da01fb7471a027c29db1b78e1721cd1ae6df6572.
Matches 84f137a590e7.

Also adds LLVM_INSTALL_TOOLCHAIN_ONLY, which 84f137a590e7 added too.

2 years agoRevert "[DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out"
Kevin Athey [Wed, 2 Feb 2022 01:19:39 +0000 (17:19 -0800)]
Revert "[DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out"

This reverts commit 3fab2d138e30c65249e1eaea6cc68b2b7f50955a.

Breaking PPC sanitizer build:
https://lab.llvm.org/buildbot/#/builders/105/builds/20857

2 years agoAdd new status of the move to Discourse.
Tanya Lattner [Wed, 2 Feb 2022 02:30:46 +0000 (18:30 -0800)]
Add new status of the move to Discourse.

2 years ago[ARM] Fix build break after 762f0b546328
Nemanja Ivanovic [Wed, 2 Feb 2022 02:13:09 +0000 (20:13 -0600)]
[ARM] Fix build break after 762f0b546328

The commit adds a unit test that uses the facilities of libLLVMCore
without adding it to link components. This causes failures with
the shared libraries builds.

This patch just adds the missing library to the link step.

2 years agoUpdate discourse migration status.
Tanya Lattner [Wed, 2 Feb 2022 02:09:31 +0000 (18:09 -0800)]
Update discourse migration status.

2 years ago[AMDGPU][NFC] Fixing formatting
Jacob Lambert [Thu, 20 Jan 2022 18:25:10 +0000 (10:25 -0800)]
[AMDGPU][NFC] Fixing formatting

Differential Revision: https://reviews.llvm.org/D117801

2 years ago[flang] Fix argument keyword names in some specific intrinsics
Peter Klausler [Thu, 27 Jan 2022 17:55:37 +0000 (09:55 -0800)]
[flang] Fix argument keyword names in some specific intrinsics

Some entries in the specific intrinsic function table have the
wrong argument keyword names -- they should agree with the names
of the arguments on their corresponding generic intrinsic function.
Clean them up.

Differential Revision: https://reviews.llvm.org/D118721

2 years ago[libc++][ranges][NFC] Fix an inconsistent patch link on the Ranges status page.
Konstantin Varlamov [Wed, 2 Feb 2022 00:50:33 +0000 (16:50 -0800)]
[libc++][ranges][NFC] Fix an inconsistent patch link on the Ranges status page.

2 years ago[llvm-profgen] Clean up unnecessary memory reservations between phases.
Hongtao Yu [Tue, 1 Feb 2022 04:24:45 +0000 (20:24 -0800)]
[llvm-profgen] Clean up unnecessary memory reservations between phases.

Cleaning up data structures that are not used after a certain point. This further brings down peak memory usage by 15% for a large benchmark.

Before:
   note: Before parsePerfTraces
   note: VM: 40.73 GB   RSS: 39.18 GB
   note: Before parseAndAggregateTrace
   note: VM: 40.73 GB   RSS: 39.18 GB
   note: After parseAndAggregateTrace
   note: VM: 88.93 GB   RSS: 87.97 GB
   note: Before generateUnsymbolizedProfile
   note: VM: 88.95 GB   RSS: 87.99 GB
   note: After generateUnsymbolizedProfile
   note: VM: 93.50 GB   RSS: 92.53 GB
   note: After computeSizeForProfiledFunctions
   note: VM: 101.13 GB   RSS: 99.36 GB
   note: After generateProbeBasedProfile
   note: VM: 215.61 GB   RSS: 210.88 GB
   note: After postProcessProfiles
   note: VM: 237.48 GB   RSS: 212.50 GB

After:
   note: Before parsePerfTraces
   note: VM: 40.73 GB   RSS: 39.18 GB
   note: Before parseAndAggregateTrace
   note: VM: 40.73 GB   RSS: 39.18 GB
   note: After parseAndAggregateTrace
   note: VM: 88.93 GB   RSS: 87.96 GB
   note: Before generateUnsymbolizedProfile
   note: VM: 88.95 GB   RSS: 87.97 GB
   note: After generateUnsymbolizedProfile
   note: VM: 93.50 GB   RSS: 92.51 GB
   note: After computeSizeForProfiledFunctions
   note: VM: 93.50 GB   RSS: 92.53 GB
   note: After generateProbeBasedProfile
   note: VM: 164.87 GB   RSS: 163.55 GB
   note: After postProcessProfiles
   note: VM: 182.28 GB   RSS: 179.43 GB

Reviewed By: wenlei, wlei

Differential Revision: https://reviews.llvm.org/D118677

2 years ago[flang] Fix edge-case I/O regressions
Peter Klausler [Thu, 27 Jan 2022 17:53:14 +0000 (09:53 -0800)]
[flang] Fix edge-case I/O regressions

A blank field in an input record that exists must be interpreted
as a zero value for numeric input editing, but advancing to a
next record that doesn't exist should leave an input variable
unmodified (and signal END=).  On internal output, blank fill
the "current record" array element even if nothing has been
written to it if it is the only record.

Differential Revision: https://reviews.llvm.org/D118720

2 years agoTest fixes for prior patch
David Blaikie [Wed, 2 Feb 2022 00:15:25 +0000 (16:15 -0800)]
Test fixes for prior patch

2 years agoRevert "DebugInfo: Don't put types in type units if they reference internal linkage...
David Blaikie [Tue, 1 Feb 2022 02:27:39 +0000 (18:27 -0800)]
Revert "DebugInfo: Don't put types in type units if they reference internal linkage types"

This reverts commit ab4756338c5b2216d52d9152b2f7e65f233c4dac.

Breaks some cases, including this:

namespace {
template <typename> struct a {};
} // namespace
class c {
  c();
};
class b {
  b();
  a<c> ax;
};
b::b() {}
c::c() {}

By producing a reference to a type unit for "c" but not producing the type unit.

2 years agoRevert "[ASan] Not linking asan_static library for DSO."
Kirill Stoimenov [Tue, 1 Feb 2022 20:39:29 +0000 (20:39 +0000)]
Revert "[ASan] Not linking asan_static library for DSO."

This reverts commit cf730d8ce1341ba593144df2e2bc8411238e04c3. It turned out that D118184 is causing segfaults in some situations.

Reviewed By: vitalybuka, kda

Differential Revision: https://reviews.llvm.org/D118739

2 years ago[mlir][taco] Add a utility to create an MLIR sparse tensor from a file.
Bixia Zheng [Fri, 28 Jan 2022 18:56:50 +0000 (10:56 -0800)]
[mlir][taco] Add a utility to create an MLIR sparse tensor from a file.

Move the functions that retrieve the supporting C library, compile an MLIR
module and build a JIT execution engine to mlir_pytaco_utils.

Add a function to create an MLIR sparse tensor from a file and return a pointer
to the MLIR sparse tensor as well as the shape of the sparse tensor.

Add unit tests.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D118496

2 years ago[Driver][test] Fix fatal-warnings.c CHECK lines and fold the test into as-warnings.c
Fangrui Song [Tue, 1 Feb 2022 23:11:16 +0000 (15:11 -0800)]
[Driver][test] Fix fatal-warnings.c CHECK lines and fold the test into as-warnings.c

2 years agoRevert "[llvm-profgen] Clean up unnecessary memory reservations between phases."
Hongtao Yu [Tue, 1 Feb 2022 22:44:37 +0000 (14:44 -0800)]
Revert "[llvm-profgen] Clean up unnecessary memory reservations between phases."

This reverts commit 057e784b0962a7c5a17e858932bb6f03c7676c47.

2 years ago[libc++][ranges][NFC] In the Ranges status, list the changes to stream.iterators
Konstantin Varlamov [Tue, 1 Feb 2022 22:39:53 +0000 (14:39 -0800)]
[libc++][ranges][NFC] In the Ranges status, list the changes to stream.iterators

2 years ago[LV] Allow a scalable VF for the epilogue.
Sander de Smalen [Tue, 1 Feb 2022 17:27:01 +0000 (17:27 +0000)]
[LV] Allow a scalable VF for the epilogue.

For some reason we limited the epilogue VF to be fixed-width, but there
is not necessarily a reason for doing so. If the main VF=vscale x 16, the
epilogue VF could be either fixed-width, or a scalable VF upto vscale x 8.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D118688

2 years agoReland "enable plugins for clang-tidy"
Jameson Nash [Tue, 1 Feb 2022 17:05:20 +0000 (12:05 -0500)]
Reland "enable plugins for clang-tidy"

This reverts commit ab3b89855c5318f0009e1f016ffe5b1483507fd0 but
disables the new test if the user has disabled support for building it.

2 years ago[libc++][ranges][NFC] In the Ranges status, list the changes to predef.iterators
Konstantin Varlamov [Tue, 1 Feb 2022 22:34:40 +0000 (14:34 -0800)]
[libc++][ranges][NFC] In the Ranges status, list the changes to predef.iterators

2 years ago[LoopFuse] Add assertion for non-null DT in fusion candidate
Anna Thomas [Tue, 1 Feb 2022 21:46:02 +0000 (16:46 -0500)]
[LoopFuse] Add assertion for non-null DT in fusion candidate

The code paths analyzed (all constructor invocations of fusion
candidate) pass in a non-null DT.
Adding this assert as requested in D118472 before converting this to a
reference argument.

2 years ago[LoopPeel] Use reference instead of pointer for DT argument
Anna Thomas [Tue, 1 Feb 2022 21:29:22 +0000 (16:29 -0500)]
[LoopPeel] Use reference instead of pointer for DT argument

Cleanup code in peelLoop API. We already have usage of DT without guarding
against a null DT, so this change constant folds the remaining null DT
checks.
Also make the argument a reference so that it is clear the argument is
a nonnull DT.
Extracted from D118472.

2 years ago[libc++] Make _VSTD and alias for std
Nikolas Klauser [Tue, 1 Feb 2022 21:38:27 +0000 (22:38 +0100)]
[libc++] Make _VSTD and alias for std

There is no practical difference between `_VSTD` and `std` so we should just remove `_VSTD`. This is the first step.

Reviewed By: ldionne, #libc

Spies: jeroen.dobbelaere, wmaxey, EricWF, lebedev.ri, __simt__, dim, mgrang, sstefan1, wenlei, smeenai, libcxx-commits, #libc_vendors

Differential Revision: https://reviews.llvm.org/D117811

2 years agoAdd ClangLinkerWrapper to the TOC to appease the Sphinx build bot
Aaron Ballman [Tue, 1 Feb 2022 21:37:07 +0000 (16:37 -0500)]
Add ClangLinkerWrapper to the TOC to appease the Sphinx build bot

2 years ago[sanitizer_common][test] Enable tests on SPARC
Rainer Orth [Tue, 1 Feb 2022 21:33:56 +0000 (22:33 +0100)]
[sanitizer_common][test] Enable tests on SPARC

Unfortunately, the `sanitizer_common` tests are disabled on many targets
that are supported by `sanitizer_common`, making it easy to miss issues
with that support.  This patch enables SPARC testing.

Beside the enabling proper, the patch fixes (together with D91607
<https://reviews.llvm.org/D91607>) the failures of the `symbolize_pc.cpp`,
`symbolize_pc_demangle.cpp`, and `symbolize_pc_inline.cpp` tests.  They
lack calls to `__builtin_extract_return_addr`.  When those are added, they
`PASS` when compiled with `gcc`.  `clang` incorrectly doesn't implement a
non-default `__builtin_extract_return_addr` on several targets, SPARC
included.

Because `__builtin_extract_return_addr(__builtin_return_addr(0))` is quite
a mouthful and I'm uncertain if the code needs to compile with msvc which
appparently has it's own `_ReturnAddress`, I've introduced
`__sanitizer_return_addr` to hide the difference and complexity.  Because
on 32-bit SPARC `__builtin_extract_return_addr` differs when the calling
function returns a struct, I've added a testcase for that.

There are a couple more tests failing on SPARC that I will deal with
separately.

Tested on `sparcv9-sun-solaris2.11`, `amd64-pc-solaris2.11`, and
`x86_64-pc-linux-gnu`.

Differential Revision: https://reviews.llvm.org/D91608

2 years ago[libc++] Remove unneeded qualifier.
Mark de Wever [Tue, 1 Feb 2022 21:32:49 +0000 (16:32 -0500)]
[libc++] Remove unneeded qualifier.

In D117811 @Quuxplusone pointed out the friend declarations don't need
to be qualified. Removing the qualification should avoid needing to add
a GCC work-around when changing _VSTD to std.

Reviewed By: Quuxplusone, philnik, #libc, ldionne

Differential Revision: https://reviews.llvm.org/D118719

2 years ago[hwasan][test] Remove obsoleted/removed -fno-experimental-new-pass-manager
Fangrui Song [Tue, 1 Feb 2022 21:24:39 +0000 (13:24 -0800)]
[hwasan][test] Remove obsoleted/removed -fno-experimental-new-pass-manager

2 years ago[GVN] Add additional tests after 216d1a729.
Florian Hahn [Tue, 1 Feb 2022 21:02:41 +0000 (21:02 +0000)]
[GVN] Add additional tests after 216d1a729.

Further extend test coverage added in 216d1a729

2 years ago[llvm-profgen] Clean up unnecessary memory reservations between phases.
Hongtao Yu [Tue, 1 Feb 2022 04:24:45 +0000 (20:24 -0800)]
[llvm-profgen] Clean up unnecessary memory reservations between phases.

Cleaning up data structures that are not used after a certain point. This further brings down peak memory usage by 15% for a large benchmark.

Before:
   note: Before parsePerfTraces
   note: VM: 40.73 GB   RSS: 39.18 GB
   note: Before parseAndAggregateTrace
   note: VM: 40.73 GB   RSS: 39.18 GB
   note: After parseAndAggregateTrace
   note: VM: 88.93 GB   RSS: 87.97 GB
   note: Before generateUnsymbolizedProfile
   note: VM: 88.95 GB   RSS: 87.99 GB
   note: After generateUnsymbolizedProfile
   note: VM: 93.50 GB   RSS: 92.53 GB
   note: After computeSizeForProfiledFunctions
   note: VM: 101.13 GB   RSS: 99.36 GB
   note: After generateProbeBasedProfile
   note: VM: 215.61 GB   RSS: 210.88 GB
   note: After postProcessProfiles
   note: VM: 237.48 GB   RSS: 212.50 GB

After:
   note: Before parsePerfTraces
   note: VM: 40.73 GB   RSS: 39.18 GB
   note: Before parseAndAggregateTrace
   note: VM: 40.73 GB   RSS: 39.18 GB
   note: After parseAndAggregateTrace
   note: VM: 88.93 GB   RSS: 87.96 GB
   note: Before generateUnsymbolizedProfile
   note: VM: 88.95 GB   RSS: 87.97 GB
   note: After generateUnsymbolizedProfile
   note: VM: 93.50 GB   RSS: 92.51 GB
   note: After computeSizeForProfiledFunctions
   note: VM: 93.50 GB   RSS: 92.53 GB
   note: After generateProbeBasedProfile
   note: VM: 164.87 GB   RSS: 163.55 GB
   note: After postProcessProfiles
   note: VM: 182.28 GB   RSS: 179.43 GB

Reviewed By: wenlei, wlei

Differential Revision: https://reviews.llvm.org/D118677

2 years ago[x86] add tests for fmul/fdiv with identity constant in select arm; NFC
Sanjay Patel [Tue, 1 Feb 2022 20:28:21 +0000 (15:28 -0500)]
[x86] add tests for fmul/fdiv with identity constant in select arm; NFC

2 years ago[x86] add more tests for select with identity constant; NFC
Sanjay Patel [Tue, 1 Feb 2022 16:20:27 +0000 (11:20 -0500)]
[x86] add more tests for select with identity constant; NFC

D118644

2 years ago[mlir][capi] Add DialectRegistry to MLIR C-API
Daniel Resnick [Thu, 27 Jan 2022 00:13:24 +0000 (17:13 -0700)]
[mlir][capi] Add DialectRegistry to MLIR C-API

Exposes mlir::DialectRegistry to the C API as MlirDialectRegistry along with
helper functions. A hook has been added to MlirDialectHandle that inserts
the dialect into a registry.

A future possible change is removing mlirDialectHandleRegisterDialect in
favor of using mlirDialectHandleInsertDialect, which it is now implemented with.

Differential Revision: https://reviews.llvm.org/D118293

2 years ago[AMDGPU] Check atomics aliasing in the clobbering annotation
Stanislav Mekhanoshin [Mon, 31 Jan 2022 23:10:08 +0000 (15:10 -0800)]
[AMDGPU] Check atomics aliasing in the clobbering annotation

MemorySSA considers any atomic a def to any operation it dominates
just like a barrier or fence. That is correct from memory state
perspective, but not required for the no-clobber metadata since
we are not using it for reordering. Skip such atomics during the
scan just like a barrier if it does not alias with the load.

Differential Revision: https://reviews.llvm.org/D118661

2 years ago[libc++] Fix TOCTOU issue with std::filesystem::remove_all
Louis Dionne [Wed, 26 Jan 2022 16:07:49 +0000 (11:07 -0500)]
[libc++] Fix TOCTOU issue with std::filesystem::remove_all

https://bugs.chromium.org/p/llvm/issues/detail?id=19
rdar://87912416

Differential Revision: https://reviews.llvm.org/D118134

2 years ago[libc++][ci] Re-enable the bootstrapping build
Louis Dionne [Mon, 24 Jan 2022 20:49:56 +0000 (15:49 -0500)]
[libc++][ci] Re-enable the bootstrapping build

Differential Revision: https://reviews.llvm.org/D118067

2 years ago[GVN] Add tests for D118143 not requiring loops.
Florian Hahn [Tue, 1 Feb 2022 20:24:19 +0000 (20:24 +0000)]
[GVN] Add tests for D118143 not requiring loops.

2 years agoRevert "[DAG] Extend SearchForAndLoads with any_extend handling"
David Green [Tue, 1 Feb 2022 20:18:40 +0000 (20:18 +0000)]
Revert "[DAG] Extend SearchForAndLoads with any_extend handling"

This reverts commit 100763a88fe97b22cd5e3f69d203669aac3ed48f as it was
making incorrect assumptions about implicit zero_extends.

2 years ago[clang] Don't typo-fix an expression in a SFINAE context.
Arthur O'Dwyer [Tue, 18 Jan 2022 12:25:17 +0000 (07:25 -0500)]
[clang] Don't typo-fix an expression in a SFINAE context.

If this is a SFINAE context, then continuing to look up names
(in particular, to treat a non-function as a function, and then
do ADL) might too-eagerly complete a type that it's not safe to
complete right now. We should just say "okay, that's a substitution
failure" and not do any more work than absolutely required.

Fixes #52970.

Differential Revision: https://reviews.llvm.org/D117603

2 years ago[clang] Correctly(?) handle placeholder types in ExprRequirements.
Arthur O'Dwyer [Fri, 28 Jan 2022 20:51:19 +0000 (15:51 -0500)]
[clang] Correctly(?) handle placeholder types in ExprRequirements.

Bug #52905 was originally papered over in a different way, but
I believe this is the actually proper fix, or at least closer to
it. We need to detect placeholder types as close to the front-end
as possible, and cause them to fail constraints, rather than letting
them persist into later stages.

Fixes #52905.
Fixes #52909.
Fixes #53075.

Differential Revision: https://reviews.llvm.org/D118552

2 years ago[libc++] Fix LWG3589 "The const lvalue reference overload of get for subrange..."
Arthur O'Dwyer [Sat, 22 Jan 2022 19:33:12 +0000 (14:33 -0500)]
[libc++] Fix LWG3589 "The const lvalue reference overload of get for subrange..."

https://cplusplus.github.io/LWG/issue3589

Differential Revision: https://reviews.llvm.org/D117961

2 years ago[hwasan] work around lifetime issue with setjmp.
Florian Mayer [Mon, 31 Jan 2022 21:10:41 +0000 (13:10 -0800)]
[hwasan] work around lifetime issue with setjmp.

setjmp can return twice, but PostDominatorTree is unaware of this. as
such, it overestimates postdominance, leaving some cases (see attached
compiler-rt) where memory does not get untagged on return. this causes
false positives later in the program execution.

this is a crude workaround to unblock use-after-scope for now, in the
longer term PostDominatorTree should bemade aware of returns_twice
function, as this may cause problems elsewhere.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D118647

2 years ago[flang] Lower basic STOP statement
Valentin Clement [Tue, 1 Feb 2022 19:53:00 +0000 (20:53 +0100)]
[flang] Lower basic STOP statement

This patch lowers STOP statement without arguments
and ERROR STOP. STOP statement with arguments lowering will
come in later patches ince it requires some expression lowering
to be added.
STOP statement is lowered to a runtime call.

Also makes sure we are creating a constant in the MLIR arith constant.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: kiranchandramohan, schweitz

Differential Revision: https://reviews.llvm.org/D118697

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
2 years ago[flang] Fix/work around warnings from GCC 11
Peter Klausler [Tue, 1 Feb 2022 19:51:19 +0000 (11:51 -0800)]
[flang] Fix/work around warnings from GCC 11

Apply part of a pending patch for GCC 11 warnings, and
rework a piece of code, to dodge warnings on flag from
GCC 11 build bots exposed by a recent patch.

Applying without review to get bots working again; changes
also tested against GCC 9.3.0.

2 years ago[AMDGPU] Allow scalar loads after barrier
Stanislav Mekhanoshin [Fri, 28 Jan 2022 00:27:43 +0000 (16:27 -0800)]
[AMDGPU] Allow scalar loads after barrier

Currently we cannot convert a vector load into scalar if there
is dominating barrier or fence. It is considered a clobbering
memory access to prevent memory operations reordering. While
reordering is not possible the actual memory is not being clobbered
by a barrier or fence and we can still use a scalar load for a
uniform pointer.

The solution is not to bail on a first clobbering access but
traverse MemorySSA to the root excluding barriers and fences.

Differential Revision: https://reviews.llvm.org/D118419

2 years ago[DebugInfo][InstrRef][NFC] Bypass a frequently-noop loop
Jeremy Morse [Tue, 1 Feb 2022 19:39:09 +0000 (19:39 +0000)]
[DebugInfo][InstrRef][NFC] Bypass a frequently-noop loop

Bypass this loop if it would do nothing -- if there are no register masks
to be examined, there's no point looking at each location to see if the
location has been def'd. Awkwardly, this was responsible for almost an
entire half a percent of performance improvement on CTMark.

Differential Revision: https://reviews.llvm.org/D118613

2 years ago[DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out
Jeremy Morse [Tue, 1 Feb 2022 19:19:20 +0000 (19:19 +0000)]
[DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out

In certain circumstances with things like autogenerated code and asan, you
can end up with thousands of Values live at the same time, causing a large
working set and a lot of information spilled to the stack. Unfortunately
InstrRefBasedLDV doesn't cope well with this and consumes a lot of memory
when there are many many stack slots. See the reproducer in D116821.

It seems very unlikely that a developer would be able to reason about
hundreds of live named local variables at the same time, so a huge working
set and many stack slots is an indicator that we're likely analysing
autogenerated or instrumented code. In those cases: gracefully degrade by
setting an upper bound on the amount of stack slots to track. This limits
peak memory consumption, at the cost of dropping some variable locations,
but in a rare scenario where it's unlikely someone is actually going to
use them.

In terms of the patch, this adds a cl::opt for max number of stack slots to
track, and has the stack-slot-numbering code optionally return None. That
then filters through a number of code paths, which can then chose to not
track a spill / restore if it touches an untracked spill slot. The added
test checks that we drop variable locations that are on the stack, if we
set the limit to zero.

Differential Revision: https://reviews.llvm.org/D118601

2 years ago[HWASan] Properly handle musttail calls.
Matt Morehouse [Tue, 1 Feb 2022 19:23:36 +0000 (11:23 -0800)]
[HWASan] Properly handle musttail calls.

Fixes a compile error when the `clang::musttail` attribute is used.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D118712

2 years ago[NFC] These tests require a default target
Chris Bieneman [Mon, 31 Jan 2022 22:15:47 +0000 (16:15 -0600)]
[NFC] These tests require a default target

These test cases all rely on a default target being specified. Adding
the requirement gets the tests properly skipped when
LLVM_DEFAULT_TARGET_TRIPLE is unset.

2 years agoChange namespace llvm::swift to namespace llvm::binaryformat because of clashes with...
Shubham Sandeep Rastogi [Tue, 1 Feb 2022 18:30:28 +0000 (10:30 -0800)]
Change namespace llvm::swift to namespace llvm::binaryformat because of clashes with the apple/llvm-project repository

The namespace llvm::swift is causing errors to pop up in the apple/llvm-project build when cherry-picking 4ce1f3d47c33 into apple/llvm-project

Differential Review: https://reviews.llvm.org/D118716

2 years ago[NFC] Use llvm-as instead of llc
Chris Bieneman [Mon, 31 Jan 2022 22:12:41 +0000 (16:12 -0600)]
[NFC] Use llvm-as instead of llc

llvm-as does everything this test requires, but doesn't depend on a
target being registered. This gets the test passing when
LLVM_DEFAUL_TARGET_TRIPLE is unset.

2 years ago[InstCombine] Remove weaker fence adjacent to a stronger fence
Anna Thomas [Fri, 28 Jan 2022 21:45:04 +0000 (13:45 -0800)]
[InstCombine] Remove weaker fence adjacent to a stronger fence

We have an instCombine rule to remove identical consecutive fences.
We can extend this to remove weaker fences when we have consecutive stronger
fence.

As stated in the LangRef, a fence with a stronger ordering also implies
ordering weaker than itself: "A fence which has seq_cst ordering, in addition to
having both acquire and release semantics specified above, participates in the
global program order of other seq_cst operations and/or fences."

Reviewed-By: reames
Differential Revision: https://reviews.llvm.org/D118607

2 years ago[DebugInfo][InstrRef][NFC] Don't build a map of un-needed values
Jeremy Morse [Tue, 1 Feb 2022 18:55:08 +0000 (18:55 +0000)]
[DebugInfo][InstrRef][NFC] Don't build a map of un-needed values

When finding locations for variable values at the start of a block, we
build a large map of every value to every location, and then pick out the
locations for values that are desired. This takes up quite a lot of time,
because, unsurprisingly, there are usually more values in registers and
stack slots than there are variables.

This patch instead creates a map of desired values to their locations,
which are initially illegal locations. Then, as we examine every available
value, we can select locations for values we care about, and ignore those
that we don't. This substantially reduces the amount of work done (i.e.,
building a map up of values to locations that nothing wants or needs).

Geomean performance improvement of 1% on CTMark, woo.

Differential Revision: https://reviews.llvm.org/D118597

2 years ago[OpenMP] Add kernel string attribute to kernel function
Joseph Huber [Tue, 1 Feb 2022 16:46:20 +0000 (11:46 -0500)]
[OpenMP] Add kernel string attribute to kernel function

This patch adds a function attribute to the kernel function generated in
OpenMP offloading. We already create a `nvvm.annotations` metadata node
indicating the kernels present in the program. However, this created
some indirection when trying to identify if a specific function was an
entry. We add a single function attribute for each function now to
simplify this.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D118708

2 years ago[lld-macho][nfc] Comments and style fixes
Jez Ng [Tue, 1 Feb 2022 18:45:38 +0000 (13:45 -0500)]
[lld-macho][nfc] Comments and style fixes

Added some comments (particularly around finalize() and
finalizeContents()) as well as doing some rephrasing / grammar fixes for
existing comments.

Also did some minor style fixups, such as by putting methods together in
a class definition and having fields of similar types next to each
other.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D118714

2 years agoUpdate status of move.
Tanya Lattner [Tue, 1 Feb 2022 18:45:40 +0000 (10:45 -0800)]
Update status of move.

2 years ago[GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible
Fangrui Song [Tue, 1 Feb 2022 18:41:16 +0000 (10:41 -0800)]
[GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible

Generalize D99629 for ELF. A default visibility non-local symbol is preemptible
in a -shared link. `isInterposable` is an insufficient condition.

Moreover, a non-preemptible alias may be referenced in a sub constant expression
which intends to lower to a PC-relative relocation. Replacing the alias with a
preemptible aliasee may introduce a linker error.

Respect dso_preemptable and suppress optimization to fix the abose issues. With
the change, `alias = 345` will not be rewritten to use aliasee in a `-fpic`
compile.
```
int aliasee;
extern int alias __attribute__((alias("aliasee"), visibility("hidden")));
void foo() { alias = 345; } // intended to access the local copy
```

While here, refine the condition for the alias as well.

For some binary formats like COFF, `isInterposable` is a sufficient condition.
But I think canonicalization for the changed case has little advantage, so I
don't bother to add the `Triple(M.getTargetTriple()).isOSBinFormatELF()` or
`getPICLevel/getPIELevel` complexity.

For instrumentations, it's recommended not to create aliases that refer to
globals that have a weak linkage or is preemptible. However, the following is
supported and the IR needs to handle such cases.
```
int aliasee __attribute__((weak));
extern int alias __attribute__((alias("aliasee")));
```

There are other places where GlobalAlias isInterposable usage may need to be
fixed.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D107249

2 years agoAvoid doing tile + fuse if tile sizes are zero.
Mahesh Ravishankar [Tue, 1 Feb 2022 16:54:05 +0000 (16:54 +0000)]
Avoid doing tile + fuse if tile sizes are zero.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D118576

2 years ago[NFC] Add CFGuard to opt build
Chris Bieneman [Mon, 31 Jan 2022 22:11:11 +0000 (16:11 -0600)]
[NFC] Add CFGuard to opt build

If you don't include a target that directly references CFGuard it
doesn't get built into opt or the llvm library build, which causes some
test cases to fail.

Including this in opt explicitly resolve those issues.