platform/upstream/llvm.git
2 years agoOpenMP: Un-xfail tests that pass now
Matt Arsenault [Sat, 4 Dec 2021 16:24:28 +0000 (11:24 -0500)]
OpenMP: Un-xfail tests that pass now

729bf9b26b657df8ddad2e5a63377e6afb349a18 should have fixed these

2 years agoRevert "[DwarfDebug] Support emitting function-local declaration for a lexical block"
Kristina Bessonova [Sat, 4 Dec 2021 16:03:46 +0000 (18:03 +0200)]
Revert "[DwarfDebug] Support emitting function-local declaration for a lexical block"

This reverts commits
ee691970a9a85470948ada623c31f0ab8773617c (D113741),
79d3132998b2828be8f7d2ec411f91fb11b3e01f (D114705)

due to lldb and dexter test failures.

2 years agoAMDGPU: Enable fixed function ABI by default
Matt Arsenault [Sat, 14 Aug 2021 23:10:46 +0000 (19:10 -0400)]
AMDGPU: Enable fixed function ABI by default

Code using indirect calls is broken without this, and there isn't
really much value in supporting the old attempt to vary the argument
placement based on uses. This resulted in more argument shuffling code
anyway.

Also have the option stop implying all inputs need to be passed. This
will no rely on the amdgpu-no-* attributes to avoid passing
unnecessary values.

2 years ago[BasicAA] Add atomic mem intrinsic tests.
Florian Hahn [Sat, 4 Dec 2021 15:20:03 +0000 (15:20 +0000)]
[BasicAA] Add atomic mem intrinsic tests.

2 years agoAMDGPU: Assume all amdhsa kernarg passed implicit arguments by default
Matt Arsenault [Tue, 26 Oct 2021 01:30:42 +0000 (21:30 -0400)]
AMDGPU: Assume all amdhsa kernarg passed implicit arguments by default

Previously we would require adding an attribute to kernels to enable
the inputs passed in the kernarg segment, accessed by
llvm.amdgcn.implicitarg.ptr. This violates the principle of being
correct by default. Some OpenMP testcases were broken recently since
it wasn't correctly setting this attribute, and no known frontends are
setting this to anything other than the maximum.

Most of the test changes are from load widening of argument loads
since there now more implied dereferenceable bytes.

2 years agoAMDGPU: Optimize out implicit kernarg argument allocation if unused
Matt Arsenault [Mon, 25 Oct 2021 19:30:55 +0000 (15:30 -0400)]
AMDGPU: Optimize out implicit kernarg argument allocation if unused

We already annotate whether llvm.amdgcn.implicitarg.ptr is known to be
unused. Start using it to avoid allocating the implicit arguments if
unneeded.

2 years ago[DwarfDebug] Support emitting function-local declaration for a lexical block
Kristina Bessonova [Sat, 4 Dec 2021 15:12:47 +0000 (17:12 +0200)]
[DwarfDebug] Support emitting function-local declaration for a lexical block

This is another attempt to make function-local declarations
(like static variables, structs/classes and other) be correctly
emitted within a lexical (bracketed) block.

Fixes https://bugs.llvm.org/show_bug.cgi?id=19238.

Differential Revision: https://reviews.llvm.org/D113741

2 years agoApply the permutation map on each affine nest
Hugo Pompougnac [Sat, 4 Dec 2021 01:42:23 +0000 (07:12 +0530)]
Apply the permutation map on each affine nest

When using -test-loop-permutation="permutation-map=...", applies the
permutation map on each affine nest in the function (and not only the
first one). If the size of the permutation map and the size of a nest
are not consistent, do nothing on this particular nest (instead of
making MLIR crash).

Differential Revision: https://reviews.llvm.org/D112947

2 years ago[DwarfDebug] Move emission of global vars, types and imports to endModule()
Kristina Bessonova [Sat, 4 Dec 2021 12:08:10 +0000 (14:08 +0200)]
[DwarfDebug] Move emission of global vars, types and imports to endModule()

This patch proposes to move emission of global variables, types,
imported entities, etc from DwarfDebug::beginModule() to DwarfDebug::endModule().
Effectively, this changes nothing but the order of debug entities which
will be as follows:
* subprograms (including related context, local variables/labels,
  local imported entities; related types can be created as a part of
  the emission of local entities of an abstract subprogram);
* global variables (including related context and types);
* retained types and enums;
* non-local-scoped imported entities;
* basic types;
* other types left (as a part of local variables attributes emission).

Note that the order of emitted compile units may also be changed as now we emit
units that contain subprograms first and then all other non-empty units.

The motivation behind this change is the following:
(1) DwarfDebug::beginModule() is run at the very beginning of backend's pipeline,
    from this time IR can be significantly changed by target-specific passes.
    If it happens for debug metadata of global entities, those changes will not
    be reflected in the emitted DWARF.
(2) imported subprogram names should refer to an abstract subprogram if it exists,
    but it isn't known in DwarfDebug::beginModule() (it's possible to make some
    guesses based on location info, but it's not quite reliable);
(3) aforementioned entities if they are scoped within a bracketed block
    (subject of D113741) couldn't be emitted in DwarfDebug::beginModule()
    (they need parent emitted first). Another problem is if to try to gather
    some information about local entities and defer their emission
    (till subprogram's processing or DwarfDebug::endModule()) all the gathered
    details might be irrelevant / invalid by the time the entities are being
    emitted (because of (1)).

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D114705

2 years agotsan: disable dlopen_static_tls.cpp test on aarch64
Dmitry Vyukov [Sat, 4 Dec 2021 11:23:37 +0000 (12:23 +0100)]
tsan: disable dlopen_static_tls.cpp test on aarch64

Fails on bots: https://lab.llvm.org/buildbot#builders/184/builds/1580

Differential Revision: https://reviews.llvm.org/D115095

2 years ago[Passes] Move AggressiveInstCombine after InstCombine
Anton Afanasyev [Mon, 1 Nov 2021 13:48:52 +0000 (16:48 +0300)]
[Passes] Move AggressiveInstCombine after InstCombine

Swap AIC and IC neighbouring in pipeline. This looks more natural and even
almost has no effect for now (three slightly touched tests of test-suite). Also
this could be the first step towards merging AIC (or its part) to -O2 pipeline.

After several changes in AIC (like D108091, D108201, D107766, D109515, D109236)
there've been observed several regressions (like PR52078, PR52253, PR52289)
that were fixed in different passes (see D111330, D112721) by extending their
functionality, but these regressions were exposed since changed AIC prevents IC
from making some of early optimizations.

This is common problem and it should be fixed by just moving AIC after IC
which looks more logically by itself: make aggressive instruction combining
only after failed ordinary one.

Fixes PR52289

Reviewed By: spatel, RKSimon

Differential Revision: https://reviews.llvm.org/D113179

2 years ago[AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args
Jay Foad [Thu, 2 Dec 2021 12:26:59 +0000 (12:26 +0000)]
[AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args

The ray_origin, ray_dir and ray_inv_dir arguments should all be vec3 to
match how the hardware instruction works.

Don't change the API of the corresponding OpenCL builtins.

Differential Revision: https://reviews.llvm.org/D115032

2 years ago[IR,TableGen] Add support for vec3 intrinsic arguments
Jay Foad [Thu, 2 Dec 2021 12:25:00 +0000 (12:25 +0000)]
[IR,TableGen] Add support for vec3 intrinsic arguments

Add generic support for vec3 types, and in particular define
llvm_v3f32_ty which will be used by AMDGPU's
llvm.amdgcn.image.bvh.intersect.ray intrinsic.

Differential Revision: https://reviews.llvm.org/D114956

2 years ago[AMDGPU] Generate checks for llvm.amdgcn.image.bvh.intersect.ray
Jay Foad [Thu, 2 Dec 2021 13:17:14 +0000 (13:17 +0000)]
[AMDGPU] Generate checks for llvm.amdgcn.image.bvh.intersect.ray

Differential Revision: https://reviews.llvm.org/D114955

2 years ago[PhaseOrdering] Add test for incorrect merge function scheduling
Nikita Popov [Fri, 3 Dec 2021 22:26:54 +0000 (23:26 +0100)]
[PhaseOrdering] Add test for incorrect merge function scheduling

Add an -enable-merge-functions option to allow testing of function
merging as it will actually happen in the optimization pipeline.
Based on that add a test where we currently produce two identical
functions without merging them due to incorrect pass scheduling
under the new pass manager.

2 years ago[clang-tidy][NFC] Move CachedGlobList to GlobList.h
Carlos Galvez [Sat, 4 Dec 2021 08:36:50 +0000 (08:36 +0000)]
[clang-tidy][NFC] Move CachedGlobList to GlobList.h

Currently it's hidden inside ClangTidyDiagnosticConsumer,
so it's hard to know it exists.

Given that there are multiple uses of globs in clang-tidy,
it makes sense to have these classes publicly available
for other use cases that might benefit from it.

Also, add unit test by converting the existing tests
for GlobList into typed tests.

Reviewed By: salman-javed-nz

Differential Revision: https://reviews.llvm.org/D113422

2 years ago[Test][PhaseOrdering] Precommit test for PR52289
Anton Afanasyev [Sat, 4 Dec 2021 07:58:16 +0000 (10:58 +0300)]
[Test][PhaseOrdering] Precommit test for PR52289

2 years ago[sanitizer] Hook up LZW into stack store
Vitaly Buka [Tue, 23 Nov 2021 05:23:46 +0000 (21:23 -0800)]
[sanitizer] Hook up LZW into stack store

Depends on D114503.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D114924

2 years ago[CodeGen] Use range-based for loops (NFC)
Kazu Hirata [Sat, 4 Dec 2021 04:45:59 +0000 (20:45 -0800)]
[CodeGen] Use range-based for loops (NFC)

2 years ago[Sparc] Create an error when `__builtin_longjmp` is used
Tee KOBAYASHI [Sat, 4 Dec 2021 04:23:09 +0000 (23:23 -0500)]
[Sparc] Create an error when `__builtin_longjmp` is used

Support for builtin setjmp/longjmp was removed by https://reviews.llvm.org/D51487. An
error should be created when compiling C code using __builtin_setjmp or __builtin_longjmp.

Reviewed By: dcederman

Differential Revision: https://reviews.llvm.org/D108901

2 years ago[mlir] Support collecting logs from notifyMatchFailure().
Chia-hung Duan [Sat, 4 Dec 2021 04:35:24 +0000 (04:35 +0000)]
[mlir] Support collecting logs from notifyMatchFailure().

Let the user registers their own handler to processing the matching
failure information.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D110896

2 years agoUse LLVM_ATTRIBUTE_UNUSED to silent warning for static function used in assert only...
Mehdi Amini [Sat, 4 Dec 2021 04:23:21 +0000 (04:23 +0000)]
Use LLVM_ATTRIBUTE_UNUSED to silent warning for static function used in assert only (NFC)

2 years agoSplit the locking of the queue and the threads vector in the ThreadPool implementation
Mehdi Amini [Fri, 3 Dec 2021 21:51:55 +0000 (21:51 +0000)]
Split the locking of the queue and the threads vector in the ThreadPool implementation

This allows to release the QueueLock early and create Thread
independently of the queue processing.

Differential Revision: https://reviews.llvm.org/D115078

2 years ago[mlir][linalg][bufferize] Implement equivalence analysis
Matthias Springer [Sat, 4 Dec 2021 02:49:07 +0000 (11:49 +0900)]
[mlir][linalg][bufferize] Implement equivalence analysis

Instead of checking buffer equivalence during bufferization, gather buffer equivalence information right after the analysis. This is in preparation of decoupling bufferization from BufferizationAliasInfo.

This change also fixes equivalence analysis for scf.if op results, which was not fully implemented. scf.if op results are equivalent to their corresponding yield values if both yield values are equivalent.

Differential Revision: https://reviews.llvm.org/D114774

2 years agoFix build for ThreadPool when using -DLLVM_ENABLE_THREADS=OFF
Mehdi Amini [Sat, 4 Dec 2021 02:19:53 +0000 (02:19 +0000)]
Fix build for ThreadPool when using -DLLVM_ENABLE_THREADS=OFF

Differential Revision: https://reviews.llvm.org/D115019

2 years ago[MLIR] Fix affine.for unroll for multi-result upper bound maps
Uday Bondhugula [Sat, 4 Dec 2021 01:49:16 +0000 (07:19 +0530)]
[MLIR] Fix affine.for unroll for multi-result upper bound maps

Fix affine.for unroll for multi-result upper bound maps: these can't be
unrolled/unroll-and-jammed in cases where the trip count isn't known to
be a multiple of the unroll factor.

Fix and clean up repeated/unnecessary checks/comments at helper callees.

Also, fix clang-tidy variable naming warnings and redundant includes.

Differential Revision: https://reviews.llvm.org/D114662

2 years ago[mlir][linalg][bufferize][NFC] Add inPlaceAnalysis overload
Matthias Springer [Sat, 4 Dec 2021 01:27:23 +0000 (10:27 +0900)]
[mlir][linalg][bufferize][NFC] Add inPlaceAnalysis overload

Differential Revision: https://reviews.llvm.org/D114773

2 years ago[mlir] Allow shape dimensions larger than 2^32
River Riddle [Sat, 4 Dec 2021 01:09:30 +0000 (01:09 +0000)]
[mlir] Allow shape dimensions larger than 2^32

Internally we use int64_t to hold shapes, but for some
reason the parser was limiting shapes to unsigned. This
change updates the parser to properly handle int64_t shape
dimensions.

Differential Revision: https://reviews.llvm.org/D115086

2 years ago[MLIR] Improve error message on missing getArgument() override on pass
Uday Bondhugula [Mon, 29 Nov 2021 22:59:27 +0000 (04:29 +0530)]
[MLIR] Improve error message on missing getArgument() override on pass

Improve error message while registering a pass with a missing getArgument() override.

Differential Revision: https://reviews.llvm.org/D114744

2 years ago[MLIR] NFC. Rename test cases in test/mlir-cpu-runner per convention
Uday Bondhugula [Sun, 28 Nov 2021 08:47:10 +0000 (14:17 +0530)]
[MLIR] NFC. Rename test cases in test/mlir-cpu-runner per convention

Test case files at most places in MLIR uses hyphens and not underscores.
A counter-pattern was somehow started to use underscores in some places.
Rename test cases in test/mlir-cpu-runner to use hyphens so that it's
consistent at least within its directory.

Differential Revision: https://reviews.llvm.org/D114672

2 years ago[LICM] Remove profile driven restriction on hoisting
Philip Reames [Sat, 4 Dec 2021 01:11:58 +0000 (17:11 -0800)]
[LICM] Remove profile driven restriction on hoisting

This reverts change 2c391a5/D87551.  As noted in the llvm-dev thread "LICM as canonical form" sent earlier today, introducing this was a major design change made without sufficient cause.

A profile driven LICM is not an unreasonable design, it simply is not what we have.  Switching to such a model requires a lot more work than just this patch, and broad aggeement that is the right direction for the optimizer as a whole.

Worth noting is that all the tests included in the reverted changed are probably handled if we allow running unconstrained LICM, and later run LoopSink.  As such, we have no public examples which motivate a profit based hoisting approach.

2 years ago[mlir][linalg][bufferize][NFC] Use same OpBuilder throughout bufferization
Matthias Springer [Sat, 4 Dec 2021 00:55:31 +0000 (09:55 +0900)]
[mlir][linalg][bufferize][NFC] Use same OpBuilder throughout bufferization

Also set insertion point right before calling `bufferize`. No need to put an InsertionGuard anymore.

Differential Revision: https://reviews.llvm.org/D114928

2 years agoImprove error message when declarativeAssembly contains invalid literals
Mehdi Amini [Sat, 4 Dec 2021 00:26:49 +0000 (00:26 +0000)]
Improve error message when declarativeAssembly contains invalid literals

Differential Revision: https://reviews.llvm.org/D115085

2 years ago[NFC][sanitizer] Add test for command line flag for enable-noundef-analysis.
Kevin Athey [Wed, 24 Nov 2021 00:26:43 +0000 (16:26 -0800)]
[NFC][sanitizer] Add test for command line flag for enable-noundef-analysis.

A simple unit test to demonstrate the flags working correctly.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D114485

2 years ago[mlir][sparse] Adding a stress test
wren romano [Thu, 18 Nov 2021 01:40:06 +0000 (17:40 -0800)]
[mlir][sparse] Adding a stress test

Addresses https://bugs.llvm.org/show_bug.cgi?id=52410
Depends on D114192

Reviewed By: aartbik, mehdi_amini

Differential Revision: https://reviews.llvm.org/D114118

2 years ago[NFC] const-ify some methods on CommandReturnObject
Jordan Rupprecht [Fri, 3 Dec 2021 22:49:42 +0000 (14:49 -0800)]
[NFC] const-ify some methods on CommandReturnObject

2 years ago[gn build] (semiautomatically) port 98bb198693ca
Nico Weber [Fri, 3 Dec 2021 22:48:27 +0000 (17:48 -0500)]
[gn build] (semiautomatically) port 98bb198693ca

2 years ago[ELF][test] Fix typo in aarch64-cortex-a53-843419-recognize.s
Fangrui Song [Fri, 3 Dec 2021 22:38:56 +0000 (14:38 -0800)]
[ELF][test] Fix typo in aarch64-cortex-a53-843419-recognize.s

2 years agoRevert "[mlir][tosa] Add tosa.conv2d as fully_connected canonicalization"
natashaknk [Fri, 3 Dec 2021 22:34:59 +0000 (14:34 -0800)]
Revert "[mlir][tosa] Add tosa.conv2d as fully_connected canonicalization"

This reverts commit 13bdb7ab4a7acaea7144a042fe583d45fbb9b5c4. The commit introduced/uncovered an unintended bug in models containing Conv2D.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D115079

2 years ago[clang][ARM] relax -mtp=cp15 for non-thumb cases
Nick Desaulniers [Fri, 3 Dec 2021 21:59:46 +0000 (13:59 -0800)]
[clang][ARM] relax -mtp=cp15 for non-thumb cases

Building -march=armv6k Linux kernels with -mtp=cp15 fails to
compile:

error: hardware TLS register is not supported for the arm
sub-architecture

@ardb found docs for ARM1176JZF-S (ARMv6K) that reference hard thread
pointer.

Relax our ARMv6 check for cases where we're targeting ARM via -marm (vs
Thumb1 via -mthumb).  This more closely matches the KConfig requirements
for where we plan to use these (ie. ARMv6K, ARMv7 (arm or thumb2)).

As @peter.smith mentions:
  on armv5 we can write the instruction to read/write to CP15 C13 with
  the ThreadID opcode. However on no armv5 implementation will the CP15
  C13 have a Thread ID register. The GCC intent seems to be whether the
  instruction is encodable rather than check what the CPU supports.

Link: https://github.com/ClangBuiltLinux/linux/issues/1502
Link: https://developer.arm.com/documentation/ddi0301/h/system-control-coprocessor/system-control-processor-registers/c13--thread-and-process-id-registers
Reviewed By: ardb, peter.smith

Differential Revision: https://reviews.llvm.org/D114116

2 years agoThreadPool: grow the pool only as needed
Benoit Jacob [Fri, 3 Dec 2021 21:40:28 +0000 (21:40 +0000)]
ThreadPool: grow the pool only as needed

On my 96-core cloudtop 'machine', it seems unnecessary to always start
96 threads upfront... particularly as the ThreadPool is created even
with -mlir-disable-threading. Things like the resuling spew in GDB and
the obfuscated output of `(gdb) info threads` are my motivation here,
but it probably also doesn't hurt for at least some efficiency metrics to
avoid creating many threads upfront.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D115019

2 years ago[DebugInfo] Check DIEnumerator bit width when comparing for equality
Arthur Eubanks [Fri, 3 Dec 2021 19:01:25 +0000 (11:01 -0800)]
[DebugInfo] Check DIEnumerator bit width when comparing for equality

As mentioned in D106585, this causes non-determinism, which can also be
shown by this test case being flaky without this patch.

We were using the APSInt's bit width for hashing, but not for checking
for equality. APInt::isSameValue() does not check bit width.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D115054

2 years ago[test-release.sh] Do not run chrpath on AIX.
Amy Kwan [Fri, 3 Dec 2021 21:35:57 +0000 (15:35 -0600)]
[test-release.sh] Do not run chrpath on AIX.

Upon testing the use of test-release.sh on AIX, the script initially fails
because chrpath is not present on AIX. This patch adds checks for AIX and allows
the script to continue running to completion.

Differential Revision: https://reviews.llvm.org/D115046

2 years ago[sanitizer] Add Lempel–Ziv–Welch encoder/decoder
Vitaly Buka [Wed, 1 Dec 2021 00:33:06 +0000 (16:33 -0800)]
[sanitizer] Add Lempel–Ziv–Welch encoder/decoder

It's very simple, fast and efficient for the stack depot compression if used on entire pointers.

Reviewed By: morehouse, kstoimenov

Differential Revision: https://reviews.llvm.org/D114918

2 years ago[NFC][sanitizer] Iterator adaptors for Leb128 encoding
Vitaly Buka [Tue, 30 Nov 2021 23:29:03 +0000 (15:29 -0800)]
[NFC][sanitizer] Iterator adaptors for Leb128 encoding

It's similar to back_insert_iterator

Needed for D114924

Reviewed By: morehouse, kstoimenov

Differential Revision: https://reviews.llvm.org/D114980

2 years ago[sanitizer] Support IsRssLimitExceeded in all sanitizers
Vitaly Buka [Thu, 2 Dec 2021 22:25:30 +0000 (14:25 -0800)]
[sanitizer] Support IsRssLimitExceeded in all sanitizers

Reviewed By: kstoimenov

Differential Revision: https://reviews.llvm.org/D115000

2 years ago[Sanitizer] Use CreateDirectoryA for report dirs
Choongwoo Han [Fri, 3 Dec 2021 19:36:11 +0000 (11:36 -0800)]
[Sanitizer] Use CreateDirectoryA for report dirs

Using `_mkdir` of CRT in Asan Init leads to launch failure and hanging in Windows.

You can trigger it by calling:
> set ASAN_OPTIONS=log_path=a/a/a
> .\asan_program.exe

And their crash dump shows the following stack trace:
```
_guard_dispatch_icall_nop()
__acrt_get_utf8_acp_compatibility_codepage()
_mkdir(const char * path)
```

I guess there could be a cfg guard in CRT, which may lead to calling uninitialized cfg guard function address. Also, `_mkdir` supports UTF-8 encoding of the path and calls _wmkdir, but that's not necessary for this case since other file apis in sanitizer_win.cpp assumes only ANSI code case, so it makes sense to use CreateDirectoryA matching other file api calls in the same file.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D114760

2 years ago[Passes] Adjust SLPVectorizer placement in test.
Florian Hahn [Fri, 3 Dec 2021 20:26:49 +0000 (20:26 +0000)]
[Passes] Adjust SLPVectorizer placement in test.

SLPVectorizer runs *after* the extra vector passes.

2 years ago[Passes] Improve opt-pipeline-vector-passes.ll test.
Florian Hahn [Fri, 3 Dec 2021 20:15:59 +0000 (20:15 +0000)]
[Passes] Improve opt-pipeline-vector-passes.ll test.

Add -NOT lines to ensure that no extra passes are run if
-extra-vectorizer-passes is not specified.

Also add a loop that actually gets vectorized in preparation for
D115052.

2 years agoCodeGen: Strip exception specifications from function types in CFI type names.
Peter Collingbourne [Fri, 3 Dec 2021 19:48:57 +0000 (14:48 -0500)]
CodeGen: Strip exception specifications from function types in CFI type names.

With C++17 the exception specification has been made part of the
function type, and therefore part of mangled type names.

However, it's valid to convert function pointers with an exception
specification to function pointers with the same argument and return
types but without an exception specification, which means that e.g. a
function of type "void () noexcept" can be called through a pointer
of type "void ()". We must therefore consider the two types to be
compatible for CFI purposes.

We can do this by stripping the exception specification before mangling
the type name, which is what this patch does.

Differential Revision: https://reviews.llvm.org/D115015

2 years ago[msan] Don't block SIGSYS in ScopedBlockSignals
Hans Wennborg [Fri, 3 Dec 2021 19:22:47 +0000 (20:22 +0100)]
[msan] Don't block SIGSYS in ScopedBlockSignals

Seccomp-BPF-sandboxed processes rely on being able to process SIGSYS
signals.

Differential revision: https://reviews.llvm.org/D115057

2 years ago[libunwind] Fix unwind_leaffunction test
Leonard Chan [Fri, 3 Dec 2021 19:20:06 +0000 (11:20 -0800)]
[libunwind] Fix unwind_leaffunction test

It's possible for this test not to pass if the libc used does not provide
unwind info for raise. We can replace it with __builtin_cast, which can lead
to a SIGTRAP on x86_64 and a SIGILL on aarch64.

Using this alternative, a nop is needed before the __builtin_cast. This is
because libunwind incorrectly decrements pc, which can cause pc to jump into
the previous function and use the incorrect FDE.

Differential Revision: https://reviews.llvm.org/D114818

2 years ago[CFG] Handle calls with funclet bundle
Choongwoo Han [Fri, 3 Dec 2021 18:44:18 +0000 (10:44 -0800)]
[CFG] Handle calls with funclet bundle

When Control Flow Guard Check is inserted, funclet bundle was not checked. Therefore, it didn't generate code correctly when a target function has "funclet" bundle.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D114914

2 years ago[HWASan] Try 'google' prefixed apex directories in symbolizer.
Mitch Phillips [Fri, 3 Dec 2021 18:34:57 +0000 (10:34 -0800)]
[HWASan] Try 'google' prefixed apex directories in symbolizer.

Google-signed apexes appear on Android build servers' symbol files as
being under /apex/com.google.android.<foo>/. In reality, the apexes are
always installed as /apex/com.android.<foo>/ (note the lack of
'google'). In order for local symbolization under hwasan_symbolize to
work correctly, we also try the 'google' directory.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D114919

2 years ago[AMDGPU] Fixed incomplete definitions in twoaddr-fma.mir. NFC.
Stanislav Mekhanoshin [Fri, 3 Dec 2021 18:18:03 +0000 (10:18 -0800)]
[AMDGPU] Fixed incomplete definitions in twoaddr-fma.mir. NFC.

2 years ago[AMDGPU] Kill def when folding immediate in two-addr pass
Stanislav Mekhanoshin [Thu, 2 Dec 2021 22:02:01 +0000 (14:02 -0800)]
[AMDGPU] Kill def when folding immediate in two-addr pass

Two-address pass works right before RA and if an immediate
was folded into an instruction there is nothing to remove
the dead def. We end up with something like:

v_mov_b32_e32 v14, 0xc1700000
v_mov_b32_e32 v14, 0x41200000
v_fmaak_f32 v51, s67, v19, 0xc1700000
v_fmaak_f32 v38, v51, v19, 0x4120000

The patch kills the dead move instruction right in the folding.

Differential Revision: https://reviews.llvm.org/D114999

2 years ago[DAG] PromoteIntRes_FunnelShift - rename shift Amount variable to Amt to prevent...
Simon Pilgrim [Fri, 3 Dec 2021 17:24:32 +0000 (17:24 +0000)]
[DAG] PromoteIntRes_FunnelShift - rename shift Amount variable to Amt to prevent line overflow. NFC.

2 years ago[funcattrs] Fix a bug in recently introduced writeonly argument inference
Philip Reames [Fri, 3 Dec 2021 16:52:40 +0000 (08:52 -0800)]
[funcattrs] Fix a bug in recently introduced writeonly argument inference

This fixes a bug in 740057d.  There's two ways to describe the issue:
* One caller hasn't yet proven nocapture on the argument.  Given that, the inference routine is responsible for bailing out on a potential capture.
* Even if we know the argument is nocapture, the access inference needs to traverse the exact set of users the capture tracking would (or exit conservatively).  Even if capture tracking can prove a store is non-capturing (e.g. to a local alloc which doesn't escape), we still need to track the copy of the pointer to see if it's later reloaded and accessed again.

Note that all the test changes except the newly added ones appear to be false negatives.  That is, cases where we could prove writeonly, but the current code isn't strong enough.  That's why I didn't spot this originally.

2 years ago[IR][AutoUpgrade] Merge x86 mask load intrinsic upgrades. NFC.
Simon Pilgrim [Fri, 3 Dec 2021 16:53:47 +0000 (16:53 +0000)]
[IR][AutoUpgrade] Merge x86 mask load intrinsic upgrades. NFC.

Helps appease MSVC which is complaining about "fatal error C1061: compiler limit: blocks nested too deeply" - we already do the same thing for avx512.mask.store intrinsics.

This is only a stopgap solution until another else-if case needs adding - we really need to refactor this chain of ifs properly.

2 years ago[LLDB] XFAIL on Arm/Linux minidebuginfo-set-and-hit-breakpoint.test
Muhammad Omair Javaid [Fri, 3 Dec 2021 16:46:07 +0000 (21:46 +0500)]
[LLDB] XFAIL on Arm/Linux minidebuginfo-set-and-hit-breakpoint.test

minidebuginfo-set-and-hit-breakpoint.test is failing on Arm/Linux most
probably due to an ill formed binary after removal of certain sections
from executable. I am marking it as XFAIL for further investigation.

2 years ago[ARM] Separate ARM autoupgrade code into a separate function
David Green [Fri, 3 Dec 2021 16:45:26 +0000 (16:45 +0000)]
[ARM] Separate ARM autoupgrade code into a separate function

Try to appease the microsoft compiler which is apparently running out of
if statements. Separate the new ARM code into a separate function to
keep it simpler.

2 years ago[ARM] Replace if's with a switch, NFC
David Green [Fri, 3 Dec 2021 16:16:30 +0000 (16:16 +0000)]
[ARM] Replace if's with a switch, NFC

I'm not having a lot of luck with the microosft compiler recently. Maybe
this will help it with its errors:
llvm\lib\IR\AutoUpgrade.cpp(3726): fatal error C1061: compiler limit: blocks nested too deeply

If not, it's a good code cleanup anyway.

2 years ago[libc] Fix invalid include for SqrtLongDouble.h
Guillaume Chatelet [Fri, 3 Dec 2021 16:13:35 +0000 (16:13 +0000)]
[libc] Fix invalid include for SqrtLongDouble.h

2 years ago[gn build] Build with Fission on non-mac non-win when using lld
Nico Weber [Fri, 3 Dec 2021 14:59:16 +0000 (09:59 -0500)]
[gn build] Build with Fission on non-mac non-win when using lld

In release+sym builds (-O2 -g), reduces time to link `clang`
from 2.3s to 1.3s (-42%).

In debug builds (-g), reduces time to link `clang`
from 5.4s to 4.5s (-17.4%).

See the phab review for full `ministat` numbers.

In the CMake build this is opt-in via LLVM_USE_SPLIT_DWARF.
Since the GN build is targeted at developers, enabling it by default
seems like a better default setting here. (If it turns out to cause
problems, we can add an opt-out.)

Time to load the binary into gdb and to set a breakpoint is unchanged.
Time from `run` to hitting a breakpoint in `main` feel a bit faster
(~4s -> ~2s), but I dind't do a careful statistical anlysis for this.

Differential Revision: https://reviews.llvm.org/D115040

2 years ago[MemoryLocation] Move DSE intrinsic handling to MemoryLocation. (NFC)
Florian Hahn [Fri, 3 Dec 2021 16:00:39 +0000 (16:00 +0000)]
[MemoryLocation] Move DSE intrinsic handling to MemoryLocation. (NFC)

Suggested in D114872.

2 years ago[libc] Select FPUtils implementations via code instead of build
Guillaume Chatelet [Fri, 3 Dec 2021 15:48:41 +0000 (15:48 +0000)]
[libc] Select FPUtils implementations via code instead of build

We want to simplify the build system and rely on code to do the implementation selection.
This is in preparation of adding a Bazel configuration (D114712).

Differential Revision: https://reviews.llvm.org/D115034

2 years ago[clang-tidy][docs][NFC] Improve documentation of bugprone-unhandled-exception-at-new
Balázs Kéri [Fri, 3 Dec 2021 15:52:27 +0000 (16:52 +0100)]
[clang-tidy][docs][NFC] Improve documentation of bugprone-unhandled-exception-at-new

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D114602

2 years ago[DebugInfo] Attempt to preserve more information during tail duplication
Stephen Tozer [Fri, 3 Dec 2021 13:35:25 +0000 (13:35 +0000)]
[DebugInfo] Attempt to preserve more information during tail duplication

Prior to this patch, tail duplication handled debug info poorly -
specifically, debug instructions would be dropped instead of being set
undef, potentially extending the lifetimes of prior debug values that
should be killed. The pass was also very aggressive with dropping debug
info, dropping debug info even when the SSA value it referred to was
still present. This patch attempts to handle debug info more carefully,
checking to see whether each affected debug value can still be live,
setting it undef if not.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D106875

2 years ago[ARM] Use v2i1 for MVE and CDE intrinsics
David Green [Fri, 3 Dec 2021 15:27:58 +0000 (15:27 +0000)]
[ARM] Use v2i1 for MVE and CDE intrinsics

This adjusts all the MVE and CDE intrinsics now that v2i1 is a legal
type, to use a <2 x i1> as opposed to emulating the predicate with a
<4 x i1>. The v4i1 workarounds have been removed leaving the natural
v2i1 types, notably in vctp64 which now generates a v2i1 type.

AutoUpgrade code has been added to upgrade old IR, which needs to
convert the old v4i1 to a v2i1 be converting it back and forth to an
integer with arm.mve.v2i and arm.mve.i2v intrinsics. These should be
optimized away in the final assembly.

Differential Revision: https://reviews.llvm.org/D114455

2 years ago[libc] Fix bugs with negative and mixed normal/denormal inputs in hypot implementation.
Tue Ly [Mon, 29 Nov 2021 18:33:11 +0000 (13:33 -0500)]
[libc] Fix bugs with negative and mixed normal/denormal inputs in hypot implementation.

Fix a bug with negative and mixed normal/denormal inputs in hypot implementation.

Differential Revision: https://reviews.llvm.org/D114726

2 years ago[PowerPC] Handle base load with reservation mnemonic
Nemanja Ivanovic [Fri, 3 Dec 2021 12:56:29 +0000 (06:56 -0600)]
[PowerPC] Handle base load with reservation mnemonic

The Power ISA defined l[bhwdq]arx as both base and
extended mnemonics. The base mnemonic takes the EH
bit as an operand and the extended mnemonic omits
it, making it implicitly zero. The existing
implementation only handles the base mnemonic when
EH is 1 and internally produces a different
instruction. There are historical reasons for this.
This patch simply removes the limitation introduced
by this implementation that disallows the base
mnemonic with EH = 0 in the ASM parser.

This resolves an issue that prevented some files
in the Linux kernel from being built with
-fintegrated-as.

Also fix a crash if the value is not an integer immediate.

2 years ago[TrivialDeadness] Introduce API separating two different usages
Anna Thomas [Wed, 1 Dec 2021 18:53:58 +0000 (13:53 -0500)]
[TrivialDeadness] Introduce API separating two different usages

The earlier usage of wouldInstructionBeTriviallyDead is based on the
assumption that the use_count of that instruction being checked will be
zero. This patch separates the API into two different ones:

1. The strictly conservative one where the instruction is trivially dead iff the uses are dead.
2. The slightly relaxed form, where an instruction is dead along paths where it is not used.

The second form can be used in identifying instructions that are valid
to sink down to uses (D109917).

Reviewed-By: reames
Differential Revision: https://reviews.llvm.org/D114647

2 years ago[lldb-vscode] Report supportsModulesRequest=true
Andy Yankovsky [Fri, 3 Dec 2021 09:59:13 +0000 (10:59 +0100)]
[lldb-vscode] Report supportsModulesRequest=true

The adapter does support `Modules` request, implemented in 39239f9.

Reviewed By: wallace

Differential Revision: https://reviews.llvm.org/D115033

2 years ago[OPENMP]Fix PR52117: Crash caused by target region inside of task construct.
Alexey Bataev [Wed, 24 Nov 2021 17:52:46 +0000 (09:52 -0800)]
[OPENMP]Fix PR52117: Crash caused by target region inside of task construct.

Need to do the analysis of the captured expressions in the clauses.
Previously the compiler ignored them and it may lead to a compiler crash
trying to get the address of the mapped variables.

Differential Revision: https://reviews.llvm.org/D114546

2 years ago[InstSimplify] Add test case for logic 'or' fold; NFC
Mehrnoosh Heidarpour [Thu, 2 Dec 2021 22:31:52 +0000 (17:31 -0500)]
[InstSimplify] Add test case for logic 'or' fold; NFC

2 years ago[mlir][linalg][bufferize][NFC] Map only tensors in BufferizationState
Matthias Springer [Fri, 3 Dec 2021 13:52:22 +0000 (22:52 +0900)]
[mlir][linalg][bufferize][NFC] Map only tensors in BufferizationState

BufferizationState had map/lookup overloads for non-tensor values. This was necessary for IREE. There is now a better way to do this, so these overloads can be removed.

Differential Revision: https://reviews.llvm.org/D114929

2 years ago[ARM] Make MVE v2i1 predicates legal
David Green [Fri, 3 Dec 2021 14:05:41 +0000 (14:05 +0000)]
[ARM] Make MVE v2i1 predicates legal

MVE can treat v16i1, v8i1, v4i1 and v2i1 as different views onto the
same 16bit VPR.P0 register, with v2i1 holding two 8 bit values for the
two halves. This was never treated as a legal type in llvm in the past
as there are not many 64bit instructions and no 64bit compares. There
are a few instructions that could use it though, notably a VSELECT (as
it can handle any size using the underlying v16i8 VPSEL), AND/OR/XOR for
similar reasons, some gathers/scatter and long multiplies and VCTP64
instructions.

This patch goes through and makes v2i1 a legal type, handling all the
cases that fall out of that. It also makes VSELECT legal for v2i64 as a
side benefit. A lot of the codegen changes as a result - usually in way
that is a little better or a little worse, but still expensive. Costs
can change a little too in the process, again in a way that expensive
things remain expensive. A lot of the tests that changed are mainly to
ensure correctness - the code can hopefully be improved in the future
where it comes up in practice.

The intrinsics currently remain using the v4i1 they previously did to
emulate a v2i1. This will be changed in a followup patch but this one
was already large enough.

Differential Revision: https://reviews.llvm.org/D114449

2 years ago[AMDGPU] Add some more GFX10 test coverage
Jay Foad [Fri, 3 Dec 2021 14:03:20 +0000 (14:03 +0000)]
[AMDGPU] Add some more GFX10 test coverage

2 years ago[fir] Add fir character builder
Valentin Clement [Fri, 3 Dec 2021 13:55:35 +0000 (14:55 +0100)]
[fir] Add fir character builder

This patch adds the FIR builder to generate the numeric intrinsic
runtime call.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: rovka

Differential Revision: https://reviews.llvm.org/D114900

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: mleair <leairmark@gmail.com>
2 years ago[fir] Add fir derived type runtime builder
Valentin Clement [Fri, 3 Dec 2021 13:49:07 +0000 (14:49 +0100)]
[fir] Add fir derived type runtime builder

This patch adds the builder to generate derived type runtime API calls.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: rovka

Differential Revision: https://reviews.llvm.org/D114472

Co-authored-by: Peter Klausler <pklausler@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
2 years ago[AMDGPU] Add some more GFX10 GlobalISel test coverage
Jay Foad [Fri, 3 Dec 2021 13:40:01 +0000 (13:40 +0000)]
[AMDGPU] Add some more GFX10 GlobalISel test coverage

2 years ago[mlir][linalg][bufferize][NFC] Provide default implementation of getAliasingOpOperand
Matthias Springer [Fri, 3 Dec 2021 13:24:49 +0000 (22:24 +0900)]
[mlir][linalg][bufferize][NFC] Provide default implementation of getAliasingOpOperand

This simplifies op interface implementations.

Differential Revision: https://reviews.llvm.org/D115025

2 years ago[SelectionDAG] Add newline to a debug message
Jay Foad [Fri, 3 Dec 2021 13:33:32 +0000 (13:33 +0000)]
[SelectionDAG] Add newline to a debug message

2 years ago[MemoryLocation] Use None instead of {}. (NFC)
Florian Hahn [Fri, 3 Dec 2021 13:19:00 +0000 (13:19 +0000)]
[MemoryLocation] Use None instead of {}. (NFC)

2 years ago[libc][NFC] Fix typo in CMakeLists documentation
Guillaume Chatelet [Fri, 3 Dec 2021 12:52:09 +0000 (13:52 +0100)]
[libc][NFC] Fix typo in CMakeLists documentation

2 years ago[mlir][NFC] Use const reference for loop variables.
Adrian Kuegel [Fri, 3 Dec 2021 12:07:54 +0000 (13:07 +0100)]
[mlir][NFC] Use const reference for loop variables.

2 years ago[PowerPC] Add non-constant fcopysign f128 test coverage
Simon Pilgrim [Fri, 3 Dec 2021 12:02:13 +0000 (12:02 +0000)]
[PowerPC] Add non-constant fcopysign f128 test coverage

As discussed on D114589 as the constant case gets affected by SimplifyDemandedBits a lot - the non-constant case currently falls back to copysignl libcalls

2 years agoAMDGPU/GlobalISel: Add clamp combine
Petar Avramovic [Fri, 3 Dec 2021 11:14:28 +0000 (12:14 +0100)]
AMDGPU/GlobalISel: Add clamp combine

Add clamp combine. Source is fminnum(fmaxnum(Val, 0.0), 1.0) or
fmaxnum(fminnum(Val, 1.0), 0.0) or fmed3 intrinsic with 0.0 and
1.0 as two out of three operands.

Differential Revision: https://reviews.llvm.org/D90052

2 years agoAMDGPU/GlobalISel: Add floating point med3 combine
Petar Avramovic [Fri, 3 Dec 2021 11:06:14 +0000 (12:06 +0100)]
AMDGPU/GlobalISel: Add floating point med3 combine

Add floating point version of med3 combine.
Source is fminnum(fmaxnum(Val, K0), K1) or fmaxnum(fminnum(Val, K1), K0)
where K0 and K1 are constants and K0 <= K1.

Differential Revision: https://reviews.llvm.org/D90051

2 years agoAMDGPU/GlobalISel: Do not fcanonicalize const splat padded with undef
Petar Avramovic [Fri, 3 Dec 2021 10:59:47 +0000 (11:59 +0100)]
AMDGPU/GlobalISel: Do not fcanonicalize const splat padded with undef

Recognize constant splat padded with undef in isCanonicalized.
Fcanonicalize will be removed by RemoveFcanonicalize in post-legalizer
combiner. We will treat undef as value that will result in a splat
in clamp combine after regbankselect.

Differential Revision: https://reviews.llvm.org/D104408

2 years ago[mlir] support recursive type conversion of named LLVM structs
Alex Zinenko [Mon, 22 Nov 2021 12:20:16 +0000 (13:20 +0100)]
[mlir] support recursive type conversion of named LLVM structs

A previous commit added support for converting elemental types contained in
LLVM dialect types in case they were not compatible with the LLVM dialect. It
was missing support for named structs as they could be recursive, which was not
supported by the conversion infra. Now that it is, add support for converting
such named structs.

Depends On D113579

Reviewed By: wsmoses

Differential Revision: https://reviews.llvm.org/D113580

2 years ago[mlir][linalg][bufferize][NFC] Move FuncOp boundary bufferization to ModuleBufferization
Matthias Springer [Fri, 3 Dec 2021 11:28:58 +0000 (20:28 +0900)]
[mlir][linalg][bufferize][NFC] Move FuncOp boundary bufferization to ModuleBufferization

Differential Revision: https://reviews.llvm.org/D114670

2 years ago[mlir][linalg][bufferize] Allow unbufferizable ops in input
Matthias Springer [Fri, 3 Dec 2021 11:19:46 +0000 (20:19 +0900)]
[mlir][linalg][bufferize] Allow unbufferizable ops in input

Allow ops that are not bufferizable in the input IR. (Deactivated by default.)

bufferization::ToMemrefOp and bufferization::ToTensorOp are generated at the bufferization boundaries.

Differential Revision: https://reviews.llvm.org/D114669

2 years ago[RISCV][VP] Add RVV codegen for vp.select
Victor Perez [Fri, 3 Dec 2021 10:00:09 +0000 (10:00 +0000)]
[RISCV][VP] Add RVV codegen for vp.select

Lower vp.select instrinsic to VSELECT_VL.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D114629

2 years ago[flang] Add missing LABEL in test. NFC
Diana Picus [Thu, 2 Dec 2021 04:40:29 +0000 (04:40 +0000)]
[flang] Add missing LABEL in test. NFC

2 years ago[fir] TargetRewrite: Rewrite fir.address_of(func)
Diana Picus [Thu, 2 Dec 2021 04:27:18 +0000 (04:27 +0000)]
[fir] TargetRewrite: Rewrite fir.address_of(func)

Rewrite AddrOfOp if taking the address of a function.

Differential Revision: https://reviews.llvm.org/D114925

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
2 years ago[mlir][linalg][bufferize][NFC] Move BufferizationOptions to op interface
Matthias Springer [Fri, 3 Dec 2021 10:50:37 +0000 (19:50 +0900)]
[mlir][linalg][bufferize][NFC] Move BufferizationOptions to op interface

Also store a reference to BufferizationOptions in BufferizationState. This is in preparation of adding support for partial bufferization.

Differential Revision: https://reviews.llvm.org/D114661

2 years ago[fir] Add fircg.ext_embox conversion
Valentin Clement [Fri, 3 Dec 2021 10:44:47 +0000 (11:44 +0100)]
[fir] Add fircg.ext_embox conversion

Convert a fircg.ext_embox operation to LLVM IR dialect.
A fircg.ext_embox is converted to a sequence of operation that
create, allocate if needed, and populate a descriptor.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D114148

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
2 years ago[llvm-dwarfdump] Do not print preceding :: for local types
Kristina Bessonova [Fri, 3 Dec 2021 10:27:29 +0000 (12:27 +0200)]
[llvm-dwarfdump] Do not print preceding :: for local types

Reviewed By: dblaikie, jhenderson

Differential Revision: https://reviews.llvm.org/D114892

2 years ago[PowerPC] [Clang] Fix alignment adjustment of single-elemented float128
Qiu Chaofan [Fri, 3 Dec 2021 10:05:46 +0000 (18:05 +0800)]
[PowerPC] [Clang] Fix alignment adjustment of single-elemented float128

This does similar thing to 6b1341e, but fixes single element 128-bit
float type: `struct { long double x; }`.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D114937