platform/upstream/llvm.git
2 years ago[clang][dataflow] Implement a basic algorithm for dataflow analysis
Stanislav Gatev [Fri, 10 Dec 2021 09:37:07 +0000 (10:37 +0100)]
[clang][dataflow] Implement a basic algorithm for dataflow analysis

This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed By: xazax.hun, gribozavr2

Differential Revision: https://reviews.llvm.org/D115235

2 years ago[llvm] [Debuginfo] Add llvm-debuginfod-find tool and end-to-end-tests.
Noah Shutty [Fri, 10 Dec 2021 10:22:15 +0000 (10:22 +0000)]
[llvm] [Debuginfo] Add llvm-debuginfod-find tool and end-to-end-tests.

This implements the `llvm-debuginfod-find` tool, which wraps the Debuginfod library (D112758) to query debuginfod servers for artifacts according to the [[ https://www.mankier.com/8/debuginfod#Webapi | specification ]].

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D112759

2 years ago[VPlan] Add InductionDescriptor to VPWidenIntOrFpInduction. (NFC)
Florian Hahn [Fri, 10 Dec 2021 09:55:09 +0000 (09:55 +0000)]
[VPlan] Add InductionDescriptor to VPWidenIntOrFpInduction. (NFC)

This allows easier access to the induction descriptor from VPlan,
without needing to go through Legal. VPReductionPHIRecipe already
contains a RecurrenceDescriptor in a similar fashion.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D115111

2 years ago[ARM][libcxxabi] Add PACBTI-M support to libcxxabi
Ties Stuij [Fri, 10 Dec 2021 09:36:19 +0000 (09:36 +0000)]
[ARM][libcxxabi] Add PACBTI-M support to libcxxabi

This change consists of just adding 'BTI' to the prologue of Arm assembly
functions, which is just the one: __cxa_end_cleanup

This patch is part of a series that adds support for the PACBTI-M extension of
the Armv8.1-M architecture, as detailed here:

https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension

The PACBTI-M specification can be found in the Armv8-M Architecture Reference
Manual:

https://developer.arm.com/documentation/ddi0553/latest

The following people contributed to this patch:

- Mikhail Maltsev

Reviewed By: lenary, danielkiss

Differential Revision: https://reviews.llvm.org/D112432

2 years ago[msan] Implement -msan-disable-checks.
Alexander Potapenko [Tue, 7 Dec 2021 12:20:12 +0000 (13:20 +0100)]
[msan] Implement -msan-disable-checks.

To ease the deployment of KMSAN, we need a way to apply
__attribute__((no_sanitize("kernel-memory"))) to the whole source file.

Passing -msan-disable-checks=1 to the compiler will make it
treat every function in the file as if it was lacking the
sanitize_memory attribute.

Differential Revision: https://reviews.llvm.org/D115236

2 years ago[gn build] Port 1d0244aed781
LLVM GN Syncbot [Fri, 10 Dec 2021 09:08:48 +0000 (09:08 +0000)]
[gn build] Port 1d0244aed781

2 years agoReapply CycleInfo: Introduce cycles as a generalization of loops
Sameer Sahasrabuddhe [Fri, 10 Dec 2021 09:06:43 +0000 (14:36 +0530)]
Reapply CycleInfo: Introduce cycles as a generalization of loops

Reverts 02940d6d2202. Fixes breakage in the modules build.

LLVM loops cannot represent irreducible structures in the CFG. This
change introduce the concept of cycles as a generalization of loops,
along with a CycleInfo analysis that discovers a nested
hierarchy of such cycles. This is based on Havlak (1997), Nesting of
Reducible and Irreducible Loops.

The cycle analysis is implemented as a generic template and then
instatiated for LLVM IR and Machine IR. The template relies on a new
GenericSSAContext template which must be specialized when used for
each IR.

This review is a restart of an older review request:
https://reviews.llvm.org/D83094

Original implementation by Nicolai Hähnle <nicolai.haehnle@amd.com>,
with recent refactoring by Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>

Differential Revision: https://reviews.llvm.org/D112696

2 years agoRemove one change from https://reviews.llvm.org/D115431
Jason Molenda [Fri, 10 Dec 2021 09:01:17 +0000 (01:01 -0800)]
Remove one change from https://reviews.llvm.org/D115431

The change to ArchSpec::SetArchitecture that was setting the
ObjectFile of a mach-o binary to llvm::Triple::MachO.  It's not
necessary for my patch, and it changes the output of image list -t
causing TestUniversal.py to fail on x86_64 systems.  The bots
turned up the failure, I was developing and testing this on
an Apple Silicon mac.

2 years ago[flang][nfc] Fix formatting
Andrzej Warzynski [Fri, 10 Dec 2021 08:56:57 +0000 (08:56 +0000)]
[flang][nfc] Fix formatting

2 years ago[flang][codegen] Add a conversion for `!fir.coordinate_of` - part 1
Andrzej Warzynski [Wed, 17 Nov 2021 10:03:19 +0000 (10:03 +0000)]
[flang][codegen] Add a conversion for `!fir.coordinate_of` - part 1

This patch extends the `FIRToLLVMLowering` pass in Flang by adding a
hook to transform `!fir.coordinate_of` into a sequence of LLVM MLIR
instructions.

The following cases are currently supported:
  1.  the input object is a `!fir.complex` (wrapped in e.g. `!fir.ref` or
      `!fir.box`)
  2.  the input object is wrapped in a `!fir.box` (including e.g.
      `!fir.array`).
Note that `!fir.complex` inside a `!fir.box` falls under case 1. above
(i.e. it's a special case regardless of the wrapping type).

This is part of the upstreaming effort from the `!fir-dev` branch in [1].

Differential Revision: https://reviews.llvm.org/D114159

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
2 years ago[AMDGPU] Add AV class spill pseudo instructions
Christudasan Devadasan [Thu, 9 Dec 2021 07:55:21 +0000 (02:55 -0500)]
[AMDGPU] Add AV class spill pseudo instructions

While enabling vector superclasses with D109301,
the AV spills are converted into VGPR spills by
introducing appropriate copies. The whole thing
ended up adding two instructions per spill (a copy
+ vgpr spill pseudo) and caused an incorrect
liverange update during inline spiller.

This patch adds the pseudo instructions for all
AV spills from 32b to 1024b and handles them in
the way all other spills are lowered.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D115439

2 years ago[PowerPC] Require htm feature for HTM builtins
Qiu Chaofan [Fri, 10 Dec 2021 08:01:45 +0000 (16:01 +0800)]
[PowerPC] Require htm feature for HTM builtins

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D114569

2 years ago[Inline] Add test for exponential deferred inlining (NFC)
Nikita Popov [Fri, 10 Dec 2021 07:59:59 +0000 (08:59 +0100)]
[Inline] Add test for exponential deferred inlining (NFC)

This shows a case where deferred inlining produces an exponential
result. The test case demonstrates the basic exponential behavior,
but is nowhere close to the worst case. For example, the file at
https://gist.github.com/nikic/1262b5f7d27278e1b34a190ae10947f5
currently gets expanded from <100 lines to nearly 500000 lines of
IR by opt -inline.

2 years ago[GlobalISel] Fix IRTranslator for constexpr fcmp
Konstantin Schwarz [Thu, 9 Dec 2021 16:18:42 +0000 (17:18 +0100)]
[GlobalISel] Fix IRTranslator for constexpr fcmp

The existing code assumed fcmp to always be an Instruction, but it can also be a ConstExpr.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D115450

2 years ago[NFC] Format the newly added table for coro.end in coroutines.rst
Chuanqi Xu [Fri, 10 Dec 2021 07:18:53 +0000 (15:18 +0800)]
[NFC] Format the newly added table for coro.end in coroutines.rst

The intention should be formatted in two lines instead of one.

2 years agoSet a default number of address bits on Darwin arm64 systems
Jason Molenda [Fri, 10 Dec 2021 06:43:39 +0000 (22:43 -0800)]
Set a default number of address bits on Darwin arm64 systems

With arm64e ARMv8.3 pointer authentication, lldb needs to know how
many bits are used for addressing and how many are used for pointer
auth signing.  This should be determined dynamically from the inferior
system / corefile, but there are some workflows where it still isn't
recorded and we fall back on a default value that is correct on some
Darwin environments.

This patch also explicitly sets the vendor of mach-o binaries to
Apple, so we select an Apple ABI instead of a random other ABI.

It adds a function pointer formatter for systems where pointer
authentication is in use, and we can strip the ptrauth bits off
of the function pointer address and get a different value that
points to an actual symbol.

Differential Revision: https://reviews.llvm.org/D115431
rdar://84644661

2 years ago[libc][NFC] Add NOLINT annotations at call sites to immintrin functions.
Siva Chandra Reddy [Fri, 10 Dec 2021 06:36:50 +0000 (06:36 +0000)]
[libc][NFC] Add NOLINT annotations at call sites to immintrin functions.

These annotations are intended to be temporary while we understand why
clang-tidy is not able to treat them as builtin exceptions.

2 years ago[sanitizer] Update symbols after D113717
Vitaly Buka [Fri, 10 Dec 2021 05:50:32 +0000 (21:50 -0800)]
[sanitizer] Update symbols after D113717

2 years ago[RISCV] Unify depedency check and extension implication parsing logics
eopXD [Sat, 23 Oct 2021 10:18:24 +0000 (03:18 -0700)]
[RISCV] Unify depedency check and extension implication parsing logics

Originially there are two places that does parsing - `parseArchString` and
`parseFeatures`, each with its code on dependency check and implication.
This patch extracts common parts of the two  as functions of `RISCVISAInfo`
and let them 2 use it.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D112359

2 years ago[RISCV] Fix arch string parsing for multi-character extensions
eopXD [Sat, 23 Oct 2021 05:25:57 +0000 (22:25 -0700)]
[RISCV] Fix arch string parsing for multi-character extensions

Current implementation can't parse extension names that contains digits
correctly (e.g. `zvl128b`). This patch fixes it.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D109215

2 years agoFix crash from use of a temporary after its scope exit
Mehdi Amini [Fri, 10 Dec 2021 05:02:25 +0000 (05:02 +0000)]
Fix crash from use of a temporary after its scope exit

Introduced in D110448 and broke some bots (reported by ASAN).

Differential Revision: https://reviews.llvm.org/D110448

2 years ago[llvm] Use llvm::count (NFC)
Kazu Hirata [Fri, 10 Dec 2021 04:50:38 +0000 (20:50 -0800)]
[llvm] Use llvm::count (NFC)

2 years ago[AArch64][GlobalISel] Split vector stores of zero.
Amara Emerson [Fri, 10 Dec 2021 00:05:14 +0000 (16:05 -0800)]
[AArch64][GlobalISel] Split vector stores of zero.

This results in a very minor improvement in most cases, generating
stores of xzr instead of moving zero to a vector register.

Differential Revision: https://reviews.llvm.org/D115479

2 years ago[dfsan] Add missing test for the new pass manager with -dfsan-ignore-personality...
Taewook Oh [Thu, 9 Dec 2021 23:49:00 +0000 (15:49 -0800)]
[dfsan] Add missing test for the new pass manager with -dfsan-ignore-personality-routine

A test for the new pass manager was missed from the original diff D115317.

Reviewed By: browneee

Differential Revision: https://reviews.llvm.org/D115477

2 years agoSupport: Avoid using SmallVector::set_size() in MemoryBuffer
Duncan P. N. Exon Smith [Wed, 8 Dec 2021 01:22:47 +0000 (17:22 -0800)]
Support: Avoid using SmallVector::set_size() in MemoryBuffer

Update getMemoryBufferForStream() to use `resize_for_overwrite()` and
`truncate()` instead of `reserve()` and `set_size()`.

Differential Revision: https://reviews.llvm.org/D115384

2 years agoRevert "[X86][clang] Emit diagnostic for float and double when we have features ...
Phoebe Wang [Fri, 10 Dec 2021 02:31:09 +0000 (10:31 +0800)]
Revert "[X86][clang] Emit diagnostic for float and double when we have features -x87 and -sse on 64-bits"

This reverts commit 4a2c827b178f89d4cdeb56153d9440ad4ba786a3.

Need to fix the problem when using `-mno-sse` together with "x86intrin.h"

2 years agoRevert "[ASan] Shared optimized callbacks implementation."
Kirill Stoimenov [Fri, 10 Dec 2021 00:08:33 +0000 (00:08 +0000)]
Revert "[ASan] Shared optimized callbacks implementation."

This reverts commit 428ed61a921c092b638ee512c73d48352af915e6.

Build bot failure:
https://lab.llvm.org/buildbot/#/builders/37
https://lab.llvm.org/buildbot/#/builders/37/builds/9041

Reviewed By: kstoimenov

Differential Revision: https://reviews.llvm.org/D115489

2 years ago[mlir] Add filegroup for Conversion/PassDetail
Chia-hung Duan [Fri, 10 Dec 2021 02:03:50 +0000 (02:03 +0000)]
[mlir] Add filegroup for Conversion/PassDetail

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D115487

2 years ago[flang] Fix folding of ac-implied-do indices in structure c'tors
Peter Klausler [Thu, 9 Dec 2021 21:00:54 +0000 (13:00 -0800)]
[flang] Fix folding of ac-implied-do indices in structure c'tors

Array constructors with implied DO loops that oversee structure
constructors were being prematurely folded into invalid constants
containing symbolic references to the ac-implied-do indices,
because they are indeed "constant expressions" as that term is
used in the Fortran standard and implemented as IsConstantExpr().
What's actually needed in structure constructor folding is a
test for actual constant values, which is what results from
folding them later with repetition in the context of folding
an ac-implied-do.

Differential Revision: https://reviews.llvm.org/D115470

2 years ago[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer.
Noah Shutty [Fri, 10 Dec 2021 01:27:28 +0000 (01:27 +0000)]
[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer.

Adds a fallback to use the debuginfod client library (386655) in `findDebugBinary`.
Fixed a cast of Erorr::success() to Expected<> in debuginfod library.
Added Debuginfod to Symbolize deps in gn.
Updates compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh to include Debuginfod library to fix sanitizer-x86_64-linux breakage.

Reviewed By: jhenderson, vitalybuka

Differential Revision: https://reviews.llvm.org/D113717

2 years ago[X86][MS-InlineAsm] Make the constraint *m to be simple place holder
Phoebe Wang [Fri, 10 Dec 2021 00:41:02 +0000 (08:41 +0800)]
[X86][MS-InlineAsm] Make the constraint *m to be simple place holder

D113096 solved the "undefined reference to xxx" issue by adding
constraint *m for the global var. But it has strong side effect due to
the symbol in the assembly being replaced with constraint variable.
This leads to some lowering fails. https://godbolt.org/z/h3nWoerPe

This patch fix the problem by use the constraint *m as place holder
rather than real constraint. It has negligible effect for the existing
code generation.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D115225

2 years agoRevert "[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer."
Noah Shutty [Fri, 10 Dec 2021 00:59:13 +0000 (00:59 +0000)]
Revert "[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer."

This reverts commit e2ad4f1756027cd27f6c82db620042e9877f900c because it
does not correctly fix the sanitizer buildbot breakage.

2 years ago[amdgpu][nfc] Delete dead code in LowerModuleLDS
Jon Chesterfield [Fri, 10 Dec 2021 00:43:23 +0000 (00:43 +0000)]
[amdgpu][nfc] Delete dead code in LowerModuleLDS

2 years ago[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer.
Noah Shutty [Fri, 10 Dec 2021 00:22:48 +0000 (00:22 +0000)]
[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer.

Adds a fallback to use the debuginfod client library (386655) in `findDebugBinary`.
Fixed a cast of Erorr::success() to Expected<> in debuginfod library.
Added Debuginfod to Symbolize deps in gn.
Adds new symbolizer symbols to `global_symbols.txt`.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D113717

2 years ago[NFC][tools] Return underlying strings directly instead of OS.str()
Logan Smith [Thu, 9 Dec 2021 23:03:19 +0000 (15:03 -0800)]
[NFC][tools] Return underlying strings directly instead of OS.str()

This avoids an unnecessary copy required by 'return OS.str()', allowing
instead for NRVO or implicit move. The .str() call (which flushes the
stream) is no longer required since 65b13610a5226b84889b923bae884ba395ad084d,
which made raw_string_ostream unbuffered by default.

Differential Revision: https://reviews.llvm.org/D115374

2 years ago[NFC][Sema] Return underlying strings directly instead of OS.str()
Logan Smith [Thu, 9 Dec 2021 23:02:58 +0000 (15:02 -0800)]
[NFC][Sema] Return underlying strings directly instead of OS.str()

This avoids an unnecessary copy required by 'return OS.str()', allowing
instead for NRVO or implicit move. The .str() call (which flushes the
stream) is no longer required since 65b13610a5226b84889b923bae884ba395ad084d,
which made raw_string_ostream unbuffered by default.

Differential Revision: https://reviews.llvm.org/D115374

2 years ago[NFC][clang] Return underlying strings directly instead of OS.str()
Logan Smith [Thu, 9 Dec 2021 23:02:35 +0000 (15:02 -0800)]
[NFC][clang] Return underlying strings directly instead of OS.str()

This avoids an unnecessary copy required by 'return OS.str()', allowing
instead for NRVO or implicit move. The .str() call (which flushes the
stream) is no longer required since 65b13610a5226b84889b923bae884ba395ad084d,
which made raw_string_ostream unbuffered by default.

Differential Revision: https://reviews.llvm.org/D115374

2 years ago[NFC][testing] Return underlying strings directly instead of OS.str()
Logan Smith [Thu, 9 Dec 2021 22:55:44 +0000 (14:55 -0800)]
[NFC][testing] Return underlying strings directly instead of OS.str()

This avoids an unnecessary copy required by 'return OS.str()', allowing
instead for NRVO or implicit move. The .str() call (which flushes the
stream) is no longer required since 65b13610a5226b84889b923bae884ba395ad084d,
which made raw_string_ostream unbuffered by default.

Differential Revision: https://reviews.llvm.org/D115374

2 years ago[NFC][AST] Return underlying strings directly instead of OS.str()
Logan Smith [Thu, 9 Dec 2021 22:52:30 +0000 (14:52 -0800)]
[NFC][AST] Return underlying strings directly instead of OS.str()

This avoids an unnecessary copy required by 'return OS.str()', allowing
instead for NRVO or implicit move. The .str() call (which flushes the
stream) is no longer required since 65b13610a5226b84889b923bae884ba395ad084d,
which made raw_string_ostream unbuffered by default.

Differential Revision: https://reviews.llvm.org/D115374

2 years ago[NFC][analyzer] Return underlying strings directly instead of OS.str()
Logan Smith [Thu, 9 Dec 2021 22:51:24 +0000 (14:51 -0800)]
[NFC][analyzer] Return underlying strings directly instead of OS.str()

This avoids an unnecessary copy required by 'return OS.str()', allowing
instead for NRVO or implicit move. The .str() call (which flushes the
stream) is no longer required since 65b13610a5226b84889b923bae884ba395ad084d,
which made raw_string_ostream unbuffered by default.

Differential Revision: https://reviews.llvm.org/D115374

2 years ago[ASan] Fixed Windows test by excluding macro instantiated INTERFACE_FUNCTION.
Kirill Stoimenov [Fri, 10 Dec 2021 00:03:51 +0000 (00:03 +0000)]
[ASan] Fixed Windows test by excluding macro instantiated INTERFACE_FUNCTION.

Reviewed By: kstoimenov

Differential Revision: https://reviews.llvm.org/D115478

2 years ago[ASan] Fix Windows build by excluding a test which requires assembly callback versions.
Kirill Stoimenov [Thu, 9 Dec 2021 23:38:54 +0000 (23:38 +0000)]
[ASan] Fix Windows build by excluding a test which requires assembly callback versions.

Reviewed By: kstoimenov

Differential Revision: https://reviews.llvm.org/D115475

2 years ago[libc] Use intrinsics for x86-64 fma and optimize PolyEval for x86-64 with degree...
Tue Ly [Wed, 8 Dec 2021 15:14:16 +0000 (10:14 -0500)]
[libc] Use intrinsics for x86-64 fma and optimize PolyEval for x86-64 with degree 3 & 5 polynomials.

- Use intrinsics for x86-64 fma
- Optimize PolyEval for x86-64 with degree 3 & 5 polynomials.
- There might be a slight loss of accuracy compared to Horner's scheme due to usages of higher powers x^2 and x^3 in the computations.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D115347

2 years ago[ASan] Fix Windows build by excluding asan_rtl_x86_64.S.
Kirill Stoimenov [Thu, 9 Dec 2021 23:23:21 +0000 (23:23 +0000)]
[ASan] Fix Windows build by excluding asan_rtl_x86_64.S.

Reviewed By: kstoimenov

Differential Revision: https://reviews.llvm.org/D115473

2 years ago[flang] Updated FIR dialect to _Both
Jacques Pienaar [Thu, 9 Dec 2021 03:49:00 +0000 (19:49 -0800)]
[flang] Updated FIR dialect to _Both

Change dialect (and remove now redundant accessors) to generate both
form of accessors of being generated. Tried to keep this change
reasonably minimal (this also includes keeping note about not generating
getType accessor to avoid shadowing).

Differential Revision: https://reviews.llvm.org/D115420

2 years ago[ASan] Shared optimized callbacks implementation.
Kirill Stoimenov [Thu, 9 Dec 2021 00:42:50 +0000 (00:42 +0000)]
[ASan] Shared optimized callbacks implementation.

This change moves optimized callbacks from each .o file to compiler-rt. Instead of using code generation it uses direct assembly implementation. Please note that the 'or' version is not implemented and it will produce unresolved external if somehow 'or' version is requested.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D114558

2 years ago[bazel] Exclude MLModelRunnerTest.cpp
Mircea Trofin [Thu, 9 Dec 2021 22:53:50 +0000 (14:53 -0800)]
[bazel] Exclude MLModelRunnerTest.cpp

Until we figure MLGO + bazel, exclude this unittest (same as
TFUtilsTest.cpp)

Differential Revision: https://reviews.llvm.org/D115472

2 years ago[lldb/Target] Refine source display warning for artificial locations (NFC)
Med Ismail Bennani [Thu, 9 Dec 2021 19:46:05 +0000 (11:46 -0800)]
[lldb/Target] Refine source display warning for artificial locations (NFC)

This is a post-review update for D115313, to rephrase source display
warning messages for artificial locations, making them more
understandable for the end-user.

Differential Revision: https://reviews.llvm.org/D115461

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2 years ago[PGO] Adjust BFI verification option default values [NFC]
Rong Xu [Thu, 9 Dec 2021 22:15:28 +0000 (14:15 -0800)]
[PGO] Adjust BFI verification option default values [NFC]

Slightly changed the default option values.
Also avoided some bogus output.

2 years ago[mlir][sparse] minor corrections and updates in sparse compiler doc
Aart Bik [Thu, 9 Dec 2021 21:30:03 +0000 (13:30 -0800)]
[mlir][sparse] minor corrections and updates in sparse compiler doc

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D115467

2 years ago[libc++abi][AIX] Add 2 LIT tests for the AIX unwinder
Xing Xue [Thu, 9 Dec 2021 21:43:32 +0000 (16:43 -0500)]
[libc++abi][AIX] Add 2 LIT tests for the AIX unwinder

Summary:
This patch creates sub-directory libcxxabi/test/vendor/ibm and adds 2 LIT test cases for the AIX EH under the directory. One tests the restoration of the condition register and the other tests the restoration of vector registers. Both are saved on the stack by the function prologue.

Reviewed by: compnerd, libc++abi

Differential Revision: https://reviews.llvm.org/D114445

2 years ago[OpenMP][FIX] Pass the num_threads value directly to parallel_51
Joseph Huber [Thu, 9 Dec 2021 21:10:46 +0000 (16:10 -0500)]
[OpenMP][FIX] Pass the num_threads value directly to parallel_51

The problem with the old scheme is that we would need to keep track of
the "next region" and reset the num_threads value after it. The new RT
doesn't do it and an assertion is triggered. The old RT doesn't do it
either, I haven't tested it but I assume a num_threads clause might
impact multiple parallel regions "accidentally". Further, in SPMD mode
num_threads was simply ignored, for some reason beyond me.

In any case, parallel_51 is designed to take the clause value directly,
so let's do that instead.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D113623

2 years ago[flang] Simplify RaggedArrayHeader and make it plain C struct
Valentin Clement [Thu, 9 Dec 2021 21:27:49 +0000 (22:27 +0100)]
[flang] Simplify RaggedArrayHeader and make it plain C struct

- Join indirection and rank into a single value `flags`
- Make the struct a plain C struct.

Reviewed By: schweitz

Differential Revision: https://reviews.llvm.org/D115464

2 years ago[InstSimplify] Simplify bool icmp with not in LHS
Hasyimi Bahrudin [Thu, 9 Dec 2021 17:03:59 +0000 (12:03 -0500)]
[InstSimplify] Simplify bool icmp with not in LHS

Refer to https://llvm.org/PR52546.

Simplifies the following cases:
    not(X) == 0 -> X != 0 -> X
    not(X) <=u 0 -> X >u 0 -> X
    not(X) >=s 0 -> X <s 0 -> X
    not(X) != 1 -> X == 1 -> X
    not(X) <=u 1 -> X >=u 1 -> X
    not(X) >s 1 -> X <=s -1 -> X

Differential Revision: https://reviews.llvm.org/D114666

2 years ago[MLIR] PresburgerSetTest: expectEqual: pass by ref, not value
Arjun P [Thu, 9 Dec 2021 17:38:03 +0000 (23:08 +0530)]
[MLIR] PresburgerSetTest: expectEqual: pass by ref, not value

2 years ago[NFC] Use getAlign() instead of getAlignment() in haveSameSpecialState()
Arthur Eubanks [Thu, 9 Dec 2021 21:19:09 +0000 (13:19 -0800)]
[NFC] Use getAlign() instead of getAlignment() in haveSameSpecialState()

getAlignment() is deprecated.

2 years ago[gn build] Port cfb075089128
LLVM GN Syncbot [Thu, 9 Dec 2021 21:10:50 +0000 (21:10 +0000)]
[gn build] Port cfb075089128

2 years agoUnify libstdcpp and libcxx formatters for `std::optional`
Alisamar Husain [Thu, 9 Dec 2021 20:22:30 +0000 (12:22 -0800)]
Unify libstdcpp and libcxx formatters for `std::optional`

Reviewed By: wallace

Differential Revision: https://reviews.llvm.org/D115178

2 years ago[AArch64][GlobalISel] Add regbankselect support for G_FMAXIMUM/G_FMINIMUM
Jessica Paquette [Wed, 8 Dec 2021 20:24:19 +0000 (12:24 -0800)]
[AArch64][GlobalISel] Add regbankselect support for G_FMAXIMUM/G_FMINIMUM

These always use FPRs only.

Differential Revision: https://reviews.llvm.org/D115376

2 years ago[TargetInstrInfo][PowerPC] Remove virtual function that is only called from PPC speci...
Craig Topper [Thu, 9 Dec 2021 20:39:48 +0000 (12:39 -0800)]
[TargetInstrInfo][PowerPC] Remove virtual function that is only called from PPC specific code.

There are two signatures of setSpecialOperandAttr in TargetInstrInfo.
One of them is only called from PPCInstrInfo which has an override
of it.

Remove it from TargetInstrInfo and make it a non-virtual method in
PPCInstrInfo.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D115404

2 years ago[mlir][tosa] Fix quantized type for tosa.conv2d canonicalization
Rob Suderman [Thu, 9 Dec 2021 20:32:15 +0000 (12:32 -0800)]
[mlir][tosa] Fix quantized type for tosa.conv2d canonicalization

Wrong type was used for the result type in the tosa.conv_2d canonicalization.
The type should match the result element type should match the result type
not the input element type.

Differential Revision: https://reviews.llvm.org/D115463

2 years ago[GlobalISel] Make G_PTR_ADD pattern matcher non-commutative.
Daniel Thornburgh [Thu, 9 Dec 2021 20:00:05 +0000 (12:00 -0800)]
[GlobalISel] Make G_PTR_ADD pattern matcher non-commutative.

G_PTR_ADD takes arguments of two different types, so it probably shouldn't be
considered commutative just on that basis. A recent G_PTR_ADD reassociation
optimization (https://reviews.llvm.org/D109528) can emit erroneous code if the
pattern matcher commutes the arguments; this can happen when the base pointer
was created by G_INTTOPTR of a G_CONSTANT and the offset register is variable.

This was discovered on the llvm-mos fork, but I added a failing test case that
should apply to AArch64 (and more generally).

Differential Revision: https://reviews.llvm.org/D114655

2 years ago[ifs] Add options to allow llvm-ifs to generate multiple outputs
Haowei Wu [Fri, 3 Dec 2021 07:25:38 +0000 (23:25 -0800)]
[ifs] Add options to allow llvm-ifs to generate multiple outputs

This change adds options to llvm-ifs to allow it to generate multiple
types of stub files at a single invocation.

Differential Revision: https://reviews.llvm.org/D115024

2 years ago[libFuzzer] Remove entropic-scale-per-exec-time.test.
Matt Morehouse [Thu, 9 Dec 2021 20:19:40 +0000 (12:19 -0800)]
[libFuzzer] Remove entropic-scale-per-exec-time.test.

The test has been flaky for years, and I think we should remove it to
eliminate noise on the buildbot.

Neither me nor dokyungs have been able to fully deflake the test, and it
tests a non-default Entropic flag.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D115453

2 years ago[lsan] Move out suppression of invalid PCs from StopTheWorld
Vitaly Buka [Thu, 9 Dec 2021 19:21:16 +0000 (11:21 -0800)]
[lsan] Move out suppression of invalid PCs from StopTheWorld

This removes the last use of StackDepot from StopTheWorld.

Depends on D115284.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D115319

2 years ago[NFC][sanitizer] Relax InternalLowerBound interface
Vitaly Buka [Wed, 8 Dec 2021 08:44:51 +0000 (00:44 -0800)]
[NFC][sanitizer] Relax InternalLowerBound interface

val can be of any type accepted by Compare.

2 years ago[mlir][sparse] reenable asan for sampled mm integration test
Aart Bik [Wed, 8 Dec 2021 18:13:14 +0000 (10:13 -0800)]
[mlir][sparse] reenable asan for sampled mm integration test

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D115364

2 years ago[AArch64][GlobalISel] Legalize scalar G_FMAXIMUM + G_FMINIMUM
Jessica Paquette [Wed, 8 Dec 2021 20:00:47 +0000 (12:00 -0800)]
[AArch64][GlobalISel] Legalize scalar G_FMAXIMUM + G_FMINIMUM

Necessary for implementing some combines on floating point selects.

Differential Revision: https://reviews.llvm.org/D115372

2 years ago[lsan] Reduce StopTheWorld access to StackDepot
Vitaly Buka [Wed, 8 Dec 2021 05:40:20 +0000 (21:40 -0800)]
[lsan] Reduce StopTheWorld access to StackDepot

StackDepot locks some stuff. As is there is small probability to
deadlock if we stop thread which locked the Depot.

We need either Lock/Unlock StackDepot for StopTheWorld, or don't
interact with StackDepot from there.

This patch does not run LeakReport under StopTheWorld. LeakReport
contains most of StackDepot access.

As a bonus this patch will help to resolve kMaxLeaksConsidered FIXME.

Depends on D114498.

Reviewed By: morehouse, kstoimenov

Differential Revision: https://reviews.llvm.org/D115284

2 years ago[benchmark] Reapply fix for -Wcovered-switch-default warning
Martin Storsjö [Thu, 9 Dec 2021 09:09:39 +0000 (11:09 +0200)]
[benchmark] Reapply fix for -Wcovered-switch-default warning

This reapplies a fix from 948ce4e6edec6ad3cdf1911fc3e8e9569140d4ff,
whichn't originally submitted upstream. I has now been merged upstream
though, in https://github.com/google/benchmark/pull/1302.

When benchmarks were unified in
5dda2efde574d3a200d04c371f561a77ee9f4aff, it lost this change,
but it also lost another local modification, where benchmark's
CMakeLists.txt was modified to comment out adding -Werror.
(This change was part of the original import in
0addd170ab0880941fa4089c2717f3f3a0e4e25a.)

As the benchmark library is built automatically by default, when
building all of LLVM (contrary to the copy in libcxx, which wasn't
built by default), building it with -Werror by default is very brittle.

This fixes building LLVM with MinGW. (It wasn't broken in MSVC
mode, as the benchmark library doesn't add -Werror or anything
equivalent in MSVC mode, and it's unclear if this warning is
enabled in that mode at all.)

Differential Revision: https://reviews.llvm.org/D115434

2 years agoReapply #2 of [runtimes] Fix building initial libunwind+libcxxabi+libcxx with compile...
Martin Storsjö [Sat, 23 Oct 2021 22:11:20 +0000 (01:11 +0300)]
Reapply #2 of [runtimes] Fix building initial libunwind+libcxxabi+libcxx with compiler implied -lunwind

This does mostly the same as D112126, but for the runtimes cmake files.
Most of that is straightforward, but the interdependency between
libcxx and libunwind is tricky:

Libunwind is built at the same time as libcxx, but libunwind is not
installed yet. LIBCXXABI_USE_LLVM_UNWINDER makes libcxx link directly
against the just-built libunwind, but the compiler implicit -lunwind
isn't found. This patch avoids that by adding --unwindlib=none if
supported, if we are going to link explicitly against a newly built
unwinder anyway.

Since the previous attempt, this no longer uses
llvm_enable_language_nolink (and thus doesn't set
CMAKE_TRY_COMPILE_TARGET_TYPE=STATIC_LIBRARY during the compiler
sanity checks). Setting CMAKE_TRY_COMPILE_TARGET_TYPE=STATIC_LIBRARY
during compiler sanity checks makes cmake not learn about some
aspects of the compiler, which can make further find_library or
find_package fail. This caused OpenMP to not detect libelf and libffi,
disabling some OpenMP target plugins.

Instead, require the caller to set CMAKE_{C,CXX}_COMPILER_WORKS=YES
when building in a configuration with an incomplete toolchain.

Differential Revision: https://reviews.llvm.org/D113253

2 years ago[NFC][lsan] Change LeakSuppressionContext interface
Vitaly Buka [Wed, 8 Dec 2021 05:37:02 +0000 (21:37 -0800)]
[NFC][lsan] Change LeakSuppressionContext interface

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D115318

2 years ago[SLP]Fix comparator for cmp instruction vectorization.
Alexey Bataev [Tue, 7 Dec 2021 18:16:14 +0000 (10:16 -0800)]
[SLP]Fix comparator for cmp instruction vectorization.

The comparator for the sort functions should provide strict weak
ordering relation between parameters. Current solution causes compiler
crash with some standard c++ library implementations, because it does
not meet this criteria. Tried to fix it + it improves the iverall
vectorization result.

Differential Revision: https://reviews.llvm.org/D115268

2 years ago[reductions] Delete another piece of dead flag handling [NFC]
Philip Reames [Thu, 9 Dec 2021 18:52:35 +0000 (10:52 -0800)]
[reductions] Delete another piece of dead flag handling [NFC]

The code claimed to handle nsw/nuw, but those aren't passed via builder state and the explicit IR construction just above never sets them.

The only case this bit of code is actually relevant for is FMF flags.  However, dropPoisonGeneratingFlags currently doesn't know about FMF at all, so this was a noop.  It's also unneeded, as the caller explicitly configures the flags on the builder before this call, and the flags on the individual ops should be controled by the intrinsic flags anyways.  If any of the flags aren't safe to propagate, the caller needs to make that change.

2 years ago[dsymutil][NFC] Fix typo in help message
Ellis Hoag [Thu, 9 Dec 2021 18:41:13 +0000 (10:41 -0800)]
[dsymutil][NFC] Fix typo in help message

Just a simple typo fix that allows me to test landing a commit now that
I have commit access.

Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D115414

2 years ago[recurrence] Delete dead flag/fmf handling [NFC]
Philip Reames [Thu, 9 Dec 2021 18:42:18 +0000 (10:42 -0800)]
[recurrence] Delete dead flag/fmf handling [NFC]

The recurrence lowering code has handling which claims to be about flag intersection, but all the callers pass empty arrays to the arguments.  The sole exception is a caller of a method which has the argument, but no implementation.

I don't know what the intent was here, but it certaintly doesn't actually do anything today.

2 years ago[asan] Run background thread for asan only on THUMB
Vitaly Buka [Wed, 8 Dec 2021 20:28:12 +0000 (12:28 -0800)]
[asan] Run background thread for asan only on THUMB

As in D114934, or lsan crashes on the same bot.

2 years ago[sanitizer] Run Stack compression in background thread
Vitaly Buka [Tue, 23 Nov 2021 05:24:10 +0000 (21:24 -0800)]
[sanitizer] Run Stack compression in background thread

Depends on D114495.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D114498

2 years ago[instcombine] Do demanded elts last when visiting extractelement
Philip Reames [Thu, 9 Dec 2021 17:54:45 +0000 (09:54 -0800)]
[instcombine] Do demanded elts last when visiting extractelement

This reorders existing transforms to put demanded elements last. The reasoning here is that when we have an example which can be scalarized or handled via demanded bits, we should prefer scalarization as that doesn't require dropping flags on arithmetic instructions.

This doesn't show major changes in the tests today, but once I add support for fast math flags to dropPoisonGeneratingFlags this becomes glaringly obvious.

Differential Revision: https://reviews.llvm.org/D115394

2 years agoThread safety analysis: Remove unused variable. NFC.
Benjamin Kramer [Thu, 9 Dec 2021 17:56:13 +0000 (18:56 +0100)]
Thread safety analysis: Remove unused variable. NFC.

2 years agoRevert "[sanitizer] Run Stack compression in background thread"
Petr Hosek [Thu, 9 Dec 2021 17:55:49 +0000 (09:55 -0800)]
Revert "[sanitizer] Run Stack compression in background thread"

This reverts commit e5c2a46c5e8fc038b9f6c898df9628f9524dc10e as this
change introduced a linker error when building sanitizer runtimes:

  ld.lld: error: undefined symbol: __sanitizer::internal_start_thread(void* (*)(void*), void*)
  >>> referenced by sanitizer_stackdepot.cpp:133 (compiler-rt/lib/sanitizer_common/sanitizer_stackdepot.cpp:133)
  >>>               compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonSymbolizer.x86_64.dir/sanitizer_stackdepot.cpp.obj:(__sanitizer::(anonymous namespace)::CompressThread::NewWorkNotify())

2 years agoCompute estimated trip counts for multiple exit loops
Philip Reames [Thu, 9 Dec 2021 17:40:03 +0000 (09:40 -0800)]
Compute estimated trip counts for multiple exit loops

This change allows us to estimate trip count from profile metadata for all multiple exit loops. We still do the estimate only from the latch, but that's fine as it causes us to over estimate the trip count at worst.

Reviewing the uses of the API, all but one are cases where we restrict a loop transformation (unroll, and vectorize respectively) when we know the trip count is short enough. So, as a result, the change makes these passes strictly less aggressive. The test change illustrates a case where we'd previously have runtime unrolled a loop which ran fewer iterations than the unroll factor. This is definitely unprofitable.

The one case where an upper bound on estimate trip count could drive a more aggressive transform is peeling, and I duplicated the logic being removed from the generic estimation there to keep it the same. The resulting heuristic makes no sense and should probably be immediately removed, but we can do that in a separate change.

This was noticed when analyzing regressions on D113939.

I plan to come back and incorporate estimated trip counts from other exits, but that's a minor improvement which can follow separately.

Differential Revision: https://reviews.llvm.org/D115362

2 years ago[InstCombine] (~a & b & c) | ~(a | b | c) -> ~(a | (b ^ c))
Stanislav Mekhanoshin [Wed, 17 Nov 2021 20:33:31 +0000 (12:33 -0800)]
[InstCombine] (~a & b & c) | ~(a | b | c) -> ~(a | (b ^ c))

Transform
```
(~a & b & c) | ~(a | b | c) -> ~(a | (b ^ c))
```
And swapped case:
```
(~a | b | c) & ~(a & b & c) -> ~a | (b ^ c)
```

```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %or1 = or i4 %b, %a
  %or2 = or i4 %or1, %c
  %not1 = xor i4 %or2, 15
  %not2 = xor i4 %a, 15
  %and1 = and i4 %b, %not2
  %and2 = and i4 %and1, %c
  %or3 = or i4 %and2, %not1
  ret i4 %or3
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %1 = xor i4 %c, %b
  %2 = or i4 %1, %a
  %or3 = xor i4 %2, 15
  ret i4 %or3
}
Transformation seems to be correct!
```
```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %and1 = and i4 %b, %a
  %and2 = and i4 %and1, %c
  %not1 = xor i4 %and2, 15
  %not2 = xor i4 %a, 15
  %or1 = or i4 %not2, %b
  %or2 = or i4 %or1, %c
  %and3 = and i4 %or2, %not1
  ret i4 %and3
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %xor = xor i4 %b, %c
  %not = xor i4 %a, 15
  %or = or i4 %xor, %not
  ret i4 %or
}
Transformation seems to be correct!
```

Differential Revision: https://reviews.llvm.org/D112966

2 years agoPrevent abseil-cleanup-ctad check from stomping on surrounding context
CJ Johnson [Thu, 9 Dec 2021 17:28:07 +0000 (17:28 +0000)]
Prevent abseil-cleanup-ctad check from stomping on surrounding context

This change applies two fixes to the abseil-cleanup-ctad check. It uses hasSingleDecl() to ensure only declStmt()s with one varDecl() are matched (leaving compount declStmt()s unchanged). It also addresses a bug in the handling of comments that surround the absl::MakeCleanup() calls by switching to the callArgs() combinator from Clang Transformer.

Reviewed By: ymandel

Differential Revision: https://reviews.llvm.org/D115452

2 years ago[llvm] Use range-based for loops (NFC)
Kazu Hirata [Thu, 9 Dec 2021 17:37:29 +0000 (09:37 -0800)]
[llvm] Use range-based for loops (NFC)

2 years ago[mlir][Vector] Avoid infinite loop in InnerOuterDimReductionConversion.
MaheshRavishankar [Thu, 9 Dec 2021 17:18:00 +0000 (09:18 -0800)]
[mlir][Vector] Avoid infinite loop in InnerOuterDimReductionConversion.

This patterns tries to convert an inner (outer) dim reduction to an
outer (inner) dim reduction. Doing this on a 1D or 0D vector results
in an infinite loop since the converted op is same as the original
operation. Just returning failure when source rank <= 1 fixes the
issue.

Differential Revision: https://reviews.llvm.org/D115426

2 years ago[fir] Keep runtime function name in comment
Valentin Clement [Thu, 9 Dec 2021 17:28:36 +0000 (18:28 +0100)]
[fir] Keep runtime function name in comment

Some of the function name in the comment didn't match
their actual name in the runtime. This patch fixes that.

Reviewed By: schweitz

Differential Revision: https://reviews.llvm.org/D115076

2 years agoRevert "tsan: new runtime (v3)"
Jonas Devlieghere [Thu, 9 Dec 2021 17:18:10 +0000 (09:18 -0800)]
Revert "tsan: new runtime (v3)"

This reverts commit 5a33e412815b8847610425a2a3b86d2c7c313b71 becuase it
breaks LLDB.

https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/39208/

2 years ago[RISCV] Use MULHU for more division by constant cases.
Craig Topper [Thu, 9 Dec 2021 16:53:46 +0000 (08:53 -0800)]
[RISCV] Use MULHU for more division by constant cases.

D113805 improved handling of i32 divu/remu on RV64. The basic idea
from that can be extended to (mul (and X, C2), C1) where C2 is any
mask constant.

We can replace the and with an SLLI by shifting by the number of
leading zeros in C2 if we also shift C1 left by XLen - lzcnt(C1)
bits. This will give the full product XLen additional trailing zeros,
putting the result in the output of MULHU. If we can't use ANDI,
ZEXT.H, or ZEXT.W, this will avoid materializing C2 in a register.

The downside is it make take 1 additional instruction to create C1.
But since that's not on the critical path, it can hopefully be
interleaved with other operations.

The previous tablegen pattern is replaced by custom isel code.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D115310

2 years ago[amdgpu][nfc] Drop dead PtrSet, fix a comment
Jon Chesterfield [Thu, 9 Dec 2021 16:47:41 +0000 (16:47 +0000)]
[amdgpu][nfc] Drop dead PtrSet, fix a comment

2 years ago[lldb] Remove unused lldb.cpp
Haojian Wu [Thu, 9 Dec 2021 11:10:58 +0000 (12:10 +0100)]
[lldb] Remove unused lldb.cpp

lldb.cpp is unused after https://github.com/llvm/llvm-project/commit/ccf1469a4cdb03cb2bc7868f76164e85d90ebee1

Differential Revision: https://reviews.llvm.org/D115438

2 years ago[NFC] Replace some deprecated getAlignment() calls with getAlign()
Arthur Eubanks [Wed, 8 Dec 2021 19:49:12 +0000 (11:49 -0800)]
[NFC] Replace some deprecated getAlignment() calls with getAlign()

Reviewed By: gchatelet

Differential Revision: https://reviews.llvm.org/D115370

2 years ago[RISCV] Reduce duplicate FP test cases.
Craig Topper [Thu, 9 Dec 2021 16:04:43 +0000 (08:04 -0800)]
[RISCV] Reduce duplicate FP test cases.

-Remove feq, fle, flt tests from *-arith.ll in favor of *-fcmp.ll which tests all predicates.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D113703

2 years agoAvoid unnecessary output buffer allocation and initialization.
Bixia Zheng [Tue, 7 Dec 2021 23:15:42 +0000 (15:15 -0800)]
Avoid unnecessary output buffer allocation and initialization.

The sparse tensor code generator allocates memory for the output tensor. As
such, we only need to allocate a MemRefDescriptor to receive the output tensor
and do not need to allocate and initialize the storage for the tensor.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D115292

2 years ago[NFC][mlir][OpenMP] Added documentation for omp.atomic ops
Shraiysh Vaishay [Thu, 9 Dec 2021 14:28:25 +0000 (19:58 +0530)]
[NFC][mlir][OpenMP] Added documentation for omp.atomic ops

This patch adds the documentation for the operations `omp.atomic.read`,
`omp.atomic.write` and `omp.atomic.update`.

Reviewed By: peixin

Differential Revision: https://reviews.llvm.org/D115445

2 years ago[AArch64][Analysis] Add on overhead costs for SVE gathers and scatters
David Sherwood [Mon, 6 Dec 2021 11:02:29 +0000 (11:02 +0000)]
[AArch64][Analysis] Add on overhead costs for SVE gathers and scatters

This patch adds on an overhead cost for gathers and scatters, which
is a rough estimate based on performance investigations I have
performed on SVE hardware for various micro-benchmarks.

Differential Revision: https://reviews.llvm.org/D115143

2 years ago[MLIR][GPU] Define gpu.printf op and its lowerings
Krzysztof Drewniak [Wed, 8 Dec 2021 23:28:06 +0000 (23:28 +0000)]
[MLIR][GPU] Define gpu.printf op and its lowerings

- Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports constant format strings and scalar arguments
- Define the lowering of gpu.pirntf to a call to printf() (which is what is required for AMD GPUs when using OpenCL) as well as to the hostcall interface present in the AMD Open Compute device library, which is the interface present when kernels are running under HIP.
- Add a "runtime" enum that allows specifying which of the possible runtimes a ROCDL kernel will be executed under or that the runtime is unknown. This enum controls how gpu.printf is lowered

This change does not enable lowering for Nvidia GPUs, but such a lowering should be possible in principle.

And:
[MLIR][AMDGPU] Always set amdgpu-implicitarg-num-bytes=56 on kernels

This is something that Clang always sets on both OpenCL and HIP kernels, and failing to include it causes mysterious crashes with printf() support.

In addition, revert the max-flat-work-group-size to (1, 256) to avoid triggering bugs in the AMDGPU backend.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D110448

2 years ago[LoopVectorize][AArch64] Add vectoriser cost model tests for gathers/scatters
David Sherwood [Wed, 1 Dec 2021 14:55:10 +0000 (14:55 +0000)]
[LoopVectorize][AArch64] Add vectoriser cost model tests for gathers/scatters

I've added some tests that were previously missing for the gather-scatter costs
being calculated by the vectorizer for AArch64:

  Transforms/LoopVectorize/AArch64/sve-gather-scatter-cost.ll

The costs are sometimes different to the ones in

  Analysis/CostModel/AArch64/sve-gather.ll

because the vectorizer also adds on the address computation cost.

2 years agoRevert "[xray] add support for hexagon"
Brian Cain [Thu, 9 Dec 2021 15:30:27 +0000 (07:30 -0800)]
Revert "[xray] add support for hexagon"

This reverts commit 543a9ad7c460bb8d641b1b7c67bbc032c9bfdb45.

2 years ago[mlir] AsyncParallelFor: align block size to be a multiple of inner loops iterations
Eugene Zhulenev [Thu, 9 Dec 2021 11:21:22 +0000 (03:21 -0800)]
[mlir] AsyncParallelFor: align block size to be a multiple of inner loops iterations

Depends On D115263

By aligning block size to inner loop iterations parallel_compute_fn LLVM can later unroll and vectorize some of the inner loops with small number of trip counts. Up to 2x speedup in multiple benchmarks.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D115436