Sanjay Patel [Tue, 23 Nov 2021 21:46:55 +0000 (16:46 -0500)]
[InstSimplify] fold xor logic of 2 variables
(a & b) ^ (~a | b) --> ~a
I was looking for a shortcut to reduce some of the complex logic
folds that are currently up for review (D113216
and others in that stack), and I found this missing from
instcombine/instsimplify.
There is a trade-off in putting it into instsimplify: because
we can't create new values here, we need a strict 'not' op (no
undef elements). Otherwise, the fold is not valid:
https://alive2.llvm.org/ce/z/k_AGGj
If this was in instcombine instead, we could create the proper
'not'. But having the fold here benefits other passes like GVN
that use instsimplify as an analysis.
There is a related fold where 'and' and 'or' are swapped, and
that is planned as a follow-up commit.
Differential Revision: https://reviews.llvm.org/D114462
Vitaly Buka [Tue, 23 Nov 2021 21:49:41 +0000 (13:49 -0800)]
[NFC][sanitizer] Make method const
Vitaly Buka [Tue, 23 Nov 2021 21:48:25 +0000 (13:48 -0800)]
[NFC][sanitizer] Extract StackTraceHeader struct
Rong Xu [Mon, 22 Nov 2021 22:03:32 +0000 (14:03 -0800)]
[SampleFDO] Recompute BFI if the sample loader changes BPI
The MIR sample loader changes the branch probability but not BFI.
Here we force a recompute of BFI if the branch probabilities are
changed.
Also register the MIR FSAFDO passes properly.
Differential Revision: https://reviews.llvm.org/D114400
Vitaly Buka [Tue, 16 Nov 2021 04:58:51 +0000 (20:58 -0800)]
[NFC][sanitizer] Add StackStoreTest
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D114463
Dimitry Andric [Tue, 23 Nov 2021 19:47:38 +0000 (20:47 +0100)]
[lldb] Move create_relative_symlink function up in CMake hierarchy
Configuring lldb with `LLDB_ENABLE_PYTHON=OFF` and `LLDB_ENABLE_LUA=ON` results in a CMake error:
CMake Error at lldb/bindings/lua/CMakeLists.txt:47 (create_relative_symlink):
Unknown CMake command "create_relative_symlink".
Call Stack (most recent call first):
lldb/CMakeLists.txt:117 (finish_swig_lua)
This is because the CMake function `create_relative_symlink` only exists in `lldb/bindings/python/CMakeLists.txt`, and not in `lldb/bindings/lua/CMakeLists.txt`.
Move the function to `lldb/bindings/CMakeLists.txt`, so it is available for all language bindings.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D114465
Vitaly Buka [Tue, 23 Nov 2021 20:51:12 +0000 (12:51 -0800)]
[NFC][sanitizer] Early return for empty StackTraces
Current callers should filter them out anyway,
but with this patch we don't need rely on that assumption.
Vitaly Buka [Tue, 23 Nov 2021 20:41:28 +0000 (12:41 -0800)]
[NFC][sanitizer] Move StackStore::Allocated into cpp file
Sanjay Patel [Tue, 23 Nov 2021 17:10:03 +0000 (12:10 -0500)]
[InstSimplify] add tests for xor logic fold; NFC
Rob Suderman [Wed, 10 Nov 2021 22:02:54 +0000 (14:02 -0800)]
[mlir][tosa] Materialize tosa.pad value and fold noop pads
Padding now can explicitly specify the padding value when non-zero is wanted.
This also includes bypassing pads when the pad does nothing.
Differential Revision: https://reviews.llvm.org/D113611
Rob Suderman [Tue, 23 Nov 2021 03:43:06 +0000 (19:43 -0800)]
[mlir][tosa] Separate tosa.transpose_conv decomposition and added stride support
Transpose convolution decomposition is now performed in a separate pass. This
allows padding / constant propagation to be performed at the TOSA level. It
also adds support for striding when there is no dilation.
Differential Revision: https://reviews.llvm.org/D114409
LLVM GN Syncbot [Tue, 23 Nov 2021 20:11:07 +0000 (20:11 +0000)]
[gn build] Port
1392b654ff65
Mehdi Amini [Tue, 23 Nov 2021 20:10:36 +0000 (20:10 +0000)]
Revert "profi - a flow-based profile inference algorithm: Part I (out of 3)"
This reverts commit
884b6dd311422bbfac62b8a90fbfff8e77ba8121.
The windows build is broken with a linker error.
MaheshRavishankar [Tue, 23 Nov 2021 18:21:52 +0000 (10:21 -0800)]
[mlir][Linalg] Add pad vectorization patterns into LinalgStrategyVectorize passes.
Add an option to control whether these patterns are added to the
pattern list or not.
Differential Revision: https://reviews.llvm.org/D114290
Mehrnoosh Heidarpour [Tue, 23 Nov 2021 18:50:13 +0000 (13:50 -0500)]
[InstCombine] Add test cases for D114339; NFC
Adding test cases for XOR logic folds with base result.
Differential Revision: https://reviews.llvm.org/D114436
LLVM GN Syncbot [Tue, 23 Nov 2021 19:09:46 +0000 (19:09 +0000)]
[gn build] Port
884b6dd31142
Quinn Pham [Thu, 18 Nov 2021 21:03:03 +0000 (15:03 -0600)]
[NFC][llvm] Inclusive language: remove instance of master in LiveRangeUtils.h
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with primary in `LiveRangeUtils.h`.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D114191
spupyrev [Tue, 23 Nov 2021 16:47:23 +0000 (08:47 -0800)]
profi - a flow-based profile inference algorithm: Part I (out of 3)
The benefits of sampling-based PGO crucially depends on the quality of profile
data. This diff implements a flow-based algorithm, called profi, that helps to
overcome the inaccuracies in a profile after it is collected.
Profi is an extended and significantly re-engineered classic MCMF (min-cost
max-flow) approach suggested by Levin, Newman, and Haber [2008, Complementing
missing and inaccurate profiling using a minimum cost circulation algorithm]. It
models profile inference as an optimization problem on a control-flow graph with
the objectives and constraints capturing the desired properties of profile data.
Three important challenges that are being solved by profi:
- "fixing" errors in profiles caused by sampling;
- converting basic block counts to edge frequencies (branch probabilities);
- dealing with "dangling" blocks having no samples in the profile.
The main implementation (and required docs) are in SampleProfileInference.cpp.
The worst-time complexity is quadratic in the number of blocks in a function,
O(|V|^2). However a careful engineering and extensive evaluation shows that
the running time is (slightly) super-linear. In particular, instances with
1000 blocks are solved within 0.1 second.
The algorithm has been extensively tested internally on prod workloads,
significantly improving the quality of generated profile data and providing
speedups in the range from 0% to 5%. For "smaller" benchmarks (SPEC06/17), it
generally improves the performance (with a few outliers) but extra work in
the compiler might be needed to re-tune existing optimization passes relying on
profile counts.
Reviewed By: wenlei, hoy
Differential Revision: https://reviews.llvm.org/D109860
wren romano [Thu, 18 Nov 2021 21:06:25 +0000 (13:06 -0800)]
[mlir][sparse] Moving integration tests that merely use the Python API
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D114192
Fangrui Song [Tue, 23 Nov 2021 18:30:11 +0000 (10:30 -0800)]
[ELF] Support non-RAX/non-adjacent R_X86_64_GOTPC32_TLSDESC/R_X86_64_TLSDESC_CALL
The current TLSDESC optimization code assumes:
```
leaq x@tlsdesc(%rip), %rax
call *x@tlscall(%rax) # adjacent
```
From https://gitlab.freedesktop.org/mesa/mesa/-/issues/5665 , it seems that the
two instructions may not be adjacent in GCC 10's output:
```
leaq x@tlsdesc(%rip), %rax
something else
call *x@tlscall(%rax)
```
This patch supports the case. While here, support non-RAX registers for
R_X86_64_GOTPC32_TLSDESC, in case the compiler generates inefficient:
```
leaq x@tlsdesc(%rip), %rcx # or %rdx, %rbx, %rdi, ...
movq %rcx, %rax
call *x@tlscall(%rax) # GNU ld/gold error for non-RAX
```
Differential Revision: https://reviews.llvm.org/D114416
Zarko Todorovski [Tue, 23 Nov 2021 18:22:21 +0000 (13:22 -0500)]
[llvm][NFC] Inclusive language: Reword replace uses of sanity in llvm/lib/Transform comments and asserts
Reworded some comments and asserts to avoid usage of `sanity check/test`
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D114372
Pirama Arumuga Nainar [Tue, 23 Nov 2021 18:03:04 +0000 (10:03 -0800)]
[compiler-rt/profile] Include __llvm_profile_get_magic in module signature
The INSTR_PROF_RAW_MAGIC_* number in profraw files should match during
profile merging. This causes an error with 32-bit and 64-bit variants
of the same code. The module signatures for the two binaries are
identical but they use different INSTR_PROF_RAW_MAGIC_* causing a
failure when profile-merging is used. Including it when computing the
module signature yields different signatures for the 32-bit and 64-bit
profiles.
Differential Revision: https://reviews.llvm.org/D114054
Philip Reames [Tue, 23 Nov 2021 17:57:30 +0000 (09:57 -0800)]
[indvars] Fix lftr crash when preheader is terminated by switch
This was found by oss-fuzz. The switch will get canonicalized to a branch, but if it hasn't been when we run LFTR, we crashed on an unneeded assert.
Nemanja Ivanovic [Tue, 23 Nov 2021 13:32:45 +0000 (07:32 -0600)]
[PowerPC] Add BCD add/sub/cmp builtins
Support for builtins that use bcdadd./bcdsub. to add/subtract
Binary Coded Decimal values as well as to determine validity
and compare BCD values.
Differential revision: https://reviews.llvm.org/D114088
Florian Hahn [Tue, 23 Nov 2021 17:37:12 +0000 (17:37 +0000)]
[LAA] Turn aggregate type check into assertion (NFCI).
getPtrStride should not be called with aggregate access types. There's
also an old TODO.
Turn the check into an assertion.
Philip Reames [Tue, 23 Nov 2021 17:18:28 +0000 (09:18 -0800)]
Revert "profi - a flow-based profile inference algorithm: Part I (out of 3)"
This reverts commit
b00fc198224efa038a7469e068dd920b3f1aba75. This change fails to build (link) on ubuntu x86,
Philip Reames [Tue, 23 Nov 2021 17:10:41 +0000 (09:10 -0800)]
[unroll] Remove two dead variable assignments [nfc]
These variables are not out-params, and we immediately return after assigning them. Thus, the assignments are dead and just confusing.
I believe these used to be out-params, but they're not any more.
spupyrev [Tue, 23 Nov 2021 16:47:23 +0000 (08:47 -0800)]
profi - a flow-based profile inference algorithm: Part I (out of 3)
The benefits of sampling-based PGO crucially depends on the quality of profile
data. This diff implements a flow-based algorithm, called profi, that helps to
overcome the inaccuracies in a profile after it is collected.
Profi is an extended and significantly re-engineered classic MCMF (min-cost
max-flow) approach suggested by Levin, Newman, and Haber [2008, Complementing
missing and inaccurate profiling using a minimum cost circulation algorithm]. It
models profile inference as an optimization problem on a control-flow graph with
the objectives and constraints capturing the desired properties of profile data.
Three important challenges that are being solved by profi:
- "fixing" errors in profiles caused by sampling;
- converting basic block counts to edge frequencies (branch probabilities);
- dealing with "dangling" blocks having no samples in the profile.
The main implementation (and required docs) are in SampleProfileInference.cpp.
The worst-time complexity is quadratic in the number of blocks in a function,
O(|V|^2). However a careful engineering and extensive evaluation shows that
the running time is (slightly) super-linear. In particular, instances with
1000 blocks are solved within 0.1 second.
The algorithm has been extensively tested internally on prod workloads,
significantly improving the quality of generated profile data and providing
speedups in the range from 0% to 5%. For "smaller" benchmarks (SPEC06/17), it
generally improves the performance (with a few outliers) but extra work in
the compiler might be needed to re-tune existing optimization passes relying on
profile counts.
Reviewed By: wenlei, hoy
Differential Revision: https://reviews.llvm.org/D109860
Yaxun (Sam) Liu [Mon, 8 Nov 2021 21:20:22 +0000 (16:20 -0500)]
[HIP] Fix device stub name for Windows
This is a follow up of https://reviews.llvm.org/D68578
where device stub name is changed for Itanium
mangling but not Microsoft mangling.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D113491
Philip Reames [Tue, 23 Nov 2021 17:01:23 +0000 (09:01 -0800)]
[unroll] Use early return in shouldFullUnroll [nfc]
Dmitry Vyukov [Tue, 23 Nov 2021 10:50:49 +0000 (11:50 +0100)]
tsan: disable signal_sync2.cpp test on powerpc64
Fails 1 out of 10 runs on powerpc bots:
https://lab.llvm.org/buildbot/#/builders/121/builds/13391
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D114426
Dmitry Vyukov [Tue, 23 Nov 2021 15:58:32 +0000 (16:58 +0100)]
[lldb] Deflake TestTsanBasic.py
The test flaked on bots:
http://green.lab.llvm.org/green/job/lldb-cmake/38666/
The test expects that tsan will detect a single race
with concurrent memory accesses. TSan doesn't do this reliably.
Run 100 iterations of the racing threads, which should
make the race much more likely to be detected.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114444
Kazu Hirata [Tue, 23 Nov 2021 16:54:47 +0000 (08:54 -0800)]
[llvm] Use range-based for loops (NFC)
Paul Robinson [Tue, 23 Nov 2021 16:42:16 +0000 (08:42 -0800)]
[PS4][TLI] Remove redundant line
alex-t [Fri, 19 Nov 2021 17:27:35 +0000 (20:27 +0300)]
[AMDGPU] Enable fneg and fabs divergence-driven instruction selection.
Detailed description: We currently have a set of patterns to select ISD::FNEG and ISD::FABS to the bitwise operations. We need to make them predicated to select the VALU or SALU bitwise operation variant according to the SDNode divergence bit.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D114257
Yaxun (Sam) Liu [Thu, 4 Nov 2021 17:49:43 +0000 (13:49 -0400)]
[NFC] Let Microsoft mangler accept GlobalDecl
This is a follow up of https://reviews.llvm.org/D75700
where support of GlobalDecl with Microsoft mangler
is incomplete.
Reviewed by: Artem Belevich, Reid Kleckner
Differential Revision: https://reviews.llvm.org/D113490
Yaxun (Sam) Liu [Tue, 23 Nov 2021 15:46:51 +0000 (10:46 -0500)]
Fix warning due to default switch label
Fix warning due to default label in switch which covers all enumeration values
Simon Moll [Tue, 23 Nov 2021 14:08:02 +0000 (15:08 +0100)]
[VP] Canonicalize macros of VPIntrinsics.def
Usage and naming of macros in VPIntrinsics.def has been inconsistent. Rename all property macros to VP_PROPERTY_<name>. Use BEGIN/END scope macros to attach properties to vp intrinsics and SDNodes (instead of specifying either directly with the property macro).
A follow-up patch has documentation on how the macros are (intended) to be used.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D114144
Gabor Marton [Thu, 11 Nov 2021 13:55:24 +0000 (14:55 +0100)]
[Analyzer][Core] Better simplification in SimpleSValBuilder::evalBinOpNN
Make the SValBuilder capable to simplify existing
SVals based on a newly added constraints when evaluating a BinOp.
Before this patch, we called `simplify` only in some edge cases.
However, we can and should investigate the constraints in all cases.
Differential Revision: https://reviews.llvm.org/D113753
Yaxun (Sam) Liu [Mon, 22 Nov 2021 19:37:02 +0000 (14:37 -0500)]
[HIP] Add HIP scope atomic operations
Add an AtomicScopeModel for HIP and support for OpenCL builtins
that are missing in HIP.
Patch by: Michael Liao
Revised by: Anshil Ghandi
Reviewed by: Yaxun Liu
Differential Revision: https://reviews.llvm.org/D113925
Jinsong Ji [Tue, 23 Nov 2021 15:08:49 +0000 (15:08 +0000)]
[PowerPC] Remove FreeBSD test in mm-malloc.c due to cross-compilation limitation
Fix failures on powerpc BE buildbots
https://lab.llvm.org/buildbot/#/builders/93/builds/6031
https://lab.llvm.org/buildbot/#/builders/100/builds/10836
https://lab.llvm.org/buildbot/#/builders/52/builds/12719
Sanjay Patel [Tue, 23 Nov 2021 14:50:24 +0000 (09:50 -0500)]
[InstCombine] enhance bitwise select matching
I noticed that adding a seemingly unrelated fold for xor caused
regressions on similar patterns, and this is one of the
underlying causes.
This could also be a variation for code as seen in:
https://llvm.org/PR34047
...although that exact example should be fixed after:
D113035 /
c36b7e21bd8f
The vector test shows that we are actually missing a potential
canonicalization for bitcast-of-sext-of-not or the inverse.
The scalar test shows that even if we had that canonicalization,
it would still be possible to see this pattern due to extra uses.
https://alive2.llvm.org/ce/z/y2BAgi
Sanjay Patel [Tue, 23 Nov 2021 13:55:49 +0000 (08:55 -0500)]
[InstCombine] add tests for logical select; NFC
Louis Dionne [Mon, 22 Nov 2021 20:40:12 +0000 (15:40 -0500)]
[libc++] Tidy up how %T and %t are created during configuration checks
Instead of having ad-hoc cleanup in various places, handle all creation
and removal of temporary files and directories inside _makeConfigTest.
As a fly-by, also remove testPrefix since we don't keep any source file
around anymore. Setting a prefix for the files is hence not useful anymore.
Differential Revision: https://reviews.llvm.org/D114390
David Green [Tue, 23 Nov 2021 14:24:58 +0000 (14:24 +0000)]
[ARM] Expand rev.ll test with more triples. NFC
Useful in showing Thumb2 and Thumb1 rev instructions as well as the arm
already tested, as well as testing the more canonical llvm.bswap.i16
form.
Zahira Ammarguellat [Tue, 23 Nov 2021 13:00:57 +0000 (08:00 -0500)]
Revert "The _Float16 type is supported on x86 systems with SSE2 enabled."
This reverts commit
6623c02d70c3732dbea59c6d79c69501baf9627b.
The change seems to be breaking build of compiler-rt on Debian.
Nicolas Vasilache [Tue, 23 Nov 2021 12:01:53 +0000 (12:01 +0000)]
[mlir][Vector] Thread 0-d vectors through InsertElementOp.
This revision makes concrete use of 0-d vectors to extend the semantics of
InsertElementOp.
Reviewed By: dcaballe, pifon2a
Differential Revision: https://reviews.llvm.org/D114388
Nicolas Vasilache [Tue, 23 Nov 2021 12:01:12 +0000 (12:01 +0000)]
[mlir][Vector] Thread 0-d vectors through ExtractElementOp.
This revision starts making concrete use of 0-d vectors to extend the semantics of
ExtractElementOp.
In the process a new VectorOfAnyRank Tablegen OpBase.td is added to allow progressive transition to supporting 0-d vectors by gradually opting in.
Differential Revision: https://reviews.llvm.org/D114387
Matthias Springer [Tue, 23 Nov 2021 12:27:03 +0000 (21:27 +0900)]
[mlir][linalg][bufferize][NFC] Specify bufferize traversal in `bufferize`
The interface method `bufferize` controls how (and it what order) nested ops are traversed. This simplifies bufferization of scf::ForOps and scf::IfOps, which used to need special rules in scf::YieldOp.
Differential Revision: https://reviews.llvm.org/D114057
Diana Picus [Thu, 18 Nov 2021 12:40:48 +0000 (12:40 +0000)]
[fir] Set !fir.len_param_index conversion to unimplemented
This patch is part of the upstreaming effort from fir-dev.
The conversion of len_param_index in fir-dev is incomplete, so for now
we're marking this as unimplemented until we can settle on a design for
the runtime support of LEN parameters.
Differential Revision: https://reviews.llvm.org/D114241
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Tonko Sabolčec [Tue, 23 Nov 2021 11:43:46 +0000 (12:43 +0100)]
[lldb] Fix lookup for global constants in namespaces
LLDB uses mangled name to construct a fully qualified name for global
variables. Sometimes DW_TAG_linkage_name attribute is missing from
debug info, so LLDB has to rely on parent entries to construct the
fully qualified name.
Currently, the fallback is handled when the parent DW_TAG is either
DW_TAG_compiled_unit or DW_TAG_partial_unit, which may not work well
for global constants in namespaces. For example:
namespace ns {
const int x = 10;
}
may produce the following debug info:
<1><2a>: Abbrev Number: 2 (DW_TAG_namespace)
<2b> DW_AT_name : (indirect string, offset: 0x5e): ns
<2><2f>: Abbrev Number: 3 (DW_TAG_variable)
<30> DW_AT_name : (indirect string, offset: 0x61): x
<34> DW_AT_type : <0x3c>
<38> DW_AT_decl_file : 1
<39> DW_AT_decl_line : 2
<3a> DW_AT_const_value : 10
Since the fallback didn't handle the case when parent tag is
DW_TAG_namespace, LLDB wasn't able to match the variable by its fully
qualified name "ns::x". This change fixes this by additional check
if the parent is a DW_TAG_namespace.
Reviewed By: werat, clayborg
Differential Revision: https://reviews.llvm.org/D112147
Jay Foad [Tue, 23 Nov 2021 11:33:10 +0000 (11:33 +0000)]
[AMDGPU] Fix the name of a test case
Dmitry Vyukov [Tue, 27 Apr 2021 11:55:41 +0000 (13:55 +0200)]
tsan: new runtime (v3)
This change switches tsan to the new runtime which features:
- 2x smaller shadow memory (2x of app memory)
- faster fully vectorized race detection
- small fixed-size vector clocks (512b)
- fast vectorized vector clock operations
- unlimited number of alive threads/goroutimes
Differential Revision: https://reviews.llvm.org/D112603
mydeveloperday [Tue, 23 Nov 2021 10:43:27 +0000 (10:43 +0000)]
[clang-format] [NFC] build clang-format with -Wall
When building clang-format with -Wall on Visual Studio 20119 we see the following, prevent this the only -Wall error
```
..FormatTokenLexer.cpp(45) : warning C4868: compiler may not enforce left-to-right evaluation order in braced initializer list
```
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D113844
mydeveloperday [Tue, 23 Nov 2021 10:35:05 +0000 (10:35 +0000)]
[clang-format] [PR52527] can join * with /* to form an outside of comment error C4138
https://bugs.llvm.org/show_bug.cgi?id=52527
The follow patch ensures there is always a space between * and /* to prevent transforming
```
void foo(* /* comment */)(int bar);
```
into
```
void foo(*/* comment */)(int bar);
```
Differential Revision: https://reviews.llvm.org/D114142
Evgeniy Brevnov [Mon, 22 Nov 2021 12:52:57 +0000 (19:52 +0700)]
[DSE][NFC] Introduce "doesn't overwrite" return code for isOverwrite
Add OR_None code to indicate that there is no overwrite. This has no any effect for current uses but will be used in one of the next patches building support for PHI translation.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D105098
Florian Hahn [Tue, 23 Nov 2021 10:06:08 +0000 (10:06 +0000)]
[ThreadPool] Do not return shared futures.
The only users of returned futures from ThreadPool is llvm-reduce after
D113857.
There should be no cases where multiple threads wait on the same future,
so there should be no need to return std::shared_future<>. Instead return
plain std::future<>.
If users need to share a future between multiple threads, they can share
the futures themselves.
Reviewed By: Meinersbur, mehdi_amini
Differential Revision: https://reviews.llvm.org/D114363
Alexander Belyaev [Tue, 23 Nov 2021 09:04:47 +0000 (10:04 +0100)]
Revert "Revert "[mlir] Move AllocationOpInterface to Bufferize/IR/AllocationOpInterface.td.""
This reverts and fixes commit
de18b7dee6a81e5e790c8e8060065b1ef72d13ed.
David Green [Tue, 23 Nov 2021 09:47:56 +0000 (09:47 +0000)]
[SDAG] Use UnknownSize for masked load/store MMO size
A masked load or store will load a potentially unknown number of bytes
from a memory location - that is not generally known at compile time.
They do not necessarily load/store the entire vector width, and treating
them as such can lead to incorrect aliasing information (for example, if
the underlying object is smaller than the size of the vector).
This makes sure that the MMO is given an unknown size to represent this.
which is less accurate that "may load/store from up to 16 bytes", but
less incorrect that "will load/store from 16 bytes".
Differential Revision: https://reviews.llvm.org/D113888
Qiu Chaofan [Tue, 23 Nov 2021 09:21:17 +0000 (17:21 +0800)]
[PowerPC] Implement more fusion types for Power10
This implements the rest of Power10 instruction fusion pairs, according
to user manual, including 'wide immediate', 'load compare', 'zero move'
and 'SHA3 assist'.
Only 'SHA3 assist' is enabled by default.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D112912
David Green [Tue, 23 Nov 2021 08:41:47 +0000 (08:41 +0000)]
[X86] Regenerate X86/vmaskmov-offset.ll check lines as per new mir format. NFC
David Green [Tue, 23 Nov 2021 08:27:06 +0000 (08:27 +0000)]
[ARM] Add an test for showing the incorrect aliasing info around masked loads/stores. NFC
Martin Storsjö [Sat, 13 Nov 2021 08:13:42 +0000 (10:13 +0200)]
[LLD] [COFF] Omit section symbols and IMAGE_SYM_CLASS_LABEL from the PE symbol table
The section symbols aren't of much practical use when looking at
a linked image. This shrinks one observed mingw style unstripped
binary by 14%.
IMAGE_SYM_CLASS_LABEL is in spirit the same as a temporary assembler
label that isn't emitted on the object file level at all.
Differential Revision: https://reviews.llvm.org/D113866
Martin Storsjö [Wed, 10 Nov 2021 17:10:43 +0000 (19:10 +0200)]
[AArch64] [COFF] Move jump tables back to the readonly section
This essentially reverts
f5884d255e78305d41c28c6e001a460ff83981d8
(D57277).
That commit was made as a workaround since LLVM back then didn't
support cross-section relative relocations (IMAGE_REL_ARM64_REL32)
in COFF for ARM64. Support for this was implemented later,
in
d5c5cf5ce8d921fc8c5e1b608c298a1ffa688d37 (D99572) and
382c505d9cfca8adaec47aea2da7bbcbc00fc05c (D102217).
The commit that moved jump tables to the function section noted
that it woud be ideal to utilize IMAGE_REL_ARM64_REL32.
Differential Revision: https://reviews.llvm.org/D113576
Martin Storsjö [Sat, 20 Nov 2021 17:55:18 +0000 (19:55 +0200)]
[LLD] [COFF] Interpret the immediate in ARM64 adr/adrp relocations as signed 21 bit
This matches how MS link.exe interprets this relocation.
Differential Revision: https://reviews.llvm.org/D114347
Martin Storsjö [Fri, 19 Nov 2021 23:05:07 +0000 (01:05 +0200)]
[COFF] [ARM64] Create symbols with regular intervals for relocations against temporary symbols
For relocations against temporary symbols (that don't persist in
the object file), we normally adjust them to reference the start of
the section.
For adrp relocations, the immediate offset from the referenced
symbol is stored in the opcode as the 21 bit signed immediate; this
means that the symbol referenced must be within +/- 1 MB from the
referenced symbol.
Create label symbols with regular intervals (1 MB intervals). For
relocations against temporary symbols, pick the preceding added
offset symbol and make the relocation against that instead of
against the start of the section.
This should fix the root issue behind
https://bugs.llvm.org/show_bug.cgi?id=52378.
Differential Revision: https://reviews.llvm.org/D114340
Nicolas Vasilache [Mon, 22 Nov 2021 10:22:37 +0000 (10:22 +0000)]
[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)
This revision follows up on the conversation titled:
```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```
The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.
This results in roughly 20% fewer cycles as reported by llvm-mca:
After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted):
```
Iterations: 100
Instructions: 5900
Total Cycles: 2415
Total uOps: 7300
Dispatch Width: 6
uOps Per Cycle: 3.02
IPC: 2.44
Block RThroughput: 24.0
Cycles with backend pressure increase [ 89.90% ]
Throughput Bottlenecks:
Resource Pressure [ 89.65% ]
- SKXPort1 [ 0.04% ]
- SKXPort2 [ 12.42% ]
- SKXPort3 [ 12.42% ]
- SKXPort5 [ 89.52% ]
Data Dependencies: [ 37.06% ]
- Register Dependencies [ 37.06% ]
- Memory Dependencies [ 0.00% ]
```
After this revision (inline_asm version, vblendps instructions are indeed emitted):
```
Iterations: 100
Instructions: 6300
Total Cycles: 2015
Total uOps: 7700
Dispatch Width: 6
uOps Per Cycle: 3.82
IPC: 3.13
Block RThroughput: 20.0
Cycles with backend pressure increase [ 83.47% ]
Throughput Bottlenecks:
Resource Pressure [ 83.18% ]
- SKXPort0 [ 14.49% ]
- SKXPort1 [ 14.54% ]
- SKXPort2 [ 19.70% ]
- SKXPort3 [ 19.70% ]
- SKXPort5 [ 83.03% ]
- SKXPort6 [ 14.49% ]
Data Dependencies: [ 39.75% ]
- Register Dependencies [ 39.75% ]
- Memory Dependencies [ 0.00% ]
```
An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/
68c7f34012584b0e00f335bcb374ede0).
Differential Revision: https://reviews.llvm.org/D114393
Sandeep Dasgupta [Tue, 23 Nov 2021 06:05:41 +0000 (06:05 +0000)]
[mlir] Refactoring a few Parser APIs
Refactored two new parser APIs parseGenericOperationAfterOperands and
parseCustomOperationName out of parseGenericOperation and parseCustomOperation.
Motivation: Sometimes an op can be printed in a special way if certain criteria
is met. While parsing, we need to handle all the versions.
`parseGenericOperationAfterOperands` is handy in situation where we already
parsed the operands and decide to fall back to default parsing.
`parseCustomOperationName` is useful when we need to know details (dialect,
operation name etc.) about a parsed token meant to be an mlir operation.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D113719
Kazu Hirata [Tue, 23 Nov 2021 04:33:27 +0000 (20:33 -0800)]
[llvm] Use range-based for loops (NFC)
Matthias Springer [Tue, 23 Nov 2021 03:19:53 +0000 (12:19 +0900)]
[mlir][linalg][bufferize] Limited support for scf.execute_region
Add support for analysis only.
Differential Revision: https://reviews.llvm.org/D114055
Matthias Springer [Tue, 23 Nov 2021 02:20:27 +0000 (11:20 +0900)]
[mlir][linalg][bufferize][NFC] Move helper function to op interface
This is in preparation of changing the op traversal during bufferization.
Differential Revision: https://reviews.llvm.org/D114040
Matthias Springer [Tue, 23 Nov 2021 02:12:38 +0000 (11:12 +0900)]
[mlir][linalg][bufferize][NFC] Remove special casing of CallOps
Differential Revision: https://reviews.llvm.org/D113966
Matthias Springer [Tue, 23 Nov 2021 01:27:57 +0000 (10:27 +0900)]
[mlir][linalg][bufferize][NFC] Clean up headers and function visibility
Differential Revision: https://reviews.llvm.org/D113964
Walter Erquinigo [Tue, 23 Nov 2021 00:33:11 +0000 (16:33 -0800)]
Attempt to fix
e3dea5cf0e326366ab95a49d167fde8b0816e292
https://lab.llvm.org/buildbot/#/builders/17/builds/13728 found an issue
in the optional formatter.
Peter Klausler [Fri, 19 Nov 2021 23:17:55 +0000 (15:17 -0800)]
[flang] Correct the argument keyword for AIMAG(Z=...)
It was X= in the intrinsics table.
Differential Revision: https://reviews.llvm.org/D114296
Walter Erquinigo [Mon, 22 Nov 2021 21:46:49 +0000 (13:46 -0800)]
[formatters] Add a formatter for libstdc++ optional
Besides adding the formatter and the summary, this makes the libcxx
tests also work for this case.
This is the polished version of https://reviews.llvm.org/D114266,
authored by Danil Stefaniuc.
Differential Revision: https://reviews.llvm.org/D114403
Huihui Zhang [Mon, 22 Nov 2021 22:58:15 +0000 (14:58 -0800)]
[InstCombine] Enable fold select into operand for FAdd, FMul, FSub and FDiv.
For FAdd, FMul, FSub and FDiv, fold select into one of the operands to enable
further optimizations, i.e., floating-point reduction detection.
Turn code:
%C = fadd %A, %B
%D = select %cond, %C, %A
into:
%C = select %cond, %B, -0.000000e+00
%D = fadd %A, %C
Alive2 verification (with --disable-undef-input), timed out otherwise.
FAdd - https://alive2.llvm.org/ce/z/eUxN4Y
FMul - https://alive2.llvm.org/ce/z/5SWZz4
FSub - https://alive2.llvm.org/ce/z/Dhj8dU
FDiv - https://alive2.llvm.org/ce/z/Yj_NA2
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D113442
Peter Klausler [Fri, 19 Nov 2021 23:49:16 +0000 (15:49 -0800)]
[flang] Remove typo that affected complex namelist input
A recent patch to real/complex formatted input included what must
have been an editing hiccup: "++ ++p" instead of "++p". This
compiles, and it broke the consumption of the trailing ')' of a
complex value in namelist input by skipping over the character.
Extend existing test to cover this case.
Differential Revision: https://reviews.llvm.org/D114297
Shoaib Meenai [Mon, 22 Nov 2021 22:34:42 +0000 (14:34 -0800)]
[MachO] Fix struct size assertion
std::vector can have different sizes depending on the STL's debug level,
so account for its size separately. (You could argue that we should be
accounting for all the other members separately as well, but that would
be very unergonomic, and std::vector is the only one that's caused
problems so far.)
Jon Chesterfield [Mon, 22 Nov 2021 23:00:19 +0000 (23:00 +0000)]
[openmp][amdgpu] Make plugin robust to presence of explicit implicit arguments
OpenMP (compiler) does not currently request any implicit kernel
arguments. OpenMP (runtime) allocates and initialises a reasonable guess at
the implicit kernel arguments anyway.
This change makes the plugin check the number of explicit arguments, instead
of all arguments, and puts the pointer to hostcall buffer in both the current
location and at the offset expected when implicit arguments are added to the
metadata by D113538.
This is intended to keep things running while fixing the oversight in the
compiler (in D113538). Once that patch lands, and a following one marks
openmp kernels that use printf such that the backend emits an args element
with the right type (instead of hidden_node), the over-allocation can be
removed and the hardcoded 8*e+3 offset replaced with one read from the
.offset of the corresponding metadata element.
Reviewed By: estewart08
Differential Revision: https://reviews.llvm.org/D114274
Fangrui Song [Mon, 22 Nov 2021 21:59:23 +0000 (13:59 -0800)]
[ELF] Simplify a condition with config->copyRelocs. NFC
Benjamin Kramer [Mon, 22 Nov 2021 21:11:45 +0000 (22:11 +0100)]
[mlir][memref] Fix expanded shape ops memref.cast folding with changed type
`memref.expand_shape` has verification logic to make sure
result dim must be static if all the collapsing src dims are static.
This can be relaxed once expand_shape supports more dynamism.
Differential Revision: https://reviews.llvm.org/D114391
Jan Beich [Mon, 22 Nov 2021 16:32:58 +0000 (11:32 -0500)]
[Driver] Default to libc++ on FreeBSD
All supported FreeBSD releases use libc++, so default to it if the
target's major version is not specified.
Reviewed by: dim, emaste
Differential Revision: https://reviews.llvm.org/D77776
Christian Ulmann [Mon, 22 Nov 2021 21:30:02 +0000 (03:00 +0530)]
[mlir] FlatAffineConstraint parsing for unit tests
This patch adds functionality to parse FlatAffineConstraints from a
StringRef with the intention to be used for unit tests. This should
make the construction of FlatAffineConstraints easier for testing
purposes.
The patch contains an example usage of the functionality in a unit test that
uses FlatAffineConstraints.
Reviewed By: bondhugula, grosser
Differential Revision: https://reviews.llvm.org/D113275
Snehasish Kumar [Fri, 19 Nov 2021 21:13:02 +0000 (13:13 -0800)]
[memprof] Remove the "Live on exit:" print for text format.
We dropped the printing of live on exit blocks in rG1243cef245f6 -
the commit changed the insertOrMerge logic. Remove the message since it
is no longer needed (all live blocks are inserted into the hashmap)
before serializing/printing the profile. Furthermore, the original
intent was to capture evicted blocks so it wasn't entirely correct.
Also update the binary format test invocation to remove the redundant
print_text directive now that it is the default.
Differential Revision: https://reviews.llvm.org/D114285
Groverkss [Mon, 22 Nov 2021 21:18:03 +0000 (02:48 +0530)]
[MLIR] Fix incorrect removal of source loop in loop fusion
This patch fixes a bug in loop fusion pass where the source loop was removed
even when the fused loop did not cover all iterations of the source loop.
This was because the fast hueristic check for checking if source loop and
fused loop have same iterations did not take into account steps in loop.
Reviewed By: dcaballe, bondhugula
Differential Revision: https://reviews.llvm.org/D114164
Bill Wendling [Mon, 22 Nov 2021 21:21:24 +0000 (13:21 -0800)]
[llvm-diff] Implement diff of PHI nodes
Implement diff of PHI nodes
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D114211
Florian Hahn [Mon, 22 Nov 2021 21:20:55 +0000 (21:20 +0000)]
[ThreadPool] Support returning futures with results.
This patch adjusts ThreadPool::async to return futures that wrap
the result type of the passed in callable.
To do so, ThreadPool::asyncImpl first creates a shared promise. The
result of the promise is set in a new callable that first executes the
task. The callable is added to the task queue.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D114183
Matt Morehouse [Mon, 22 Nov 2021 21:12:47 +0000 (13:12 -0800)]
[HWASan] Move LTO test to separate file.
The test fails on Android for an unknown reason but is still worth
having for x86.
Walter Erquinigo [Mon, 22 Nov 2021 21:13:43 +0000 (13:13 -0800)]
Revert "[lldb] Load the fblldb module automatically"
This reverts commit
2e6a0a8b81d7be948491ce39d241695dc1385429.
It was pushed by mistake..
Danil Stefaniuc [Mon, 22 Nov 2021 20:54:28 +0000 (12:54 -0800)]
[formatters] Add a libstdcpp formatter for for unordered_map, unordered_set, unordered_multimap, unordered_multiset
This diff adds a data formatter and tests for libstdcpp's unordered_map, unordered_set, unordered_multimap, unordered_multiset
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/D113760
Walter Erquinigo [Thu, 24 Jun 2021 20:35:11 +0000 (13:35 -0700)]
[lldb] Load the fblldb module automatically
Summary:
```
// Facebook only:
// We want to load automatically the fblldb python module as soon as lldb or
// lldb-vscode start. This will ensure that logging and formatters are enabled
// by default.
//
// As we want to have a mechanism for not triggering this by default, if the
// user is starting lldb disabling .lldbinit support, then we also don't load
// this module. This is equivalent to appending this line to all .lldbinit
// files.
//
// We don't have the fblldb module on windows, so we don't include it for that
// build.
```
Test Plan:
the fbsymbols module is loaded automatically
```
./bin/lldb
(lldb) help fbsymbols
Facebook {mini,core}dump utility. Expects 'raw' input (see 'help raw-input'.)
```
Reviewers: wanyi
Reviewed By: wanyi
Subscribers: mnovakovic, serhiyr, phabricatorlinter
Differential Revision: https://phabricator.intern.facebook.com/
D29372804
Tags: accept2ship
Signature:
29372804:
1624567770:
07836e50e576bd809124ed80a6bc01082190e48f
[lldb] Load fblldbinit instead of fblldb
Summary: Once accepted, it'll merge it with the existing commit in our branch so that we keep the commit list as short as possible.
Test Plan: https://www.internalfb.com/diff/
D30293094
Reviewers: aadsm, wanyi
Reviewed By: aadsm
Subscribers: mnovakovic, serhiyr
Differential Revision: https://phabricator.intern.facebook.com/
D30293211
Tags: accept2ship
Signature:
30293211:
1628880953:
423e2e543cade107df69da0ebf458e581e54ae3a
LLVM GN Syncbot [Mon, 22 Nov 2021 20:49:36 +0000 (20:49 +0000)]
[gn build] Port
8e2fd879e6f9
Haowei Wu [Fri, 19 Nov 2021 18:43:31 +0000 (10:43 -0800)]
[compiler-rt] Explicitly set dependency on libcxx for MemProfUnitTest
MemProfUnitTest now depends on libcxx but the dependency is not
explicitly expressed in build system, causing build races. This patch
addresses this issue.
Differential Revision: https://reviews.llvm.org/D114267
Peter Klausler [Mon, 22 Nov 2021 20:42:51 +0000 (12:42 -0800)]
[flang] Move IsCoarray() to fix shared library build
The predicate IsCoarray() needs to be in libFortranEvaluate so that
IsSaved() can call it without breaking the shared library build.
Pushed without pre-commit review as I'm moving code around and
the fix to the shared build is confirmed.
Alfredo Dal'\''Ava Junior [Mon, 22 Nov 2021 18:55:35 +0000 (18:55 +0000)]
[PowerPC] [Clang] Enable Intel intrinsics support on FreeBSD
This enables Intel intrinsics support on FreeBSD.
Thanks to @pkubaj who noticed this feature was missing
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D113451
Quinn Pham [Fri, 19 Nov 2021 21:04:22 +0000 (15:04 -0600)]
[NFC][llvm] Inclusive language: replace master with main in 2007-04-02-RegScavengerAssert.ll
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with main in `2007-04-02-RegScavengerAssert.ll`.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D114276
Jay Foad [Mon, 22 Nov 2021 10:53:18 +0000 (10:53 +0000)]
[AMDGPU] Allow VOP3 source modifiers in fpow expansion
Differential Revision: https://reviews.llvm.org/D114353
Alexander Belyaev [Mon, 22 Nov 2021 20:35:20 +0000 (21:35 +0100)]
Revert "[mlir] Move AllocationOpInterface to Bufferize/IR/AllocationOpInterface.td."
This reverts commit
3028bca6a987e424365ca67f6dc29e037e52ea11.
For some reason using FallbackModel works with CMake and does not work
with bazel. Using `ExternalModel` works. I will check what's going on
and resubmit tomorrow.
Quinn Pham [Wed, 17 Nov 2021 18:21:58 +0000 (12:21 -0600)]
[NFC][clang] Inclusive language: rename master variable to controller in debug-info tests
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with controller in these tests.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D114108