Jeremy Morse [Wed, 24 Nov 2021 10:20:03 +0000 (10:20 +0000)]
[DebugInfo][InstrRef] Avoid crash when values optimised out late in sdag
It appears that we can emit all the instructions for a function, including
debug instructions, and then optimise some of the values out late.
Specifically, in the attached test case, an argument gets optimised out
after DBG_VALUE / DBG_INSTR_REFs are created. This confuses
MachineFunction::finalizeDebugInstrRefs, which expects to be able to find a
defining instruction, and crashes instead.
Fix this by identifying when there's no defining instruction, and
translating that instead into a DBG_VALUE $noreg.
Differential Revision: https://reviews.llvm.org/D114476
David Green [Wed, 24 Nov 2021 10:22:20 +0000 (10:22 +0000)]
[ARM] Fold floating point select(binop) patterns
Similar to D84091 which added extra predicated folds for integer operations
using the identity element of the operation, this adds them for floating
point operations for the form `BinOp(x, select(p, y, Identity))`. They are
folded back to predicated versions of the operator, with fadd having the
identity -0.0, fsub using the identity 0.0 and fmul using 1.0.
Differential Revision: https://reviews.llvm.org/D113574
Dmitry Vyukov [Mon, 22 Nov 2021 14:44:00 +0000 (15:44 +0100)]
tsan: extend mmap test
Test size larger than clear_shadow_mmap_threshold,
which is handled differently.
Depends on D114348.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D114366
David Green [Wed, 24 Nov 2021 09:51:33 +0000 (09:51 +0000)]
[ARM] Add fma and update fadd/fmul predicated select tests. NFC
mydeveloperday [Wed, 24 Nov 2021 09:44:35 +0000 (09:44 +0000)]
[clang-format] NFC - recent changes caused clang-format to no longer be clang-formatted.
The following 2 commits caused files in clang-format to no longer be clang-formatted.
we would lose our "clean" status https://releases.llvm.org/13.0.0/tools/clang/docs/ClangFormattedStatus.html
c2271926a4fc - Make clang-format fuzz through Lexing with asserts enabled (https://github.com/llvm/llvm-project/commit/
c2271926a4fc )
84bf5e328664 - Fix various problems found by fuzzing. (https://github.com/llvm/llvm-project/commit/
84bf5e328664)
Reviewed By: HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D114430
Matthias Springer [Wed, 24 Nov 2021 09:20:00 +0000 (18:20 +0900)]
[mlir][linalg][bufferize][NFC] Move tensor interface impl to new build target
This makes ComprehensiveBufferize entirely independent of the tensor dialect.
Differential Revision: https://reviews.llvm.org/D114217
Florian Hahn [Wed, 24 Nov 2021 09:23:52 +0000 (09:23 +0000)]
[llvm-reduce] Add parallel chunk processing.
This patch adds parallel processing of chunks. When reducing very large
inputs, e.g. functions with 500k basic blocks, processing chunks in
parallel can significantly speed up the reduction.
To allow modifying clones of the original module in parallel, each clone
needs their own LLVMContext object. To achieve this, each job parses the
input module with their own LLVMContext. In case a job successfully
reduced the input, it serializes the result module as bitcode into a
result array.
To ensure parallel reduction produces the same results as serial
reduction, only the first successfully reduced result is used, and
results of other successful jobs are dropped. Processing resumes after
the chunk that was successfully reduced.
The number of threads to use can be configured using the -j option.
It defaults to 1, which means serial processing.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D113857
Pavel Labath [Wed, 24 Nov 2021 08:59:16 +0000 (09:59 +0100)]
[lldb/gdb-remote] Remove more non-stop mode remnants
The read thread handling is completely dead code now that non-stop mode
no longer exists.
Rosie Sumpter [Tue, 12 Oct 2021 08:51:42 +0000 (09:51 +0100)]
[LoopVectorize][CostModel] Update cost model for fmuladd intrinsic
This patch updates the cost model for ordered reductions so that a call
to the llvm.fmuladd intrinsic is modelled as a normal fmul instruction
plus the cost of an ordered fadd reduction.
Differential Revision: https://reviews.llvm.org/D111630
Rosie Sumpter [Tue, 16 Nov 2021 11:52:19 +0000 (11:52 +0000)]
[LoopVectorize] Print fast-math flags for VPReductionRecipe
Rosie Sumpter [Wed, 3 Nov 2021 12:40:14 +0000 (12:40 +0000)]
[LoopVectorize] Propagate fast-math flags for VPInstruction
In-loop vector reductions which use the llvm.fmuladd intrinsic involve
the creation of two recipes; a VPReductionRecipe for the fadd and a
VPInstruction for the fmul. If the call to llvm.fmuladd has fast-math flags
these should be propagated through to the fmul instruction, so an
interface setFastMathFlags has been added to the VPInstruction class to
enable this.
Differential Revision: https://reviews.llvm.org/D113125
Rosie Sumpter [Mon, 11 Oct 2021 14:50:44 +0000 (15:50 +0100)]
[LoopVectorize] Add vector reduction support for fmuladd intrinsic
Enables LoopVectorize to handle reduction patterns involving the
llvm.fmuladd intrinsic.
Differential Revision: https://reviews.llvm.org/D111555
Butygin [Fri, 19 Nov 2021 22:56:23 +0000 (01:56 +0300)]
[mlir][scf] Canonicalize scf.while with unused results
Differential Revision: https://reviews.llvm.org/D114291
Clement Courbet [Fri, 19 Nov 2021 15:42:32 +0000 (16:42 +0100)]
[clang-tidy] performance-unnecessary-copy-initialization: Fix false negative.
`isConstRefReturningMethodCall` should be considering
`CXXOperatorCallExpr` in addition to `CXXMemberCallExpr`. Clang considers
these to be distinct (`CXXOperatorCallExpr` derives from `CallExpr`, not
`CXXMemberCallExpr`), but we don't care in the context of this
check.
This is important because of
`std::vector<Expensive>::operator[](size_t) const`.
Differential Revision: https://reviews.llvm.org/D114249
Vitaly Buka [Wed, 24 Nov 2021 06:12:31 +0000 (22:12 -0800)]
[sanitizer] Add Abs<T>
Abinav Puthan Purayil [Mon, 8 Nov 2021 05:35:22 +0000 (11:05 +0530)]
[AMDGPU] Check for unneeded shift mask in shift PatFrags.
The existing constrained shift PatFrags only dealt with masked shift
from OpenCL front-ends. This change copies the
X86DAGToDAGISel::isUnneededShiftMask() function to AMDGPU and uses it in
the shift PatFrag predicates.
Differential Revision: https://reviews.llvm.org/D113448
Igor Kudrin [Wed, 24 Nov 2021 05:17:03 +0000 (12:17 +0700)]
[ELF] Support the "read-only" memory region attribute
The attribute 'r' allows (or disallows for the negative case) read-only
sections, i.e. ones without the SHF_WRITE flag, to be assigned to the
memory region. Before the patch, lld could put a section in the wrong
region or fail with "error: no memory region specified for section".
Differential Revision: https://reviews.llvm.org/D113771
Vitaly Buka [Wed, 24 Nov 2021 04:05:25 +0000 (20:05 -0800)]
[sanitizer] Fail instead of crash without real_pthread_create
Bixia Zheng [Mon, 22 Nov 2021 23:24:52 +0000 (15:24 -0800)]
Accept symmetric sparse matrix in Matrix Market Exchange Format.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D114402
Weverything [Wed, 24 Nov 2021 01:51:32 +0000 (17:51 -0800)]
Revert "tsan: new runtime (v3)"
This reverts commit
ebd47b0fb78fa11758da6ffcd3e6b415cbb8fa28.
This was causing unexpected behavior in programs.
Uday Bondhugula [Mon, 22 Nov 2021 10:52:41 +0000 (16:22 +0530)]
[MLIR] Remove duplicate `Pass` suffix from ViewOpGraph class name
Remove duplicate `Pass` suffix from view-op-graph pass class name. The
extra suffix would lead to methods like registerViewOpGraphPassPass
being generated.
Differential Revision: https://reviews.llvm.org/D114459
wren romano [Mon, 22 Nov 2021 21:14:17 +0000 (13:14 -0800)]
[mlir][sparse] Adding wrappers for constantOverheadTypeEncoding
Minor code cleanup
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D114392
Jun Ma [Wed, 24 Nov 2021 02:10:22 +0000 (10:10 +0800)]
Revert "[Taildup] Don't tail-duplicate loop header with multiple successors as its latches"
This reverts commit
1f9fa549841a2ec55aa5a131bfaf83f0383c4713.
Jun Ma [Wed, 24 Nov 2021 02:09:53 +0000 (10:09 +0800)]
Revert "Revert "Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values."""""
This reverts commit
c93f93b2e3f28997f794265089fb8138dd5b5f13.
Mehdi Amini [Tue, 23 Nov 2021 06:35:37 +0000 (06:35 +0000)]
Update fir.insert_on_range syntax to make the range more explicit (NFC)
Also replace ArrayAttr with IndexElementsAttr to model subscript dimensions.
An array of attribute is a sparse inefficient storage, with an API that
requires to unpack/repack integers at every call site.
Instead we can store dense array of integer as IndexElementsAttr.
Reviewed By: clementval, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D112899
Zequan Wu [Tue, 23 Nov 2021 20:27:26 +0000 (12:27 -0800)]
[LLDB][NativePDB] Allow find functions by full names
I don't see a reason why not to. If we allows lookup functions by full names,
I can change the test case in D113930 to use `lldb-test symbols --find=function --name=full::name --function-flags=full ...`,
though the duplicate method decl prolem is still there for `lldb-test symbols --dump-ast`.
That's a seprate bug, we can fix it later.
Differential Revision: https://reviews.llvm.org/D114467
Vitaly Buka [Wed, 24 Nov 2021 00:52:02 +0000 (16:52 -0800)]
[NFC][sanitizer] Limit StackStore stack size/tag to 1 byte
Nothing uses more than 8bit now. So the rest of the headers can store other data.
kStackTraceMax is 256 now, but all sanitizers by default store just 20-30 frames here.
Vitaly Buka [Sun, 21 Nov 2021 00:46:27 +0000 (16:46 -0800)]
[NFC][sanitizer] Test for
b80affb8a149
Stanislav Mekhanoshin [Fri, 19 Nov 2021 22:42:29 +0000 (14:42 -0800)]
[AMDGPU] Remove a no-op check in the gfx90a hazard recognizer
Also rename helper function accordingly.
Differential Revision: https://reviews.llvm.org/D114289
Butygin [Thu, 28 Oct 2021 16:04:35 +0000 (19:04 +0300)]
[mlir][spirv] Add math to OpenCL conversion
Differential Revision: https://reviews.llvm.org/D113780
Florian Mayer [Mon, 22 Nov 2021 23:49:54 +0000 (15:49 -0800)]
[hwasan] support python3 in hwasan_sanitize
Verified no diff exist between previous version, new version python 2, and python 3 for an example stack.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D114404
Florian Mayer [Wed, 3 Nov 2021 00:29:13 +0000 (00:29 +0000)]
[stack-safety] Check SCEV constraints at memory instructions.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D113160
Vitaly Buka [Tue, 23 Nov 2021 23:16:29 +0000 (15:16 -0800)]
[NFC][sanitizer] Reuse forEach for operator==
Vitaly Buka [Tue, 23 Nov 2021 03:38:44 +0000 (19:38 -0800)]
[sanitizer] Add DenseMap::forEach
Nemanja Ivanovic [Tue, 23 Nov 2021 22:45:34 +0000 (16:45 -0600)]
[PowerPC] Allow scalars for asm constraint "v" with VSX
Similarly to what GCC does, we should allow scalars with
the "v" constraint rather than introducing unnecessary
new constraints for scalars in Altivec registers.
Differential revision: https://reviews.llvm.org/D113635
Matt Arsenault [Thu, 4 Nov 2021 01:01:53 +0000 (21:01 -0400)]
PrologEpilogInserter: Use explicit control for scavenge slot placement
AMDGPU is unusual in that the both stack is indexed in the same
direction as stack growth (up). We therefore always need the emergency
stack slots placed as low as possible to ensure they are in range of
load/store instruction immediate offsets. The existing logic is mostly
OK, but failed if we required stack realignment.
I don't understand what the existing control isFPCloseToIncomingSP is
supposed to mean, but can only be used to stop placing the scavenge
slots earlier. Make this explicit so that targets can opt-in rather
than opt-out only.
Florian Hahn [Tue, 23 Nov 2021 22:47:26 +0000 (22:47 +0000)]
[LAA] Move visitPointers up in file (NFC).
This allows easier re-use in earlier functions.
Walter Erquinigo [Tue, 23 Nov 2021 22:23:34 +0000 (14:23 -0800)]
Fix
a48501150b9ef64fd61d24f8cef2645237facc44
Issue in https://lab.llvm.org/buildbot/#/builders/96/builds/14682.
Making the test deterministic.
Danil Stefaniuc [Tue, 23 Nov 2021 22:11:42 +0000 (14:11 -0800)]
[formatters] List and forward_list capping_size determination and application
This diff is adding the capping_size determination for the list and forward list, to limit the number of children to be displayed. Also it modifies and unifies tests for libcxx and libstdcpp list data formatter.
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/D114433
Rahul Joshi [Tue, 23 Nov 2021 21:25:26 +0000 (13:25 -0800)]
Move dependency llvm:AllTargetsAsmParsers from Translation to ExecutionEngine.
- Fixes a minor issue in https://reviews.llvm.org/D114338, which seems incorrectly
added the llvm:AllTargetsAsmParsers dependency to Translation in bazel build files.
Differential Revision: https://reviews.llvm.org/D114471
Danil Stefaniuc [Tue, 23 Nov 2021 22:02:05 +0000 (14:02 -0800)]
[formatters] Capping size limitation avoidance for the libcxx and libcpp bitset data formatters.
This diff is avoiding the size limitation introduced by the capping size for the libcxx and libcpp bitset data formatters.
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/D114461
Walter Erquinigo [Tue, 23 Nov 2021 17:32:30 +0000 (09:32 -0800)]
Make some libstd++ formatters safer
We need to add checks that ensure that some core variables are valid, so
that we avoid printing out garbage data. The worst that could happen is
that an non-initialized variable is being printed as something with
123123432 children instead of 0.
Differential Revision: https://reviews.llvm.org/D114458
Walter Erquinigo [Tue, 23 Nov 2021 17:16:59 +0000 (09:16 -0800)]
Improve optional formatter
As suggested by @labath in https://reviews.llvm.org/D114403, we should
make the formatter more resilient to corrupted data. The Libcxx version
explicitly checks for engaged = 1, so we can do that as well for safety.
Differential Revision: https://reviews.llvm.org/D114450
Sanjay Patel [Tue, 23 Nov 2021 21:46:55 +0000 (16:46 -0500)]
[InstSimplify] fold xor logic of 2 variables
(a & b) ^ (~a | b) --> ~a
I was looking for a shortcut to reduce some of the complex logic
folds that are currently up for review (D113216
and others in that stack), and I found this missing from
instcombine/instsimplify.
There is a trade-off in putting it into instsimplify: because
we can't create new values here, we need a strict 'not' op (no
undef elements). Otherwise, the fold is not valid:
https://alive2.llvm.org/ce/z/k_AGGj
If this was in instcombine instead, we could create the proper
'not'. But having the fold here benefits other passes like GVN
that use instsimplify as an analysis.
There is a related fold where 'and' and 'or' are swapped, and
that is planned as a follow-up commit.
Differential Revision: https://reviews.llvm.org/D114462
Vitaly Buka [Tue, 23 Nov 2021 21:49:41 +0000 (13:49 -0800)]
[NFC][sanitizer] Make method const
Vitaly Buka [Tue, 23 Nov 2021 21:48:25 +0000 (13:48 -0800)]
[NFC][sanitizer] Extract StackTraceHeader struct
Rong Xu [Mon, 22 Nov 2021 22:03:32 +0000 (14:03 -0800)]
[SampleFDO] Recompute BFI if the sample loader changes BPI
The MIR sample loader changes the branch probability but not BFI.
Here we force a recompute of BFI if the branch probabilities are
changed.
Also register the MIR FSAFDO passes properly.
Differential Revision: https://reviews.llvm.org/D114400
Vitaly Buka [Tue, 16 Nov 2021 04:58:51 +0000 (20:58 -0800)]
[NFC][sanitizer] Add StackStoreTest
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D114463
Dimitry Andric [Tue, 23 Nov 2021 19:47:38 +0000 (20:47 +0100)]
[lldb] Move create_relative_symlink function up in CMake hierarchy
Configuring lldb with `LLDB_ENABLE_PYTHON=OFF` and `LLDB_ENABLE_LUA=ON` results in a CMake error:
CMake Error at lldb/bindings/lua/CMakeLists.txt:47 (create_relative_symlink):
Unknown CMake command "create_relative_symlink".
Call Stack (most recent call first):
lldb/CMakeLists.txt:117 (finish_swig_lua)
This is because the CMake function `create_relative_symlink` only exists in `lldb/bindings/python/CMakeLists.txt`, and not in `lldb/bindings/lua/CMakeLists.txt`.
Move the function to `lldb/bindings/CMakeLists.txt`, so it is available for all language bindings.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D114465
Vitaly Buka [Tue, 23 Nov 2021 20:51:12 +0000 (12:51 -0800)]
[NFC][sanitizer] Early return for empty StackTraces
Current callers should filter them out anyway,
but with this patch we don't need rely on that assumption.
Vitaly Buka [Tue, 23 Nov 2021 20:41:28 +0000 (12:41 -0800)]
[NFC][sanitizer] Move StackStore::Allocated into cpp file
Sanjay Patel [Tue, 23 Nov 2021 17:10:03 +0000 (12:10 -0500)]
[InstSimplify] add tests for xor logic fold; NFC
Rob Suderman [Wed, 10 Nov 2021 22:02:54 +0000 (14:02 -0800)]
[mlir][tosa] Materialize tosa.pad value and fold noop pads
Padding now can explicitly specify the padding value when non-zero is wanted.
This also includes bypassing pads when the pad does nothing.
Differential Revision: https://reviews.llvm.org/D113611
Rob Suderman [Tue, 23 Nov 2021 03:43:06 +0000 (19:43 -0800)]
[mlir][tosa] Separate tosa.transpose_conv decomposition and added stride support
Transpose convolution decomposition is now performed in a separate pass. This
allows padding / constant propagation to be performed at the TOSA level. It
also adds support for striding when there is no dilation.
Differential Revision: https://reviews.llvm.org/D114409
LLVM GN Syncbot [Tue, 23 Nov 2021 20:11:07 +0000 (20:11 +0000)]
[gn build] Port
1392b654ff65
Mehdi Amini [Tue, 23 Nov 2021 20:10:36 +0000 (20:10 +0000)]
Revert "profi - a flow-based profile inference algorithm: Part I (out of 3)"
This reverts commit
884b6dd311422bbfac62b8a90fbfff8e77ba8121.
The windows build is broken with a linker error.
MaheshRavishankar [Tue, 23 Nov 2021 18:21:52 +0000 (10:21 -0800)]
[mlir][Linalg] Add pad vectorization patterns into LinalgStrategyVectorize passes.
Add an option to control whether these patterns are added to the
pattern list or not.
Differential Revision: https://reviews.llvm.org/D114290
Mehrnoosh Heidarpour [Tue, 23 Nov 2021 18:50:13 +0000 (13:50 -0500)]
[InstCombine] Add test cases for D114339; NFC
Adding test cases for XOR logic folds with base result.
Differential Revision: https://reviews.llvm.org/D114436
LLVM GN Syncbot [Tue, 23 Nov 2021 19:09:46 +0000 (19:09 +0000)]
[gn build] Port
884b6dd31142
Quinn Pham [Thu, 18 Nov 2021 21:03:03 +0000 (15:03 -0600)]
[NFC][llvm] Inclusive language: remove instance of master in LiveRangeUtils.h
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with primary in `LiveRangeUtils.h`.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D114191
spupyrev [Tue, 23 Nov 2021 16:47:23 +0000 (08:47 -0800)]
profi - a flow-based profile inference algorithm: Part I (out of 3)
The benefits of sampling-based PGO crucially depends on the quality of profile
data. This diff implements a flow-based algorithm, called profi, that helps to
overcome the inaccuracies in a profile after it is collected.
Profi is an extended and significantly re-engineered classic MCMF (min-cost
max-flow) approach suggested by Levin, Newman, and Haber [2008, Complementing
missing and inaccurate profiling using a minimum cost circulation algorithm]. It
models profile inference as an optimization problem on a control-flow graph with
the objectives and constraints capturing the desired properties of profile data.
Three important challenges that are being solved by profi:
- "fixing" errors in profiles caused by sampling;
- converting basic block counts to edge frequencies (branch probabilities);
- dealing with "dangling" blocks having no samples in the profile.
The main implementation (and required docs) are in SampleProfileInference.cpp.
The worst-time complexity is quadratic in the number of blocks in a function,
O(|V|^2). However a careful engineering and extensive evaluation shows that
the running time is (slightly) super-linear. In particular, instances with
1000 blocks are solved within 0.1 second.
The algorithm has been extensively tested internally on prod workloads,
significantly improving the quality of generated profile data and providing
speedups in the range from 0% to 5%. For "smaller" benchmarks (SPEC06/17), it
generally improves the performance (with a few outliers) but extra work in
the compiler might be needed to re-tune existing optimization passes relying on
profile counts.
Reviewed By: wenlei, hoy
Differential Revision: https://reviews.llvm.org/D109860
wren romano [Thu, 18 Nov 2021 21:06:25 +0000 (13:06 -0800)]
[mlir][sparse] Moving integration tests that merely use the Python API
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D114192
Fangrui Song [Tue, 23 Nov 2021 18:30:11 +0000 (10:30 -0800)]
[ELF] Support non-RAX/non-adjacent R_X86_64_GOTPC32_TLSDESC/R_X86_64_TLSDESC_CALL
The current TLSDESC optimization code assumes:
```
leaq x@tlsdesc(%rip), %rax
call *x@tlscall(%rax) # adjacent
```
From https://gitlab.freedesktop.org/mesa/mesa/-/issues/5665 , it seems that the
two instructions may not be adjacent in GCC 10's output:
```
leaq x@tlsdesc(%rip), %rax
something else
call *x@tlscall(%rax)
```
This patch supports the case. While here, support non-RAX registers for
R_X86_64_GOTPC32_TLSDESC, in case the compiler generates inefficient:
```
leaq x@tlsdesc(%rip), %rcx # or %rdx, %rbx, %rdi, ...
movq %rcx, %rax
call *x@tlscall(%rax) # GNU ld/gold error for non-RAX
```
Differential Revision: https://reviews.llvm.org/D114416
Zarko Todorovski [Tue, 23 Nov 2021 18:22:21 +0000 (13:22 -0500)]
[llvm][NFC] Inclusive language: Reword replace uses of sanity in llvm/lib/Transform comments and asserts
Reworded some comments and asserts to avoid usage of `sanity check/test`
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D114372
Pirama Arumuga Nainar [Tue, 23 Nov 2021 18:03:04 +0000 (10:03 -0800)]
[compiler-rt/profile] Include __llvm_profile_get_magic in module signature
The INSTR_PROF_RAW_MAGIC_* number in profraw files should match during
profile merging. This causes an error with 32-bit and 64-bit variants
of the same code. The module signatures for the two binaries are
identical but they use different INSTR_PROF_RAW_MAGIC_* causing a
failure when profile-merging is used. Including it when computing the
module signature yields different signatures for the 32-bit and 64-bit
profiles.
Differential Revision: https://reviews.llvm.org/D114054
Philip Reames [Tue, 23 Nov 2021 17:57:30 +0000 (09:57 -0800)]
[indvars] Fix lftr crash when preheader is terminated by switch
This was found by oss-fuzz. The switch will get canonicalized to a branch, but if it hasn't been when we run LFTR, we crashed on an unneeded assert.
Nemanja Ivanovic [Tue, 23 Nov 2021 13:32:45 +0000 (07:32 -0600)]
[PowerPC] Add BCD add/sub/cmp builtins
Support for builtins that use bcdadd./bcdsub. to add/subtract
Binary Coded Decimal values as well as to determine validity
and compare BCD values.
Differential revision: https://reviews.llvm.org/D114088
Florian Hahn [Tue, 23 Nov 2021 17:37:12 +0000 (17:37 +0000)]
[LAA] Turn aggregate type check into assertion (NFCI).
getPtrStride should not be called with aggregate access types. There's
also an old TODO.
Turn the check into an assertion.
Philip Reames [Tue, 23 Nov 2021 17:18:28 +0000 (09:18 -0800)]
Revert "profi - a flow-based profile inference algorithm: Part I (out of 3)"
This reverts commit
b00fc198224efa038a7469e068dd920b3f1aba75. This change fails to build (link) on ubuntu x86,
Philip Reames [Tue, 23 Nov 2021 17:10:41 +0000 (09:10 -0800)]
[unroll] Remove two dead variable assignments [nfc]
These variables are not out-params, and we immediately return after assigning them. Thus, the assignments are dead and just confusing.
I believe these used to be out-params, but they're not any more.
spupyrev [Tue, 23 Nov 2021 16:47:23 +0000 (08:47 -0800)]
profi - a flow-based profile inference algorithm: Part I (out of 3)
The benefits of sampling-based PGO crucially depends on the quality of profile
data. This diff implements a flow-based algorithm, called profi, that helps to
overcome the inaccuracies in a profile after it is collected.
Profi is an extended and significantly re-engineered classic MCMF (min-cost
max-flow) approach suggested by Levin, Newman, and Haber [2008, Complementing
missing and inaccurate profiling using a minimum cost circulation algorithm]. It
models profile inference as an optimization problem on a control-flow graph with
the objectives and constraints capturing the desired properties of profile data.
Three important challenges that are being solved by profi:
- "fixing" errors in profiles caused by sampling;
- converting basic block counts to edge frequencies (branch probabilities);
- dealing with "dangling" blocks having no samples in the profile.
The main implementation (and required docs) are in SampleProfileInference.cpp.
The worst-time complexity is quadratic in the number of blocks in a function,
O(|V|^2). However a careful engineering and extensive evaluation shows that
the running time is (slightly) super-linear. In particular, instances with
1000 blocks are solved within 0.1 second.
The algorithm has been extensively tested internally on prod workloads,
significantly improving the quality of generated profile data and providing
speedups in the range from 0% to 5%. For "smaller" benchmarks (SPEC06/17), it
generally improves the performance (with a few outliers) but extra work in
the compiler might be needed to re-tune existing optimization passes relying on
profile counts.
Reviewed By: wenlei, hoy
Differential Revision: https://reviews.llvm.org/D109860
Yaxun (Sam) Liu [Mon, 8 Nov 2021 21:20:22 +0000 (16:20 -0500)]
[HIP] Fix device stub name for Windows
This is a follow up of https://reviews.llvm.org/D68578
where device stub name is changed for Itanium
mangling but not Microsoft mangling.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D113491
Philip Reames [Tue, 23 Nov 2021 17:01:23 +0000 (09:01 -0800)]
[unroll] Use early return in shouldFullUnroll [nfc]
Dmitry Vyukov [Tue, 23 Nov 2021 10:50:49 +0000 (11:50 +0100)]
tsan: disable signal_sync2.cpp test on powerpc64
Fails 1 out of 10 runs on powerpc bots:
https://lab.llvm.org/buildbot/#/builders/121/builds/13391
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D114426
Dmitry Vyukov [Tue, 23 Nov 2021 15:58:32 +0000 (16:58 +0100)]
[lldb] Deflake TestTsanBasic.py
The test flaked on bots:
http://green.lab.llvm.org/green/job/lldb-cmake/38666/
The test expects that tsan will detect a single race
with concurrent memory accesses. TSan doesn't do this reliably.
Run 100 iterations of the racing threads, which should
make the race much more likely to be detected.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114444
Kazu Hirata [Tue, 23 Nov 2021 16:54:47 +0000 (08:54 -0800)]
[llvm] Use range-based for loops (NFC)
Paul Robinson [Tue, 23 Nov 2021 16:42:16 +0000 (08:42 -0800)]
[PS4][TLI] Remove redundant line
alex-t [Fri, 19 Nov 2021 17:27:35 +0000 (20:27 +0300)]
[AMDGPU] Enable fneg and fabs divergence-driven instruction selection.
Detailed description: We currently have a set of patterns to select ISD::FNEG and ISD::FABS to the bitwise operations. We need to make them predicated to select the VALU or SALU bitwise operation variant according to the SDNode divergence bit.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D114257
Yaxun (Sam) Liu [Thu, 4 Nov 2021 17:49:43 +0000 (13:49 -0400)]
[NFC] Let Microsoft mangler accept GlobalDecl
This is a follow up of https://reviews.llvm.org/D75700
where support of GlobalDecl with Microsoft mangler
is incomplete.
Reviewed by: Artem Belevich, Reid Kleckner
Differential Revision: https://reviews.llvm.org/D113490
Yaxun (Sam) Liu [Tue, 23 Nov 2021 15:46:51 +0000 (10:46 -0500)]
Fix warning due to default switch label
Fix warning due to default label in switch which covers all enumeration values
Simon Moll [Tue, 23 Nov 2021 14:08:02 +0000 (15:08 +0100)]
[VP] Canonicalize macros of VPIntrinsics.def
Usage and naming of macros in VPIntrinsics.def has been inconsistent. Rename all property macros to VP_PROPERTY_<name>. Use BEGIN/END scope macros to attach properties to vp intrinsics and SDNodes (instead of specifying either directly with the property macro).
A follow-up patch has documentation on how the macros are (intended) to be used.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D114144
Gabor Marton [Thu, 11 Nov 2021 13:55:24 +0000 (14:55 +0100)]
[Analyzer][Core] Better simplification in SimpleSValBuilder::evalBinOpNN
Make the SValBuilder capable to simplify existing
SVals based on a newly added constraints when evaluating a BinOp.
Before this patch, we called `simplify` only in some edge cases.
However, we can and should investigate the constraints in all cases.
Differential Revision: https://reviews.llvm.org/D113753
Yaxun (Sam) Liu [Mon, 22 Nov 2021 19:37:02 +0000 (14:37 -0500)]
[HIP] Add HIP scope atomic operations
Add an AtomicScopeModel for HIP and support for OpenCL builtins
that are missing in HIP.
Patch by: Michael Liao
Revised by: Anshil Ghandi
Reviewed by: Yaxun Liu
Differential Revision: https://reviews.llvm.org/D113925
Jinsong Ji [Tue, 23 Nov 2021 15:08:49 +0000 (15:08 +0000)]
[PowerPC] Remove FreeBSD test in mm-malloc.c due to cross-compilation limitation
Fix failures on powerpc BE buildbots
https://lab.llvm.org/buildbot/#/builders/93/builds/6031
https://lab.llvm.org/buildbot/#/builders/100/builds/10836
https://lab.llvm.org/buildbot/#/builders/52/builds/12719
Sanjay Patel [Tue, 23 Nov 2021 14:50:24 +0000 (09:50 -0500)]
[InstCombine] enhance bitwise select matching
I noticed that adding a seemingly unrelated fold for xor caused
regressions on similar patterns, and this is one of the
underlying causes.
This could also be a variation for code as seen in:
https://llvm.org/PR34047
...although that exact example should be fixed after:
D113035 /
c36b7e21bd8f
The vector test shows that we are actually missing a potential
canonicalization for bitcast-of-sext-of-not or the inverse.
The scalar test shows that even if we had that canonicalization,
it would still be possible to see this pattern due to extra uses.
https://alive2.llvm.org/ce/z/y2BAgi
Sanjay Patel [Tue, 23 Nov 2021 13:55:49 +0000 (08:55 -0500)]
[InstCombine] add tests for logical select; NFC
Louis Dionne [Mon, 22 Nov 2021 20:40:12 +0000 (15:40 -0500)]
[libc++] Tidy up how %T and %t are created during configuration checks
Instead of having ad-hoc cleanup in various places, handle all creation
and removal of temporary files and directories inside _makeConfigTest.
As a fly-by, also remove testPrefix since we don't keep any source file
around anymore. Setting a prefix for the files is hence not useful anymore.
Differential Revision: https://reviews.llvm.org/D114390
David Green [Tue, 23 Nov 2021 14:24:58 +0000 (14:24 +0000)]
[ARM] Expand rev.ll test with more triples. NFC
Useful in showing Thumb2 and Thumb1 rev instructions as well as the arm
already tested, as well as testing the more canonical llvm.bswap.i16
form.
Zahira Ammarguellat [Tue, 23 Nov 2021 13:00:57 +0000 (08:00 -0500)]
Revert "The _Float16 type is supported on x86 systems with SSE2 enabled."
This reverts commit
6623c02d70c3732dbea59c6d79c69501baf9627b.
The change seems to be breaking build of compiler-rt on Debian.
Nicolas Vasilache [Tue, 23 Nov 2021 12:01:53 +0000 (12:01 +0000)]
[mlir][Vector] Thread 0-d vectors through InsertElementOp.
This revision makes concrete use of 0-d vectors to extend the semantics of
InsertElementOp.
Reviewed By: dcaballe, pifon2a
Differential Revision: https://reviews.llvm.org/D114388
Nicolas Vasilache [Tue, 23 Nov 2021 12:01:12 +0000 (12:01 +0000)]
[mlir][Vector] Thread 0-d vectors through ExtractElementOp.
This revision starts making concrete use of 0-d vectors to extend the semantics of
ExtractElementOp.
In the process a new VectorOfAnyRank Tablegen OpBase.td is added to allow progressive transition to supporting 0-d vectors by gradually opting in.
Differential Revision: https://reviews.llvm.org/D114387
Matthias Springer [Tue, 23 Nov 2021 12:27:03 +0000 (21:27 +0900)]
[mlir][linalg][bufferize][NFC] Specify bufferize traversal in `bufferize`
The interface method `bufferize` controls how (and it what order) nested ops are traversed. This simplifies bufferization of scf::ForOps and scf::IfOps, which used to need special rules in scf::YieldOp.
Differential Revision: https://reviews.llvm.org/D114057
Diana Picus [Thu, 18 Nov 2021 12:40:48 +0000 (12:40 +0000)]
[fir] Set !fir.len_param_index conversion to unimplemented
This patch is part of the upstreaming effort from fir-dev.
The conversion of len_param_index in fir-dev is incomplete, so for now
we're marking this as unimplemented until we can settle on a design for
the runtime support of LEN parameters.
Differential Revision: https://reviews.llvm.org/D114241
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Tonko Sabolčec [Tue, 23 Nov 2021 11:43:46 +0000 (12:43 +0100)]
[lldb] Fix lookup for global constants in namespaces
LLDB uses mangled name to construct a fully qualified name for global
variables. Sometimes DW_TAG_linkage_name attribute is missing from
debug info, so LLDB has to rely on parent entries to construct the
fully qualified name.
Currently, the fallback is handled when the parent DW_TAG is either
DW_TAG_compiled_unit or DW_TAG_partial_unit, which may not work well
for global constants in namespaces. For example:
namespace ns {
const int x = 10;
}
may produce the following debug info:
<1><2a>: Abbrev Number: 2 (DW_TAG_namespace)
<2b> DW_AT_name : (indirect string, offset: 0x5e): ns
<2><2f>: Abbrev Number: 3 (DW_TAG_variable)
<30> DW_AT_name : (indirect string, offset: 0x61): x
<34> DW_AT_type : <0x3c>
<38> DW_AT_decl_file : 1
<39> DW_AT_decl_line : 2
<3a> DW_AT_const_value : 10
Since the fallback didn't handle the case when parent tag is
DW_TAG_namespace, LLDB wasn't able to match the variable by its fully
qualified name "ns::x". This change fixes this by additional check
if the parent is a DW_TAG_namespace.
Reviewed By: werat, clayborg
Differential Revision: https://reviews.llvm.org/D112147
Jay Foad [Tue, 23 Nov 2021 11:33:10 +0000 (11:33 +0000)]
[AMDGPU] Fix the name of a test case
Dmitry Vyukov [Tue, 27 Apr 2021 11:55:41 +0000 (13:55 +0200)]
tsan: new runtime (v3)
This change switches tsan to the new runtime which features:
- 2x smaller shadow memory (2x of app memory)
- faster fully vectorized race detection
- small fixed-size vector clocks (512b)
- fast vectorized vector clock operations
- unlimited number of alive threads/goroutimes
Differential Revision: https://reviews.llvm.org/D112603
mydeveloperday [Tue, 23 Nov 2021 10:43:27 +0000 (10:43 +0000)]
[clang-format] [NFC] build clang-format with -Wall
When building clang-format with -Wall on Visual Studio 20119 we see the following, prevent this the only -Wall error
```
..FormatTokenLexer.cpp(45) : warning C4868: compiler may not enforce left-to-right evaluation order in braced initializer list
```
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D113844
mydeveloperday [Tue, 23 Nov 2021 10:35:05 +0000 (10:35 +0000)]
[clang-format] [PR52527] can join * with /* to form an outside of comment error C4138
https://bugs.llvm.org/show_bug.cgi?id=52527
The follow patch ensures there is always a space between * and /* to prevent transforming
```
void foo(* /* comment */)(int bar);
```
into
```
void foo(*/* comment */)(int bar);
```
Differential Revision: https://reviews.llvm.org/D114142
Evgeniy Brevnov [Mon, 22 Nov 2021 12:52:57 +0000 (19:52 +0700)]
[DSE][NFC] Introduce "doesn't overwrite" return code for isOverwrite
Add OR_None code to indicate that there is no overwrite. This has no any effect for current uses but will be used in one of the next patches building support for PHI translation.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D105098
Florian Hahn [Tue, 23 Nov 2021 10:06:08 +0000 (10:06 +0000)]
[ThreadPool] Do not return shared futures.
The only users of returned futures from ThreadPool is llvm-reduce after
D113857.
There should be no cases where multiple threads wait on the same future,
so there should be no need to return std::shared_future<>. Instead return
plain std::future<>.
If users need to share a future between multiple threads, they can share
the futures themselves.
Reviewed By: Meinersbur, mehdi_amini
Differential Revision: https://reviews.llvm.org/D114363