Vedant Kumar [Tue, 18 May 2021 23:35:21 +0000 (16:35 -0700)]
[profile] Skip mmap() if there are no counters
If there are no counters, an mmap() of the counters section would fail
due to the size argument being too small (EINVAL).
rdar://
78175925
Differential Revision: https://reviews.llvm.org/D102735
Jessica Paquette [Wed, 19 May 2021 16:18:52 +0000 (09:18 -0700)]
Recommit "[GlobalISel] Simplify G_ICMP to true/false when the result is known"
Add missing REQUIRES line to
prelegalizer-combiner-icmp-to-true-false-known-bits.
Hongtao Yu [Mon, 17 May 2021 06:13:39 +0000 (23:13 -0700)]
[CSSPGO] Overwrite branch weight annotated in previous pass.
Sample profile loader can be run in both LTO prelink and postlink. Currently the counts annoation in postilnk doesn't fully overwrite what's done in prelink. I'm adding a switch (`-overwrite-existing-weights=1`) to enable a full overwrite, which includes:
1. Clear old metadata for calls when their parent block has a zero count. This could be caused by prelink code duplication.
2. Clear indirect call metadata if somehow all the rest targets have a sum of zero count.
3. Overwrite branch weight for basic blocks.
With a CS profile, I was seeing #1 and #2 help reduce code size by preventing post-sample ICP and CGSCC inliner working on obsolete metadata, which come from a partial global inlining in prelink. It's not expected to work well for non-CS case with a less-accurate post-inline count quality.
It's worth calling out that some prelink optimizations can damage counts quality in an irreversible way. One example is the loop rotate optimization. Due to lack of exact loop entry count (profiling can only give loop iteration count and loop exit count), moving one iteration out of the loop body leaves the rest iteration count unknown. We had to turn off prelink loop rotate to achieve a better postlink counts quality. A even better postlink counts quality can be archived by turning off prelink CGSCC inlining which is not context-sensitive.
Reviewed By: wenlei, wmi
Differential Revision: https://reviews.llvm.org/D102537
Amy Huang [Wed, 19 May 2021 15:47:30 +0000 (08:47 -0700)]
Revert "Do actual DCE in LoopUnroll (try 3)"
This reverts commit
b6320eeb8622f05e4a5d4c7f5420523357490fca
as it causes clang to assert; see
https://reviews.llvm.org/rGb6320eeb8622f05e4a5d4c7f5420523357490fca.
Nicolas Vasilache [Wed, 19 May 2021 15:41:54 +0000 (15:41 +0000)]
[mlir][SCF] NFC - Drop SCF EDSC usage
Drop the SCF dialect EDSC subdirectory and update all uses.
Differential Revision: https://reviews.llvm.org/D102780
Mariusz Ceier [Wed, 19 May 2021 15:07:39 +0000 (11:07 -0400)]
Fix lld macho standalone build by including llvm/Config/llvm-config.h instead of llvm/Config/config.h
lld/MachO/Driver.cpp and lld/MachO/SyntheticSections.cpp include
llvm/Config/config.h which doesn't exist when building standalone lld.
This patch replaces llvm/Config/config.h include with llvm/Config/llvm-config.h
just like it is in lld/ELF/Driver.cpp and HAVE_LIBXAR with LLVM_HAVE_LIXAR and
moves LLVM_HAVE_LIBXAR from config.h to llvm-config.h
Also it adds LLVM_HAVE_LIBXAR to LLVMConfig.cmake and links liblldMachO2.so
with XAR_LIB if LLVM_HAVE_LIBXAR is set.
Differential Revision: https://reviews.llvm.org/D102084
Simon Moll [Wed, 19 May 2021 15:08:20 +0000 (17:08 +0200)]
[VP] make getFunctionalOpcode return an Optional
The operation of some VP intrinsics do/will not map to regular
instruction opcodes. Returning 'None' seems more intuitive here than
'Instruction::Call'.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D102778
Anirudh Prasad [Wed, 19 May 2021 15:05:00 +0000 (11:05 -0400)]
[AsmParser][SystemZ][z/OS] Introducing HLASM Parser support to AsmParser - Part 1
- This patch (is one in a series of patches) which introduces HLASM Parser support (for the first parameter of inline asm statements) to LLVM ([[ https://lists.llvm.org/pipermail/llvm-dev/2021-January/147686.html | main RFC here ]])
- This patch in particular introduces HLASM Parser support for Z machine instructions.
- The approach taken here was to subclass `AsmParser`, and make various functions and variables as "protected" wherever appropriate.
- The `HLASMAsmParser` class overrides the `parseStatement` function. Two new private functions `parseAsHLASMLabel` and `parseAsMachineInstruction` are introduced as well.
The general syntax is laid out as follows (more information available in [[ https://www.ibm.com/support/knowledgecenter/SSENW6_1.6.0/com.ibm.hlasm.v1r6.asm/asmr1023.pdf | HLASM V1R6 Language Reference Manual ]] - Chapter 2 - Instruction Statement Format):
```
<TokA><spaces.*><TokB><spaces.*><TokC><spaces.*><TokD>
```
1. TokA is referred to as the Name Entry. This token is optional
2. TokB is referred to as the Operation Entry. This token is mandatory.
3. TokC is referred to as the Operand Entry. This token is mandatory
4. TokD is referred to as the Remarks Entry. This token is optional
- If TokA is provided, then we either parse TokA as a possible comment or as a label (Name Entry), Tok B as the Operation Entry and so on.
- If TokA is not provided (i.e. we have one or more spaces and then the first token), then we will parse the first token (i.e TokB) as a possible Z machine instruction, TokC as the operands to the Z machine instruction and TokD as a possible Remark field
- TokC (Operand Entry), no spaces are allowed between OperandEntries. If a space occurs it is classified as an error.
- TokD if provided is taken as is, and emitted as a comment.
The following additional approach was examined, but not taken:
- Adding custom private only functions to base AsmParser class, and only invoking them for z/OS. While this would eliminate the need for another child class, these private functions would be of non-use to every other target. Similarly, adding any pure virtual functions to the base MCAsmParser class and overriding them in AsmParser would also have the same disadvantage.
Testing:
- This patch doesn't have tests added with it, for the sole reason that MCStreamer Support and Object File support hasn't been added for the z/OS target (yet). Hence, it's not possible generate code outright for the z/OS target. They are in the process of being committed / process of being worked on.
- Any comments / feedback on how to combat this "lack of testing" due to other missing required features is appreciated.
Reviewed By: Kai, uweigand
Differential Revision: https://reviews.llvm.org/D98276
Andy Yankovsky [Wed, 19 May 2021 13:36:13 +0000 (15:36 +0200)]
[lldb] Enable TestCppBitfields on Windows
The test works correctly on Windows, the linked bug has been resolved.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D102769
Melanie Blower [Wed, 19 May 2021 14:57:04 +0000 (10:57 -0400)]
[clang][patch] Add support for option -fextend-arguments={32,64}: widen integer arguments to int64 in unprototyped function calls
Reviewed By: Aaron Ballman
Differential Revision: https://reviews.llvm.org/D101640
Wang, Pengfei [Wed, 19 May 2021 10:01:11 +0000 (18:01 +0800)]
Reapply "[X86] Limit X86InterleavedAccessGroup to handle the same type case only"
The current implementation assumes the destination type of shuffle is the same as the decomposed ones. Add the check to avoid crush when the condition is not satisfied.
This fixes PR37616.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D102751
Simon Pilgrim [Wed, 19 May 2021 13:59:58 +0000 (14:59 +0100)]
Revert rG528bc10e95d5f9d6a338f9bab5e91d7265d1cf05 : "[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB"
Reports on D101970 indicate this is causing failures on multi-stage compiles.
Jan Kratochvil [Wed, 19 May 2021 13:51:54 +0000 (15:51 +0200)]
[lldb] 2/2: Fix DW_AT_ranges DW_FORM_sec_offset not using DW_AT_rnglists_base (used by GCC)
DW_AT_ranges can use DW_FORM_sec_offset (instead of DW_FORM_rnglistx).
In such case DW_AT_rnglists_base does not need to be present.
DWARF-5 spec:
"If the offset_entry_count is zero, then DW_FORM_rnglistx cannot
be used to access a range list; DW_FORM_sec_offset must be used
instead. If the offset_entry_count is non-zero, then
DW_FORM_rnglistx may be used to access a range list;"
This fix is for TestTypeCompletion.py category `dwarf` using GCC with DWARF-5.
The fix just provides GetRnglist() lazy getter for `m_rnglist_table`.
The testcase is easier to review by:
diff -u lldb/test/Shell/SymbolFile/DWARF/DW_AT_low_pc-addrx.s \
lldb/test/Shell/SymbolFile/DWARF/DW_AT_range-DW_FORM_sec_offset.s
Differential Revision: https://reviews.llvm.org/D98289
Jan Kratochvil [Wed, 19 May 2021 13:49:14 +0000 (15:49 +0200)]
[nfc] [lldb] 1/2: Fix DW_AT_ranges DW_FORM_sec_offset not using DW_AT_rnglists_base (used by GCC)
Refactor code only for D98289.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D99653
Simon Pilgrim [Wed, 19 May 2021 13:13:41 +0000 (14:13 +0100)]
[X86][AVX] createVariablePermute - generalize the PR50356 fix for smaller indices vector as well
Generalize the fix from rGd0902a8665b1 by ensuring we widen/narrow the indices subvector first and then perform the ZERO_EXTEND_VECTOR_INREG (if necessary), which should allow us to perform the variable permutes with source/destination/indices vectors of any widths.
Simon Pilgrim [Wed, 19 May 2021 12:51:34 +0000 (13:51 +0100)]
[X86][Atom] Fix vector integer shift by immediate resource/throughputs
Match whats documented in the Intel AOM (and Agner/instlatx64 agree) - these are all Port0 only.
Now that we can use in-order models in llvm-mca, the atom model is a good "worst case scenario" analysis for x86.
Tobias Gysi [Wed, 19 May 2021 13:10:28 +0000 (13:10 +0000)]
[mir][Python][linalg] Support OpDSL extensions in C++.
The patch extends the yaml code generation to support the following new OpDSL constructs:
- captures
- constants
- iteration index accesses
- predefined types
These changes have been introduced by revision
https://reviews.llvm.org/D101364.
Differential Revision: https://reviews.llvm.org/D102075
Andy Yankovsky [Tue, 18 May 2021 12:43:20 +0000 (14:43 +0200)]
[lldb] Encode `bool` as unsigned int
`bool` is considered to be unsigned according to `std::is_unsigned<bool>::value` (and `Type::GetTypeInfo`). Encoding it as signed int works fine for normal variables and fields, but breaks when reading the values of boolean bitfields. If the field is declared as `bool b : 1` and has a value of `0b1`, the call to `SBValue::GetValueAsSigned()` will return `-1`.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D102685
Raphael Isemann [Wed, 19 May 2021 13:22:03 +0000 (15:22 +0200)]
[lldb][NFC] Remove sample test boilerplate from TestBreakOnCPP11Initializers
Nico Weber [Wed, 19 May 2021 13:02:27 +0000 (09:02 -0400)]
Revert "[GlobalISel] Simplify G_ICMP to true/false when the result is known"
This reverts commit
892497c806306a4b7185ead16d60b0ebcca0a304.
Breaks tests, see comments on https://reviews.llvm.org/D102542
Peter Waller [Thu, 13 May 2021 14:44:53 +0000 (14:44 +0000)]
[llvm][AArch64][SVE] Model FFR-using intrinsics with inaccessiblemem
Intriniscs reading or writing the FFR register need to model the fact
there is additional state being read/wrtten.
Model this state as inaccessible memory.
* setffr => write inaccessiblememonly
* rdffr => read inaccessiblememonly
* ldff* => read arg memory, write inaccessiblemem
* ldnf => read arg memory, write inaccessiblemem
Nicolas Vasilache [Wed, 19 May 2021 12:34:52 +0000 (12:34 +0000)]
[mlir][Vector] NFC - Drop vector EDSC usage
Drop the vector dialect EDSC subdirectory and update all uses.
Wang, Pengfei [Wed, 19 May 2021 12:34:47 +0000 (20:34 +0800)]
Revert "[X86] Limit X86InterleavedAccessGroup to handle the same type case only"
This reverts commit
ca23a38e373142a18ab56700ba4f3b947bfe9db0.
Revert due to EXPENSIVE_CHECKS fail.
David Sherwood [Wed, 19 May 2021 11:13:13 +0000 (12:13 +0100)]
Remove scalable vector assert from InnerLoopVectorizer::setDebugLocFromInst
In InnerLoopVectorizer::setDebugLocFromInst we were previously
asserting that the VF is not scalable. This is because we want to
use the number of elements to create a duplication factor for the
debug profiling data. However, for scalable vectors we only know the
minimum number of elements. I've simply removed the assert for now
and added a FIXME saying that we assume vscale is always 1. When
vscale is not 1 it just means that the profiling data isn't as
accurate, but shouldn't cause any functional problems.
Kristina Bessonova [Thu, 6 May 2021 20:51:30 +0000 (22:51 +0200)]
[ARM][NEON] Combine base address updates for vst1x intrinsics
Differential Revision: https://reviews.llvm.org/D102256
Sanjay Patel [Wed, 19 May 2021 10:20:45 +0000 (06:20 -0400)]
[SDAG] propagate FMF from target-specific IR intrinsics
This is a step towards relying more on node-level FMF rather than function-wide
or target settings.
I think it was just an oversight that we didn't get this path in D87361
or follow-on patches.
The lack of FMF propagation is blocking D90901 from converting tests to IR-level FMF.
We can't do much more than this currently because we also fail to propagate flags
from x86-specific node to generic FMA node. That would be another patch, so the
test just verifies that we can transfer from IR to initial SDAG node.
Differential Revision: https://reviews.llvm.org/D102725
Michael Spencer [Wed, 19 May 2021 11:04:56 +0000 (13:04 +0200)]
Reapply "[clang][deps] Support inferred modules"
This reapplies commit
95033eb3 that reverted commit
1d9e8e13.
The tests were failing on Windows due to spaces and backslashes in paths not being handled carefully.
Haojian Wu [Wed, 19 May 2021 08:26:03 +0000 (10:26 +0200)]
[clang] Fix a crash on CheckArgAlignment.
We might encounter an undeduced type before calling getTypeAlignInChars.
NOTE: this retrieves the fix from
8f80c66bd2982788a8eede4419684ca72f48b9a2, which was removed in Adam's
followup fix
fbfcfdbf6828b8d36f4ec0ff5f4eac11fb1411a5. We originally
thought the crash was caused by recovery-ast, but it turns out it can
occur for other cases, e.g. typo-correction.
Differential Revision: https://reviews.llvm.org/D102750
Simon Pilgrim [Wed, 19 May 2021 10:33:58 +0000 (11:33 +0100)]
[X86] Atom (pre-SLM) doesn't support PTEST instructions
Simon Pilgrim [Wed, 19 May 2021 10:09:19 +0000 (11:09 +0100)]
[X86] Remove copy + paste typos in AtomWriteResPair comment.
Remnants from when the Atom model was copied from the Btver2 model.....
Bjorn Pettersson [Tue, 18 May 2021 13:02:45 +0000 (15:02 +0200)]
[HIP] Tighten checks in hip-include-path.hip test case
The checks (both positive and negative checks) in the test case
hip-include-path.hip could mistakenly end up matching the string
"clang" from the InstalledDir in case the build dir for example
was named "/home/username/build-clang/". Intention with this
patch is to tighten up the checks a bit to filter our the
part of the paths that match with InstalledDir when doing the
checks, as well as matching "/lib/clang/" rather than
just "clang/".
Problem was found when building with
-DCLANG_DEFAULT_RTLIB=compiler-rt
-DCLANG_DEFAULT_CXX_STDLIB=libc++
and having "clang/" in the path to the build dir.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D102723
Roman Lebedev [Wed, 19 May 2021 10:53:36 +0000 (13:53 +0300)]
[NFCI][SimplifyCFG] removeEmptyCleanup(): use DeleteDeadBlock()
This required some changes to, instead of eagerly making PHI's
in the UnwindDest valid as-if the BB is already not a predecessor,
to be valid while BB is still a predecessor.
Roman Lebedev [Wed, 19 May 2021 09:48:40 +0000 (12:48 +0300)]
[NFCI][SimplifyCFG] removeEmptyCleanup(): streamline PHI node updating
Roman Lebedev [Wed, 19 May 2021 09:19:45 +0000 (12:19 +0300)]
[NFC][SimplifyCFG] removeEmptyCleanup(): use BasicBlock::phis()
Dmitry Vyukov [Fri, 7 May 2021 09:16:03 +0000 (11:16 +0200)]
tsan: mark sigwait as blocking
Add a test case reported in:
https://github.com/google/sanitizers/issues/1401
and fix it.
The code assumes sigwait will process other signals.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102057
Frederik Gossen [Wed, 19 May 2021 10:39:14 +0000 (12:39 +0200)]
[x86] Fix FMF propagation test
Kristóf Umann [Tue, 18 May 2021 11:06:02 +0000 (13:06 +0200)]
[analyzer] Check the checker name, rather than the ProgramPointTag when silencing a checker
The program point created by the checker, even if it is an error node,
might not be the same as the name under which the report is emitted.
Make sure we're checking the name of the checker, because thats what
we're silencing after all.
Differential Revision: https://reviews.llvm.org/D102683
Wang, Pengfei [Wed, 19 May 2021 10:01:11 +0000 (18:01 +0800)]
[X86] Limit X86InterleavedAccessGroup to handle the same type case only
The current implementation assumes the destination type of shuffle is the same as the decomposed ones. Add the check to avoid crush when the condition is not satisfied.
This fixes PR37616.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D102751
Simon Giesecke [Fri, 14 May 2021 11:10:18 +0000 (11:10 +0000)]
Use a non-recursive mutex in GsymCreator.
There doesn't seem to be a need to support recursive locking,
and a recursive mutex is unnecessarily inefficient.
Differential Revision: https://reviews.llvm.org/D102486
Simon Giesecke [Fri, 14 May 2021 11:06:39 +0000 (11:06 +0000)]
Move FunctionInfo in addFunctionInfo rather than copying.
Differential Revision: https://reviews.llvm.org/D102485
Simon Giesecke [Fri, 14 May 2021 10:59:51 +0000 (10:59 +0000)]
Avoid calculating the string hash twice in GsymCreator::insertString.
Do the single hash calculation before acquiring the lock, to reduce
lock contention. If Copy is true, and the string was not yet contained
in the StringStorage, use the new address from StringStorage, but
reuse the hash we already calculated.
Differential Revision: https://reviews.llvm.org/D102484
Simon Giesecke [Fri, 7 May 2021 15:32:02 +0000 (15:32 +0000)]
Reformat GSYMCreator.cpp
Differential Revision: https://reviews.llvm.org/D102483
Tim Northover [Tue, 11 May 2021 08:57:18 +0000 (09:57 +0100)]
MachineBasicBlock: add liveout iterator aware of which liveins are defined by the runtime.
Using this in RegAlloc fast reduces register pressure, and in some cases allows
x86 code to compile that wouldn't before.
Sander de Smalen [Thu, 8 Apr 2021 11:19:44 +0000 (12:19 +0100)]
[LV] Add -scalable-vectorization=<option> flag.
This patch adds a new option to the LoopVectorizer to control how
scalable vectors can be used.
Initially, this suggests three levels to control scalable
vectorization, although other more aggressive options can be added in
the future.
The possible options are:
- Disabled: Disables vectorization with scalable vectors.
- Enabled: Vectorize loops using scalable vectors or fixed-width
vectors, but favors fixed-width vectors when the cost
is a tie.
- Preferred: Like 'Enabled', but favoring scalable vectors when the
cost-model is inconclusive.
Reviewed By: paulwalker-arm, vkmr
Differential Revision: https://reviews.llvm.org/D101945
Roman Lebedev [Wed, 19 May 2021 08:54:27 +0000 (11:54 +0300)]
[NFCI][SimplifyCFG] simplifyUnreachable(): use DeleteDeadBlock()
Roman Lebedev [Wed, 19 May 2021 08:50:06 +0000 (11:50 +0300)]
[NFCI][SimplifyCFG] simplifyReturn(): use DeleteDeadBlock()
Roman Lebedev [Wed, 19 May 2021 08:49:16 +0000 (11:49 +0300)]
[NFCI][SimplifyCFG] simplifySingleResume(): use DeleteDeadBlock()
Roman Lebedev [Wed, 19 May 2021 08:44:43 +0000 (11:44 +0300)]
[NFCI][SimplifyCFG] simplifyCommonResume(): use DeleteDeadBlock()
Sergey Dmitriev [Wed, 19 May 2021 08:11:53 +0000 (01:11 -0700)]
[llvm-objcopy] Add support for '--' for delimiting options from input/output files
This will allow to use llvm-objcopy with file names that begin with dashes.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D102665
Fraser Cormack [Tue, 18 May 2021 16:17:21 +0000 (17:17 +0100)]
[RISCV] Support INSERT_VECTOR_ELT into i1 vectors
Like the element extraction of these vectors, we choose to promote up to
an i8 vector type and perform the insertion there.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D102697
Roman Lebedev [Wed, 19 May 2021 08:31:53 +0000 (11:31 +0300)]
[NFCI] SimplifyCFGPass: mergeEmptyReturnBlocks(): use DeleteDeadBlocks()
In this case, it does the same thing as the original pattern does.
SimplifyCFG has a few lurking miscompilations about deleting blocks that
have their address taken, and consistently using DeleteDeadBlocks() instead
of a hand-rolled pattern will allow to weed those cases out easierly.
Haojian Wu [Tue, 18 May 2021 19:53:32 +0000 (21:53 +0200)]
[clang-tidy] Fix a crash on invalid code for memset-usage check.
Differential Revision: https://reviews.llvm.org/D102714
Rong Xu [Wed, 19 May 2021 05:40:30 +0000 (22:40 -0700)]
Fix sanitizer test errors from commit
886629a8
Explictly handle the empty string in the Hash calculation.
Matthias Springer [Mon, 17 May 2021 05:37:32 +0000 (14:37 +0900)]
[mlir] Use VectorTransferPermutationMapLoweringPatterns in VectorToSCF
VectorTransferPermutationMapLoweringPatterns can be enabled via a pass option. These additional patterns lower permutation maps to minor identity maps with broadcasting, if possible, allowing for more efficient vector load/stores. The option is deactivated by default.
Differential Revision: https://reviews.llvm.org/D102593
Vitaly Buka [Wed, 19 May 2021 05:39:36 +0000 (22:39 -0700)]
[libfuzzer] Update doc mentioning removed flags.
MaheshRavishankar [Wed, 19 May 2021 05:08:12 +0000 (22:08 -0700)]
[mlir][Linalg] Break unnecessary dependency through unused `outs` tensor.
LinalgOps that are all parallel do not use the value of `outs`
tensor. The semantics is that the `outs` tensor is fully
overwritten. Using anything other than `init_tensor` can add false
dependencies between operations, when the use is just for the shape of
the tensor. Adding a canonicalization to always use `init_tensor` in
such cases, breaks this dependence.
Differential Revision: https://reviews.llvm.org/D102561
Arthur Eubanks [Fri, 7 May 2021 21:32:20 +0000 (14:32 -0700)]
[NewPM] Add options to PrintPassInstrumentation
To bring D99599's implementation in line with the existing
PrintPassInstrumentation, and to fix a FIXME, add more customizability
to PrintPassInstrumentation.
Introduce three new options. The first takes over the existing
"-debug-pass-manager-verbose" cl::opt.
The second and third option are specific to -fdebug-pass-structure. They
allow indentation, and also don't print analysis queries.
To avoid more golden file tests than necessary, prune down the
-fdebug-pass-structure tests.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D102196
Senran Zhang [Wed, 19 May 2021 03:40:59 +0000 (23:40 -0400)]
[Utils][vim] Highlight CHECK-EMPTY: & CHECK-COUNT: directives
Reviewed By: porglezomp
Differential Revision: https://reviews.llvm.org/D101135
Vladimir Vereschaka [Wed, 19 May 2021 02:04:49 +0000 (19:04 -0700)]
[CMake] Update Cmake cache file for Win to ARM Linux cross builds. NFC
Parametrize the cache file with TARGET_TRIPLE parameter. Normalize
the target triple to follow the runtime library installation directory.
Explicity enable LLVM_ENABLE_PER_TARGET_RUNTIME_DIR option.
Wenyi Zhao [Wed, 19 May 2021 02:11:33 +0000 (02:11 +0000)]
Enhance InferShapedTypeOpInterface to make it accessible during dialect conversion
Original interfaces are not safe to be called during dialect conversion.
This is because some ops (e.g. `dynamic_reshape(input, target_shape)`)
depend on the values of their operands to calculate the output shape.
However the operands may be out of reach during dialect conversion (e.g.
converting from tensor world to buffer world). This patch provides a new
kind of interface which accpets user-provided operands to solve this
problem.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D102317
Richard Smith [Wed, 19 May 2021 00:43:06 +0000 (17:43 -0700)]
Revert "[IR] Add a Location to BlockArgument." and follow-on commit
"[mlir] Speed up Lexer::getEncodedSourceLocation"
This reverts commit
3043be9d2db4d0cdf079adb5e1bdff032405e941 and commit
861d69a5259653f60d59795597493a7939b794fe.
This change resulted in printing textual MLIR that can't be parsed; see
review thread https://reviews.llvm.org/D102567 for details.
Joseph Huber [Wed, 19 May 2021 00:10:05 +0000 (20:10 -0400)]
[Attributor] Change AAExecutionDomain to only accept intrinsics
Summary:
The OpenMP runtime functions don't always provide unique thread ID's to
determine if a basic block is truly single-threaded. Change the implementation
to only check NVPTX intrinsics for now.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102700
Guozhi Wei [Wed, 19 May 2021 01:02:36 +0000 (18:02 -0700)]
[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB
This patch transforms the sequence
lea (reg1, reg2), reg3
sub reg3, reg4
to two sub instructions
sub reg1, reg4
sub reg2, reg4
Similar optimization can also be applied to LEA/ADD sequence.
The modifications to TwoAddressInstructionPass is to ensure the operands of ADD
instruction has expected order (the dest register of LEA should be src register
of ADD).
Differential Revision: https://reviews.llvm.org/D101970
Thomas Köppe [Tue, 18 May 2021 23:44:25 +0000 (23:44 +0000)]
Add a helper function to convert LogicalResult to int for return from main
At present, a lot of code contains main function bodies like "return failed(mlir::MlirOptMain(...);". This is unfortunate for two reasons: a) it uses ADL, which is maybe not what the free "failed" function was designed for; and b) it is a bit awkward to read, requring the reader to both understand the boolean nature of the value and the semantics of main's return value. (And it's also not portable, since 1 is not a portable success value.)
The replacement code, `return mlir::AsMainReturnCode(mlir::MlirOptMain(...))` is a bit more self-explanatory.
The change applies the new function to a few internal uses of MlirOptMain, too.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D102641
River Riddle [Tue, 18 May 2021 23:36:19 +0000 (16:36 -0700)]
[mlir] Speed up Lexer::getEncodedSourceLocation
We currently use SourceMgr::getLineAndColumn to get the line and column for an SMLoc, but this includes a call to StringRef::find_last_of that ends up dominating compile time. In D102567, we start creating locations from the input file for block arguments which resulted in an extreme performance regression for modules with very large amounts of block arguments. This revision switches to just using a pointer offset from the beginning of the line to calculate the column(all MLIR files are simple ascii), resulting in a compile time reduction from 4700 seconds (1 hour and 18 minutes) to 8 seconds.
Differential Revision: https://reviews.llvm.org/D102734
Amy Huang [Mon, 15 Mar 2021 21:20:49 +0000 (14:20 -0700)]
Apply [[standalone_debug]] to some types in the STL.
Add this attribute to some types to ensure that they have
debug info.
The debug info for these classes are required for debuggers to display
some STL types. With constructor homing (a new debug info optimization)
their debug info isn't emitted because their constructors are never
called.
The list of types with the attribute added are __hash_value_type,
__value_type, __tree_node_base, __tree_node, __hash_node, __list_node,
and __forward_list_node.
Differential Revision: https://reviews.llvm.org/D98750
Arthur O'Dwyer [Thu, 13 May 2021 03:04:03 +0000 (23:04 -0400)]
[libc++] Alphabetize header inclusions and include-what-you-use <__debug>. NFCI.
Arthur O'Dwyer [Wed, 12 May 2021 17:09:26 +0000 (13:09 -0400)]
[libc++] Some fixes to the <bit> utilities.
Fix __bitop_unsigned_integer and rename to __libcpp_is_unsigned_integer.
There are only five unsigned integer types, so we should just list them out.
Also provide `__libcpp_is_signed_integer`, even though the Standard doesn't
consume that trait anywhere yet.
Notice that `concept uniform_random_bit_generator` is specifically specified
to rely on `concept unsigned_integral` and *not* `__is_unsigned_integer`.
Instantiating `std::ranges::sample` with a type `U` satisfying
`uniform_random_bit_generator` where `unsigned_integral<U::result_type>`
and not `__is_unsigned_integer<U::result_type>` is simply IFNDR.
Orthogonally, fix an undefined behavior in std::countr_zero(__uint128_t).
Orthogonally, improve tests for the <bit> manipulation functions.
It was these new tests that detected the bug in countr_zero.
Differential Revision: https://reviews.llvm.org/D102328
Rong Xu [Tue, 18 May 2021 23:52:07 +0000 (16:52 -0700)]
Fix a buildbot failure from commit
886629a8
LLVM GN Syncbot [Tue, 18 May 2021 23:27:42 +0000 (23:27 +0000)]
[gn build] Port
886629a8c9e5
Rong Xu [Tue, 18 May 2021 23:08:38 +0000 (16:08 -0700)]
[SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO
This patch implements first part of Flow Sensitive SampleFDO (FSAFDO).
It has the following changes:
(1) disable current discriminator encoding scheme,
(2) new hierarchical discriminator for FSAFDO.
For this patch, option "-enable-fs-discriminator=true" turns on the new
functionality. Option "-enable-fs-discriminator=false" (the default)
keeps the current SampleFDO behavior. When the fs-discriminator is
enabled, we insert a flag variable, namely, llvm_fs_discriminator, to
the object. This symbol will checked by create_llvm_prof tool, and used
to generate a profile with FS-AFDO discriminators enabled. If this
happens, for an extbinary format profile, create_llvm_prof tool
will add a flag to profile summary section.
Differential Revision: https://reviews.llvm.org/D102246
Mike Rice [Tue, 18 May 2021 16:18:17 +0000 (09:18 -0700)]
[OpenMP] Stabilize OpenMP/parallel_for_codegen.cpp test (NFC)
Revert recent commit to require x86-registered-target (
e4b790c5e3653053819182a67c593bc65de860ac).
Remove -O1 from the run lines so they are less dependent on backend passes.
Update the CHECK6 and CHECK10 lines with script.
Differential Revision: https://reviews.llvm.org/D102720
Tomasz Miąsko [Wed, 19 May 2021 00:00:00 +0000 (00:00 +0000)]
[Demangle][Rust] Speculative fix for bot build failure
> error: ‘InType’ is not a class, namespace, or enumeration
Alex Orlov [Tue, 18 May 2021 22:38:13 +0000 (02:38 +0400)]
[symbolizer] Added StartAddress for the resolved function.
In many cases it is helpful to know at what address the resolved function starts.
This patch adds a new StartAddress member to the DILineInfo structure.
Reviewed By: jhenderson, dblaikie
Differential Revision: https://reviews.llvm.org/D102316
Fabian Sommer [Tue, 18 May 2021 22:06:08 +0000 (15:06 -0700)]
Default stack alignment of x86 NaCl to 16 bytes
X86 NaCl generally requires the stack to be aligned to 16 bytes.
This change was already implemented in two downstream NaCl compilers
based on llvm.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D102610
Tomasz Miąsko [Tue, 18 May 2021 16:15:00 +0000 (18:15 +0200)]
[Demangle][Rust] Parse tuples
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102579
Tomasz Miąsko [Tue, 18 May 2021 16:14:43 +0000 (18:14 +0200)]
[Demangle][Rust] Parse slice type
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102578
Tomasz Miąsko [Tue, 18 May 2021 16:14:02 +0000 (18:14 +0200)]
[Demangle][Rust] Parse array type
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102573
Tomasz Miąsko [Tue, 18 May 2021 16:13:21 +0000 (18:13 +0200)]
[Demangle][Rust] Parse named types
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102571
Peter Collingbourne [Tue, 18 May 2021 19:57:19 +0000 (12:57 -0700)]
scudo: Test realloc on increasing size buffers.
While developing a change to the allocator I ended up breaking
realloc on secondary allocations with increasing sizes. That didn't
cause any of the unit tests to fail, which indicated that we're
missing some test coverage here. Add a unit test for that case.
Differential Revision: https://reviews.llvm.org/D102716
Sanjay Patel [Tue, 18 May 2021 20:08:28 +0000 (16:08 -0400)]
[x86] add FMF propagation test for target-specific intrinsic; NFC
Sanjay Patel [Tue, 18 May 2021 18:02:11 +0000 (14:02 -0400)]
[x86] trim zeros from constants for readability; NFC
River Riddle [Tue, 18 May 2021 21:31:33 +0000 (14:31 -0700)]
[mlir] Allow derived rewrite patterns to define a non-virtual `initialize` hook
This is a hook that allows for providing custom initialization of the pattern, e.g. if it has bounded recursion, setting the debug name, etc., without needing to define a custom constructor. A non-virtual hook was chosen to avoid polluting the vtable with code that we really just want to be inlined when constructing the pattern. The alternative to this would be to just define a constructor for each pattern, this unfortunately creates a lot of otherwise unnecessary boiler plate for a lot of patterns and a hook provides a much simpler/cleaner interface for the very common case.
Differential Revision: https://reviews.llvm.org/D102440
River Riddle [Tue, 18 May 2021 21:31:22 +0000 (14:31 -0700)]
[mlir-docs] Add a blurb on recursion during pattern application
We currently do not document how the pattern rewriter infra treats recursion when it gets detected. This revision adds a blurb on recursion in patterns, and how patterns can signal that they are equipped to handle it.
Differential Revision: https://reviews.llvm.org/D102439
Arthur Eubanks [Tue, 18 May 2021 21:38:12 +0000 (14:38 -0700)]
[docs] Fix broken docs after
1c7f32334
Arthur Eubanks [Sun, 2 May 2021 04:27:47 +0000 (21:27 -0700)]
[NFC] Use ArgListEntry indirect types more in ISel lowering
For opaque pointers, we're trying to avoid uses of
PointerType::getElementType().
A couple of ISel places use PointerType::getElementType(). Some of these
are easy to fix by using ArgListEntry's indirect types.
The inalloca type wasn't stored there, as opposed to preallocated and
byval which have their indirect types available, so add it and use it.
This is a reland after an MSan fix in D102667.
Differential Revision: https://reviews.llvm.org/D101713
Arthur Eubanks [Tue, 4 May 2021 01:00:50 +0000 (18:00 -0700)]
[TargetLowering] Only inspect attributes in the arguments for ArgListEntry
Parameter attributes are considered part of the function [1], and like
mismatched calling conventions [2], we can't have the verifier check for
mismatched parameter attributes.
This is a reland after fixing MSan issues in D102667.
[1] https://llvm.org/docs/LangRef.html#parameter-attributes
[2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D101806
Arthur Eubanks [Tue, 18 May 2021 05:11:06 +0000 (22:11 -0700)]
[MSan] Set zeroext on call arguments to msan functions with zeroext parameter attribute
ABI attributes need to match between the caller and callee.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D102667
Konstantin Zhuravlyov [Tue, 18 May 2021 20:56:23 +0000 (16:56 -0400)]
AMDGPU/Docs: Remove reserved MACH 0x3E (it is no longer reserved), sort MACHs by value
Neumann Hon [Tue, 18 May 2021 19:02:11 +0000 (15:02 -0400)]
[SystemZ] [z/OS] Add XPLINK64 Calling Convention to SystemZ
This patch adds the XPLINK64 calling convention to the SystemZ
backend. It specifies and implements the argument passing and
return value conventions.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D101010
Martin Storsjö [Fri, 14 May 2021 20:34:51 +0000 (23:34 +0300)]
[compiler-rt] [builtins] Provide a SEH specific __gcc_personality_seh0
This matches how __gxx_personality_seh0 is hooked up in libcxxabi.
Differential Revision: https://reviews.llvm.org/D102530
Arthur Eubanks [Thu, 6 May 2021 23:30:39 +0000 (16:30 -0700)]
[NewPM] Don't mark AA analyses as preserved
Currently all AA analyses marked as preserved are stateless, not taking
into account their dependent analyses. So there's no need to mark them
as preserved, they won't be invalidated unless their analyses are.
SCEVAAResults was the one exception to this, it was treated like a
typical analysis result. Make it like the others and don't invalidate
unless SCEV is invalidated.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D102032
Mateusz Mikuła [Tue, 18 May 2021 20:36:50 +0000 (23:36 +0300)]
[MinGW] Fix the cmake condition for -mbig-obj
This is a correction to D102419, fixing the condition to the
form that actually works as intended.
Arthur Eubanks [Thu, 13 May 2021 22:44:21 +0000 (15:44 -0700)]
[OpaquePtr] Make loads and stores work with opaque pointers
Don't check that types match when the pointer operand is an opaque
pointer.
I would separate the Assembler and Verifier changes, but
verify-uselistorder in the Assembler test ends up running the verifier.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102450
Petr Hosek [Tue, 18 May 2021 19:59:57 +0000 (12:59 -0700)]
[CMake] Use -O0 for unittests under full LTO as well
We already use -O0 for unittests under ThinLTO, do the same for full LTO
where the compile time costs to runtime benefits tradeoff is even worse.
Differential Revision: https://reviews.llvm.org/D102718
Reid Kleckner [Tue, 18 May 2021 19:34:02 +0000 (12:34 -0700)]
[PDB] Improve error handling when writes fail
Handle PDB writing errors like any other error in LLD: emit an error and
continue. This allows the linker to print timing data and summary data
after linking, which can be helpful for finding PDB size problems. Also
report how large the file would have been.
Example output:
lld-link: error: Output data is larger than 4 GiB. File size would have been 6,937,108,480
lld-link: error: failed to write PDB file ./chrome.dll.pdb
Summary
--------------------------------------------------------------------------------
33282 Input OBJ files (expanded from all cmd-line inputs)
4 PDB type server dependencies
0 Precomp OBJ dependencies
33396931 Input type records
... snip ...
Input File Reading: 59756 ms ( 45.5%)
GC: 7500 ms ( 5.7%)
ICF: 3336 ms ( 2.5%)
Code Layout: 6329 ms ( 4.8%)
PDB Emission (Cumulative): 46192 ms ( 35.2%)
Add Objects: 27609 ms ( 21.0%)
Type Merging: 16740 ms ( 12.8%)
Symbol Merging: 10761 ms ( 8.2%)
Publics Stream Layout: 9383 ms ( 7.1%)
TPI Stream Layout: 1678 ms ( 1.3%)
Commit to Disk: 3461 ms ( 2.6%)
--------------------------------------------------
Total Link Time: 131244 ms (100.0%)
Differential Revision: https://reviews.llvm.org/D102713
River Riddle [Tue, 18 May 2021 19:57:36 +0000 (12:57 -0700)]
[mlir-lsp-server] Add support for recording text document versions
The version is used by LSP clients to ignore stale diagnostics, and can be used in a followup to help verify incremental changes.
Differential Revision: https://reviews.llvm.org/D102644
Sam Clegg [Tue, 18 May 2021 18:08:32 +0000 (11:08 -0700)]
[lld][WebAssembly] Convert test to assembly. NFC.
Differential Revision: https://reviews.llvm.org/D102704
Simon Pilgrim [Tue, 18 May 2021 19:25:42 +0000 (20:25 +0100)]
[X86][AVX] createVariablePermute - correctly extend same-sized-vector indices (PR50356)
D101838 incorrectly handled indices vectors of the same size but with higher element counts to just bitcast to the target indices type instead of performing a ZERO_EXTEND_VECTOR_INREG
Sam Clegg [Wed, 12 May 2021 23:48:34 +0000 (16:48 -0700)]
[lld][WebAssembly] Enable string tail merging in debug sections
This is a followup to https://reviews.llvm.org/D97657 which
applied string tail merging to data segments.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=48828
Differential Revision: https://reviews.llvm.org/D102436