Richard Smith [Wed, 12 May 2021 21:19:23 +0000 (14:19 -0700)]
Clean up handling of constrained parameters in lambdas.
No functionality change intended.
Richard Smith [Wed, 12 May 2021 20:25:52 +0000 (13:25 -0700)]
Add test for substitutability of variable templates in closure type
mangling.
Pushpinder Singh [Wed, 12 May 2021 11:14:36 +0000 (11:14 +0000)]
[AMDGPU][OpenMP] Emit textual IR for -emit-llvm -S
Previously clang would print a binary blob into the bundled file
for amdgcn. With this patch, it will instead print textual IR as
expected.
Reviewed By: JonChesterfield, ronlieb
Differential Revision: https://reviews.llvm.org/D102065
Change-Id: I10c0127ab7357787769fdf9a2edd4b3071e790a1
Peter Collingbourne [Wed, 12 May 2021 22:29:41 +0000 (15:29 -0700)]
scudo: Require fault address to be in bounds for UAF.
The bounds check that we previously had here was suitable for secondary
allocations but not for UAF on primary allocations, where it is likely
to result in false positives. Fix it by using a different bounds check
for UAF that requires the fault address to be in bounds.
Differential Revision: https://reviews.llvm.org/D102376
Christopher Di Bella [Sun, 2 May 2021 02:21:28 +0000 (02:21 +0000)]
[libcxx] modifies `_CmpUnspecifiedParam` ignore types outside its domain
D85051's honeypot solution was a bit too aggressive swallowed up the
comparison types, which made comparing objects of different ordering
types ambiguous.
Depends on D101707.
Differential Revision: https://reviews.llvm.org/D101708
Aart Bik [Wed, 12 May 2021 21:51:16 +0000 (14:51 -0700)]
[mlir][sparse][capi][python] add sparse tensor passes
First set of "boilerplate" to get sparse tensor
passes available through CAPI and Python.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D102362
Stanislav Mekhanoshin [Wed, 12 May 2021 21:18:16 +0000 (14:18 -0700)]
[AMDGPU] Refactor shouldExpandAtomicRMWInIR(). NFC.
This is logic simplification for better readability.
Differential Revision: https://reviews.llvm.org/D102371
MaheshRavishankar [Wed, 12 May 2021 23:06:02 +0000 (16:06 -0700)]
[mlir][Linalg] Add interface methods to get lhs and rhs of contraction
Differential Revision: https://reviews.llvm.org/D102301
Justin Bogner [Wed, 12 May 2021 22:29:29 +0000 (15:29 -0700)]
Change the context instruction for computeKnownBits in LoadStoreVectorizer pass
This change enables cases for which the index value for the first
load/store instruction in a pair could be a function argument. This
allows using llvm.assume to provide known bits information in such
cases.
Patch by Viacheslav Nikolaev. Thanks!
Differential Revision: https://reviews.llvm.org/D101680
Greg Clayton [Wed, 12 May 2021 20:20:31 +0000 (13:20 -0700)]
Optimize GSymCreator::finalize.
The algorithm removing duplicates from the Funcs list used to have
amortized quadratic time complexity because it was potentially
removing each entry using std::vector::erase individually. This
patch is now using a erase-remove idiom with an adapted
removeIfBinary algorithm.
Probably this was made under the assumption that these removals are
rare, but there are cases where the case of duplicate entries is
occurring frequently. In these cases, the actual runtime was very
poor, taking hours to process a single binary of around 1 GiB size
including debug info. Another factor contributing to that is the
frequent output of the warning, which is now removed.
It seems this is particularly an issue with GCC-compiled binaries,
rather than clang-built binaries.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D102219
Eugene Zhulenev [Wed, 12 May 2021 21:30:09 +0000 (14:30 -0700)]
[mlir] Fix ssa values naming bug
Address comments in https://reviews.llvm.org/D102226 to fix the bug + style violations
Differential Revision: https://reviews.llvm.org/D102368
Sam Clegg [Tue, 11 May 2021 22:16:00 +0000 (15:16 -0700)]
[lld][WebAssembly] Allow data symbols to extend past end of segment
This fixes a bug with string merging with string symbols that contain
NULLs, as is the case in the `merge-string.s` test.
The bug only showed when we run with `--relocatable` and then try read
the resulting object back in. In this case we would end up with string
symbols that extend past the end of the segment in which they live.
The problem comes from the fact that sections which are flagged as
string mergable assume that all strings are NULL terminated. The
merging algorithm will drop trailing chars that follow a NULL since they
are essentially unreachable. However, the "size" attribute (in the
symbol table) of such a truncated symbol is not updated resulting a
symbol size that can overlap the end of the segment.
I verified that this can happen in ELF too given the right conditions
and the its harmless enough. In practice Strings that contain embedded
null should not be part of a mergable section.
Differential Revision: https://reviews.llvm.org/D102281
Sam Clegg [Mon, 10 May 2021 21:58:47 +0000 (14:58 -0700)]
[WebAssembly] Add TLS data segment flag: WASM_SEG_FLAG_TLS
Previously the linker was relying solely on the name of the segment
to imply TLS.
Differential Revision: https://reviews.llvm.org/D102202
Heejin Ahn [Wed, 12 May 2021 20:19:30 +0000 (13:19 -0700)]
[WebAssembly] Allow Wasm EH with Emscripten SjLj
We explicitly made it error out in D101403, out of a good intention that
the error message will make people less confusing. Turns out, we weren't
failing all cases of wasm EH + SjLj; only a few cases were failing and
our client was able to get around by fixing source code, but now we made
it fail for all cases, even the cases that previously succeeded fail,
which we didn't intend. This reverts that change.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D102364
Craig Topper [Wed, 12 May 2021 19:38:06 +0000 (12:38 -0700)]
[RISCV] Remove RISCVII:VSEW enum. Make encodeVYPE operate directly on SEW.
The VSEW encoding isn't a useful value to pass around. It's better
to use SEW or log2(SEW) directly. The only real ugliness is that
the vsetvli IR intrinsics use the VSEW encoding, but it's easy
enough to decode that when the intrinsic is processed.
Richard Smith [Thu, 6 May 2021 01:56:58 +0000 (18:56 -0700)]
Fix bad mangling of <data-member-prefix> for a closure in the initializer of a variable at global namespace scope.
This implements the direction proposed in
https://github.com/itanium-cxx-abi/cxx-abi/pull/126.
Differential Revision: https://reviews.llvm.org/D101968
River Riddle [Wed, 12 May 2021 20:02:09 +0000 (13:02 -0700)]
[mlir-lsp-server][NFC] Add newline between Protocol JSON serialization methods and class definitions.
River Riddle [Wed, 12 May 2021 20:01:59 +0000 (13:01 -0700)]
[mlir-lsp-server] Add support for sending diagnostics to the client
This allows for diagnostics emitted during parsing/verification to be surfaced to the user by the language client, as opposed to just being emitted to the logs like they are now.
Differential Revision: https://reviews.llvm.org/D102293
Shoaib Meenai [Wed, 12 May 2021 19:59:23 +0000 (12:59 -0700)]
[flang] Fix standalone builds
Flang's CMake modules directory was being added to the CMake module path
twice, and AddFlang was being included after the first addition. Remove
the unnecessary first addition and move the AddFlang include down to the
second one. This way, it occurs after LLVM's CMake modules have been
included for a standalone build, so it can make use of those modules.
Erich Keane [Tue, 11 May 2021 13:40:48 +0000 (06:40 -0700)]
Suppress Deferred Diagnostics in discarded statements.
It doesn't really make sense to emit language specific diagnostics
in a discarded statement, and suppressing these diagnostics results in a
programming pattern that many users will feel is quite useful.
Basically, this makes sure we only emit errors from the 'true' side of a
'constexpr if'.
It does this by making the ExprEvaluatorBase type have an opt-in option
as to whether it should visit discarded cases.
Differential Revision: https://reviews.llvm.org/D102251
Suraj Sudhir [Wed, 12 May 2021 19:39:25 +0000 (12:39 -0700)]
[mlir][tosa] Remove tosa.identityn operator
Removes the identityn operator from TOSA MLIR definition.
Removes TosaToLinAlg mappings
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D102329
Fangrui Song [Wed, 12 May 2021 19:38:27 +0000 (12:38 -0700)]
[ELF][AVR] Add explicit relocation types to getRelExpr
Martin Storsjö [Sun, 9 May 2021 20:27:35 +0000 (23:27 +0300)]
[LLD] [COFF] Fix including the personality function for DWARF EH when linking with --gc-sections
Since
c579a5b1d92a9bc2046d00ee2d427832e0f5ddec we don't traverse
.eh_frame when doing GC. But the exception handling personality
function needs to be included, and is only referenced from within
.eh_frame.
Differential Revision: https://reviews.llvm.org/D102138
Martin Storsjö [Sun, 2 May 2021 21:13:51 +0000 (00:13 +0300)]
[libcxx] [test] Fix fs.op.last_write_time for Windows
Don't use stat and lstat on Windows; lstat is missing, stat only provides
the modification times with second granularity (and does the wrong thing
regarding symlinks). Instead do a minimal reimplementation using the
native windows APIs.
Differential Revision: https://reviews.llvm.org/D101731
Shoaib Meenai [Wed, 12 May 2021 19:14:20 +0000 (12:14 -0700)]
[cmake] Fix typo in function name
Not sure how my local testing didn't trigger this path. Should fix
https://lab.llvm.org/buildbot/#/builders/132/builds/5494
Mark de Wever [Wed, 12 May 2021 15:46:24 +0000 (17:46 +0200)]
[libc++][nfc] remove duplicated __to_unsigned.
Both `<type_traits>` and `<charconv>` implemented this function with
different names and a slightly different behavior. This removes the
version in `<charconv>` and improves the version in `<typetraits>`.
- The code can be used again in C++11.
- The original claimed C++14 support, but `[[nodiscard]]` is not
available in C++14.
- Adds `_LIBCPP_INLINE_VISIBILITY`.
Reviewed By: zoecarver, #libc, Quuxplusone
Differential Revision: https://reviews.llvm.org/D102332
Nikita Popov [Tue, 11 May 2021 21:01:29 +0000 (23:01 +0200)]
[InstCombine] Support one-hot merge for logical and/or
If a logical and/or is used, we need to be careful not to propagate
a potential poison value from the RHS by inserting a freeze
instruction. Otherwise it works the same way as bitwise and/or.
This is intended to address the regression reported at
https://reviews.llvm.org/D101191#2751002.
Differential Revision: https://reviews.llvm.org/D102279
Pratyush Das [Wed, 12 May 2021 17:28:41 +0000 (17:28 +0000)]
Add type information to integral template argument if required.
Non-comprehensive list of cases:
* Dumping template arguments;
* Corresponding parameter contains a deduced type;
* Template arguments are for a DeclRefExpr that hadMultipleCandidates()
Type information is added in the form of prefixes (u8, u, U, L),
suffixes (U, L, UL, LL, ULL) or explicit casts to printed integral template
argument, if MSVC codeview mode is disabled.
Differential revision: https://reviews.llvm.org/D77598
Nico Weber [Wed, 12 May 2021 18:53:50 +0000 (14:53 -0400)]
Revert "Produce warning for performing pointer arithmetic on a null pointer."
This reverts commit
dfc1e31d49fe1380c9bab43373995df5fed15e6d.
See discussion on https://reviews.llvm.org/D98798
Stephen Concannon [Wed, 12 May 2021 18:25:22 +0000 (20:25 +0200)]
[clang-tidy] Allow opt-in or out of some commonly occuring patterns in NarrowingConversionsCheck.
Within clang-tidy's NarrowingConversionsCheck.
* Allow opt-out of some common occurring patterns, such as:
- Implicit casts between types of equivalent bit widths.
- Implicit casts occurring from the return of a ::size() method.
- Implicit casts on size_type and difference_type.
* Allow opt-in of errors within template instantiations.
This will help projects adopt these guidelines iteratively.
Developed in conjunction with Yitzhak Mandelbaum (ymandel).
Patch by Stephen Concannon!
Differential Revision: https://reviews.llvm.org/D99543
Florian Hahn [Wed, 12 May 2021 18:16:36 +0000 (19:16 +0100)]
[PhaseOrdering] Add test for missing vectorization with NewPM.
Florian Hahn [Wed, 12 May 2021 16:02:49 +0000 (17:02 +0100)]
[SCEV] Add loop-guard pessimizing test with step = 2.
Stelios Ioannou [Tue, 11 May 2021 13:47:19 +0000 (14:47 +0100)]
[LoopFlatten] Simplify loops so that the pass can operate on unsimplified loops.
The loop flattening pass requires loops to be in simplified form. If the
loops are not in simplified form, the pass cannot operate. This patch
simplifies all loops before flattening. As a result, all loops will be
simplified regardless of whether anything ends up being flattened.
This change was inspired by observing a certain loop that was not flatten
because the loops were not in simplified form. This loop is added as a
test to verify that it is now flattened.
Differential Revision: https://reviews.llvm.org/D102249
Change-Id: I45bcabe70fb99b0d89f0effafc82eb9e0585ec30
Shoaib Meenai [Thu, 1 Oct 2020 21:35:28 +0000 (14:35 -0700)]
[cmake] Add support for multiple distributions
LLVM's build system contains support for configuring a distribution, but
it can often be useful to be able to configure multiple distributions
(e.g. if you want separate distributions for the tools and the
libraries). Add this support to the build system, along with
documentation and usage examples.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D89177
Rob Suderman [Tue, 11 May 2021 22:31:20 +0000 (15:31 -0700)]
[mlir][linalg] Fixed issue generating reassociation map with Rank-0 types
Rank-0 case causes a graph during linalg reshape operation.
Differential Revision: https://reviews.llvm.org/D102282
Benjamin Kramer [Wed, 12 May 2021 17:51:21 +0000 (19:51 +0200)]
Remove AST inclusion from Basic include
That's a cyclic dependency. NFC.
Valentin Clement [Wed, 12 May 2021 15:41:03 +0000 (11:41 -0400)]
[mlir][openacc] Add OpenACC translation to LLVM IR (enter_data op create/copyin)
This patch begins to translate acc.enter_data operation to call to tgt runtime call.
It currently only translate create/copyin operands of memref type. This acts as a basis to add support
for FIR types in the Flang/OpenACC support. It follows more or less a similar path than clang
with `omp target enter data map` directives.
This patch is taking a different approach than D100678 and perform a translation to LLVM IR
and make use of the OpenMPIRBuilder instead of doing a conversion to the LLVMIR dialect.
OpenACC support in Flang will rely on the current OpenMP runtime where 1:1 lowering can be
applied. Some extension will be added where features are not available yet.
Big part of this code will be shared for other standalone data operations in the OpenACC
dialect such as acc.exit_data and acc.update.
It is likely that parts of the lowering can also be shared later with the ops for
standalone data directives in the OpenMP dialect when they are introduced.
This is an initial translation and it probably needs more work.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D101504
Roman Lebedev [Wed, 12 May 2021 17:22:22 +0000 (20:22 +0300)]
[NFC][clang][Codegen] Split ThunkInfo into it's own header
Otherwise we'll have issues with forward definition of GlobalDecl.
Split off from https://reviews.llvm.org/D100388
Roman Lebedev [Tue, 13 Apr 2021 12:10:39 +0000 (15:10 +0300)]
[NFCI][clang][Codegen] CodeGenVTables::addVTableComponent(): use getGlobalDecl
It does the same thing.
Split off from https://reviews.llvm.org/D100388
Fangrui Song [Wed, 12 May 2021 17:34:32 +0000 (10:34 -0700)]
[X86] Fix -Wunused-lambda-capture
Fangrui Song [Wed, 12 May 2021 17:22:26 +0000 (10:22 -0700)]
[CMake][ELF] Add -fno-semantic-interposition and -Bsymbolic-functions
llvm-dev message: https://lists.llvm.org/pipermail/llvm-dev/2021-May/150465.html
In an ELF shared object, a default visibility defined symbol is preemptible by default.
This creates some missed optimization opportunities. -fno-semantic-interposition can optimize -fPIC:
* in Clang: avoid GOT/PLT cost for variable access/function calls to external linkage definition in the same TU
* in GCC: enable interprocedural optimizations (including inlining) and avoid PLT
See https://gist.github.com/MaskRay/
2d4dfcfc897341163f734afb59f689c6 for more information.
-Bsymbolic-functions is more aggressive than -fvisibility-inlines-hidden (present since 2012) as it applies
to all function definitions. It can
* avoid PLT for cross-TU function calls && reduce dynamic symbol lookup
* reduce dynamic symbol lookup for taking function addresses and optimize out GOT/TOC on x86-64/ppc64
With both options, the libLLVM.so and libclang-cpp.so performance should
be closer to PIE binary linking against `libLLVM*.a` and `libclang*.a`
(In a -DLLVM_TARGETS_TO_BUILD=X86 build, the number of JUMP_SLOT decreases from 12716 to 1628, and the number of GLOB_DAT decreases from 1918 to 1313
The built clang with `-DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on` is significantly faster.
See the Linux kernel build result https://bugs.archlinux.org/task/70697
)
Some implication:
Interposing a subset of functions is no longer supported.
(This is fragile anyway and cannot really be supported. For Mach-O we don't use
`ld -interpose`, so interposition is not supported on Mach-O at all.)
Compiling a program which takes the address of any LLVM function with
`{gcc,clang} -fno-pic` and expects the address to equal to the address taken
from libLLVM.so or libclang-cpp.so is unsupported. I am fairly confident that
llvm-project shouldn't have different behaviors depending on such pointer
equality (as we've been using -fvisibility-inlines-hidden which applies to
inline functions for a long time), but if we accidentally do, users should be
aware that they should not make assumption on pointer equality in `-fno-pic`
mode.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D102090
Inho Seo [Wed, 12 May 2021 17:28:37 +0000 (10:28 -0700)]
Update static bound checker for Linalg to cover decreasing cases
The current static checker for linalg does not work on the decreasing
index cases well. So, this is to Update the current static bound checker
for linalg to cover decreasing index cases.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D102302
Simon Pilgrim [Wed, 12 May 2021 16:34:11 +0000 (17:34 +0100)]
[X86][AVX] Fold concat(ps*lq(x,32),ps*lq(y,32)) -> shuffle(concat(x,y),zero) (PR46621)
On AVX1 targets we can handle v4i64 logical shifts by 32 bits as a pair of v8f32 shuffles with zero.
I was hoping to put this in LowerScalarImmediateShift, but performing that early causes regressions where other instructions were respliting the subvectors.
Aart Bik [Tue, 11 May 2021 23:14:00 +0000 (16:14 -0700)]
[mlir][sparse] keep runtime support library signature consistent
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D102285
Amara Emerson [Wed, 12 May 2021 08:04:55 +0000 (01:04 -0700)]
[AArch64][GlobalISel] Add MMOs to constant pool loads to allow LICM hoisting.
This caused performance regressions vs SDAG on SingleSource/Benchmarks/Adobe-C++
Greg McGary [Tue, 30 Mar 2021 00:33:48 +0000 (17:33 -0700)]
[lld-macho] Implement branch-range-extension thunks
Extend the range of calls beyond an architecture's limited branch range by first calling a thunk, which loads the far address into a scratch register (x16 on ARM64) and branches through it.
Other ports (COFF, ELF) use multiple passes with successively-refined guesses regarding the expansion of text-space imposed by thunk-space overhead. This MachO algorithm places thunks during MergedOutputSection::finalize() in a single pass using exact thunk-space overheads. Thunks are kept in a separate vector to avoid the overhead of inserting into the `inputs` vector of `MergedOutputSection`.
FIXME:
* arm64-stubs.s test is broken
* add thunk tests
* Handle thunks to DylibSymbol in MergedOutputSection::finalize()
Differential Revision: https://reviews.llvm.org/D100818
Jon Chesterfield [Wed, 12 May 2021 16:30:40 +0000 (17:30 +0100)]
[libomptarget][amdgpu][nfc] Expand errorcheck macros
[libomptarget][amdgpu][nfc] Expand errorcheck macros
These macros expand to continue, which is confusing, or exit,
which is incompatible with continuing execution on offloading fail.
Expanding the macros in place makes the code look untidy but the
control flow obvious and amenable to improving. In particular, exit
becomes easier to eliminate.
Reviewed By: pdhaliwal
Differential Revision: https://reviews.llvm.org/D102230
Abhina Sreeskantharajan [Wed, 12 May 2021 16:26:00 +0000 (12:26 -0400)]
[SystemZ][z/OS] Fix warning caused by umask returning a signed integer type
On z/OS, umask() returns an int because mode_t is type int, however it is being compared to an unsigned int. This patch fixes the following warning we see when compiling Path.cpp.
```
comparison of integers of different signs: 'const int' and 'const unsigned int'
```
Reviewed By: muiez
Differential Revision: https://reviews.llvm.org/D102326
Malcolm Parsons [Wed, 12 May 2021 16:11:19 +0000 (17:11 +0100)]
[docs] Fix documentation for bugprone-dangling-handle
string_view isn't experimental anymore.
This check has always handled both forms.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D102313
Fabian Schuiki [Wed, 21 Apr 2021 08:57:29 +0000 (10:57 +0200)]
[MLIR] Factor pass timing out into a dedicated timing manager
This factors out the pass timing code into a separate `TimingManager`
that can be plugged into the `PassManager` from the outside. Users are
able to provide their own implementation of this manager, and use it to
time additional code paths outside of the pass manager. Also allows for
multiple `PassManager`s to run and contribute to a single timing report.
More specifically, moves most of the existing infrastructure in
`Pass/PassTiming.cpp` into a new `Support/Timing.cpp` file and adds a
public interface in `Support/Timing.h`. The `PassTiming` instrumentation
becomes a wrapper around the new timing infrastructure which adapts the
instrumentation callbacks to the new timers.
Reviewed By: rriddle, lattner
Differential Revision: https://reviews.llvm.org/D100647
Victor Huang [Wed, 12 May 2021 15:56:54 +0000 (10:56 -0500)]
[PowerPC] Fix definitions of CMPRB8, CMPEQB, CMPRB, SETB in PPCInstr64Bit.td and PPCInstrInfo.td
Baptiste Saleil [Wed, 12 May 2021 15:37:06 +0000 (11:37 -0400)]
[AMDGPU] Disable the SIFormMemoryClauses pass at -O1
This patch disables the SIFormMemoryClauses pass at -O1. This pass has a
significant impact on compilation time, so we only want it to be enabled
starting from -O2.
Differential Revision: https://reviews.llvm.org/D101939
Paul Robinson [Wed, 12 May 2021 15:48:50 +0000 (08:48 -0700)]
Fix grammar in README.md
Simon Pilgrim [Wed, 12 May 2021 15:25:08 +0000 (16:25 +0100)]
[X86][AVX] combineConcatVectorOps - add ConcatSubOperand helper. NFCI.
Pull out repeated code to create a concat_vectors of the same operand from all subvecs.
Simon Pilgrim [Wed, 12 May 2021 14:46:52 +0000 (15:46 +0100)]
[X86][AVX] Add v4i64 shift-by-32 tests
AVX1 could perform this as a v8f32 shuffle instead of splitting - based off PR46621
Fraser Cormack [Fri, 7 May 2021 14:25:40 +0000 (15:25 +0100)]
[TargetLowering] Improve legalization of scalable vector types
This patch extends the vector type-conversion and legalization capabilities of
scalable vector types.
Firstly, `vscale x 1` types now behave more like the corresponding `vscale x
2+` types. This enables the integer promotion legalization of extended scalable
types, such as the promotion of `<vscale x 1 x i5>` to `<vscale x 1 x i8>`.
These `vscale x 1` types are also now better handled by
`getVectorTypeBreakdown`, where what looks like older handling for 1-element
fixed-length vector types was spuriously updated to include scalable types.
Widening of scalable types is now better supported, by using `INSERT_SUBVECTOR`
to insert the smaller scalable vector "value" type into the wider scalable
vector "part" type. This allows AArch64 to pass and return `vscale x 1` types
by value by widening.
There are still cases where we are unable to legalize `vscale x 1` types, such
as where expansion would require splitting the vector in two.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D102073
Valentin Clement [Wed, 12 May 2021 15:31:18 +0000 (11:31 -0400)]
[mlir][openacc] Conversion of data operand to LLVM IR dialect
Add a conversion pass to convert higher-level type before translation.
This conversion extract meangingful information and pack it into a struct that
the translation (D101504) will be able to understand.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D102170
Anastasia Stulova [Wed, 12 May 2021 15:19:13 +0000 (16:19 +0100)]
[OpenCL] Remove pragma requirement from Arm dot extension.
This removed the pointless need for extension pragma since
it doesn't disable anything properly and it doesn't need to
enable anything that is not possible to disable.
The change doesn't break existing kernels since it allows to
compile more cases i.e. without pragma statements but the
pragma continues to be accepted.
Differential Revision: https://reviews.llvm.org/D100985
Jordan Rupprecht [Wed, 12 May 2021 01:36:53 +0000 (18:36 -0700)]
[llvm-cov][test] Add test coverage for "gcov" implying "llvm-cov gcov" compatibility.
Much like other LLVM binary utilities, `llvm-cov` has a symlink compatibility feature where it runs in `gcov` compatibility mode if the binary name ends in `gcov`. This is identical to invoking `llvm-cov gcov ...`.
Differential Revision: https://reviews.llvm.org/D102299
Yaxun (Sam) Liu [Tue, 11 May 2021 14:09:38 +0000 (10:09 -0400)]
[CUDA][HIP] Fix device template variables
Currently clang does not emit device template variables
instantiated only in host functions, however, nvcc is
able to do that:
https://godbolt.org/z/fneEfferY
This patch fixes this issue by refactoring and extending
the existing mechanism for emitting static device
var ODR-used by host only. Basically clang records
device variables ODR-used by host code and force
them to be emitted in device compilation. The existing
mechanism makes sure these device variables ODR-used
by host code are added to llvm.compiler-used, therefore
they are guaranteed not to be deleted.
It also fixes non-ODR-use of static device variable by host code
causing static device variable to be emitted and registered,
which should not.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D102237
Craig Topper [Wed, 12 May 2021 14:27:52 +0000 (07:27 -0700)]
[ValueTypes] Rename MVT::getVectorNumElements() to MVT::getVectorMinNumElements(). Fix some misuses of getVectorNumElements()
getVectorNumElements() returns a value for scalable vectors
without any warning so it is effectively getVectorMinNumElements().
By renaming it and making getVectorNumElements() forward to
it, we can insert a check for scalable vectors into getVectorNumElements()
similar to EVT. I didn't do that in this patch because there are still more
fixes needed, but I was able to temporarily do it and passed the RISCV
lit tests with these changes.
The changes to isPow2VectorType and getPow2VectorType are copied from EVT.
The change to TypeInfer::EnforceSameNumElts reduces the size of AArch64's isel table.
We're now considering SameNumElts to require the scalable property to match which
removes some unneeded type checks.
This was motivated by the bug I fixed yesterday in
80b9510806cf11c57f2dd87191d3989fc45defa8
Reviewed By: frasercrmck, sdesmalen
Differential Revision: https://reviews.llvm.org/D102262
Stefan Pintilie [Wed, 12 May 2021 14:42:09 +0000 (09:42 -0500)]
Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics"
This reverts commit
6c80361b8474535852afb2f7201370fb5f410091.
Breaks PowerPC Big Endian buildbots.
Hendrik Greving [Tue, 11 May 2021 15:57:18 +0000 (08:57 -0700)]
[DAGCombiner] Fix DAG combine store elimination, different address space.
Fixes a bug in the DAG combiner that eliminates the stores because it missed
to inspect the address space of the pointers.
%v = load %ptr_as1
// no chain side effect
store %v, %ptr_as2
As well as
store %v, %ptr_as1
store %v, %ptr_as2
Fixes a test for above in X86.
Differential Revision: https://reviews.llvm.org/D102096
Hendrik Greving [Tue, 11 May 2021 14:16:35 +0000 (07:16 -0700)]
[DAGCombiner] Add test exposing bug in DAG combine.
Adds a test in X86, exposing a bug in DAG combine eliminating stores that
are the same value but no the same address space.
Differential Revision: https://reviews.llvm.org/D102243
Peter Waller [Tue, 27 Apr 2021 12:29:42 +0000 (12:29 +0000)]
[CodeGen][AArch64][SVE] Fold [rdffr, ptest] => rdffrs; bugfix for optimizePTestInstr
When a ptest is used to set flags from the output of rdffr, the ptest
can be eliminated, using a flags-setting rdffrs instead.
Additionally, check that nothing consumes flags between rdffr and ptest;
this case appears to have been missed previously.
* There is no unpredicated RDFFRS instruction.
* If substituting RDFFR_PP, require that the mask argument of the
PTEST matches that of the RDFFR_PP.
* Move some precondition code up inside optimizePTestInstr, so that it
covers the new code paths for RDFFR which return earlier.
* Only consider RDFFR, PTEST in same basic block.
* Check for other flag setting instructions between the two, abort if
found.
* Drop an old TODO comment about removing dead PTEST instructions.
RDFFR_P to follow in later patch.
Differential Revision: https://reviews.llvm.org/D101357
Ben Shi [Wed, 12 May 2021 14:01:28 +0000 (22:01 +0800)]
[clang][AVR] Redefine some types to be compatible with avr-gcc
Reviewed By: dylanmckay
Differential Revision: https://reviews.llvm.org/D100701
David Sherwood [Wed, 12 May 2021 13:49:04 +0000 (14:49 +0100)]
[NFC] Use variable GEP index in vec_demanded_elts tests
I've changed a test in each of these files:
Transforms/InstCombine/vec_demanded_elts.ll
Transforms/InstCombine/vec_demanded_elts-inseltpoison.ll
to use a variable GEP index instead of a constant value so that
we're testing the more general case.
Martin Storsjö [Wed, 12 May 2021 09:03:54 +0000 (12:03 +0300)]
[Passes] Reenable the relative lookup table converter pass for ELF and COFF on aarch64
The bug (PR50227, affecting COFF) that caused the revert in
6f5670a4c3d8c079d4b676140ee69e5cc235d5a8 has been fixed in
382c505d9cfca8adaec47aea2da7bbcbc00fc05c now, so it should be safe
to reenable the pass for that target (and ELF).
In PR50227 it's also mentioned that the same pass seems to cause
problems on aarch64 on darwin, so leaving it disabled there for now.
Greg McGary [Mon, 3 May 2021 02:08:02 +0000 (19:08 -0700)]
[llvm-objdump] Exclude __mh_*_header symbols during MachO disassembly
`__mh_(execute|dylib|dylinker|bundle|preload|object)_header` are special symbols whose values hold the VMA of the Mach header to support introspection. They are attached to the first section in `__TEXT`, even though their addresses are outside `__TEXT`, and they do not refer to code.
It is normally harmless, but when the first section of `__TEXT` has no other symbols, `__mh_*_header` is considered by the disassembler when determing function boundaries. Since `__mh_*_header` refers to an address outside `__TEXT`, the boundary determination fails and disassembly quits.
Since `__TEXT,__text` normally has symbols, this bug is obscured. Experiments placing `__stubs` and `__stub_helper` first exposed the bug, since neither has symbols.
Differential Revision: https://reviews.llvm.org/D101786
Julien Pagès [Wed, 12 May 2021 13:14:56 +0000 (14:14 +0100)]
[AMDGPU] Improve Codegen for build_vector
Improve the code generation of build_vector.
Use the v_pack_b32_f16 instruction instead of
v_and_b32 + v_lshl_or_b32
Differential Revision: https://reviews.llvm.org/D98081
Patch by Julien Pagès!
Roman Lebedev [Wed, 12 May 2021 13:09:37 +0000 (16:09 +0300)]
[InstCombine] ~(C + X) --> ~C - X (PR50308)
We can not rely on (C+X)-->(X+C) already happening,
because we might not have visited that `add` yet.
The added testcase would get stuck in an endless combine loop.
Jay Foad [Tue, 11 May 2021 14:14:04 +0000 (15:14 +0100)]
[TargetRegisterInfo] Speed up getAllocatableSet. NFCI.
MachineRegisterInfo caches the reserved register set that is computed by
by TargetRegisterInfo::getReservedRegs, so call into MRI to get the
reserved regs to avoid recomputing them.
In particular this speeds up AMDGPU's SIFormMemoryClauses pass because
AMDGPU has a particularly complicated reserved set that is expensive to
compute.
Differential Revision: https://reviews.llvm.org/D102318
Tobias Gysi [Wed, 12 May 2021 12:43:34 +0000 (12:43 +0000)]
[mlir][linalg] Remove IndexedGenericOp support from LinalgInterchangePattern...
after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).
Differential Revision: https://reviews.llvm.org/D102245
Piotr Sobczak [Wed, 12 May 2021 12:52:02 +0000 (14:52 +0200)]
[AMDGPU] Remove assert
Remove assert introduced in D101177, following post-commit feedback.
Sanjay Patel [Wed, 12 May 2021 12:03:11 +0000 (08:03 -0400)]
[x86] try harder to lower to PCMPGT instead of not-of-PCMPEQ
This is motivated by the example in https://llvm.org/PR50055 ,
but it doesn't do anything for that bug currently because we
don't actually have a zero-extended setcc there.
Proof for the generic transform (inverse of what we would
try to do in combining):
https://alive2.llvm.org/ce/z/aBL-Mg
Differential Revision: https://reviews.llvm.org/D102275
Sanjay Patel [Tue, 11 May 2021 19:52:43 +0000 (15:52 -0400)]
[x86] add test for pcmpeq with 0; NFC
Nathan James [Wed, 12 May 2021 12:18:40 +0000 (13:18 +0100)]
[clang-tidy][NFC] Simplify a lot of bugprone-sizeof-expression matchers
There should be a follow up to this for changing the traversal mode, but some of the tests don't like that.
Reviewed By: steveire
Differential Revision: https://reviews.llvm.org/D101614
Tobias Gysi [Wed, 12 May 2021 12:00:08 +0000 (12:00 +0000)]
[mlir][linalg] Remove IndexedGenericOp support from LinalgBufferize...
after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).
Differential Revision: https://reviews.llvm.org/D102308
David Spickett [Wed, 12 May 2021 12:12:28 +0000 (13:12 +0100)]
Revert "[scudo] Enable arm32 arch"
This reverts commit
b1a77e465e37fc400c16f9fda2a637f11c698bb9.
Which has a failing test on our armv7 bots:
https://lab.llvm.org/buildbot/#/builders/59/builds/1812
Hana Joo [Wed, 12 May 2021 11:57:17 +0000 (12:57 +0100)]
[clang-tidy] Enable the use of IgnoreArray flag in pro-type-member-init rule
The `IgnoreArray` flag was not used before while running the rule. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=47288 | b/47288 ]]
Reviewed By: njames93
Differential Revision: https://reviews.llvm.org/D101239
Tobias Gysi [Wed, 12 May 2021 11:34:13 +0000 (11:34 +0000)]
[mlir][linalg] Remove IndexedGenericOp support from LinalgToStandard...
after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).
Differential Revision: https://reviews.llvm.org/D102236
Kristina Bessonova [Sun, 9 May 2021 17:29:56 +0000 (19:29 +0200)]
[libcxx] NFC. Correct wordings of _LIBCPP_ASSERT debug messages
Differential Revision: https://reviews.llvm.org/D102195
Simon Pilgrim [Wed, 12 May 2021 11:02:06 +0000 (12:02 +0100)]
[X86][AVX] canonicalizeShuffleMaskWithHorizOp - improve support for 256/512-bit vectors
Extend the HOP(HOP(X,Y),HOP(Z,W)) and SHUFFLE(HOP(X,Y),HOP(Z,W)) folds to handle repeating 256/512-bit vector cases.
This allows us to drop the UNPACK(HOP(),HOP()) custom fold in combineTargetShuffle.
This required isRepeatedTargetShuffleMask to be tweaked to support target shuffle masks taking more than 2 inputs.
gbreynoo [Wed, 12 May 2021 11:09:08 +0000 (12:09 +0100)]
[llvm-readelf] Unhide short options to match the command guide
The readelf command guide shows the short options used as aliases but
these are not found in the help text unless --show-hidden is used, other
tools show aliases with --help. This change fixes the help output to be
consistent with the command guide.
Differential Revision: https://reviews.llvm.org/D102173
gbreynoo [Wed, 12 May 2021 11:04:54 +0000 (12:04 +0100)]
[llvm-symbolizer] Place Mach-O options into the Mach-O option group.
In the help output of other tools and in the symbolizer command guide,
Mach-O specific options are in their own section. This change fixes the
symbolizer help output to be consistent.
Differential Revision: https://reviews.llvm.org/D102178
David Sherwood [Wed, 21 Apr 2021 15:36:11 +0000 (16:36 +0100)]
[LoopVectorize] Fix scalarisation crash in widenPHIInstruction for scalable vectors
In InnerLoopVectorizer::widenPHIInstruction there are cases where we have
to scalarise a pointer induction variable after vectorisation. For scalable
vectors we already deal with the case where the pointer induction variable
is uniform, but we currently crash if not uniform. For fixed width vectors
we calculate every lane of the scalarised pointer induction variable for a
given VF, however this cannot work for scalable vectors. In this case I
have added support for caching the whole vector value for each unrolled
part so that we can always extract an arbitrary element. Additionally, we
still continue to cache the known minimum number of lanes too in order
to improve code quality by avoiding an extractelement operation.
I have adapted an existing test `pointer_iv_mixed` from the file:
Transforms/LoopVectorize/consecutive-ptr-uniforms.ll
and added it here for scalable vectors instead:
Transforms/LoopVectorize/AArch64/sve-widen-phi.ll
Differential Revision: https://reviews.llvm.org/D101294
Peter Waller [Thu, 29 Apr 2021 15:40:34 +0000 (15:40 +0000)]
[AArch64][SVE] Improve sve.convert.to.svbool lowering
The sve.convert.to.svbool lowering has the effect of widening a logical
<M x i1> vector representing lanes into a physical <16 x i1> vector
representing bits in a predicate register.
In general, if converting to svbool, the contents of lanes in the
physical register might not be known. For sve.convert.to.svbool the new
lanes are specified to be zeroed, requiring 'and' instructions to mask
off the new lanes. For lanes coming from a ptrue or a comparison,
however, they are known to be zero.
CodeGen Before:
ptrue p0.s, vl16
ptrue p1.s
ptrue p2.b
and p0.b, p2/z, p0.b, p1.b
ret
After:
ptrue p0.s, vl16
ret
Differential Revision: https://reviews.llvm.org/D101544
Michał Górny [Wed, 5 May 2021 11:06:55 +0000 (13:06 +0200)]
[Process/elf-core] Read PID from FreeBSD prpsinfo
Add a function to read NT_PRPSINFO note from FreeBSD core dumps. This
is necessary to get the process ID (NT_PRSTATUS has only thread ID).
Move the lp64 check from NT_PRSTATUS parsing to the parseFreeBSDNotes()
to avoid repeating it.
Differential Revision: https://reviews.llvm.org/D101893
Michał Górny [Thu, 22 Apr 2021 17:21:50 +0000 (19:21 +0200)]
[lldb] [Process/elf-core] Fix reading FPRs from FreeBSD/i386 cores
The FreeBSD coredumps from i386 systems contain only FSAVE-style
NT_FPREGSET. Since we do not really support reading that kind of data
anymore, just use NT_X86_XSTATE to get FXSAVE-style data when available.
Differential Revision: https://reviews.llvm.org/D101086
Stephen Tozer [Mon, 10 May 2021 13:00:01 +0000 (14:00 +0100)]
Reapply "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST"
Previous crashes caused by this patch were the result of machine
subregisters being incorrectly handled in updateDbgUsersToReg; this has
been fixed by using RegUnits to determine overlapping registers, instead
of using the register values directly.
Differential Revision: https://reviews.llvm.org/D101523
This reverts commit
7ca26c5fa2df253878cab22e1e2f0d6f1b481218.
Neal (nealsid) [Wed, 12 May 2021 08:46:35 +0000 (09:46 +0100)]
Remove Windows editline from LLDB
I don't mean to undo others' work but it looks like the hand-rolled EditLine for LLDB on Windows isn't used. It'd be easier to make changes to bring the other platforms' Editline wrapper up to date (e.g. simplifying char vs wchar_t) without modifying/testing this one too.
Reviewed By: amccarth
Differential Revision: https://reviews.llvm.org/D102208
Piotr Sobczak [Wed, 12 May 2021 07:23:59 +0000 (09:23 +0200)]
[AMDGPU] Skip invariant loads when avoiding WAR conflicts
No need to handle invariant loads when avoiding WAR conflicts, as
there cannot be a vector store to the same memory location.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D101177
Qiu Chaofan [Wed, 12 May 2021 08:51:52 +0000 (16:51 +0800)]
Revert "[PowerPC] [Clang] Enable float128 feature on VSX targets"
This commit brought build break in some f128 related tests. But that's
not the root cause. There exists some differences between Clang and
GCC's definition for 128-bit float types on PPC, so macros/functions in
glibc may not work with clang -mfloat128 well. We need to handle this
carefully and reland it.
Tomas Matheson [Tue, 11 May 2021 16:15:07 +0000 (17:15 +0100)]
[ARM] Prevent spilling between ldrex/strex pairs
Based on the same for AArch64:
4751cadcca45984d7671e594ce95aed8fe030bf1
At -O0, the fast register allocator may insert spills between the ldrex and
strex instructions inserted by AtomicExpandPass when expanding atomicrmw
instructions in LL/SC loops. To avoid this, expand to cmpxchg loops and
therefore expand the cmpxchg pseudos after register allocation.
Required a tweak to ARMExpandPseudo::ExpandCMP_SWAP to use the 4-byte encoding
of UXT, since the pseudo instruction can be allocated a high register (R8-R15)
which the 2-byte encoding doesn't support. However, the 4-byte encodings
are not present for ARM v8-M Baseline. To enable this, two new pseudos are
added for Thumb which are only valid for v8mbase, tCMP_SWAP_8 and
tCMP_SWAP_16.
The previously committed attempt in D101164 had to be reverted due to runtime
failures in the test suites. Rather than spending time fixing that
implementation (adding another implementation of atomic operations and more
divergence between backends) I have chosen to follow the approach taken in
D101163.
Differential Revision: https://reviews.llvm.org/D101898
Depends on D101912
Tomas Matheson [Wed, 5 May 2021 14:51:21 +0000 (15:51 +0100)]
[ARM] Precommit test for D101898
Differential Revision: https://reviews.llvm.org/D101912
Alex Orlov [Wed, 12 May 2021 08:39:30 +0000 (12:39 +0400)]
Fixed llvm-objcopy to add correct symbol table for ELF with program headers.
This fixes the following bugs:
https://bugs.llvm.org/show_bug.cgi?id=43935
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D102258
Djordje Todorovic [Tue, 11 May 2021 08:23:31 +0000 (01:23 -0700)]
[NFC][llvm-dwarfdump] Avoid passing std::string by value in collectStatsForDie()
Guillaume Chatelet [Wed, 12 May 2021 07:24:53 +0000 (07:24 +0000)]
[libc] Simplifies multi implementations
This is a roll forward of D101895 with two additional fixes:
Original Patch description:
> This is a follow up on D101524 which:
>
> - simplifies cpu features detection and usage,
> - flattens target dependent optimizations so it's obvious which implementations are generated,
> - provides an implementation targeting the host (march/mtune=native) for the mem* functions,
> - makes sure all implementations are unittested (provided the host can run them).
Additional fixes:
- Fix uninitialized ALL_CPU_FEATURES
- Use non pseudo microarch as it is only supported from Clang 12 on
Differential Revision: https://reviews.llvm.org/D102233
Dmitry Vyukov [Wed, 12 May 2021 07:07:00 +0000 (09:07 +0200)]
scudo: fix CheckFailed-related build breakage
I was running:
$ ninja check-sanitizer check-msan check-asan \
check-tsan check-lsan check-ubsan check-cfi \
check-profile check-memprof check-xray check-hwasan
but missed check-scudo...
Differential Revision: https://reviews.llvm.org/D102314
Ulysse Beaugnon [Wed, 12 May 2021 07:07:44 +0000 (09:07 +0200)]
[MLIR] Enable conversion from llvm::SMLoc to mlir::Location with OpAsmParser.
DialectAsmParser already allows converting an llvm::SMLoc location to a
mlir::Location location. This commit adds the same functionality to OpAsmParser.
Implementation is copied from DialectAsmParser.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D102165