Nathan James [Wed, 12 May 2021 12:18:40 +0000 (13:18 +0100)]
[clang-tidy][NFC] Simplify a lot of bugprone-sizeof-expression matchers
There should be a follow up to this for changing the traversal mode, but some of the tests don't like that.
Reviewed By: steveire
Differential Revision: https://reviews.llvm.org/D101614
Tobias Gysi [Wed, 12 May 2021 12:00:08 +0000 (12:00 +0000)]
[mlir][linalg] Remove IndexedGenericOp support from LinalgBufferize...
after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).
Differential Revision: https://reviews.llvm.org/D102308
David Spickett [Wed, 12 May 2021 12:12:28 +0000 (13:12 +0100)]
Revert "[scudo] Enable arm32 arch"
This reverts commit
b1a77e465e37fc400c16f9fda2a637f11c698bb9.
Which has a failing test on our armv7 bots:
https://lab.llvm.org/buildbot/#/builders/59/builds/1812
Hana Joo [Wed, 12 May 2021 11:57:17 +0000 (12:57 +0100)]
[clang-tidy] Enable the use of IgnoreArray flag in pro-type-member-init rule
The `IgnoreArray` flag was not used before while running the rule. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=47288 | b/47288 ]]
Reviewed By: njames93
Differential Revision: https://reviews.llvm.org/D101239
Tobias Gysi [Wed, 12 May 2021 11:34:13 +0000 (11:34 +0000)]
[mlir][linalg] Remove IndexedGenericOp support from LinalgToStandard...
after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).
Differential Revision: https://reviews.llvm.org/D102236
Kristina Bessonova [Sun, 9 May 2021 17:29:56 +0000 (19:29 +0200)]
[libcxx] NFC. Correct wordings of _LIBCPP_ASSERT debug messages
Differential Revision: https://reviews.llvm.org/D102195
Simon Pilgrim [Wed, 12 May 2021 11:02:06 +0000 (12:02 +0100)]
[X86][AVX] canonicalizeShuffleMaskWithHorizOp - improve support for 256/512-bit vectors
Extend the HOP(HOP(X,Y),HOP(Z,W)) and SHUFFLE(HOP(X,Y),HOP(Z,W)) folds to handle repeating 256/512-bit vector cases.
This allows us to drop the UNPACK(HOP(),HOP()) custom fold in combineTargetShuffle.
This required isRepeatedTargetShuffleMask to be tweaked to support target shuffle masks taking more than 2 inputs.
gbreynoo [Wed, 12 May 2021 11:09:08 +0000 (12:09 +0100)]
[llvm-readelf] Unhide short options to match the command guide
The readelf command guide shows the short options used as aliases but
these are not found in the help text unless --show-hidden is used, other
tools show aliases with --help. This change fixes the help output to be
consistent with the command guide.
Differential Revision: https://reviews.llvm.org/D102173
gbreynoo [Wed, 12 May 2021 11:04:54 +0000 (12:04 +0100)]
[llvm-symbolizer] Place Mach-O options into the Mach-O option group.
In the help output of other tools and in the symbolizer command guide,
Mach-O specific options are in their own section. This change fixes the
symbolizer help output to be consistent.
Differential Revision: https://reviews.llvm.org/D102178
David Sherwood [Wed, 21 Apr 2021 15:36:11 +0000 (16:36 +0100)]
[LoopVectorize] Fix scalarisation crash in widenPHIInstruction for scalable vectors
In InnerLoopVectorizer::widenPHIInstruction there are cases where we have
to scalarise a pointer induction variable after vectorisation. For scalable
vectors we already deal with the case where the pointer induction variable
is uniform, but we currently crash if not uniform. For fixed width vectors
we calculate every lane of the scalarised pointer induction variable for a
given VF, however this cannot work for scalable vectors. In this case I
have added support for caching the whole vector value for each unrolled
part so that we can always extract an arbitrary element. Additionally, we
still continue to cache the known minimum number of lanes too in order
to improve code quality by avoiding an extractelement operation.
I have adapted an existing test `pointer_iv_mixed` from the file:
Transforms/LoopVectorize/consecutive-ptr-uniforms.ll
and added it here for scalable vectors instead:
Transforms/LoopVectorize/AArch64/sve-widen-phi.ll
Differential Revision: https://reviews.llvm.org/D101294
Peter Waller [Thu, 29 Apr 2021 15:40:34 +0000 (15:40 +0000)]
[AArch64][SVE] Improve sve.convert.to.svbool lowering
The sve.convert.to.svbool lowering has the effect of widening a logical
<M x i1> vector representing lanes into a physical <16 x i1> vector
representing bits in a predicate register.
In general, if converting to svbool, the contents of lanes in the
physical register might not be known. For sve.convert.to.svbool the new
lanes are specified to be zeroed, requiring 'and' instructions to mask
off the new lanes. For lanes coming from a ptrue or a comparison,
however, they are known to be zero.
CodeGen Before:
ptrue p0.s, vl16
ptrue p1.s
ptrue p2.b
and p0.b, p2/z, p0.b, p1.b
ret
After:
ptrue p0.s, vl16
ret
Differential Revision: https://reviews.llvm.org/D101544
Michał Górny [Wed, 5 May 2021 11:06:55 +0000 (13:06 +0200)]
[Process/elf-core] Read PID from FreeBSD prpsinfo
Add a function to read NT_PRPSINFO note from FreeBSD core dumps. This
is necessary to get the process ID (NT_PRSTATUS has only thread ID).
Move the lp64 check from NT_PRSTATUS parsing to the parseFreeBSDNotes()
to avoid repeating it.
Differential Revision: https://reviews.llvm.org/D101893
Michał Górny [Thu, 22 Apr 2021 17:21:50 +0000 (19:21 +0200)]
[lldb] [Process/elf-core] Fix reading FPRs from FreeBSD/i386 cores
The FreeBSD coredumps from i386 systems contain only FSAVE-style
NT_FPREGSET. Since we do not really support reading that kind of data
anymore, just use NT_X86_XSTATE to get FXSAVE-style data when available.
Differential Revision: https://reviews.llvm.org/D101086
Stephen Tozer [Mon, 10 May 2021 13:00:01 +0000 (14:00 +0100)]
Reapply "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST"
Previous crashes caused by this patch were the result of machine
subregisters being incorrectly handled in updateDbgUsersToReg; this has
been fixed by using RegUnits to determine overlapping registers, instead
of using the register values directly.
Differential Revision: https://reviews.llvm.org/D101523
This reverts commit
7ca26c5fa2df253878cab22e1e2f0d6f1b481218.
Neal (nealsid) [Wed, 12 May 2021 08:46:35 +0000 (09:46 +0100)]
Remove Windows editline from LLDB
I don't mean to undo others' work but it looks like the hand-rolled EditLine for LLDB on Windows isn't used. It'd be easier to make changes to bring the other platforms' Editline wrapper up to date (e.g. simplifying char vs wchar_t) without modifying/testing this one too.
Reviewed By: amccarth
Differential Revision: https://reviews.llvm.org/D102208
Piotr Sobczak [Wed, 12 May 2021 07:23:59 +0000 (09:23 +0200)]
[AMDGPU] Skip invariant loads when avoiding WAR conflicts
No need to handle invariant loads when avoiding WAR conflicts, as
there cannot be a vector store to the same memory location.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D101177
Qiu Chaofan [Wed, 12 May 2021 08:51:52 +0000 (16:51 +0800)]
Revert "[PowerPC] [Clang] Enable float128 feature on VSX targets"
This commit brought build break in some f128 related tests. But that's
not the root cause. There exists some differences between Clang and
GCC's definition for 128-bit float types on PPC, so macros/functions in
glibc may not work with clang -mfloat128 well. We need to handle this
carefully and reland it.
Tomas Matheson [Tue, 11 May 2021 16:15:07 +0000 (17:15 +0100)]
[ARM] Prevent spilling between ldrex/strex pairs
Based on the same for AArch64:
4751cadcca45984d7671e594ce95aed8fe030bf1
At -O0, the fast register allocator may insert spills between the ldrex and
strex instructions inserted by AtomicExpandPass when expanding atomicrmw
instructions in LL/SC loops. To avoid this, expand to cmpxchg loops and
therefore expand the cmpxchg pseudos after register allocation.
Required a tweak to ARMExpandPseudo::ExpandCMP_SWAP to use the 4-byte encoding
of UXT, since the pseudo instruction can be allocated a high register (R8-R15)
which the 2-byte encoding doesn't support. However, the 4-byte encodings
are not present for ARM v8-M Baseline. To enable this, two new pseudos are
added for Thumb which are only valid for v8mbase, tCMP_SWAP_8 and
tCMP_SWAP_16.
The previously committed attempt in D101164 had to be reverted due to runtime
failures in the test suites. Rather than spending time fixing that
implementation (adding another implementation of atomic operations and more
divergence between backends) I have chosen to follow the approach taken in
D101163.
Differential Revision: https://reviews.llvm.org/D101898
Depends on D101912
Tomas Matheson [Wed, 5 May 2021 14:51:21 +0000 (15:51 +0100)]
[ARM] Precommit test for D101898
Differential Revision: https://reviews.llvm.org/D101912
Alex Orlov [Wed, 12 May 2021 08:39:30 +0000 (12:39 +0400)]
Fixed llvm-objcopy to add correct symbol table for ELF with program headers.
This fixes the following bugs:
https://bugs.llvm.org/show_bug.cgi?id=43935
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D102258
Djordje Todorovic [Tue, 11 May 2021 08:23:31 +0000 (01:23 -0700)]
[NFC][llvm-dwarfdump] Avoid passing std::string by value in collectStatsForDie()
Guillaume Chatelet [Wed, 12 May 2021 07:24:53 +0000 (07:24 +0000)]
[libc] Simplifies multi implementations
This is a roll forward of D101895 with two additional fixes:
Original Patch description:
> This is a follow up on D101524 which:
>
> - simplifies cpu features detection and usage,
> - flattens target dependent optimizations so it's obvious which implementations are generated,
> - provides an implementation targeting the host (march/mtune=native) for the mem* functions,
> - makes sure all implementations are unittested (provided the host can run them).
Additional fixes:
- Fix uninitialized ALL_CPU_FEATURES
- Use non pseudo microarch as it is only supported from Clang 12 on
Differential Revision: https://reviews.llvm.org/D102233
Dmitry Vyukov [Wed, 12 May 2021 07:07:00 +0000 (09:07 +0200)]
scudo: fix CheckFailed-related build breakage
I was running:
$ ninja check-sanitizer check-msan check-asan \
check-tsan check-lsan check-ubsan check-cfi \
check-profile check-memprof check-xray check-hwasan
but missed check-scudo...
Differential Revision: https://reviews.llvm.org/D102314
Ulysse Beaugnon [Wed, 12 May 2021 07:07:44 +0000 (09:07 +0200)]
[MLIR] Enable conversion from llvm::SMLoc to mlir::Location with OpAsmParser.
DialectAsmParser already allows converting an llvm::SMLoc location to a
mlir::Location location. This commit adds the same functionality to OpAsmParser.
Implementation is copied from DialectAsmParser.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D102165
Dumitru Potop [Wed, 12 May 2021 06:45:25 +0000 (08:45 +0200)]
[mlir] Support alignment in LLVM dialect GlobalOp
First step in adding alignment as an attribute to MLIR global definitions. Alignment can be specified for global objects in LLVM IR. It can also be specified as a named attribute in the LLVMIR dialect of MLIR. However, this attribute has no standing and is discarded during translation from MLIR to LLVM IR. This patch does two things: First, it adds the attribute to the syntax of the llvm.mlir.global operation, and by doing this it also adds accessors and verifications. The syntax is "align=XX" (with XX being an integer), placed right after the value of the operation. Second, it allows transforming this operation to and from LLVM IR. It is checked whether the value is an integer power of 2.
Reviewed By: ftynse, mehdi_amini
Differential Revision: https://reviews.llvm.org/D101492
Dmitry Vyukov [Wed, 12 May 2021 06:54:34 +0000 (08:54 +0200)]
tsan: fix syscall test on aarch64
Add missing includes and use SYS_pipe2 instead of SYS_pipe
as it's not present on some arches.
Differential Revision: https://reviews.llvm.org/D102311
Martin Storsjö [Tue, 11 May 2021 07:04:02 +0000 (10:04 +0300)]
[COFF] Fix ARM and ARM64 REL32 relocations to be relative to the end of the relocation
This matches how they are defined on X86.
This should fix the relative lookup tables pass for COFF, allowing
it to be reenabled.
Differential Revision: https://reviews.llvm.org/D102217
Dmitry Vyukov [Tue, 11 May 2021 08:18:48 +0000 (10:18 +0200)]
sanitizer_common: deduplicate CheckFailed
We have some significant amount of duplication around
CheckFailed functionality. Each sanitizer copy-pasted
a chunk of code. Some got random improvements like
dealing with recursive failures better. These improvements
could benefit all sanitizers, but they don't.
Deduplicate CheckFailed logic across sanitizers and let each
sanitizer only print the current stack trace.
I've tried to dedup stack printing as well,
but this got me into cmake hell. So let's keep this part
duplicated in each sanitizer for now.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102221
Qiu Chaofan [Wed, 12 May 2021 06:32:37 +0000 (14:32 +0800)]
[PowerPC] [Clang] Enable float128 feature on VSX targets
Reviewed By: nemanjai, steven.zhang
Differential Revision: https://reviews.llvm.org/D92815
Kristina Bessonova [Mon, 10 May 2021 07:06:09 +0000 (09:06 +0200)]
[libcxx][test] Split more debug mode tests
Split a few more debug mode tests missed in D100592.
Differential Revision: https://reviews.llvm.org/D102194
Dmitry Vyukov [Mon, 10 May 2021 11:27:06 +0000 (13:27 +0200)]
sanitizer_common: don't write into .rodata
setlocale interceptor imitates a write into result,
which may be located in .rodata section.
This is the only interceptor that tries to do this and
I think the intention was to initialize the range for msan.
So do that instead. Writing into .rodata shouldn't happen
(without crashing later on the actual write) and this
traps on my local tsan experiments.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102161
Vitaly Buka [Wed, 12 May 2021 05:51:36 +0000 (22:51 -0700)]
[symbolizer] Fix leak after D96883
Dmitry Vyukov [Mon, 10 May 2021 11:43:20 +0000 (13:43 +0200)]
sanitizer_common: fix SIG_DFL warning
Currently we have:
sanitizer_posix_libcdep.cpp:146:27: warning: cast between incompatible
function types from ‘__sighandler_t’ {aka ‘void (*)(int)’} to ‘sa_sigaction_t’
146 | sigact.sa_sigaction = (sa_sigaction_t)SIG_DFL;
We don't set SA_SIGINFO, so we need to assign to sa_handler.
And SIG_DFL is meant for sa_handler, so this gets rid of both
compiler warning, type cast and potential runtime misbehavior.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102162
Dmitry Vyukov [Mon, 10 May 2021 07:04:20 +0000 (09:04 +0200)]
tsan: declare annotations in test.h
We already declare subset of annotations in test.h.
But some are duplicated and declared in tests.
Move all annotation declarations to test.h.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102152
Qiu Chaofan [Wed, 12 May 2021 05:18:20 +0000 (13:18 +0800)]
[VectorComine] Restrict single-element-store index to inbounds constant
Vector single element update optimization is landed in 2db4979. But the
scope needs restriction. This patch restricts the index to inbounds and
vector must be fixed sized. In future, we may use value tracking to
relax constant restrictions.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D102146
Dmitry Vyukov [Fri, 7 May 2021 09:16:03 +0000 (11:16 +0200)]
tsan: mark sigwait as blocking
Add a test case reported in:
https://github.com/google/sanitizers/issues/1401
and fix it.
The code assumes sigwait will process other signals.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102057
Dmitry Vyukov [Tue, 11 May 2021 08:37:48 +0000 (10:37 +0200)]
tsan: add a simple syscall test
Add a simple test that uses syscall annotations.
Just to ensure at least basic functionality works.
Also factor out annotated syscall wrappers into a separate
header file as they may be useful for future tests.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102223
Chia-hung Duan [Wed, 12 May 2021 03:21:25 +0000 (11:21 +0800)]
[mlir][AsmPrinter] Remove recursion while SSA naming
Address the TODO of removing recursion while SSA naming.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D102226
Vitaly Buka [Wed, 12 May 2021 02:03:50 +0000 (19:03 -0700)]
[NFC][msan] Move setlocale test into sanitizer_common
Congzhe Cao [Wed, 12 May 2021 01:25:16 +0000 (21:25 -0400)]
[LoopInterchange] Handle lcssa PHIs with multiple predecessors
This is a bugfix in the transformation phase.
If the original outer loop header branches to both the inner loop
(header) and the outer loop latch, and if there is an lcssa PHI
node outside the loop nest, then after interchange the new outer latch
will have an lcssa PHI node inserted which has two predecessors, i.e.,
the original outer header and the original outer latch. Currently
the transformation assumes it has only one predecessor (the original
outer latch) and crashes, since the inserted lcssa PHI node does
not take both predecessors as incoming BBs.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D100792
Jim Ingham [Wed, 12 May 2021 01:26:22 +0000 (18:26 -0700)]
Removing test...
Actually, I don't think this test is going to be stable enough
to be worthwhile. Let me see if I can think of a better way to
test this.
Matt Arsenault [Mon, 10 May 2021 13:22:45 +0000 (09:22 -0400)]
AMDGPU: Fix SILoadStoreOptimizer for gfx90a
This was hardcoding the register class to use for the newly created
pointer registers, violating the aligned VGPR requirement.
Jim Ingham [Wed, 12 May 2021 01:09:51 +0000 (18:09 -0700)]
This test is failing on Linux, skip while I investigate.
The gdb-remote tests are a bit artificial, depending on
Python threading, and sleeps. So I'm not 100% surprised it doesn't
work straight up on another XSsystem.
Sam Clegg [Tue, 11 May 2021 15:58:13 +0000 (08:58 -0700)]
[lld][WebAssembly] Fix for string merging + negative addends
Don't include the relocation addend when calculating the
virtual address of a symbol. Instead just pass the symbol's
offset and add the addend afterwards.
Without this fix we hit the `offset is outside the section`
error in MergeInputSegment::getSegmentPiece.
This fixes a real world error we were are seeing in emscripten.
Differential Revision: https://reviews.llvm.org/D102271
Richard Smith [Wed, 12 May 2021 00:46:18 +0000 (17:46 -0700)]
Revert "Fix bad mangling of <data-member-prefix> for a closure in the initializer of a variable at global namespace scope."
This reverts commit
697ac15a0fc71888c372667bdbc5583ab42d4695, for which
review was not complete. That change was accidentally pushed when
an unrelated change was pushed.
Richard Smith [Wed, 12 May 2021 00:34:14 +0000 (17:34 -0700)]
Add test for PR50039.
I believe Clang's behavior is correct according to the standard here,
but this is an unusual situation for which we had no test coverage, so
I'm adding some.
Richard Smith [Thu, 6 May 2021 01:56:58 +0000 (18:56 -0700)]
Fix bad mangling of <data-member-prefix> for a closure in the initializer of a variable at global namespace scope.
This implements the direction proposed in
https://github.com/itanium-cxx-abi/cxx-abi/pull/126.
Differential Revision: https://reviews.llvm.org/D101968
Matt Arsenault [Thu, 6 May 2021 00:25:31 +0000 (20:25 -0400)]
GlobalISel: Don't hardcode varargs=false in resultsCompatible
Matt Arsenault [Tue, 11 May 2021 21:12:33 +0000 (17:12 -0400)]
AMDGPU: Fix assert on constant load from addrspacecasted pointer
This was trying to create a bitcast between different address spaces.
Matt Arsenault [Wed, 12 May 2021 00:10:55 +0000 (20:10 -0400)]
GlobalISel: Make constant fields const
Matt Arsenault [Tue, 4 May 2021 22:12:38 +0000 (18:12 -0400)]
GlobalISel: Split ValueHandler into assignment and emission classes
Currently the ValueHandler handles both selecting the type and
location for arguments, as well as inserting instructions needed to
handle them. Split this so that the determination of the argument
handling is independent of the function state. Currently the checks
for tail call compatibility do not follow the full assignment logic,
so it misses cases where arguments require nontrivial legalization.
This should help avoid targets ending up in a buggy state where the
argument evaluation may change in different contexts.
Matt Arsenault [Tue, 4 May 2021 21:32:09 +0000 (17:32 -0400)]
GlobalISel: Move AArch64 AssignFnVarArg to base class
We can handle the distinction easily enough in the generic code, and
this makes it easier to abstract the selection of type/location from
the code to insert code.
Jordan Rupprecht [Tue, 11 May 2021 23:08:53 +0000 (16:08 -0700)]
Revert "[GVN] Clobber partially aliased loads."
This reverts commit
6c570442318e2d3b8b13e95c2f2f588d71491acb.
It causes assertion errors due to widening atomic loads, and potentially causes miscompile elsewhere too. Repro, also posted to D95543:
```
$ cat repro.ll
; ModuleID = 'repro.ll'
source_filename = "repro.ll"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
%struct.widget = type { i32 }
%struct.baz = type { i32, %struct.snork }
%struct.snork = type { %struct.spam }
%struct.spam = type { i32, i32 }
@global = external local_unnamed_addr global %struct.widget, align 4
@global.1 = external local_unnamed_addr global i8, align 1
@global.2 = external local_unnamed_addr global i32, align 4
define void @zot(%struct.baz* %arg) local_unnamed_addr align 2 {
bb:
%tmp = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1
%tmp1 = bitcast %struct.snork* %tmp to i64*
%tmp2 = load i64, i64* %tmp1, align 4
%tmp3 = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1, i32 0, i32 1
%tmp4 = icmp ugt i64 %tmp2,
4294967295
br label %bb5
bb5: ; preds = %bb14, %bb
%tmp6 = load i32, i32* %tmp3, align 4
%tmp7 = icmp ne i32 %tmp6, 0
%tmp8 = select i1 %tmp7, i1 %tmp4, i1 false
%tmp9 = zext i1 %tmp8 to i8
store i8 %tmp9, i8* @global.1, align 1
%tmp10 = load i32, i32* @global.2, align 4
switch i32 %tmp10, label %bb11 [
i32 1, label %bb12
i32 2, label %bb12
]
bb11: ; preds = %bb5
br label %bb14
bb12: ; preds = %bb5, %bb5
%tmp13 = load atomic i32, i32* getelementptr inbounds (%struct.widget, %struct.widget* @global, i64 0, i32 0) acquire, align 4
br label %bb14
bb14: ; preds = %bb12, %bb11
br label %bb5
}
$ opt -O2 repro.ll -disable-output
opt: /home/rupprecht/src/llvm-project/llvm/lib/Transforms/Utils/VNCoercion.cpp:496: llvm::Value *llvm::VNCoercion::getLoadValueForLoad(llvm::LoadInst *, unsigned int, llvm::Type *, llvm::Instruction *, const llvm::DataLayout &): Assertion `SrcVal->isSimple() && "Cannot widen volatile/atomic load!"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/rupprecht/dev/opt -O2 repro.ll -disable-output
...
```
Lang Hames [Tue, 11 May 2021 23:04:00 +0000 (16:04 -0700)]
[JITLink] Fix bogus format string.
Leonard Chan [Thu, 6 May 2021 22:54:28 +0000 (15:54 -0700)]
[clang][Fuchsia] Introduce compat multilibs
These are GCC-compatible multilibs that use the generic Itanium C++ ABI
instead of the Fuchsia C++ ABI.
Differential Revision: https://reviews.llvm.org/D102030
Congzhe Cao [Tue, 11 May 2021 22:34:32 +0000 (18:34 -0400)]
[LoopInterchange] Fix legality for triangular loops
This is a bug fix in legality check.
When we encounter triangular loops such as the following form:
for (int i = 0; i < m; i++)
for (int j = 0; j < i; j++), or
for (int i = 0; i < m; i++)
for (int j = 0; j*i < n; j++),
we should not perform interchange since the number of executions
of the loop body will be different before and after interchange,
resulting in incorrect results.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D101305
Petr Hosek [Fri, 9 Apr 2021 18:53:59 +0000 (11:53 -0700)]
[Coverage] Support overriding compilation directory
When making compilation relocatable, for example in distributed
compilation scenarios, we want to set compilation dir to a relative
value like `.` but this presents a problem when generating reports
because if the file path is relative as well, for example `..`, you
may end up writing files outside of the output directory.
This change introduces a flag that allows overriding the compilation
directory that's stored inside the profile with a different value that
is absolute.
Differential Revision: https://reviews.llvm.org/D100232
Lang Hames [Tue, 11 May 2021 21:47:40 +0000 (14:47 -0700)]
[JITLink][MachO/x86_64] Expose API for creating eh-frame fixing passes.
These can be used to create eh-frame section fixing passes outside the usual
linker pipeline, which can be useful for tests and tools that just want to
verify or dump graphs.
Lang Hames [Tue, 11 May 2021 21:45:14 +0000 (14:45 -0700)]
[JITLink][x86-64] Add an x86_64 PointerSize constexpr.
This can be used in place of magic '8' values in generic x86-64 utilities.
Lang Hames [Tue, 11 May 2021 21:09:49 +0000 (14:09 -0700)]
[JITLink] Make LinkGraph debug dumps more readable.
This commit reorders some fields and fixes the width of others to try to
maintain more consistent columns. It also switches to long-hand scope
and linkage names, since LinkGraph dumps aren't read often enough for
single-character codes to be memorable.
Victor Huang [Tue, 11 May 2021 21:35:13 +0000 (16:35 -0500)]
[AIX][TLS] Diagnose use of unimplemented TLS models
Add front end diagnostics to report error for unimplemented TLS models set by
- compiler option `-ftls-model`
- attributes like `__thread int __attribute__((tls_model("local-exec"))) var_name;`
Reviewed by: aaron.ballman, nemanjai, PowerPC
Differential Revision: https://reviews.llvm.org/D102070
Congzhe Cao [Tue, 11 May 2021 22:06:41 +0000 (18:06 -0400)]
Revert "[LoopInterchange] Fix legality for triangular loops"
This reverts commit
29342291d25b83da97e74d75004b177ba41114fc.
The test case requires an assert build. Will add REQUIRES and re-commit.
Petr Hosek [Thu, 15 Apr 2021 08:22:04 +0000 (01:22 -0700)]
[llvm-cov] Support for v4 format in convert-for-testing
v4 moves function records to a dedicated section so we need to write
and read it separately.
https://reviews.llvm.org/D100535
Evandro Menezes [Tue, 11 May 2021 17:17:26 +0000 (12:17 -0500)]
[RISCV] Move instruction information into the RISCVII namespace (NFC)
Move instruction attributes into the `RISCVII` namespace and add associated helper functions.
Differential Revision: https://reviews.llvm.org/D102268
Nikita Popov [Tue, 11 May 2021 20:51:16 +0000 (22:51 +0200)]
[InstCombine] Clean up one-hot merge optimization (NFC)
Remove the requirement that the instruction is a BinaryOperator,
make the predicate check more compact and use slightly more
meaningful naming for the and operands.
Rob Suderman [Tue, 11 May 2021 20:40:03 +0000 (13:40 -0700)]
[mlir][tosa] Tosa elementwise broadcasting had some minor bugs
Updated tests to include broadcast of left and right. Includes
bypass if in-type and out-type match shape (no broadcasting).
Differential Revision: https://reviews.llvm.org/D102276
River Riddle [Tue, 11 May 2021 19:40:27 +0000 (12:40 -0700)]
[mlir] Elide large elements attrs when printing Operations in diagnostics
Diagnostics are intended to be read by users, and in most cases displayed in a terminal. When not eliding huge element attributes, in some cases we end up dumping hundreds of megabytes(gigabytes) to the terminal (or logs), completely obfuscating the main diagnostic being shown.
Differential Revision: https://reviews.llvm.org/D102272
Alex Orlov [Tue, 11 May 2021 20:46:00 +0000 (00:46 +0400)]
Removed unnecessary introduction of semi-colons.
Austin Kerbow [Tue, 11 May 2021 16:29:48 +0000 (09:29 -0700)]
[AMDGPU] Fix extra waitcnt being added with BUFFER_INVL2
The waitcnt pass would increment the number of vmem events for some buffer
invalidates that were not handled by the pass.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D102252
Evgenii Stepanov [Wed, 5 May 2021 18:56:52 +0000 (11:56 -0700)]
[hwasan] Stress test for thread creation.
This test has two modes - testing reused threads with multiple loops of
batch create/join, and testing new threads with a single loop of
create/join per fork.
The non-reuse variant catches the problem that was fixed in D101881 with
a high probability.
Differential Revision: https://reviews.llvm.org/D101936
Craig Topper [Tue, 11 May 2021 19:24:56 +0000 (12:24 -0700)]
[RISCV] Regenerate stepvector.ll. NFC
It looks like the RV32 and RV64 prefixes were removed from the
RUN lines while another patch was in review that added check
lines that used them.
Christopher Pulido [Tue, 11 May 2021 20:03:04 +0000 (23:03 +0300)]
[OpenMP] Changes to enable MSVC ARM64 build of libomp
This is the first in a series of changes to the OpenMP runtime
that have been done internally by Microsoft. This patch makes
the necessary changes to enable libomp.dll to build with
the MSVC compiler targeting ARM64.
Differential Revision: https://reviews.llvm.org/D101173
Albion Fung [Tue, 11 May 2021 19:56:24 +0000 (14:56 -0500)]
[PowerPC] Improve codegen for int-to-fp conversion of subword vector extract
When an integer is converted into floating point in subword vector extract,
it can be done in 2 instructions instead of the 3+ instructions it generates
right now. This patch removes the uncessary generation.
Differential: https://reviews.llvm.org/D100604
Amara Emerson [Tue, 11 May 2021 00:57:47 +0000 (17:57 -0700)]
[AArch64][GlobaISel] Mark target generic instructions as HasNoSideEffects.
One test needed updating because the newly side-effect-free instructions were
now being DCE'd.
Vitaly Buka [Tue, 11 May 2021 07:15:17 +0000 (00:15 -0700)]
[NFC][LSAN] Limit the number of concurrent threads is the test
Test still fails with D88184 reverted.
The test was flaky on https://bugs.chromium.org/p/chromium/issues/detail?id=1206745 and
https://lab.llvm.org/buildbot/#/builders/sanitizer-x86_64-linux
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D102218
River Riddle [Tue, 11 May 2021 19:09:17 +0000 (12:09 -0700)]
[mlir] Move move capture in SparseElementsAttr::getValues
This was a TODO for the move to C++14. Now that the move has been completed, we can resolve it.
Sam Clegg [Tue, 11 May 2021 18:08:14 +0000 (11:08 -0700)]
[lld][WebAssembly] Remove relocation target verification
We have this extra step in wasm-ld that doesn't exist in other lld
backend which verifies the existing contents of the relocation targets.
This was originally intended as an extra form of double checking and an
aid to compiler developers. However it has always been somewhat
controversial and there have been suggestions in the past the we simply
remove it.
My motivation for removing it now is that its causing me a headache
when trying to fix an issue with negative addends. In the case of
negative addends that final result can be wrapped/negative but this
checking code would require significant modification to be able to deal
with that case. For example with some test cases I'm looking at I'm
seeing error like this:
```
wasm-ld: warning: /usr/local/google/home/sbc/dev/wasm/llvm-build/tools/lld/test/wasm/Output/merge-string.s.tmp.o:(.rodata_relocs): unexpected existing value for R_WASM_MEMORY_ADDR_I32: existing=
FFFFFFFA expected=
FFFFFFFFFFFFFFFA
```
Rather than try to refactor `calcExpectedValue` to somehow return two
different types of results (32 and 64-bit) depending on the relocation
type, I think we can just remove this code.
Differential Revision: https://reviews.llvm.org/D102265
Jim Ingham [Thu, 6 May 2021 21:14:35 +0000 (14:14 -0700)]
Add an "interrupt timeout" to Process, and pipe that through the
ProcessGDBRemote plugin layers.
Also fix a bug where if we tried to interrupt, but the ReadPacket
wakeup timer woke us up just after the timeout, we would break out
the switch, but then since we immediately check if the response is
empty & fail if it is, we could end up actually only giving a
small interval to the interrupt.
Differential Revision: https://reviews.llvm.org/D102085
Vladimir Vereschaka [Tue, 11 May 2021 18:39:15 +0000 (11:39 -0700)]
[libc++] Run `substitutes-in-compile-flags.sh.cpp` test on Windows.
Fix for substitutes-in-compile-flags.sh.cpp to run it properly on Windows platform.
Differential Revision: https://reviews.llvm.org/D102048
Mike Rice [Sat, 20 Mar 2021 00:39:04 +0000 (17:39 -0700)]
[OpenMP] Use compound operators for reduction combiner if available.
The OpenMP spec seems to require the compound operators be used for
+, *, &, |, and ^ reduction. So use these if a class has those operators.
If not try the simple operators as we did previously to limit the impact
to existing code.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=48584
Differential Revision: https://reviews.llvm.org/D101941
Fangrui Song [Tue, 11 May 2021 18:38:32 +0000 (11:38 -0700)]
[clang] Support -fpic -fno-semantic-interposition for RISCV
-fno-semantic-interposition (only effective with -fpic) can optimize default
visibility external linkage (non-ifunc-non-COMDAT) variable access and function
calls to avoid GOT/PLT, by using local aliases, e.g.
```
int var;
__attribute__((optnone)) int fun(int x) { return x * x; }
int test() { return fun(var); }
```
-fpic (var and fun are dso_preemptable)
```
test:
.LBB1_1:
auipc a0, %got_pcrel_hi(var)
ld a0, %pcrel_lo(.LBB1_1)(a0)
lw a0, 0(a0)
// fun is preemptible by default in ld -shared mode. ld will create a PLT.
tail fun@plt
```
vs -fpic -fno-semantic-interposition (var and fun are dso_local)
```
test:
.Ltest$local:
.LBB1_1:
auipc a0, %pcrel_hi(.Lvar$local)
addi a0, a0, %pcrel_lo(.LBB1_1)
lw a0, 0(a0)
// The assembler either resolves .Lfun$local at assembly time (-mno-relax
// -fno-function-sections), or produces a relocation referencing a non-preemptible
// local symbol (which can avoid PLT).
tail .Lfun$local
```
Note: Clang's default -fpic is more aggressive than GCC -fpic: interprocedural
optimizations (including inlining) are available but local aliases are not used.
-fpic -fsemantic-interposition can disable interprocedural optimizations.
Depends on D101875
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D101876
Sam Clegg [Tue, 11 May 2021 18:15:45 +0000 (11:15 -0700)]
[lld][WebAssembly] Convert test to assembly. NFC.
Differential Revision: https://reviews.llvm.org/D102264
Roman Lebedev [Tue, 11 May 2021 18:19:41 +0000 (21:19 +0300)]
[X86] X86TTIImpl::getInterleavedMemoryOpCostAVX2(): canonicalize to integer type
This way we don't have to duplicate i32/f32 and i64/f64 entries,
which was already forgotten to be done for a few tuples.
Fangrui Song [Tue, 11 May 2021 18:34:37 +0000 (11:34 -0700)]
[GlobalOpt] Remove heap SROA
GlobalOpt implements a heap SROA (SROA for an malloc allocatated struct or array
of structs) which is largely undertested (heap-sra-[1234].ll are basically the
same test with very little difference) and does not trigger at all when
bootstrapping clang (it only supports the case of one single store).
The heap SROA implementation causes PR50027 (GEP is not properly handled; crash or miscompile).
Just drop the implementation. I have deleted some obviously duplicated tests
but kept `heap-sra-[12]{,-no-nullopt}.ll`.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D102257
Amara Emerson [Sun, 24 Jan 2021 08:35:15 +0000 (00:35 -0800)]
[AArch64][GlobalISel] Support truncstorei8/i16 w/ combine to form truncating G_STOREs.
This needs some tablegen changes so that we can actually import the patterns properly.
Differential Revision: https://reviews.llvm.org/D102204
Fangrui Song [Tue, 11 May 2021 18:29:45 +0000 (11:29 -0700)]
[RISCV] Prefer to lower MC_GlobalAddress operands to .Lfoo$local
Similar to X86 D73230 and AArch64 D101872
With this change, we can set dso_local in clang's -fpic -fno-semantic-interposition mode,
for default visibility external linkage non-ifunc-non-COMDAT definitions.
For such dso_local definitions, variable access/taking the address of a
function/calling a function will go through a local alias to avoid GOT/PLT.
Reviewed By: jrtc27, luismarques
Differential Revision: https://reviews.llvm.org/D101875
Eli Friedman [Tue, 20 Oct 2020 20:08:07 +0000 (13:08 -0700)]
[ArgumentPromotion] Fix byval alignment handling.
Make sure the alignment of the generated operations matches the
alignment of the byval argument. Previously, we were just ignoring
alignment and getting lucky.
While I'm here, also delete the unnecessary "tail" handling.
Passing a pointer to a byval argument to a "tail" call is UB, so
rewriting to an alloca doesn't require any special handling.
Differential Revision: https://reviews.llvm.org/D89819
Sean Silva [Mon, 10 May 2021 21:30:22 +0000 (14:30 -0700)]
[mlir][ODS]: Add per-op cppNamespace.
This is useful for dialects that have logical subparts.
Differential Revision: https://reviews.llvm.org/D102200
Martin Storsjö [Fri, 26 Feb 2021 12:37:26 +0000 (14:37 +0200)]
[libcxx] [test] Fix filesystem permission tests for windows
On Windows, the permission bits are mapped down to essentially only
two possible states; readonly or readwrite. Normalize the checked
permission bitmask to match what the implementation will return.
Differential Revision: https://reviews.llvm.org/D101728
Pirama Arumuga Nainar [Wed, 5 May 2021 00:41:40 +0000 (17:41 -0700)]
[git-clang-format] Do not apply clang-format to symlinks
This fixes PR46992.
Git stores symlinks as text files and we should not format them even if
they have one of the requested extensions.
(Move the call to `cd_to_toplevel()` up a few lines so we can also print
the skipped symlinks during verbose output.)
Differential Revision: https://reviews.llvm.org/D101878
Nico Weber [Tue, 11 May 2021 15:43:48 +0000 (11:43 -0400)]
[lld/mac] Implement -sectalign
clang sometimes passes this flag along (see D68351), so we should implement it.
Differential Revision: https://reviews.llvm.org/D102247
Lang Hames [Tue, 11 May 2021 17:13:52 +0000 (10:13 -0700)]
Re-apply "[ORC-RT] Add unit test infrastructure, extensible_rtti..."
This reapplies
6d263b6f1c9 (which was reverted in
1c7c6f2b106) with a fix for a
CMake issue.
Peter Steinfeld [Tue, 11 May 2021 02:39:05 +0000 (19:39 -0700)]
[flang] Allow large and erroneous ac-implied-do's
We sometimes unroll an ac-implied-do of an array constructor into a flat list
of values. We then re-analyze the array constructor that contains the
resulting list of expressions. Such a list may or may not contain errors.
But when processing an array constructor with an unrolled ac-implied-do, the
compiler was building an expression to represent the extent of the resulting
array constructor containing the list of values. The number of operands
in this extent expression was based on the number of elements in the
unrolled list of values. For very large lists, this created an
expression so large that it could not be evaluated by the compiler
without overflowing the stack.
I fixed this by continuously folding the extent expression as each operand is
added to it. I added the test .../flang/test/Semantics/array-constr-big.f90
that will cause the compiler to seg fault without this change.
Also, when the unrolled ac-implied-do expression contains errors, we were
repeating the same error message referencing the same source line for every
instance of the erroneous expression in the unrolled list. This potentially
resulted in a very long list of messages for a single error in the source code.
I fixed this by comparing the message being emitted to the previously emitted
message. If they are the same, I do not emit the message. This change is also
tested by the new test array-constr-big.f90.
Several of the existing tests had duplicate error messages for the same source
line, and this change caused differences in their output. So I adjusted the
tests to match the new message emitting behavior.
Differential Revision: https://reviews.llvm.org/D102210
Sam Powell [Tue, 11 May 2021 16:57:52 +0000 (09:57 -0700)]
[TextAPI] Reformat llvm_unreachable message
Change llvm_unreachable message from "Unknown llvm.MachO.PlatformKind
enum" to "Unknown llvm::MachO::PlatformKind enum".
Differential revision: https://reviews.llvm.org/D102250
Lang Hames [Tue, 11 May 2021 16:51:12 +0000 (09:51 -0700)]
Revert "[ORC-RT] Add unit test infrastructure, extensible_rtti..."
This reverts commit
6d263b6f1c9 while I investigate the CMake failures that it
causes in some configurations.
Alan Phipps [Tue, 11 May 2021 16:40:11 +0000 (11:40 -0500)]
Reland "[Coverage] Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation""
Originally landed in:
6400905a615282c83a2fc6e49e57ff716aa8b4de
Reverted in:
668dccc396da4f593ac87c92dc0eb7bc983b5762
Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation
groups.
This change corrects the implementation for the branch coverage summary to do
the same thing for branches that is done for lines and regions. That is,
across function instantiations in an instantiation group, the maximum branch
coverage found in any of those instantiations is returned, with the total
number of branches being the same across instantiations.
Differential Revision: https://reviews.llvm.org/D102193
Simon Pilgrim [Tue, 11 May 2021 16:30:31 +0000 (17:30 +0100)]
[X86][SSE] Add tests for permute(phaddw(phaddw(x,y),phaddw(z,w))) -> phaddw(phaddw(),phaddw()) folds.
We currently only fold if NumEltsPerLane == 4
zoecarver [Tue, 11 May 2021 16:43:14 +0000 (09:43 -0700)]
[libcxx][tests] Fix incomplte.verify tests by disabling them on clang-10.
For some reason clang-10 can't match the expected errors produced by
passing icomplete arrays to range access functions. Disabling the tests
is a stop-gap solution to fix the bots.
Craig Topper [Tue, 11 May 2021 16:32:19 +0000 (09:32 -0700)]
[RISCV] Use fractional LMULs for fixed length types smaller than riscv-v-vector-bits-min.
My thought process is that if v2i64 is an LMUL=1 type then v2i32
should be an LMUL=1/2 type. We limit the fractional LMUL so that
SEW=64 clips to LMUL=1, SEW=32 clips to LMUL=1/2, etc. This
ensures there's always a fractional LMUL available to truncate a type.
This does reduce the number of vsetvlis in some cases.
Some tests increase vsetvlis because the best container type for a
mask type is dependent on the LMUL+SEW that the mask was produced
from, but you can't tell that from the type. I think this is
something we need to solve this in the machine IR when optimizing
vsetvlis.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D101215
Roman Lebedev [Tue, 11 May 2021 16:35:41 +0000 (19:35 +0300)]
[X86][Codegen] Shift amount mod: sh? i64 x, (32-y) --> sh? i64 x, -(y+32)
I've seen this in the RawSpeed's BitPumpMSB*::push() hotpath,
after fixing the buffer abstraction to a more sane one,
when looking into a +5% runtime regression.
I was hoping that this would fix it, but it does not look it does.
This seems to be at least not worse than the original pattern.
But i'm actually mainly interested in the case where we already
compute `(y+32)` (see last test),
https://alive2.llvm.org/ce/z/ZCzJio
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D101944