Timm Bäder [Tue, 8 Jun 2021 10:33:40 +0000 (12:33 +0200)]
[NFC] Remove some include cycles
These files include themselves directly.
Simon Moll [Tue, 8 Jun 2021 11:55:21 +0000 (13:55 +0200)]
[VE][NFC] IRBuilder<> -> IRBuilderBase
VE's TTI broke with the switch from IRBuilder<> to IRBuilderBase.
Following that change to compile again.
Nathan Sidwell [Tue, 4 May 2021 14:59:17 +0000 (07:59 -0700)]
[clang] p1099 using enum part 1
This adds support for p1099's 'using SCOPED_ENUM::MEMBER;'
functionality, bringing a member of an enumerator into the current
scope. The novel feature here, is that there need not be a class
hierarchical relationship between the current scope and the scope of
the SCOPED_ENUM. That's a new thing, the closest equivalent is a
typedef or alias declaration. But this means that
Sema::CheckUsingDeclQualifier needs adjustment. (a) one can't call it
until one knows the set of decls that are being referenced -- if
exactly one is an enumerator, we're in the new territory. Thus it
needs calling later in some cases. Also (b) there are two ways we hold
the set of such decls. During parsing (or instantiating a dependent
scope) we have a lookup result, and during instantiation we have a set
of shadow decls. Thus two optional arguments, at most one of which
should be non-null.
Differential Revision: https://reviews.llvm.org/D100276
maekawatoshiki [Tue, 8 Jun 2021 11:29:48 +0000 (20:29 +0900)]
[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass
This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass.
The next patch will utilize LoopNest to effectively handle loop nests.
Also, a crash problem on legacy pass manager is fixed.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D99149
Vignesh Balasubramanian [Tue, 8 Jun 2021 11:03:39 +0000 (16:33 +0530)]
[OpenMP][OMPD] Implementation of OMPD debugging library - libompd.
This is the first of seven patches that implements OMPD, a debugging interface to support debugging of OpenMP programs.
It contains support code required in "openmp/runtime" for OMPD implementation.
Reviewed By: @hbae
Differential Revision: https://reviews.llvm.org/D100181
Kerry McLaughlin [Tue, 8 Jun 2021 09:49:22 +0000 (10:49 +0100)]
[CostModel] Return an invalid cost for memory ops with unsupported types
Fixes getTypeConversion to return `TypeScalarizeScalableVector` when a scalable vector
type cannot be legalized by widening/splitting. When this is the method of legalization
found, getTypeLegalizationCost will return an Invalid cost.
The getMemoryOpCost, getMaskedMemoryOpCost & getGatherScatterOpCost functions already call
getTypeLegalizationCost and will now also return an Invalid cost for unsupported types.
Reviewed By: sdesmalen, david-arm
Differential Revision: https://reviews.llvm.org/D102515
Sven van Haastregt [Tue, 8 Jun 2021 10:50:29 +0000 (11:50 +0100)]
[OpenCL] Add memory_scope_all_devices
Add the `memory_scope_all_devices` enum value, which is restricted to
OpenCL 3.0 or newer and the `__opencl_c_atomic_scope_all_devices`
feature. Also guard `memory_scope_all_svm_devices` accordingly, which
is already available in OpenCL 2.0.
The `__opencl_c_atomic_scope_all_devices` feature is header-only, so
set its define to 1 in `opencl-c-base.h`. This is done
unconditionally at the moment, as the mechanism for disabling
header-only options hasn't been decided yet.
This patch only adds a negative test for now. Ideally adding a CL3.0
run line to atomic-ops.cl should suffice as a positive test, but we
cannot do that yet until (at least) generic address spaces and program
scope variables are supported in OpenCL 3.0 mode.
Differential Revision: https://reviews.llvm.org/D103241
Fraser Cormack [Tue, 8 Jun 2021 10:05:09 +0000 (11:05 +0100)]
[RISCV] Add a test case showing inefficient vector codegen
Caroline Concatto [Tue, 11 May 2021 14:22:27 +0000 (15:22 +0100)]
[InstCombine] Add instcombine fold for extractelement + splat for scalable vectors
This patch allows that scalable vector can also use the fold that already
exists for fixed vector, only when the lane index is lower than the minimum
number of elements of the vector.
Differential Revision: https://reviews.llvm.org/D102404
Simon Pilgrim [Mon, 7 Jun 2021 17:19:44 +0000 (18:19 +0100)]
OptBisect.cpp - remove unused include. NFCI.
StringRef.h is included in OptBisect.h and we have no uses of std::string.
Simon Pilgrim [Mon, 7 Jun 2021 17:01:55 +0000 (18:01 +0100)]
[CostModel][X86] Improve AVX1/AVX2 truncation costs
Based off the worse case numbers generated by D103695, we were overestimating the cost of a number of vector truncations:
AVX2: v2i32->v2i8, v2i64->v2i16 + v4i64->v4i32
AVX1: v2i32->v2i8, v4i64->v4i16 + v16i16->v16i8
Once we have a working set of conversion costs, the intention is to cleanup the tables and use legalized types a lot more to reduce the number of entries we currently have.
Simon Pilgrim [Mon, 7 Jun 2021 16:15:55 +0000 (17:15 +0100)]
MemCpyOptimizer.cpp - hasUndefContentsMSSA - Pass DataLayout by reference. NFCI.
Simon Pilgrim [Mon, 7 Jun 2021 16:10:59 +0000 (17:10 +0100)]
ValueTrackingTest.cpp - Pass DataLayout by reference. NFCI.
Simon Pilgrim [Mon, 7 Jun 2021 16:10:19 +0000 (17:10 +0100)]
NVPTXTargetLowering::LowerReturn - Pass DataLayout by reference. NFCI.
Kerry McLaughlin [Tue, 8 Jun 2021 08:16:07 +0000 (09:16 +0100)]
[LoopVectorize] Don't use strict reductions when reordering is allowed
If the `-enable-strict-reductions` flag is set to true, then currently we will
always choose to vectorize the loop with strict in-order reductions. This is
not necessary where we allow the reordering of FP operations, such as
when loop hints are passed via metadata.
This patch moves useOrderedReductions so that we can also check whether
loop hints allow reordering, in which case we should use the default
behaviour of vectorizing with unordered reductions.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D103814
Alex Zinenko [Tue, 8 Jun 2021 09:30:31 +0000 (11:30 +0200)]
[mlir] fix shared-libs build
David Green [Tue, 8 Jun 2021 09:18:58 +0000 (10:18 +0100)]
[DAG] Allow isNullOrNullSplat to see truncated zeroes
This sets the AllowTruncation flag on isConstOrConstSplat in
isNullOrNullSplat, allowing it to see truncated constant zeroes on
architectures such as AArch64, where only a i32.i64 are legal. As a
truncation of 0 is always 0, this should always be valid, allowing some
extra folding to happen including some of the cases from D103755.
Differential Revision: https://reviews.llvm.org/D103756
Martin Storsjö [Mon, 31 May 2021 20:59:56 +0000 (23:59 +0300)]
[clang] Apply MS ABI details on __builtin_ms_va_list on non-windows platforms on x86_64
This fixes inconsistencies in the ms_abi.c testcase.
Also add a couple cases of missing double pointers in the windows part
of the testcase; the outcome of building that testcase on windows hasn't
changed, but the previous form of the test was imprecise (checking
for "%[[STRUCT_FOO]]*" when clang actually generates "%[[STRUCT_FOO]]**"),
which still used to match.
Ideally this would share code with the native Windows case, but
X86_64ABIInfo and WinX86_64ABIInfo aren't superclasses/subclasses of
each other so it's impractical, and the code to share currently only
consists of a couple lines.
Differential Revision: https://reviews.llvm.org/D103837
Alex Zinenko [Mon, 7 Jun 2021 16:33:42 +0000 (18:33 +0200)]
[mlir] support memref of memref in standard-to-llvm conversion
Now that memref supports arbitrary element types, add support for memref of
memref and make sure it is properly converted to the LLVM dialect. The type
support itself avoids adding the interface to the memref type itself similarly
to other built-in types. This allows the shape, and therefore byte size, of the
memref descriptor to remain a lowering aspect that is easier to customize and
evolve as opposed to sanctifying it in the data layout specification for the
memref type itself.
Factor out the code previously in a testing pass to live in a dedicated data
layout analysis and use that analysis in the conversion to compute the
allocation size for memref of memref. Other conversions will be ported
separately.
Depends On D103827
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D103828
Alex Zinenko [Mon, 7 Jun 2021 16:33:29 +0000 (18:33 +0200)]
[mlir] Make MemRef element type extensible
Historically, MemRef only supported a restricted list of element types that
were known to be storable in memory. This is unnecessarily restrictive given
the open nature of MLIR's type system. Allow types to opt into being used as
MemRef elements by implementing a type interface. For now, the interface is
merely a declaration with no methods. Later, methods to query, e.g., the type
size or whether a type can alias elements of another type may be added.
Harden the "standard"-to-LLVM conversion against memrefs with non-builtin
types.
See https://llvm.discourse.group/t/rfc-memref-of-custom-types/3558.
Depends On D103826
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D103827
Alex Zinenko [Mon, 7 Jun 2021 16:33:18 +0000 (18:33 +0200)]
[mlir] fix integer type mismatch in alloc conversion to LLVM
Some places in the alloc-like op conversion use the converted index type
whereas other places use the pointer-sized integer type, which may not be the
same. Consistently use the converted index type, similarly to other address
calculations.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D103826
Javier Setoain [Tue, 8 Jun 2021 09:02:19 +0000 (10:02 +0100)]
Revert "[mlir][ArmSVE] Add basic mask generation operations"
This reverts commit
392af6a78bb7dfb87a24ed66db598c1d09ac756b
Lang Hames [Sat, 5 Jun 2021 16:48:29 +0000 (09:48 -0700)]
[JITLink] Clarify LinkGraph::splitBlock contract in comment.
David Spickett [Fri, 4 Jun 2021 15:57:13 +0000 (16:57 +0100)]
[lldb] Set return status to failed when adding a command error
There is a common pattern:
result.AppendError(...);
result.SetStatus(eReturnStatusFailed);
I found that some commands don't actually "fail" but only
print "error: ..." because the second line got missed.
This can cause you to miss a failed command when you're
using the Python interface during testing.
(and produce some confusing script results)
I did not find any place where you would want to add
an error without setting the return status, so just
set eReturnStatusFailed whenever you add an error to
a command result.
This change does not remove any of the now redundant
SetStatus. This should allow us to see if there are any
tests that have commands unexpectedly fail with this change.
(the test suite passes for me but I don't have access to all
the systems we cover so there could be some corner cases)
Some tests that failed on x86 and AArch64 have been modified
to work with the new behaviour.
Differential Revision: https://reviews.llvm.org/D103701
Tomasz Miąsko [Mon, 7 Jun 2021 21:35:50 +0000 (23:35 +0200)]
[Demangle][Rust] Parse const backreferences
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D103848
Tomasz Miąsko [Mon, 7 Jun 2021 21:35:25 +0000 (23:35 +0200)]
[Demangle][Rust] Parse type backreferences
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D103847
Tomasz Miąsko [Mon, 7 Jun 2021 21:34:34 +0000 (23:34 +0200)]
[Demangle][Rust] Parse path backreferences
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D103459
Javier Setoain [Thu, 22 Apr 2021 08:29:02 +0000 (09:29 +0100)]
[mlir][ArmSVE] Add basic mask generation operations
These `arm_sve.cmp` functions are needed to generate scalable vector
masks as long as scalable vectors are not part of the standard types.
Once in standard, these can be removed and `std.cmp` can be used
instead.
Differential Revision: https://reviews.llvm.org/D103473
Denys Petrov [Mon, 7 Jun 2021 12:01:53 +0000 (15:01 +0300)]
[analyzer] [NFC] Implement a wrapper SValBuilder::getCastedMemRegionVal for similar functionality on region cast
Summary: Replaced code on region cast with a function-wrapper SValBuilder::getCastedMemRegionVal. This is a next step of code refining due to suggestions in D103319.
Differential Revision: https://reviews.llvm.org/D103803
Petr Hosek [Wed, 28 Apr 2021 18:23:54 +0000 (11:23 -0700)]
[Driver] Support libc++ in MSVC
This implements support for using libc++ headers and library in the MSVC
toolchain. We only support libc++ that is a part of the toolchain, and
not headers installed elsewhere on the system.
Differential Revision: https://reviews.llvm.org/D101479
Craig Topper [Tue, 8 Jun 2021 05:40:54 +0000 (22:40 -0700)]
[RISCV] Use 0 for Log2SEW for vle1/vse1 intrinsics to enable vsetvli optimization.
Missed in D103299.
Craig Topper [Tue, 8 Jun 2021 04:43:42 +0000 (21:43 -0700)]
[RISCV] Masked compares should use a tail agnostic policy.
Writes of a mask result are always tail agnostic.
Unfortunately, this seems to have made codegen worse. I can only
think this must be because the vsetvli was acting as some sort
of barrier that prevented some code movement in the scheduler.
Reviewed By: arcbbb
Differential Revision: https://reviews.llvm.org/D103331
Craig Topper [Tue, 8 Jun 2021 04:15:18 +0000 (21:15 -0700)]
[RISCV] Use AVL Operand instead of GPR for tied mask pseudo for vwadd.wv and similar.
I mistakenly copied this from an older version of our internal
repo.
Yonghong Song [Mon, 7 Jun 2021 21:54:23 +0000 (14:54 -0700)]
BPF: fix relocation types in lib/Object/RelocationResolver.cpp
Commit
6a2ea84600ba ("BPF: Add more relocation kinds")
added new relocations R_BPF_64_ABS64 and R_BPF_64_ABS32
for normal 64-bit and 32-bit data relocations.
This is to replace some of functionalities with
R_BPF_64_64 and R_BPF_64_32 so that new R_BPF_64_64
and R_BPF_64_32 semantics are for ld_imm64 and
call instructions only.
The BPF support in lib/Object/RelocationResolver.cpp
is used to perform normal data relocations for
the case like DWARFObjInMemory with an object file
(search function getRelocationResolver() in file
DebugInfo/DWARF/DWARFContext.cpp) or llvm-readobj
to dump ".stack_sizes" section data.
In all these casees, normal 64-bit and 32-bit relocations
are performed and such resolution resolution
is exactly what implemented in RelocationResolver.cpp.
But Commit
6a2ea84600ba missed to change
R_BPF_64_64/R_BPF_64_32 to R_BPF_64_ABS64/R_BPF_64_ABS32.
This patch fixed the issue and added a test for it
with llvm-readobj dumping ".stack_sizes" section.
Differential Revision: https://reviews.llvm.org/D103864
Jez Ng [Tue, 8 Jun 2021 03:48:16 +0000 (23:48 -0400)]
[lld-macho] Implement -force_load_swift_libs
It causes libraries whose names start with "swift" to be force-loaded.
Note that unlike the more general `-force_load`, this flag only applies
to libraries specified via LC_LINKER_OPTIONS, and not those passed on
the command-line. This is what ld64 does.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D103709
Jez Ng [Tue, 8 Jun 2021 03:47:12 +0000 (23:47 -0400)]
[lld-macho] Implement cstring deduplication
Our implementation draws heavily from LLD-ELF's, which in turn delegates
its string deduplication to llvm-mc's StringTableBuilder. The messiness of
this diff is largely due to the fact that we've previously assumed that
all InputSections get concatenated together to form the output. This is
no longer true with CStringInputSections, which split their contents into
StringPieces. StringPieces are much more lightweight than InputSections,
which is important as we create a lot of them. They may also overlap in
the output, which makes it possible for strings to be tail-merged. In
fact, the initial version of this diff implemented tail merging, but
I've dropped it for reasons I'll explain later.
**Alignment Issues**
Mergeable cstring literals are found under the `__TEXT,__cstring`
section. In contrast to ELF, which puts strings that need different
alignments into different sections, clang's Mach-O backend puts them all
in one section. Strings that need to be aligned have the `.p2align`
directive emitted before them, which simply translates into zero padding
in the object file.
I *think* ld64 extracts the desired per-string alignment from this data
by preserving each string's offset from the last section-aligned
address. I'm not entirely certain since it doesn't seem consistent about
doing this; but perhaps this can be chalked up to cases where ld64 has
to deduplicate strings with different offset/alignment combos -- it
seems to pick one of their alignments to preserve. This doesn't seem
correct in general; we can in fact can induce ld64 to produce a crashing
binary just by linking in an additional object file that only contains
cstrings and no code. See PR50563 for details.
Moreover, this scheme seems rather inefficient: since unaligned and
aligned strings are all put in the same section, which has a single
alignment value, it doesn't seem possible to tell whether a given string
doesn't have any alignment requirements. Preserving offset+alignments
for strings that don't need it is wasteful.
In practice, the crashes seen so far seem to stem from x86_64 SIMD
operations on cstrings. X86_64 requires SIMD accesses to be
16-byte-aligned. So for now, I'm thinking of just aligning all strings
to 16 bytes on x86_64. This is indeed wasteful, but implementation-wise
it's simpler than preserving per-string alignment+offsets. It also
avoids the aforementioned crash after deduplication of
differently-aligned strings. Finally, the overhead is not huge: using
16-byte alignment (vs no alignment) is only a 0.5% size overhead when
linking chromium_framework.
With these alignment requirements, it doesn't make sense to attempt tail
merging -- most strings will not be eligible since their overlaps aren't
likely to start at a 16-byte boundary. Tail-merging (with alignment) for
chromium_framework only improves size by 0.3%.
It's worth noting that LLD-ELF only does tail merging at `-O2`. By
default (at `-O1`), it just deduplicates w/o tail merging. @thakis has
also mentioned that they saw it regress compressed size in some cases
and therefore turned it off. `ld64` does not seem to do tail merging at
all.
**Performance Numbers**
CString deduplication reduces chromium_framework from 250MB to 242MB, or
about a 3.2% reduction.
Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:
N Min Max Median Avg Stddev
x 20 3.91 4.03 3.935 3.95 0.
034641016
+ 20 3.99 4.14 4.015 4.0365 0.0492336
Difference at 95.0% confidence
0.0865 +/- 0.027245
2.18987% +/- 0.689746%
(Student's t, pooled s = 0.0425673)
As expected, cstring merging incurs some non-trivial overhead.
When passing `--no-literal-merge`, it seems that performance is the
same, i.e. the refactoring in this diff didn't cost us.
N Min Max Median Avg Stddev
x 20 3.91 4.03 3.935 3.95 0.
034641016
+ 20 3.89 4.02 3.935 3.9435 0.
043197831
No difference proven at 95.0% confidence
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D102964
Esme-Yi [Tue, 8 Jun 2021 03:00:52 +0000 (03:00 +0000)]
[yaml2obj] Fix buildbot-issue-4886
XCOFFEmitter.cpp:67:16: runtime error: null pointer passed as argument 2,
which is declared to never be null
Carl Ritson [Tue, 8 Jun 2021 02:31:08 +0000 (11:31 +0900)]
[AMDGPU] Allow oversize vaddr in GFX10 MIMG assembly
As a follow up to D103672, we should allow vaddr to be larger than
required when assembling GFX10 MIMG instructions.
Reviewed By: dp
Differential Revision: https://reviews.llvm.org/D103733
Jake.Egan [Tue, 8 Jun 2021 02:36:07 +0000 (22:36 -0400)]
[AIX] Define __STDC_NO_ATOMICS__ and __STDC_NO_THREADS__
Revert/reapply to fix Git authorship metadata
Differential Revision: https://reviews.llvm.org/D103707
Chris Bowler [Tue, 8 Jun 2021 02:34:15 +0000 (22:34 -0400)]
Revert "[AIX] Define __STDC_NO_ATOMICS__ and __STDC_NO_THREADS__ predefined macros"
This reverts commit
e6629be31e67190f0a524f009752d73410894560.
Carl Ritson [Tue, 8 Jun 2021 02:10:53 +0000 (11:10 +0900)]
[AMDGPU] Add v5f32/VReg_160 support for MIMG instructions
Avoid having to round up to v8f32/VReg_256 when only 5 VGPRs are
required for a MIMG address operand.
Maintain _V8 instruction variants of pseudo instructions allowing
assembly prior to GFX10 to work as-is. Currently the validator
can tell for GFX10 what the correct size is, so will disallow
oversize address registers.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D103672
=Jake Egan [Tue, 8 Jun 2021 01:56:47 +0000 (21:56 -0400)]
[AIX] Define __STDC_NO_ATOMICS__ and __STDC_NO_THREADS__ predefined macros
Differential Revision: https://reviews.llvm.org/D103707
Vitaly Buka [Tue, 8 Jun 2021 01:54:14 +0000 (18:54 -0700)]
[NFC][scudo] Print errno of fork failure
This fork fails sometime on sanitizer-x86_64-linux-qemu bot.
Craig Topper [Mon, 7 Jun 2021 23:35:27 +0000 (16:35 -0700)]
[RISCV] Use bitfields to shrink the size of the vector load/store intrinsics to pseudo instruction lookup tables.
Vitaly Buka [Tue, 8 Jun 2021 00:10:22 +0000 (17:10 -0700)]
[NFC][LSAN] Limit the number of concurrent threads is the test
Test still fails with D88184 reverted.
The test was flaky on https://bugs.chromium.org/p/chromium/issues/detail?id=1206745 and
https://lab.llvm.org/buildbot/#/builders/sanitizer-x86_64-linux
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D102218
George Balatsouras [Fri, 4 Jun 2021 22:34:02 +0000 (15:34 -0700)]
[dfsan] Add full fast8 support
Complete support for fast8:
- amend shadow size and mapping in runtime
- remove fast16 mode and -dfsan-fast-16-labels flag
- remove legacy mode and make fast8 mode the default
- remove dfsan-fast-8-labels flag
- remove functions in dfsan interface only applicable to legacy
- remove legacy-related instrumentation code and tests
- update documentation.
Reviewed By: stephan.yichao.zhao, browneee
Differential Revision: https://reviews.llvm.org/D103745
LLVM GN Syncbot [Tue, 8 Jun 2021 00:16:13 +0000 (00:16 +0000)]
[gn build] Port
692d7166f771
Petr Hosek [Mon, 7 Jun 2021 19:35:02 +0000 (12:35 -0700)]
Revert "[libcxx][gardening] Move all algorithms into their own headers."
This reverts commit
7ed7d4ccb8991e2b5b95334b508f8cec2faee737 as it
uncovered a Clang bug PR50592.
Petr Hosek [Mon, 7 Jun 2021 18:29:03 +0000 (11:29 -0700)]
Revert "[libcxx][module-map] creates submodules for private headers"
This reverts commit
f1417eb9b1f51b689c78dd8cb0114c1749dd2845 as it
uncovered a Clang bug PR50592.
Ben Shi [Mon, 7 Jun 2021 23:26:00 +0000 (07:26 +0800)]
[RISCV] Optimize bitwise and with constant for the Zbs extension
This patch optimizes (and r i) to
(BCLRI (BCLRI r, i0), i1) in which i = ~((1<<i0) | (1<<i1)).
or
(BCLRI (ANDI r, i0), i1) in which i = i0 & ~(1<<i1).
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D103743
Arthur Eubanks [Mon, 7 Jun 2021 22:54:35 +0000 (15:54 -0700)]
Revert "[TargetLowering] Only inspect attributes in the arguments for ArgListEntry"
Needs to be discussed more.
This reverts commit
255a5c1baa6020c009934b4fa342f9f6dbbcc46
This reverts commit
df2056ff3730316f376f29d9986c9913b95ceb1
This reverts commit
faff79b7ca144e505da6bc74aa2b2f7cffbbf23
This reverts commit
d2a9020785c6e02afebc876aa2778fa64c5cafd
Craig Topper [Mon, 7 Jun 2021 22:14:26 +0000 (15:14 -0700)]
[RISCV] Store Log2 of EEW in the vector load/store intrinsic to pseudo lookup tables. NFCI
This uses 3 bits of data instead of 7. I'm wondering if we can use
bitfields for the lookup table key where this would matter.
I also name the shift_amount template to log2 since it is used
with more than just an srl now.
Stanislav Mekhanoshin [Mon, 7 Jun 2021 22:30:06 +0000 (15:30 -0700)]
[AMDGPU] Handle constant LDS uses from different kernels
This allows to lower an LDS variable into a kernel structure
even if there is a constant expression used from different
kernels.
Differential Revision: https://reviews.llvm.org/D103655
hsmahesha [Mon, 7 Jun 2021 22:28:13 +0000 (03:58 +0530)]
[AMDGPU] Introduce command line switch to control super aligning of LDS.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D103817
hsmahesha [Mon, 7 Jun 2021 21:15:18 +0000 (02:45 +0530)]
[IR] Add utility to convert constant expression operands (of an instruction) to instructions.
In the situation where we need to replace a constant operand C from a constant expression CE
by an instruction NI, it not possible without converting CE itself into an instruction. This
utility helps to convert the given set of constant expression operands from an instruction I
into a corresponding set of instructions.
The current use-case for this utility is from the patches - https://reviews.llvm.org/D103225
and https://reviews.llvm.org/D103655.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D103661
Philip Reames [Mon, 7 Jun 2021 21:46:57 +0000 (14:46 -0700)]
[SCEV] Properly guard reasoning about infinite loops being UB on mustprogress
Noticed via code inspection. We changed the semantics of the IR when we added mustprogress, and we appear to have not updated this location.
Differential Revision: https://reviews.llvm.org/D103834
Daniil Suchkov [Fri, 4 Jun 2021 23:17:02 +0000 (23:17 +0000)]
[BasicAA] Handle PHIs without incoming values gracefully
Fix a bug introduced by
f6f6f6375d1a4bced8a6e79a78726ab32b8dd879.
Now for empty PHIs, instead of crashing on assert(hasVal()) in
Optional's internals, we'll return NoAlias, as we did before that patch.
Differential Revision: https://reviews.llvm.org/D103831
Daniil Suchkov [Fri, 4 Jun 2021 22:55:35 +0000 (22:55 +0000)]
[Test] Add a JumpThreading test exposing a bug in BasicAA.
Jian Cai [Mon, 7 Jun 2021 21:30:32 +0000 (14:30 -0700)]
Revert "[AArch64] handle -Wa,-march="
This reverts commit
fd11a26d368c5a909fb88548fef2cee7a6c2c931.
River Riddle [Mon, 7 Jun 2021 21:00:00 +0000 (14:00 -0700)]
[mlir-lsp-server] Fix bug in symbol use/def tracking
We were accidentally only using the first found reference, instead of all of them. This revision fixes this by properly tracking all references to a symbol.
Differential Revision: https://reviews.llvm.org/D103730
River Riddle [Mon, 7 Jun 2021 20:59:50 +0000 (13:59 -0700)]
[mlir-lsp-server] Add support for hover on symbol references
For now the hover simply shows the same information as hovering on the operation
name. If necessary this can be tweaked to something symbol specific later.
Differential Revision: https://reviews.llvm.org/D103728
River Riddle [Mon, 7 Jun 2021 20:59:39 +0000 (13:59 -0700)]
[mlir-lsp-server] Add support for hover on region operations
This revision adds support for hover on region operations, by temporarily removing the regions during printing. This revision also tweaks the hover format for operations to include symbol information, now that FuncOp can be shown in the hover.
Differential Revision: https://reviews.llvm.org/D103727
Nico Weber [Mon, 7 Jun 2021 15:00:34 +0000 (11:00 -0400)]
[lld/mac] Add reexports after reexporter to inputFiles
When a library "host"'s reexports change their installName with
`$ld$os10.11$install_name$host`, we used to write a load command for "host" but
write the version numbers of the reexport instead of "host". This fixes that.
I first thought that the rule is to take the version numbers from the library
that originally had that install name (implemented in D103819), but that's not
what ld64 seems to be doing: It takes the version number from the first dylib
with that install name it loads, and it loads the reexporting library before
the reexports. We already did most of that, we just added reexports before the
reexporter. After this change, we add the reexporter before the reexports.
Addresses https://bugs.llvm.org/show_bug.cgi?id=49800#c11 part 1.
(ld64 seems to add reexports after processing _all_ files on the command line,
while we add them right after the reexporter. For the common case of reexport +
$ld$ symbol changing back to the exporter name, this doesn't make a difference,
but you can construct a case where it does. I expect this to not make a
difference in practice though.)
Differential Revision: https://reviews.llvm.org/D103821
Amir Ayupov [Mon, 7 Jun 2021 20:17:00 +0000 (13:17 -0700)]
[ELF] getRelocatedSection: remove the check for ET_REL object file
getRelocatedSection interface should not check that the object file is
relocatable, as executable files may have relocations preserved with
`--emit-relocs` linker flag. The relocations are useful in context of post-link
binary analysis for function reference identification. For example, BOLT relies
on relocations to perform function reordering.
Reviewed By: MaskRay, jhenderson
Differential Revision: https://reviews.llvm.org/D102296
Harald van Dijk [Mon, 7 Jun 2021 19:48:39 +0000 (20:48 +0100)]
[X32] Add Triple::isX32(), use it.
So far, support for x86_64-linux-gnux32 has been handled by explicit
comparisons of Triple.getEnvironment() to GNUX32. This worked as long as
x86_64-linux-gnux32 was the only X32 environment to worry about, but we
now have x86_64-linux-muslx32 as well. To support this, this change adds
an isX32() function and uses it. It replaces all checks for GNUX32 or
MuslX32 by isX32(), except for the following:
- Triple::isGNUEnvironment() and Triple::isMusl() are supposed to treat
GNUX32 and MuslX32 differently.
- computeTargetTriple() needs to be able to transform triples to add or
remove X32 from the environment and needs to map GNU to GNUX32, and
Musl to MuslX32.
- getMultiarchTriple() completely lacks any Musl support and retains the
explicit check for GNUX32 as it can only return x86_64-linux-gnux32.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D103777
Martin Storsjö [Mon, 31 May 2021 20:55:26 +0000 (23:55 +0300)]
[clang] Fix reading long doubles with va_arg on x86_64 mingw
On x86_64 mingw, long doubles are always passed indirectly as
arguments (see an existing case in WinX86_64ABIInfo::classify);
generalize the existing code for reading varargs - any non-aggregate
type that is larger than 64 bits (which would be both long double
in mingw, and __int128) are passed indirectly too.
This makes reading varargs consistent with how they're passed,
fixing interop with both gcc and clang callers, for long double
and __int128.
Differential Revision: https://reviews.llvm.org/D103452
Nikita Popov [Sat, 5 Jun 2021 08:49:51 +0000 (10:49 +0200)]
[LoopUnroll] Clamp unroll count to MaxTripCount
Unrolling with more iterations than MaxTripCount is pointless, as
those iterations can never be executed. As such, we clamp ULO.Count
to MaxTripCount if it is known. This means we no longer need to
consider iterations after MaxTripCount for exit folding, and the
CompletelyUnroll flag becomes independent of ULO.TripCount.
Differential Revision: https://reviews.llvm.org/D103748
Peyton, Jonathan L [Fri, 4 Jun 2021 19:26:08 +0000 (14:26 -0500)]
[OpenMP][runtime] add .clang-tidy file
Use same checks as compiler-rt which removes checks for readability-*
and llvm-header style.
Differential Revision: https://reviews.llvm.org/D103711
AndreyChurbanov [Mon, 7 Jun 2021 18:42:51 +0000 (21:42 +0300)]
[OpenMP] libomp: implement OpenMP 5.1 inoutset task dependence type
Refactored code of dependence processing and added new inoutset dependence type.
Compiler can set dependence flag to 0x8 when call __kmpc_omp_task_with_deps.
Size of type of the dependence flag changed from 1 to 4 bytes in clang.
All dependence flags library gets so far and corresponding dependence types:
1 - IN, 2 - OUT, 3 - INOUT, 4 - MUTEXINOUTSET, 8 - INOUTSET.
Differential Revision: https://reviews.llvm.org/D97085
Matt Arsenault [Sat, 5 Jun 2021 23:10:38 +0000 (19:10 -0400)]
AMDGPU: Move codegen test out of MIR test directory
This is testing an actual pass, not the MIR parser/printer.
Matt Arsenault [Sun, 6 Jun 2021 15:41:48 +0000 (11:41 -0400)]
GlobalISel: Use MMO helper for getting the size in bits
Matt Arsenault [Mon, 7 Jun 2021 18:11:52 +0000 (14:11 -0400)]
GlobalISel: Remove unnecessary .getReg(0)s
Philip Reames [Mon, 7 Jun 2021 18:16:23 +0000 (11:16 -0700)]
[SCEV] Compute exit counts for unsigned IVs using mustprogress semantics
The motivation here is simple loops with unsigned induction variables w/non-one steps. A toy example would be:
for (unsigned i = 0; i < N; i += 2) { body; }
Given C/C++ semantics, we do not get the nuw flag on the induction variable. Given that lack, we currently can't compute a bound for this loop. We can do better for many cases, depending on the contents of "body".
The basic intuition behind this patch is as follows:
* A step which evenly divides the iteration space must wrap through the same numbers repeatedly. And thus, we can ignore potential cornercases where we exit after the n-th wrap through uint32_max.
* Per C++ rules, infinite loops without side effects are UB. We already have code in SCEV which relies on this. In LLVM, this is tied to the mustprogress attribute.
Together, these let us conclude that the trip count of this loop must come before unsigned overflow unless the body would form a well defined infinite loop.
A couple notes for those reading along:
* I reused the loop properties code which is overly conservative for this case. I may follow up in another patch to generalize it for the actual UB rules.
* We could cache the n(s/u)w facts. I left that out because doing a pre-patch which cached existing inference showed a lot of diffs I had trouble fully explaining. I plan to get back to this, but I don't want it on the critical path.
Differential Revision: https://reviews.llvm.org/D103118
William S. Moses [Sat, 5 Jun 2021 01:27:15 +0000 (21:27 -0400)]
[MLIR][GPU] Simplify memcpy of cast
Introduce a simplification that allows memcpy of a cast to simply use the underlying op
Differential Revision: https://reviews.llvm.org/D103830
Louis Dionne [Mon, 7 Jun 2021 17:48:45 +0000 (13:48 -0400)]
[libc++] Rename 'and' to '&&'
Nico Weber [Mon, 7 Jun 2021 14:22:25 +0000 (10:22 -0400)]
[lld/mac] Add a test for -reexport_library + -dead_strip_dylibs
Our behavior here already matched ld64, now we have a test for it.
(ld64 even strips the library here if you also pass -needed_library bar.dylib.
That seems wrong to me, and lld honors needed_library in that case.)
Differential Revision: https://reviews.llvm.org/D103812
William S. Moses [Sun, 2 May 2021 04:08:24 +0000 (00:08 -0400)]
[MLIR] Conditional Branch Argument Propagation
In an operation in the true/false dest of a branch,
one can assume that the operation itself was true/false if
only that edge can reach the operation.
Differential Revision: https://reviews.llvm.org/D101709
Craig Topper [Mon, 7 Jun 2021 17:21:53 +0000 (10:21 -0700)]
[RISCV] Lower i8/i16 bswap/bitreverse to grevi/greviw with Zbp.
Include known bits support so we know we don't need to zext the
output if the input was already zero extended.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D103757
Philip Reames [Mon, 7 Jun 2021 17:20:45 +0000 (10:20 -0700)]
[RS4GC] Treat inttoptr as base pointer
This is a modified version of a patch by tolziplohu with a style change, and most importantly, a revised commit message.
inttoptr for a non-integral address space is currently ill defined in the LangRef. Figuring out exactly what the dynamic semantics of such a cast would be is hard, and not yet settled. Despite that, we still need to go ahead and implement something in RS4GC for a couple of reasons.
First, as a simple consistency argument. We're apparently added support for constexpr inttoptrs a while back, and even have tests which exercised them. Having a lack of constant folding trigger a crash during lowering is non-ideal.
Second, and more fundementally, the optimizer is allowed to insert undefined constructs in unreachable code. At the same time, we can't assume that dynamically dead code is always pruned before lowering. As a result, we must assume that inttoptrs can occur (even if completely ill defined) along dead paths. We need the lowering to not crash. The stackmaps produced can be garbage (as the assumption is the code is dynamically dead), but the lowering itself can't crash.
Differential Revision: https://reviews.llvm.org/D103492
jasonliu [Mon, 7 Jun 2021 14:52:55 +0000 (14:52 +0000)]
[XCOFF][AIX] Enable tooling support for 64 bit symbol table parsing
Add in the ability of parsing symbol table for 64 bit object.
Reviewed By: jhenderson, DiggerLin
Differential Revision: https://reviews.llvm.org/D85774
Sanjay Patel [Mon, 7 Jun 2021 17:05:04 +0000 (13:05 -0400)]
[InstCombine] intersect nsz and ninf fast-math-flags (FMF) for fneg(fdiv) fold
https://alive2.llvm.org/ce/z/3KPvih
https://llvm.org/PR49654
Sanjay Patel [Mon, 7 Jun 2021 15:48:45 +0000 (11:48 -0400)]
[InstCombine] refactor match clauses; NFC
We need to adjust the FMF propagation on at least
one of these transforms as discussed in:
https://llvm.org/PR49654
...so this should make it easier to intersect flags.
Sanjay Patel [Mon, 7 Jun 2021 14:42:28 +0000 (10:42 -0400)]
[InstCombine] add tests for FMF propagation via -(C/X); NFC
There are bugs here as discussed in:
https://llvm.org/PR49654
Craig Topper [Mon, 7 Jun 2021 06:31:43 +0000 (23:31 -0700)]
[RISCV] Don't enable loop vectorizer interleaving if the V extension isn't enabled.
This can cause the vectorizer to generate interleaved scalar
code which might be ok for some CPUs, but definitely not all.
Disable it to restore the previous scalar behavior.
Differential Revision: https://reviews.llvm.org/D103787
Tomasz Miąsko [Mon, 7 Jun 2021 17:11:16 +0000 (19:11 +0200)]
[Demangle][Rust] Parse instantiating crate
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D103460
Jian Cai [Mon, 7 Jun 2021 17:12:13 +0000 (10:12 -0700)]
[AArch64] handle -Wa,-march=
This fixed PR#48894 for AArch64. The issue has been fixed for Arm in
https://reviews.llvm.org/D95872
The following rules apply to -Wa,-march with this change:
- Only compiler options apply to non assembly files
- Compiler and assembler options apply to assembly files
- For assembly files, we prefer the assembler option(s) if we have both kinds of option
- Of the options that apply (or are preferred), the last value wins (it's not additive)
Reviewed By: DavidSpickett, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D103184
Florian Hahn [Mon, 7 Jun 2021 16:45:19 +0000 (17:45 +0100)]
[VPlan] Print successors of VPRegionBlocks.
The non-DOT printing does not include the successors of VPregionBlocks.
This patch use the same style for printing successors as for
VPBasicBlock.
I think the printing of successors could be a bit improved further, as
at the moment it is hard to ensure a check line matches all successors.
But that can be done as follow-up.
Reviewed By: a.elovikov
Differential Revision: https://reviews.llvm.org/D103515
Jianzhou Zhao [Mon, 7 Jun 2021 16:55:56 +0000 (16:55 +0000)]
[dfsan] Fix internal build errors because of more strict warning checks
Krzysztof Parzyszek [Mon, 7 Jun 2021 13:51:11 +0000 (08:51 -0500)]
[docs] Set Phabricator as the tool for pre-commit reviews
Differential Revision: https://reviews.llvm.org/D103811
Louis Dionne [Fri, 4 Jun 2021 17:31:22 +0000 (13:31 -0400)]
[libc++] Simplify a few macros in __config
Several macros were guarded with a check along the lines of:
#ifndef MACRO
# define MACRO ...
#endif
However, some of these macros are never intended to be defined by users,
so it's pointless to make this check (i.e. the first #ifndef is always
true). This commit removes those checks.
The motivation for doing this cleanup is to remove the impression that
arbitrary configurations macros can be defined by users when including
libc++ headers, which doesn't work reliably and leads to macro spaghetti.
If one needs to be able to override a knob in the __config, that's fine,
but the proper way to do that is to document the macro as being a public
facing knob in the documentation, and most likely to migrate that macro
to __config_site (depending on the nature of the macro).
Differential Revision: https://reviews.llvm.org/D103705
Raphael Isemann [Mon, 7 Jun 2021 16:45:03 +0000 (18:45 +0200)]
[lldb] Fix TypeSystemClang compilation after D101777
We apparently now need to pass the DeclName of the target decl to the
constructor.
Raphael Isemann [Mon, 7 Jun 2021 16:43:00 +0000 (18:43 +0200)]
[NFC] Add missing include to LaneBitmask.h to fix modules build
Sander de Smalen [Mon, 7 Jun 2021 15:55:44 +0000 (16:55 +0100)]
[CostModel][AArch64] NFC: Simplify some cost model tests for SVE.
* Merged some functions into a single function, to make the costs more obvious.
* Moved scalable-mem-op-cost-model.ll -> sve-ldst.ll to be more consistent with other filenames.
Sander de Smalen [Mon, 7 Jun 2021 12:02:38 +0000 (13:02 +0100)]
[CostModel] Return Invalid cost in getArithmeticCost instead of crashing for scalable vectors.
This fixes an issue in BasicTTIImpl.h where it tries to do a
cast<FixedVectorType> on a scalable vector type in order to get the
scalarization cost. Because scalarization of scalable vectors is not
supported, we return Invalid instead.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D103798
Tomasz Miąsko [Mon, 7 Jun 2021 16:14:06 +0000 (18:14 +0200)]
[Demangle][Rust] Parse dyn-trait-assoc-binding
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D103364
Tomasz Miąsko [Mon, 7 Jun 2021 16:13:19 +0000 (18:13 +0200)]
[Demangle][Rust] Parse dyn-trait
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D103361
Tomasz Miąsko [Mon, 7 Jun 2021 16:12:13 +0000 (18:12 +0200)]
[Demangle][Rust] Parse dyn-bounds
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D103151
Valentin Clement [Mon, 7 Jun 2021 16:09:25 +0000 (12:09 -0400)]
[mlir][openacc] Add conversion for if operand to scf.if for standalone data operation
This patch convert the if condition on standalone data operation such as acc.update,
acc.enter_data and acc.exit_data to a scf.if with the operation in the if region.
It removes the operation when the if condition is constant and false. It removes the
the condition if it is contant and true.
Conversion to scf.if is done in order to use the translation to LLVM IR dialect out of the box.
Not sure this is the best approach or we should perform this during the translation from OpenACC
to LLVM IR dialect. Any thoughts welcome.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D103325
Valentin Clement [Mon, 7 Jun 2021 15:40:26 +0000 (11:40 -0400)]
[mlir][openacc] Add canonicalization for standalone data operations for if condition
This patch add canonicalization for the standalone data operation with constant if condition.
It is extracted from this patch D103325.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D103712
Hsiangkai Wang [Thu, 20 May 2021 02:22:08 +0000 (10:22 +0800)]
[Clang][CodeGen] Set the size of llvm.lifetime to unknown for scalable types.
If the memory object is scalable type, we do not know the exact size of
it at compile time. Set the size of lifetime marker to unknown if the
object is scalable one.
Differential Revision: https://reviews.llvm.org/D102822