platform/upstream/llvm.git
3 years ago[GlobalISel] Add a G_LROUND instruction
Jessica Paquette [Thu, 19 Aug 2021 22:41:36 +0000 (15:41 -0700)]
[GlobalISel] Add a G_LROUND instruction

Meant to represent the `@llvm.lround.*` family.

Add the opcode, docs, and verification.

Differential Revision: https://reviews.llvm.org/D108417

3 years ago[libomptarget][amdcgn] Add build dependency for llvm-link and opt
Joachim Protze [Thu, 19 Aug 2021 21:28:51 +0000 (23:28 +0200)]
[libomptarget][amdcgn] Add build dependency for llvm-link and opt

D107156 and D107320 are not sufficient when OpenMP is built as llvm runtime
(LLVM_ENABLE_RUNTIMES=openmp) because dependencies only work within the same
cmake instance.

We could limit the dependency to cases where libomptarget/plugins are really
built. But compared to the whole llvm project, building openmp runtime is
negligible and postponing the build of OpenMP runtime after the dependencies
are ready seems reasonable.

The direct dependency introduced in D107156 and D107320 is necessary for the
case where OpenMP is built as llvm project (LLVM_ENABLE_PROJECTS=openmp).

Differential Revision: https://reviews.llvm.org/D108404

3 years agoRevert "[InstrProfiling] Make COFF use the ELF comdat scheme (drop link.exe compatibi...
Fangrui Song [Thu, 19 Aug 2021 23:42:57 +0000 (16:42 -0700)]
Revert "[InstrProfiling] Make COFF use the ELF comdat scheme (drop link.exe compatibility)"

This reverts commit fbb8e772ec501a1b71643db90e9c6445e17d7cac.

Accidentally pushed.

3 years ago[AArch64][GlobalISel] Add G_VECREDUCE fewerElements support for full scalarization.
Amara Emerson [Wed, 18 Aug 2021 07:19:58 +0000 (00:19 -0700)]
[AArch64][GlobalISel] Add G_VECREDUCE fewerElements support for full scalarization.

For some reductions like G_VECREDUCE_OR on AArch64, we need to scalarize
completely if the source is <= 64b. This change adds support for that in
the legalizer. If the source has a pow-2 num elements, then we can do
a tree reduction using the scalar operation in the individual elements.
Otherwise, we just create a sequential chain of operations.

For AArch64, we only need to scalarize if the input is <64b. If it's great than
64b then we can first do a fewElements step to 64b, taking advantage of vector
instructions until we reach the point of scalarization.

I also had to relax the verifier checks for reductions because the intrinsics
support <1 x EltTy> types, which we lower to scalars for GlobalISel.

Differential Revision: https://reviews.llvm.org/D108276

3 years ago[InstrProfiling] Make COFF use the ELF comdat scheme (drop link.exe compatibility)
Fangrui Song [Thu, 19 Aug 2021 23:38:32 +0000 (16:38 -0700)]
[InstrProfiling] Make COFF use the ELF comdat scheme (drop link.exe compatibility)

The COFF specific `DataReferencedByCode` complexity (D103372 D103717) is due to
a link.exe limitation: an external symbol in IMAGE_COMDAT_SELECT_ASSOCIATIVE is
not really dropped, so it can cause duplicate definition error.

3 years ago[test] Split icall.ll into comdat/nocomdat variants
Fangrui Song [Thu, 19 Aug 2021 23:36:58 +0000 (16:36 -0700)]
[test] Split icall.ll into comdat/nocomdat variants

darwin/aix don't support comdat. Using IR comdat is incorrect.

3 years ago[lld][WebAssembly] Handle weakly defined symbols in shared libraries.
Sam Clegg [Wed, 18 Aug 2021 16:57:34 +0000 (12:57 -0400)]
[lld][WebAssembly] Handle weakly defined symbols in shared libraries.

In the case of weakly defined symbols in shared libraries we now
generate both an import and an export.  The dynamic linker can then
choose how a winner from among all the shared libraries that define a
given symbol.

Previously any direct usage of a weakly defined symbol would use the
DSO-local definition (For example, even through there would be single
address for a weakly defined function, each DSO could end up directly
calling its local version).

Fixes: https://github.com/emscripten-core/emscripten/issues/13773

Differential Revision: https://reviews.llvm.org/D108413

3 years ago[WebAssembly] Make bitmask instructions return unsigned ints
Thomas Lively [Thu, 19 Aug 2021 23:23:47 +0000 (16:23 -0700)]
[WebAssembly] Make bitmask instructions return unsigned ints

Since they are bitmasks, it will be more common for them to be used and
potentially extended to 64-bit integers as unsigned values rather than signed
values.

Differential Revision: https://reviews.llvm.org/D108401

3 years ago[AArch64][GlobalISel] Fix miscompile of <16 x s8> G_EXTRACT_VECTOR_ELT.
Amara Emerson [Thu, 19 Aug 2021 22:45:50 +0000 (15:45 -0700)]
[AArch64][GlobalISel] Fix miscompile of <16 x s8> G_EXTRACT_VECTOR_ELT.

When support for copying vector s8 lanes was added recently, this also
had the side effect of fixing a fallback for <16 x s8> extracts since
both used the same helper. However, there was a bug in another helper
to get the regclass for a specific FPR-native type, which was assigning
FPR16 to s8 instead of FPR8.

3 years ago[libc++][NFC] Update and alphabetize CREDITS.TXT
Kent Ross [Thu, 19 Aug 2021 23:12:37 +0000 (23:12 +0000)]
[libc++][NFC] Update and alphabetize CREDITS.TXT

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D108263

3 years ago[libc++] [doc] Add issue tracking for spaceship operator<=> implementation
Kent Ross [Thu, 19 Aug 2021 23:10:47 +0000 (23:10 +0000)]
[libc++] [doc] Add issue tracking for spaceship operator<=> implementation

Add issue tracking and assignment for the implementation of P1614R2: The Mothership has Landed.

Reviewed By: cjdb, #libc, Mordante, Quuxplusone

Differential Revision: https://reviews.llvm.org/D107877

3 years ago[libc++][NFC] Remove unused include in <compare>.
Kent Ross [Thu, 19 Aug 2021 23:06:52 +0000 (23:06 +0000)]
[libc++][NFC] Remove unused include in <compare>.

`<type_traits>` was included in the first iteration of `<compare>` when
it was created as a monolithic header, then never removed. Removing it
now is a beneficial no-op since it is not guaranteed by the standard
and is already included by all of its subheaders.

Reviewed By: cjdb, #libc, Quuxplusone

Differential Revision: https://reviews.llvm.org/D107801

3 years ago[WebAssembly] Add explicit casts to silence -Wc++11-narrowing
Thomas Lively [Thu, 19 Aug 2021 23:00:07 +0000 (16:00 -0700)]
[WebAssembly] Add explicit casts to silence -Wc++11-narrowing

3 years agoRevert "[DebugInfo] generate btf_tag annotations for DIComposite types"
Yonghong Song [Thu, 19 Aug 2021 22:54:38 +0000 (15:54 -0700)]
Revert "[DebugInfo] generate btf_tag annotations for DIComposite types"

This reverts commit 2fded193e7a8fb5bd8fb339f00fd9de686390530.

Builtbot reports some test failures. Revert now so I can take time
to fix the issues.

3 years ago[DebugInfo] generate btf_tag annotations for DIComposite types
Yonghong Song [Mon, 19 Jul 2021 06:43:48 +0000 (23:43 -0700)]
[DebugInfo] generate btf_tag annotations for DIComposite types

Clang patch D106614 added attribute btf_tag support. This patch
generates btf_tag annotations for DIComposite types.
A field "annotations" is introduced to DIComposite, and the
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates
how annotations are encoded in IR:
  distinct !DICompositeType(..., annotations: !10)
  !10 = !{!11, !12}
  !11 = !{!"btf_tag", !"a"}
  !12 = !{!"btf_tag", !"b"}
Each btf_tag annotation is represented as a 2D array of
meta strings. Each record may have more than one
btf_tag annotations, as in the above example.

Differential Revision: https://reviews.llvm.org/D106615

3 years ago[WebAssembly] More convert_low and promote_low codegen
Thomas Lively [Thu, 19 Aug 2021 22:37:12 +0000 (15:37 -0700)]
[WebAssembly] More convert_low and promote_low codegen

The convert_low and promote_low instructions can widen the lower two lanes of a
four-lane vector, but we were previously scalarizing patterns that widened lanes
besides the low two lanes. The commit adds a shuffle to move the widened lanes
into the low lane positions so the convert_low and promote_low instructions can
be used instead of scalarizing.

Depends on D108266.

Differential Revision: https://reviews.llvm.org/D108341

3 years ago[WebAssembly] Pattern match SIMD convert_low and promote_low during ISel
Thomas Lively [Thu, 19 Aug 2021 22:24:28 +0000 (15:24 -0700)]
[WebAssembly] Pattern match SIMD convert_low and promote_low during ISel

Since the simplest DAG patterns for convert_low and promote_low instructions
involved v2i32, v2f32, v4i64, and v4f64 types, which are not legal in the
WebAssembly backend and would be eliminated by type legalization, we were
previously matching those patterns in a DAG combine before the type legalization
stage. However in cases where the vectors were wider than 128 bits, the patterns
we matched were not created until the type legalization stage when the wide
vectors were split up. Type legalization would continue to eliminate the illegal
types we were matching as well, so the code ended up scalarized.

To make the ISel for these instructions more robust, match the scalarized
patterns rather than the patterns containing illegal types. Add tests with
double-wide vectors to show that this works as intended.

Fixes PR51098.
Depends on D107502.

Differential Revision: https://reviews.llvm.org/D108266

3 years agoRefactor inlineRetainOrClaimRVCalls. NFC
Akira Hatanaka [Thu, 19 Aug 2021 21:55:45 +0000 (14:55 -0700)]
Refactor inlineRetainOrClaimRVCalls. NFC

This is in preparation for committing https://reviews.llvm.org/D103000.

3 years agoUpdate logic to close inherited file descriptors.
Rumeet Dhindsa [Thu, 19 Aug 2021 21:38:04 +0000 (14:38 -0700)]
Update logic to close inherited file descriptors.

This patch adds the support to close all inherited fds into the child
process by iterating over /proc/self/fd entries.

Differential Revision: https://reviews.llvm.org/D105732

3 years ago[NFC] Cleanup AttributeList::getStackAlignment()
Arthur Eubanks [Thu, 19 Aug 2021 21:20:59 +0000 (14:20 -0700)]
[NFC] Cleanup AttributeList::getStackAlignment()

So that we don't use a confusing index.

3 years ago[Support] Update `MD5` to follow other hashes.
Alexandre Rames [Wed, 18 Aug 2021 22:27:08 +0000 (15:27 -0700)]
[Support] Update `MD5` to follow other hashes.

Introduce `StringRef final()` and `StringRef result()`.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D107781

3 years ago[NFC] Replace some attribute methods that use confusing indexes
Arthur Eubanks [Thu, 19 Aug 2021 21:02:11 +0000 (14:02 -0700)]
[NFC] Replace some attribute methods that use confusing indexes

3 years ago[NFC][Support] Move `MD5` members in `InternalState`.
Alexandre Rames [Thu, 19 Aug 2021 17:09:53 +0000 (10:09 -0700)]
[NFC][Support] Move `MD5` members in `InternalState`.

This prepares an update to follow other hashes.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D108388

3 years ago[MLIR][SCF] Parenthesize multiple return types in scf.execute_region asm op
Morten Borup Petersen [Thu, 19 Aug 2021 19:34:43 +0000 (20:34 +0100)]
[MLIR][SCF] Parenthesize multiple return types in scf.execute_region asm op

Previously, ExecuteRegionOps with multiple return values would fail a round-trip test due to missing parenthesis around the types.

Differential Revision: https://reviews.llvm.org/D108402

3 years ago[hwasan] re-enable stack safety by default.
Florian Mayer [Thu, 19 Aug 2021 15:37:02 +0000 (16:37 +0100)]
[hwasan] re-enable stack safety by default.

The failed assertion was fixed in D108337.

Reviewed By: vitalybuka, eugenis

Differential Revision: https://reviews.llvm.org/D108381

3 years agoAdd implicit map for a list item appears in a reduction clause.
Jennifer Yu [Mon, 16 Aug 2021 14:18:34 +0000 (07:18 -0700)]
Add implicit map for a list item appears in a reduction clause.

A new rule is added in 5.0:
If a list item appears in a reduction, lastprivate or linear clause
on a combined target construct then it is treated as if it also appears
in a map clause with a map-type of tofrom.

Currently map clauses for all capture variables are added implicitly.
But missing for list item of expression for array elements or array
sections.

The change is to add implicit map clause for array of elements used in
reduction clause. Skip adding map clause if the expression is not
mappable.
Noted: For linear and lastprivate, since only variable name is
accepted, the map has been added though capture variables.

To do so:
During the mappable checking, if error, ignore diagnose and skip
adding implicit map clause.

The changes:
1> Add code to generate implicit map in ActOnOpenMPExecutableDirective,
   for omp 5.0 and up.
2> Add extra default parameter NoDiagnose in ActOnOpenMPMapClause:
Use that to skip error as well as skip adding implicit map during the
mappable checking.

Note: there are only tow places need to be check for NoDiagnose. Rest
of them either the check is for < omp 5.0 or the error already generated for
reduction clause.

Differential Revision: https://reviews.llvm.org/D108132

3 years ago[openmp] Disable the tests that block CI for amdgpu and host offloading.
Jon Chesterfield [Thu, 19 Aug 2021 19:43:05 +0000 (20:43 +0100)]
[openmp] Disable the tests that block CI for amdgpu and host offloading.

3 years agoMove function definition out-of-line to fix the modularized build (NFC)
Adrian Prantl [Thu, 19 Aug 2021 19:24:36 +0000 (12:24 -0700)]
Move function definition out-of-line to fix the modularized build (NFC)

3 years ago[WebAssembly] Legalize vector types by widening
Thomas Lively [Thu, 19 Aug 2021 19:07:32 +0000 (12:07 -0700)]
[WebAssembly] Legalize vector types by widening

The default legalization of unsupported vector types is to promote the integers
in each lane, which leads to extra sign or zero extending and masking when
moving data into and out of vectors. Switch our preferred type legalization from
the default to vector widening, which keeps the data in the low lanes of the
vector rather than in the low bits of each lane. The unused high lanes can be
ignored.

Half-wide vectors are now loaded from memory into the low 64 bits of the v128
rather than spread out among the lanes. As a result, v128.load64_splat is a much
more common operation, so add new patterns to support it.

Differential Revision: https://reviews.llvm.org/D107502

3 years ago[sanitizer] Fix for CMAKE_CXX_FLAGS update
Brian Cain [Thu, 19 Aug 2021 13:26:41 +0000 (06:26 -0700)]
[sanitizer] Fix for CMAKE_CXX_FLAGS update

With unquoted ${CMAKE_CXX_FLAGS}, the REGEX fails when it's empty:

```CMake Error at lib/scudo/standalone/CMakeLists.txt:14 (string):
string sub-command REGEX, mode REPLACE needs at least 6 arguments total to
command.```

3 years ago[libc][Obvious] Fix llvm_libc_ext.td.
Siva Chandra Reddy [Thu, 19 Aug 2021 18:41:18 +0000 (18:41 +0000)]
[libc][Obvious] Fix llvm_libc_ext.td.

3 years agoRevert "[mlir][Linalg] Allow all build methods of Structured ops to specify additiona...
MaheshRavishankar [Thu, 19 Aug 2021 18:52:46 +0000 (11:52 -0700)]
Revert "[mlir][Linalg] Allow all build methods of Structured ops to specify additional attributes."

This reverts commit 95ddc8341ae2c27229ad3dcf1d55abebcec15d02.

Differential Revision: https://reviews.llvm.org/D108396

3 years ago[LLDB][GUI] Handle return key for compound fields
Omar Emara [Wed, 18 Aug 2021 22:07:55 +0000 (15:07 -0700)]
[LLDB][GUI] Handle return key for compound fields

This patch handles the return key for compound fields like lists and
mapping fields. The return key, if not handled by the field will select
the next primary element, skipping secondary elements like remove
buttons and the like.

Differential Revision: https://reviews.llvm.org/D108331

3 years ago[runtimeunroll] Support multiple exits to latch exit w/prolog loop
Philip Reames [Thu, 19 Aug 2021 18:41:09 +0000 (11:41 -0700)]
[runtimeunroll] Support multiple exits to latch exit w/prolog loop

This patch extends the runtime unrolling infrastructure to support unrolling a loop with multiple exiting blocks branching to the same exit block used by the latch. It intentionally does not include a cost model change to enable this functionality unless appropriate force flags are used.

This is the prolog companion to D107381. Since this was LGTMed, a problem with DT updating was reported against that patch.  I roled in the analogous fix here as it seemed obvious, and not worth re-review.

As an aside, our prolog form leaves a lot of potential value on the floor when there is an invariant load or invariant condition in the loop being runtime unrolled. We should probably consider a "required prolog" heuristic.  (Alternatively, maybe we should be peeling these cases more aggressively?)

Differential Revision: https://reviews.llvm.org/D108262

3 years ago[AMDGPU] Add alias.scope metadata to lowered LDS struct
Stanislav Mekhanoshin [Tue, 17 Aug 2021 19:58:07 +0000 (12:58 -0700)]
[AMDGPU] Add alias.scope metadata to lowered LDS struct

Alias analysis is unable to disambiguate accesses to the structure
fields without it unlike distinct variables. As a result we cannot
combine ds_read and ds_write operations in a case of any store in
between which always considered clobbering.

Differential Revision: https://reviews.llvm.org/D108315

3 years ago[GuardWidening] Preserve MemorySSA
Nikita Popov [Thu, 19 Aug 2021 16:42:27 +0000 (18:42 +0200)]
[GuardWidening] Preserve MemorySSA

As reported on https://bugs.llvm.org/show_bug.cgi?id=51020, the
guard widening pass doesn't preserve MemorySSA, so it can no
longer be scheduled in the same loop pass manager as LICM. However,
the loop-schedule.ll test indicates that this is supposed to work.

Fix this by preserving MemorySSA if available, as this seems to be
trivial in this case (we only need to drop the memory access for
the removed guards).

Differential Revision: https://reviews.llvm.org/D108386

3 years ago[mlir][Linalg] Allow all build methods of Structured ops to specify additional attrib...
MaheshRavishankar [Thu, 19 Aug 2021 17:08:59 +0000 (10:08 -0700)]
[mlir][Linalg] Allow all build methods of Structured ops to specify additional attributes.

Differential Revision: https://reviews.llvm.org/D108338

3 years ago[lldb][NFC] Remove unused header include
Alex Langford [Thu, 19 Aug 2021 18:06:41 +0000 (11:06 -0700)]
[lldb][NFC] Remove unused header include

3 years ago[runtimeunroll] Fix reported DT verification error after 94d0914
Philip Reames [Thu, 19 Aug 2021 18:04:31 +0000 (11:04 -0700)]
[runtimeunroll] Fix reported DT verification error after 94d0914

In 94d0914, I added support for unrolling of multiple exit loops which have multiple exits reaching the latch.  Per reports on the review post commit, I'd missed updating the domtree for one case.  This fix addresses that ommission.

There's no new test as this is covered by existing tests with expensive verification turned on.

3 years ago[libc] add atoi, atol, and atoll
Michael Jones [Wed, 18 Aug 2021 17:24:14 +0000 (17:24 +0000)]
[libc] add atoi, atol, and atoll

This is based on the work done to add strtoll and the other strto
functions. The atoi functions also were added to stdc and
entrypoints.txt.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D108330

3 years ago[libc++][NFCI] Remove unnecessary exception-throwing base classes
Louis Dionne [Thu, 19 Aug 2021 16:15:31 +0000 (12:15 -0400)]
[libc++][NFCI] Remove unnecessary exception-throwing base classes

__split_buffer_common was entirely unused, and __deque_base_common
was unused except for two calls to __throw_out_of_range(), which have
been inlined.

The usual intent of the __xxx_base_common base classes is to localize
where the exception-throwing code is instantiated, however that wasn't
the case here because we never explicitly instantiated those base classes
in the shared library, unlike what we do for basic_string and vector.

Differential Revision: https://reviews.llvm.org/D108384

3 years ago[SLP][X86] Add llvm.isnan intrinsic test coverage
Simon Pilgrim [Thu, 19 Aug 2021 17:44:26 +0000 (18:44 +0100)]
[SLP][X86] Add llvm.isnan intrinsic test coverage

We still need to tag the llvm.isnan.? intrinsic as vectorizable

3 years ago[SLP][X86] Regenerate intrinsic.ll test checks
Simon Pilgrim [Thu, 19 Aug 2021 17:35:14 +0000 (18:35 +0100)]
[SLP][X86] Regenerate intrinsic.ll test checks

3 years ago[libc] Add a trivial implementation for bcmp
Guillaume Chatelet [Thu, 19 Aug 2021 17:55:16 +0000 (17:55 +0000)]
[libc] Add a trivial implementation for bcmp

Differential Revision: https://reviews.llvm.org/D108225

3 years ago[libomptarget][nfc] Move lanemask_t type into target_impl.h
Jon Chesterfield [Thu, 19 Aug 2021 17:42:23 +0000 (18:42 +0100)]
[libomptarget][nfc] Move lanemask_t type into target_impl.h

3 years agoAArch64: copy all parts of the mem operand across when combining a store
Tim Northover [Thu, 19 Aug 2021 14:15:37 +0000 (15:15 +0100)]
AArch64: copy all parts of the mem operand across when combining a store

In particular we were dropping volatility, which can lead to unwanted
transformations.

3 years ago[CostModel][X86] Add isnan half/float/double costs tests
Simon Pilgrim [Thu, 19 Aug 2021 17:06:52 +0000 (18:06 +0100)]
[CostModel][X86] Add isnan half/float/double costs tests

3 years ago[InstCombine] Avoid folding GEPs across loop boundaries
Chang-Sun Lin, Jr [Thu, 19 Aug 2021 17:01:34 +0000 (20:01 +0300)]
[InstCombine] Avoid folding GEPs across loop boundaries

Folding a GEP from outside to inside a loop will materialize an add where there wasn't an equivalent operation before. Check the containing loops before making this fold.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D107935

3 years ago[NFC][InstCombine] Add test for one-use one-index geps in different loops
Chang-Sun Lin, Jr [Thu, 19 Aug 2021 16:53:48 +0000 (19:53 +0300)]
[NFC][InstCombine] Add test for one-use one-index geps in different loops

3 years ago[OpaquePtr][Inline] Use byval type instead of pointee type
Arthur Eubanks [Fri, 9 Jul 2021 16:37:50 +0000 (09:37 -0700)]
[OpaquePtr][Inline] Use byval type instead of pointee type

Reviewed By: #opaque-pointers, dblaikie

Differential Revision: https://reviews.llvm.org/D105711

3 years agoUse v16i8 rather than v2i64 as the VT for memset expansion on AArch64.
Owen Anderson [Thu, 19 Aug 2021 08:00:29 +0000 (08:00 +0000)]
Use v16i8 rather than v2i64 as the VT for memset expansion on AArch64.

This allows the instruction selector to realize that it can directly
broadcast the low byte of the memset value, rather than replicating
it to a 64-bit GPR before broadcasting.

This fixes PR50985.

Differential Revision: https://reviews.llvm.org/D108354

3 years agoFix unknown parameter Wdocumentation warnings. NFC.
Simon Pilgrim [Thu, 19 Aug 2021 16:45:26 +0000 (17:45 +0100)]
Fix unknown parameter Wdocumentation warnings. NFC.

3 years ago[clang] Do not warn unused -enable-trivial-auto-var-init-zero-knowing-it-will-be...
Yi Kong [Wed, 18 Aug 2021 08:24:04 +0000 (16:24 +0800)]
[clang] Do not warn unused -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang

Android enables zero initialisation globally by default, but also allows
subprojects to override with different option. Clang complains the above
flag being unused in this case.

Instead of adding a 75 char long -no-* flag, don't warn unused argument
for this flag.

Differential Revision: https://reviews.llvm.org/D108278

3 years agoMemoryBuiltins: trailing , on collection literal
Augie Fackler [Thu, 19 Aug 2021 15:17:39 +0000 (11:17 -0400)]
MemoryBuiltins: trailing , on collection literal

This was probably bugging more than is reasonable, but it makes merging
changes in this file slightly less annoying to have the trailing comma
here. I only noticed this because Rust is currently carrying a patch to
this file and it kept making life a little difficult.

3 years agoFix CodeGen/X86/fsafdo_test2.ll fail in release
Thomas Preud'homme [Thu, 19 Aug 2021 11:04:42 +0000 (12:04 +0100)]
Fix CodeGen/X86/fsafdo_test2.ll fail in release

Require debug build for CodeGen/X86/fsafdo_test2.ll since it checks for
messages only printed in debug mode.

Reviewed By: wenlei, hoy

Differential Revision: https://reviews.llvm.org/D108364

3 years agoFix empty paragraph passed to parameter Wdocumentation warning. NFC.
Simon Pilgrim [Thu, 19 Aug 2021 15:48:09 +0000 (16:48 +0100)]
Fix empty paragraph passed to parameter Wdocumentation warning. NFC.

3 years agoRevert "[SelectionDAGBuilder] Compute and cache PreferredExtendType on demand."
Craig Topper [Thu, 19 Aug 2021 15:42:05 +0000 (08:42 -0700)]
Revert "[SelectionDAGBuilder] Compute and cache PreferredExtendType on demand."

This reverts commit add08c874147638e52d89eb07e40797dbc98d73b.

There was a compile time jump on tramp3d-v4 on https://llvm-compile-time-tracker.com/
Want to see if it goes away with this reverted.

3 years ago[CRT][LIT] build the target_cflags for Popen properly
Jinsong Ji [Thu, 19 Aug 2021 15:37:50 +0000 (15:37 +0000)]
[CRT][LIT] build the target_cflags for Popen properly

We recently enabled crt for powerpc in
https://reviews.llvm.org/rGb7611ad0b16769d3bf172e84fa9296158f8f1910.

And we started to see some unexpected error message when running
check-runtimes.

eg:
https://lab.llvm.org/buildbot/#/builders/57/builds/9488/steps/6/logs/stdio
line 100 - 103:

"
clang-14: error: unknown argument: '-m64 -fno-function-sections'
clang-14: error: unknown argument: '-m64 -fno-function-sections'
clang-14: error: unknown argument: '-m64 -fno-function-sections'
clang-14: error: unknown argument: '-m64 -fno-function-sections'
"

Looks like we shouldn't strip the space at the beginning,
or else the command line passed to subprocess won't work well.

Reviewed By: phosek, MaskRay

Differential Revision: https://reviews.llvm.org/D108329

3 years ago[Clang][AST][NFC] Resolve FIXME: Make CXXRecordDecl *Record const.
Alfsonso Gregory [Thu, 19 Aug 2021 15:36:05 +0000 (16:36 +0100)]
[Clang][AST][NFC] Resolve FIXME: Make CXXRecordDecl *Record const.

Differential Revision: https://reviews.llvm.org/D107477

3 years ago[docs] Document how to install sphinx and recommonmark on Ubuntu
Yaron Keren [Thu, 19 Aug 2021 13:57:15 +0000 (16:57 +0300)]
[docs] Document how to install sphinx and recommonmark on Ubuntu

Differential Revision: https://reviews.llvm.org/D108374

3 years ago[ISel] Expand saddsat and ssubsat via asr and xor
David Green [Thu, 19 Aug 2021 15:08:07 +0000 (16:08 +0100)]
[ISel] Expand saddsat and ssubsat via asr and xor

This changes the lowering of saddsat and ssubsat so that instead of
using:
  r,o = saddo x, y
  c = setcc r < 0
  s = c ? INTMAX : INTMIN
  ret o ? s : r
into using asr and xor to materialize the INTMAX/INTMIN constants:
  r,o = saddo x, y
  s = ashr r, BW-1
  x = xor s, INTMIN
  ret o ? x : r
https://alive2.llvm.org/ce/z/TYufgD

This seems to reduce the instruction count in most testcases across most
architectures. X86 has some custom lowering added to compensate for
cases where it can increase instruction count.

Differential Revision: https://reviews.llvm.org/D105853

3 years ago[AIX] Remove XFAIL from macro-same-context
Jinsong Ji [Thu, 19 Aug 2021 14:52:41 +0000 (14:52 +0000)]
[AIX] Remove XFAIL from macro-same-context

We have enabled inline asm intergrated assembler support,
this test is passing now.

3 years agoFix unknown parameter Wdocumentation warnings. NFC.
Simon Pilgrim [Thu, 19 Aug 2021 14:39:53 +0000 (15:39 +0100)]
Fix unknown parameter Wdocumentation warnings. NFC.

3 years agoFix unknown parameter Wdocumentation warning. NFC.
Simon Pilgrim [Thu, 19 Aug 2021 14:30:06 +0000 (15:30 +0100)]
Fix unknown parameter Wdocumentation warning. NFC.

3 years agoFix unknown parameter Wdocumentation warning. NFC.
Simon Pilgrim [Thu, 19 Aug 2021 14:23:53 +0000 (15:23 +0100)]
Fix unknown parameter Wdocumentation warning. NFC.

3 years agoFix unknown parameter Wdocumentation warning. NFC.
Simon Pilgrim [Thu, 19 Aug 2021 14:22:05 +0000 (15:22 +0100)]
Fix unknown parameter Wdocumentation warning. NFC.

3 years ago[CostModel][X86] Add VPOPCNTDQ/BITALG ctpop costs
Simon Pilgrim [Thu, 19 Aug 2021 13:25:17 +0000 (14:25 +0100)]
[CostModel][X86] Add VPOPCNTDQ/BITALG ctpop costs

VPOPCNTDQ + BITALG add ctpop instructions for vXi64/vXi32 + vXi16/vXi8 vector types respectively

3 years ago[SelectionDAGBuilder] Compute and cache PreferredExtendType on demand.
Craig Topper [Thu, 19 Aug 2021 14:18:30 +0000 (07:18 -0700)]
[SelectionDAGBuilder] Compute and cache PreferredExtendType on demand.

Previously we pre-calculated this and cached it for every
instruction in the function. Most of the calculated results will
never be used. So instead calculate it only on the first use, and
then cache it.

The cache was originally added to fix a compile time issue which
caused r216066 to be reverted.

This change exposed that we weren't pre-computing the Value for
Arguments. I've explicitly disabled that for now as it seemed to
regress some tests on AArch64 which has sext built into its compare
instructions.

Spotted while investigating how to improve heuristics to work better
with RISCV preferring sign extend for unsigned compares for i32 on RV64.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D107976

3 years ago[TypePromotion] Use Instruction* instead of Value* for a couple functions. NFC
Craig Topper [Wed, 18 Aug 2021 22:02:33 +0000 (15:02 -0700)]
[TypePromotion] Use Instruction* instead of Value* for a couple functions. NFC

This matches how they are called and allows some isa/cast/dyn_cast
to be removed.

Differential Revision: https://reviews.llvm.org/D108333

3 years ago[RISCV] Reduce duplicate code for calling SimplifyDemandedBits.
Craig Topper [Wed, 18 Aug 2021 19:21:04 +0000 (12:21 -0700)]
[RISCV] Reduce duplicate code for calling SimplifyDemandedBits.

This encapsulates the APInt creation and worklist management into
a helper function.

To keep one common interface I've use Log2_32 in places that
previously created a mask by subtracting 1 from a power of 2.

Differential Revision: https://reviews.llvm.org/D108324

3 years ago[ARM] Add MVE min/max intrinsic tests. NFC
David Green [Thu, 19 Aug 2021 13:33:34 +0000 (14:33 +0100)]
[ARM] Add MVE min/max intrinsic tests. NFC

3 years ago[DWARF][Verifier][NFC] Use reference to DWARFAddressRangesVector to avoid copying.
Alexey Lapshin [Thu, 19 Aug 2021 11:19:07 +0000 (14:19 +0300)]
[DWARF][Verifier][NFC] Use reference to DWARFAddressRangesVector to avoid copying.

Avoid copying while access to RangesOrError.get().

3 years ago[CostModel][X86] Add VPOPCNT/BITALG test coverage for ctpop/cttz costs
Simon Pilgrim [Thu, 19 Aug 2021 12:49:50 +0000 (13:49 +0100)]
[CostModel][X86] Add VPOPCNT/BITALG test coverage for ctpop/cttz costs

3 years ago[RISCV][test] Improve tests for (add (mul x, c1), c2)
Ben Shi [Thu, 19 Aug 2021 13:03:46 +0000 (21:03 +0800)]
[RISCV][test] Improve tests for (add (mul x, c1), c2)

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D107710

3 years ago[mlir][SparseTensor] Split scf.for loop into masked/unmasked parts
Matthias Springer [Thu, 19 Aug 2021 12:46:12 +0000 (21:46 +0900)]
[mlir][SparseTensor] Split scf.for loop into masked/unmasked parts

Apply the "for loop peeling" pattern from SCF dialect transforms. This pattern splits scf.for loops into full and partial iterations. In the full iteration, all masked loads/stores are canonicalized to unmasked loads/stores.

Differential Revision: https://reviews.llvm.org/D107733

3 years agoRevert "[CVP] processSwitch: Remove default case when switch cover all possible values."
Sanjay Patel [Thu, 19 Aug 2021 12:43:51 +0000 (08:43 -0400)]
Revert "[CVP] processSwitch: Remove default case when switch cover all possible values."

This reverts commit 9934a5b2ed5aa6e6bbb2e55c3cd98839722c226e.
This patch may cause miscompiles because it missed a constraint
as shown in the examples from:
https://llvm.org/PR51531

3 years ago[InstCombine] add min/max intrinsics as freely invertible candidates
Sanjay Patel [Wed, 18 Aug 2021 22:45:51 +0000 (18:45 -0400)]
[InstCombine] add min/max intrinsics as freely invertible candidates

In the optimized test, we are able to peak through the
min/max that has 2 min/max operands and invert them all:
https://alive2.llvm.org/ce/z/7gYMN5

3 years ago[InstCombine] add tests for min/max with inverts; NFC
Sanjay Patel [Wed, 18 Aug 2021 22:27:15 +0000 (18:27 -0400)]
[InstCombine] add tests for min/max with inverts; NFC

3 years ago[InstCombine] add one-use check for min/max fold with not operands; NFC
Sanjay Patel [Wed, 18 Aug 2021 21:02:55 +0000 (17:02 -0400)]
[InstCombine] add one-use check for min/max fold with not operands; NFC

This makes the intrinsic logic match the cmp+select idiom folds
just below. It's not clearly a win either way unless we think
that a 'not' op costs more than min/max.

The cmp+select folds on these patterns are more extensive than
the intrinsics currently and may have some complicated interactions,
so I'm trying to make those line up and bring the optimizations
for intrinsics up to parity.

3 years ago[openmp][nfc] Replace OMPGridValues array with struct
Jon Chesterfield [Thu, 19 Aug 2021 12:25:41 +0000 (13:25 +0100)]
[openmp][nfc] Replace OMPGridValues array with struct

[nfc] Replaces enum indices into an array with a struct. Named the
fields to match the enum, leaves memory layout and initialization unchanged.

Motivation is to later safely remove dead fields and replace redundant ones
with (compile time) computation. It should also be possible to factor some
common fields into a base and introduce a gfx10 amdgpu instance with less
duplication than the arrays of integers require.

Reviewed By: ronlieb

Differential Revision: https://reviews.llvm.org/D108339

3 years ago[LoopFlatten] Fix assertion failure
Rosie Sumpter [Tue, 17 Aug 2021 10:43:34 +0000 (11:43 +0100)]
[LoopFlatten] Fix assertion failure

There is an assertion failure in computeOverflowForUnsignedMul
(used in checkOverflow) due to the inner and outer trip counts
having different types. This occurs when the IV has been widened,
but the loop components are not successfully rediscovered.
This is fixed by some refactoring of the code in findLoopComponents
which identifies the trip count of the loop.

Differential Revision: https://reviews.llvm.org/D108107

3 years ago[LegalizeTypes][VP] Add widening support for binary VP ops
Fraser Cormack [Wed, 11 Aug 2021 12:09:14 +0000 (13:09 +0100)]
[LegalizeTypes][VP] Add widening support for binary VP ops

This patch adds the beginnings of more thorough support in the
legalizers for vector-predicated (VP) operations.

The first step is the ability to widen illegal vectors. The more
complicated scenario in which the result/operands need widening but the
mask doesn't has not been handled here. That would require a lot of code
without an in-tree target on which to test it.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D107904

3 years ago[CodeCompletion] Provide placeholders for known attribute arguments
Sam McCall [Fri, 13 Aug 2021 08:17:59 +0000 (10:17 +0200)]
[CodeCompletion] Provide placeholders for known attribute arguments

Completion now looks more like function/member completion:

  used
  alias(Aliasee)
  abi_tag(Tags...)

Differential Revision: https://reviews.llvm.org/D108109

3 years ago[AArch64][SVE] Teach cost model that masked loads/stores are cheap
Matthew Devereau [Thu, 19 Aug 2021 10:42:20 +0000 (11:42 +0100)]
[AArch64][SVE] Teach cost model that masked loads/stores are cheap

Reduce the cost of VLS masked loads/stores to make the vectorizor emit them more frequently.

3 years ago[RISCV][test] Add new tests for add optimization in the zba extension
Ben Shi [Tue, 17 Aug 2021 06:21:24 +0000 (14:21 +0800)]
[RISCV][test] Add new tests for add optimization in the zba extension

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D108188

3 years ago[X86] Regenerate store_op_load_fold.ll test checks
Simon Pilgrim [Thu, 19 Aug 2021 11:30:46 +0000 (12:30 +0100)]
[X86] Regenerate store_op_load_fold.ll test checks

3 years ago[CodeComplete] Only complete attributes that match the current LangOpts
Sam McCall [Mon, 16 Aug 2021 09:40:24 +0000 (11:40 +0200)]
[CodeComplete] Only complete attributes that match the current LangOpts

Differential Revision: https://reviews.llvm.org/D108111

3 years ago[tsan] Fix pthread_once() on Mac OS X
Marco Elver [Thu, 19 Aug 2021 11:17:45 +0000 (13:17 +0200)]
[tsan] Fix pthread_once() on Mac OS X

Change 636428c727cd enabled BlockingRegion hooks for pthread_once().
Unfortunately this seems to cause crashes on Mac OS X which uses
pthread_once() from locations that seem to result in crashes:

| ThreadSanitizer:DEADLYSIGNAL
| ==31465==ERROR: ThreadSanitizer: stack-overflow on address 0x7ffee73fffd8 (pc 0x00010807fd2a bp 0x7ffee7400050 sp 0x7ffee73fffb0 T93815)
|     #0 __tsan::MetaMap::GetSync(__tsan::ThreadState*, unsigned long, unsigned long, bool, bool) tsan_sync.cpp:195 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x78d2a)
|     #1 __tsan::MutexPreLock(__tsan::ThreadState*, unsigned long, unsigned long, unsigned int) tsan_rtl_mutex.cpp:143 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x6cefc)
|     #2 wrap_pthread_mutex_lock sanitizer_common_interceptors.inc:4240 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x3dae0)
|     #3 flockfile <null>:2 (libsystem_c.dylib:x86_64+0x38a69)
|     #4 puts <null>:2 (libsystem_c.dylib:x86_64+0x3f69b)
|     #5 wrap_puts sanitizer_common_interceptors.inc (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x34d83)
|     #6 __tsan::OnPotentiallyBlockingRegionBegin() cxa_guard_acquire.cpp:8 (foo:x86_64+0x100000e48)
|     #7 wrap_pthread_once tsan_interceptors_posix.cpp:1512 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x2f6e6)

From the stack trace it can be seen that the caller is unknown, and the
resulting stack-overflow seems to indicate that whoever the caller is
does not have enough stack space or otherwise is running in a limited
environment not yet ready for full instrumentation.

Fix it by reverting behaviour on Mac OS X to not call BlockingRegion
hooks from pthread_once().

Reported-by: azharudd
Reviewed By: glider

Differential Revision: https://reviews.llvm.org/D108305

3 years agoAvoid unused variable when NDEBUG
Frederik Gossen [Thu, 19 Aug 2021 10:51:14 +0000 (12:51 +0200)]
Avoid unused variable when NDEBUG

3 years ago[OpenCL] Fix as_type(vec3) invalid store creation
Sven van Haastregt [Thu, 19 Aug 2021 10:57:09 +0000 (11:57 +0100)]
[OpenCL] Fix as_type(vec3) invalid store creation

With -fpreserve-vec3-type enabled, a cast was not created when
converting from a vec3 type to a non-vec3 type, even though a
conversion to vec4 was performed.  This resulted in creation of
invalid store instructions.

Differential Revision: https://reviews.llvm.org/D107963

3 years ago[NewPM] Make some sanitizer passes parameterized in the PassRegistry
Bjorn Pettersson [Mon, 28 Jun 2021 09:16:40 +0000 (11:16 +0200)]
[NewPM] Make some sanitizer passes parameterized in the PassRegistry

Refactored implementation of AddressSanitizerPass and
HWAddressSanitizerPass to use pass options similar to passes like
MemorySanitizerPass. This makes sure that there is a single mapping
from class name to pass name (needed by D108298), and options like
-debug-only and -print-after makes a bit more sense when (despite
that it is the unparameterized pass name that should be used in those
options).

A result of the above is that some pass names are removed in favor
of the parameterized versions:
- "khwasan" is now "hwasan<kernel;recover>"
- "kasan" is now "asan<kernel>"
- "kmsan" is now "msan<kernel>"

Differential Revision: https://reviews.llvm.org/D105007

3 years ago[docs] Document that psutil should be installed in non-user location
Yaron Keren [Thu, 19 Aug 2021 09:27:49 +0000 (12:27 +0300)]
[docs] Document that psutil should be installed in non-user location

Differential Revision: https://reviews.llvm.org/D108356

3 years agoUpdate {Small}BitVector size_type definition
Renato Golin [Wed, 18 Aug 2021 10:50:15 +0000 (11:50 +0100)]
Update {Small}BitVector size_type definition

SmallBitVector implements a level of indirection over BitVector by
storing a smaller bit-vector in a pointer-sized element, or in case the
number of elements exceeds the bucket size, it creates a new pointer to
a BitVector and uses that as its storage.

However, the functions returning the vector size were using `unsigned`,
which is ok for BitVector, but not for SmallBitVector, which is actually
`uintptr_t`.

This commit reuses the `size_type` definition to more than just `count`
and propagates them into range iteration, size calculation, etc.

This is a continuation of D108124.

I haven't changed all occurrences of `unsigned` or `uintptr_t` to
`size_type`, just those that were directly related.

Following directions from clang-tidy on case of variables.

Differential Revision: https://reviews.llvm.org/D108290

3 years ago[OptTable] Refine how `printHelp` treats empty help texts
Andrzej Warzynski [Thu, 5 Aug 2021 11:42:30 +0000 (11:42 +0000)]
[OptTable] Refine how `printHelp` treats empty help texts

Currently, `printHelp` behaves differently for options that:
  * do not define `HelpText` (such options _are not printed_), and
  * define its `HelpText` as `HelpText<"">` (such options _are printed_).
In practice, both approaches lead to no help text and `printHelp` should
treat them consistently. This patch addresses that by making
`printHelpt` check the length of the help text to be printed.

All affected tests have been updated accordingly. The option definitions
for llvm-cvtres have been updated with a short description or "Not
  implemented" for options that are ignored by the tool.

Differential Revision: https://reviews.llvm.org/D107557

3 years ago[clang] [MSVC] Implement __mulh and __umulh builtins for aarch64
Martin Storsjö [Fri, 23 Jul 2021 21:04:10 +0000 (00:04 +0300)]
[clang] [MSVC] Implement __mulh and __umulh builtins for aarch64

The code is based on the same __mulh and __umulh intrinsics for
x86.

This should fix PR51128.

Differential Revision: https://reviews.llvm.org/D106721

3 years ago[LoopVectorize][AArch64] Enable ordered reductions by default for AArch64
David Sherwood [Fri, 2 Jul 2021 10:12:16 +0000 (11:12 +0100)]
[LoopVectorize][AArch64] Enable ordered reductions by default for AArch64

I have added a new TTI interface called enableOrderedReductions() that
controls whether or not ordered reductions should be enabled for a
given target. By default this returns false, whereas for AArch64 it
returns true and we rely upon the cost model to make sensible
vectorisation choices. It is still possible to override the new TTI
interface by setting the command line flag:

  -force-ordered-reductions=true|false

I have added a new RUN line to show that we use ordered reductions by
default for SVE and Neon:

  Transforms/LoopVectorize/AArch64/strict-fadd.ll
  Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll

Differential Revision: https://reviews.llvm.org/D106653

3 years ago[flang][driver] Add print function name Plugin example
Stuart Ellis [Thu, 19 Aug 2021 08:07:45 +0000 (08:07 +0000)]
[flang][driver] Add print function name Plugin example

Replacing Hello World example Plugin with one that counts and prints the names of
functions and subroutines.
This involves changing the `PluginParseTreeAction` Plugin base class to
inherit from `PrescanAndSemaAction` class to get access to the Parse Tree
so that the Plugin can walk it.
Additionally, there are tests of this new Plugin to check it prints the correct
things in different circumstances.

Depends on: D106137

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D107089

3 years ago[mlir][scf] Simplify affine.min ops after loop peeling
Matthias Springer [Thu, 19 Aug 2021 08:08:21 +0000 (17:08 +0900)]
[mlir][scf] Simplify affine.min ops after loop peeling

Simplify affine.min ops, enabling various other canonicalizations inside the peeled loop body.

affine.min ops such as:
```
map = affine_map<(d0)[s0, s1] -> (s0, -d0 + s1)>
%r = affine.min #affine.min #map(%iv)[%step, %ub]
```
are rewritten them into (in the case the peeled loop):
```
%r = %step
```

To determine how an affine.min op should be rewritten and to prove its correctness, FlatAffineConstraints is utilized.

Differential Revision: https://reviews.llvm.org/D107222

3 years ago[flang] Add POSIX implementation for SYSTEM_CLOCK
Diana Picus [Tue, 13 Jul 2021 11:37:43 +0000 (11:37 +0000)]
[flang] Add POSIX implementation for SYSTEM_CLOCK

This is very similar to CPU_TIME, except that we return nanoseconds
rather than seconds. This means we're potentially dealing with rather
large numbers, so we'll have to wrap around to avoid overflows.

Differential Revision: https://reviews.llvm.org/D105970

3 years agoSimplify setting up LLVM as bazel external repo
Christian Sigg [Wed, 18 Aug 2021 07:14:42 +0000 (09:14 +0200)]
Simplify setting up LLVM as bazel external repo

Only require one intermediate repository instead of two.
Fewer parameters in llvm_config.

Second attempt of https://reviews.llvm.org/D107714, this time also updating `third_party_build` and `deps_impl` paths.

Reviewed By: GMNGeoffrey

Differential Revision: https://reviews.llvm.org/D108274