platform/upstream/llvm.git
3 years ago[OptTable] Refine how `printHelp` treats empty help texts
Andrzej Warzynski [Thu, 5 Aug 2021 11:42:30 +0000 (11:42 +0000)]
[OptTable] Refine how `printHelp` treats empty help texts

Currently, `printHelp` behaves differently for options that:
  * do not define `HelpText` (such options _are not printed_), and
  * define its `HelpText` as `HelpText<"">` (such options _are printed_).
In practice, both approaches lead to no help text and `printHelp` should
treat them consistently. This patch addresses that by making
`printHelpt` check the length of the help text to be printed.

All affected tests have been updated accordingly. The option definitions
for llvm-cvtres have been updated with a short description or "Not
  implemented" for options that are ignored by the tool.

Differential Revision: https://reviews.llvm.org/D107557

3 years ago[clang] [MSVC] Implement __mulh and __umulh builtins for aarch64
Martin Storsjö [Fri, 23 Jul 2021 21:04:10 +0000 (00:04 +0300)]
[clang] [MSVC] Implement __mulh and __umulh builtins for aarch64

The code is based on the same __mulh and __umulh intrinsics for
x86.

This should fix PR51128.

Differential Revision: https://reviews.llvm.org/D106721

3 years ago[LoopVectorize][AArch64] Enable ordered reductions by default for AArch64
David Sherwood [Fri, 2 Jul 2021 10:12:16 +0000 (11:12 +0100)]
[LoopVectorize][AArch64] Enable ordered reductions by default for AArch64

I have added a new TTI interface called enableOrderedReductions() that
controls whether or not ordered reductions should be enabled for a
given target. By default this returns false, whereas for AArch64 it
returns true and we rely upon the cost model to make sensible
vectorisation choices. It is still possible to override the new TTI
interface by setting the command line flag:

  -force-ordered-reductions=true|false

I have added a new RUN line to show that we use ordered reductions by
default for SVE and Neon:

  Transforms/LoopVectorize/AArch64/strict-fadd.ll
  Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll

Differential Revision: https://reviews.llvm.org/D106653

3 years ago[flang][driver] Add print function name Plugin example
Stuart Ellis [Thu, 19 Aug 2021 08:07:45 +0000 (08:07 +0000)]
[flang][driver] Add print function name Plugin example

Replacing Hello World example Plugin with one that counts and prints the names of
functions and subroutines.
This involves changing the `PluginParseTreeAction` Plugin base class to
inherit from `PrescanAndSemaAction` class to get access to the Parse Tree
so that the Plugin can walk it.
Additionally, there are tests of this new Plugin to check it prints the correct
things in different circumstances.

Depends on: D106137

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D107089

3 years ago[mlir][scf] Simplify affine.min ops after loop peeling
Matthias Springer [Thu, 19 Aug 2021 08:08:21 +0000 (17:08 +0900)]
[mlir][scf] Simplify affine.min ops after loop peeling

Simplify affine.min ops, enabling various other canonicalizations inside the peeled loop body.

affine.min ops such as:
```
map = affine_map<(d0)[s0, s1] -> (s0, -d0 + s1)>
%r = affine.min #affine.min #map(%iv)[%step, %ub]
```
are rewritten them into (in the case the peeled loop):
```
%r = %step
```

To determine how an affine.min op should be rewritten and to prove its correctness, FlatAffineConstraints is utilized.

Differential Revision: https://reviews.llvm.org/D107222

3 years ago[flang] Add POSIX implementation for SYSTEM_CLOCK
Diana Picus [Tue, 13 Jul 2021 11:37:43 +0000 (11:37 +0000)]
[flang] Add POSIX implementation for SYSTEM_CLOCK

This is very similar to CPU_TIME, except that we return nanoseconds
rather than seconds. This means we're potentially dealing with rather
large numbers, so we'll have to wrap around to avoid overflows.

Differential Revision: https://reviews.llvm.org/D105970

3 years agoSimplify setting up LLVM as bazel external repo
Christian Sigg [Wed, 18 Aug 2021 07:14:42 +0000 (09:14 +0200)]
Simplify setting up LLVM as bazel external repo

Only require one intermediate repository instead of two.
Fewer parameters in llvm_config.

Second attempt of https://reviews.llvm.org/D107714, this time also updating `third_party_build` and `deps_impl` paths.

Reviewed By: GMNGeoffrey

Differential Revision: https://reviews.llvm.org/D108274

3 years ago[MLIR] [Python] Add `owner` to `mlir.ir.Block`
John Demme [Thu, 19 Aug 2021 07:02:09 +0000 (00:02 -0700)]
[MLIR] [Python] Add `owner` to `mlir.ir.Block`

Provides a way for python users to access the owning Operation from a Block.

3 years ago[mlir][linalg] Set result types in all builders.
Tobias Gysi [Thu, 19 Aug 2021 06:17:41 +0000 (06:17 +0000)]
[mlir][linalg] Set result types in all builders.

Add code to set the result types in all yaml op builders.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D108273

3 years ago[CSSPGO] Track and use context-sensitive post-optimization function size to drive...
Wenlei He [Tue, 17 Aug 2021 01:29:07 +0000 (18:29 -0700)]
[CSSPGO] Track and use context-sensitive post-optimization function size to drive global pre-inliner in llvm-profgen

This change enables llvm-profgen to use accurate context-sensitive post-optimization function byte size as a cost proxy to drive global preinline decisions.

To do this, BinarySizeContextTracker is introduced to track function byte size under different inline context during disassembling. In preinliner, we can not query context byte size under switch `context-cost-for-preinliner`. The tracker uses a reverse trie to keep size of functions under different context (callee as parent, caller as child), and it can give best/longest possible matching context size for given input context.

The new size cost is off by default. There're a few TODOs that needs to addressed: 1) avoid dangling string from `Offset2LocStackMap`, which will be addressed in split context work; 2) using inlinee's entry probe to make sure we have correct zero size for inlinee that's completely optimized away after inlining. Some tuning is also needed.

Differential Revision: https://reviews.llvm.org/D108180

3 years agoRevert "[HIP] Allow target addr space in target builtins"
Anshil Gandhi [Thu, 19 Aug 2021 03:37:53 +0000 (21:37 -0600)]
Revert "[HIP] Allow target addr space in target builtins"

This reverts commit a35008955fa606487f79a050f5cc80fc7ee84dda.

3 years ago[examples] Fix Kaleidoscope for Windows
Lang Hames [Thu, 19 Aug 2021 03:17:35 +0000 (13:17 +1000)]
[examples] Fix Kaleidoscope for Windows

This fixes "Resolving symbol with incorrect flags" errors when running the
Kaleidoscope tutorials on Windows.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D108348

3 years ago[WebAssembly] Avoid unused function imports in PIC mode
Sam Clegg [Thu, 19 Aug 2021 02:20:55 +0000 (22:20 -0400)]
[WebAssembly] Avoid unused function imports in PIC mode

In PIC mode we import function address via `GOT.mem` imports but for
direct function calls we still import the first class function.
However, if the function is never directly called we can avoid the first
class import completely.

Differential Revision: https://reviews.llvm.org/D108345

3 years ago[JITLink] Optimize GOTPCRELX Relocations
luxufan [Thu, 19 Aug 2021 02:13:40 +0000 (10:13 +0800)]
[JITLink] Optimize GOTPCRELX Relocations

This patch optimize the GOTPCRELX Reloations, which is described in X86-64 psabi chapter B.2. And Not all optimization of this chapter is implemented.

1. Convert call and jmp has been implemented
2. Convert mov, but the optimization that when the symbol is defined in the lower 32-bit address space, memory operand in `mov` can be convertted into immediate operand has not been implemented.
3. Conver Test and Binop has not been implemented.

The new test file named ELF_got_plt_optimizations.s has been added, and I moved some test cases about optimization of got/plt from ELF_x86_64_small_pic_relocations.s to the new test file.

By referencing the lld, so, the optimization `Convert call and jmp` is not same as what psabi says, and I have explained it in the comment.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D108280

3 years ago[mlir][linalg] Canonicalize dim ops of tiled_loop block args
Matthias Springer [Thu, 19 Aug 2021 02:23:36 +0000 (11:23 +0900)]
[mlir][linalg] Canonicalize dim ops of tiled_loop block args

E.g.:
```
%y = ... : tensor<...>
linalg.tiled_loop ... ins(%x = %y : tensor<...>) {
  tensor.dim %x, %c0 : tensor<...>
}
```

is rewritten to:
```
%y = ... : tensor<...>
linalg.tiled_loop ... ins(%x = %y : tensor<...>) {
  tensor.dim %y, %c0 : tensor<...>
}
```

Differential Revision: https://reviews.llvm.org/D108272

3 years ago[ORC] Handle void and no-argument async wrapper calls.
Lang Hames [Thu, 19 Aug 2021 02:19:36 +0000 (12:19 +1000)]
[ORC] Handle void and no-argument async wrapper calls.

3 years ago[WebAssembly][lld] Convert signature-mismatch.ll test to asm. NFC
Sam Clegg [Thu, 19 Aug 2021 00:30:58 +0000 (20:30 -0400)]
[WebAssembly][lld] Convert signature-mismatch.ll test to asm. NFC

Differential Revision: https://reviews.llvm.org/D108346

3 years ago[sanitizer] Use TMPDIR in Android test
Vitaly Buka [Thu, 19 Aug 2021 02:02:02 +0000 (19:02 -0700)]
[sanitizer] Use TMPDIR in Android test

TMPDIR was added long time ago, so no need to use EXTERNAL_STORAGE.

3 years ago[mlir][linalg] Remove ConstraintsSet class
Matthias Springer [Thu, 19 Aug 2021 01:47:17 +0000 (10:47 +0900)]
[mlir][linalg] Remove ConstraintsSet class

The same functionality can be implemented with FlatAffineValueConstraints.

Differential Revision: https://reviews.llvm.org/D108179

3 years ago[gn build] Port 5fdaaf7fd8f3
LLVM GN Syncbot [Thu, 19 Aug 2021 01:52:47 +0000 (01:52 +0000)]
[gn build] Port 5fdaaf7fd8f3

3 years agoStackLifetime: Remove asserts for multiple lifetime intrinsics.
Peter Collingbourne [Wed, 18 Aug 2021 22:03:03 +0000 (15:03 -0700)]
StackLifetime: Remove asserts for multiple lifetime intrinsics.

According to the langref, it is valid to have multiple consecutive
lifetime start or end intrinsics on the same object.

For llvm.lifetime.start:
"If ptr [...] is a stack object that is already alive, it simply
fills all bytes of the object with poison."

For llvm.lifetime.end:
"Calling llvm.lifetime.end on an already dead alloca is no-op."

However, we currently fail an assertion in such cases. I've observed
the assertion failure when the loop vectorization pass duplicates
the intrinsic.

We can conservatively handle these intrinsics by ignoring all but
the first one, which can be implemented by removing the assertions.

Differential Revision: https://reviews.llvm.org/D108337

3 years ago[SampleFDO] Flow Sensitive Sample FDO (FSAFDO) profile loader
Rong Xu [Wed, 18 Aug 2021 23:59:02 +0000 (16:59 -0700)]
[SampleFDO] Flow Sensitive Sample FDO (FSAFDO) profile loader

This patch implements Flow Sensitive Sample FDO (FSAFDO) profile
loader. We have two profile loaders for FS profile,
one before RegAlloc and one before BlockPlacement.

To enable it, when -fprofile-sample-use=<profile> is specified,
add "-enable-fs-discriminator=true \
     -disable-ra-fsprofile-loader=false \
     -disable-layout-fsprofile-loader=false"
to turn on the FS profile loaders.

Differential Revision: https://reviews.llvm.org/D107878

3 years ago[mlir][Analysis][NFC] FlatAffineConstraints: Use BoundType enum in functions
Matthias Springer [Thu, 19 Aug 2021 00:53:39 +0000 (09:53 +0900)]
[mlir][Analysis][NFC] FlatAffineConstraints: Use BoundType enum in functions

Differential Revision: https://reviews.llvm.org/D108185

3 years ago[scudo] Don't build SCUDO for Android
Vitaly Buka [Thu, 19 Aug 2021 01:22:28 +0000 (18:22 -0700)]
[scudo] Don't build SCUDO for Android

Android 11 uses scudo_standalone as default
allocator making difficult to test legacy scudo.

3 years ago[openmp] Annotate tmp variables with omp_thread_mem_alloc
Jon Chesterfield [Thu, 19 Aug 2021 01:22:10 +0000 (02:22 +0100)]
[openmp] Annotate tmp variables with omp_thread_mem_alloc

Fixes miscompile of calls into ocml. Bug 51445.

The stack variable `double __tmp` is moved to dynamically allocated shared
memory by CGOpenMPRuntimeGPU. This is usually fine, but when the variable
is passed to a function that is explicitly annotated address_space(5) then
allocating the variable off-stack leads to a miscompile in the back end,
which cannot decide to move the variable back to the stack from shared.

This could be fixed by removing the AS(5) annotation from the math library
or by explicitly marking the variables as thread_mem_alloc. The cast to
AS(5) is still a no-op once IR is reached.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D107971

3 years ago[NFC][DebugInfo] getDwarfCompileUnitID
Kyungwoo Lee [Thu, 19 Aug 2021 00:14:46 +0000 (17:14 -0700)]
[NFC][DebugInfo] getDwarfCompileUnitID

This is a refactoring for the use in https://reviews.llvm.org/D108261

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D108271

3 years ago[libomptarget] Apply D106710 to amdgcn devicertl
Jon Chesterfield [Thu, 19 Aug 2021 00:34:33 +0000 (01:34 +0100)]
[libomptarget] Apply D106710 to amdgcn devicertl

3 years ago[mlir][sparse] use shared util for DimOp generation
Aart Bik [Wed, 18 Aug 2021 17:39:14 +0000 (10:39 -0700)]
[mlir][sparse] use shared util for DimOp generation

This shares more code with existing utilities. Also, to be consistent,
we moved dimension permutation on the DimOp to the tensor lowering phase.
This way, both pre-existing DimOps on sparse tensors (not likely but
possible) as well as compiler generated DimOps are handled consistently.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108309

3 years ago[libomptarget][nfc][devicertl] Delete unused enums
Jon Chesterfield [Wed, 18 Aug 2021 23:12:33 +0000 (00:12 +0100)]
[libomptarget][nfc][devicertl] Delete unused enums

3 years ago[NFC][libcxxabi] Run clang-format on libcxxabi/src/cxa_guard_impl.h
Daniel McIntosh [Wed, 4 Aug 2021 17:37:57 +0000 (13:37 -0400)]
[NFC][libcxxabi] Run clang-format on libcxxabi/src/cxa_guard_impl.h

I'm about to submit a change which involves re-writing most of
cxa_guard_impl.h. Running clang-format on the whole file first seems like a
good idea.

Reviewed By: ldionne, #libc_abi

Differential Revision: https://reviews.llvm.org/D108231

3 years ago[mlir] Fix typo in SuperVectorizer
Diego Caballero [Wed, 18 Aug 2021 22:21:23 +0000 (22:21 +0000)]
[mlir] Fix typo in SuperVectorizer

NFC.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D108334

3 years ago[LLDB][GUI] Add Process Launch form
Omar Emara [Wed, 18 Aug 2021 18:59:57 +0000 (11:59 -0700)]
[LLDB][GUI] Add Process Launch form

This patch adds a process launch form. Additionally, a LazyBoolean field
was implemented and numerous utility methods were added to various
fields to get the launch form working.

Differential Revision: https://reviews.llvm.org/D107869

3 years ago[clang-format] Improve detection of parameter declarations in K&R C
owenca [Sun, 15 Aug 2021 21:03:17 +0000 (14:03 -0700)]
[clang-format] Improve detection of parameter declarations in K&R C

Clean up the detection of parameter declarations in K&R C function
definitions. Also make it more precise by requiring the second
token after the r_paren to be either a star or keyword/identifier.

Differential Revision: https://reviews.llvm.org/D108094

3 years ago[LLDB][GUI] Fix text field incorrect key handling
Omar Emara [Wed, 18 Aug 2021 22:06:05 +0000 (15:06 -0700)]
[LLDB][GUI] Fix text field incorrect key handling

The isprint libc function was used to determine if the key code
represents a printable character. The problem is that the specification
leaves the behavior undefined if the key is not representable as an
unsigned char, which is the case for many ncurses keys. This patch adds
and explicit check for this undefined behavior and make it consistent.

The llvm::isPrint function didn't work correctly for some reason, most
likely because it takes a char instead of an int, which I guess makes it
unsuitable for checking ncurses key codes.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D108327

3 years ago[gn build] Port d8bbfe8a4897
LLVM GN Syncbot [Wed, 18 Aug 2021 21:58:30 +0000 (21:58 +0000)]
[gn build] Port d8bbfe8a4897

3 years ago[DWARF] Expose raw bytes in DWARFExpression
Rafael Auler [Thu, 5 Aug 2021 00:15:29 +0000 (17:15 -0700)]
[DWARF] Expose raw bytes in DWARFExpression

This information is necessary for clients of DebugInfo that
do not want to process a DWARF expression, but just treat it as a blob
of data. In BOLT, for example, we need to read these expressions in
CFIs and write them back to the binary, unchanged, so having access to
the original expression encoding is a shortcut to avoid the need to
re-encode the entire expression when re-writing exception handling
info (CFIs).

This patch is an alternative to https://reviews.llvm.org/D98301, in
which we implement the support to re-encode these expressions. But
since we don't really need to change anything in these expressions,
we can just copy their bytes.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D107515

3 years agoEnables inferring return types for Shape op if possible
Chia-hung Duan [Wed, 18 Aug 2021 20:46:26 +0000 (20:46 +0000)]
Enables inferring return types for Shape op if possible

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D102565

3 years ago[AArch64][GlobalISel] Don't allow s128 for G_ISNAN
Jessica Paquette [Wed, 18 Aug 2021 20:57:42 +0000 (13:57 -0700)]
[AArch64][GlobalISel] Don't allow s128 for G_ISNAN

getAPFloatFromSize doesn't support s128, so we can't lower this without
asserting right now.

To fix the buildbots, don't allow any scalars other than s16, s32, and s64.

3 years agogn build: Build libclang.so and libLTO.so on ELF platforms.
Peter Collingbourne [Mon, 23 Nov 2020 19:45:06 +0000 (11:45 -0800)]
gn build: Build libclang.so and libLTO.so on ELF platforms.

This requires changing the ELF build to enable -fPIC, consistent
with other platforms.

Differential Revision: https://reviews.llvm.org/D108223

3 years ago[AArch64][GlobalISel] Mark G_FMINNUM/G_FMAXNUM as floating point opcodes
Jessica Paquette [Wed, 18 Aug 2021 00:40:23 +0000 (17:40 -0700)]
[AArch64][GlobalISel] Mark G_FMINNUM/G_FMAXNUM as floating point opcodes

We need to ensure that these end up on FPR to allow imported patterns to
select them.

This will also ensure that we get good regbank selection when dealing with
instructions like G_PHI/G_LOAD/G_STORE which deduce their banks from their
uses/users.

Differential Revision: https://reviews.llvm.org/D108260

3 years ago[AArch64][GlobalISel] Legalize scalar G_FMINNUM + G_FMAXNUM
Jessica Paquette [Wed, 18 Aug 2021 00:26:48 +0000 (17:26 -0700)]
[AArch64][GlobalISel] Legalize scalar G_FMINNUM + G_FMAXNUM

For subtargets with full FP16, this is legal for s16, s32, and s64. Without
full FP16, it's legal for s32 and s64.

For s128, this is a libcall.

We also support some vector types, but for now, let's just support scalars.

Differential Revision: https://reviews.llvm.org/D108259

3 years ago[libomptarget][devicertl] Replace lanemask with uint64 at interface
Jon Chesterfield [Wed, 18 Aug 2021 19:47:33 +0000 (20:47 +0100)]
[libomptarget][devicertl] Replace lanemask with uint64 at interface

Use uint64_t for lanemask on all GPU architectures at the interface
with clang. Updates tests. The deviceRTL is always linked as IR so the zext
and trunc introduced for wave32 architectures will fold after inlining.

Simplification partly motivated by amdgpu gfx10 which will be wave32 and
is awkward to express in the current arch-dependant typedef interface.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108317

3 years ago[AggressiveInstCombine] Add logical shift right instr to `TruncInstCombine` DAG
Anton Afanasyev [Tue, 17 Aug 2021 10:49:53 +0000 (13:49 +0300)]
[AggressiveInstCombine] Add logical shift right instr to `TruncInstCombine` DAG

Add `lshr` instruction to the DAG post-dominated by `trunc`, allowing
TruncInstCombine to reduce bitwidth of expressions containing
these instructions.

We should be shifting by less than the target bitwidth.
Also it is sufficient to require that all truncated bits
of the value-to-be-shifted are zeros: https://alive2.llvm.org/ce/z/_LytbB

Alive2 variable-length proof:
https://godbolt.org/z/1srE1aqzf => s/32/8/ => https://alive2.llvm.org/ce/z/StwPia

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108201

3 years ago[Test][AggressiveInstCombine] Add one more tests for shifts
Anton Afanasyev [Wed, 18 Aug 2021 14:23:09 +0000 (17:23 +0300)]
[Test][AggressiveInstCombine] Add one more tests for shifts

3 years ago[mlir][tosa] Fix clamp to restrict only within valid bitwidth range
Robert Suderman [Wed, 18 Aug 2021 18:55:54 +0000 (11:55 -0700)]
[mlir][tosa] Fix clamp to restrict only within valid bitwidth range

Its possible for the clamp to have invalid min/max values on its range. To fix
this we validate the range of the min/max and clamp to a valid range.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D108256

3 years ago[Polly] Introduce caching for the isErrorBlock function. NFC.
Michael Kruse [Wed, 18 Aug 2021 18:36:17 +0000 (13:36 -0500)]
[Polly] Introduce caching for the isErrorBlock function. NFC.

Compilation of the file insn-attrtab.c of the SPEC CPU 2017 502.gcc_r
benchmark takes excessive time (> 30min) with Polly enabled. Most time
is spent in the isErrorBlock function querying the DominatorTree.
The isErrorBlock is invoked redundantly over the course of ScopDetection
and ScopBuilder. This patch introduces a caching mechanism for its
result.

Instead of a free function, isErrorBlock is moved to ScopDetection where
its cache map resides. This also means that many functions directly or
indirectly calling isErrorBlock are not "const" anymore. The
DetectionContextMap was marked as "mutable", but IMHO it never should
have been since it stores the detection result.

502.gcc_r only takes excessive time with the new pass manager. The
reason seeams to be that it invalidates the ScopDetection analysis more
often than the legacy pass manager, for unknown reasons.

3 years agoReapply: [NFC] factor out unrolling decision logic
Ali Sedaghati [Wed, 18 Aug 2021 18:57:56 +0000 (11:57 -0700)]
Reapply: [NFC] factor out unrolling decision logic

reverting ffd8a268bdc518f87e9ba7524aba0458f4b9979c (reapplying
4d559837e887c278d7c27274f4f6b1b78b97c00d) - removed spurious inclusion
of <optional>

Differential Revision: https://reviews.llvm.org/D106001

3 years ago[X86][NFC] Pre-commit tests for PR51494
Andrea Di Biagio [Wed, 18 Aug 2021 18:40:35 +0000 (19:40 +0100)]
[X86][NFC] Pre-commit tests for PR51494

3 years ago[PowerPC] Regenerate 2007-09-08-unaligned.ll test checks
Simon Pilgrim [Wed, 18 Aug 2021 18:53:57 +0000 (19:53 +0100)]
[PowerPC] Regenerate 2007-09-08-unaligned.ll test checks

3 years ago[tsan] Disable all Trace unit tests on Mac
Azharuddin Mohammed [Wed, 18 Aug 2021 18:41:09 +0000 (11:41 -0700)]
[tsan] Disable all Trace unit tests on Mac

In an earlier commit (7338be0e6e8d), only the MemoryAccessSize unit test
was disabled whereas the other tests which are also failing were not.

3 years agoRevert "[NFC] factor out unrolling decision logic"
Geoffrey Martin-Noble [Wed, 18 Aug 2021 18:36:25 +0000 (11:36 -0700)]
Revert "[NFC] factor out unrolling decision logic"

This patch added a requirement for C++17, while LLVM is supposed to
build with C++14
(https://llvm.org/docs/CodingStandards.html#c-standard-versions). Posted
a note to the original review thread (https://reviews.llvm.org/D106001).

This reverts commit 4d559837e887c278d7c27274f4f6b1b78b97c00d.

Differential Revision: https://reviews.llvm.org/D108314

3 years ago[AMDGPU] Fix atomic float max/min intrinsics
Joe Nash [Thu, 12 Aug 2021 19:00:19 +0000 (15:00 -0400)]
[AMDGPU] Fix atomic float max/min intrinsics

Hooked up raw.buffer.atomic.fmin/max.f64
This instruction should be available on GFX6, GFX7, and GFX10.
It was implemented for GFX90a with a different name.

Added intrinsic def for image_atomic_fmin/fmax; the instruction
defs were already there.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D108208

Change-Id: I473f98d28b2afbeeb2c27822d9686b5e86634e2f

3 years ago[hwasan] Don't report short-granule shadow as overwritten.
Mitch Phillips [Wed, 18 Aug 2021 16:36:48 +0000 (09:36 -0700)]
[hwasan] Don't report short-granule shadow as overwritten.

The shadow for a short granule is stored in the last byte of the
granule. Currently, if there's a tail-overwrite report (a
buffer-overflow-write in uninstrumented code), we report the shadow byte
as a mismatch against the magic.

Fix this bug by slapping the shadow into the expected value. This also
makes sure that if the uninstrumented WRITE does clobber the shadow
byte, it reports the shadow was actually clobbered as well.

Reviewed By: eugenis, fmayer

Differential Revision: https://reviews.llvm.org/D107938

3 years ago[LICM] Remove AST-based implementation
Nikita Popov [Tue, 17 Aug 2021 19:31:35 +0000 (21:31 +0200)]
[LICM] Remove AST-based implementation

MSSA-based LICM has been enabled by default for a few years now.
This drops the old AST-based implementation. Using loop(licm) will
result in a fatal error, the use of loop-mssa(licm) is required
(or just licm, which defaults to loop-mssa).

Note that the core canSinkOrHoistInst() logic has to retain AST
support for now, because it is shared with LoopSink.

Differential Revision: https://reviews.llvm.org/D108244

3 years ago[NFC] factor out unrolling decision logic
Ali Sedaghati [Wed, 18 Aug 2021 18:09:04 +0000 (11:09 -0700)]
[NFC] factor out unrolling decision logic

Decoupling the unrolling logic into three different functions. The shouldPragmaUnroll() covers the 1st and 2nd priorities of the previous code, the shouldFullUnroll() covers the 3rd, and the shouldPartialUnroll() covers the 5th. The output of each function, Optional<unsigned>, could be a value for UP.Count, which means unrolling factor has been set, or None, which means decision hasn't been made yet and should try the next priority.

Reviewed By: mtrofin, jdoerfert

Differential Revision: https://reviews.llvm.org/D106001

3 years ago[Bazel] Don't set HAVE_[DE]REGISTER_FRAME on Windows
Geoffrey Martin-Noble [Wed, 18 Aug 2021 18:19:31 +0000 (11:19 -0700)]
[Bazel] Don't set HAVE_[DE]REGISTER_FRAME on Windows

This is also done based on OS in the GN build
(https://github.com/llvm/llvm-project/blob/24b0df8686/llvm/utils/gn/secondary/llvm/include/llvm/Config/BUILD.gn#L193-L203).
Of course the right way would be to set up platform detection, but that
remains TODO.

Reviewed By: csigg

Differential Revision: https://reviews.llvm.org/D107375

3 years ago[NFC] A couple more removeAttribute() cleanups
Arthur Eubanks [Wed, 18 Aug 2021 18:11:02 +0000 (11:11 -0700)]
[NFC] A couple more removeAttribute() cleanups

3 years ago[NFC] Remove some unnecessary AttributeList methods
Arthur Eubanks [Wed, 18 Aug 2021 17:43:17 +0000 (10:43 -0700)]
[NFC] Remove some unnecessary AttributeList methods

These rely on methods I'm trying to cleanup.

3 years ago[hwasan] Flag stack safety check as requiring aarch64
Christopher Tetreault [Tue, 17 Aug 2021 21:16:14 +0000 (14:16 -0700)]
[hwasan] Flag stack safety check as requiring aarch64

Reviewed By: fmayer

Differential Revision: https://reviews.llvm.org/D108241

3 years ago[RISCV] Remove sext_inreg+add/sub/mul/shl isel patterns.
Craig Topper [Wed, 18 Aug 2021 17:46:09 +0000 (10:46 -0700)]
[RISCV] Remove sext_inreg+add/sub/mul/shl isel patterns.

Let the sext_inreg be selected to sext.w. Remove unneeded sext.w
during PostProcessISelDAG.

This gives opportunities for some other isel patterns to match
like the ADDIPair or matching mul with immediate to shXadd.

This becomes possible after D107658 started selecting W instructions
based on users. The sext.w will be considered a W user so isel
will often select a W instruction for the sext.w input and we can
just remove the sext.w. Otherwise we can combine the sext.w with
a ADD/SUB/MUL/SLLI to create a new W instruction in parallel
to the the original instruction.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D107708

3 years ago[GlobalISel] Implement lowering for G_ISNAN + use it in AArch64
Jessica Paquette [Tue, 17 Aug 2021 17:39:18 +0000 (10:39 -0700)]
[GlobalISel] Implement lowering for G_ISNAN + use it in AArch64

GlobalISel equivalent to `TargetLowering::expandISNAN`.

Use it in AArch64 and add a testcase.

Differential Revision: https://reviews.llvm.org/D108227

3 years ago[NFC][loop-idiom] Rename Stores to IgnoredInsts; Fix a typo
Han Zhu [Wed, 18 Aug 2021 06:39:29 +0000 (23:39 -0700)]
[NFC][loop-idiom] Rename Stores to IgnoredInsts; Fix a typo

When dealing with memmove, we also add the load instruction to the ignored
instructions list passed to `mayLoopAccessLocation`. Renaming "Stores" to
"IgnoredInsts" to be more precise.

Differential Revision: https://reviews.llvm.org/D108275

3 years ago[GlobalISel] Add IRTranslator support for G_ISNAN
Jessica Paquette [Tue, 17 Aug 2021 16:49:44 +0000 (09:49 -0700)]
[GlobalISel] Add IRTranslator support for G_ISNAN

Translate the `@llvm.isnan` intrinsic to G_ISNAN when we see it.

This is pretty much the same as the associated SelectionDAGBuilder code. Main
difference is that we don't expand it here. It makes more sense to do that
during legalization in GlobalISel. GlobalISel will just legalize the generated
illegal types.

Differential Revision: https://reviews.llvm.org/D108226

3 years ago[InstrProfiling] Support relative CountersPtr for PlatformOther
Jinsong Ji [Wed, 18 Aug 2021 17:20:18 +0000 (17:20 +0000)]
[InstrProfiling] Support relative CountersPtr for PlatformOther

D104556 change the CountersPtr to be relative, however, it did not
update the pointer initialization in  __llvm_profile_register_function,
so the platform (eg:AIX) that use __llvm_profile_register_function is now totaly
broken, any PGO code will SEGV.

This patch update the code to reflect that the Data->CountersPtr is now
relative.

Reviewed By: MaskRay, davidxl

Differential Revision: https://reviews.llvm.org/D108304

3 years ago[RISCV] Insert sext_inreg when type legalizing add/sub/mul with constant LHS.
Craig Topper [Wed, 18 Aug 2021 17:37:00 +0000 (10:37 -0700)]
[RISCV] Insert sext_inreg when type legalizing add/sub/mul with constant LHS.

We already do this for non-constants RHS. This just removes the
special case. I believe the special case may have been needed
because the ANY_EXTEND of a constant used to create zero extended
constants, but we recently changed that to produce sign extended
constants.

D107658 is needed to prevent some regressions.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D107697

3 years ago[GlobalISel] Add G_ISNAN
Jessica Paquette [Tue, 17 Aug 2021 16:45:23 +0000 (09:45 -0700)]
[GlobalISel] Add G_ISNAN

Add a generic opcode equivalent to the `llvm.isnan` intrinsic +
MachineVerifier support for it.

We need an opcode here because we may want target-specific lowering later on.

Differential Revision: https://reviews.llvm.org/D108222

3 years ago[Polly] Break early when the result is known. NFC.
Michael Kruse [Wed, 18 Aug 2021 17:22:21 +0000 (12:22 -0500)]
[Polly] Break early when the result is known. NFC.

3 years ago[RISCV] Improve constant materialization for stores of i16 or i32 negative constants.
Craig Topper [Wed, 18 Aug 2021 17:23:59 +0000 (10:23 -0700)]
[RISCV] Improve constant materialization for stores of i16 or i32 negative constants.

DAGCombiner::visitStore can clear the upper bits of constants
used by stores. This leads prevents them from being recognized as
sign extended negative values making them more expensive to
materialize.

This patch uses the hasAllNBitUsers method from D107658 to make
a negative constant if none of the users care about the upper bits.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D108052

3 years ago[RISCV] Teach isel to select ADDW/SUBW/MULW/SLLIW when only the lower 32-bits are...
Craig Topper [Wed, 18 Aug 2021 16:40:57 +0000 (09:40 -0700)]
[RISCV] Teach isel to select ADDW/SUBW/MULW/SLLIW when only the lower 32-bits are used.

We normally select these when the root node is a sext_inreg, but
SimplifyDemandedBits can sometimes bypass the sext_inreg for some
users. This can create situation where sext_inreg+add/sub/mul/shl
is selected to a W instruction, and then the add/sub/mul/shl is
separately selected to a non-W instruction with the same inputs.

This patch tries to detect when it would still be ok to use a W
instruction without the sext_inreg by checking the direct users.
This can allow the W instruction to CSE with one created for a
sext_inreg+add/sub/mul/shl. To minimize complexity and cost of
checking, we make no attempt to determine if the CSE will happen
and just always use a W instruction when we can.

Differential Revision: https://reviews.llvm.org/D107658

3 years ago[X86] avx512bw-intrinsics-upgrade.ll - cleanup whitespace and use nounwind to avoid...
Simon Pilgrim [Wed, 18 Aug 2021 16:53:41 +0000 (17:53 +0100)]
[X86] avx512bw-intrinsics-upgrade.ll - cleanup whitespace and use nounwind to avoid unnecessary cfi tags. NFCI.

3 years ago[RISCV] Add zext.h/zext.w to RISCVTTIImpl::getIntImmCostInst.
Craig Topper [Wed, 18 Aug 2021 16:32:10 +0000 (09:32 -0700)]
[RISCV] Add zext.h/zext.w to RISCVTTIImpl::getIntImmCostInst.

If we have these instructions, we don't need to hoist the immediate
for an AND that would match them.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D107783

3 years ago[NFC] Cleanup calls to CallBase::getAttribute()
Arthur Eubanks [Wed, 18 Aug 2021 16:37:01 +0000 (09:37 -0700)]
[NFC] Cleanup calls to CallBase::getAttribute()

3 years ago[hwasan] Default -hwasan-use-stack-safety to off.
Florian Mayer [Wed, 18 Aug 2021 09:21:27 +0000 (10:21 +0100)]
[hwasan] Default -hwasan-use-stack-safety to off.

This very occasionally causes to an assertion failure in the compiler.
Turning off until we can get to the bottom of this.

Reviewed By: hctim

Differential Revision: https://reviews.llvm.org/D108282

3 years ago[Bitcode] Remove unused declaration writeGlobalVariableMetadataAttachment (NFC)
Kazu Hirata [Wed, 18 Aug 2021 16:16:05 +0000 (09:16 -0700)]
[Bitcode] Remove unused declaration writeGlobalVariableMetadataAttachment (NFC)

The declaration was introduced without a corresponding definition on
May 31, 2016 in commit cceae7feda8e33194d1a6c5963bd4114bb8d2b36.

3 years ago[Analysis][AArch64] Make fixed-width ordered reductions slightly more expensive
David Sherwood [Wed, 18 Aug 2021 08:40:21 +0000 (09:40 +0100)]
[Analysis][AArch64] Make fixed-width ordered reductions slightly more expensive

For tight loops like this:

  float r = 0;
  for (int i = 0; i < n; i++) {
    r += a[i];
  }

it's better not to vectorise at -O3 using fixed-width ordered reductions
on AArch64 targets. Although the resulting number of instructions in the
generated code ends up being comparable to not vectorising at all, there
may be additional costs on some CPUs, for example perhaps the scheduling
is worse. It makes sense to deter vectorisation in tight loops.

Differential Revision: https://reviews.llvm.org/D108292

3 years ago[OpenMP][NFC] Improve debug message for shared memory
Joseph Huber [Wed, 18 Aug 2021 15:48:15 +0000 (11:48 -0400)]
[OpenMP][NFC] Improve debug message for shared memory

Summary:
Make the debug message for HeapToShared more helpful by showing the
actual call.

3 years ago[libc++] Skip logic for detecting C11 features when using_if_exists is supported
Louis Dionne [Tue, 17 Aug 2021 15:23:48 +0000 (11:23 -0400)]
[libc++] Skip logic for detecting C11 features when using_if_exists is supported

In the future, we'll want to rely exclusively on using_if_exists for this
job, but for now, only rely on it when the compiler supports that attribute.
That removes the possibility for getting the logic wrong.

Differential Revision: https://reviews.llvm.org/D108297

3 years ago[libc++] Split off tests for aligned_alloc & friends into separate test files
Louis Dionne [Wed, 18 Aug 2021 12:31:40 +0000 (08:31 -0400)]
[libc++] Split off tests for aligned_alloc & friends into separate test files

This allows testing the rest of those headers on most platforms, instead
of XFAILing the whole test just because of a few functions.

As a fly-by fix, remove std/utilities/time/date.time/ctime.pass.cpp,
which was a duplicate of std/language.support/support.runtime/ctime.pass.cpp.

Differential Revision: https://reviews.llvm.org/D108295

3 years agoAdd some Function method definitions accidentally removed
Arthur Eubanks [Wed, 18 Aug 2021 15:27:45 +0000 (08:27 -0700)]
Add some Function method definitions accidentally removed

In cc327bd5231126006b4177b8ce0946ce52e2f645.

3 years ago[OpenMP] Change AAKernelInfo to ignore non-kernels
Joseph Huber [Wed, 18 Aug 2021 00:29:54 +0000 (20:29 -0400)]
[OpenMP] Change AAKernelInfo to ignore non-kernels

Currently, AAKernelInfo will fail on an assertion if we attempt to run
it on a kernel without the init / deinit runtime calls. However, this
occurs for global constructors on the device. This will cause OpenMPOpt
to crash whenever global constructors are present. This patch removes
this assertion and just gives up instead.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108258

3 years ago[Libomptarget] Correctly default to Generic if exec_mode is not present
Joseph Huber [Tue, 17 Aug 2021 23:19:31 +0000 (19:19 -0400)]
[Libomptarget] Correctly default to Generic if exec_mode is not present

Currently, the runtime returns an error when the `exec_mode` global is
not present. The expected behvaiour is that the region will default to
Generic. This prevents global constructors from being called because
they do not contain execution mode globals.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108255

3 years ago[clang-offload-wrapper] Disabled ELF offload notes embedding by default.
Vyacheslav Zakharin [Wed, 18 Aug 2021 15:18:03 +0000 (08:18 -0700)]
[clang-offload-wrapper] Disabled ELF offload notes embedding by default.

This change-set puts 93d08acaacec951dbb302f77eeae51974985b6b2 functionality
under -add-omp-offload-notes switch that is OFF by default.
CUDA toolchain is not able to handle ELF images with LLVMOMPOFFLOAD
notes for unknown reason (see https://reviews.llvm.org/D99551#2950272).
I disable the ELF notes embedding until the CUDA issue is triaged and resolved.

Differential Revision: https://reviews.llvm.org/D108246

3 years ago[MLIR] Correct linkage of lowered globalop
William S. Moses [Tue, 17 Aug 2021 22:22:04 +0000 (18:22 -0400)]
[MLIR] Correct linkage of lowered globalop

LLVM considers global variables marked as externals to be defined within the module if it is initialized (including to an undef). Other external globals are considered as being defined externally and imported into the current translation unit. Lowering of MLIR Global Ops does not properly propagate undefined initializers, resulting in a global which is expected to be defined within the current TU, not being defined.

Differential Revision: https://reviews.llvm.org/D108252

3 years ago[PowerPC][AIX] llvm-readobj: Convert some errors to warnings.
Maryam Benimmar [Tue, 17 Aug 2021 17:06:30 +0000 (13:06 -0400)]
[PowerPC][AIX] llvm-readobj: Convert some errors to warnings.

Report warnings rather than errors, so that llvm-readobj doesn't bail
out on malformed inputs.

Differential Revision: https://reviews.llvm.org/D106783

3 years ago[mlir][spirv] Add (InBounds)PtrAccessChain ops
Butygin [Sat, 14 Aug 2021 08:57:02 +0000 (11:57 +0300)]
[mlir][spirv] Add (InBounds)PtrAccessChain ops

Differential Revision: https://reviews.llvm.org/D108070

3 years ago[X86] [AMX] Fix the test case failure caused by D107544.
Bing1 Yu [Wed, 18 Aug 2021 05:40:19 +0000 (13:40 +0800)]
[X86] [AMX] Fix the test case failure caused by D107544.

The issue can be duplicated when EXPENSIVE_CHECKS is specified for llvm
build. Thank Simon report this issue at
https://bugs.llvm.org/show_bug.cgi?id=51513. We need return correct
value for the changed IR.

Reviewed By: RKSimon, LuoYuanke

Differential Revision: https://reviews.llvm.org/D108269

3 years ago[gn build] Port 38812f4ac122
LLVM GN Syncbot [Wed, 18 Aug 2021 14:02:48 +0000 (14:02 +0000)]
[gn build] Port 38812f4ac122

3 years ago[libc++] Implement structured binding for std::ranges::subrange.
Arthur O'Dwyer [Fri, 13 Aug 2021 20:20:13 +0000 (16:20 -0400)]
[libc++] Implement structured binding for std::ranges::subrange.

The `get` half of this machinery was already implemented, but the `tuple_size`
and `tuple_element` parts were hiding in [ranges.syn] and therefore missed.

Differential Revision: https://reviews.llvm.org/D108054

3 years ago[libc++] [P1614] Implement std::compare_three_way_result.
Arthur O'Dwyer [Thu, 29 Jul 2021 04:03:01 +0000 (00:03 -0400)]
[libc++] [P1614] Implement std::compare_three_way_result.

Differential Revision: https://reviews.llvm.org/D103581

3 years agoUse a more general test here.
Aaron Ballman [Wed, 18 Aug 2021 13:30:36 +0000 (09:30 -0400)]
Use a more general test here.

The interesting bit about that triple isn't the architecture, it's the
fact that ps4 implies C99 as the standard rather than a newer C mode.
Specify the language standard rather than the triple so the test is a
bit more general.

3 years agoSimplify a .mailmap entry
Nico Weber [Wed, 18 Aug 2021 13:16:16 +0000 (09:16 -0400)]
Simplify a .mailmap entry

The old entry mapped the email address `<compnerd@compnerd.org>` to user name
`Saleem Abdulrasool` and email address `<compnerd@compnerd.org>`. Since the two
addresses are identical, that's a needless detail.

The new entry just maps email address `<compnerd@compnerd.org>` to user name
`Saleem Abdulrasool`.

No behavior change.

Differential Revision: https://reviews.llvm.org/D108079

3 years agoDo not emit diagnostics for invalid unicode characters in preprocessing mode
Corentin Jabot [Wed, 18 Aug 2021 13:10:34 +0000 (09:10 -0400)]
Do not emit diagnostics for invalid unicode characters in preprocessing mode

This amends 4e80636db71a1b6123d15ed1f9eda3979b4292de with a fix for
https://lab.llvm.org/buildbot/#/builders/139/builds/8943

3 years ago[tsan] Disable Trace.MemoryAccessSize on Mac
Alexander Potapenko [Wed, 18 Aug 2021 12:33:14 +0000 (14:33 +0200)]
[tsan] Disable Trace.MemoryAccessSize on Mac

According to comments at https://reviews.llvm.org/D107911,
Trace.MemoryAccessSize fails on Mac buildbots.
Because this test is newly introduced, and is the only user of the code
added in that patch, disable the test on Mac till the problem is
resolved.

Differential Revision: https://reviews.llvm.org/D108294

3 years ago[libc++] Remove workarounds for the lack of deduction guides in C++17
Louis Dionne [Tue, 17 Aug 2021 15:59:07 +0000 (11:59 -0400)]
[libc++] Remove workarounds for the lack of deduction guides in C++17

All supported compilers have supported deduction guides in C++17 for a
while, so this isn't necessary anymore.

Differential Revision: https://reviews.llvm.org/D108213

3 years ago[libc++][NFC] Fix copy-paste errors in tests
Louis Dionne [Wed, 18 Aug 2021 12:54:18 +0000 (08:54 -0400)]
[libc++][NFC] Fix copy-paste errors in tests

The test precision_type.pass.cpp was a duplicate of precision.pass.cpp,
so it is removed. atomic_flag_test.pass.cpp was a duplicate of
atomic_flag_test_and_set.pass.cpp, so instead I wrote a proper
test for it. Those duplicate tests were detected with

     find libcxx ! -empty -type f -exec md5sum {} + | sort | uniq -w32 -dD

3 years ago[libc++] Convert test-suite workarounds for some C11 features to XFAILs
Louis Dionne [Tue, 17 Aug 2021 15:21:09 +0000 (11:21 -0400)]
[libc++] Convert test-suite workarounds for some C11 features to XFAILs

Instead of trying to sniff out what features are supported by the
library being tested, the way we normally handle these things is with
Lit annotations. This should not be treated differently.

Differential Revision: https://reviews.llvm.org/D108209

3 years ago[NFC][X86][Codegen] Add exhaustive test coverage for PR50971
Roman Lebedev [Wed, 18 Aug 2021 12:02:25 +0000 (15:02 +0300)]
[NFC][X86][Codegen] Add exhaustive test coverage for PR50971

Produced via https://godbolt.org/z/5hEdGY5x3

3 years agoImplement P1949
Corentin Jabot [Wed, 18 Aug 2021 11:33:14 +0000 (07:33 -0400)]
Implement P1949

This adds the Unicode 13 data for XID_Start and XID_Continue.
The definition of valid identifier is changed in all C++ modes
as P1949 (https://wg21.link/p1949) was accepted by WG21 as a defect
report.

3 years ago[Sema] CheckObjCBridgeNSCast - fix dead code warning. NFCI.
Simon Pilgrim [Wed, 18 Aug 2021 10:02:39 +0000 (11:02 +0100)]
[Sema] CheckObjCBridgeNSCast - fix dead code warning. NFCI.

Target is only ever non-null when we find an existing type, so move its declaration inside that case, and remove the dead code where Target was always null.

3 years ago[gn build] Port 45ac5f544181
LLVM GN Syncbot [Wed, 18 Aug 2021 10:43:22 +0000 (10:43 +0000)]
[gn build] Port 45ac5f544181