platform/upstream/llvm.git
3 years ago[SVE][LoopVectorize] Add support for scalable vectorization of first-order recurrences
Kerry McLaughlin [Thu, 6 May 2021 09:50:51 +0000 (10:50 +0100)]
[SVE][LoopVectorize] Add support for scalable vectorization of first-order recurrences

Adds support for scalable vectorization of loops containing first-order recurrences, e.g:
```
for(int i = 0; i < n; i++)
  b[i] =  a[i] + a[i - 1]
```
This patch changes fixFirstOrderRecurrence for scalable vectors to take vscale into
account when inserting into and extracting from the last lane of a vector.
CreateVectorSplice has been added to construct a vector for the recurrence, which
returns a splice intrinsic for scalable types. For fixed-width the behaviour
remains unchanged as CreateVectorSplice will return a shufflevector instead.

The tests included here are the same as test/Transform/LoopVectorize/first-order-recurrence.ll

Reviewed By: david-arm, fhahn

Differential Revision: https://reviews.llvm.org/D101076

3 years ago[clang-format] Rename common types between C#/JS
Eliza Velasquez [Thu, 6 May 2021 10:12:05 +0000 (12:12 +0200)]
[clang-format] Rename common types between C#/JS

Reviewed By: curdeius

Differential Revision: https://reviews.llvm.org/D101862

3 years ago[clang-format] Fix C# nullable-related errors
Eliza Velasquez [Thu, 6 May 2021 10:06:00 +0000 (12:06 +0200)]
[clang-format] Fix C# nullable-related errors

This fixes two errors:

Previously, clang-format was splitting up type identifiers from the
nullable ?. This changes this behavior so that the type name sticks with
the operator.

Additionally, nullable operators attached to return types in interface
functions were not parsed correctly. Digging deeper, it looks like
interface bodies were being parsed differently than classes and structs,
causing MustBeDeclaration to be incorrect for interface members. They
now share the same logic.

One other change is reintroducing the CSharpNullable type independent of
JsTypeOptionalQuestion. Despite having a similar semantic purpose, their
actual syntax differs quite a bit.

Reviewed By: MyDeveloperDay, curdeius

Differential Revision: https://reviews.llvm.org/D101860

3 years ago[clang-format] Add more support for C# 8 nullables
Eliza Velasquez [Thu, 6 May 2021 09:22:31 +0000 (11:22 +0200)]
[clang-format] Add more support for C# 8 nullables

This adds support for the null-coalescing assignment and null-forgiving
operators.

https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/operators/null-coalescing-operator

https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/operators/null-forgiving

Reviewed By: krasimir, curdeius

Differential Revision: https://reviews.llvm.org/D101702

3 years ago[AMDGPU] SIFoldOperands: clean up tryConstantFoldOp
Jay Foad [Wed, 7 Apr 2021 12:49:07 +0000 (13:49 +0100)]
[AMDGPU] SIFoldOperands: clean up tryConstantFoldOp

First clean up the strange API of tryConstantFoldOp where it took an
immediate operand value, but no indication of which operand it was the
value for.

Second clean up the loop that calls tryConstantFoldOp so that it does
not have to restart from the beginning every time it folds an
instruction.

This is NFCI but there are some minor changes caused by the order in
which things are folded.

Differential Revision: https://reviews.llvm.org/D100031

3 years ago[flang] Remove `%f18` from LIT configuration files
Andrzej Warzynski [Fri, 23 Apr 2021 14:46:35 +0000 (14:46 +0000)]
[flang] Remove `%f18` from LIT configuration files

`%f18` was originally introduced to represent the old Flang driver,
`f18`. With the introduction of the new driver, `flang-new`, we have
been switching to `%flang` (compiler driver) and `%flang_fc1` (frontend
driver) as more generic alternatives.

As most tests have been portend to use the new LIT variables instead of
`%f18`, this is good time to remove it from lit.cfg.py. There's only one
test left that requires the old driver to run. It's updated with:
```
! REQUIRES: old-flang-driver
```
This way we preserve its semantics while reducing the number of
variables in LIT configuration.

Differential Revision: https://reviews.llvm.org/D101281

3 years ago[ARM] Transforming memcpy to Tail predicated Loop
Malhar Jajoo [Thu, 6 May 2021 00:38:20 +0000 (01:38 +0100)]
[ARM] Transforming memcpy to Tail predicated Loop

This patch converts llvm.memcpy intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).

From an implementation point of view, the patch

- adds an ARM specific SDAG Node (to which the llvm.memcpy intrinsic is lowered to, during first phase of ISel)
- adds a corresponding TableGen entry to generate a pseudo instruction, with a custom inserter,
  on matching the above node.
- Adds a custom inserter function that expands the pseudo instruction into MIR suitable
   to be (by later passes) into a WLSTP loop.

Note: A cli option is used to control the conversion of memcpy to TP
loop and this option is currently disabled by default. It may be enabled
in the future after further downstream testing.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D99723

3 years ago[lit] Report tool path from use_llvm_tool if found via env variable
James Henderson [Wed, 5 May 2021 10:56:46 +0000 (11:56 +0100)]
[lit] Report tool path from use_llvm_tool if found via env variable

Previously, if the search_env argument was specified, and the tool was
found at that location, the path was not reported, unlike other
situations when this function was called. Adding the reporting makes the
function consistent.

Reviewed by: thopre

Differential Revision: https://reviews.llvm.org/D101896

3 years ago[llvm-objdump] Use std::make_unique
Tim Renouf [Tue, 4 May 2021 09:10:41 +0000 (10:10 +0100)]
[llvm-objdump] Use std::make_unique

Fix up my recent commit rG1128311a19179ceca799ff0fbc4dd206ab56e560 to
use std::make_unique instead of std::unique_ptr(new), as requested by
David Blaikie.

Differential Revision: https://reviews.llvm.org/D101822

3 years ago[llvm][NFC] Remove CallingConvLower deprecated alignment functions
Guillaume Chatelet [Thu, 6 May 2021 07:46:19 +0000 (07:46 +0000)]
[llvm][NFC] Remove CallingConvLower deprecated alignment functions

Differential Revision: https://reviews.llvm.org/D101910

3 years ago[llvm][NFC] Remove SelectionDag alignment deprecated functions
Guillaume Chatelet [Thu, 6 May 2021 07:44:14 +0000 (07:44 +0000)]
[llvm][NFC] Remove SelectionDag alignment deprecated functions

Differential Revision: https://reviews.llvm.org/D101909

3 years ago[llvm][NFC] Remove deprecated InterleaveGroup::getAlignment() function.
Guillaume Chatelet [Thu, 6 May 2021 07:40:18 +0000 (07:40 +0000)]
[llvm][NFC] Remove deprecated InterleaveGroup::getAlignment() function.

Differential Revision: https://reviews.llvm.org/D101907

3 years ago[llvm][NFC] Remove deprecated DataLayout::getPreferredAlignment functions
Guillaume Chatelet [Thu, 6 May 2021 07:28:00 +0000 (07:28 +0000)]
[llvm][NFC] Remove deprecated DataLayout::getPreferredAlignment functions

Differential Revision: https://reviews.llvm.org/D101906

3 years ago[llvm][NFC] Remove deprecated Alignment::None()
Guillaume Chatelet [Thu, 6 May 2021 07:21:23 +0000 (07:21 +0000)]
[llvm][NFC] Remove deprecated Alignment::None()

Differential Revision: https://reviews.llvm.org/D101905

3 years ago[OpenMP] Overhaul `declare target` handling
Johannes Doerfert [Thu, 22 Apr 2021 05:57:28 +0000 (00:57 -0500)]
[OpenMP] Overhaul `declare target` handling

This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.

This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.

I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.

Highlights:
  - new diagnosis for restrictions specificed in the standard,
  - delayed emission of globals not mentioned in an explicit
    list of a declare target,
  - omission of `nohost` globals on the host and `host` globals on the
    device,
  - no explicit parsing of declarations in-between `omp [begin] declare
    variant` and the corresponding end anymore, regular parsing instead,
  - precedence for explicit mentions in `declare target` lists over
    implicit mentions in the declaration-definition-seq, and
  - `omp allocate` declarations will now replace an earlier emitted
    global, if necessary.

---

Notes:

The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.

After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.

---

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D101030

3 years ago[OpenMP] Ensure the DefaultMapperId has a location
Johannes Doerfert [Fri, 16 Apr 2021 05:44:50 +0000 (00:44 -0500)]
[OpenMP] Ensure the DefaultMapperId has a location

A user reported an assertion (below) but without a reproducer. I failed to
create a test myself but from the assertion one can derive the problem.
I set the DefaultMapperId location now to make sure this doesn't cause
trouble.

```
clang-13: .../DeclTemplate.h:1940:
void clang::ClassTemplateSpecializationDecl::setPointOfInstantiation(clang::SourceLocation):
Assertion `Loc.isValid() && "point of instantiation must be valid!"' failed.
```

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D100621

3 years ago[OpenMP] Make sure classes work on the device as they do on the host
Johannes Doerfert [Fri, 16 Apr 2021 05:35:29 +0000 (00:35 -0500)]
[OpenMP] Make sure classes work on the device as they do on the host

We do provide `operator delete(void*)` in `<new>` but it should be
available by default. This is mostly boilerplate to test it and the
unconditional include of `<new>` in the header we always in include
on the device.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D100620

3 years ago[MLIR][GPU][NVVM] Add warp synchronous matrix-multiply accumulate ops
Navdeep Kumar [Thu, 6 May 2021 06:35:07 +0000 (12:05 +0530)]
[MLIR][GPU][NVVM] Add warp synchronous matrix-multiply accumulate ops

Add warp synchronous matrix-multiply accumulate ops in GPU and NVVM
dialect. Add following three ops to GPU dialect :-
  1.) subgroup_mma_load_matrix
  2.) subgroup_mma_store_matrix
  3.) subgroup_mma_compute
Add following three ops to NVVM dialect :-
  1.) wmma.m16n16k16.load.[a,b,c].[f16,f32].row.stride
  2.) wmma.m16n16k16.store.d.[f16,f32].row.stride
  3.) wmma.m16n16k16.mma.row.row.[f16,f32].[f16,f32]

Reviewed By: bondhugula, ftynse, ThomasRaoux

Differential Revision: https://reviews.llvm.org/D95330

3 years ago[clangd] Check if macro is already in the IdentifierTable before loading it
Queen Dela Cruz [Thu, 6 May 2021 06:22:32 +0000 (08:22 +0200)]
[clangd] Check if macro is already in the IdentifierTable before loading it

Having nested macros in the C code could cause clangd to fail an assert in clang::Preprocessor::setLoadedMacroDirective() and crash.

 #1 0x00000000007ace30 PrintStackTraceSignalHandler(void*) /qdelacru/llvm-project/llvm/lib/Support/Unix/Signals.inc:632:1
 #2 0x00000000007aaded llvm::sys::RunSignalHandlers() /qdelacru/llvm-project/llvm/lib/Support/Signals.cpp:76:20
 #3 0x00000000007ac7c1 SignalHandler(int) /qdelacru/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1
 #4 0x00007f096604db20 __restore_rt (/lib64/libpthread.so.0+0x12b20)
 #5 0x00007f0964b307ff raise (/lib64/libc.so.6+0x377ff)
 #6 0x00007f0964b1ac35 abort (/lib64/libc.so.6+0x21c35)
 #7 0x00007f0964b1ab09 _nl_load_domain.cold.0 (/lib64/libc.so.6+0x21b09)
 #8 0x00007f0964b28de6 (/lib64/libc.so.6+0x2fde6)
 #9 0x0000000001004d1a clang::Preprocessor::setLoadedMacroDirective(clang::IdentifierInfo*, clang::MacroDirective*, clang::MacroDirective*) /qdelacru/llvm-project/clang/lib/Lex/PPMacroExpansion.cpp:116:5

An example of the code that causes the assert failure:
```
...
```

During code completion in clangd, the macros will be loaded in loadMainFilePreambleMacros() by iterating over the macro names and calling PreambleIdentifiers->get(). Since these macro names are store in a StringSet (has StringMap underlying container), the order of the iterator is not guaranteed to be same as the order seen in the source code.

When clangd is trying to resolve nested macros it sometimes attempts to load them out of order which causes a macro to be stored twice. In the example above, ECHO2 macro gets resolved first, but since it uses another macro that has not been resolved it will try to resolve/store that as well. Now there are two MacroDirectives stored in the Preprocessor, ECHO and ECHO2. When clangd tries to load the next macro, ECHO, the preprocessor fails an assert in clang::Preprocessor::setLoadedMacroDirective() because there is already a MacroDirective stored for that macro name.

In this diff, I check if the macro is already inside the IdentifierTable and if it is skip it so that it is not resolved twice.

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D101870

3 years ago[OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks
Giorgis Georgakoudis [Wed, 5 May 2021 22:13:14 +0000 (15:13 -0700)]
[OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks

This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactoring facilitates updating the Clang OpenMP code generation codebase by automating test generation.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101849

3 years ago[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSi...
Jessica Clarke [Thu, 6 May 2021 03:01:20 +0000 (04:01 +0100)]
[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics

Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits,
where zero-extending atomics were incorrectly returning 0 rather than
the (slightly confusing) required return value of 1.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342

3 years ago[BPF][Test] Disable codegen test on AIX
Jinsong Ji [Thu, 6 May 2021 02:38:31 +0000 (02:38 +0000)]
[BPF][Test] Disable codegen test on AIX

https://reviews.llvm.org/D101194 changed the default getMultiarchTriple in toolchain.
So -march=bpf on AIX will get triple of bpf-ibm-aix now,
this is unexpected and causing test failures.

BPF on AIX is not supported (yet), disable the codegen test on AIX in lit cfg.

Reviewed By: yonghong-song

Differential Revision: https://reviews.llvm.org/D101866

3 years ago[ORC] Add missing library dependency on IRReader.
Lang Hames [Thu, 6 May 2021 02:30:24 +0000 (19:30 -0700)]
[ORC] Add missing library dependency on IRReader.

3 years ago[OpenMP] Fix non-determinism in clang copyin codegen
Giorgis Georgakoudis [Thu, 6 May 2021 01:28:23 +0000 (18:28 -0700)]
[OpenMP] Fix non-determinism in clang copyin codegen

Codegen for OpeMP copyin has non-deterministic IR output due to the unspecified evaluation order in a codegen conditional branch, which makes automatic test generation unreliable. This patch refactors codegen code to avoid this non-determinism.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101952

3 years ago[ORC] Introduce C API for adding object buffers directly to an object layer.
Lang Hames [Thu, 6 May 2021 01:29:26 +0000 (18:29 -0700)]
[ORC] Introduce C API for adding object buffers directly to an object layer.

This can be useful for clients constructing custom JIT stacks: If the C API
for your custom stack exposes API to obtain a reference to an object layer
(e.g. LLVMOrcLLJITGetObjLinkingLayer) then the newly added
LLVMOrcObjectLayerAddObjectFile and LLVMOrcObjectLayerAddObjectFileWithRT
functions can be used to add objects directly to that layer.

3 years ago[scudo] Add initialization for TSDRegistrySharedT
Christopher Ferris [Thu, 6 May 2021 02:00:30 +0000 (19:00 -0700)]
[scudo] Add initialization for TSDRegistrySharedT

Fixes compilation on Android which has a TSDSharedRegistry object in the config.

Reviewed By: cryptoad, vitalybuka

Differential Revision: https://reviews.llvm.org/D101951

3 years ago[AMDGPU] Switch AnnotateUniformValues to MemorySSA
Stanislav Mekhanoshin [Wed, 5 May 2021 23:28:09 +0000 (16:28 -0700)]
[AMDGPU] Switch AnnotateUniformValues to MemorySSA

This shall speedup compilation and also remove threshold
limitations used by memory dependency analysis.

It also seem to fix the bug in the coalescer_remat.ll
where an SMRD load was used in presence of a potentially
clobbering store.

Fixes: SWDEV-272132

Differential Revision: https://reviews.llvm.org/D101962

3 years ago[AMDGPU] Move insertion of function entry waitcnt later
Austin Kerbow [Tue, 27 Apr 2021 16:29:27 +0000 (09:29 -0700)]
[AMDGPU] Move insertion of function entry waitcnt later

This allows tracking these as preexisting waitcnt.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D101380

3 years ago[M68k][test][NFC] Scrubing some tests
Min-Yih Hsu [Thu, 6 May 2021 00:46:56 +0000 (17:46 -0700)]
[M68k][test][NFC] Scrubing some tests

Remove unecessary labels and assembly directives. NFC.

3 years ago[AArch64] Replace fixup_aarch64_tlsdesc_call with FirstLiteralRelocationKind + R_AARC...
Fangrui Song [Thu, 6 May 2021 00:41:56 +0000 (17:41 -0700)]
[AArch64] Replace fixup_aarch64_tlsdesc_call with FirstLiteralRelocationKind + R_AARCH64_{,P32_}TLSDESC_CALL

3 years ago[test] Delete redundant arm64-tls-relocs.s
Fangrui Song [Thu, 6 May 2021 00:41:04 +0000 (17:41 -0700)]
[test] Delete redundant arm64-tls-relocs.s

It just replicates tls-relocs.s

3 years ago[InstCombine] Fully disable select to and/or i1 folding
Juneyoung Lee [Sun, 2 May 2021 03:28:16 +0000 (12:28 +0900)]
[InstCombine] Fully disable select to and/or i1 folding

This is a patch that disables the poison-unsafe select -> and/or i1 folding.

It has been blocking D72396 and also has been the source of a few miscompilations
described in llvm.org/pr49688 .
D99674 conditionally blocked this folding and successfully fixed the latter one.
The former one was still blocked, and this patch addresses it.

Note that a few test functions that has `_logical` suffix are now deoptimized.
These are created by @nikic to check the impact of disabling this optimization
by copying existing original functions and replacing and/or with select.

I can see that most of these are poison-unsafe; they can be revived by introducing
freeze instruction. I left comments at fcmp + select optimizations (or-fcmp.ll, and-fcmp.ll)
because I think they are good targets for freeze fix.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101191

3 years ago[AMDGPU] Revise handling of preexisting waitcnt
Austin Kerbow [Fri, 9 Apr 2021 17:54:21 +0000 (10:54 -0700)]
[AMDGPU] Revise handling of preexisting waitcnt

Preexisting waitcnt may not update the scoreboard if the instruction
being examined needed to wait on fewer counters than what was encoded in
the old waitcnt instruction. Fixing this results in the elimination of
some redudnat waitcnt.

These changes also enable combining consecutive waitcnt into a single
S_WAITCNT or S_WAITCNT_VSCNT instruction.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D100281

3 years ago[ARM] Simplification to ARMBlockPlacement Pass.
Malhar Jajoo [Wed, 5 May 2021 19:20:46 +0000 (20:20 +0100)]
[ARM] Simplification to ARMBlockPlacement Pass.

It simplifies the logic by moving the predecessor  (preHeader or it's predecessor) above the target (or loopExit),
instead of moving the target to after the predecessor.

Since the loopExit is no longer being moved, directions of any branches within/to it are unaffected.

While the predecessor is being moved, the backwards movement simplifies some considerations,
and the only consideration now required is that a forward WLS to the predecessor should not become backwards.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100094

3 years ago [dfsan] extend a test case to measure origin memory usage
Jianzhou Zhao [Wed, 5 May 2021 00:53:51 +0000 (00:53 +0000)]
 [dfsan] extend a test case to measure origin memory usage

This is to support D101204.

Reviewed By: gbalats

Differential Revision: https://reviews.llvm.org/D101877

3 years ago[M68k][AsmParser] Fix invalid register name parsing logics
Min-Yih Hsu [Sun, 2 May 2021 21:31:38 +0000 (14:31 -0700)]
[M68k][AsmParser] Fix invalid register name parsing logics

Adjust sanity check in register parsing function to allow register
name with more than 2 characters (e.g. ccr).

Differential Revision: https://reviews.llvm.org/D101733

3 years ago[M68k][AsmParser] Support negative integer constants
Min-Yih Hsu [Sun, 2 May 2021 21:27:33 +0000 (14:27 -0700)]
[M68k][AsmParser] Support negative integer constants

Parsing negative integer constants as expressions.

Differential Revision: https://reviews.llvm.org/D101732

3 years ago[M68k][test] Initial migration of MC tests
Min-Yih Hsu [Tue, 27 Apr 2021 16:51:57 +0000 (09:51 -0700)]
[M68k][test] Initial migration of MC tests

As the context depicted by bug 49865[1], we are migrating tests under
`test/CodeGen/M68k/Encoding`, which was originally used to test
instruction encoding using MIR file as input, into `test/MC/M68k`. We
are also adding test directives for AsmParser using the same set of
inputs.

Currently we are converting the original MIR test files into assembly
code as well as translating the original LIT "RUN" statement into one
that only uses built-in LLVM tools (i.e. Get rid of `extract-section`).

However, since AsmParser has not completely finished, many of these
original test cases fail. Thus, this patch only migrate test files
that are passed by the current implementation of AsmParser (and
MCCodeEmitter). The remaining tests (under test/CodeGen/M68k/Encoding)
will be ported alone with the patch that fixes the related issues.

[1]: https://bugs.llvm.org/show_bug.cgi?id=49865

Differential Revision: https://reviews.llvm.org/D101410

3 years ago[WebAssembly] Fix JS code mentions in LowerEmscriptenEHSjLj
Heejin Ahn [Tue, 4 May 2021 04:48:38 +0000 (21:48 -0700)]
[WebAssembly] Fix JS code mentions in LowerEmscriptenEHSjLj

- Removes the mention of fastcomp, which is deprecated.
- Some functions in Emscripten have moved from JS glue code to
  compiler-rt/emscripten_setjmp.c and
  compiler-rt/emscripten_exception_builtins.c. This fixes comments about
  that.

Reviewed By: sbc100

Differential Revision: https://reviews.llvm.org/D101812

3 years ago[flang] Provide access to constant character array data
peter klausler [Fri, 23 Apr 2021 23:30:34 +0000 (16:30 -0700)]
[flang] Provide access to constant character array data

Allow direct access to constant character array data (for creating a hash ID of a constant).

Differential Revision: https://reviews.llvm.org/D101208

3 years ago[mlir] Check generated IR of math_polynomial_approx.mlir
Emilio Cota [Wed, 5 May 2021 23:41:22 +0000 (16:41 -0700)]
[mlir] Check generated IR of math_polynomial_approx.mlir

Instead of just checking that we emit something.

Differential Revision: https://reviews.llvm.org/D101940

3 years ago[tests] Update Transforms/FunctionAttrs/nosync.ll
Nicolai Hähnle [Wed, 5 May 2021 23:36:45 +0000 (01:36 +0200)]
[tests] Update Transforms/FunctionAttrs/nosync.ll

Commit generated by running update_test_checks.py, to reflect the fact
that we now add the `mustprogress` attribute.

3 years ago[AArch64] Deleted unused AsmBackend functions
Fangrui Song [Wed, 5 May 2021 23:28:39 +0000 (16:28 -0700)]
[AArch64] Deleted unused AsmBackend functions

3 years ago[MCAsmInfo] Support UsesCFIForDebug for targets with no exception handling
RamNalamothu [Wed, 5 May 2021 23:18:59 +0000 (04:48 +0530)]
[MCAsmInfo] Support UsesCFIForDebug for targets with no exception handling

This change enables emitting CFI unwind information for debugging purpose
for targets with MCAsmInfo::ExceptionsType == ExceptionHandling::None.

Currently generating CFI unwind information is entangled with supporting
the exceptions, even when AsmPrinter explicitly recognizes that the unwind
tables are being generated as debug information.

In fact, the unwind information is not generated even if we specify
--force-dwarf-frame-section, unless exceptions are enabled. The LIT test
llvm/test/CodeGen/AMDGPU/debug_frame.ll demonstrates this behavior.

Enable this option for AMDGPU to prepare for future patches which add
complete CFI support.

Reviewed By: dblaikie, MaskRay

Differential Revision: https://reviews.llvm.org/D78778

3 years ago[mlir][Linalg] Fix test to use new reshape op form.
MaheshRavishankar [Wed, 5 May 2021 23:05:44 +0000 (16:05 -0700)]
[mlir][Linalg] Fix test to use new reshape op form.

Differential Revision: https://reviews.llvm.org/D101956

3 years agoAttach metadata to simplified masked loads and stores
Coplin, Jared [Wed, 20 Jan 2021 22:11:49 +0000 (16:11 -0600)]
Attach metadata to simplified masked loads and stores

3 years agoAllow /STACK in #pragma comment(linker, ...)
Alex Reinking [Wed, 5 May 2021 22:54:17 +0000 (15:54 -0700)]
Allow /STACK in #pragma comment(linker, ...)

The Halide project uses `#pragma comment(linker, "/STACK:...")` to set
the stack size high enough for our embedded compiler to run in end-user
programs on Windows.

Unfortunately, lld-link.exe breaks on this when embedded in a COFF
object, despite supporting the flag on the command line. MSVC's link.exe
supports this fine. This patch extends support for this to lld-link.exe
for better compatibility with MSVC projects.

Differential Revision: https://reviews.llvm.org/D99680

3 years agoAMDGPU: Fix lit test
Matt Arsenault [Wed, 5 May 2021 22:40:58 +0000 (18:40 -0400)]
AMDGPU: Fix lit test

3 years ago[mlir][Linalg] Fix element type of results when folding reshapes.
MaheshRavishankar [Wed, 5 May 2021 22:38:25 +0000 (15:38 -0700)]
[mlir][Linalg] Fix element type of results when folding reshapes.

Fixing a minor bug which lead to element type of the output being
modified when folding reshapes with generic op.

Differential Revision: https://reviews.llvm.org/D101942

3 years ago[AArch64] Fix some coding standard issues related to namespace llvm
Fangrui Song [Wed, 5 May 2021 22:27:16 +0000 (15:27 -0700)]
[AArch64] Fix some coding standard issues related to namespace llvm

https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions

3 years ago[Driver] Move -print-runtime-dir and -print-resource-dir tests
Petr Hosek [Tue, 4 May 2021 05:05:27 +0000 (22:05 -0700)]
[Driver] Move -print-runtime-dir and -print-resource-dir tests

Put these into a separate files to match other -print-* options tests.

Differential Revision: https://reviews.llvm.org/D101813

3 years ago[AMDGPU][GlobalISel] Widen 1 and 2 byte scalar loads
Vang Thao [Wed, 14 Apr 2021 00:51:58 +0000 (17:51 -0700)]
[AMDGPU][GlobalISel] Widen 1 and 2 byte scalar loads

Widen 1 and 2 byte scalar loads to 4 bytes when sufficiently
aligned to avoid using a global load.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D100430

3 years ago[gn build] (semi-manually) port 0b10bb7ddd3c more
Nico Weber [Wed, 5 May 2021 22:15:07 +0000 (18:15 -0400)]
[gn build] (semi-manually) port 0b10bb7ddd3c more

3 years ago[lldb] Handle missing SBStructuredData copy assignment cases
Dave Lee [Thu, 29 Apr 2021 23:03:46 +0000 (16:03 -0700)]
[lldb] Handle missing SBStructuredData copy assignment cases

Fix cases that can crash `SBStructuredData::operator=`.

This happened in a case where `rhs` had a null `SBStructuredDataImpl`.

Differential Revision: https://reviews.llvm.org/D101585

3 years ago[lld-macho] Check simulator platforms to avoid issuing false positive errors.
Vy Nguyen [Tue, 4 May 2021 20:23:21 +0000 (16:23 -0400)]
[lld-macho] Check simulator platforms to avoid issuing false positive errors.

Currently the linker causes unnecessary errors when either the target or the config's platform is a simulator.

Differential Revision: https://reviews.llvm.org/D101855

3 years ago[gn build] (semi-manually) port 0b10bb7ddd3c
Nico Weber [Wed, 5 May 2021 22:06:52 +0000 (18:06 -0400)]
[gn build] (semi-manually) port 0b10bb7ddd3c

3 years agoAMDGPU: Add a few more tail call tests
Matt Arsenault [Sun, 14 Mar 2021 17:52:31 +0000 (13:52 -0400)]
AMDGPU: Add a few more tail call tests

Add some cases I noticed were missing when porting to GlobalISel. The
cases that required any argument splitting did not work at first.

3 years agoARM/GlobalISel: Don't store a MachineInstrBuilder reference
Matt Arsenault [Wed, 5 May 2021 21:22:10 +0000 (17:22 -0400)]
ARM/GlobalISel: Don't store a MachineInstrBuilder reference

This is basically a pointer anyway

3 years agoWhen performing template argument deduction to select a partial
Richard Smith [Wed, 5 May 2021 21:44:49 +0000 (14:44 -0700)]
When performing template argument deduction to select a partial
specialization while substituting a partial template parameter pack,
don't try to extend the existing deduction.

This caused us to select the wrong partial specialization in some rare
cases. A recent change to libc++ caused this to happen in practice for
code using std::conjunction.

3 years ago[AMDGPU] Improve global SADDR selection
Stanislav Mekhanoshin [Mon, 3 May 2021 18:01:13 +0000 (11:01 -0700)]
[AMDGPU] Improve global SADDR selection

An address can be a uniform sum of two i64 bit values.
That regularly happens in a loop where index is an induction
variable promoted to 64 bit by the LSR. We can materialize
zero in a VGPR and still use SADDR form of the load.

Differential Revision: https://reviews.llvm.org/D101591

3 years ago[clangd] Split CC and refs limit and increase refs limit to 1000
Kirill Bobyrev [Wed, 5 May 2021 21:39:37 +0000 (23:39 +0200)]
[clangd] Split CC and refs limit and increase refs limit to 1000

Related discussion: https://github.com/clangd/clangd/discussions/761

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D101902

3 years agoGlobalISel: Update documentation
Matt Arsenault [Wed, 5 May 2021 17:55:24 +0000 (13:55 -0400)]
GlobalISel: Update documentation

3 years agoAMDGPU/GlobalISel: Remove unnecessary override
Matt Arsenault [Wed, 5 May 2021 02:29:30 +0000 (22:29 -0400)]
AMDGPU/GlobalISel: Remove unnecessary override

This is the same as the default implementation

3 years agoX86/GlobalISel: Use generic version of splitToValueTypes
Matt Arsenault [Sun, 28 Feb 2021 16:35:37 +0000 (11:35 -0500)]
X86/GlobalISel: Use generic version of splitToValueTypes

The custom insert of an unmerge and the callback weirdness should be
unnecessary. Since handleAssignments should now use
getRegisterTypeForCalling conv as SelectionDAG builder would, this
should now just be able to use the generic code. X86-32 relies on the
generated CCAssignFns not seeing illegal types and sharing code with
x86_64, so i64 values would incorrectly be assigned to 64-bit
registers.

3 years agoGlobalISel: Use DAG call lowering infrastructure in a more compatible way
Matt Arsenault [Tue, 13 Apr 2021 17:45:35 +0000 (13:45 -0400)]
GlobalISel: Use DAG call lowering infrastructure in a more compatible way

Unfortunately the current call lowering code is built on top of the
legacy MVT/DAG based code. However, GlobalISel was not using it the
same way. In short, the DAG passes legalized types to the assignment
function, and GlobalISel was passing the original raw type if it was
simple.

I do believe the DAG lowering is conceptually broken since it requires
picking a type up front before knowing how/where the value will be
passed. This ends up being a problem for AArch64, which wants to pass
i1/i8/i16 values as a different size if passed on the stack or in
registers.

The argument type decision is split across 3 different places which is
hard to follow. SelectionDAG builder uses
getRegisterTypeForCallingConv to pick a legal type, tablegen gives the
illusion of controlling the type, and the target may have additional
hacks in the C++ part of the call lowering. AArch64 hacks around this
by not using the standard AnalyzeFormalArguments and special casing
i1/i8/i16 by looking at the underlying type of the original IR
argument.

I believe people have generally assumed the calling convention code is
processing the original types, and I've discovered a number of dead
paths in several targets.

x86 actually relies on the opposite behavior from AArch64, and relies
on x86_32 and x86_64 sharing calling convention code where the 64-bit
cases implicitly do not work on x86_32 due to using the pre-legalized
types.

AMDGPU targets without legal i16/f16 have always used a broken ABI
that promotes to i32/f32. GlobalISel accidentally fixed this to be the
ABI we should have, but this fixes it so we're using the worse ABI
that is compatible with the DAG. Ideally we would fix the DAG to match
the old GlobalISel behavior, but I don't wish to fight that battle.

A new native GlobalISel call lowering framework should let the target
process the incoming types directly.

CCValAssigns select a "ValVT" and "LocVT" but the meanings of these
aren't entirely clear. Different targets don't use them consistently,
even within their own call lowering code. My current belief is the
intent was "ValVT" is supposed to be the legalized value type to use
in the end, and and LocVT was supposed to be the ABI passed type
(which is also legalized).

With the default CCState::Analyze functions always passing the same
type for these arguments, these only differ when the TableGen part of
the lowering decide to promote the type from one legal type to
another. AArch64's i1/i8/i16 hack ends up inverting the meanings of
these values, so I had to add an additional hack to let the target
interpret how large the argument memory is.

Since targets don't consistently interpret ValVT and LocVT, this
doesn't produce quite equivalent code to the initial DAG
lowerings. I've opted to consistently interpret LocVT as the in-memory
size for stack passed values, and ValVT as the register type to assign
from that memory. We therefore produce extending loads directly out of
the IRTranslator, whereas the DAG would emit regular loads of smaller
values. This will also produce loads/stores that are wider than the
argument value if the allocated stack slot is larger (and there will
be undef padding bytes). If we had the optimizations to reduce
load/stores based on truncated values, this wouldn't produce a
different end result.

Since ValVT/LocVT are more consistently interpreted, we now will emit
more G_BITCASTS as requested by the CCAssignFn. For example AArch64
was directly assigning types to some physical vector registers which
according to the tablegen spec should have been casted to a vector
with a different element type.

This also moves the responsibility for inserting
G_ASSERT_SEXT/G_ASSERT_ZEXT from the target ValueHandlers into the
generic code, which is closer to how SelectionDAGBuilder works.

I had to xfail an x86 test since I don't see a quick way to fix it
right now (I filed bug 50035 for this). It's broken independently of
this change, and only triggers since now we end up with more ands
which hit the improperly handled selection pattern.

I also observed that FP arguments that need promotion (e.g. f16 passed
as f32) are broken, and use regular G_TRUNC and G_ANYEXT.

TLDR; the current call lowering infrastructure is bad and nobody has
ever understood how it chooses types.

3 years ago[mlir] Add polynomial approximation for math::ExpM1
Emilio Cota [Wed, 5 May 2021 21:26:50 +0000 (14:26 -0700)]
[mlir] Add polynomial approximation for math::ExpM1

This approximation matches the one in Eigen.

```
name                      old cpu/op  new cpu/op  delta
BM_mlir_Expm1_f32/10      90.9ns ± 4%  52.2ns ± 4%  -42.60%    (p=0.000 n=74+87)
BM_mlir_Expm1_f32/100      837ns ± 3%   231ns ± 4%  -72.43%    (p=0.000 n=79+69)
BM_mlir_Expm1_f32/1k      8.43µs ± 3%  1.58µs ± 5%  -81.30%    (p=0.000 n=77+83)
BM_mlir_Expm1_f32/10k     83.8µs ± 3%  15.4µs ± 5%  -81.65%    (p=0.000 n=83+69)
BM_eigen_s_Expm1_f32/10   68.8ns ±17%  72.5ns ±14%   +5.40%  (p=0.000 n=118+115)
BM_eigen_s_Expm1_f32/100   694ns ±11%   717ns ± 2%   +3.34%   (p=0.000 n=120+75)
BM_eigen_s_Expm1_f32/1k   7.69µs ± 2%  7.97µs ±11%   +3.56%   (p=0.000 n=95+117)
BM_eigen_s_Expm1_f32/10k  88.0µs ± 1%  89.3µs ± 6%   +1.45%   (p=0.000 n=74+106)
BM_eigen_v_Expm1_f32/10   44.3ns ± 6%  45.0ns ± 8%   +1.45%   (p=0.018 n=81+111)
BM_eigen_v_Expm1_f32/100   351ns ± 1%   360ns ± 9%   +2.58%    (p=0.000 n=73+99)
BM_eigen_v_Expm1_f32/1k   3.31µs ± 1%  3.42µs ± 9%   +3.37%   (p=0.000 n=71+100)
BM_eigen_v_Expm1_f32/10k  33.7µs ± 8%  34.1µs ± 9%   +1.04%    (p=0.007 n=99+98)
```

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D101852

3 years ago[MachineCSE][NFC]: Refactor and comment on preventing CSE for isConvergent instrs
Michael Kitzan [Sat, 1 May 2021 02:50:54 +0000 (19:50 -0700)]
[MachineCSE][NFC]: Refactor and comment on preventing CSE for isConvergent instrs

- Move the code preventing CSE of `isConvergent` instrs into
  `ProcessBlockCSE` (from `isProfitableToCSE`)
- Add comments explaining why `isConvergent` is used to prevent
  CSE of non-local instrs in MachineCSE and the new test

3 years ago[Utils][NFC] Rename replace-function-regex in update_cc_test_checks
Giorgis Georgakoudis [Wed, 5 May 2021 18:46:02 +0000 (11:46 -0700)]
[Utils][NFC] Rename replace-function-regex in update_cc_test_checks

This patch renames the replace-function-regex to replace-value-regex to indicate that the existing regex replacement functionality can replace any IR value besides functions.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101934

3 years agoPreserve metadata on masked intrinsics in auto-upgrade
Krzysztof Parzyszek [Fri, 23 Apr 2021 20:07:00 +0000 (15:07 -0500)]
Preserve metadata on masked intrinsics in auto-upgrade

When auto-upgrade was replacing a call to a masked intrinsic, it would
not copy the metadata from the original call.

If an intrinsic had metadata, but did not need any updates, the metadata
would stay, but if an update was needed, the would end up being removed.
A similar effect could be observed with masked_expandload and
masked_compressstore, which at the moment are not handled by auto-upgrade:
the metadata remained untouched.

Differential Revision: https://reviews.llvm.org/D101201

3 years ago[NFC][X86][Codegen] Add some tests for 64-bit shift by (32-x)
Roman Lebedev [Wed, 5 May 2021 20:46:35 +0000 (23:46 +0300)]
[NFC][X86][Codegen] Add some tests for 64-bit shift by (32-x)

3 years ago[WebAssembly] Add SIMD const_splat intrinsics
Thomas Lively [Wed, 5 May 2021 20:46:45 +0000 (13:46 -0700)]
[WebAssembly] Add SIMD const_splat intrinsics

These intrinsics do not correspond to their own underlying instruction, but are
a convenience for the common case of materializing a constant vector that has
the same value in each lane.

Differential Revision: https://reviews.llvm.org/D101885

3 years ago[lld] Convert LLVM_CMAKE_PATH to a CMake path
Isuru Fernando [Wed, 28 Apr 2021 17:50:51 +0000 (12:50 -0500)]
[lld] Convert LLVM_CMAKE_PATH to a CMake path

Otherwise I get the following error on windows.
```
CMake Error at D:/bld/lld_1569206597988/work/build/CMakeFiles/CMakeTmp/CMakeLists.txt:2 (set):
  Syntax error in cmake code at

    D:/bld/lld_1569206597988/work/build/CMakeFiles/CMakeTmp/CMakeLists.txt:2

  when parsing string

    D:\bld\lld_1569206597988\_h_env\Library\lib\cmake\llvm

  Invalid character escape '\b'.

CMake Error at D:/bld/lld_1569206597988/_build_env/Library/share/cmake-3.15/Modules/CheckSymbolExists.cmake:100 (try_compile):
  Failed to configure test project build system.
Call Stack (most recent call first):
  D:/bld/lld_1569206597988/_build_env/Library/share/cmake-3.15/Modules/CheckSymbolExists.cmake:57 (__CHECK_SYMBOL_EXISTS_IMPL)
  D:/bld/lld_1569206597988/_h_env/Library/lib/cmake/llvm/HandleLLVMOptions.cmake:943 (check_symbol_exists)
  CMakeLists.txt:56 (include)
```

Reviewed By: sbc100

Differential Revision: https://reviews.llvm.org/D68158

3 years ago[mlir][tosa] Add tosa.depthwise lowering to existing linalg.depthwise_conv
Rob Suderman [Wed, 5 May 2021 20:10:49 +0000 (13:10 -0700)]
[mlir][tosa] Add tosa.depthwise lowering to existing linalg.depthwise_conv

Implements support for undialated depthwise convolution using the existing
depthwise convolution operation. Once convolutions migrate to yaml defined
versions we can rewrite for cleaner implementation.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D101579

3 years ago[scudo] Align objects with alignas
Vitaly Buka [Tue, 4 May 2021 23:34:59 +0000 (16:34 -0700)]
[scudo] Align objects with alignas

Operator new must align allocations for types with large alignment.

Before c++17 behavior was implementation defined and both clang and gc++
before 11 ignored alignment. Miss-aligned objects mysteriously crashed
tests on Ubuntu 14.

Alternatives are compile with -std=c++17 or -faligned-new, but they were
discarded as less portable.

Reviewed By: hctim

Differential Revision: https://reviews.llvm.org/D101874

3 years ago[libc++] [LIBCXX-DEBUG-FIXME] Stop using invalid iterators to insert into sets/maps.
Arthur O'Dwyer [Tue, 20 Apr 2021 19:38:57 +0000 (15:38 -0400)]
[libc++] [LIBCXX-DEBUG-FIXME] Stop using invalid iterators to insert into sets/maps.

This simply applies Howard's commit 4c80bfbd53caf consistently
across all the associative and unordered container tests.

"unord.set/insert_hint_const_lvalue.pass.cpp" failed with `-D_LIBCPP_DEBUG=1`
before this patch; it was the only one that incorrectly reused
invalid iterator `e`. The others already used valid iterators
(generally `c.end()`); I'm just making them all match the same pattern
of usage: "e, then r, then c.end() for the rest."

Differential Revision: https://reviews.llvm.org/D101679

3 years ago[libc++] [LIBCXX-DEBUG-FIXME] std::advance shouldn't use ADL `>=` on the _Distance...
Arthur O'Dwyer [Tue, 20 Apr 2021 19:59:22 +0000 (15:59 -0400)]
[libc++] [LIBCXX-DEBUG-FIXME] std::advance shouldn't use ADL `>=` on the _Distance type.

Convert to a primitive type first; then use primitive `>=` on that value.

Differential Revision: https://reviews.llvm.org/D101678

3 years ago[libc++] [LIBCXX-DEBUG-FIXME] Our `__debug_less` breaks some complexity guarantees.
Arthur O'Dwyer [Tue, 20 Apr 2021 22:21:59 +0000 (18:21 -0400)]
[libc++] [LIBCXX-DEBUG-FIXME] Our `__debug_less` breaks some complexity guarantees.

`__debug_less` ends up running the comparator up-to-twice per comparison,
because whenever `(x < y)` it goes on to verify that `!(y < x)`.
This breaks the strict "Complexity" guarantees of algorithms like
`inplace_merge`, which we test in the test suite. So, just skip the
complexity assertions in debug mode.

Differential Revision: https://reviews.llvm.org/D101677

3 years ago[libc++] [LIBCXX-DEBUG-FIXME] Iterating a string::iterator "off the end" is UB.
Arthur O'Dwyer [Wed, 21 Apr 2021 01:51:41 +0000 (21:51 -0400)]
[libc++] [LIBCXX-DEBUG-FIXME] Iterating a string::iterator "off the end" is UB.

The range of char pointers [data, data+size] is a valid closed range,
but the range [begin, end) is valid only half-open.

Differential Revision: https://reviews.llvm.org/D101676

3 years ago[libc++] [LIBCXX-DEBUG-FIXME] Fix an iterator-invalidation issue in string::assign.
Arthur O'Dwyer [Tue, 27 Apr 2021 13:10:04 +0000 (09:10 -0400)]
[libc++] [LIBCXX-DEBUG-FIXME] Fix an iterator-invalidation issue in string::assign.

This appears to be a bug in our string::assign: when assigning into
a longer string, from a shorter snippet of itself, we invalidate
iterators before doing the copy. We should invalidate them afterward.
Also drive-by improve the formatting of a function header.

Differential Revision: https://reviews.llvm.org/D101675

3 years ago[libc++] Move <__sso_allocator> out of include/ into src/. NFCI.
Arthur O'Dwyer [Mon, 26 Apr 2021 13:56:50 +0000 (09:56 -0400)]
[libc++] Move <__sso_allocator> out of include/ into src/. NFCI.

This allocator is not intended for libc++'s users to use;
it's strictly an implementation detail of `src/locale.cpp`.
So, move it to the `src/include/` directory.

Drive-by const-qualify its comparison operators.

For consistency with `__hidden_allocator` (defined in `src/thread.cpp`),
do *not* remove it from "libcxx/lib/libc++unexp.exp",
"libcxx/utils/symcheck-blacklists/linux_blacklist.txt", etc.

Differential Revision: https://reviews.llvm.org/D101293

3 years ago[WebAssembly] Fix constness of pointer params to load intrinsics
Thomas Lively [Wed, 5 May 2021 20:16:55 +0000 (13:16 -0700)]
[WebAssembly] Fix constness of pointer params to load intrinsics

Update the SIMD builtin load functions to take pointers to const data and update
the intrinsics themselves to not cast away constness.

Differential Revision: https://reviews.llvm.org/D101884

3 years ago[WebAssembly] Update narrowing builtin function operand types
Thomas Lively [Wed, 5 May 2021 20:04:04 +0000 (13:04 -0700)]
[WebAssembly] Update narrowing builtin function operand types

Make the inputs to all narrowing builtins signed, which is how they are
interpreted by the underlying instructions (only the result changes sign
between instructions).

Differential Revision: https://reviews.llvm.org/D101883

3 years agoAdd fuzzer for Rust demangler
Tomasz Miąsko [Wed, 5 May 2021 19:29:03 +0000 (12:29 -0700)]
Add fuzzer for Rust demangler

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D101823

3 years ago[lld-macho] Try to unbreak build
Jez Ng [Wed, 5 May 2021 19:46:42 +0000 (15:46 -0400)]
[lld-macho] Try to unbreak build

Looks like the PointerUnion casting cares about const-ness...

3 years ago[libcxx] [ci] Add a Windows CI configuration for a statically linked libc++
Martin Storsjö [Mon, 5 Apr 2021 21:17:30 +0000 (00:17 +0300)]
[libcxx] [ci] Add a Windows CI configuration for a statically linked libc++

On Windows, static vs DLL linking affects details in quite a few
cases, so it's good to have coverage for both cases.

Testing with static linking also increases coverage for a number of
cases and individual checks that have had to be waived for the DLL
case, and allows testing libc++experimental, increasing the number
of test cases actually executed by 180 (176 new tests from
libc++experimental and 4 ones that are XFAIL windows-dll).

Also drop the "generic-" prefix from these configuration names, as
they're perhaps not what the "generic" prefix intended originally
in the other generic-posix configurations.

Differential Revision: https://reviews.llvm.org/D101565

3 years ago[libc++] NFC: Remove stray semicolon in from-scratch config files
Louis Dionne [Wed, 5 May 2021 19:05:45 +0000 (15:05 -0400)]
[libc++] NFC: Remove stray semicolon in from-scratch config files

3 years ago[WebAssembly] Set alignment to 1 for SIMD memory intrinsics
Thomas Lively [Wed, 5 May 2021 18:59:33 +0000 (11:59 -0700)]
[WebAssembly] Set alignment to 1 for SIMD memory intrinsics

The WebAssembly SIMD intrinsics in wasm_simd128.h generally try not to require
any particular alignment for memory operations to be maximally flexible. For
builtin memory access functions and their corresponding LLVM IR intrinsics,
there's no way to set the expected alignment, so the best we can do is set the
alignment to 1 in the backend. This change means that the alignment hints in the
emitted code will no longer be incorrect when users use the intrinsics to access
unaligned data.

Differential Revision: https://reviews.llvm.org/D101850

3 years ago[libomptarget] Initial documentation on amdgpu offload
Jon Chesterfield [Wed, 5 May 2021 18:58:51 +0000 (19:58 +0100)]
[libomptarget] Initial documentation on amdgpu offload

[libomptarget] Initial documentation on amdgpu offload

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101927

3 years ago[hwasan] Fix missing synchronization in AllocThread.
Evgenii Stepanov [Wed, 5 May 2021 02:16:30 +0000 (19:16 -0700)]
[hwasan] Fix missing synchronization in AllocThread.

The problem was introduced in D100348.

It's really hard to trigger the bug in a stress test - the race is just too
narrow - but the new checks in Thread::Init should at least provide usable
diagnostic if the problem ever returns.

Differential Revision: https://reviews.llvm.org/D101881

3 years ago[lld-macho] Preliminary support for ARM_RELOC_BR24
Jez Ng [Wed, 5 May 2021 18:40:41 +0000 (14:40 -0400)]
[lld-macho] Preliminary support for ARM_RELOC_BR24

ARM_RELOC_BR24 is used for BL/BLX instructions from within ARM (i.e. not
Thumb) code. This diff just handles the basic case: branches from ARM to
ARM, or from ARM to Thumb where no shimming is required. (See comments
in ARM.cpp for why shims are required.)

Note: I will likely be deprioritizing ARM work for the near future to
focus on other parts of LLD. Apologies for the half-done state of this;
I'm just trying to wrap up what I've already worked on.

Reviewed By: #lld-macho, alexshap

Differential Revision: https://reviews.llvm.org/D101814

3 years ago[lld-macho] Have --reproduce account for path rerooting
Jez Ng [Wed, 5 May 2021 18:38:36 +0000 (14:38 -0400)]
[lld-macho] Have --reproduce account for path rerooting

We need to account for path rerooting when generating the response
file. We could either reroot the paths before generating the file, or pass
through the original filenames and change just the syslibroot. I've opted for
the latter, in order that the reproduction run more closely mirrors the
original.

We must also be careful *not* to make an absolute path relative if it is
shadowed by a rerooted path. See repro6.tar in reroot-path.s for
details.

I've moved the call to `createResponseFile()` after the initialization of
`config->systemLibraryRoots`, since it now needs to know what those roots are.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D101224

3 years agoMake clangd CompletionModel not depend on directory layout.
Harald van Dijk [Wed, 5 May 2021 18:25:34 +0000 (19:25 +0100)]
Make clangd CompletionModel not depend on directory layout.

The current code accounts for two possible layouts, but there is at
least a third supported layout: clang-tools-extra may also be checked
out as clang/tools/extra with the releases, which was not yet handled.
Rather than treating that as a special case, use the location of
CompletionModel.cmake to handle all three cases. This should address the
problems that prompted D96787 and the problems that prompted the
proposed revert D100625.

Reviewed By: usaxena95

Differential Revision: https://reviews.llvm.org/D101851

3 years ago[Clang] remove text extension from diag::err_drv_invalid_value_with_suggestion
Nick Desaulniers [Wed, 5 May 2021 18:01:33 +0000 (11:01 -0700)]
[Clang] remove text extension from diag::err_drv_invalid_value_with_suggestion

This hinders translations, as per:
https://clang.llvm.org/docs/InternalsManual.html#the-format-string

Reviewed By: MaskRay, xbolva00

Differential Revision: https://reviews.llvm.org/D101387

3 years ago[NFC][SimplifyCFG] Update documentation comments for SinkCommonCodeFromPredecessors...
Roman Lebedev [Wed, 5 May 2021 17:34:43 +0000 (20:34 +0300)]
[NFC][SimplifyCFG] Update documentation comments for SinkCommonCodeFromPredecessors() after 1886aad

3 years ago[llvm-objcopy][ELF] --only-keep-debug: set offset/size of segments with no sections...
Fangrui Song [Wed, 5 May 2021 17:26:57 +0000 (10:26 -0700)]
[llvm-objcopy][ELF] --only-keep-debug: set offset/size of segments with no sections to zero

PR50160: we currently ignore non-PT_PHDR segments with no sections, not
accounting for its p_offset and p_filesz: this can cause an out-of-bounds write
in `writeSegmentData` if the p_offset+p_filesz is larger than the total file
size.

This can be fixed by setting p_offset=p_filesz=0. The logic nicely unifies with
the logic added in D90897.

Reviewed By: jhenderson, rupprecht

Differential Revision: https://reviews.llvm.org/D101560

3 years agoRISSCV: clang-format RISC-V AsmParser (NFC)
Saleem Abdulrasool [Wed, 5 May 2021 17:15:14 +0000 (10:15 -0700)]
RISSCV: clang-format RISC-V AsmParser (NFC)

This corrects a few issues identified by `clang-format`.  This is meant
to be preparation for a subsequent change.

3 years ago[NFC][X86][CostModel] Add tests for byteswap intrinsic
Roman Lebedev [Wed, 5 May 2021 16:21:22 +0000 (19:21 +0300)]
[NFC][X86][CostModel] Add tests for byteswap intrinsic

3 years ago[MC] Untangle MCContext and MCObjectFileInfo
Philipp Krones [Wed, 5 May 2021 17:03:02 +0000 (10:03 -0700)]
[MC] Untangle MCContext and MCObjectFileInfo

This untangles the MCContext and the MCObjectFileInfo. There is a circular
dependency between MCContext and MCObjectFileInfo. Currently this dependency
also exists during construction: You can't contruct a MOFI without a MCContext
without constructing the MCContext with a dummy version of that MOFI first.
This removes this dependency during construction. In a perfect world,
MCObjectFileInfo wouldn't depend on MCContext at all, but only be stored in the
MCContext, like other MC information. This is future work.

This also shifts/adds more information to the MCContext making it more
available to the different targets. Namely:

- TargetTriple
- ObjectFileType
- SubtargetInfo

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D101462

3 years ago[LV] Workaround PR49900 (a crash due to analyzing partially mutated IR)
Philip Reames [Wed, 5 May 2021 16:55:09 +0000 (09:55 -0700)]
[LV] Workaround PR49900 (a crash due to analyzing partially mutated IR)

LoopVectorize has a fairly deeply baked in design problem where it will try to query analysis (primarily SCEV, but also ValueTracking) in the midst of mutating IR. In particular, the intermediate IR state does not represent the semantics of the original (or final) program.

Fixing this for real is hard, but all of the cases seen so far share a common symptom. In cases seen to date, the analysis being queried is the computation of the original loop's trip count. We can fix this particular instance of the issue by simply computing the trip count early, and caching it.

I want to be really clear that this is nothing but a workaround. It does nothing to fix the root issue, and at best, delays the time until we have to fix this for real. Florian and I have discussed an eventual solution in the review comments for https://reviews.llvm.org/D100663, but it's a lot of work.

Test taken from https://reviews.llvm.org/D100663.

Differential Revision: https://reviews.llvm.org/D101487

3 years ago[mlir][ArmSVE] Add masked arithmetic operations
Javier Setoain [Mon, 19 Apr 2021 14:37:29 +0000 (15:37 +0100)]
[mlir][ArmSVE] Add masked arithmetic operations

These instructions map to SVE-specific instrinsics that accept a
predicate operand to support control flow in vector code.

Differential Revision: https://reviews.llvm.org/D100982