platform/upstream/llvm.git
2 years ago[AMDGPU] Initialize a couple more Subtarget fields
Jay Foad [Wed, 13 Apr 2022 15:24:31 +0000 (16:24 +0100)]
[AMDGPU] Initialize a couple more Subtarget fields

This is just for consistency. The fields are never actually used
so it is NFC.

2 years ago[libunwind][AIX] implementation of the unwinder for AIX
Xing Xue [Wed, 13 Apr 2022 15:29:37 +0000 (11:29 -0400)]
[libunwind][AIX] implementation of the unwinder for AIX

Summary:
This is an add-on patch to address comments.
- Replace #elif in file <assembly.h> with #else as suggested;
- Reversed the indentation changes in the main patch.

Differential Revision: https://reviews.llvm.org/D100132

2 years agoRecommit "[LICM] Only create load in pre-header when promoting load."
Florian Hahn [Wed, 13 Apr 2022 15:20:39 +0000 (17:20 +0200)]
Recommit "[LICM] Only create load in pre-header when promoting load."

This reverts the revert commit 1ddc719680c21f3.

This version of the patch sets the initial available value to poison,
which resolves an issue with the SSAUpdater breaking LCSSA form.

2 years ago[gn build] Port a85da649b9ac
LLVM GN Syncbot [Wed, 13 Apr 2022 15:05:20 +0000 (15:05 +0000)]
[gn build] Port a85da649b9ac

2 years ago[libunwind][AIX] implementation of the unwinder for AIX
Xing Xue [Wed, 13 Apr 2022 15:01:59 +0000 (11:01 -0400)]
[libunwind][AIX] implementation of the unwinder for AIX

Summary:
This patch contains the implementation of the unwinder for IBM AIX.

AIX does not support the eh_frame section. Instead, the traceback table located at the end of each function provides the information for stack unwinding and EH. In this patch macro _LIBUNWIND_SUPPORT_TBTAB_UNWIND is used to guard code for AIX traceback table based unwinding. Function getInfoFromTBTable() and stepWithTBTable() are added to get the EH information from the traceback table and to step up the stack respectively.

There are two kinds of LSDA information for EH on AIX, the state table and the range table. The state table is used by the previous version of the IBM XL compiler, i.e., xlC and xlclang++. The DWARF based range table is used by AIX clang++. The traceback table has flags to differentiate these cases. For the range table, relative addresses are calculated using a base of DW_EH_PE_datarel, which is the TOC base of the module where the function of the current frame belongs.

Two personality routines are employed to handle these two different LSDAs, __xlcxx_personality_v0() for the state table and __xlcxx_personality_v1() for the range table. Since the traceback table does not have the information of the personality for the state table approach, its personality __xlcxx_personality_v0() is dynamically resolved as the handler for the state table. For the range table, the locations of the LSDA and its associated personality routine are found in the traceback table.

Assembly code for 32- and 64-bit PowerPC in UnwindRegistersRestore.S and UnwindRegistersSave.S are modified so that it can be consumed by the GNU flavor assembler and the AIX assembler. The restoration of vector registers does not check VRSAVE on AIX because VRSAVE is not used in the AIX ABI.

Reviewed by: MaskRay, compnerd, cebowleratibm, sfertile, libunwind

Differential Revision: https://reviews.llvm.org/D100132

2 years ago[CUDA][HIP] Fix host used external kernel in archive
Yaxun (Sam) Liu [Sat, 9 Apr 2022 03:56:07 +0000 (23:56 -0400)]
[CUDA][HIP] Fix host used external kernel in archive

For -fgpu-rdc, a host function may call an external kernel
which is defined in an archive of bitcode. Since this external
kernel is only referenced in host function, the device
bitcode does not contain reference to this external
kernel, then the linker will not try to resolve this external
kernel in the archive.

To fix this issue, host-used external kernels and device
variables are tracked. A global array containing pointers
to these external kernels and variables is emitted which
serves as an artificial references to the external kernels
and variables used by host.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D123441

2 years ago[SimplifyLibCalls] Don't mark memchr() memory as fully dereferenceable
Nikita Popov [Wed, 13 Apr 2022 09:28:45 +0000 (11:28 +0200)]
[SimplifyLibCalls] Don't mark memchr() memory as fully dereferenceable

C11 specifies memchr() as follows:

> The memchr function locates the first occurrence of c (converted
> to an unsigned char) in the initial n characters (each interpreted
> as unsigned char) of the object pointed to by s. The implementation
> shall behave as if it reads the characters sequentially and stops
> as soon as a matching character is found.

In particular, it is well-defined to specify a memchr size larger
than the underlying object, as long as the character is found before
the end of the object.

Differential Revision: https://reviews.llvm.org/D123665

2 years ago[clang-format] Fix SeparateDefinitionBlocks breaking up function-try-block.
Marek Kurdej [Fri, 25 Mar 2022 10:01:40 +0000 (11:01 +0100)]
[clang-format] Fix SeparateDefinitionBlocks breaking up function-try-block.

Fixes https://github.com/llvm/llvm-project/issues/54536.

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D122468

2 years ago[NFC] Simplify /noimplib argument logic
Tobias Hieta [Wed, 13 Apr 2022 14:39:22 +0000 (16:39 +0200)]
[NFC] Simplify /noimplib argument logic

2 years ago[LLD][COFF] Add support for /noimplib
Tobias Hieta [Tue, 12 Apr 2022 11:59:45 +0000 (13:59 +0200)]
[LLD][COFF] Add support for /noimplib

Mostly for compatibility reasons with link.exe this flag
makes sure we don't write a implib - not even when /implib
is also passed, that's how link.exe works.

Differential Revision: https://reviews.llvm.org/D123591

2 years ago[SimplifyCFG] add tests for switch to select; NFC
chenglin.bi [Wed, 13 Apr 2022 14:34:30 +0000 (22:34 +0800)]
[SimplifyCFG] add tests for switch to select; NFC

Baseline tests for D122485 (issue #39957)

2 years agoRevert "[SimplifyCFG] add tests for switch to select; NFC"
chenglin.bi [Wed, 13 Apr 2022 14:32:22 +0000 (22:32 +0800)]
Revert "[SimplifyCFG] add tests for switch to select; NFC"

This reverts commit e2d77a160c5b8141eca3db1fca6dafd97e78288d.

2 years ago[OpenMP] Lowering to MLIR of ordered threads directive
PeixinQiao [Wed, 13 Apr 2022 14:30:52 +0000 (22:30 +0800)]
[OpenMP] Lowering to MLIR of ordered threads directive

This patch supports lowering parse-tree to MLIR of ordered threads
directive following Section 2.19.9 of the OpenMP 5.1 standard.

This is part of the upstreaming effort from the fir-dev branch in [1].
[1] https://github.com/flang-compiler/f18-llvm-project

Reviewed By: shraiysh

Differential Revision: https://reviews.llvm.org/D123590

2 years ago[flang][OpenMP] Add semantic checks of nesting of region about ordered construct
PeixinQiao [Wed, 13 Apr 2022 14:27:58 +0000 (22:27 +0800)]
[flang][OpenMP] Add semantic checks of nesting of region about ordered construct

This patch supports the following checks for ORDERED construct:

```
[5.1] 2.19.9 ORDERED Construct
The worksharing-loop or worksharing-loop SIMD region to which an ordered
region corresponding to an ordered construct without a depend clause
binds must have an ordered clause without the parameter specified on the
corresponding worksharing-loop or worksharing-loop SIMD directive.
The worksharing-loop region to which an ordered region that corresponds
to an ordered construct with any depend clauses binds must have an
ordered clause with the parameter specified on the corresponding
worksharing-loop directive.
An ordered construct with the depend clause specified must be closely
nested inside a worksharing-loop (or parallel worksharing-loop)
construct.
An ordered region that corresponds to an ordered construct with the simd
clause specified must be closely nested inside a simd or
worksharing-loop SIMD region.
```

Reviewed By: kiranchandramohan, shraiysh, NimishMishra

Differential Revision: https://reviews.llvm.org/D113399

2 years ago[mlir][docs] Fix broken links
Marius Brehler [Wed, 13 Apr 2022 14:17:52 +0000 (16:17 +0200)]
[mlir][docs] Fix broken links

2 years ago[libc++] Mark completed paper as complete
Louis Dionne [Wed, 13 Apr 2022 14:16:39 +0000 (10:16 -0400)]
[libc++] Mark completed paper as complete

2 years ago[gn build] Port 2fb026ee4d1a
LLVM GN Syncbot [Wed, 13 Apr 2022 13:52:22 +0000 (13:52 +0000)]
[gn build] Port 2fb026ee4d1a

2 years ago[libc++] Post-commit adjustments after rebasing D117656
Louis Dionne [Tue, 12 Apr 2022 22:56:40 +0000 (18:56 -0400)]
[libc++] Post-commit adjustments after rebasing D117656

2 years agoImplement move_sentinel and C++20 move_iterator.
Arthur O'Dwyer [Wed, 19 Jan 2022 11:26:52 +0000 (06:26 -0500)]
Implement move_sentinel and C++20 move_iterator.

Differential Revision: https://reviews.llvm.org/D117656

2 years ago[lldb] Fixup af921006d3792f for non-linux platforms
Pavel Labath [Wed, 13 Apr 2022 13:37:44 +0000 (15:37 +0200)]
[lldb] Fixup af921006d3792f for non-linux platforms

2 years ago[SimplifyCFG] add tests for switch to select; NFC
chenglin.bi [Wed, 13 Apr 2022 13:27:06 +0000 (21:27 +0800)]
[SimplifyCFG] add tests for switch to select; NFC

Baseline tests for D122968(issue #54649)

2 years ago[gn build] Port 2b424f4ea82e
LLVM GN Syncbot [Wed, 13 Apr 2022 13:04:50 +0000 (13:04 +0000)]
[gn build] Port 2b424f4ea82e

2 years ago[libc++] Implement ranges::filter_view
Louis Dionne [Wed, 14 Jul 2021 21:01:25 +0000 (17:01 -0400)]
[libc++] Implement ranges::filter_view

Differential Revision: https://reviews.llvm.org/D109086

2 years ago[SystemZ] Implement adjustInliningThreshold().
Jonas Paulsson [Fri, 14 Jan 2022 00:52:16 +0000 (18:52 -0600)]
[SystemZ] Implement adjustInliningThreshold().

This patch boosts the inlining threshold for a particular type of functions
that are using an incoming argument only as a memcpy source.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D121341

2 years ago[clangd] Export preamble AST and serialized size as metrics
Sam McCall [Wed, 13 Apr 2022 12:02:19 +0000 (14:02 +0200)]
[clangd] Export preamble AST and serialized size as metrics

Differential Revision: https://reviews.llvm.org/D123672

2 years ago[lldb] Remove the global platform list
Pavel Labath [Fri, 25 Feb 2022 13:47:27 +0000 (14:47 +0100)]
[lldb] Remove the global platform list

This patch moves the platform creation and selection logic into the
per-debugger platform lists. I've tried to keep functional changes to a
minimum -- the main (only) observable difference in this change is that
APIs, which select a platform by name (e.g.,
Debugger::SetCurrentPlatform) will not automatically pick up a platform
associated with another debugger (or no debugger at all).

I've also added several tests for this functionality -- one of the
pleasant consequences of the debugger isolation is that it is now
possible to test the platform selection and creation logic.

This is a product of the discussion at
<https://discourse.llvm.org/t/multiple-platforms-with-the-same-name/59594>.

Differential Revision: https://reviews.llvm.org/D120810

2 years ago[compiler-rt] Don't explictly ad-hoc sign dylibs on APPLE if ld is new enough
Nico Weber [Mon, 11 Apr 2022 01:28:29 +0000 (21:28 -0400)]
[compiler-rt] Don't explictly ad-hoc sign dylibs on APPLE if ld is new enough

ld64 implicitly ad-hoc code-signs as of Xcode 12, and `strip` and friends know
how keep this special ad-hoc signature valid.

So this should have no effective behavior change, except that you can now strip
libclang_rt.asan_osx_dynamic.dylib and it'll still have a valid ad-hoc
signature, instead of strip printing "warning: changes being made to the file
will invalidate the code signature in:" and making the ad-hoc code signature
invalid.

Differential Revision: https://reviews.llvm.org/D123475

2 years ago[mlir][Tensor] Fix wrong comment (NFC)
Adrian Kuegel [Wed, 13 Apr 2022 12:30:01 +0000 (14:30 +0200)]
[mlir][Tensor] Fix wrong comment (NFC)

2 years agoCorrectly diagnose prototype redeclaration errors in C
Aaron Ballman [Wed, 13 Apr 2022 12:20:19 +0000 (08:20 -0400)]
Correctly diagnose prototype redeclaration errors in C

We did not implement C99 6.7.5.3p15 fully in that we missed the rule
for compatible function types where a prior declaration has a prototype
and a subsequent definition (not just declaration) has an empty
identifier list or an identifier list with a mismatch in parameter
arity. This addresses that situation by issuing an error on code like:

void f(int);
void f() {} // type conflicts with previous declaration

(Note: we already diagnose the other type conflict situations
appropriately, this was the only situation we hadn't covered that I
could find.)

2 years ago[X86] Covert unsigned int 0 to float-point with FILD instruction.
Liu, Chen3 [Wed, 13 Apr 2022 07:25:12 +0000 (15:25 +0800)]
[X86] Covert unsigned int 0 to float-point with FILD instruction.

unsinged int 0 will be convert to float/double -0.0 when the rounding
mode is set to 'FE_DOWNWARD'. Using FILD instruction instead of SSE
instructions on 32-bit target if the strictfp is enabled.

Differential Revision: https://reviews.llvm.org/D123660

2 years ago[DAG] Enable SimplifyVBinOp folds on add/sub sat intrinsics
Simon Pilgrim [Wed, 13 Apr 2022 11:53:15 +0000 (12:53 +0100)]
[DAG] Enable SimplifyVBinOp folds on add/sub sat intrinsics

2 years ago[AMDGPU][MC][GFX10] Removed unsupported 64bit DPP opcodes
Dmitry Preobrazhensky [Wed, 13 Apr 2022 11:16:40 +0000 (14:16 +0300)]
[AMDGPU][MC][GFX10] Removed unsupported 64bit DPP opcodes

Removed 64bit DPP opcodes from asm matcher tables.

Differential Revision: https://reviews.llvm.org/D123611

2 years ago[X86] Add tests showing failure to pull common shuffles through add/sub sat intrinsics
Simon Pilgrim [Wed, 13 Apr 2022 11:35:38 +0000 (12:35 +0100)]
[X86] Add tests showing failure to pull common shuffles through add/sub sat intrinsics

2 years ago[SimplifyCFG] make a debug option for case max when converting switch to select
Sanjay Patel [Wed, 13 Apr 2022 10:54:08 +0000 (06:54 -0400)]
[SimplifyCFG] make a debug option for case max when converting switch to select

This should be "NFC" as written, but it will make D122485 smaller
and give us more flexibility to experiment with optimization level
vs. compile-time.

Differential Revision: https://reviews.llvm.org/D123625

2 years ago[InlineAsm] Add support for address operands ("p").
Jonas Paulsson [Tue, 22 Mar 2022 09:39:07 +0000 (10:39 +0100)]
[InlineAsm] Add support for address operands ("p").

This patch adds support for inline assembly address operands using the "p"
constraint on X86 and SystemZ.

This was in fact broken on X86 (see example at
https://reviews.llvm.org/D110267, Nov 23).

These operands should probably be treated the same as memory operands by
CodeGenPrepare, which have been commented with "TODO" there.

Review: Xiang Zhang and Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D122220

2 years ago[flang][nfc] Simplify TargetMachine initialisation
Andrzej Warzynski [Wed, 13 Apr 2022 10:28:49 +0000 (10:28 +0000)]
[flang][nfc] Simplify TargetMachine initialisation

2 years ago[AMDGPU][GFX10] Enabled op_sel for v_add_nc_u16 and v_sub_nc_u16
Dmitry Preobrazhensky [Wed, 13 Apr 2022 10:09:11 +0000 (13:09 +0300)]
[AMDGPU][GFX10] Enabled op_sel for v_add_nc_u16 and v_sub_nc_u16

Differential Revision: https://reviews.llvm.org/D123594

2 years ago[BOLT] Fix two aarch64 tests
Vladislav Khmelevsky [Thu, 7 Apr 2022 18:56:05 +0000 (21:56 +0300)]
[BOLT] Fix two aarch64 tests

tls-lld test might be broken since compiler might optimize plt function
call and use address directly from got table. The test is removed since
plt-gnu-ld checks the same functionality + versioning symbol matching,
no need to keep both of the tests.
The toolchain might optimize relocations in runtime-relocs test, replace
the test compilation with yaml files.

Differential Revision: https://reviews.llvm.org/D123332

2 years ago[DAG] Add non-uniform vector support to (shl (srl x, c1), c2) -> (and (shift x, c3))
Simon Pilgrim [Wed, 13 Apr 2022 10:37:24 +0000 (11:37 +0100)]
[DAG] Add non-uniform vector support to (shl (srl x, c1), c2) -> (and (shift x, c3))

Another part of D77804 yak shaving

Differential Revision: https://reviews.llvm.org/D123523

2 years ago[flang][driver] Add support for generating LLVM bytecode files
Andrzej Warzynski [Wed, 6 Apr 2022 11:59:28 +0000 (11:59 +0000)]
[flang][driver] Add support for generating LLVM bytecode files

Support for generating LLVM BC files is added in Flang's compiler and
frontend drivers. This requires the `BitcodeWriterPass` pass to be run
on the input LLVM IR module and is implemented as a dedicated frontend
aciton. The new functionality as seen by the user (compiler driver):
```
flang-new -c -emit-llvm file.90
```
or (frontend driver):
```
flang-new -fc1 -emit-llvm-bc file.f90
```

The new behaviour is consistent with `clang` and `clang -cc1`.

Differential Revision: https://reviews.llvm.org/D123211

2 years ago[RISCV][NFC] Reorganize check prefixes in some tests to reduce redundant lines
Ping Deng [Wed, 13 Apr 2022 09:55:07 +0000 (09:55 +0000)]
[RISCV][NFC] Reorganize check prefixes in some tests to reduce redundant lines

Reviewed By: benshi001, craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D123176

2 years ago[AArch64] Add missing HasNEON predicate in scalar FABD patterns
Alex Richardson [Tue, 12 Apr 2022 20:59:41 +0000 (20:59 +0000)]
[AArch64] Add missing HasNEON predicate in scalar FABD patterns

I was trying to compile with -march=+nosimd and hit the following assertion:
`Attempting to emit FABD64 instruction but the Feature_HasNEON predicate(s) are not met`.
This adds a HasNEON predicate to the patterns which was omitted in commit
21d9b33d62772c58267cc0aa725e35ac9a4661db for some reason.
The new code generation matches GCC with -mcpu=<cpu>+nosimd:
https://godbolt.org/z/n1Y7xh5jo

Differential Revision: https://reviews.llvm.org/D123491

2 years ago[AArch64] Baseline test for D123491
Alex Richardson [Mon, 11 Apr 2022 08:01:05 +0000 (08:01 +0000)]
[AArch64] Baseline test for D123491

2 years ago[AutoUpgrade] Don't lose attributes when upgrading mem intrinsics
Alex Richardson [Thu, 17 Mar 2022 23:28:11 +0000 (23:28 +0000)]
[AutoUpgrade] Don't lose attributes when upgrading mem intrinsics

The original AutoUpgrade code from 1e68724d24ba38de7c7cdb2e1939d78c8b37cc0d
did not retain existing attributes. I noticed this in some downstream test
cases, but it turns out there are also two affected testcase upstream.

Differential Revision: https://reviews.llvm.org/D121971

2 years ago[AArch64][SVE] Fix lowering of "fcmp ueq/one" when using SVE
David Sherwood [Thu, 17 Mar 2022 11:51:59 +0000 (11:51 +0000)]
[AArch64][SVE] Fix lowering of "fcmp ueq/one" when using SVE

We were previously lowering to the incorrect instructions for the
setcc DAG node when using the SETUEQ and SETONE floating point
condition codes. I have fixed this by marking the SETONE code
as Expand and letting the SETUNE code be legal. I have also
fixed up the patterns for FCMNE_PPzZZ and FCMNE_PPzZ0 to use
the correct opcode.

Differential Revision: https://reviews.llvm.org/D121905

2 years ago[RISCV][NFC] Refactor the type promotion of fsl/fsr/becompress/bdecompress/bfp
Liqin Weng [Wed, 13 Apr 2022 02:45:28 +0000 (02:45 +0000)]
[RISCV][NFC] Refactor the type promotion of fsl/fsr/becompress/bdecompress/bfp

Reviewed By: asb, jrtc27, craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D123181

2 years ago[Test] Add tests showing duplicate PHIs generated by RS4GC (NFC)
Dmitry Makogon [Wed, 13 Apr 2022 08:46:12 +0000 (15:46 +0700)]
[Test] Add tests showing duplicate PHIs generated by RS4GC (NFC)

2 years ago[LTO] Remove legacy PM support
Nikita Popov [Wed, 13 Apr 2022 08:35:42 +0000 (10:35 +0200)]
[LTO] Remove legacy PM support

We don't have any places setting NewPM=false anymore, so drop the
support code in LTOBackend.

2 years agoRevert "[ubsan] Simplify ubsan_GetStackTrace"
Nikita Popov [Wed, 13 Apr 2022 08:36:11 +0000 (10:36 +0200)]
Revert "[ubsan] Simplify ubsan_GetStackTrace"

This reverts commit 63f2d1f4d4b8ee284b4ab977242e322a9458a168.

I don't quite understand why, but this causes a linker error for
me and a number of buildbots:

/home/npopov/repos/llvm-project/compiler-rt/lib/ubsan/../sanitizer_common/sanitizer_stacktrace.h:130: error: undefined reference to '__sanitizer::BufferedStackTrace::UnwindImpl(unsigned long, unsigned long, void*, bool, unsigned int)'

2 years ago[LLD][COFF] Add support for /noimplib
Tobias Hieta [Tue, 12 Apr 2022 14:06:46 +0000 (16:06 +0200)]
[LLD][COFF] Add support for /noimplib

Mostly for compatibility reasons with link.exe this flag
makes sure we don't write a implib - not even when /implib
is also passed, that's how link.exe works.

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D123591

2 years ago[Clang] Remove support for legacy pass manager
Nikita Popov [Mon, 11 Apr 2022 10:52:31 +0000 (12:52 +0200)]
[Clang] Remove support for legacy pass manager

This removes the -flegacy-pass-manager and
-fno-experimental-new-pass-manager options, and the corresponding
support code in BackendUtil. The -fno-legacy-pass-manager and
-fexperimental-new-pass-manager options are retained as no-ops.

Differential Revision: https://reviews.llvm.org/D123609

2 years ago[clang][ASTImporter] Fix an import error handling related bug.
Balázs Kéri [Wed, 13 Apr 2022 07:41:40 +0000 (09:41 +0200)]
[clang][ASTImporter] Fix an import error handling related bug.

This bug can cause that more import errors are generated than necessary
and many objects fail to import. Chance of an invalid AST after these
imports increases.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D122525

2 years ago[clang] NFC, move CompilerInvocation::setLangDefaults to LangOptions.h
Haojian Wu [Fri, 8 Apr 2022 11:23:12 +0000 (13:23 +0200)]
[clang] NFC, move CompilerInvocation::setLangDefaults to LangOptions.h

The function is moved from clangFrontend to clangBasic, which allows tools
(e.g. clang pseudoparser) which don't depend on clangFrontend to use.

Differential Revision: https://reviews.llvm.org/D121375

2 years ago[ubsan] Simplify ubsan_GetStackTrace
Fangrui Song [Wed, 13 Apr 2022 07:32:10 +0000 (00:32 -0700)]
[ubsan] Simplify ubsan_GetStackTrace

Suggested by Vitaly Buka

2 years agoSupport the min of module flags when linking, use for AArch64 BTI/PAC-RET
Daniel Kiss [Wed, 13 Apr 2022 07:31:25 +0000 (09:31 +0200)]
Support the min of module flags when linking, use for AArch64 BTI/PAC-RET

LTO objects might compiled with different `mbranch-protection` flags which will cause an error in the linker.
Such a setup is allowed in the normal build with this change that is possible.

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D123493

2 years ago[clangd] Fix incorrect operator< impl for HighlightingToken
Nathan Ridge [Wed, 13 Apr 2022 06:54:12 +0000 (02:54 -0400)]
[clangd] Fix incorrect operator< impl for HighlightingToken

Differential Revision: https://reviews.llvm.org/D123478

2 years ago[gn build] Port e53c461bf3f0
LLVM GN Syncbot [Wed, 13 Apr 2022 05:30:23 +0000 (05:30 +0000)]
[gn build] Port e53c461bf3f0

2 years ago[libc++][ranges] Implement `lazy_split_view`.
Konstantin Varlamov [Wed, 13 Apr 2022 05:27:07 +0000 (22:27 -0700)]
[libc++][ranges] Implement `lazy_split_view`.

Note that this class was called just `split_view` in the original One
Ranges Proposal and was renamed to `lazy_split_view` by
[P2210](https://wg21.link/p2210).

Co-authored-by: zoecarver <z.zoelec2@gmail.com>
Differential Revision: https://reviews.llvm.org/D107500

2 years ago[clang][preprocessor] Allow calling DumpToken() on annotation tokens
Timm Bäder [Tue, 29 Mar 2022 14:58:45 +0000 (16:58 +0200)]
[clang][preprocessor] Allow calling DumpToken() on annotation tokens

Differential Revision: https://reviews.llvm.org/D122659

2 years ago[X86][test] Add encoding/decoding tests for VEX instruction w/ address-size prefix
Shengchen Kan [Wed, 13 Apr 2022 04:49:51 +0000 (12:49 +0800)]
[X86][test] Add encoding/decoding tests for VEX instruction w/ address-size prefix

This patch also contains a regression test for D122448

Reviewed By: hvdijk, RKSimon

Differential Revision: https://reviews.llvm.org/D122449

2 years ago[clang-format] Allow empty .clang-format file
owenca [Wed, 13 Apr 2022 00:51:28 +0000 (17:51 -0700)]
[clang-format] Allow empty .clang-format file

Differential Revision: https://reviews.llvm.org/D123535

2 years ago[libomptarget][amdgpu] Add hidden_heap_v1 kernarg metadata
Saiyedul Islam [Mon, 11 Apr 2022 17:28:07 +0000 (17:28 +0000)]
[libomptarget][amdgpu] Add hidden_heap_v1 kernarg metadata

Code object version 5 adds support of hidden_heap_v1 kernarg
metadata field [1]. It is a global address space pointer to an
initialized memory buffer that conforms to the requirements of the
malloc/free device library V1 version implementation.

[1] https://llvm.org/docs/AMDGPUUsage.html#amdgpu-amdhsa-code-object-kernel-argument-metadata-map-table-v5

Reviewed By: carlo.bertolli

Differential Revision: https://reviews.llvm.org/D123527

2 years ago[lldb] Re-enable TestStepNoDebug.py on AS
Jonas Devlieghere [Wed, 13 Apr 2022 03:27:10 +0000 (20:27 -0700)]
[lldb] Re-enable TestStepNoDebug.py on AS

This test showed up as an unexpected pass and is now consistently
passing on Apple Silicon.

2 years ago[lldb] Print diagnostic prefixes (error, warning) in color
Jonas Devlieghere [Wed, 13 Apr 2022 03:26:37 +0000 (20:26 -0700)]
[lldb] Print diagnostic prefixes (error, warning) in color

Print diagnostic prefixes (error, warning) in their respective colors
when colors are enabled.

2 years ago[NFC][sanitizer] Consolidate malloc hook invocations
Vitaly Buka [Wed, 13 Apr 2022 03:07:34 +0000 (20:07 -0700)]
[NFC][sanitizer] Consolidate malloc hook invocations

2 years ago[mlir][LLVM-IR] Added support for global variable attributes
Shraiysh Vaishay [Wed, 13 Apr 2022 02:50:56 +0000 (08:20 +0530)]
[mlir][LLVM-IR] Added support for global variable attributes

This patch adds thread_local to llvm.mlir.global and adds translation for dso_local and addr_space to and from LLVM IR.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D123412

2 years ago[NFC] [AST] Reduce the size of TemplateParmPosition
Chuanqi Xu [Thu, 7 Apr 2022 10:53:55 +0000 (18:53 +0800)]
[NFC] [AST] Reduce the size of TemplateParmPosition

I found this when reading the codes. I think it makes sense to reduce
the space for TemplateParmPosition. It is hard to image the depth of
template parameter is larger than 2^20 and the index is larger than
2^12. So I think the patch might be reasonable.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D123298

2 years ago[NFC][sanitizer] Remove unnececary HOOK macros
Vitaly Buka [Wed, 13 Apr 2022 02:19:44 +0000 (19:19 -0700)]
[NFC][sanitizer] Remove unnececary HOOK macros

2 years ago[InstCombine] [NFC] Add a test for fneg.ll
Chenbing Zheng [Wed, 13 Apr 2022 02:33:54 +0000 (10:33 +0800)]
[InstCombine] [NFC] Add a test for fneg.ll

2 years ago[clang][test] Disable opaque pointers in test
Arthur Eubanks [Wed, 13 Apr 2022 02:14:52 +0000 (19:14 -0700)]
[clang][test] Disable opaque pointers in test

Was missed in opaque pointer switch due to not being run on x86.

2 years ago[mlir][Arithmetic] Add common constant folder function for type cast ops.
jacquesguan [Mon, 11 Apr 2022 09:24:43 +0000 (09:24 +0000)]
[mlir][Arithmetic] Add common constant folder function for type cast ops.

This revision replaces current type cast constant folder with a new common type cast constant folder function template.
It will cover all former folder and support fold the constant splat and vector.

Differential Revision: https://reviews.llvm.org/D123489

2 years ago[NFC][msan] Rename SymbolizerScope to UnwinderScope and hide
Vitaly Buka [Wed, 13 Apr 2022 01:57:01 +0000 (18:57 -0700)]
[NFC][msan] Rename SymbolizerScope to UnwinderScope and hide

2 years ago[NFC][sanitizer] Clang format some code
Vitaly Buka [Wed, 13 Apr 2022 01:43:22 +0000 (18:43 -0700)]
[NFC][sanitizer] Clang format some code

2 years ago[NFC][msan] Switch pointer to a reference
Vitaly Buka [Tue, 12 Apr 2022 22:29:13 +0000 (15:29 -0700)]
[NFC][msan] Switch pointer to a reference

2 years ago[lldb] Escape semicolons for all shells
Raphael Isemann [Wed, 13 Apr 2022 01:12:18 +0000 (18:12 -0700)]
[lldb] Escape semicolons for all shells

LLDB supports having globbing regexes in the process launch arguments
that will be resolved using the user's shell. This requires that we pass
the launch args to the shell and then read back the expanded arguments
using LLDB's argdumper utility.

As the shell will not just expand the globbing regexes but all special
characters, we need to escape all non-globbing charcters such as $, &,
<, >, etc. as those otherwise are interpreted and removed in the step
where we expand the globbing characters. Also because the special
characters are shell-specific, LLDB needs to maintain a list of all the
characters that need to be escaped for each specific shell.

This patch adds the missing semicolon character to the escape list for
all currently supported shells. Without this having a semicolon in the
binary path or having a semicolon in the launch arguments will cause the
argdumping process to fail. E.g., lldb -- ./calc "a;b" was failing
before but is working now.

Fixes rdar://55776943

Differential revision: https://reviews.llvm.org/D104629

2 years ago[SLP]Improve reductions analysis and emission, part 1.
Alexey Bataev [Thu, 18 Nov 2021 16:08:01 +0000 (08:08 -0800)]
[SLP]Improve reductions analysis and emission, part 1.

Currently SLP vectorizer walks through the instructions and selects
3 main classes of values: 1) reduction operations - instructions with same
reduction opcode (add, mul, min/max, etc.), which build the reduction,
2) reduced values - instructions with the same opcodes, but different
from the reduction opcode, 3) extra arguments - all other values,
instructions from the different basic block rather than the root node,
instructions with to many/less uses.

This scheme is not very efficient. It excludes some instructions and all
non-instruction values from the reductions (constants, proficient
gathers), to many possibly reduced values are marked as extra arguments.
Patch improves this process by introducing a bit extended analysis
stage. During this stage, we still try to select 3 classes of the
values: 1) reduction operations - same as before, 2) possibly reduced
values - all instructions from the current block/non-instructions, which
may build a vectorization tree, 3) extra arguments - instructions from
the different basic blocks. Additionally, an extra sorting of the
possibly reduced values occurs to build the scalar sequences which
highly likely will bed vectorized, e.g. loads are grouped by the
distance between them, constants are grouped together, cmp instructions
are sorted by their compare types and predicates, extractelement
instructions are sorted by the vector operand, etc. Also, these groups
are reordered by their length so the longest group is the first in the
list of the possibly reduced values.

The vectorization process tries to emit the reductions for all these
groups. These reductions, remaining non-vectorized possible reduced
values and extra arguments are then combined into the final expression
just like it was before.

Differential Revision: https://reviews.llvm.org/D114171

2 years agoAMDGPU: Update reqd-work-group-size optimization for umin intrinsic
Matt Arsenault [Thu, 7 Apr 2022 17:50:08 +0000 (13:50 -0400)]
AMDGPU: Update reqd-work-group-size optimization for umin intrinsic

This code was pattern matching the ID computation expression as it
appears in the library. This was a compare and select, but now that
umin is canonical, we were no longer matching. Update to match the
intrinsic instead.

2 years agoRevert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth"
Muhammad Omair Javaid [Tue, 12 Apr 2022 23:51:25 +0000 (04:51 +0500)]
Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth"

This reverts commit 64b6192e812977092242ae34d6eafdcd42fea39d.

This broke LLVM AArch64 buildbot clang-aarch64-sve-vls-2stage:

https://lab.llvm.org/buildbot/#/builders/176/builds/1515

llvm-tblgen crashes after applying this patch.

2 years ago[test][DSE] Precommit test
Arthur Eubanks [Tue, 12 Apr 2022 23:20:49 +0000 (16:20 -0700)]
[test][DSE] Precommit test

2 years agoRegAllocGreedy: Fix illegal eviction assert for urgent evictions
Matt Arsenault [Tue, 29 Mar 2022 12:48:21 +0000 (08:48 -0400)]
RegAllocGreedy: Fix illegal eviction assert for urgent evictions

The condition in canEvictInterferenceBasedOnCost is slightly different
from the assertion in evictInteference.
canEvictInterferenceBasedOnCost uses a <= check for the cascade number
for legality, but the assert was checking for <. For equal cascade
numbers for an urgent eviction, canEvictInterferenceBasedOnCost could
return success. The actual eviction would then hit this assert. Avoid
ever returning true for equivalent cascade numbers.

The resulting failed allocation seems a bit off to me. e.g. in
illegal-eviction-assert.mir, I wuold assume %0 gets allocated starting
at $vgpr0. That was its initial allocation choice, but was later
evicted. In this example no evictions can help improve anything.

2 years ago[AMDGPU] Split unaligned 4 DWORD DS operations
Stanislav Mekhanoshin [Mon, 11 Apr 2022 22:25:11 +0000 (15:25 -0700)]
[AMDGPU] Split unaligned 4 DWORD DS operations

Similarly to 3 DWORD operations it is better for performance
to split unlaligned operations as long a these are at least
DWORD alignmened. Performance data:

```
Using platform: AMD Accelerated Parallel Processing
Using device: gfx900:xnack-

ds_write_b128                      aligned by 16:  4.9 sec
ds_write2_b64                      aligned by 16:  5.1 sec
ds_write2_b32 * 2                  aligned by 16:  5.5 sec
ds_write_b128                      aligned by  1:  8.1 sec
ds_write2_b64                      aligned by  1:  8.7 sec
ds_write2_b32 * 2                  aligned by  1: 14.0 sec
ds_write_b128                      aligned by  2:  8.1 sec
ds_write2_b64                      aligned by  2:  8.7 sec
ds_write2_b32 * 2                  aligned by  2: 14.0 sec
ds_write_b128                      aligned by  4:  5.6 sec
ds_write2_b64                      aligned by  4:  8.7 sec
ds_write2_b32 * 2                  aligned by  4:  5.6 sec
ds_write_b128                      aligned by  8:  5.6 sec
ds_write2_b64                      aligned by  8:  5.1 sec
ds_write2_b32 * 2                  aligned by  8:  5.6 sec
ds_read_b128                       aligned by 16:  3.8 sec
ds_read2_b64                       aligned by 16:  3.8 sec
ds_read2_b32 * 2                   aligned by 16:  4.0 sec
ds_read_b128                       aligned by  1:  4.6 sec
ds_read2_b64                       aligned by  1:  8.1 sec
ds_read2_b32 * 2                   aligned by  1: 14.0 sec
ds_read_b128                       aligned by  2:  4.6 sec
ds_read2_b64                       aligned by  2:  8.1 sec
ds_read2_b32 * 2                   aligned by  2: 14.0 sec
ds_read_b128                       aligned by  4:  4.6 sec
ds_read2_b64                       aligned by  4:  8.1 sec
ds_read2_b32 * 2                   aligned by  4:  4.0 sec
ds_read_b128                       aligned by  8:  4.6 sec
ds_read2_b64                       aligned by  8:  3.8 sec
ds_read2_b32 * 2                   aligned by  8:  4.0 sec

Using platform: AMD Accelerated Parallel Processing
Using device: gfx1030

ds_write_b128                      aligned by 16:  6.2 sec
ds_write2_b64                      aligned by 16:  7.1 sec
ds_write2_b32 * 2                  aligned by 16:  7.6 sec
ds_write_b128                      aligned by  1: 24.1 sec
ds_write2_b64                      aligned by  1: 25.2 sec
ds_write2_b32 * 2                  aligned by  1: 43.7 sec
ds_write_b128                      aligned by  2: 24.1 sec
ds_write2_b64                      aligned by  2: 25.1 sec
ds_write2_b32 * 2                  aligned by  2: 43.7 sec
ds_write_b128                      aligned by  4: 14.4 sec
ds_write2_b64                      aligned by  4: 25.1 sec
ds_write2_b32 * 2                  aligned by  4:  7.6 sec
ds_write_b128                      aligned by  8: 14.4 sec
ds_write2_b64                      aligned by  8:  7.1 sec
ds_write2_b32 * 2                  aligned by  8:  7.6 sec
ds_read_b128                       aligned by 16:  6.2 sec
ds_read2_b64                       aligned by 16:  6.3 sec
ds_read2_b32 * 2                   aligned by 16:  7.5 sec
ds_read_b128                       aligned by  1: 12.5 sec
ds_read2_b64                       aligned by  1: 24.0 sec
ds_read2_b32 * 2                   aligned by  1: 43.6 sec
ds_read_b128                       aligned by  2: 12.5 sec
ds_read2_b64                       aligned by  2: 24.0 sec
ds_read2_b32 * 2                   aligned by  2: 43.6 sec
ds_read_b128                       aligned by  4: 12.5 sec
ds_read2_b64                       aligned by  4: 24.0 sec
ds_read2_b32 * 2                   aligned by  4:  7.5 sec
ds_read_b128                       aligned by  8: 12.5 sec
ds_read2_b64                       aligned by  8:  6.3 sec
ds_read2_b32 * 2                   aligned by  8:  7.5 sec
```

Differential Revision: https://reviews.llvm.org/D123634

2 years ago[docs][ORC] Fix RST error in dfffb7df24e.
Lang Hames [Tue, 12 Apr 2022 23:05:01 +0000 (16:05 -0700)]
[docs][ORC] Fix RST error in dfffb7df24e.

2 years agoRevert "[clang-format] Allow empty .clang-format file"
owenca [Tue, 12 Apr 2022 23:04:59 +0000 (16:04 -0700)]
Revert "[clang-format] Allow empty .clang-format file"

This reverts commit 4e814a6f2db90046914734fac4f9e3110c7e0424.

2 years agoRegAllocGreedy: Roll back successful recolorings on failure
Matt Arsenault [Thu, 17 Mar 2022 17:12:36 +0000 (13:12 -0400)]
RegAllocGreedy: Roll back successful recolorings on failure

This is a replacement for the original fix attempted in
c46aab01c002b7a04135b8b7f1f52d8c9ae23a58.

This fixes "overlapping insert" assertion failures when trying to
unwind an unsuccessful recoloring attempt.

The problem would occur when there are multiple recoloring candidates
which recursively required recoloring. If one recoloring candidate was
successfully recolored at one level, and the next recoloring candidate
was unsuccessful, we would not roll back the first candidates
successful recoloring. The forgotten successful recoloring may have
been assigned to something that conflicts with a register that needs
to be restored in a parent recoloring attempt.

See the testcase added in issue48473 for a more concrete example with
explanation.

2 years ago[docs] Update OrcV2 doc to include some notes on code removal.
Lang Hames [Tue, 12 Apr 2022 22:23:42 +0000 (15:23 -0700)]
[docs] Update OrcV2 doc to include some notes on code removal.

2 years ago[clang-format] Allow empty .clang-format file
owenca [Tue, 12 Apr 2022 21:35:58 +0000 (14:35 -0700)]
[clang-format] Allow empty .clang-format file

Differential Revision: https://reviews.llvm.org/D123535

2 years agoFix libcxx build after cd0a5889d71c62ae7cefc
Yuanfang Chen [Tue, 12 Apr 2022 22:42:21 +0000 (15:42 -0700)]
Fix libcxx build after cd0a5889d71c62ae7cefc

2 years ago[ArgPromo][OpaquePointer] Don't promote mismatched function types
Arthur Eubanks [Tue, 12 Apr 2022 22:16:11 +0000 (15:16 -0700)]
[ArgPromo][OpaquePointer] Don't promote mismatched function types

Mismatched call/callee function types is considered an indirect call.

Fixes crash in https://reviews.llvm.org/D123300#3446023.

2 years ago[examples][ORC] Add a new example showing the ORCv2 removable code APIs.
Lang Hames [Tue, 12 Apr 2022 21:47:07 +0000 (14:47 -0700)]
[examples][ORC] Add a new example showing the ORCv2 removable code APIs.

2 years ago[MSan] Ensure argument shadow initialized on memcpy
Nikita Popov [Tue, 12 Apr 2022 20:45:53 +0000 (13:45 -0700)]
[MSan] Ensure argument shadow initialized on memcpy

We need to explicitly query the shadow here, because it is lazily
initialized for byval arguments. Without opaque pointers this used to
mostly work out, because there would be a bitcast to `i8*` present, and
that would query, and copy in case of byval, the argument shadow.

Reviewed By: vitalybuka, eugenis

Differential Revision: https://reviews.llvm.org/D123602

2 years agoRevert "[MSan] Ensure argument shadow initialized on memcpy"
Vitaly Buka [Tue, 12 Apr 2022 21:51:00 +0000 (14:51 -0700)]
Revert "[MSan] Ensure argument shadow initialized on memcpy"

Invalid author.

This reverts commit 163a9f4552bea71b2d53126a5f74f9a1b47d2865.

2 years ago[Reland][lit] Use sharding for GoogleTest format
Yuanfang Chen [Tue, 12 Apr 2022 19:09:34 +0000 (12:09 -0700)]
[Reland][lit] Use sharding for GoogleTest format

This helps lit unit test performance by a lot, especially on windows. The performance gain comes from launching one gtest executable for many subtests instead of one (this is the current situation).

The shards are executed by the test runner and the results are stored in the
json format supported by the GoogleTest. Later in the test reporting stage,
all test results in the json file are retrieved to continue the test results
summary etc.

On my Win10 desktop, before this patch: `check-clang-unit`: 177s, `check-llvm-unit`: 38s; after this patch: `check-clang-unit`: 37s, `check-llvm-unit`: 11s.
On my Linux machine, before this patch: `check-clang-unit`: 46s, `check-llvm-unit`: 8s; after this patch: `check-clang-unit`: 7s, `check-llvm-unit`: 4s.

Reviewed By: yln, rnk, abrachet

Differential Revision: https://reviews.llvm.org/D122251

2 years ago[MSan] Ensure argument shadow initialized on memcpy
Vitaly Buka [Tue, 12 Apr 2022 20:45:53 +0000 (13:45 -0700)]
[MSan] Ensure argument shadow initialized on memcpy

We need to explicitly query the shadow here, because it is lazily
initialized for byval arguments. Without opaque pointers this used to
mostly work out, because there would be a bitcast to `i8*` present, and
that would query, and copy in case of byval, the argument shadow.

Reviewed By: vitalybuka, eugenis

Differential Revision: https://reviews.llvm.org/D123602

2 years ago[GlobalsModRef][FIX] Ensure we honor synchronizing effects of intrinsics
Johannes Doerfert [Mon, 11 Apr 2022 18:32:22 +0000 (13:32 -0500)]
[GlobalsModRef][FIX] Ensure we honor synchronizing effects of intrinsics

This is a long standing problem that resurfaces once in a while [0].
There might actually be two problems because I'm not 100% sure if the
issue underlying https://reviews.llvm.org/D115302 would be solved by
this or not. Anyway.

In 2008 we thought intrinsics do not read/write globals passed to them:
https://github.com/llvm/llvm-project/commit/d4133ac31535ce5176f97e9fc81825af8a808760
This is not correct given that intrinsics can synchronize threads and
cause effects to effectively become visible.

NOTE: I did not yet modify any tests but only tried out the reproducer
      of https://github.com/llvm/llvm-project/issues/54851.

Fixes: https://github.com/llvm/llvm-project/issues/54851

[0] https://discourse.llvm.org/t/bug-gvn-memdep-bug-in-the-presence-of-intrinsics/59402

Differential Revision: https://reviews.llvm.org/D123531

2 years ago[NVPTX][FIX] Allow __nvvm_reflect in the presence of opaque pointers
Johannes Doerfert [Mon, 11 Apr 2022 17:23:50 +0000 (12:23 -0500)]
[NVPTX][FIX] Allow __nvvm_reflect in the presence of opaque pointers

Differential Revision: https://reviews.llvm.org/D123522

2 years ago[OpenMP][FIX] Ensure to set the context for wait events if necessary
Johannes Doerfert [Sat, 9 Apr 2022 05:12:44 +0000 (00:12 -0500)]
[OpenMP][FIX] Ensure to set the context for wait events if necessary

Differential Revision: https://reviews.llvm.org/D123445

2 years agoAMDGPU: Don't use unreachable on stores to unhandled address space
Matt Arsenault [Mon, 21 Feb 2022 23:00:20 +0000 (18:00 -0500)]
AMDGPU: Don't use unreachable on stores to unhandled address space

For stores to constant address space, this will now consistently hit a
selection error instead of hitting unreachable in an asserts build.

I'm not sure what we should really do here. We could either just
codegen as if it were global, delete the instruction, or declare the
IR invalid (we really should have a target IR verifier to enforce it).

2 years agoRevert "[clang-format] Allow empty .clang-format file"
owenca [Tue, 12 Apr 2022 21:28:02 +0000 (14:28 -0700)]
Revert "[clang-format] Allow empty .clang-format file"

This reverts commit 6eafda0ef0543cad4b190002e9dae93b036a4ded.

2 years ago[clang-format] Allow empty .clang-format file
owenca [Mon, 11 Apr 2022 19:00:03 +0000 (12:00 -0700)]
[clang-format] Allow empty .clang-format file

Differential Revision: https://reviews.llvm.org/D123535

2 years agoGlobalISel: Implement MoreElements for select of vector conditions
Matt Arsenault [Tue, 12 Apr 2022 01:31:15 +0000 (21:31 -0400)]
GlobalISel: Implement MoreElements for select of vector conditions