platform/upstream/llvm.git
2 years ago[libc++] Implement P0798R8 (Monadic operations for std::optional)
Nikolas Klauser [Wed, 15 Dec 2021 19:54:24 +0000 (20:54 +0100)]
[libc++] Implement P0798R8 (Monadic operations for std::optional)

Implement P0798R8

Reviewed By: #libc, ldionne, Quuxplusone

Spies: tcanens, Quuxplusone, ldionne, Wmbat, arichardson, Mordante, libcxx-commits

Differential Revision: https://reviews.llvm.org/D113408

2 years ago[flang] Avoid code duplication in mixed expressions
Peter Klausler [Mon, 13 Dec 2021 21:53:45 +0000 (13:53 -0800)]
[flang] Avoid code duplication in mixed expressions

Rather than represent the mixed real/complex subexpression x*(a,b)
as (x*a,x*b), use (x,0)*(a,b) to avoid a potential code duplication
in current lowering code.  Same for mixed division, and for mixed
integer*complex and integer/complex cases.

Differential Review: https://reviews.llvm.org/D115732

2 years ago[ELF] Replace make<Defined> with makeDefined. NFC
Fangrui Song [Wed, 15 Dec 2021 21:15:02 +0000 (13:15 -0800)]
[ELF] Replace make<Defined> with makeDefined. NFC

This removes SpecificAlloc<Defined> and makes my lld executable 1.5k smaller.
This drops the small memory waste due to the separate BumpPtrAllocator.

2 years ago[ELF] ObjFile<ELFT>::initializeSymbols: Simplify this->symbols[i]. NFC
Fangrui Song [Wed, 15 Dec 2021 21:02:38 +0000 (13:02 -0800)]
[ELF] ObjFile<ELFT>::initializeSymbols: Simplify this->symbols[i]. NFC

2 years ago[AST] Add more testcases to QualTypeNamesTest. NFC
Sam McCall [Wed, 15 Dec 2021 20:59:54 +0000 (21:59 +0100)]
[AST] Add more testcases to QualTypeNamesTest. NFC

These all currently pass, but are tricky cases not currently covered.
https://reviews.llvm.org/D114251 would break them in its current state.

2 years ago[AST] Fix QualTypeNamesTest, which was spuriously passing
Sam McCall [Wed, 15 Dec 2021 20:53:58 +0000 (21:53 +0100)]
[AST] Fix QualTypeNamesTest, which was spuriously passing

The empty VisitDecl() meant all assertions were skipped.
Meanwhile the assertions have rotted as some type printing has changed.

The test is still in the wrong directory, because it requires TestVisitor.h
which uses Tooling APIs.

2 years ago[ELF] ObjFile<ELFT>::initializeSymbols: Batch allocate local symbols
Fangrui Song [Wed, 15 Dec 2021 20:54:38 +0000 (12:54 -0800)]
[ELF] ObjFile<ELFT>::initializeSymbols: Batch allocate local symbols

and detangle local/global symbol initialization.

My x86-64 lld executable is 8k smaller due to the removal of SpecificAlloc<Undefined>.

2 years ago[clang-format] C# switch expression formatting differs from normal switch formatting
mydeveloperday [Wed, 15 Dec 2021 19:36:22 +0000 (19:36 +0000)]
[clang-format] C# switch expression formatting differs from normal switch formatting

https://github.com/llvm/llvm-project/issues/52677

clang-format doesn't format C# switch expressions very well.

Start with this small use case and try and improve the output. I'll look for other examples to add as tests

Reviewed By: curdeius

Differential Revision: https://reviews.llvm.org/D115673

Fixes  #52677

2 years ago[gn build] Port 8179e1fd519d
LLVM GN Syncbot [Wed, 15 Dec 2021 19:34:58 +0000 (19:34 +0000)]
[gn build] Port 8179e1fd519d

2 years ago[clang][dataflow] Add simplistic constant-propagation analysis.
Yitzhak Mandelbaum [Tue, 14 Dec 2021 17:18:09 +0000 (17:18 +0000)]
[clang][dataflow] Add simplistic constant-propagation analysis.

Adds a very simple constant-propagation analysis for demo and testing purposes.

Differential Revision: https://reviews.llvm.org/D115740

2 years ago[ADT] Add new type traits for type pack indexes
Scott Linder [Tue, 14 Dec 2021 19:27:36 +0000 (19:27 +0000)]
[ADT] Add new type traits for type pack indexes

Similar versions of these already exist, this effectively just just
factors them out into STLExtras. I plan to use these in future patches.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D100672

2 years ago[SCEV] Add test where result depends on order loop guards are applied.
Florian Hahn [Wed, 15 Dec 2021 19:07:25 +0000 (19:07 +0000)]
[SCEV] Add test where result depends on order loop guards are applied.

This patch adds 2 test cases where we fail to determine a tight bound on
the backedge taken count because the ULT condition is applied before the
signed conditions. The order the conditions are applied impacts which
min/max folds are applied.

2 years ago[RISCV] Rename Zbs test cases to match instruction names. NFC
Craig Topper [Wed, 15 Dec 2021 19:03:57 +0000 (11:03 -0800)]
[RISCV] Rename Zbs test cases to match instruction names. NFC

The Zbs instructions uses to start with 'sb' but now start with 'b'.
Update test names accordingly.

2 years ago[SLP][NFC] Add a test for inefficient reordering, NFC.
Alexey Bataev [Wed, 15 Dec 2021 19:05:28 +0000 (11:05 -0800)]
[SLP][NFC] Add a test for inefficient reordering, NFC.

2 years ago[ASTMatchers] Make ParamIndex unsigned.
Felix Berger [Wed, 15 Dec 2021 18:35:31 +0000 (13:35 -0500)]
[ASTMatchers] Make ParamIndex unsigned.

This fixes a compiler error/warning in
https://lab.llvm.org/buildbot/#/builders/36/builds/15377.

Differential Revision: https://reviews.llvm.org/D115809

Reviewed-by: sammccall
2 years ago[Sema] Mark explicit specialization declaration in a friend invalid
Yuanfang Chen [Wed, 15 Dec 2021 18:26:52 +0000 (10:26 -0800)]
[Sema] Mark explicit specialization declaration in a friend invalid

Down the path, if there is a implicit instantiation, this may trigger
the assertion "Member specialization must be an explicit specialization"
in `clang::FunctionDecl::setFunctionTemplateSpecialization`.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D113245

2 years agoRevert "[Sema] Mark explicit specialization declaration in a friend invalid"
Yuanfang Chen [Wed, 15 Dec 2021 18:25:37 +0000 (10:25 -0800)]
Revert "[Sema] Mark explicit specialization declaration in a friend invalid"

This reverts commit 8cb6ecbc4da2b0cfd8dcf04f612dc413716d27a1.

Nothing wrong with the commit. It is missing Phabricator informations.

2 years ago[Sema] Mark explicit specialization declaration in a friend invalid
Yuanfang Chen [Tue, 14 Dec 2021 20:27:07 +0000 (12:27 -0800)]
[Sema] Mark explicit specialization declaration in a friend invalid

Down the path, if there is a implicit instantiation, this may trigger
the assertion "Member specialization must be an explicit specialization"
in `clang::FunctionDecl::setFunctionTemplateSpecialization`.

2 years ago[ELF] Slightly speed up -z keep-text-section-prefix
Fangrui Song [Wed, 15 Dec 2021 18:20:10 +0000 (10:20 -0800)]
[ELF] Slightly speed up -z keep-text-section-prefix

2 years agoTeach the backend to make references to swift_async_extendedFramePointerFlags weak...
Arnold Schwaighofer [Mon, 13 Dec 2021 20:33:15 +0000 (12:33 -0800)]
Teach the backend to make references to swift_async_extendedFramePointerFlags weak if it emits it

When references to the symbol `swift_async_extendedFramePointerFlags`
are emitted they have to be weak.

References to the symbol `swift_async_extendedFramePointerFlags` get
emitted only by frame lowering code. Therefore, the backend needs to track
references to the symbol and mark them weak.

Differential Revision: https://reviews.llvm.org/D115672

2 years ago[libc][NFC][bazel] remove unneeded bzl_library
Guillaume Chatelet [Wed, 15 Dec 2021 17:50:32 +0000 (17:50 +0000)]
[libc][NFC][bazel] remove unneeded bzl_library

2 years ago[InstCombine] (~a & b & c) | ~(a | b) -> (c | ~b) & ~a
Stanislav Mekhanoshin [Wed, 15 Dec 2021 17:17:25 +0000 (09:17 -0800)]
[InstCombine] (~a & b & c) | ~(a | b) -> (c | ~b) & ~a

Transform
```
(~a & b & c) | ~(a | b) -> (c | ~b) & ~a
```
and swapped case
```
(~a | b | c) & ~(a & b) -> (c & ~b) | ~a
```

```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %or1 = or i4 %b, %a
  %not1 = xor i4 %or1, 15
  %not2 = xor i4 %a, 15
  %and1 = and i4 %b, %not2
  %and2 = and i4 %and1, %c
  %or2 = or i4 %and2, %not1
  ret i4 %or2
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %notb = xor i4 %b, 15
  %or = or i4 %notb, %c
  %nota = xor i4 %a, 15
  %and = and i4 %or, %nota
  ret i4 %and
}
Transformation seems to be correct!
```

```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %and1 = and i4 %b, %a
  %not1 = xor i4 %and1, 15
  %not2 = xor i4 %a, 15
  %or1 = or i4 %b, %not2
  %or2 = or i4 %or1, %c
  %and2 = and i4 %or2, %not1
  ret i4 %and2
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %notb = xor i4 %b, 15
  %and = and i4 %notb, %c
  %nota = xor i4 %a, 15
  %or = or i4 %and, %nota
  ret i4 %or
}
Transformation seems to be correct!
```

Differential Revision: https://reviews.llvm.org/D113037

2 years ago[clang] ASTMatchers: Fix out-of-bounds access in foreachArgumentWithParamType.
Felix Berger [Wed, 24 Nov 2021 18:45:36 +0000 (13:45 -0500)]
[clang] ASTMatchers: Fix out-of-bounds access in foreachArgumentWithParamType.

The matcher crashes when a variadic function pointer is invoked because the
FunctionProtoType has fewer parameters than arguments.

Matching of non-variadic arguments now works.

Differential Revision: https://reviews.llvm.org/D114559

Reviewed-by: sammccall
2 years ago[WebAssembly] Add simd-vector-trunc.ll test missing from 2a4a229
Jing Bao [Wed, 15 Dec 2021 17:22:40 +0000 (09:22 -0800)]
[WebAssembly] Add simd-vector-trunc.ll test missing from 2a4a229

This test was authored as part of the same revision, D109481, but I (tlively)
accidentally left it out when committing.

2 years ago[clang] Require x86 target for tbaa test
David Spickett [Wed, 15 Dec 2021 16:39:00 +0000 (16:39 +0000)]
[clang] Require x86 target for tbaa test

Added in https://reviews.llvm.org/D115320.
Failing on our bots that only build Arm/AArch64 targets:
https://lab.llvm.org/buildbot/#/builders/188/builds/6951

2 years ago[mlir] Flip Complex & SCF dialects to _Both (NFC)
Jacques Pienaar [Wed, 15 Dec 2021 16:21:38 +0000 (08:21 -0800)]
[mlir] Flip Complex & SCF dialects to _Both (NFC)

Following
https://llvm.discourse.group/t/psa-ods-generated-accessors-will-change-to-have-a-get-prefix-update-you-apis/4476

2 years ago[clang-tidy][#51939] Exempt placement-new expressions from 'bugprone-throw-keyword...
Markus Böck [Wed, 15 Dec 2021 15:59:04 +0000 (16:59 +0100)]
[clang-tidy][#51939] Exempt placement-new expressions from 'bugprone-throw-keyword-missing'

The purpose of this checker is to flag a missing throw keyword, and does so by checking for the construction of an exception class that is then unused.
This works great except that placement new expressions are also flagged as those lead to the construction of an object as well, even though they are not temporary (as that is dependent on the storage).
This patch fixes the issue by exempting the match if it is within a placement-new.

Fixes https://github.com/llvm/llvm-project/issues/51939

Differential Revision: https://reviews.llvm.org/D115576

2 years ago[LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam
Zaara Syeda [Wed, 15 Dec 2021 15:31:45 +0000 (15:31 +0000)]
[LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam

If a loop isn't forced to be unrolled, we want to avoid unrolling it when there
is an explicit unroll-and-jam pragma. This is to prevent automatic unrolling
from interfering with the user requested transformation.
Missed adding the testcase in earlier commit.

Differential Revision: https://reviews.llvm.org/D114886

2 years ago[clang][deps] NFC: Move entry initialization into member functions
Jan Svoboda [Tue, 14 Dec 2021 11:36:59 +0000 (12:36 +0100)]
[clang][deps] NFC: Move entry initialization into member functions

This is a prep-patch for making `CachedFileSystemEntry` initialization more lazy.

2 years ago[OpenMP] Increase opportunity for parallel kernel launch in AMDGPUs: add multiple...
Carlo Bertolli [Wed, 15 Dec 2021 15:33:17 +0000 (15:33 +0000)]
[OpenMP] Increase opportunity for parallel kernel launch in AMDGPUs: add multiple hsa queue's per device in plugin
This patch extends the AMDGPU plugin for OpenMP target offloading from using a single HSA queue to multiple queues (four in this patch) per device. This enables concurrent threads to concurrently submit kernel launches to the same GPU.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D115771

2 years ago[clang][deps] NFC: Use clearer wording around entry initialization
Jan Svoboda [Fri, 10 Dec 2021 13:16:29 +0000 (14:16 +0100)]
[clang][deps] NFC: Use clearer wording around entry initialization

The code and documentation around `CachedFileSystemEntry` use the following terms:
* "invalid stat" for `llvm::ErrorOr<llvm::vfs::Status>` that is *not* an error and contains an unknown status,
* "initialized entry" for an entry that contains "invalid stat",
* "valid entry" for an entry that contains "invalid stat", synonymous to "initialized" entry.

Having an entry be "valid" while it contains an "invalid" status object is counter-intuitive.
This patch cleans up the wording by referring to the status as "unknown" and to the entry as either "initialized" or "uninitialized".

2 years ago[AMDGPU] Extract helper function in AsmParser. NFC
Joe Nash [Tue, 14 Dec 2021 19:28:42 +0000 (14:28 -0500)]
[AMDGPU] Extract helper function in AsmParser. NFC

NFC refactor to extract useful helper function isRegOrInline.

Reviewed By: rampitec, dp

Differential Revision: https://reviews.llvm.org/D115753

Change-Id: Ief52db9a62615c053fb5f429248657b97cb41453

2 years ago[mlir][scf] Add getNumRegionInvocations to IfOp
Mogball [Wed, 15 Dec 2021 06:42:36 +0000 (06:42 +0000)]
[mlir][scf] Add getNumRegionInvocations to IfOp

Implements the RegionBranchOpInterface method getNumRegionInvocations to `scf::IfOp` so that, when the condition is constant, the number of region executions can be analyzed by `NumberOfExecutions`.

Reviewed By: jpienaar, ftynse

Differential Revision: https://reviews.llvm.org/D115087

2 years ago[X86] LowerRotate - use vXi8 custom lowering for non-uniform constant amounts
Simon Pilgrim [Wed, 15 Dec 2021 14:30:26 +0000 (14:30 +0000)]
[X86] LowerRotate - use vXi8 custom lowering for non-uniform constant amounts

Instead of bailing and using the default expansion, we can more efficiently use the shl(unpack(x,x),unpack(amt,zero)) pattern for vXi8 rotl, as we'll then use vXi16 fast PMULLW (or PSLLVW).

This required some minor changes to improve constant folding during unpack shuffle creation and convertShiftLeftToScale to support constants that have already been lowered to constant pools.

2 years ago[SLP]Do not represent splats as node with the reused scalars.
Alexey Bataev [Fri, 10 Dec 2021 15:43:52 +0000 (07:43 -0800)]
[SLP]Do not represent splats as node with the reused scalars.

No need to represent splats as a node with the reused scalars, it may
increase the cost (currently pass just ignores extra shuffle cost and it
is still not correct).

Differential Revision: https://reviews.llvm.org/D115800

2 years ago[mlir][linalg][bufferize] Replace remaining bvm usage with new API
Matthias Springer [Wed, 15 Dec 2021 14:18:51 +0000 (23:18 +0900)]
[mlir][linalg][bufferize] Replace remaining bvm usage with new API

* Call `replaceOp` instead of `mapBuffer`.
* Remove bvm and all helper functions around bvm.
* Simplify FuncOp bufferization and rely on existing functionality to generate ToMemrefOps for function BlockArguments.

Differential Revision: https://reviews.llvm.org/D115515

2 years agoEmbed licence into package
Guillaume Chatelet [Wed, 15 Dec 2021 14:17:24 +0000 (15:17 +0100)]
Embed licence into package

2 years ago[Driver] Default to contemporary FreeBSD profiling behaviour
Ed Maste [Mon, 22 Nov 2021 21:56:35 +0000 (16:56 -0500)]
[Driver] Default to contemporary FreeBSD profiling behaviour

Prior to FreeBSD 14, FreeBSD provided special _p.a libraries for use
with -pg.  They are no longer used or provided.  If the target does
not specify a major version (e.g. amd64-unknown-freebsd, rather than
amd64-unknown-freebsd12) default to the new behaviour.

Differential Revision: https://reviews.llvm.org/D114396

2 years ago[SLP][NFC]Add a test for broadcast cost with undefs, NFC.
Alexey Bataev [Wed, 15 Dec 2021 13:57:58 +0000 (05:57 -0800)]
[SLP][NFC]Add a test for broadcast cost with undefs, NFC.

2 years ago[clangd] Add ) to signature-help triggers
Kadir Cetinkaya [Wed, 15 Dec 2021 13:33:34 +0000 (14:33 +0100)]
[clangd] Add ) to signature-help triggers

It is important for nested function calls.

Differential Revision: https://reviews.llvm.org/D115799

2 years ago[AArch64][SVE] Lower shuffles to permute instructions: rev/revb/revh/revw
Andrew Wei [Wed, 15 Dec 2021 12:52:00 +0000 (20:52 +0800)]
[AArch64][SVE] Lower shuffles to permute instructions: rev/revb/revh/revw

Attempt to lower a shuffle as a permute instruction(rev/revb/revh/revw) for fixed length SVE.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D114960

2 years ago[CodeGen] Avoid some pointer element type accesses
Nikita Popov [Wed, 15 Dec 2021 13:10:27 +0000 (14:10 +0100)]
[CodeGen] Avoid some pointer element type accesses

2 years ago[Sema] Add FixIt when a C++ out-of-line method has extra/missing const
Sam McCall [Sat, 11 Dec 2021 02:14:05 +0000 (03:14 +0100)]
[Sema] Add FixIt when a C++ out-of-line method has extra/missing const

Differential Revision: https://reviews.llvm.org/D115567

2 years ago[AMDGPU] Use v_fma_f16 on GFX10
Jay Foad [Tue, 14 Dec 2021 15:47:44 +0000 (15:47 +0000)]
[AMDGPU] Use v_fma_f16 on GFX10

Teach convertToThreeAddress to use the V_FMA_F16_gfx9 pseudo (i.e. the
standard instruction in GFX9 onwards) instead of V_FMA_F16 (the legacy
pseudo for GFX8 compatibility, which is no longer supported in GFX10).
This follows the example of macToMad in SIFoldOperands.

Differential Revision: https://reviews.llvm.org/D115731

2 years ago[AMDGPU] Improve zeroesHigh16BitsOfDest for GFX9 legacy opcodes
Jay Foad [Tue, 14 Dec 2021 15:08:31 +0000 (15:08 +0000)]
[AMDGPU] Improve zeroesHigh16BitsOfDest for GFX9 legacy opcodes

Pseudos like V_MAD_U16 and V_FMA_F16 map down to what GFX9 calls
v_mad_legacy_u16 and v_fma_legacy_f16, which are documented to have the
same zeroing behaviour as on GFX8.

Differential Revision: https://reviews.llvm.org/D115729

2 years ago[CodeGen] Pass element type to EmitCheckedInBoundsGEP()
Nikita Popov [Wed, 15 Dec 2021 12:06:28 +0000 (13:06 +0100)]
[CodeGen] Pass element type to EmitCheckedInBoundsGEP()

Same as for other GEP creation methods.

2 years ago[docs] Give the reason why the support for coroutine is partial
Chuanqi Xu [Wed, 15 Dec 2021 12:59:50 +0000 (20:59 +0800)]
[docs] Give the reason why the support for coroutine is partial

This helps user to know what level of support there
is (roughly) for coroutine feature.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D115778

2 years ago[mlir][LLVMIR] Add `llvm.umin` and `llvm.umax` intrinsics
Markus Böck [Wed, 15 Dec 2021 12:54:23 +0000 (13:54 +0100)]
[mlir][LLVMIR] Add `llvm.umin` and `llvm.umax` intrinsics

Ops for the signed counterparts "llvm.smin" and "llvm.smax" already exist. This patch adds the unsigned versions as well.

Differential Revision: https://reviews.llvm.org/D115796

2 years ago[AMDGPU] Skip some work on subtargets without scalar stores. NFC.
Jay Foad [Wed, 15 Dec 2021 12:28:28 +0000 (12:28 +0000)]
[AMDGPU] Skip some work on subtargets without scalar stores. NFC.

2 years ago[DAG] SelectionDAG::isSplatValue - add *_EXTEND_VECTOR_INREG handling
Simon Pilgrim [Wed, 15 Dec 2021 12:21:18 +0000 (12:21 +0000)]
[DAG] SelectionDAG::isSplatValue - add *_EXTEND_VECTOR_INREG handling

Fixes #52719

2 years ago[X86] Add PR52719 test cases
Simon Pilgrim [Wed, 15 Dec 2021 12:01:50 +0000 (12:01 +0000)]
[X86] Add PR52719 test cases

2 years ago[mlir][linalg] Replace LinalgOps.h and LinalgTypes.h by a single header.
gysit [Wed, 15 Dec 2021 12:14:35 +0000 (12:14 +0000)]
[mlir][linalg] Replace LinalgOps.h and LinalgTypes.h by a single header.

After removing the range type, Linalg does not define any type. The revision thus consolidates the LinalgOps.h and LinalgTypes.h into a single Linalg.h header. Additionally, LinalgTypes.cpp is renamed to LinalgDialect.cpp to follow the convention adopted by other dialects such as the tensor dialect.

Depends On D115727

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D115728

2 years agoExplicitly convert StringRef to std::string for compatibility with libstdc++ 5.4.0
Dmitri Gribenko [Wed, 15 Dec 2021 11:47:10 +0000 (12:47 +0100)]
Explicitly convert StringRef to std::string for compatibility with libstdc++ 5.4.0

For some reason, the user-defined implicit conversion from StringRef to
std::string is not invoked by std::map::emplace in libstdc++ 5.4.0, even
though it works fine on modern systems.

2 years ago[CodeGen] Avoid some deprecated Address constructors
Nikita Popov [Wed, 15 Dec 2021 10:55:16 +0000 (11:55 +0100)]
[CodeGen] Avoid some deprecated Address constructors

Some of these are on the critical path towards making something
minimal work with opaque pointers.

2 years ago[X86] Enable v16i8/v32i8/v64i8 rotation on AVX512 targets
Simon Pilgrim [Wed, 15 Dec 2021 11:17:10 +0000 (11:17 +0000)]
[X86] Enable v16i8/v32i8/v64i8 rotation on AVX512 targets

We currently rely on generic promotion to vXi16/vXi32 types for rotation lowering on various AVX512 targets.

We can more efficiently perform this by making use of the shl(unpack(x,x),amt) style pattern that we already use for vXi8 rotation by splat amounts, either by widening to a larger vector type or unpacking lo/hi halves of the subvectors so we can access whatever vXi16/vXi32 per-element shifts are supported.

This uncovered an issue in the supportedVectorShiftWithImm/supportedVectorVarShift legality checkers which was using hasAVX512() instead of useAVX512Regs() to detect support for 512-bit vector shifts.

NOTE: I'm actually hoping to eventually reuse this code for shl(unpack(y,x),amt) funnel shift lowering (vXi8 and wider), but initially I just want to ensure we have efficient ISD::ROTL lowering for all targets.

Differential Revision: https://reviews.llvm.org/D115180

2 years ago[CodeGen] Prefer CreateElementBitCast() where possible
Nikita Popov [Wed, 15 Dec 2021 10:24:27 +0000 (11:24 +0100)]
[CodeGen] Prefer CreateElementBitCast() where possible

CreateElementBitCast() can preserve the pointer element type in
the presence of opaque pointers, so use it in place of CreateBitCast()
in some places. This also sometimes simplifies the code a bit.

2 years ago[analyzer] Expand conversion check to check more expressions for overflow and underflow
Gabor Marton [Tue, 14 Dec 2021 15:46:20 +0000 (16:46 +0100)]
[analyzer] Expand conversion check to check more expressions for overflow and underflow

This expands checking for more expressions. This will check underflow
and loss of precision when using call expressions like:

  void foo(unsigned);
  int i = -1;
  foo(i);

This also includes other expressions as well, so it can catch negative
indices to std::vector since it uses unsigned integers for [] and .at()
function.

Patch by: @pfultz2

Differential Revision: https://reviews.llvm.org/D46081

2 years ago[VE] SHL,SRA,SRL v256i32|64 isel and tests
Simon Moll [Wed, 15 Dec 2021 10:31:37 +0000 (11:31 +0100)]
[VE] SHL,SRA,SRL v256i32|64 isel and tests

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D115734

2 years ago[clangd] Disable the NOLINTBBEGIN testcase in clangd.
Haojian Wu [Wed, 15 Dec 2021 10:30:23 +0000 (11:30 +0100)]
[clangd] Disable the NOLINTBBEGIN testcase in clangd.

NOLINTBEGIN is disabled, in 529833377ccdf4381f8bc9961bfa96ec4f5e2eed

2 years ago[bazel] drop some unnecessary dependencies in mlir
Alex Zinenko [Wed, 15 Dec 2021 10:30:46 +0000 (11:30 +0100)]
[bazel] drop some unnecessary dependencies in mlir

2 years ago[mlir] Use rewriter in linalg Detensorize
Tres Popp [Sat, 23 Oct 2021 11:23:28 +0000 (13:23 +0200)]
[mlir] Use rewriter in linalg Detensorize

This is to allow rollbacks on failures of dialect lowering to succeed.

Differential Revision: https://reviews.llvm.org/D115789

2 years ago[CodeGen] Avoid some uses of deprecated Address constructor
Nikita Popov [Wed, 15 Dec 2021 10:03:58 +0000 (11:03 +0100)]
[CodeGen] Avoid some uses of deprecated Address constructor

Explicitly pass in the element type instead.

2 years ago[mlir][OpenMP] omp.sections and omp.section lowering to LLVM IR
Shraiysh Vaishay [Wed, 15 Dec 2021 09:53:30 +0000 (15:23 +0530)]
[mlir][OpenMP] omp.sections and omp.section lowering to LLVM IR

This patch adds lowering from omp.sections and omp.section (simple lowering along with the nowait clause) to LLVM IR.
Tests for the same are also added.

Reviewed By: ftynse, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D115030

2 years ago[clangd] Disable support for clang-tidy suppression blocks (NOLINTBEGIN)
Sam McCall [Mon, 13 Dec 2021 18:16:33 +0000 (19:16 +0100)]
[clangd] Disable support for clang-tidy suppression blocks (NOLINTBEGIN)

The implementation is very inefficient and we pay the cost even when the feature is not used

Differential Revision: https://reviews.llvm.org/D115650

2 years ago[bazel] Adjust Bazel BUILD files for a4830d14edbb2a21eb35f3d79d1f64bd09db8b1c
Dmitri Gribenko [Wed, 15 Dec 2021 09:57:35 +0000 (10:57 +0100)]
[bazel] Adjust Bazel BUILD files for a4830d14edbb2a21eb35f3d79d1f64bd09db8b1c

2 years ago[mlir][linalg][bufferize] Reimplementation of TiledLoopOp bufferization
Matthias Springer [Wed, 15 Dec 2021 09:43:24 +0000 (18:43 +0900)]
[mlir][linalg][bufferize] Reimplementation of TiledLoopOp bufferization

Instead of modifying the existing linalg.tiled_loop op, create a new op with memref input/outputs and delete the old op.

Differential Revision: https://reviews.llvm.org/D115493

2 years ago[CodeGen] Avoid deprecated ConstantAddress constructor
Nikita Popov [Wed, 15 Dec 2021 08:38:48 +0000 (09:38 +0100)]
[CodeGen] Avoid deprecated ConstantAddress constructor

Change all uses of the deprecated constructor to pass the
element type explicitly and drop it.

For cases where the correct element type was not immediately
obvious to me or would require a slightly larger change I'm
falling back to explicitly calling getPointerElementType() for now.

2 years ago[mlir][linalg][bufferize] Reimplementation of scf.if bufferization
Matthias Springer [Wed, 15 Dec 2021 09:32:13 +0000 (18:32 +0900)]
[mlir][linalg][bufferize] Reimplementation of scf.if bufferization

Instead of modifying the existing scf.if op, create a new op with memref OpOperands/OpResults and delete the old op.

New allocations / other memrefs can now be yielded from the op. This functionality is deactivated by default and guarded against by AssertDestinationPassingStyle.

Differential Revision: https://reviews.llvm.org/D115491

2 years ago[mlir][RFC] Add scalable dimensions to VectorType
Javier Setoain [Tue, 12 Oct 2021 13:26:01 +0000 (14:26 +0100)]
[mlir][RFC] Add scalable dimensions to VectorType

With VectorType supporting scalable dimensions, we don't need many of
the operations currently present in ArmSVE, like mask generation and
basic arithmetic instructions. Therefore, this patch also gets
rid of those.

Having built-in scalable vector support also simplifies the lowering of
scalable vector dialects down to LLVMIR.

Scalable dimensions are indicated with the scalable dimensions
between square brackets:

        vector<[4]xf32>

Is a scalable vector of 4 single precission floating point elements.

More generally, a VectorType can have a set of fixed-length dimensions
followed by a set of scalable dimensions:

        vector<2x[4x4]xf32>

Is a vector with 2 scalable 4x4 vectors of single precission floating
point elements.

The scale of the scalable dimensions can be obtained with the Vector
operation:

        %vs = vector.vscale

This change is being discussed in the discourse RFC:

https://llvm.discourse.group/t/rfc-add-built-in-support-for-scalable-vector-types/4484

Differential Revision: https://reviews.llvm.org/D111819

2 years ago[mlir][linalg][bufferize] Reimplementation of scf.for bufferization
Matthias Springer [Wed, 15 Dec 2021 09:26:27 +0000 (18:26 +0900)]
[mlir][linalg][bufferize] Reimplementation of scf.for bufferization

Instead of modifying the existing scf.for op, create a new op with memref OpOperands/OpResults and delete the old op.

New allocations / other memrefs can now be yielded from the loop. This functionality is deactivated by default and guarded against by AssertDestinationPassingStyle.

This change also introduces `replaceOp`, which will be utilized by all other `bufferize` implementations in future commits. Bufferization will then no longer rely on old (pre-bufferize) ops to DCE away. Instead old ops are deleted on the spot. This improves debuggability because there won't be any duplicate ops anymore (bufferized + not-yet-bufferized) when dumping IR during bufferization. It is also less fragile because unbufferized IR can no longer silently "hang around" due to an implementation bug.

Differential Revision: https://reviews.llvm.org/D114926

2 years ago[ELF] --gc-sections: Change startwith(".jcr") to exact match
Fangrui Song [Wed, 15 Dec 2021 09:27:08 +0000 (01:27 -0800)]
[ELF] --gc-sections: Change startwith(".jcr") to exact match

GNU ld's internal linker script keeps `.jcr`, but not other sections
starting with `.jcr`.

2 years ago[mlir] Added documentation for bufferization to memref conversion pass.
Julian Gross [Wed, 8 Dec 2021 10:29:47 +0000 (11:29 +0100)]
[mlir] Added documentation for bufferization to memref conversion pass.

Added documentation to clearify the purpose of the bufferization to memref pass
and added some remarks.

Differential Revision: https://reviews.llvm.org/D115326

2 years ago[ELF] --gc-sections: Change startwith(".init") (and ".fini") to exact match
Fangrui Song [Wed, 15 Dec 2021 09:16:25 +0000 (01:16 -0800)]
[ELF] --gc-sections: Change startwith(".init") (and ".fini") to exact match

GNU ld's internal linker script keeps `.init`, but not other sections starting
with `.init`. .fini is similar.

2 years ago[ELF] Change objectFiles to ELFFileBase *
Fangrui Song [Wed, 15 Dec 2021 08:37:10 +0000 (00:37 -0800)]
[ELF] Change objectFiles to ELFFileBase *

This can sometimes avoid `cast<ObjFile<...>>`.

I intentionally do not touch postScanRelocations to wait for its stabilization.

2 years ago[CodeGen] Avoid some pointer element type accesses
Nikita Popov [Wed, 15 Dec 2021 08:27:49 +0000 (09:27 +0100)]
[CodeGen] Avoid some pointer element type accesses

2 years ago[ELF] Adjust getOutputSectionName prefix order
Fangrui Song [Wed, 15 Dec 2021 08:18:58 +0000 (00:18 -0800)]
[ELF] Adjust getOutputSectionName prefix order

Sorting the prefixes by decreasing frequency can improve performance.
.gcc_except_table is relatively frequent, so move it ahead.
.ctors and .dtors mostly disappear and should be the last.

2 years ago[CodeGen] Store ElementType in Address
Nikita Popov [Tue, 14 Dec 2021 13:37:23 +0000 (14:37 +0100)]
[CodeGen] Store ElementType in Address

Explicitly track the pointer element type in Address, rather than
deriving it from the pointer type, which will no longer be possible
with opaque pointers. This just adds the basic facility, for now
everything is still going through the deprecated constructors.

I had to adjust one place in the LValue implementation to satisfy
the new assertions: Global registers are represented as a
MetadataAsValue, which does not have a pointer type. We should
avoid using Address in this case.

This implements a part of D103465.

Differential Revision: https://reviews.llvm.org/D115725

2 years ago[ELF] Slightly speed up getOutputSectionName. NFC
Fangrui Song [Wed, 15 Dec 2021 07:43:00 +0000 (23:43 -0800)]
[ELF] Slightly speed up getOutputSectionName. NFC

2 years ago[libc++][NFC] Use _LIBCPP_DEBUG_ASSERT in <string>
Nikolas Klauser [Wed, 15 Dec 2021 00:32:30 +0000 (01:32 +0100)]
[libc++][NFC] Use _LIBCPP_DEBUG_ASSERT in <string>

Use `_LIBCPP_DEBUG_ASSERT` instead of `_LIBCPP_ASSERT` and guarding it with `LIBCPP_DEBUG_LEVEL == 2`

Reviewed By: ldionne, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D115765

2 years ago[DebugInfo][DWARF] emit DW_AT_accessibility attribute for class/struct/union types.
Esme-Yi [Wed, 15 Dec 2021 07:38:12 +0000 (07:38 +0000)]
[DebugInfo][DWARF] emit DW_AT_accessibility attribute for class/struct/union types.

Summary:
This patch emits the DW_AT_accessibility attribute for
class/struct/union types in the LLVM part.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D115606

2 years ago[mlir][linalg] Remove RangeOp and RangeType.
gysit [Wed, 15 Dec 2021 07:10:32 +0000 (07:10 +0000)]
[mlir][linalg] Remove RangeOp and RangeType.

Remove the RangeOp and the RangeType that are not actively used anymore. After removing RangeType, the LinalgTypes header only includes the generated dialect header.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D115727

2 years ago[adt] Fix compiler warning in test
Matthias Springer [Wed, 15 Dec 2021 07:12:17 +0000 (16:12 +0900)]
[adt] Fix compiler warning in test

Differential Revision: https://reviews.llvm.org/D115589

2 years ago[ELF] Remove dead code from SymbolTable::find
Fangrui Song [Wed, 15 Dec 2021 06:41:52 +0000 (22:41 -0800)]
[ELF] Remove dead code from SymbolTable::find

2 years agoPrint the sign of negative infinity
Logan Chien [Sat, 16 Oct 2021 00:43:25 +0000 (17:43 -0700)]
Print the sign of negative infinity

Differential Revision: https://reviews.llvm.org/D111917

2 years ago[RISCV] Add more curly braces to constexpr array initialization to hopefully appease...
Craig Topper [Wed, 15 Dec 2021 05:44:23 +0000 (21:44 -0800)]
[RISCV] Add more curly braces to constexpr array initialization to hopefully appease gcc 5.

Build bot failure found after D115668.

2 years ago[ELF] Use SmallVector for SharedFile and simplify parseVerdefs
Fangrui Song [Wed, 15 Dec 2021 05:11:45 +0000 (21:11 -0800)]
[ELF] Use SmallVector for SharedFile and simplify parseVerdefs

SHT_GNU_verdef is typically small, so it's unnecessary to reserve the vector.

While here, fix a hypothetical issue when SHT_GNU_verdef has non-increasing
version indexes, which don't happen with GNU ld, gold, ld.lld's output.

My x86-64 lld executable is 256 bytes smaller.

2 years ago[ELF] Make InputFile smaller
Fangrui Song [Wed, 15 Dec 2021 04:55:32 +0000 (20:55 -0800)]
[ELF] Make InputFile smaller

sizeof(ObjFile<ELF64LE>) is decreased from 344 to 272 on an ELF64 system.
In a large link with 30000 ObjFiles, this may be 2+MiB saving.

Change std::vector members to SmallVector, and std::string members to
SmallString<0> (these members typically don't benefit from small string optimization).
On Linux x86-64 the lld executable is ~6k smaller.

2 years ago[gn build] (manually) port b45ad7363c30 (LLVM_WITH_Z3)
Nico Weber [Wed, 15 Dec 2021 04:11:42 +0000 (23:11 -0500)]
[gn build] (manually) port b45ad7363c30 (LLVM_WITH_Z3)

2 years ago[compiler-rt][AArch64] Add a workaround for Exynos 9810
Stephen Hines [Wed, 15 Dec 2021 01:20:06 +0000 (17:20 -0800)]
[compiler-rt][AArch64] Add a workaround for Exynos 9810

Big.LITTLE Heterogeneous architectures, as described by ARM [1],
require that the instruction set architecture of the big and little
cores be compatible. However, the Samsung Exynos 9810 is known to
have different ISAs in its core.
According to [2], some cores are ARMv8.2 and others are ARMv8.0.

Since LSE is for ARMv8.1 and later, it should be disabled
for this broken CPU.

[1] https://developer.arm.com/documentation/den0024/a/big-LITTLE-Technology
[2] https://github.com/golang/go/issues/28431

Patch by: Byoungchan Lee (byoungchan.lee@gmx.com)
Reviewed By: srhines

Differential Revision: https://reviews.llvm.org/D114523

2 years ago[llvm-jitlink] Update handling of library options.
Lang Hames [Fri, 10 Dec 2021 20:53:59 +0000 (07:53 +1100)]
[llvm-jitlink] Update handling of library options.

Adds -L<search-path> and -l<library> options that are analogous to ld's
versions.

Each instance of -L<search-path> or -l<library> will apply to the most recent
-jd option on the command line (-jd <name> creates a JITDylib with the given
name). Library names will match against JITDylibs first, then llvm-jitlink will
look through the search paths for files named <search-path>/lib<library>.dylib
or <search-path>/lib<library>.a.

The default "main" JITDylib will link against all JITDylibs created by -jd
options, and all JITDylibs will link against the process symbols (unless
-no-process-symbols is specified).

The -dlopen option is renamed -preload, and will load dylibs into the JITDylib
for the ORC runtime only.

The effect of these changes is to make it easier to describe a non-trivial
program layout to llvm-jitlink for testing purposes. E.g. the following
invocation describes a program consisting of three JITDylibs: "main" (created
implicitly) containing main.o, "Foo" containing foo1.o and foo2.o, and linking
against library "bar" (not a JITDylib, so it must be a .dylib or .a on disk)
and "Baz" (which is a JITDylib), and "Baz" containing baz.o.

llvm-jitlink \
  main.o \
  -jd Foo foo1.o foo2.o -L${HOME}/lib -lbar -lBaz
  -jd Baz baz.o

2 years ago[clang] Use usual lit pattern for CLANG_DEFAULT_PIE_ON_LINUX and LLVM_WITH_Z3
Nico Weber [Tue, 14 Dec 2021 20:10:41 +0000 (15:10 -0500)]
[clang] Use usual lit pattern for CLANG_DEFAULT_PIE_ON_LINUX and LLVM_WITH_Z3

See D28294 for context.

Differential Revision: https://reviews.llvm.org/D115751

2 years ago[ASan] Added NO_EXEC_STACK_DIRECTIVE to assembly callback file.
Kirill Stoimenov [Fri, 10 Dec 2021 21:44:14 +0000 (21:44 +0000)]
[ASan] Added NO_EXEC_STACK_DIRECTIVE to assembly callback file.

This is present in our assembly files. It should fix decorate_proc_maps.cpp failures because of shadow memory being allocated as executable.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D115552

2 years ago[llvm-profgen] Turn on preinliner by default
Wenlei He [Wed, 15 Dec 2021 01:10:47 +0000 (17:10 -0800)]
[llvm-profgen] Turn on preinliner by default

preinliner has been tuned on large server workloads and it's not ready to be turned on by default. this change also updates the thresholds based on tuning.

Differential Revision: https://reviews.llvm.org/D115770

2 years agoAvoid setting tbaa on the store of return type of call to inline assembler.
Sindhu Chittireddy [Wed, 15 Dec 2021 01:40:33 +0000 (17:40 -0800)]
Avoid setting tbaa on the store of return type of call to inline assembler.

In 32bit mode, attaching TBAA metadata to the store following the call
to inline assembler results in describing the wrong type by making a
fake lvalue(i.e., whatever the inline assembler happens to leave in
EAX:EDX.) Even if inline assembler somehow describes the correct type,
setting TBAA information on return type of call to inline assembler is
likely not correct, since TBAA rules need not apply to inline assembler.

Differential Revision: https://reviews.llvm.org/D115320

2 years ago[gn build] Port 4299d8d0ce42
LLVM GN Syncbot [Wed, 15 Dec 2021 01:15:06 +0000 (01:15 +0000)]
[gn build] Port 4299d8d0ce42

2 years ago[clangd] Cleanup unneeded use of shared_ptr. NFC
Sam McCall [Wed, 15 Dec 2021 01:13:26 +0000 (02:13 +0100)]
[clangd] Cleanup unneeded use of shared_ptr. NFC

2 years ago[ORC] Add MaterializationUnit::Interface parameter to ObjectLayer::add.
Lang Hames [Wed, 8 Dec 2021 09:25:53 +0000 (20:25 +1100)]
[ORC] Add MaterializationUnit::Interface parameter to ObjectLayer::add.

Also moves object interface building functions out of Mangling.h and in to the
new ObjectFileInterfaces.h header, and updates the llvm-jitlink tool to use
custom object interfaces rather than a custom link layer.

ObjectLayer::add overloads are added to match the old signatures (which
do not take a MaterializationUnit::Interface). These overloads use the
standard getObjectFileInterface function to build an interface.

Passing a MaterializationUnit::Interface explicitly makes it easier to alter
the effective interface of the object file being added, e.g. by changing symbol
visibility/linkage, or renaming symbols (in both cases the changes will need to
be mirrored by a JITLink pass at link time to update the LinkGraph to match the
explicit interface). Altering interfaces in this way can be useful when lazily
compiling (e.g. for renaming function bodies) or emulating linker options (e.g.
demoting all symbols to hidden visibility to emulate -load_hidden).

2 years ago[gn build] Port 3f630cff65fc
LLVM GN Syncbot [Wed, 15 Dec 2021 00:46:46 +0000 (00:46 +0000)]
[gn build] Port 3f630cff65fc

2 years ago[CSSPGO][llvm-profgen] Fix external address issues of perf reader (return to external...
wlei [Tue, 14 Dec 2021 04:33:33 +0000 (20:33 -0800)]
[CSSPGO][llvm-profgen] Fix external address issues of perf reader (return to external addr part)

Before we have an issue with artificial LBR whose source is a return, recalling that "an internal code(A) can return to external address, then from the external address call a new internal code(B), making an artificial branch that looks like a return from A to B can confuse the unwinder". We just ignore the LBRs after this artificial LBR which can miss some samples. This change aims at fixing this by correctly unwinding them instead of ignoring them.

List some typical scenarios covered by this change.

1)  multiple sequential call back happen in external address, e.g.

```
[ext, call, foo] [foo, return, ext] [ext, call, bar]
```
Unwinder should avoid having foo return from bar. Wrong call stack is like [foo, bar]

2) the call stack before and after external call should be correctly unwinded.
```
 {call stack1}                                            {call stack2}
 [foo, call, ext]  [ext, call, bar]  [bar, return, ext]  [ext, return, foo ]
```
call stack 1 should be the same to call stack2. Both shouldn't be truncated

3) call stack should be truncated after call into external code since we can't do inlining with external code.

```
 [foo, call, ext]  [ext, call, bar]  [bar, call, baz] [baz, return, bar ] [bar, return, ext]
```
the call stack of code in baz should not include foo.

### Implementation:

We leverage artificial frame to fix #2 and #3: when we got a return artificial LBR, push an extra artificial frame to the stack. when we pop frame, check if the parent is an artificial frame to pop(fix #2). Therefore, call/ return artificial LBR is just the same as regular LBR which can keep the call stack.

While recording context on the trie, artificial frame is used as a tag indicating that we should truncate the call stack(fix #3).

To differentiate #1 and #2, we leverage `getCallAddrFromFrameAddr`.  Normally the target of the return should be the next inst of a call inst and `getCallAddrFromFrameAddr` will return the address of call inst. Otherwise, getCallAddrFromFrameAddr will return to 0 which is the case of #1.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D115550

2 years ago[llvm-profgen] Fix to use getUntrackedCallsites outside the loop
wlei [Sun, 12 Dec 2021 09:42:53 +0000 (01:42 -0800)]
[llvm-profgen] Fix to use getUntrackedCallsites outside the loop

Unwinder is hoisted out in https://reviews.llvm.org/D115550, so fix the useage of getUntrackedCallsites.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D115760