platform/upstream/llvm.git
2 years ago[gn build] Port ca875539f788
LLVM GN Syncbot [Wed, 18 May 2022 13:39:15 +0000 (13:39 +0000)]
[gn build] Port ca875539f788

2 years agoAssert on polymorphic pointer intrinsic param
Thomas Preud'homme [Tue, 17 May 2022 10:47:43 +0000 (11:47 +0100)]
Assert on polymorphic pointer intrinsic param

Opaque pointers cannot be polymorphic on the pointed type given their
lack thereof. However they are currently accepted by tablegen but the
intrinsic signature verifier trips when verifying any further
polymorphic type because the opaque pointer codepath for pointers will
not push the pointed type in ArgTys.

This commit adds an assert to easily catch such cases instead of having
the generic signature match failure.

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D125764

2 years ago[CGP] Regenerate test checks (NFC)
Nikita Popov [Wed, 18 May 2022 13:35:00 +0000 (15:35 +0200)]
[CGP] Regenerate test checks (NFC)

2 years ago[lit] pass LLVM_SYMBOLIZER_PATH through to tests.
Sam McCall [Wed, 18 May 2022 13:28:43 +0000 (15:28 +0200)]
[lit] pass LLVM_SYMBOLIZER_PATH through to tests.

Currently several buildbots give unsymbolized traces on crash.
I suspect these are configuring the symbolizer in this way and regressed in
D122251 or thereabouts.

Trying this coupled with a reland of patch that failed on a couple of bots with
no useful stacktrace...

2 years agoReland(2) "[clangd] Indexing of standard library"
Sam McCall [Wed, 18 May 2022 12:23:02 +0000 (14:23 +0200)]
Reland(2) "[clangd] Indexing of standard library"

This reverts commit 6aabf60f2fb7589430c0ecc8fe95913c973fa248.

2 years ago[InstCombine] avoid crash on fold of icmp with cast operand
Sanjay Patel [Wed, 18 May 2022 13:07:07 +0000 (09:07 -0400)]
[InstCombine] avoid crash on fold of icmp with cast operand

We could do better by inserting a bitcast from scalar int
to vector int or using an insertelement (the alternate test
does not crash because there's an independent fold like that).

But this doesn't seem like a likely pattern, so just bail out
for now.

Fixes issue #55516.

2 years ago[InstCombine] reduce code duplication for checking types; NFC
Sanjay Patel [Wed, 18 May 2022 12:34:45 +0000 (08:34 -0400)]
[InstCombine] reduce code duplication for checking types; NFC

2 years ago[lldb][AArch64] Fix corefile memory reads when there are non-address bits
David Spickett [Mon, 4 Apr 2022 13:42:24 +0000 (14:42 +0100)]
[lldb][AArch64] Fix corefile memory reads when there are non-address bits

Previously if you read a code/data mask before there was a valid thread
you would get the top byte mask. This meant the value was "valid" as in,
don't read it again.

When using a corefile we ask for the data mask very early on and this
meant that later once you did have a thread it wouldn't read the
register to get the rest of the mask.

This fixes that and adds a corefile test generated from the same program
as in my previous change on this theme.

Depends on D118794

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D122411

2 years ago[InstCombine] Remove disable-verify tests (NFC)
Nikita Popov [Wed, 18 May 2022 12:47:38 +0000 (14:47 +0200)]
[InstCombine] Remove disable-verify tests (NFC)

InstCombine is not required to do anything sensible if it receives
invalid IR.

These tests seem to be testing self-referential instructions that
may occur in unreachable code -- but InstCombine actually goes out
of the way to remove such instructions ahead of time so it doesn't
need to deal with them.

2 years ago[AMDGPU][MC][GFX940] Correct tied operand decoding for smfmac opcodes
Dmitry Preobrazhensky [Wed, 18 May 2022 12:37:30 +0000 (15:37 +0300)]
[AMDGPU][MC][GFX940] Correct tied operand decoding for smfmac opcodes

Differential Revision: https://reviews.llvm.org/D125790

2 years ago[libcxx] [test] Include header for strverscmp
Jonathan Wakely [Wed, 18 May 2022 12:36:20 +0000 (14:36 +0200)]
[libcxx] [test] Include header for strverscmp

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D122570

2 years ago[libcxx] [test] Add missing header for std::numeric_limits
Jonathan Wakely [Wed, 18 May 2022 12:35:57 +0000 (14:35 +0200)]
[libcxx] [test] Add missing header for std::numeric_limits

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D122571

2 years ago[X86] addcarry.ll - add nounwind to prevent cfi noise on tests
Simon Pilgrim [Wed, 18 May 2022 12:30:43 +0000 (13:30 +0100)]
[X86] addcarry.ll - add nounwind to prevent cfi noise on tests

2 years ago[AArch64] neon-vmull-high-p64.ll - fix name/check mismatch identified in D125604
Simon Pilgrim [Wed, 18 May 2022 12:24:20 +0000 (13:24 +0100)]
[AArch64] neon-vmull-high-p64.ll - fix name/check mismatch identified in D125604

Typos meant that we weren't actually checking the function name, which wasn't accounting for mangling

2 years ago[Security Group] Update representative for Rust.
Pietro Albini [Wed, 18 May 2022 12:16:39 +0000 (14:16 +0200)]
[Security Group] Update representative for Rust.

Steve Klabnik recently left the Rust project. Josh Stone (the other member of
the Rust Security Response WG) replaces him as one of the vendor contacts for
Rust.

Differential Revision: https://reviews.llvm.org/D119137

2 years ago[AMDGPU][MC][GFX7] Disable cache policy modifiers with SMRD
Dmitry Preobrazhensky [Wed, 18 May 2022 12:12:01 +0000 (15:12 +0300)]
[AMDGPU][MC][GFX7] Disable cache policy modifiers with SMRD

Differential Revision: https://reviews.llvm.org/D125799

2 years ago[ARM] Don't Enable AES Pass for Generic Cores
Archibald Elliott [Wed, 18 May 2022 12:10:22 +0000 (13:10 +0100)]
[ARM] Don't Enable AES Pass for Generic Cores

This brings clang/llvm into line with GCC. The Pass is still enabled for
the affected cores, but is now opt-in when using `-march=`.

I also took the opportunity to add release notes for this change.

Reviewed By: john.brawn

Differential Revision: https://reviews.llvm.org/D125775

2 years ago[mlir][complex] Add pow/sqrt/tanh ops and lowering to libm
Benjamin Kramer [Fri, 13 May 2022 15:12:11 +0000 (17:12 +0200)]
[mlir][complex] Add pow/sqrt/tanh ops and lowering to libm

Lowering through libm gives us a baseline version, even though it's not
going to be particularly fast. This is similar to what we do for some
math dialect ops.

Differential Revision: https://reviews.llvm.org/D125550

2 years ago[OpenCL] Add cl_khr_subgroup_rotate builtins
Sven van Haastregt [Wed, 18 May 2022 12:02:17 +0000 (13:02 +0100)]
[OpenCL] Add cl_khr_subgroup_rotate builtins

Differential Revision: https://reviews.llvm.org/D124256

2 years ago[AMDGPU][MC][NFC] MUBUF code cleanup
Dmitry Preobrazhensky [Wed, 18 May 2022 11:58:45 +0000 (14:58 +0300)]
[AMDGPU][MC][NFC] MUBUF code cleanup

Removed code that is no longer used after https://reviews.llvm.org/D124485.

Differential Revision: https://reviews.llvm.org/D125811

2 years ago[lldb] Remove non-address bits from read/write addresses in lldb
David Spickett [Fri, 28 Jan 2022 13:47:30 +0000 (13:47 +0000)]
[lldb] Remove non-address bits from read/write addresses in lldb

Non-address bits are not part of the virtual address in a pointer.
So they must be removed before passing to interfaces like ptrace.

Some of them we get way with not removing, like AArch64's top byte.
However this is only because of a hardware feature that ignores them.

This change updates all the Process/Target Read/Write memory methods
to remove non-address bits before using addresses.

Doing it in this way keeps lldb-server simple and also fixes the
memory caching when differently tagged pointers for the same location
are read.

Removing the bits is done at the ReadMemory level not DoReadMemory
because particualrly for process, many subclasses override DoReadMemory.

Tests have been added for read/write at the command and API level,
for process and target. This includes variants like
Read<sometype>FromMemory. Commands are tested to make sure we remove
at the command and API level.

"memory find" is not included because:
* There is no API for it.
* It already has its own address handling tests.

Software breakpoints do use these methods but they are not tested
here because there are bigger issues to fix with those. This will
happen in another change.

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D118794

2 years agoRevert "[lldb] Add --all option to "memory region""
David Spickett [Wed, 18 May 2022 11:57:20 +0000 (11:57 +0000)]
Revert "[lldb] Add --all option to "memory region""

This reverts commit 8e648f195c3d57e573fdd8023edcfd80e0516c61
due to test failures on Windows:
https://lab.llvm.org/buildbot/#/builders/83/builds/19094

2 years ago[AArch64] fp16-v8-instructions.ll - remove some old defunct CHECKS identified in...
Simon Pilgrim [Wed, 18 May 2022 11:49:05 +0000 (12:49 +0100)]
[AArch64] fp16-v8-instructions.ll - remove some old defunct CHECKS identified in D125604

Typos meant that the update script never removed them

2 years ago[DebugInfo][X86] debug-info-template-parameter.ll - fix broken DW_AT_default_value...
Simon Pilgrim [Wed, 18 May 2022 11:39:02 +0000 (12:39 +0100)]
[DebugInfo][X86] debug-info-template-parameter.ll - fix broken DW_AT_default_value checks identified in D125604

2 years ago[X86] statepoint-vreg-details.ll - fix CHECK-VREG-LABEL typo identified in D125604
Simon Pilgrim [Wed, 18 May 2022 11:33:53 +0000 (12:33 +0100)]
[X86] statepoint-vreg-details.ll - fix CHECK-VREG-LABEL typo identified in D125604

2 years ago[X86] lvi-hardening-indirectbr.ll - fix X64-NOT typo identified in D125604
Simon Pilgrim [Wed, 18 May 2022 11:33:27 +0000 (12:33 +0100)]
[X86] lvi-hardening-indirectbr.ll - fix X64-NOT typo identified in D125604

2 years ago[X86] copy-propagation.ll - fix CHECK-NEXT typo identified in D125604
Simon Pilgrim [Wed, 18 May 2022 11:32:31 +0000 (12:32 +0100)]
[X86] copy-propagation.ll - fix CHECK-NEXT typo identified in D125604

2 years ago[X86] coalesce-dead-lanes.mir - fix CHECK-LABEL typo identified in D125604
Simon Pilgrim [Wed, 18 May 2022 11:31:42 +0000 (12:31 +0100)]
[X86] coalesce-dead-lanes.mir - fix CHECK-LABEL typo identified in D125604

2 years ago[X86] Regenerate select-ext.ll test for D125604
Simon Pilgrim [Wed, 18 May 2022 11:25:35 +0000 (12:25 +0100)]
[X86] Regenerate select-ext.ll test for D125604

GlobalISel tests are barely supported on X86, so just regenerate for now to avoid a blocker

2 years agoAArch64: fall back to DWARF instead of crashing on weird .cfi directives
Tim Northover [Wed, 18 May 2022 10:40:46 +0000 (11:40 +0100)]
AArch64: fall back to DWARF instead of crashing on weird .cfi directives

CodeGen will only produce fixed formwat prologues, but hand-written assembly
can have .cfi directives in any combination they want. This should cause a
fallback to DWARF rather than an assertion failure (or an incorrect compact
unwind if assertions are disabled).

2 years ago[lldb] Add --all option to "memory region"
David Spickett [Wed, 18 May 2022 10:30:14 +0000 (10:30 +0000)]
[lldb] Add --all option to "memory region"

This adds an option to the memory region command
to print all regions at once. Like you can do by
starting at address 0 and repeating the command
manually.

memory region [-a] [<address-expression>]

(lldb) memory region --all
[0x0000000000000000-0x0000000000400000) ---
[0x0000000000400000-0x0000000000401000) r-x <...>/a.out PT_LOAD[0]
<...>
[0x0000fffffffdf000-0x0001000000000000) rw- [stack]
[0x0001000000000000-0xffffffffffffffff) ---

The output matches exactly what you'd get from
repeating the command. Including that it shows
unmapped areas between the mapped regions.

(this is why Process GetMemoryRegions is not
used, that skips unmapped areas)

Help text has been updated to show that you can have
an address or --all but not both.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D111791

2 years ago[AMDGPU][MC][GFX10] Add missing s_scratch_load tests.
Ivan Kosarev [Wed, 18 May 2022 09:47:24 +0000 (10:47 +0100)]
[AMDGPU][MC][GFX10] Add missing s_scratch_load tests.

Completes
https://reviews.llvm.org/D125117

Reviewed By: dp, arsenm

Differential Revision: https://reviews.llvm.org/D125753

2 years ago[JumpThreading] Look through freeze in getPredicateAt() fold
Nikita Popov [Wed, 18 May 2022 09:26:56 +0000 (11:26 +0200)]
[JumpThreading] Look through freeze in getPredicateAt() fold

This code is valid for any icmp, so we can safely look through a
freeze when trying to find one.

A caveat here is that replaceFoldableUses() may not end up replacing
any uses in this case. It might make sense to use the freeze as the
context instruction (rather than the terminator) if there is a
freeze, to ensure that it always gets folded. This would require
some changes to how replaceFoldedUses() works though, as it
currently assumes that the value is valid at the end of the block.

2 years ago[AMDGPU] Aggressively fold immediates in SIShrinkInstructions
Jay Foad [Mon, 16 May 2022 14:53:03 +0000 (15:53 +0100)]
[AMDGPU] Aggressively fold immediates in SIShrinkInstructions

Fold immediates regardless of how many uses they have. This is expected
to increase overall code size, but decrease register usage.

Differential Revision: https://reviews.llvm.org/D114644

2 years ago[JumpThreading] Add additional freeze tests (NFC)
Nikita Popov [Wed, 18 May 2022 10:03:15 +0000 (12:03 +0200)]
[JumpThreading] Add additional freeze tests (NFC)

These are for the getPredicateAt() codepath.

2 years ago[IR] Report whether replaceUsesOfWith() changed something (NFC)
Nikita Popov [Wed, 18 May 2022 09:45:10 +0000 (11:45 +0200)]
[IR] Report whether replaceUsesOfWith() changed something (NFC)

With change reporting in transformation passes in mind.

2 years ago[llvm][fix-irreducible] ensure that loop subtree under child is correctly reconnected...
Sun Ziping [Wed, 18 May 2022 09:18:50 +0000 (10:18 +0100)]
[llvm][fix-irreducible] ensure that loop subtree under child is correctly reconnected to new loop

The modified function was incorrectly (not unnecessarily) ignoring grandchild
loops, and this change fixes the bug. In particular, this fixes the handling of
the loop { inner, body }. The TODO in the same function is talking about the b1
self loop, which may be "unnecessarily" lost, but that is a different issue.

2 years ago[MLIR] Make `parseDimensionListRanked` configurable wrt parsing a trailing `x`
Frederik Gossen [Wed, 18 May 2022 09:14:43 +0000 (05:14 -0400)]
[MLIR] Make `parseDimensionListRanked` configurable wrt parsing a trailing `x`

Differential Revision: https://reviews.llvm.org/D125797

2 years ago[JumpThreading] Simplify getPredicateAt() based folding
Nikita Popov [Wed, 18 May 2022 09:16:08 +0000 (11:16 +0200)]
[JumpThreading] Simplify getPredicateAt() based folding

It's sufficient to just fold the icmp to true/false here, and then
let constant terminator folding take care of the rest.

It should be noted that while replaceFoldableUses() may not replace
all uses of the icmp, at least the use in the terminator we're
working on is always replaceable, so terminator constant folding
should be reliably enabled as a subsequent step.

2 years ago[AMDGPU] Aggressively fold immediates in SIFoldOperands
Jay Foad [Mon, 16 May 2022 14:48:11 +0000 (15:48 +0100)]
[AMDGPU] Aggressively fold immediates in SIFoldOperands

Previously SIFoldOperands::foldInstOperand would only fold a
non-inlinable immediate into a single user, so as not to increase code
size by adding the same 32-bit literal operand to many instructions.

This patch removes that restriction, so that a non-inlinable immediate
will be folded into any number of users. The rationale is:
- It reduces the number of registers used for holding constant values,
  which might increase occupancy. (On the other hand, many of these
  registers are SGPRs which no longer affect occupancy on GFX10+.)
- It reduces ALU stalls between the instruction that loads a constant
  into a register, and the instruction that uses it.
- The above benefits are expected to outweigh any increase in code size.

Differential Revision: https://reviews.llvm.org/D114643

2 years ago[mlir:GreedyDriver] Return WalkResult::skip after deleting a known constant
River Riddle [Wed, 18 May 2022 09:14:02 +0000 (02:14 -0700)]
[mlir:GreedyDriver] Return WalkResult::skip after deleting a known constant

This avoids use-after-free when trying to access the regions after visiting
the operation.

2 years ago[AMDGPU] Shrink F16 MAD/FMA to MADAK/MADMK/FMAAK/FMAMK on GFX10
Jay Foad [Tue, 17 May 2022 15:54:13 +0000 (16:54 +0100)]
[AMDGPU] Shrink F16 MAD/FMA to MADAK/MADMK/FMAAK/FMAMK on GFX10

Differential Revision: https://reviews.llvm.org/D125803

2 years ago[lldb] const a couple of getters on MemoryRegionInfo
David Spickett [Tue, 17 May 2022 13:43:08 +0000 (13:43 +0000)]
[lldb] const a couple of getters on MemoryRegionInfo

GetDirtyPageList was being assigned to const & in most places anyway.
If you wanted to change the list you'd make a new one and call
SetDirtyPageList.

GetPageSize is just an int so no issues being const.

Differential Revision: https://reviews.llvm.org/D125786

2 years ago[JumpThreading] Use common code to skip freeze (NFC)
Nikita Popov [Wed, 18 May 2022 08:48:38 +0000 (10:48 +0200)]
[JumpThreading] Use common code to skip freeze (NFC)

There are multiple places that want to look through freeze, so
store condition without freeze in a separate variable.

2 years ago[clang][analyzer][ctu] Make CTU a two phase analysis
Gabor Marton [Thu, 14 Apr 2022 09:03:44 +0000 (11:03 +0200)]
[clang][analyzer][ctu] Make CTU a two phase analysis

This new CTU implementation is the natural extension of the normal single TU
analysis. The approach consists of two analysis phases. During the first phase,
we do a normal single TU analysis. During this phase, if we find a foreign
function (that could be inlined from another TU) then we don’t inline that
immediately, we rather mark that to be analysed later.
When the first phase is finished then we start the second phase, the CTU phase.
In this phase, we continue the analysis from that point (exploded node)
which had been enqueued during the first phase. We gradually extend the
exploded graph of the single TU analysis with the new node that was
created by the inlining of the foreign function.

We count the number of analysis steps of the first phase and we limit the
second (ctu) phase with this number.

This new implementation makes it convenient for the users to run the
single-TU and the CTU analysis in one go, they don't need to run the two
analysis separately. Thus, we name this new implementation as "onego" CTU.

Discussion:
https://discourse.llvm.org/t/rfc-much-faster-cross-translation-unit-ctu-analysis-implementation/61728

Differential Revision: https://reviews.llvm.org/D123773

2 years ago[clang][ASTImporter] Add isNewDecl
Gabor Marton [Fri, 13 May 2022 12:57:14 +0000 (14:57 +0200)]
[clang][ASTImporter] Add isNewDecl

Summary:
Add a new function with which we can query if a Decl had been newly
created during the import process. This feature is a must if we want to
have a different static analysis strategy for such newly created
declarations.

This is a dependent patch that is needed for the new CTU implementation
discribed at
https://discourse.llvm.org/t/rfc-much-faster-cross-translation-unit-ctu-analysis-implementation/61728

Differential Revision:
https://reviews.llvm.org/D123685

2 years ago[LV] set Header earlier, use variable instead of repeated access (NFC).
Florian Hahn [Wed, 18 May 2022 08:29:59 +0000 (09:29 +0100)]
[LV] set Header earlier, use variable instead of repeated access (NFC).

2 years ago[test, x86] Fix spurious x86-target-features.c failure
Thomas Preud'homme [Fri, 6 May 2022 08:46:55 +0000 (09:46 +0100)]
[test, x86] Fix spurious x86-target-features.c failure

x86-target-features.c can spuriously fail when checking for absence of
the string "lvi" in the compiler output due to the temporary path used
for the output file. For example:
"-o" "/tmp/lit-tmp-981j7lvi/x86-target-features-670b86.o"
will make the test fail. This commit checks specifically for lvi as a
target feature, in a similar way to the positive CHECK directive just
above.

Test Plan: fails when using -mlvi-hardening and pass otherwise

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D125084

2 years ago[flang][driver] Support parsing response files
Diana Picus [Tue, 3 May 2022 09:44:35 +0000 (09:44 +0000)]
[flang][driver] Support parsing response files

Add support for reading response files in the flang driver. Response
files contain command line arguments and are used whenever a command
becomes longer than the shell/environment limit. Response files are
recognized via the special "@path/to/response/file.rsp" syntax, which
distinguishes them from other file inputs.

This patch hardcodes GNU tokenization, since we don't have a CL mode for
the driver. In the future we might want to add a --rsp-quoting command
line option, like clang has, to accommodate Windows platforms.

Differential Revision: https://reviews.llvm.org/D124846

2 years ago[SelectionDAGBuilder] Pass fast math flags to most of VP SDNodes.
Yeting Kuo [Fri, 13 May 2022 23:25:36 +0000 (07:25 +0800)]
[SelectionDAGBuilder] Pass fast math flags to most of VP SDNodes.

The patch does not pass math flags to float VPCmpIntrinsics because LLParser
could not identify float VPCmpIntrinsics as FPMathOperators.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125600

2 years ago[flang][Runtime] Use proper prototypes in Fortran_main. NFCI
Diana Picus [Mon, 16 May 2022 07:58:09 +0000 (07:58 +0000)]
[flang][Runtime] Use proper prototypes in Fortran_main. NFCI

This is compiled as C code, so it's a good idea to be explicit about the
prototype. Clang complains about this when -Wstrict-prototypes is used.

Differential Revision: https://reviews.llvm.org/D125672

2 years ago[PowerPC] Treat llvm.fmuladd intrinsic as using CTR
Qiu Chaofan [Wed, 18 May 2022 07:55:02 +0000 (15:55 +0800)]
[PowerPC] Treat llvm.fmuladd intrinsic as using CTR

This fixes bug 55463, similar to D78668. This is a temporary fix since
we will switch to post-isel CTR loop determination in the future.

Reviewed By: dim, shchenz

Differential Revision: https://reviews.llvm.org/D125746

2 years ago[GreedyPatternRewriter] Avoid reversing constant order
rkayaith [Wed, 18 May 2022 07:38:42 +0000 (00:38 -0700)]
[GreedyPatternRewriter] Avoid reversing constant order

The previous fix from af371f9f98da only applied when using a bottom-up
traversal. The change here applies the constant preprocessing logic to the
top-down case as well. This resolves the issue with the canonicalizer pass still
reordering constants, since it uses a top-down traversal by default.

Fixes #51892

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D125623

2 years ago[SCEVExpander] Expand umin_seq using freeze
Nikita Popov [Wed, 11 May 2022 10:34:16 +0000 (12:34 +0200)]
[SCEVExpander] Expand umin_seq using freeze

%x umin_seq %y is currently expanded to %x == 0 ? 0 : umin(%x, %y).
This patch changes the expansion to umin(%x, freeze %y) instead
(https://alive2.llvm.org/ce/z/wujUhp).

The motivation for this change are the test cases affected by
D124910, where the freeze expansion ultimately produces better
optimization results. This is largely because
`(%x umin_seq %y) == %x` is a common expansion pattern, which
reliably optimizes in freeze representation, but only sometimes
with the zero comparison (in particular, if %x == 0 can fold to
something else, we generally won't be able to cover reasonable
code from this.)

Differential Revision: https://reviews.llvm.org/D125372

2 years ago[LoopUnroll] Avoid branch on poison for runtime unroll with multiple exits
Nikita Popov [Tue, 17 May 2022 09:26:14 +0000 (11:26 +0200)]
[LoopUnroll] Avoid branch on poison for runtime unroll with multiple exits

When performing runtime unrolling with multiple exits, one of the
earlier (non-latch) exits may exit the loop on the first iteration,
such that we never branch on the latch exit condition. As such, we
need to freeze the condition of the new branch that is introduced
before the loop, as it now executes unconditionally.

Differential Revision: https://reviews.llvm.org/D125754

2 years ago[llvm-nm] Always use opaque pointers (PR55506)
Nikita Popov [Tue, 17 May 2022 08:50:18 +0000 (10:50 +0200)]
[llvm-nm] Always use opaque pointers (PR55506)

Always enable opaque pointers in llvm-nm, because the tool doesn't
actually care, and this allows us to read both typed pointer and
opaque pointer bitcode files in one archive. Previously this
depended on the order inside the archive (it would work with an
opaque pointer bitcode file first, but fail with a typed pointer
bitcode file first).

Fixes https://github.com/llvm/llvm-project/issues/55506.

Differential Revision: https://reviews.llvm.org/D125751

2 years ago[mlir][Canonicalize] Fix command-line options
rkayaith [Wed, 18 May 2022 07:27:54 +0000 (00:27 -0700)]
[mlir][Canonicalize] Fix command-line options

The canonicalize command-line options currently have no effect, as the pass is
reading the pass options in its constructor, before they're actually
initialized. This results in the default values of the options always being used.

The change here moves the initialization of the `GreedyRewriteConfig` out of the
constructor, so that it runs after the pass options have been parsed.

Fixes #55466

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D125621

2 years ago[mlir:PDLL] Don't append / for directory code completion
River Riddle [Wed, 18 May 2022 01:32:46 +0000 (18:32 -0700)]
[mlir:PDLL] Don't append / for directory code completion

This allows for properly using / as a trigger character, i.e.
more easily allows chaining include directory completions.

2 years ago[mlir:PDLL] Improve the location ranges of several expressions during parsing
River Riddle [Wed, 18 May 2022 00:49:28 +0000 (17:49 -0700)]
[mlir:PDLL] Improve the location ranges of several expressions during parsing

This allows for the range to encompass more of the source associated
with the full expression, making diagnostics easier to see/tooling easier/etc.

2 years ago[mlir:PDLL] Drop space as a completion commit character
River Riddle [Wed, 18 May 2022 00:57:18 +0000 (17:57 -0700)]
[mlir:PDLL] Drop space as a completion commit character

This causes annoyances when attempting to use space as
a trigger character (to start a different completion).

2 years ago[llvm-readobj] Fix printing of Windows ARM unwind opcodes, add tests
Martin Storsjö [Mon, 22 Nov 2021 22:44:58 +0000 (00:44 +0200)]
[llvm-readobj] Fix printing of Windows ARM unwind opcodes, add tests

The existing code was essentially untested; in some cases, it used
too narrow variable types to fit all the bits, in some cases the
bit manipulation operations were incorrect.

For the "ldr lr, [sp], #x" opcode, there's nothing in the documentation
that says it cannot be used in a prologue. (In practice, it would
probably seldom be used there, but technically there's nothing
stopping it from being used.) The documentation only specifies the
operation to replay for unwinding it, but the corresponding mirror
instruction to be printed for a prologue is "str lr, [sp, #-x]!".

Also improve printing of register masks, by aggregating registers
into ranges where possible, and make the printing of the terminating
branches clearer, as "bx <reg>" and "b.w <target>".

Differential Revision: https://reviews.llvm.org/D125643

2 years ago[ArgPromotion] Add unused-argument.ll test (NFC)
Pavel Samolysov [Wed, 18 May 2022 07:05:13 +0000 (10:05 +0300)]
[ArgPromotion] Add unused-argument.ll test (NFC)

If a pointer argument is unused within the callee, this argument should
be removed from the function's signature while all used pointer
arguments should be promoted as it is expected. The ArgumentPromotion
pass doesn't touch unused non-pointer arguments at all.

2 years agoRevert "[clang-format] Fix WhitespaceSensitiveMacros not being honoured when macro...
Marek Kurdej [Wed, 18 May 2022 05:25:12 +0000 (07:25 +0200)]
Revert "[clang-format] Fix WhitespaceSensitiveMacros not being honoured when macro closing parenthesis is followed by a newline."

This reverts commit 50cd52d9357224cce66a9e00c9a0417c658a5655.

It provoked regressions in C++ and ObjectiveC as described in https://reviews.llvm.org/D123676#3515949.

Reproducers:
```
MACRO_BEGIN
#if A
int f();
#else
int f();
#endif
```

```
NS_SWIFT_NAME(A)
@interface B : C
@property(readonly) D value;
@end
```

2 years ago[MLIR][Presburger] Cleanup getMaybeValues in FACV
Groverkss [Wed, 18 May 2022 04:14:14 +0000 (09:44 +0530)]
[MLIR][Presburger] Cleanup getMaybeValues in FACV

This patch cleans up multiple getMaybeValue functions to take an IdKind instead
of special functions.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D125617

2 years ago[MLIR][Presburger] Attach values only to non-local identifiers in FAVC
Groverkss [Wed, 18 May 2022 03:43:30 +0000 (09:13 +0530)]
[MLIR][Presburger] Attach values only to non-local identifiers in FAVC

This patch changes `FlatAffineValueConstraints` to only allow attaching
values to non-local identifiers.

The reasoning for this change is:
1. Information attached to local identifiers can be lost since local identifiers
  can be removed for output size optimizations.
2. There are no current use cases for attaching values to Local identifiers.
3. Attaching a value to a local identifier does not make sense since a local
  identifier represents existential quantification.

This patch also adds some additional asserts to the affected functions.

Reviewed By: arjunp, bondhugula

Differential Revision: https://reviews.llvm.org/D125613

2 years ago[BasicAA] Remove unneeded special case for malloc/calloc
Philip Reames [Wed, 18 May 2022 03:34:19 +0000 (20:34 -0700)]
[BasicAA] Remove unneeded special case for malloc/calloc

This code pre-exists the generic handling for inaccessiblememonly.  If we remove it and update one test with inaccessiblememonly, nothing else changes.  Note that simply running O1 on that test would annotate malloc with the missing inaccessiblememonly.

2 years ago[NFC][Clang] Modify expect of fail test or XFAIL because CSKY align is different
Zi Xuan Wu (Zeson) [Wed, 11 May 2022 08:48:40 +0000 (16:48 +0800)]
[NFC][Clang] Modify expect of fail test or XFAIL because CSKY align is different

CSKY is always in 4-byte align, no matter it's long long type.
For global aggregate variable, it's 4-byte align if its size is bigger than or equal to 4 bytes.

Differential Revision: https://reviews.llvm.org/D124977

2 years ago[NFC][AMDGPU][CodeGen] Use ArrayRef in TargetLowering functions
Shao-Ce SUN [Wed, 18 May 2022 00:19:25 +0000 (08:19 +0800)]
[NFC][AMDGPU][CodeGen] Use ArrayRef in TargetLowering functions

Based on D123467.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D124508

2 years ago[InstCombine] add tests for icmp-fsh
Chenbing Zheng [Wed, 18 May 2022 02:01:44 +0000 (10:01 +0800)]
[InstCombine] add tests for icmp-fsh

2 years ago[JumpThreading] Let ProcessImpliedCondition look into freeze instructions
Juneyoung Lee [Wed, 18 May 2022 01:41:31 +0000 (10:41 +0900)]
[JumpThreading] Let ProcessImpliedCondition look into freeze instructions

This patch makes JumpThreading's ProcessImpliedCondition deal with frozen
conditions.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D84941

2 years agoPrecommit a test file for D84941
Juneyoung Lee [Wed, 18 May 2022 01:41:25 +0000 (10:41 +0900)]
Precommit a test file for D84941

2 years ago[lld][ELF] Support BFD name elf32-avr
Ben Shi [Thu, 12 May 2022 13:26:15 +0000 (13:26 +0000)]
[lld][ELF] Support BFD name elf32-avr

Reviewed By: MaskRay

differential Revision: https://reviews.llvm.org/D125544

2 years ago[mlir][tosa] Rework tosa.apply_scale lowering for 32-bit
Robert Suderman [Tue, 17 May 2022 23:00:04 +0000 (16:00 -0700)]
[mlir][tosa] Rework tosa.apply_scale lowering for 32-bit

Added handling rounding behavior in 32-bits for when possible. This
avoids kernel compilation generating scalarized code on platforms where
64-bit vectors are not available.

As the 48-bit lowering requires 64-bit anyway, we added a full 64-bit
solution simplifying the old path.

Reviewed By: dcaballe, mravishankar

Differential Revision: https://reviews.llvm.org/D125583

2 years agoRevert "[RISCV] Enable strict assertions in InsertVSETVLI data flow"
Philip Reames [Tue, 17 May 2022 22:51:41 +0000 (15:51 -0700)]
Revert "[RISCV] Enable strict assertions in InsertVSETVLI data flow"

This reverts commit 79a66ec97b4fb8cbc4e0a81ead356caf5507a6ea.

The stronger asserts served their purpose; I stumbled across another bug.  Will reapply once this one is also fixed.

The bug appears to be a variant of a previous one:
* We mutate an instruction in one block.
* That mutation changes the phase3 results of another block.

This is very similiar to a previous issue, except cross block instead of within a single block.

2 years ago[mlir][SCF] Fix scf.while bufferization
Matthias Springer [Tue, 17 May 2022 20:58:54 +0000 (22:58 +0200)]
[mlir][SCF] Fix scf.while bufferization

Before this fix, the bufferization implementation made the incorrect assumption that the values yielded from the "before" region must match with the values yielded from the "after" region.

Differential Revision: https://reviews.llvm.org/D125835

2 years ago[pseudo] Design notes from discussion today. NFC
Sam McCall [Tue, 17 May 2022 22:08:47 +0000 (00:08 +0200)]
[pseudo] Design notes from discussion today. NFC

2 years ago[ConstantRange] Improve the implementation of binaryAnd
Alexander Shaposhnikov [Tue, 17 May 2022 21:30:44 +0000 (21:30 +0000)]
[ConstantRange] Improve the implementation of binaryAnd

This diff adjusts binaryAnd to take advantage of the analysis
based on KnownBits.

Differential revision: https://reviews.llvm.org/D125603

Test plan:
1/ ninja check-llvm
2/ ninja check-llvm-unit

2 years ago[BOLT][NFC] Suppress unused variable warnings
Amir Ayupov [Tue, 17 May 2022 21:30:00 +0000 (14:30 -0700)]
[BOLT][NFC] Suppress unused variable warnings

Addresses the warnings emitted by Apple Clang 13.1.6 (Xcode 13.3.1).
Tip @tschuett issue #55404.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D125733

2 years ago[BOLT][NFC] Move BinaryDominatorTree out of BinaryLoop header
Amir Ayupov [Tue, 17 May 2022 21:19:43 +0000 (14:19 -0700)]
[BOLT][NFC] Move BinaryDominatorTree out of BinaryLoop header

Split up the BinaryLoop header and move BinaryDominatorTree into its own header,
preparing it for a standalone use.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D125664

2 years ago[RISCV] Add additional test coverage of 11a7e77c and related transforms
Philip Reames [Tue, 17 May 2022 20:31:51 +0000 (13:31 -0700)]
[RISCV] Add additional test coverage of 11a7e77c and related transforms

2 years ago[docs][LangRef] Fix typo in llvm.smul.fix example
Nuno Lopes [Tue, 17 May 2022 20:36:36 +0000 (21:36 +0100)]
[docs][LangRef] Fix typo in llvm.smul.fix example

2 years ago[libc] add snprintf
Michael Jones [Tue, 17 May 2022 19:03:23 +0000 (12:03 -0700)]
[libc] add snprintf

After adding sprintf, snprintf is simple. The functions are very
similar. The tests only cover the behavior of the max length since the
sprintf tests should cover the other behavior.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D125826

2 years ago[clang][dataflow] Fix double visitation of nested logical operators
Eric Li [Tue, 17 May 2022 18:08:25 +0000 (18:08 +0000)]
[clang][dataflow] Fix double visitation of nested logical operators

Sub-expressions that are logical operators are not spelled out
separately in basic blocks, so we need to manually visit them when we
encounter them. We do this in both the `TerminatorVisitor`
(conditionally) and the `TransferVisitor` (unconditionally), which can
cause cause an expression to be visited twice when the binary
operators are nested 2+ times.

This changes the visit in `TransferVisitor` to check if it has been
evaluated before trying to visit the sub-expression.

Differential Revision: https://reviews.llvm.org/D125821

2 years ago[gn build] Port 1f49714d3e19
LLVM GN Syncbot [Tue, 17 May 2022 19:47:10 +0000 (19:47 +0000)]
[gn build] Port 1f49714d3e19

2 years ago[gn build] Port 1188faa7ab4b
LLVM GN Syncbot [Tue, 17 May 2022 19:47:09 +0000 (19:47 +0000)]
[gn build] Port 1188faa7ab4b

2 years ago[trace][intelpt] Support system-wide tracing [6] - Break IntelPTCollector into smalle...
Walter Erquinigo [Thu, 5 May 2022 21:42:54 +0000 (14:42 -0700)]
[trace][intelpt] Support system-wide tracing [6] - Break IntelPTCollector into smaller files and minor refactor

IntelPTCollector is very big and has 3 classes in it. It's actually cleaner if each one has its own file. This also gives more visibility to the developer about the different kinds of "tracers" that we have.

Besides that, I'm now restricting the creation of the BinaryData chunks to GetState() instead of having it in different places, which is not very clean, because the gdb-remote protocol should be as restricted as possible.

Differential Revision: https://reviews.llvm.org/D125047

2 years ago[trace][intelpt] Support system-wide tracing [5] - Disable/enable per-core tracing...
Walter Erquinigo [Wed, 4 May 2022 20:24:49 +0000 (13:24 -0700)]
[trace][intelpt] Support system-wide tracing [5] - Disable/enable per-core tracing based on the process state

When tracing on per-core mode, we are tracing all processes, which means
that after hitting a breakpoint, our process will stop running (thus
producing no more tracing data) but other processes will continue
writing to our trace buffers. This causes a big data loss for our trace.
As a way to remediate this, I'm adding some logic to pause and unpause
tracing based on the target's state. The earlier we do it the better,
however, I'm not adding the trigger at the earliest possible point for
simplicity of this diff. Later we can improve that part.

Differential Revision: https://reviews.llvm.org/D124962

2 years ago[trace][intelpt] Support system-wide tracing [4] - Support per core tracing on lldb...
Walter Erquinigo [Tue, 3 May 2022 02:10:39 +0000 (19:10 -0700)]
[trace][intelpt] Support system-wide tracing [4] - Support per core tracing on lldb-server

This diffs implements per-core tracing on lldb-server. It also includes tests that ensure that tracing can be initiated from the client and that the jLLDBGetState ppacket returns the list of trace buffers per core.

This doesn't include any decoder changes.

Finally, this makes some little changes here and there improving the existing code.

A specific piece of code that can't reliably be tested is when tracing
per core fails due to permissions. In this case we add a
troubleshooting message and this is the manual test:

```
/proc/sys/kernel/perf_event_paranoid set to 1

(lldb) process trace start --per-core-tracing                                         error: perf event syscall failed: Permission denied
 You might need that /proc/sys/kernel/perf_event_paranoid has a value of 0 or -1.
``

Differential Revision: https://reviews.llvm.org/D124858

2 years ago[gn build] Port 6aabf60f2fb7
LLVM GN Syncbot [Tue, 17 May 2022 19:38:35 +0000 (19:38 +0000)]
[gn build] Port 6aabf60f2fb7

2 years ago[AMDGPU] Add llvm.amdgcn.global.load.lds intrinsic
Stanislav Mekhanoshin [Tue, 17 May 2022 18:25:45 +0000 (11:25 -0700)]
[AMDGPU] Add llvm.amdgcn.global.load.lds intrinsic

Differential Revision: https://reviews.llvm.org/D125279

2 years agoRevert "Reland "[clangd] Indexing of standard library""
Sam McCall [Tue, 17 May 2022 19:32:45 +0000 (21:32 +0200)]
Revert "Reland "[clangd] Indexing of standard library""

This reverts commit ccdb56ac10eef3048135169a67d239328c2b1de6.

Still seeing windows failures on GN bots: http://45.33.8.238/win/58316/step_9.txt

Unfortunately I can't debug these at all - it's a bare unsymbolized
stacktrace, and I can't reproduce the failure.

2 years ago[AMDGPU] Enable FLAT LDS DMA on gfx9/10 before gfx940
Stanislav Mekhanoshin [Thu, 5 May 2022 22:44:16 +0000 (15:44 -0700)]
[AMDGPU] Enable FLAT LDS DMA on gfx9/10 before gfx940

We always had global and scratch loads to LDS in the gfx9,
but did not handle it. These were available via the 'lds'
encoding bit. In gfx940 this bit was reused as 'svs' which
resulted in new '_lds' opcodes effectively pushing this
bit into the opcode, but functionally it is the same. These
instructions are also available on gfx10.

Differential Revision: https://reviews.llvm.org/D125126

2 years ago[gn build] Port ccdb56ac10ee
LLVM GN Syncbot [Tue, 17 May 2022 19:07:18 +0000 (19:07 +0000)]
[gn build] Port ccdb56ac10ee

2 years ago[RISCV] Minor reorganization of VSETVLIInfo::operator== for readability [NFC]
Philip Reames [Tue, 17 May 2022 19:05:11 +0000 (12:05 -0700)]
[RISCV] Minor reorganization of VSETVLIInfo::operator== for readability [NFC]

2 years agoReland "[clangd] Indexing of standard library"
Sam McCall [Tue, 17 May 2022 18:04:02 +0000 (20:04 +0200)]
Reland "[clangd] Indexing of standard library"

This reverts commit 76ddbb1ca747366417be64fdf79218df099a5973.

2 years ago[clang][dataflow] Weaken guard to only check for storage location
Eric Li [Tue, 17 May 2022 18:48:23 +0000 (18:48 +0000)]
[clang][dataflow] Weaken guard to only check for storage location

Weaken the guard for whether a sub-expression has been evaluated to
only check for the storage location, instead of checking for the
value. It should be sufficient to check for the storage location, as
we don't necessarily guarantee that a value will be set for the
location (although this is currently true right now).

Differential Revision: https://reviews.llvm.org/D125823

2 years ago[RISCV] Canonicalize AVL=setvli to AVL=Imm or AVL=VLMAX
Philip Reames [Tue, 17 May 2022 18:29:39 +0000 (11:29 -0700)]
[RISCV] Canonicalize AVL=setvli to AVL=Imm or AVL=VLMAX

This patch adds a transform to the local prepass in InsertVSETVLI which canonicalizes an AVL of a register from another vsetvli into immediate or VLMAX when VTYPE is the same. In this patch, I chose to be conservative and avoid arbitrary vreg forwarding due to profitability concerns about possibility overlapping live ranges.

This has the effect of eliminating vsetvli instructions in loops which are walking either VLMAX or a constant number of lanes per iteration.

Differential Revision: https://reviews.llvm.org/D125812

2 years ago[libc] add sprintf
Michael Jones [Thu, 12 May 2022 20:43:15 +0000 (13:43 -0700)]
[libc] add sprintf

This adds the sprintf entrypoint, as well as unit tests. Currently
sprintf only supports %%, %s, and %c, but the other conversions are on
the way.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D125573

2 years ago[InstCombine] fold more shuffles with FP<->Int cast operands
Sanjay Patel [Tue, 17 May 2022 17:58:51 +0000 (13:58 -0400)]
[InstCombine] fold more shuffles with FP<->Int cast operands

shuffle (cast X), (cast Y), Mask --> cast (shuffle X, Y, Mask)

This extends the transform added with 0353c2c996c5.

If the casts are to a larger element type, the transform
reduces shuffle bit width, so that should be a win for
most codegen (if not, it can be inverted).

2 years ago[pseudo] benchmark cleanups. NFC
Sam McCall [Tue, 10 May 2022 13:45:38 +0000 (15:45 +0200)]
[pseudo] benchmark cleanups. NFC

- add missing benchmark for lex/preprocess steps
- name benchmarks after the function they're benchmarking, when appropriate
- remove unergonomic "run" prefixes from benchmark names
- give a useful error message if --grammar or --source are missing
- Use realistic example of how to run, run all benchmarks by default.
  (for someone who doesn't know the commands, this is the most useful action)
- Improve typos/wording in comment
- clean up unused vars
- avoid "parseable stream" name, which isn't a great name & not one I expected
  to escape from ClangPseudoMain

Differential Revision: https://reviews.llvm.org/D125312