review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Benjamin Kramer [Wed, 24 Nov 2021 13:42:54 +0000 (14:42 +0100)]

Revert "[DAG] SimplifyDemandedBits - simplify rotl/rotr to shl/srl"

This reverts commit 3cf4a2c6203b5777d56f0c04fb743b85a041d6f9.

It makes llc hang on the following test case.
```
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-unknown-linux-gnu"

define dso_local void @_PyUnicode_EncodeUTF16() local_unnamed_addr #0 {
entry:
  br label %while.body117.i

while.body117.i:                                  ; preds = %cleanup149.i, %entry
  %out.6269.i = phi i16* [ undef, %cleanup149.i ], [ undef, %entry ]
  %0 = load i16, i16* undef, align 2
  %1 = icmp eq i16 undef, -10240
  br i1 %1, label %fail.i, label %cleanup149.i

cleanup149.i:                                     ; preds = %while.body117.i
  %or130.i = call i16 @llvm.bswap.i16(i16 %0) #2
  store i16 %or130.i, i16* %out.6269.i, align 2
  br label %while.body117.i

fail.i:                                           ; preds = %while.body117.i
  ret void
}

; Function Attrs: nofree nosync nounwind readnone speculatable willreturn
declare i16 @llvm.bswap.i16(i16) #1

attributes #0 = { "target-features"="+neon,+v8a" }
attributes #1 = { nofree nosync nounwind readnone speculatable willreturn }
attributes #2 = { mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn "frame-pointer"="non-leaf" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+neon,+v8a" }
```

commit | commitdiff | tree

Florian Hahn [Wed, 24 Nov 2021 13:32:24 +0000 (13:32 +0000)]

[LV] Use patterns in some induction tests, to make more robust. (NFC)

commit | commitdiff | tree

Simon Pilgrim [Wed, 24 Nov 2021 13:24:27 +0000 (13:24 +0000)]

[X86] Add BMI test coverage for for or-lea with no common bits tests

Ensure D113970 handles andnot patterns as well.

commit | commitdiff | tree

Omer Aviram [Wed, 24 Nov 2021 13:22:15 +0000 (13:22 +0000)]

[X86] Add D113970 tests cases for or-lea with no common bits.

Added tests are permutations of the pattern: (X & ~M) or (Y & M).

Differential Revision: https://reviews.llvm.org/D114078

commit | commitdiff | tree

Sanjay Patel [Tue, 23 Nov 2021 22:19:12 +0000 (17:19 -0500)]

[InstSimplify] fold xor logic of 2 variables, part 2

(~a & b) ^ (a | b) --> a

This is the swapped and/or (Demorgan?) sibling fold for
the fold added with D114462 ( 892648b18a8c ).

This case is easier to specify because we are returning
a root value, not a 'not':
https://alive2.llvm.org/ce/z/SRzj4f

commit | commitdiff | tree

Sanjay Patel [Tue, 23 Nov 2021 22:07:31 +0000 (17:07 -0500)]

[InstSimplify] add tests for xor logic; NFC

commit | commitdiff | tree

Djordje Todorovic [Wed, 24 Nov 2021 12:46:35 +0000 (13:46 +0100)]

[llvm-dwarfdump][Statistics] Handle LTO cases with cross CU referencing

With link-time optimizations enabled, resulting DWARF mayend up containing
cross CU references (through the DW_AT_abstract_origin attribute).
Consider the following example:

// sum.c
__attribute__((always_inline)) int sum(int a, int b)
{
     return a + b;
}
// main.c
extern int sum(int, int);
int main()
{
     int a = 5, b = 10, c = sum(a, b);
     return 0;
}

Compiled as follows:

$ clang -g -flto -fuse-ld=lld main.c sum.c -o main

Results in the following DWARF:

-- sum.c CU: abstract instance tree
...
0x000000b0:   DW_TAG_subprogram
                DW_AT_name ("sum")
                DW_AT_decl_file ("sum.c")
                DW_AT_decl_line (1)
                DW_AT_prototyped (true)
                DW_AT_type (0x000000d3 "int")
                DW_AT_external (true)
                DW_AT_inline (DW_INL_inlined)

0x000000bc:     DW_TAG_formal_parameter
                  DW_AT_name ("a")
                  DW_AT_decl_file ("sum.c")
                  DW_AT_decl_line (1)
                  DW_AT_type (0x000000d3 "int")

0x000000c7:     DW_TAG_formal_parameter
                  DW_AT_name ("b")
                  DW_AT_decl_file ("sum.c")
                  DW_AT_decl_line (1)
                  DW_AT_type (0x000000d3 "int")
...
-- main.c CU: concrete inlined instance tree
...
0x0000006d:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin (0x00000000000000b0 "sum")
                  DW_AT_low_pc (0x00000000002016ef)
                  DW_AT_high_pc (0x00000000002016f1)
                  DW_AT_call_file ("main.c")
                  DW_AT_call_line (5)
                  DW_AT_call_column (0x19)

0x00000081:       DW_TAG_formal_parameter
                    DW_AT_location (DW_OP_reg0 RAX)
                    DW_AT_abstract_origin (0x00000000000000bc "a")

0x00000088:       DW_TAG_formal_parameter
                    DW_AT_location (DW_OP_reg2 RCX)
                    DW_AT_abstract_origin (0x00000000000000c7 "b")
...

Note that each entry within the concrete inlined instance tree in
the main.c CU has a DW_AT_abstract_origin attribute which
refers to a corresponding entry within the abstract instance
tree in the sum.c CU.
llvm-dwarfdump --statistics did not properly report
DW_TAG_formal_parameters/DW_TAG_variables from concrete inlined
instance trees which had 0% location coverage and which
referred to a different CU, mainly because information about abstract
instance trees and their parameters/variables was stored
locally - just for the currently processed CU,
rather than globally - for all CUs.
In particular, if the concrete inlined instance tree from
the example above was to look like this
(i.e. parameter b has 0% location coverage, hence why it's missing):

0x0000006d:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin (0x00000000000000b0 "sum")
                  DW_AT_low_pc (0x00000000002016ef)
                  DW_AT_high_pc (0x00000000002016f1)
                  DW_AT_call_file ("main.c")
                  DW_AT_call_line (5)
                  DW_AT_call_column (0x19)

0x00000081:       DW_TAG_formal_parameter
                    DW_AT_location (DW_OP_reg0 RAX)
                    DW_AT_abstract_origin (0x00000000000000bc "a")

llvm-dwarfdump --statistics would have not reported b as such.

Patch by Dimitrije Milosevic.

Differential revision: https://reviews.llvm.org/D113465

commit | commitdiff | tree

Nemanja Ivanovic [Wed, 24 Nov 2021 10:34:01 +0000 (04:34 -0600)]

[PowerPC] Provide XL-compatible vec_round implementation

The XL implementation of vec_round for vector double uses
"round-to-nearest, ties to even" just as the vector float
`version does. However clang and gcc use "round-to-nearest-away"
for vector double and "round-to-nearest, ties to even"
for vector float.

The XL behaviour is implemented under the __XL_COMPAT_ALTIVEC__
macro similarly to other instances of incompatibility.

Differential revision: https://reviews.llvm.org/D113642

commit | commitdiff | tree

Jeremy Morse [Wed, 24 Nov 2021 12:28:15 +0000 (12:28 +0000)]

[DebugInfo] Adjust x86 location-list tests for instruction referencing

This patch updates location lists in various x86 tests to reflect what
instruction referencing produces. There are two flavours of change:
* Not following a register copy immediately, because instruction
referencing can make some slightly smarter decisions,
* Extended ranges, due to having additional information.

The register changes aren't that interesting, it's just a choice between
equally legitimate registers that instr-ref does differently. The extended
ranges are largely due to following stack restores better.

Differential Revision: https://reviews.llvm.org/D114362

commit | commitdiff | tree

Dmitry Vyukov [Wed, 24 Nov 2021 08:44:08 +0000 (09:44 +0100)]

tsan: add another fork deadlock test

The test tries to provoke internal allocator to be locked during fork
and then force the child process to use the internal allocator.
This test sometimes deadlocks with the new tsan runtime.

Depends on D114514.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D114515

commit | commitdiff | tree

Dmitry Vyukov [Wed, 24 Nov 2021 08:39:05 +0000 (09:39 +0100)]

sanitizer_common: remove SANITIZER_USE_MALLOC

It was introduced in:
9cffc9550b75 tsan: allow to force use of __libc_malloc in sanitizer_common
and used in:
512a18e51819 tsan: add standalone deadlock detector
and later used for Go support.
But now both uses are gone. Nothing defines SANITIZER_USE_MALLOC.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D114514

commit | commitdiff | tree

Manuel Klimek [Wed, 24 Nov 2021 11:17:19 +0000 (12:17 +0100)]

Clean up clang-format tech debt.

Make all code go through FormatTokenSource instead of going around it, which
makes changes to TokenSource brittle.

Add LLVM_DEBUG in FormatTokenSource to be able to follow the token stream.

commit | commitdiff | tree

Jeremy Morse [Wed, 24 Nov 2021 11:45:43 +0000 (11:45 +0000)]

[DebugInfo] Check both instr-ref and DBG_VALUE modes of sdag tests

In these test updates for instruction referencing, I've added specific
instr-ref RUN lines, and kep thte DBG_VALUE-based variable location check
lines too. This is because argument handling is really fiddly, and I figure
it's worth duplicating the testing to ensure it's definitely correct.

There's also dbg-value-superreg-copy2.mir, a dtest for where varaible
locations go when virtual registers are coalesced together. I don't think
there's an instruction referencing specific test for this, so have
duplicated that to for instruction referencing.

Differential Revision: https://reviews.llvm.org/D114262

commit | commitdiff | tree

Simon Pilgrim [Wed, 24 Nov 2021 11:28:28 +0000 (11:28 +0000)]

[DAG] SimplifyDemandedBits - simplify rotl/rotr to shl/srl

If we only demand bits from one half of a rotation pattern, see if we can simplify to a logical shift.

For the ARM rev16 patterns, I had to drop a fold to prevent srl(bswap()) -> rotr(bswap) -> srl(bswap) infinite loops. I've replaced this with an isel PatFrag which should do the same task.

https://alive2.llvm.org/ce/z/iroxki (rol -> shl by amt iff demanded bits has at least as many trailing zeros as the shift amount)
https://alive2.llvm.org/ce/z/4ez_U- (ror -> shl by revamt iff demanded bits has at least as many trailing zeros as the reverse shift amount)
https://alive2.llvm.org/ce/z/cD7dR- (ror -> lshr by amt iff demanded bits has at least as many leading zeros as the shift amount)
https://alive2.llvm.org/ce/z/_XGHtQ (rol -> lshr by revamt iff demanded bits has at least as many leading zeros as the reverse shift amount)

Differential Revision: https://reviews.llvm.org/D114354

commit | commitdiff | tree

Jay Foad [Fri, 12 Nov 2021 18:02:58 +0000 (18:02 +0000)]

[AMDGPU] Implement widening multiplies with v_mad_i64_i32/v_mad_u64_u32

Select SelectionDAG ops smul_lohi/umul_lohi to
v_mad_i64_i32/v_mad_u64_u32 respectively, with an addend of 0.
v_mul_lo, v_mul_hi and v_mad_i64/u64 are all quarter-rate instructions
so it is better to use one instruction than two.

Further improvements are possible to make better use of the addend
operand, but this is already a strict improvement over what we have
now.

Differential Revision: https://reviews.llvm.org/D113986

commit | commitdiff | tree

Jay Foad [Fri, 19 Nov 2021 16:40:29 +0000 (16:40 +0000)]

[AMDGPU] Only select VOP3 forms of VOP2 instructions

Change VOP_PAT_GEN to default to not generating an instruction selection
pattern for the VOP2 (e32) form of an instruction, only for the VOP3
(e64) form. This allows SIFoldOperands maximum freedom to fold copies
into the operands of an instruction, before SIShrinkInstructions tries
to shrink it back to the smaller encoding.

This affects the following VOP2 instructions:
v_min_i32
v_max_i32
v_min_u32
v_max_u32
v_and_b32
v_or_b32
v_xor_b32
v_lshr_b32
v_ashr_i32
v_lshl_b32

A further cleanup could simplify or remove VOP_PAT_GEN, since its
optional second argument is never used.

Differential Revision: https://reviews.llvm.org/D114252

commit | commitdiff | tree

SYNOPSYS\georgiev [Wed, 24 Nov 2021 11:13:17 +0000 (11:13 +0000)]

[LLDB/test] lldbutil check_breakpoint() - check target instance

Check test.target instance type before we attempt to get the breakpoint.
This fix is suggested by 'clayborg'.
Ref: https://reviews.llvm.org/D111899#inline-1090156

commit | commitdiff | tree

Carl Ritson [Wed, 24 Nov 2021 08:57:01 +0000 (17:57 +0900)]

[AMDGPU] Only allow implicit WQM in pixel shaders

Implicit derivatives are only valid in pixel shaders,
hence only implicitly enable WQM for pixel shaders.
This avoids unintended WQM in other shader types (e.g. compute)
when image sampling instructions are used.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D114414

commit | commitdiff | tree

David Green [Wed, 24 Nov 2021 10:41:00 +0000 (10:41 +0000)]

[ARM] Fold (fadd x, (vselect c, y, -1.0)) into (vselect c, (fadd x, y), x)

This is similar to D113574, but as a DAG combine, not tablegen patterns.
Doing the fold as a DAG combine allows the fadd to be folded with a
fmul, finally producing a predicated vfma. It performs the same fold of
fadd(x, vselect(p, y, -0.0)) to vselect p, (fadd x, y), x) using -0.0 as
the identity value of a fadd.

Differential Revision: https://reviews.llvm.org/D113584

commit | commitdiff | tree

Matthias Springer [Wed, 24 Nov 2021 10:32:33 +0000 (19:32 +0900)]

[mlir][linalg][bufferize][NFC] Move vector interface impl to new build target

This makes ComprehensiveBufferize entirely independent of the vector dialect.

Differential Revision: https://reviews.llvm.org/D114218

commit | commitdiff | tree

David Sherwood [Tue, 23 Nov 2021 16:44:55 +0000 (16:44 +0000)]

[NFC] Tidy up SelectionDAGBuilder::visitIntrinsicCall to use existing sdl debug loc

In quite a few places we were calling getCurSDLoc() to get the debug
location, but this is already a local variable `sdl`.

Differential Revision: https://reviews.llvm.org/D114447

commit | commitdiff | tree

Jeremy Morse [Wed, 24 Nov 2021 10:20:03 +0000 (10:20 +0000)]

[DebugInfo][InstrRef] Avoid crash when values optimised out late in sdag

It appears that we can emit all the instructions for a function, including
debug instructions, and then optimise some of the values out late.
Specifically, in the attached test case, an argument gets optimised out
after DBG_VALUE / DBG_INSTR_REFs are created. This confuses
MachineFunction::finalizeDebugInstrRefs, which expects to be able to find a
defining instruction, and crashes instead.

Fix this by identifying when there's no defining instruction, and
translating that instead into a DBG_VALUE $noreg.

Differential Revision: https://reviews.llvm.org/D114476

commit | commitdiff | tree

David Green [Wed, 24 Nov 2021 10:22:20 +0000 (10:22 +0000)]

[ARM] Fold floating point select(binop) patterns

Similar to D84091 which added extra predicated folds for integer operations
using the identity element of the operation, this adds them for floating
point operations for the form `BinOp(x, select(p, y, Identity))`. They are
folded back to predicated versions of the operator, with fadd having the
identity -0.0, fsub using the identity 0.0 and fmul using 1.0.

Differential Revision: https://reviews.llvm.org/D113574

commit | commitdiff | tree

Dmitry Vyukov [Mon, 22 Nov 2021 14:44:00 +0000 (15:44 +0100)]

tsan: extend mmap test

Test size larger than clear_shadow_mmap_threshold,
which is handled differently.

Depends on D114348.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D114366

commit | commitdiff | tree

David Green [Wed, 24 Nov 2021 09:51:33 +0000 (09:51 +0000)]

[ARM] Add fma and update fadd/fmul predicated select tests. NFC

commit | commitdiff | tree

mydeveloperday [Wed, 24 Nov 2021 09:44:35 +0000 (09:44 +0000)]

[clang-format] NFC - recent changes caused clang-format to no longer be clang-formatted.

The following 2 commits caused files in clang-format to no longer be clang-formatted.

we would lose our "clean" status https://releases.llvm.org/13.0.0/tools/clang/docs/ClangFormattedStatus.html

c2271926a4fc - Make clang-format fuzz through Lexing with asserts enabled (https://github.com/llvm/llvm-project/commit/c2271926a4fc )

84bf5e328664 - Fix various problems found by fuzzing. (https://github.com/llvm/llvm-project/commit/84bf5e328664)

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D114430

commit | commitdiff | tree

Matthias Springer [Wed, 24 Nov 2021 09:20:00 +0000 (18:20 +0900)]

[mlir][linalg][bufferize][NFC] Move tensor interface impl to new build target

This makes ComprehensiveBufferize entirely independent of the tensor dialect.

Differential Revision: https://reviews.llvm.org/D114217

commit | commitdiff | tree

Florian Hahn [Wed, 24 Nov 2021 09:23:52 +0000 (09:23 +0000)]

[llvm-reduce] Add parallel chunk processing.

This patch adds parallel processing of chunks. When reducing very large
inputs, e.g. functions with 500k basic blocks, processing chunks in
parallel can significantly speed up the reduction.

To allow modifying clones of the original module in parallel, each clone
needs their own LLVMContext object. To achieve this, each job parses the
input module with their own LLVMContext. In case a job successfully
reduced the input, it serializes the result module as bitcode into a
result array.

To ensure parallel reduction produces the same results as serial
reduction, only the first successfully reduced result is used, and
results of other successful jobs are dropped. Processing resumes after
the chunk that was successfully reduced.

The number of threads to use can be configured using the -j option.
It defaults to 1, which means serial processing.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D113857

commit | commitdiff | tree

Pavel Labath [Wed, 24 Nov 2021 08:59:16 +0000 (09:59 +0100)]

[lldb/gdb-remote] Remove more non-stop mode remnants

The read thread handling is completely dead code now that non-stop mode
no longer exists.

commit | commitdiff | tree

Rosie Sumpter [Tue, 12 Oct 2021 08:51:42 +0000 (09:51 +0100)]

[LoopVectorize][CostModel] Update cost model for fmuladd intrinsic

This patch updates the cost model for ordered reductions so that a call
to the llvm.fmuladd intrinsic is modelled as a normal fmul instruction
plus the cost of an ordered fadd reduction.

Differential Revision: https://reviews.llvm.org/D111630

commit | commitdiff | tree

Rosie Sumpter [Tue, 16 Nov 2021 11:52:19 +0000 (11:52 +0000)]

[LoopVectorize] Print fast-math flags for VPReductionRecipe

commit | commitdiff | tree

Rosie Sumpter [Wed, 3 Nov 2021 12:40:14 +0000 (12:40 +0000)]

[LoopVectorize] Propagate fast-math flags for VPInstruction

In-loop vector reductions which use the llvm.fmuladd intrinsic involve
the creation of two recipes; a VPReductionRecipe for the fadd and a
VPInstruction for the fmul. If the call to llvm.fmuladd has fast-math flags
these should be propagated through to the fmul instruction, so an
interface setFastMathFlags has been added to the VPInstruction class to
enable this.

Differential Revision: https://reviews.llvm.org/D113125

commit | commitdiff | tree

Rosie Sumpter [Mon, 11 Oct 2021 14:50:44 +0000 (15:50 +0100)]

[LoopVectorize] Add vector reduction support for fmuladd intrinsic

Enables LoopVectorize to handle reduction patterns involving the
llvm.fmuladd intrinsic.

Differential Revision: https://reviews.llvm.org/D111555

commit | commitdiff | tree

Butygin [Fri, 19 Nov 2021 22:56:23 +0000 (01:56 +0300)]

[mlir][scf] Canonicalize scf.while with unused results

Differential Revision: https://reviews.llvm.org/D114291

commit | commitdiff | tree

Clement Courbet [Fri, 19 Nov 2021 15:42:32 +0000 (16:42 +0100)]

[clang-tidy] performance-unnecessary-copy-initialization: Fix false negative.

`isConstRefReturningMethodCall` should be considering
`CXXOperatorCallExpr` in addition to `CXXMemberCallExpr`. Clang considers
these to be distinct (`CXXOperatorCallExpr` derives from `CallExpr`, not
`CXXMemberCallExpr`), but we don't care in the context of this
check.

This is important because of
`std::vector<Expensive>::operator[](size_t) const`.

Differential Revision: https://reviews.llvm.org/D114249

commit | commitdiff | tree

Vitaly Buka [Wed, 24 Nov 2021 06:12:31 +0000 (22:12 -0800)]

[sanitizer] Add Abs<T>

commit | commitdiff | tree

Abinav Puthan Purayil [Mon, 8 Nov 2021 05:35:22 +0000 (11:05 +0530)]

[AMDGPU] Check for unneeded shift mask in shift PatFrags.

The existing constrained shift PatFrags only dealt with masked shift
from OpenCL front-ends. This change copies the
X86DAGToDAGISel::isUnneededShiftMask() function to AMDGPU and uses it in
the shift PatFrag predicates.

Differential Revision: https://reviews.llvm.org/D113448

commit | commitdiff | tree

Igor Kudrin [Wed, 24 Nov 2021 05:17:03 +0000 (12:17 +0700)]

[ELF] Support the "read-only" memory region attribute

The attribute 'r' allows (or disallows for the negative case) read-only
sections, i.e. ones without the SHF_WRITE flag, to be assigned to the
memory region. Before the patch, lld could put a section in the wrong
region or fail with "error: no memory region specified for section".

Differential Revision: https://reviews.llvm.org/D113771

commit | commitdiff | tree

Vitaly Buka [Wed, 24 Nov 2021 04:05:25 +0000 (20:05 -0800)]

[sanitizer] Fail instead of crash without real_pthread_create

commit | commitdiff | tree

Bixia Zheng [Mon, 22 Nov 2021 23:24:52 +0000 (15:24 -0800)]

Accept symmetric sparse matrix in Matrix Market Exchange Format.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D114402

commit | commitdiff | tree

Weverything [Wed, 24 Nov 2021 01:51:32 +0000 (17:51 -0800)]

Revert "tsan: new runtime (v3)"

This reverts commit ebd47b0fb78fa11758da6ffcd3e6b415cbb8fa28.
This was causing unexpected behavior in programs.

commit | commitdiff | tree

Uday Bondhugula [Mon, 22 Nov 2021 10:52:41 +0000 (16:22 +0530)]

[MLIR] Remove duplicate `Pass` suffix from ViewOpGraph class name

Remove duplicate `Pass` suffix from view-op-graph pass class name. The
extra suffix would lead to methods like registerViewOpGraphPassPass
being generated.

Differential Revision: https://reviews.llvm.org/D114459

commit | commitdiff | tree

wren romano [Mon, 22 Nov 2021 21:14:17 +0000 (13:14 -0800)]

[mlir][sparse] Adding wrappers for constantOverheadTypeEncoding

Minor code cleanup

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D114392

commit | commitdiff | tree

Jun Ma [Wed, 24 Nov 2021 02:10:22 +0000 (10:10 +0800)]

Revert "[Taildup] Don't tail-duplicate loop header with multiple successors as its latches"

This reverts commit 1f9fa549841a2ec55aa5a131bfaf83f0383c4713.

commit | commitdiff | tree

Jun Ma [Wed, 24 Nov 2021 02:09:53 +0000 (10:09 +0800)]

Revert "Revert "Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values."""""

This reverts commit c93f93b2e3f28997f794265089fb8138dd5b5f13.

commit | commitdiff | tree

Mehdi Amini [Tue, 23 Nov 2021 06:35:37 +0000 (06:35 +0000)]

Update fir.insert_on_range syntax to make the range more explicit (NFC)

Also replace ArrayAttr with IndexElementsAttr to model subscript dimensions.
An array of attribute is a sparse inefficient storage, with an API that
requires to unpack/repack integers at every call site.
Instead we can store dense array of integer as IndexElementsAttr.

Reviewed By: clementval, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D112899

commit | commitdiff | tree

Zequan Wu [Tue, 23 Nov 2021 20:27:26 +0000 (12:27 -0800)]

[LLDB][NativePDB] Allow find functions by full names

I don't see a reason why not to. If we allows lookup functions by full names,
I can change the test case in D113930 to use `lldb-test symbols --find=function --name=full::name --function-flags=full ...`,
though the duplicate method decl prolem is still there for `lldb-test symbols --dump-ast`.
That's a seprate bug, we can fix it later.

Differential Revision: https://reviews.llvm.org/D114467

commit | commitdiff | tree

Vitaly Buka [Wed, 24 Nov 2021 00:52:02 +0000 (16:52 -0800)]

[NFC][sanitizer] Limit StackStore stack size/tag to 1 byte

Nothing uses more than 8bit now. So the rest of the headers can store other data.
kStackTraceMax is 256 now, but all sanitizers by default store just 20-30 frames here.

commit | commitdiff | tree

Vitaly Buka [Sun, 21 Nov 2021 00:46:27 +0000 (16:46 -0800)]

[NFC][sanitizer] Test for b80affb8a149

commit | commitdiff | tree

Stanislav Mekhanoshin [Fri, 19 Nov 2021 22:42:29 +0000 (14:42 -0800)]

[AMDGPU] Remove a no-op check in the gfx90a hazard recognizer

Also rename helper function accordingly.

Differential Revision: https://reviews.llvm.org/D114289

commit | commitdiff | tree

Butygin [Thu, 28 Oct 2021 16:04:35 +0000 (19:04 +0300)]

[mlir][spirv] Add math to OpenCL conversion

Differential Revision: https://reviews.llvm.org/D113780

commit | commitdiff | tree

Florian Mayer [Mon, 22 Nov 2021 23:49:54 +0000 (15:49 -0800)]

[hwasan] support python3 in hwasan_sanitize

Verified no diff exist between previous version, new version python 2, and python 3 for an example stack.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D114404

commit | commitdiff | tree

Florian Mayer [Wed, 3 Nov 2021 00:29:13 +0000 (00:29 +0000)]

[stack-safety] Check SCEV constraints at memory instructions.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D113160

commit | commitdiff | tree

Vitaly Buka [Tue, 23 Nov 2021 23:16:29 +0000 (15:16 -0800)]

[NFC][sanitizer] Reuse forEach for operator==

commit | commitdiff | tree

Vitaly Buka [Tue, 23 Nov 2021 03:38:44 +0000 (19:38 -0800)]

[sanitizer] Add DenseMap::forEach

commit | commitdiff | tree

Nemanja Ivanovic [Tue, 23 Nov 2021 22:45:34 +0000 (16:45 -0600)]

[PowerPC] Allow scalars for asm constraint "v" with VSX

Similarly to what GCC does, we should allow scalars with
the "v" constraint rather than introducing unnecessary
new constraints for scalars in Altivec registers.

Differential revision: https://reviews.llvm.org/D113635

commit | commitdiff | tree

Matt Arsenault [Thu, 4 Nov 2021 01:01:53 +0000 (21:01 -0400)]

PrologEpilogInserter: Use explicit control for scavenge slot placement

AMDGPU is unusual in that the both stack is indexed in the same
direction as stack growth (up). We therefore always need the emergency
stack slots placed as low as possible to ensure they are in range of
load/store instruction immediate offsets. The existing logic is mostly
OK, but failed if we required stack realignment.

I don't understand what the existing control isFPCloseToIncomingSP is
supposed to mean, but can only be used to stop placing the scavenge
slots earlier. Make this explicit so that targets can opt-in rather
than opt-out only.

commit | commitdiff | tree

Florian Hahn [Tue, 23 Nov 2021 22:47:26 +0000 (22:47 +0000)]

[LAA] Move visitPointers up in file (NFC).

This allows easier re-use in earlier functions.

commit | commitdiff | tree

Walter Erquinigo [Tue, 23 Nov 2021 22:23:34 +0000 (14:23 -0800)]

Fix a48501150b9ef64fd61d24f8cef2645237facc44

Issue in https://lab.llvm.org/buildbot/#/builders/96/builds/14682.

Making the test deterministic.

commit | commitdiff | tree

Danil Stefaniuc [Tue, 23 Nov 2021 22:11:42 +0000 (14:11 -0800)]

[formatters] List and forward_list capping_size determination and application

This diff is adding the capping_size determination for the list and forward list, to limit the number of children to be displayed. Also it modifies and unifies tests for libcxx and libstdcpp list data formatter.

Reviewed By: wallace

Differential Revision: https://reviews.llvm.org/D114433

commit | commitdiff | tree

Rahul Joshi [Tue, 23 Nov 2021 21:25:26 +0000 (13:25 -0800)]

Move dependency llvm:AllTargetsAsmParsers from Translation to ExecutionEngine.

- Fixes a minor issue in https://reviews.llvm.org/D114338, which seems incorrectly
added the llvm:AllTargetsAsmParsers dependency to Translation in bazel build files.

Differential Revision: https://reviews.llvm.org/D114471

commit | commitdiff | tree

Danil Stefaniuc [Tue, 23 Nov 2021 22:02:05 +0000 (14:02 -0800)]

[formatters] Capping size limitation avoidance for the libcxx and libcpp bitset data formatters.

This diff is avoiding the size limitation introduced by the capping size for the libcxx and libcpp bitset data formatters.

Reviewed By: wallace

Differential Revision: https://reviews.llvm.org/D114461

commit | commitdiff | tree

Walter Erquinigo [Tue, 23 Nov 2021 17:32:30 +0000 (09:32 -0800)]

Make some libstd++ formatters safer

We need to add checks that ensure that some core variables are valid, so
that we avoid printing out garbage data. The worst that could happen is
that an non-initialized variable is being printed as something with
123123432 children instead of 0.

Differential Revision: https://reviews.llvm.org/D114458

commit | commitdiff | tree

Walter Erquinigo [Tue, 23 Nov 2021 17:16:59 +0000 (09:16 -0800)]

Improve optional formatter

As suggested by @labath in https://reviews.llvm.org/D114403, we should
make the formatter more resilient to corrupted data. The Libcxx version
explicitly checks for engaged = 1, so we can do that as well for safety.

Differential Revision: https://reviews.llvm.org/D114450

commit | commitdiff | tree

Sanjay Patel [Tue, 23 Nov 2021 21:46:55 +0000 (16:46 -0500)]

[InstSimplify] fold xor logic of 2 variables

(a & b) ^ (~a | b) --> ~a

I was looking for a shortcut to reduce some of the complex logic
folds that are currently up for review (D113216
and others in that stack), and I found this missing from
instcombine/instsimplify.

There is a trade-off in putting it into instsimplify: because
we can't create new values here, we need a strict 'not' op (no
undef elements). Otherwise, the fold is not valid:
https://alive2.llvm.org/ce/z/k_AGGj

If this was in instcombine instead, we could create the proper
'not'. But having the fold here benefits other passes like GVN
that use instsimplify as an analysis.

There is a related fold where 'and' and 'or' are swapped, and
that is planned as a follow-up commit.

Differential Revision: https://reviews.llvm.org/D114462

commit | commitdiff | tree

Vitaly Buka [Tue, 23 Nov 2021 21:49:41 +0000 (13:49 -0800)]

[NFC][sanitizer] Make method const

commit | commitdiff | tree

Vitaly Buka [Tue, 23 Nov 2021 21:48:25 +0000 (13:48 -0800)]

[NFC][sanitizer] Extract StackTraceHeader struct

commit | commitdiff | tree

Rong Xu [Mon, 22 Nov 2021 22:03:32 +0000 (14:03 -0800)]

[SampleFDO] Recompute BFI if the sample loader changes BPI

The MIR sample loader changes the branch probability but not BFI.
Here we force a recompute of BFI if the branch probabilities are
changed.

Also register the MIR FSAFDO passes properly.

Differential Revision: https://reviews.llvm.org/D114400

commit | commitdiff | tree

Vitaly Buka [Tue, 16 Nov 2021 04:58:51 +0000 (20:58 -0800)]

[NFC][sanitizer] Add StackStoreTest

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D114463

commit | commitdiff | tree

Dimitry Andric [Tue, 23 Nov 2021 19:47:38 +0000 (20:47 +0100)]

[lldb] Move create_relative_symlink function up in CMake hierarchy

Configuring lldb with `LLDB_ENABLE_PYTHON=OFF` and `LLDB_ENABLE_LUA=ON` results in a CMake error:

    CMake Error at lldb/bindings/lua/CMakeLists.txt:47 (create_relative_symlink):
      Unknown CMake command "create_relative_symlink".
    Call Stack (most recent call first):
      lldb/CMakeLists.txt:117 (finish_swig_lua)

This is because the CMake function `create_relative_symlink` only exists in `lldb/bindings/python/CMakeLists.txt`, and not in `lldb/bindings/lua/CMakeLists.txt`.

Move the function to `lldb/bindings/CMakeLists.txt`, so it is available for all language bindings.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D114465

commit | commitdiff | tree

Vitaly Buka [Tue, 23 Nov 2021 20:51:12 +0000 (12:51 -0800)]

[NFC][sanitizer] Early return for empty StackTraces

Current callers should filter them out anyway,
but with this patch we don't need rely on that assumption.

commit | commitdiff | tree

Vitaly Buka [Tue, 23 Nov 2021 20:41:28 +0000 (12:41 -0800)]

[NFC][sanitizer] Move StackStore::Allocated into cpp file

commit | commitdiff | tree

Sanjay Patel [Tue, 23 Nov 2021 17:10:03 +0000 (12:10 -0500)]

[InstSimplify] add tests for xor logic fold; NFC

commit | commitdiff | tree

Rob Suderman [Wed, 10 Nov 2021 22:02:54 +0000 (14:02 -0800)]

[mlir][tosa] Materialize tosa.pad value and fold noop pads

Padding now can explicitly specify the padding value when non-zero is wanted.
This also includes bypassing pads when the pad does nothing.

Differential Revision: https://reviews.llvm.org/D113611

commit | commitdiff | tree

Rob Suderman [Tue, 23 Nov 2021 03:43:06 +0000 (19:43 -0800)]

[mlir][tosa] Separate tosa.transpose_conv decomposition and added stride support

Transpose convolution decomposition is now performed in a separate pass. This
allows padding / constant propagation to be performed at the TOSA level. It
also adds support for striding when there is no dilation.

Differential Revision: https://reviews.llvm.org/D114409

commit | commitdiff | tree

LLVM GN Syncbot [Tue, 23 Nov 2021 20:11:07 +0000 (20:11 +0000)]

[gn build] Port 1392b654ff65

commit | commitdiff | tree

Mehdi Amini [Tue, 23 Nov 2021 20:10:36 +0000 (20:10 +0000)]

Revert "profi - a flow-based profile inference algorithm: Part I (out of 3)"

This reverts commit 884b6dd311422bbfac62b8a90fbfff8e77ba8121.
The windows build is broken with a linker error.

commit | commitdiff | tree

MaheshRavishankar [Tue, 23 Nov 2021 18:21:52 +0000 (10:21 -0800)]

[mlir][Linalg] Add pad vectorization patterns into LinalgStrategyVectorize passes.

Add an option to control whether these patterns are added to the
pattern list or not.

Differential Revision: https://reviews.llvm.org/D114290

commit | commitdiff | tree

Mehrnoosh Heidarpour [Tue, 23 Nov 2021 18:50:13 +0000 (13:50 -0500)]

[InstCombine] Add test cases for D114339; NFC

Adding test cases for XOR logic folds with base result.
Differential Revision: https://reviews.llvm.org/D114436

commit | commitdiff | tree

LLVM GN Syncbot [Tue, 23 Nov 2021 19:09:46 +0000 (19:09 +0000)]

[gn build] Port 884b6dd31142

commit | commitdiff | tree

Quinn Pham [Thu, 18 Nov 2021 21:03:03 +0000 (15:03 -0600)]

[NFC][llvm] Inclusive language: remove instance of master in LiveRangeUtils.h

[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with primary in `LiveRangeUtils.h`.

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D114191

commit | commitdiff | tree

spupyrev [Tue, 23 Nov 2021 16:47:23 +0000 (08:47 -0800)]

profi - a flow-based profile inference algorithm: Part I (out of 3)

The benefits of sampling-based PGO crucially depends on the quality of profile
data. This diff implements a flow-based algorithm, called profi, that helps to
overcome the inaccuracies in a profile after it is collected.

Profi is an extended and significantly re-engineered classic MCMF (min-cost
max-flow) approach suggested by Levin, Newman, and Haber [2008, Complementing
missing and inaccurate profiling using a minimum cost circulation algorithm]. It
models profile inference as an optimization problem on a control-flow graph with
the objectives and constraints capturing the desired properties of profile data.
Three important challenges that are being solved by profi:
- "fixing" errors in profiles caused by sampling;
- converting basic block counts to edge frequencies (branch probabilities);
- dealing with "dangling" blocks having no samples in the profile.

The main implementation (and required docs) are in SampleProfileInference.cpp.
The worst-time complexity is quadratic in the number of blocks in a function,
O(|V|^2). However a careful engineering and extensive evaluation shows that
the running time is (slightly) super-linear. In particular, instances with
1000 blocks are solved within 0.1 second.

The algorithm has been extensively tested internally on prod workloads,
significantly improving the quality of generated profile data and providing
speedups in the range from 0% to 5%. For "smaller" benchmarks (SPEC06/17), it
generally improves the performance (with a few outliers) but extra work in
the compiler might be needed to re-tune existing optimization passes relying on
profile counts.

Reviewed By: wenlei, hoy

Differential Revision: https://reviews.llvm.org/D109860

commit | commitdiff | tree

wren romano [Thu, 18 Nov 2021 21:06:25 +0000 (13:06 -0800)]

[mlir][sparse] Moving integration tests that merely use the Python API

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D114192

commit | commitdiff | tree

Fangrui Song [Tue, 23 Nov 2021 18:30:11 +0000 (10:30 -0800)]

[ELF] Support non-RAX/non-adjacent R_X86_64_GOTPC32_TLSDESC/R_X86_64_TLSDESC_CALL

The current TLSDESC optimization code assumes:
```
leaq x@tlsdesc(%rip), %rax
call *x@tlscall(%rax)       # adjacent
```

From https://gitlab.freedesktop.org/mesa/mesa/-/issues/5665 , it seems that the
two instructions may not be adjacent in GCC 10's output:
```
leaq x@tlsdesc(%rip), %rax
something else
call *x@tlscall(%rax)
```

This patch supports the case. While here, support non-RAX registers for
R_X86_64_GOTPC32_TLSDESC, in case the compiler generates inefficient:

```
leaq x@tlsdesc(%rip), %rcx  # or %rdx, %rbx, %rdi, ...
movq %rcx, %rax
call *x@tlscall(%rax)       # GNU ld/gold error for non-RAX
```

Differential Revision: https://reviews.llvm.org/D114416

commit | commitdiff | tree

Zarko Todorovski [Tue, 23 Nov 2021 18:22:21 +0000 (13:22 -0500)]

[llvm][NFC] Inclusive language: Reword replace uses of sanity in llvm/lib/Transform comments and asserts

Reworded some comments and asserts to avoid usage of `sanity check/test`

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D114372

commit | commitdiff | tree

Pirama Arumuga Nainar [Tue, 23 Nov 2021 18:03:04 +0000 (10:03 -0800)]

[compiler-rt/profile] Include __llvm_profile_get_magic in module signature

The INSTR_PROF_RAW_MAGIC_* number in profraw files should match during
profile merging.  This causes an error with 32-bit and 64-bit variants
of the same code.  The module signatures for the two binaries are
identical but they use different INSTR_PROF_RAW_MAGIC_* causing a
failure when profile-merging is used.  Including it when computing the
module signature yields different signatures for the 32-bit and 64-bit
profiles.

Differential Revision: https://reviews.llvm.org/D114054

commit | commitdiff | tree

Philip Reames [Tue, 23 Nov 2021 17:57:30 +0000 (09:57 -0800)]

[indvars] Fix lftr crash when preheader is terminated by switch

This was found by oss-fuzz. The switch will get canonicalized to a branch, but if it hasn't been when we run LFTR, we crashed on an unneeded assert.

commit | commitdiff | tree

Nemanja Ivanovic [Tue, 23 Nov 2021 13:32:45 +0000 (07:32 -0600)]

[PowerPC] Add BCD add/sub/cmp builtins

Support for builtins that use bcdadd./bcdsub. to add/subtract
Binary Coded Decimal values as well as to determine validity
and compare BCD values.

Differential revision: https://reviews.llvm.org/D114088

commit | commitdiff | tree

Florian Hahn [Tue, 23 Nov 2021 17:37:12 +0000 (17:37 +0000)]

[LAA] Turn aggregate type check into assertion (NFCI).

getPtrStride should not be called with aggregate access types. There's
also an old TODO.

Turn the check into an assertion.

commit | commitdiff | tree

Philip Reames [Tue, 23 Nov 2021 17:18:28 +0000 (09:18 -0800)]

Revert "profi - a flow-based profile inference algorithm: Part I (out of 3)"

This reverts commit b00fc198224efa038a7469e068dd920b3f1aba75. This change fails to build (link) on ubuntu x86,

commit | commitdiff | tree

Philip Reames [Tue, 23 Nov 2021 17:10:41 +0000 (09:10 -0800)]

[unroll] Remove two dead variable assignments [nfc]

These variables are not out-params, and we immediately return after assigning them. Thus, the assignments are dead and just confusing.

I believe these used to be out-params, but they're not any more.

commit | commitdiff | tree

spupyrev [Tue, 23 Nov 2021 16:47:23 +0000 (08:47 -0800)]

profi - a flow-based profile inference algorithm: Part I (out of 3)

The benefits of sampling-based PGO crucially depends on the quality of profile
data. This diff implements a flow-based algorithm, called profi, that helps to
overcome the inaccuracies in a profile after it is collected.

Profi is an extended and significantly re-engineered classic MCMF (min-cost
max-flow) approach suggested by Levin, Newman, and Haber [2008, Complementing
missing and inaccurate profiling using a minimum cost circulation algorithm]. It
models profile inference as an optimization problem on a control-flow graph with
the objectives and constraints capturing the desired properties of profile data.
Three important challenges that are being solved by profi:
- "fixing" errors in profiles caused by sampling;
- converting basic block counts to edge frequencies (branch probabilities);
- dealing with "dangling" blocks having no samples in the profile.

The main implementation (and required docs) are in SampleProfileInference.cpp.
The worst-time complexity is quadratic in the number of blocks in a function,
O(|V|^2). However a careful engineering and extensive evaluation shows that
the running time is (slightly) super-linear. In particular, instances with
1000 blocks are solved within 0.1 second.

The algorithm has been extensively tested internally on prod workloads,
significantly improving the quality of generated profile data and providing
speedups in the range from 0% to 5%. For "smaller" benchmarks (SPEC06/17), it
generally improves the performance (with a few outliers) but extra work in
the compiler might be needed to re-tune existing optimization passes relying on
profile counts.

Reviewed By: wenlei, hoy

Differential Revision: https://reviews.llvm.org/D109860

commit | commitdiff | tree

Yaxun (Sam) Liu [Mon, 8 Nov 2021 21:20:22 +0000 (16:20 -0500)]

[HIP] Fix device stub name for Windows

This is a follow up of https://reviews.llvm.org/D68578
where device stub name is changed for Itanium
mangling but not Microsoft mangling.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D113491

commit | commitdiff | tree

Philip Reames [Tue, 23 Nov 2021 17:01:23 +0000 (09:01 -0800)]

[unroll] Use early return in shouldFullUnroll [nfc]

commit | commitdiff | tree

Dmitry Vyukov [Tue, 23 Nov 2021 10:50:49 +0000 (11:50 +0100)]

tsan: disable signal_sync2.cpp test on powerpc64

Fails 1 out of 10 runs on powerpc bots:
https://lab.llvm.org/buildbot/#/builders/121/builds/13391

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D114426

commit | commitdiff | tree

Dmitry Vyukov [Tue, 23 Nov 2021 15:58:32 +0000 (16:58 +0100)]

[lldb] Deflake TestTsanBasic.py

The test flaked on bots:
http://green.lab.llvm.org/green/job/lldb-cmake/38666/
The test expects that tsan will detect a single race
with concurrent memory accesses. TSan doesn't do this reliably.
Run 100 iterations of the racing threads, which should
make the race much more likely to be detected.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D114444

commit | commitdiff | tree

Kazu Hirata [Tue, 23 Nov 2021 16:54:47 +0000 (08:54 -0800)]

[llvm] Use range-based for loops (NFC)

commit | commitdiff | tree

Paul Robinson [Tue, 23 Nov 2021 16:42:16 +0000 (08:42 -0800)]

[PS4][TLI] Remove redundant line

commit | commitdiff | tree

alex-t [Fri, 19 Nov 2021 17:27:35 +0000 (20:27 +0300)]

[AMDGPU] Enable fneg and fabs divergence-driven instruction selection.

Detailed description: We currently have a set of patterns to select ISD::FNEG and ISD::FABS to the bitwise operations. We need to make them predicated to select the VALU or SALU bitwise operation variant according to the SDNode divergence bit.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D114257

commit | commitdiff | tree

Yaxun (Sam) Liu [Thu, 4 Nov 2021 17:49:43 +0000 (13:49 -0400)]

[NFC] Let Microsoft mangler accept GlobalDecl

This is a follow up of https://reviews.llvm.org/D75700
where support of GlobalDecl with Microsoft mangler
is incomplete.

Reviewed by: Artem Belevich, Reid Kleckner

Differential Revision: https://reviews.llvm.org/D113490

Domain: System / Toolchain;