review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

Kadir Cetinkaya [Mon, 28 Sep 2020 12:17:02 +0000 (14:17 +0200)]

[clangd] Introduce MemoryTrees

A structure that can be used to represent memory usage of a nested
set of systems.

Differential Revision: https://reviews.llvm.org/D88411

commit | commitdiff | tree

Simon Pilgrim [Mon, 12 Oct 2020 13:10:18 +0000 (14:10 +0100)]

[DAG][ARM][MIPS][RISCV] Improve funnel shift promotion to use 'double shift' patterns

Based on a discussion on D88783, if we're promoting a funnel shift to a width at least twice the size as the original type, then we can use the 'double shift' patterns (shifting the concatenated sources).

Differential Revision: https://reviews.llvm.org/D89139

commit | commitdiff | tree

Alexander Kornienko [Mon, 12 Oct 2020 13:05:42 +0000 (15:05 +0200)]

[clang-tidy] Fix IncludeInserter usage example in a comment.

commit | commitdiff | tree

Kadir Cetinkaya [Mon, 12 Oct 2020 12:24:05 +0000 (14:24 +0200)]

[clangd][NFC] Fix formatting in ClangdLSPServer

commit | commitdiff | tree

Christian Sigg [Thu, 8 Oct 2020 14:37:44 +0000 (16:37 +0200)]

[mlir][gpu] Adding gpu runtime wrapper functions for async execution.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D89037

commit | commitdiff | tree

Kazushi (Jam) Marukawa [Sun, 11 Oct 2020 08:33:47 +0000 (17:33 +0900)]

[VE] Support copysign math function

VE doesn't have instruction for copysign, so expand it. Add a
regression test also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D89228

commit | commitdiff | tree

Pavel Labath [Fri, 9 Oct 2020 09:23:17 +0000 (11:23 +0200)]

[lldb/Utility] Introduce UnimplementedError

This is essentially a replacement for the PacketUnimplementedError
previously present in the gdb-remote server code.

The reason I am introducing a generic error is because I wanted the
native process classes to be able to signal that they do not support
some functionality. They could not use PacketUnimplementedError as they
are independent of a specific transport protocol. Putting the error
class in the the native process code was also not ideal because the
gdb-remote code is also used for lldb-server's platform mode, which does
not (should not) know how to debug individual processes.

I'm putting it under Utility, as I think it can be generally useful for
notifying about unsupported/unimplemented functionality (and in
particular, for programatically testing whether something is
unsupported).

Differential Revision: https://reviews.llvm.org/D89121

commit | commitdiff | tree

Sam McCall [Fri, 9 Oct 2020 15:22:24 +0000 (17:22 +0200)]

[clangd] Stop capturing trace args if the tracer doesn't need them.

The tracer is now expected to allocate+free the args itself.

Differential Revision: https://reviews.llvm.org/D89135

commit | commitdiff | tree

Jan Kratochvil [Mon, 12 Oct 2020 09:25:47 +0000 (11:25 +0200)]

[nfc] [lldb] Simplify calling SymbolFileDWARF::GetDWARFCompileUnit

Only SymbolFileDWARF::ParseCompileUnit creates a CompileUnit and it uses
DWARFCompileUnit for that.

Differential Revision: https://reviews.llvm.org/D89165

commit | commitdiff | tree

Nicolas Vasilache [Mon, 12 Oct 2020 11:21:43 +0000 (11:21 +0000)]

[mlir][Linalg] NFC - Automate the printing of canonicalizers and folders for nameds Linalg ops.

This revision reduces the number of places that specific information needs to be modified when adding new named Linalg ops.

Differential Revision: https://reviews.llvm.org/D89223

commit | commitdiff | tree

Nicolas Vasilache [Mon, 12 Oct 2020 10:09:50 +0000 (10:09 +0000)]

[mlir][Linalg] Add named Linalg ops on tensor to buffer support.

This revision introduces support for buffer allocation for any named linalg op.
To avoid template instantiating many ops, a new ConversionPattern is created to capture the LinalgOp interface.

Some APIs are updated to remain consistent with MLIR style:
`OwningRewritePatternList * -> OwningRewritePatternList &`
`BufferAssignmentTypeConverter * -> BufferAssignmentTypeConverter &`

Differential revision: https://reviews.llvm.org/D89226

commit | commitdiff | tree

Sam McCall [Fri, 9 Oct 2020 14:06:46 +0000 (16:06 +0200)]

[clangd] Validate optional fields more strictly.

Differential Revision: https://reviews.llvm.org/D89131

commit | commitdiff | tree

Sam McCall [Fri, 9 Oct 2020 13:33:56 +0000 (15:33 +0200)]

[JSON] Add ObjectMapper::mapOptional to validate optional data.

Currently the idiom for mapping optional fields is:
  ObjectMapper O(Val, P);
  if (!O.map("required1", Out.R1) || !O.map("required2", Out.R2))
    return false;
  O.map("optional1", Out.O1); // ignore result
  return true;

If `optional1` is present but malformed, then we won't detect/report
that error. We may even leave `Out` in an incomplete state while returning true.
Instead, we'd often prefer to ignore `optional1` if it is absent, but otherwise
behave just like map().

Differential Revision: https://reviews.llvm.org/D89128

commit | commitdiff | tree

Simon Pilgrim [Mon, 12 Oct 2020 10:38:52 +0000 (11:38 +0100)]

Revert rGb97093e520036f8 - "[InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw"

This reverts commit b97093e520036f88c5b39e572966f1c8c387661e.

Funnel shift argument commutation isn't working correctly

commit | commitdiff | tree

Kazushi (Jam) Marukawa [Sun, 11 Oct 2020 08:33:47 +0000 (17:33 +0900)]

[VE] Support fneg and frem

VE doesn't have fneg or frem instruction, so change them to expand. Add
regression tests also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D89205

commit | commitdiff | tree

Kazushi (Jam) Marukawa [Tue, 6 Oct 2020 14:15:42 +0000 (23:15 +0900)]

[VE] Change to expand BRCOND

VE doesn't have BRCOND instruction, so need to expand it. Also add
a regression test.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D89173

commit | commitdiff | tree

Haojian Wu [Mon, 12 Oct 2020 10:04:44 +0000 (12:04 +0200)]

Fix buildbot failure for 702529d899c87e9268bb33d836dbc91b6bce0b16.

commit | commitdiff | tree

Evgeny Leviant [Mon, 12 Oct 2020 09:49:56 +0000 (12:49 +0300)]

Add test for cortex-a57/ARM sched model. NFC

commit | commitdiff | tree

sstefan1 [Mon, 12 Oct 2020 09:25:52 +0000 (11:25 +0200)]

[IR][FIX] Intrinsics - don't apply default willreturn if IntrNoReturn is specified

Summary: Since willreturn will soon be added as default attribute, we can end up with both noreturn and willreturn on the same intrinsic. This was exposed by llvm.wasm.throw which has IntrNoReturn.

Reviewers: jdoerfert, arsenm

Differential Revision: https://reviews.llvm.org/D88644

commit | commitdiff | tree

Haojian Wu [Mon, 12 Oct 2020 09:24:45 +0000 (11:24 +0200)]

[AST][RecoveryExpr] Don't perform early typo correction in C.

The dependent mechanism for C error-recovery is mostly finished,
this is the only place we have missed.

Differential Revision: https://reviews.llvm.org/D89045

commit | commitdiff | tree

Haojian Wu [Mon, 12 Oct 2020 09:12:58 +0000 (11:12 +0200)]

[AST][RecoveryExpr] Build dependent callexpr in C for error-recovery.

See whole context: https://reviews.llvm.org/D85025

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D84304

commit | commitdiff | tree

Georgii Rymar [Mon, 5 Oct 2020 10:25:59 +0000 (13:25 +0300)]

[llvm-readobj/elf] - Ignore the hash table when on EM_S390/EM_ALPHA platforms.

Specification for `SHT_HASH` table says (https://refspecs.linuxbase.org/elf/gabi4+/ch5.dynamic.html#hash)
that it contains `Elf32_Word` entries for both `32/64` bit objects.

But there is a problem with `EM_S390` and `ELF::EM_ALPHA` platforms: they use 8-bytes entries.
(see the issue reported: https://bugs.llvm.org/show_bug.cgi?id=47681).

Currently we might infer the size of the dynamic symbols table from hash table,
but because of the issue mentioned, the calculation is wrong. And also we don't dump the hash table
properly.

I am not sure if we want to support 8-bytes entries as they violates specification and also the
`.hash` table is kind of deprecated by itself (the `.gnu.hash` table is used nowadays).
So, the solution this patch suggests is to ban using of the hash table on `EM_S390/EM_ALPHA` platforms.

Differential revision: https://reviews.llvm.org/D88817

commit | commitdiff | tree

Haojian Wu [Mon, 12 Oct 2020 08:45:37 +0000 (10:45 +0200)]

[clang] Fix returning the underlying VarDecl as top-level decl for VarTemplateDecl.

Given the following VarTemplateDecl AST,

```
VarTemplateDecl col:26 X
|-TemplateTypeParmDecl typename depth 0 index 0
`-VarDecl X 'bool' cinit
`-CXXBoolLiteralExpr 'bool' true
```

previously, we returned the VarDecl as the top-level decl, which was not
correct, the top-level decl should be VarTemplateDecl.

Differential Revision: https://reviews.llvm.org/D89098

commit | commitdiff | tree

Nicolas Vasilache [Mon, 12 Oct 2020 08:23:54 +0000 (08:23 +0000)]

Revert "Revert "Give attributes C++ namespaces.""

This reverts commit df295fac6cd14977672b2874700572e0f77b77da.

Reactivates a spuriously rolled back change.

commit | commitdiff | tree

Alexander Belyaev [Sun, 11 Oct 2020 18:09:27 +0000 (20:09 +0200)]

[mlir] Move Linalg tensors-to-buffers tests to Linalg tests.

The buffer placement preparation tests in
test/Transforms/buffer-placement-preparation* are using Linalg as a test
dialect which leads to confusion and "copy-pasta", i.e. Linalg is being
extended now and when TensorsToBuffers.cpp is changed, TestBufferPlacement is
sometimes kept in-sync, which should not be the case.

This has led to the unnoticed bug, because the tests were in a different directory and the patterns were slightly off.

Differential Revision: https://reviews.llvm.org/D89209

commit | commitdiff | tree

David Sherwood [Mon, 12 Oct 2020 08:04:42 +0000 (09:04 +0100)]

Fix build failure caused by c5ba0d33cc060cc06a28a5d9101060afd1c0ee9a

commit | commitdiff | tree

Roman Lebedev [Mon, 12 Oct 2020 08:00:52 +0000 (11:00 +0300)]

[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown

While we indeed can't treat them as no-ops, i believe we can/should
do better than just modelling them as `unknown`. `inttoptr` story
is complicated, but for `ptrtoint`, it seems straight-forward
to model it just as a zext-or-trunc of unknown.

This may be important now that we track towards
making inttoptr/ptrtoint casts not no-op,
and towards preventing folding them into loads/etc
(see D88979/D88789/D88788)

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D88806

commit | commitdiff | tree

David Sherwood [Wed, 23 Sep 2020 13:05:15 +0000 (14:05 +0100)]

[SVE] Make ElementCount and TypeSize use a new PolySize class

I have introduced a new template PolySize class, where the template
parameter determines the type of quantity, i.e. for an element
count this is just an unsigned value. The ElementCount class is
now just a simple derivation of PolySize<unsigned>, whereas TypeSize
is more complicated because it still needs to contain the uint64_t
cast operator, since there are still many places in the code that
rely upon this implicit cast. As such the class also still needs
some of it's own operators.

I've tried to minimise the amount of code in the base PolySize
class, which led to a couple of changes:

1. In some places we were relying on '==' operator comparisons
between ElementCounts and the scalar value 1. I didn't put this
operator in the new PolySize class, and thought it was actually
clearer to use the isScalar() function instead.
2. I removed the isByteSized function and replaced it with calls
to isKnownMultipleOf(8).

I've also renamed NextPowerOf2 to be coefficientNextPowerOf2 so
that it's more consistent with coefficientDivideBy.

Differential Revision: https://reviews.llvm.org/D88409

commit | commitdiff | tree

Kito Cheng [Wed, 7 Oct 2020 09:01:18 +0000 (17:01 +0800)]

[Tablegen][SubtargetEmitter] Print TuneCPU in Subtarget::ParseSubtargetFeatures

Let user able to know which -tune-cpu are used now.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D88951

commit | commitdiff | tree

Vitaly Buka [Sat, 3 Oct 2020 04:08:13 +0000 (21:08 -0700)]

[NFC][Asan] Remove unused macro

commit | commitdiff | tree

John McCall [Mon, 12 Oct 2020 05:09:25 +0000 (01:09 -0400)]

Revert "[SYCL] Implement __builtin_unique_stable_name."

This reverts commit b5a034e771d0e4d7d8e71fc545b230d98e5a1f42.

This feature was added without following the proper process.

commit | commitdiff | tree

Fangrui Song [Sun, 11 Oct 2020 22:51:55 +0000 (15:51 -0700)]

[SchedDAGInstrs] Delete redundant contains(). NFC

commit | commitdiff | tree

Jonas Devlieghere [Mon, 12 Oct 2020 03:14:00 +0000 (20:14 -0700)]

Revert "PR47792: Include the type of a pointer or reference non-type template"

This reverts commit 849c60541b630ddf8cabf9179fa771b3f4207ec8 because it
results in a stage 2 build failure:

llvm-project/clang/include/clang/AST/ExternalASTSource.h:409:20: error:
definition with same mangled name
'_ZN5clang25LazyGenerationalUpdatePtrIPKNS_4DeclEPS1_XadL_ZNS_17ExternalASTSource19CompleteRedeclChainES3_EEE9makeValueERKNS_10ASTContextES4_'
as another definition

static ValueType makeValue(const ASTContext &Ctx, T Value);

commit | commitdiff | tree

Qiu Chaofan [Mon, 12 Oct 2020 02:40:19 +0000 (10:40 +0800)]

[NFC] Move PPC strict-fp MIR test to dedicated file

fp-strict-conv-f128.ll is generated by script, but some manual MIR tests
exist in it. Move them to another file to satisfy script when updating.

commit | commitdiff | tree

Valentin Clement [Mon, 12 Oct 2020 01:26:54 +0000 (21:26 -0400)]

[mlir][openacc] Introduce acc.enter_data operation

This patch introduces the acc.enter_data operation that represents an OpenACC Enter Data directive.
Operands and attributes are dervied from clauses in the spec 2.6.6.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D88941

commit | commitdiff | tree

Richard Smith [Sun, 11 Oct 2020 22:57:52 +0000 (15:57 -0700)]

PR47792: Include the type of a pointer or reference non-type template
parameter in its notion of template argument identity.

We already did this for all the other kinds of non-type template
argument. We're still missing the type from the mangling, so we continue
to be able to see collisions at link time; that's an open ABI issue.

commit | commitdiff | tree

Craig Topper [Sun, 11 Oct 2020 20:31:15 +0000 (13:31 -0700)]

[ValueTracking] Use KnownBits::countMaxLeadingZeros/countMaxTrailingZeros to make code more readable. NFC

commit | commitdiff | tree

Richard Smith [Sun, 11 Oct 2020 21:20:01 +0000 (14:20 -0700)]

Fix arc lint's clang-format rule: only format the file we were asked to format.

This avoids diffs being applied in the work tree to files that are
supposed to be excluded (clang tests), allows arc to properly provide
interactive feedback for the formatting fixes, and reduces the number of
files that we format, in a change affecting N files, from N^2 to N.

commit | commitdiff | tree

Christian Iversen [Sun, 11 Oct 2020 21:19:25 +0000 (14:19 -0700)]

[ELF] Fix broken bitstream linking with lld when e_machine > 255

In ELF/InputFiles.cpp, getBitcodeMachineKind() is limited to uint8_t return
type. This works as long as EM_xxx is < 256, which is true for common
architectures, but not for some newly assigned or unofficial EM_* values.

The corresponding ELF field (e_machine) can hold uint16_t.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D89185

commit | commitdiff | tree

Tres Popp [Fri, 9 Oct 2020 14:45:50 +0000 (16:45 +0200)]

[mlir] Type erase inputs to select statements in shape.broadcast lowering.

This is required or broadcasting with operands of different ranks will lead to
failures as the select op requires both possible outputs and its output type to
be the same.

Differential Revision: https://reviews.llvm.org/D89134

commit | commitdiff | tree

Nathan Ridge [Sat, 5 Sep 2020 23:22:59 +0000 (19:22 -0400)]

[clangd] Avoid relations being overwritten in a header shard

Fixes https://github.com/clangd/clangd/issues/510

Differential Revision: https://reviews.llvm.org/D87256

commit | commitdiff | tree

Roman Lebedev [Sun, 11 Oct 2020 17:17:09 +0000 (20:17 +0300)]

[InstCombine] combineLoadToOperationType(): don't fold int<->ptr cast into load

And another step towards transforms not introducing inttoptr and/or
ptrtoint casts that weren't there already.

As we've been establishing (see D88788/D88789), if there is a int<->ptr cast,
it basically must stay as-is, we can't do much with it.

I've looked, and the most source of new such casts being introduces,
as far as i can tell, is this transform, which, ironically,
tries to reduce count of casts..

On vanilla llvm test-suite + RawSpeed, @ `-O3`, this results in
-33.58% less `IntToPtr`s (19014 -> 12629)
and +76.20% more `PtrToInt`s (18589 -> 32753),
which is an increase of +20.69% in total.

However just on RawSpeed, where i know there are basically
none `IntToPtr` in the original source code,
this results in -99.27% less `IntToPtr`s (2724 -> 20)
and +82.92% more `PtrToInt`s (4513 -> 8255).
which is again an increase of 14.34% in total.

To me this does seem like the step in the right direction,
we end up with strictly less `IntToPtr`, but strictly more `PtrToInt`,
which seems like a reasonable trade-off.

See https://reviews.llvm.org/D88860 / https://reviews.llvm.org/D88995
for some more discussion on the subject.

(Eventually, `CastInst::isNoopCast()`/`CastInst::isEliminableCastPair`
should be taught about this, yes)

Reviewed By: nlopes, nikic

Differential Revision: https://reviews.llvm.org/D88979

commit | commitdiff | tree

Fangrui Song [Sun, 11 Oct 2020 05:21:47 +0000 (22:21 -0700)]

[X86] Define __LAHF_SAHF__ if feature 'sahf' is set or 32-bit mode

GCC 11 will define this macro.

In LLVM, the feature flag only applies to 64-bit mode and we always define the
macro in 32-bit mode. This is different from GCC -m32 in which -mno-sahf can
suppress the macro. The discrepancy can unlikely cause trouble.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D89198

commit | commitdiff | tree

David Green [Sun, 11 Oct 2020 15:58:34 +0000 (16:58 +0100)]

[LV] Tail folded inloop reductions.

This expands upon the inloop reductions added in e9761688e41cb9e976,
allowing them to be inserted into tail folded loops. Reductions are
generates with the form:

  x = select(mask, vecop, zero)
  v = vecreduce.add(x)
  c = add chain, v

Where zero here is chosen as the identity value for add reductions. The
backend is then expected to fold the select and the vecreduce into a
single predicated instruction.

Most of the code is fairly straight forward, except for the creation of
blockmasks which need to ensure they are created in dominance order. The
order they are added is altered to be after any phis, keeping the
requirements for the underlying IR.

Differential Revision: https://reviews.llvm.org/D84451

commit | commitdiff | tree

Zinovy Nis [Sat, 10 Oct 2020 18:32:51 +0000 (21:32 +0300)]

[clang-tidy] Fix crash in readability-function-cognitive-complexity on weak refs

Fix for https://bugs.llvm.org/show_bug.cgi?id=47779

Differential Revision: https://reviews.llvm.org/D89194

commit | commitdiff | tree

Nikita Popov [Sun, 11 Oct 2020 15:07:28 +0000 (17:07 +0200)]

[MemCpyOpt] Add lifetime may alias test (NFC)

Test the case where a lifetime intrinsic may alias the memcpy
source. Other cases test must or no alias.

commit | commitdiff | tree

David Green [Sun, 11 Oct 2020 14:06:21 +0000 (15:06 +0100)]

[LV] Extra predicated inloop reduction tests. NFC

commit | commitdiff | tree

Nikita Popov [Sun, 11 Oct 2020 13:21:49 +0000 (15:21 +0200)]

[MemCpyOpt] Add additional byval tests (NFC)

Test read/write clobbers and the the non-local case.

commit | commitdiff | tree

Sanjay Patel [Sat, 10 Oct 2020 15:08:50 +0000 (11:08 -0400)]

[InstCombine] allow vector splats for add+xor --> shifts

commit | commitdiff | tree

Sanjay Patel [Sat, 10 Oct 2020 14:35:18 +0000 (10:35 -0400)]

[InstCombine] add one-use check to add+xor transform

As shown in the affected test, we could increase instruction
count without this limitation. There's another test with extra
use that shows we still convert directly to a real "sext" if
possible.

commit | commitdiff | tree

Sanjay Patel [Sat, 10 Oct 2020 14:24:15 +0000 (10:24 -0400)]

[InstCombine] add tests with extra uses for add+xor transform; NFC

commit | commitdiff | tree

Sanjay Patel [Sat, 10 Oct 2020 13:50:34 +0000 (09:50 -0400)]

[InstCombine] add/adjust tests for add+xor -> shifts; NFC

commit | commitdiff | tree

Kazushi (Jam) Marukawa [Sun, 11 Oct 2020 10:34:12 +0000 (19:34 +0900)]

[VE][NFC] Clean VEISelLowering.cpp

Clean the order of setOperationActions and others.

Differential Revision: https://reviews.llvm.org/D89203

commit | commitdiff | tree

Simon Pilgrim [Sun, 11 Oct 2020 10:25:22 +0000 (11:25 +0100)]

Fix Wdocumentation warning. NFCI.

Add a space after /param names before any commas otherwise the doxygen parsers get confused.

commit | commitdiff | tree

Simon Pilgrim [Sun, 11 Oct 2020 10:21:23 +0000 (11:21 +0100)]

[X86][SSE2] Use smarter instruction patterns for lowering UMIN/UMAX with v8i16.

This is my first LLVM patch, so please tell me if there are any process issues.

The main observation for this patch is that we can lower UMIN/UMAX with v8i16 by using unsigned saturated subtractions in a clever way. Previously this operation was lowered by turning the signbit of both inputs and the output which turns the unsigned minimum/maximum into a signed one.

We could use this trick in reverse for lowering SMIN/SMAX with v16i8 instead. In terms of latency/throughput this is the needs one large move instruction. It's just that the sign bit turning has an increased chance of being optimized further. This is particularly apparent in the "reduce" test cases. However due to the slight regression in the single use case, this patch no longer proposes this.

Unfortunately this argument also applies in reverse to the new lowering of UMIN/UMAX with v8i16 which regresses the "horizontal-reduce-umax", "horizontal-reduce-umin", "vector-reduce-umin" and "vector-reduce-umax" test cases a bit with this patch. Maybe some extra casework would be possible to avoid this. However independent of that I believe that the benefits in the common case of just 1 to 3 chained min/max instructions outweighs the downsides in that specific case.

Patch By: @TomHender (Tom Hender) ActuallyaDeviloper

Differential Revision: https://reviews.llvm.org/D87236

commit | commitdiff | tree

Simon Pilgrim [Sun, 11 Oct 2020 09:39:51 +0000 (10:39 +0100)]

[InstCombine] Remove accidental unnecessary ConstantExpr qualification added in rGb752daa26b64155

MSVC didn't complain but everything else did....

commit | commitdiff | tree

Simon Pilgrim [Sun, 11 Oct 2020 09:37:20 +0000 (10:37 +0100)]

[InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw

If value tracking can confirm that a shift value is less than the type bitwidth then we can more confidently fold general or(shl(a,x),lshr(b,sub(bw,x))) patterns to a funnel/rotate intrinsic pattern without causing bad codegen regressions in the backend (see D89139).

Differential Revision: https://reviews.llvm.org/D88783

commit | commitdiff | tree

Simon Pilgrim [Sun, 11 Oct 2020 09:31:17 +0000 (10:31 +0100)]

[InstCombine] Replace getLogBase2 internal helper with ConstantExpr::getExactLogBase2. NFCI.

This exposes the helper for other power-of-2 instcombine folds that I'm intending to add vector support to.

The helper only operated on power-of-2 constants so getExactLogBase2 is a more accurate name.

commit | commitdiff | tree

Tobias Gysi [Sun, 11 Oct 2020 08:40:28 +0000 (10:40 +0200)]

[mlir] add scf.if op canonicalization pattern that removes unused results

The patch adds a canonicalization pattern that removes the unused results of scf.if operation. As a result, cse may remove unused computations in the then and else regions of the scf.if operation.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D89029

commit | commitdiff | tree

Xun Li [Sun, 11 Oct 2020 05:21:34 +0000 (22:21 -0700)]

[Coroutines] Refactor/Rewrite Spill and Alloca processing

This patch is a refactoring of how we process spills and allocas during CoroSplit.
In the previous implementation, everything that needs to go to the heap is put into Spills, including all the values defined by allocas.
And the way to identify a Spill, is to check whether there exists a use-def relationship that crosses suspension points.

This approach is fundamentally confusing, and unfortunately, incorrect.
First of all, allocas are always process differently than spills, hence it's quite confusing to put them together. It's a much cleaner to separate them and process them separately.
Doing so simplify lots of code and makes the logic more clear and easier to reason about.

Secondly, use-def relationship is insufficient to decide whether a value defined by AllocaInst needs to go to the heap.
There are many cases where a value defined by AllocaInst can implicitly be used across suspension points without a direct use-def relationship.
For example, you can store the address of an alloca into the heap, and load that address after suspension. Or you can escape the address into an object through a function call.
Or you can have a PHINode that takes two allocas, and this PHINode is used across suspension point (when this happens, the existing implementation will spill the PHINode, a.k.a a stack adddress to the heap!).
All these issues suggest that we need to separate spill and alloca in order to properly implement this.
This patch does not yet fix these bugs, however it sets up the code in a better shape so that we can start fixing them in the next patch.

The core idea of this patch is to add a new struct called FrameDataInfo, which contains all Spills, all Allocas, and a map from each definition to its layout index in the frame (FieldIndexMap).
Spills and Allocas are identified, stored and processed independently. When they are initially added to the frame, we record their field index through FieldIndexMap. When the frame layout is finalized, we update each index into their final layout index.

In doing so, I also cleaned up a few things and also discovered a few other bugs.

Cleanups:
1. Found out that PromiseFieldId is not used, delete it.
2. Previously, SpillInfo is a vector, which is strange because every def can have multiple users. This patch cleans it up by turning it into a map from def to users.
3. Previously, a frame Field struct contains a list of Spills that field corresponds to. This isn't necessary since we only need the layout index for each given definition. This patch removes that list. Instead, we connect each field and definition using the FieldIndexMap.
4. All the loops that process Spills are simplified now because we use a map instead of a vector.

Bugs:
It seems that we are only keeping llvm.dbg.declare intrinsics in the .resume part of the function. The ramp function will no longer has it. This means we are dropping some debug information in the ramp function.

The next step is to start fixing the bugs where the implementation fails to identify some allocas that should live on the frame.

Differential Revision: https://reviews.llvm.org/D88872

commit | commitdiff | tree

Craig Topper [Sun, 11 Oct 2020 04:34:37 +0000 (21:34 -0700)]

[X86] Redefine X86ISD::PEXTRB/W and X86ISD::PINSRB/PINSRW to use a i8 TargetConstant for the immediate instead of a ptr constant.

This is more consistent with other target specific ISD opcodes that
require immediates.

commit | commitdiff | tree

Craig Topper [Sun, 11 Oct 2020 03:11:26 +0000 (20:11 -0700)]

[X86] AMX intrinsics should have ImmArg for the register numbers and use timm in isel patterns.

commit | commitdiff | tree

Craig Topper [Sun, 11 Oct 2020 02:18:02 +0000 (19:18 -0700)]

[X86] Add a X86ISD::BEXTRI to distinquish the case where the control must be a constant.

The bextri intrinsic has a ImmArg attribute which will be converted
in SelectionDAG using TargetConstant. We previously converted this
to a plain Constant to allow X86ISD::BEXTR to call SimplifyDemandedBits
on it.

But while trying to decide if D89178 was safe, I realized that
this conversion of TargetConstant to Constant would be one case
where that would break.

So this patch adds a new opcode specifically for the immediate case.
And then teaches computeKnownBits and SimplifyDemandedBits to also
handle it, but not try to SimplifyDemandedBits on it. To make up
for that, I immediately masked the constant to 16 bits when
converting from the intrinsic node to the X86ISD node.

commit | commitdiff | tree

Krzysztof Parzyszek [Sat, 10 Oct 2020 01:17:50 +0000 (20:17 -0500)]

[Hexagon] Replace HexagonISD::VSPLAT with ISD::SPLAT_VECTOR

This removes VSPLAT and VZERO. VZERO is now SPLAT_VECTOR of (i32 0).

Included is also a testcase for the previous (target-independent)
commit.

commit | commitdiff | tree

Krzysztof Parzyszek [Sat, 10 Oct 2020 20:37:32 +0000 (15:37 -0500)]

[SDAG] Remember to set UndefElts in isSplatValue for SPLAT_VECTOR

commit | commitdiff | tree

Fangrui Song [Sat, 10 Oct 2020 21:05:48 +0000 (14:05 -0700)]

[X86] Delete redundant 'static' from namespace scope 'static constexpr'. NFC

This decreases 7 lines as the result of packing more bits on one line.

commit | commitdiff | tree

Simon Pilgrim [Sat, 10 Oct 2020 19:28:50 +0000 (20:28 +0100)]

[InstCombine] getLogBase2(undef) -> 0.

Move the undef element handling into the getLogBase2 helper instead of pre-empting with replaceUndefsWith.

commit | commitdiff | tree

Alex Denisov [Sat, 10 Oct 2020 19:22:40 +0000 (21:22 +0200)]

Fix CMake configuration error when run with -Werror/-Wall

The following code doesn't compile

  uint64_t i = x.load(std::memory_order_relaxed);
  return 0;

when CMAKE_C_FLAGS set to -Werror -Wall, thus incorrectly
breaking the CMake configuration step:

  -- Looking for __atomic_load_8 in atomic
  -- Looking for __atomic_load_8 in atomic - not found
  CMake Error at cmake/modules/CheckAtomic.cmake:79 (message):
    Host compiler appears to require libatomic for 64-bit operations, but
    cannot find it.
  Call Stack (most recent call first):
    cmake/config-ix.cmake:360 (include)
    CMakeLists.txt:671 (include)

commit | commitdiff | tree

Simon Pilgrim [Sat, 10 Oct 2020 19:09:55 +0000 (20:09 +0100)]

[InstCombine] getLogBase2 - no need to specify Type. NFCI.

In all the getLogBase2 uses, the specified Type is always the same as the constant being folded.

commit | commitdiff | tree

Simon Pilgrim [Sat, 10 Oct 2020 18:13:01 +0000 (19:13 +0100)]

Remove %tmp variables from test cases to appease update_test_checks.py

commit | commitdiff | tree

Simon Pilgrim [Sat, 10 Oct 2020 18:09:58 +0000 (19:09 +0100)]

[PowerPC] ReplaceNodeResults - bail on funnel shifts and let generic legalizers deal with it

Fixes regression raised on D88834 for 32-bit triple + 64-bit cpu cases (which apparently is a thing).

commit | commitdiff | tree

Krzysztof Parzyszek [Sat, 10 Oct 2020 01:16:09 +0000 (20:16 -0500)]

Define splat_vector for ISD::SPLAT_VECTOR in TargetSelectionDAG.td

commit | commitdiff | tree

Martin Storsjö [Sat, 10 Oct 2020 11:33:10 +0000 (14:33 +0300)]

[lldb] [Windows] Remove unused functions. NFC.

These became unused in 51117e3c51754f3732e.

commit | commitdiff | tree

Martin Storsjö [Sat, 10 Oct 2020 11:26:32 +0000 (14:26 +0300)]

[lldb] [Windows] Add missing 'override', silencing warnings. NFC.

Also remove superfluous 'virtual' in overridden methods.

commit | commitdiff | tree

Simon Pilgrim [Sat, 10 Oct 2020 17:18:57 +0000 (18:18 +0100)]

[PowerPC] Add ppc32 funnel shift test coverage

commit | commitdiff | tree

Simon Pilgrim [Sat, 10 Oct 2020 15:28:59 +0000 (16:28 +0100)]

[InstCombine] Add test case showing rotate intrinsic being split by SimplifyDemandedBits

Noticed while triaging regression report on D88834

commit | commitdiff | tree

Michał Górny [Sat, 10 Oct 2020 16:54:52 +0000 (18:54 +0200)]

[lldb] [Process/FreeBSDRemote] Fix double semicolon

commit | commitdiff | tree

Michał Górny [Sat, 10 Oct 2020 07:36:57 +0000 (09:36 +0200)]

[lldb] [Process/FreeBSDRemote] Kill process via PT_KILL

Use PT_KILL to kill the stopped process. This ensures that the process
termination is reported properly and fixes delay/error on killing it.

Differential Revision: https://reviews.llvm.org/D89182

commit | commitdiff | tree

Michał Górny [Sat, 10 Oct 2020 07:23:15 +0000 (09:23 +0200)]

[lldb] [Process/FreeBSD] Mark methods override in RegisterContext*

Differential Revision: https://reviews.llvm.org/D89181

commit | commitdiff | tree

Philip Reames [Sat, 10 Oct 2020 16:48:02 +0000 (09:48 -0700)]

Step down from security group

Resigning from security group as Azul representative as I have left Azul. Previously communicated via email with security group.

Differential Revision: https://reviews.llvm.org/D88933

commit | commitdiff | tree

Tim Renouf [Tue, 6 Oct 2020 17:23:59 +0000 (18:23 +0100)]

[AMDGPU] Add gfx602, gfx705, gfx805 targets

At AMD, in an internal audit of our code, we found some corner cases
where we were not quite differentiating targets enough for some old
hardware. This commit is part of fixing that by adding three new
targets:

* The "Oland" and "Hainan" variants of gfx601 are now split out into
  gfx602. LLPC (in the GPUOpen driver) and other front-ends could use
  that to avoid using the shaderZExport workaround on gfx602.

* One variant of gfx703 is now split out into gfx705. LLPC and other
  front-ends could use that to avoid using the
  shaderSpiCsRegAllocFragmentation workaround on gfx705.

* The "TongaPro" variant of gfx802 is now split out into gfx805.
  TongaPro has a faster 64-bit shift than its former friends in gfx802,
  and a subtarget feature could be set up for that to take advantage of
  it. This commit does not make that change; it just adds the target.

V2: Add clang changes. Put TargetParser list in order.
V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order,
    so fix the GPUKind order.

Differential Revision: https://reviews.llvm.org/D88916

Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d

commit | commitdiff | tree

Florian Hahn [Sat, 10 Oct 2020 15:39:48 +0000 (16:39 +0100)]

[SCEV] Add test cases where the max BTC is imprecise, due to step != 1.

Add a test case where we fail to compute a tight max backedge taken
count, due to the step being != 1.

This is part of the issue with PR40961.

commit | commitdiff | tree

Florian Hahn [Sat, 10 Oct 2020 15:20:37 +0000 (16:20 +0100)]

[SCEV] Handle ULE in applyLoopGuards.

Handle ULE predicate in similar fashion to ULT predicate in
applyLoopGuards.

commit | commitdiff | tree

Florian Hahn [Sat, 10 Oct 2020 11:26:01 +0000 (12:26 +0100)]

[SCEV] Add a test case with ULE loop guard.

commit | commitdiff | tree

Nikita Popov [Sat, 10 Oct 2020 14:09:15 +0000 (16:09 +0200)]

[MemCpyOpt] Add test for incorrect memset DSE (NFC)

We can't shorten the memset if there's a throwing call in between
and the destination is non-local.

commit | commitdiff | tree

David Green [Sat, 10 Oct 2020 13:50:25 +0000 (14:50 +0100)]

[ARM] Attempt to make Tail predication / RDA more resilient to empty blocks

There are a number of places in RDA where we assume the block will not
be empty. This isn't necessarily true for tail predicated loops where we
have removed instructions. This attempt to make the pass more resilient
to empty blocks, not casting pointers to machine instructions where they
would be invalid.

The test contains a case that was previously failing, but recently been
hidden on trunk. It contains an empty block to begin with to show a
similar error.

Differential Revision: https://reviews.llvm.org/D88926

commit | commitdiff | tree

Alok Kumar Sharma [Sat, 10 Oct 2020 12:18:35 +0000 (17:48 +0530)]

[DebugInfo] Support for DWARF attribute DW_AT_rank

This patch adds support for DWARF attribute DW_AT_rank.

Summary:
Fortran assumed rank arrays have dynamic rank. DWARF attribute
DW_AT_rank is needed to support that.

Testing:
unit test cases added (hand-written)
check llvm
check debug-info

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D89141

commit | commitdiff | tree

Benjamin Kramer [Sat, 10 Oct 2020 12:13:42 +0000 (14:13 +0200)]

[clangd] Map bits/stdint-intn.h and bits/stdint-uintn.h to cstdint.

These are private glibc headers containing parts of the implementation
of stdint.h.

commit | commitdiff | tree

David Green [Sat, 10 Oct 2020 09:15:43 +0000 (10:15 +0100)]

[AArch64][LV] Move vectorizer test to Transforms/LoopVectorize/AArch64. NFC

commit | commitdiff | tree

David Green [Sat, 10 Oct 2020 09:04:28 +0000 (10:04 +0100)]

[TblGen][Scheduling] Fix debug output. NFC

This just moves some newlines to the expected places.

commit | commitdiff | tree

Tatiana Shpeisman [Sat, 10 Oct 2020 08:45:05 +0000 (14:15 +0530)]

[mlir][scf] Fix a bug in scf::ForOp loop unroll with an epilogue

Fixes a bug in formation and simplification of an epilogue loop generated
during loop unroll of scf::ForOp (https://bugs.llvm.org/show_bug.cgi?id=46689)

Differential Revision: https://reviews.llvm.org/D87583

commit | commitdiff | tree

Nikita Popov [Fri, 9 Oct 2020 19:09:16 +0000 (21:09 +0200)]

[MemCpyOpt] Don't hoist store that's not guaranteed to execute

MemCpyOpt can hoist stores while load+store pairs into memcpy.
This hoisting can currently result in stores being executed that
weren't guaranteed to execute in the original problem.

Differential Revision: https://reviews.llvm.org/D89154

commit | commitdiff | tree

Denis Antrushin [Wed, 7 Oct 2020 18:32:50 +0000 (01:32 +0700)]

[Statepoints] Allow deopt GC pointer on VReg if gc-live bundle is empty.

Currently we allow passing pointers from deopt bundle on VReg only if
they were seen in list of gc-live pointers passed on VRegs.
This means that for the case of empty gc-live bundle we spill deopt
bundle's pointers. This change allows lowering deopt pointers to VRegs
in case of empty gc-live bundle. In case of non-empty gc-live bundle,
behavior does not change.

Reviewed By: skatkov

Differential Revision: https://reviews.llvm.org/D88999

commit | commitdiff | tree

Zi Xuan Wu [Tue, 29 Sep 2020 04:31:36 +0000 (12:31 +0800)]

[CSKY 1/n] Add basic stub or infra of csky backend

This patch introduce files that just enough for lib/Target/CSKY to compile.
Notably a basic CSKYTargetMachine and CSKYTargetInfo.

Differential Revision: https://reviews.llvm.org/D88466

commit | commitdiff | tree

Fangrui Song [Sat, 10 Oct 2020 01:28:31 +0000 (18:28 -0700)]

[PowerPC] Fix signed overflow in decomposeMulByConstant after D88201

Caught by multipliers LONG_MAX (after +1) and LONG_MIN (after -1) in CodeGen/PowerPC/mul-const-i64.ll

commit | commitdiff | tree

Xiang1 Zhang [Sat, 10 Oct 2020 00:59:27 +0000 (08:59 +0800)]

[X86] Add CET test, NFC

commit | commitdiff | tree

Valentin Clement [Sat, 10 Oct 2020 01:02:29 +0000 (21:02 -0400)]

[mlir][openacc] Introduce acc.exit_data operation

This patch introduces the acc.exit_data operation that represents an OpenACC Exit Data directive.
Operands and attributes are derived from clauses in the spec 2.6.6.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D88969

commit | commitdiff | tree

Sean Silva [Sat, 10 Oct 2020 00:31:42 +0000 (17:31 -0700)]

[mlir] Rename BufferPlacement.h to Bufferize.h

Context: https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/14

Differential Revision: https://reviews.llvm.org/D89174

commit | commitdiff | tree

Walter Erquinigo [Sat, 3 Oct 2020 19:23:12 +0000 (12:23 -0700)]

[intel pt] Refactor parsing

With the feedback I was getting in different diffs, I realized that splitting the parsing logic into two classes was not easy to deal with. I do see value in doing that, but I'd rather leave that as a refactor after most of the intel-pt logic is in place. Thus, I'm merging the common parser into the intel pt one, having thus only one that is fully aware of Intel PT during parsing and object creation.

Besides, based on the feedback in https://reviews.llvm.org/D88769, I'm creating a ThreadIntelPT class that will be able to orchestrate decoding of its own trace and can handle the stop events correctly.

This leaves the TraceIntelPT class as an initialization class that glues together different components. Right now it can initialize a trace session from a json file, and in the future will be able to initialize a trace session from a live process.

Besides, I'm renaming SettingsParser to SessionParser, which I think is a better name, as the json object represents a trace session of possibly many processes.

With the current set of targets, we have the following

- Trace: main interface for dealing with trace sessions
- TraceIntelPT: plugin Trace for dealing with intel pt sessions
- TraceIntelPTSessionParser: a parser of a json trace session file that can create a corresponding TraceIntelPT instance along with Targets, ProcessTraces (to be created in https://reviews.llvm.org/D88769), and ThreadIntelPT threads.
- ProcessTrace: (to be created in https://reviews.llvm.org/D88769) can handle the correct state of the traces as the user traverses the trace. I don't think there'll be a need an intel-pt specific implementation of this class.
- ThreadIntelPT: a thread implementation that can handle the decoding of its own trace file, along with keeping track of the current position the user is looking at when doing reverse debugging.

Differential Revision: https://reviews.llvm.org/D88841

commit | commitdiff | tree

Aart Bik [Fri, 9 Oct 2020 23:56:11 +0000 (16:56 -0700)]

[mlir] [standard] fixed typo in comment

There is an atomic_rmw and a generic_atomic_rmw operation.
The doc of the latter incorrectly referred to former though.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D89172

Domain: System / Toolchain;

RSS Atom