review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

Eli Friedman [Tue, 4 Aug 2020 21:57:16 +0000 (14:57 -0700)]

[AArch64][SVE] Allow llvm.aarch64.sve.st2/3/4 with vectors of pointers.

This isn't necessaary for ACLE, but could be useful in other situations.
And the change is simple.

Differential Revision: https://reviews.llvm.org/D85251

commit | commitdiff | tree

Eli Friedman [Fri, 31 Jul 2020 00:32:39 +0000 (17:32 -0700)]

[clang codegen] Use IR "align" attribute for static array arguments.

Without the "align" attribute, marking the argument dereferenceable is
basically useless. See also D80166.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46876 .

Differential Revision: https://reviews.llvm.org/D84992

commit | commitdiff | tree

Craig Topper [Tue, 18 Aug 2020 19:29:58 +0000 (12:29 -0700)]

[X86] Don't call SemaBuiltinConstantArg from CheckX86BuiltinTileDuplicate if Argument is Type or Value Dependent.

SemaBuiltinConstantArg has an early exit for that case that doesn't
produce an error and doesn't update the APInt. We need to detect that
case and not use the APInt value.

While there delete the signature of CheckX86BuiltinTileArgumentsRange
that takes a single Argument index to check. There's another version
that takes an ArrayRef and single value is convertible to an ArrayRef.

commit | commitdiff | tree

Mehdi Amini [Tue, 18 Aug 2020 19:03:40 +0000 (19:03 +0000)]

Remove MLIREDSCInterface library which isn't used anywhere (NFC)

Reviewed By: nicolasvasilache, ftynse

Differential Revision: https://reviews.llvm.org/D85042

commit | commitdiff | tree

Jessica Paquette [Tue, 18 Aug 2020 17:37:10 +0000 (10:37 -0700)]

[GlobalISel][CallLowering] NFC: Unify flag-setting from CallBase + AttributeList

It's annoying to have to maintain multiple, nearly identical chains of if
statements which all set the same attributes.

Add a helper function, `addFlagsUsingAttrFn` which performs the attribute
setting.

Then, use wrappers for that function in `lowerCall` and `setArgFlags`.

(Note that the flag-setting code in `setArgFlags` was missing the returned
attribute. There's no selection for this yet, so no test. It's an example of
the kind of thing this lets us avoid, though.)

Differential Revision: https://reviews.llvm.org/D86159

commit | commitdiff | tree

Jessica Paquette [Tue, 18 Aug 2020 16:23:48 +0000 (09:23 -0700)]

[GlobalISel][CallLowering] Don't tail call with non-forwarded explicit sret

Similar to this commit:

faf8065a99817bcb10e6f09b558fe3e0972c35ce

Testcase is pretty much the same as

test/CodeGen/AArch64/tailcall-explicit-sret.ll

Except it uses i64 (since we don't handle the i1024 return values yet), and
doesn't have indirect tail call testcases (because we can't translate those
yet).

Differential Revision: https://reviews.llvm.org/D86148

commit | commitdiff | tree

Siva Chandra Reddy [Tue, 18 Aug 2020 18:04:58 +0000 (11:04 -0700)]

[libc][obvious] Fix link order of math tests.

commit | commitdiff | tree

Tue Ly [Tue, 28 Jul 2020 05:35:18 +0000 (01:35 -0400)]

[libc] Add ULP function to MPFRNumber class to test correctly rounded functions such as SQRT, FMA.

Add ULP function to MPFRNumber class to test correctly rounded functions.

Differential Revision: https://reviews.llvm.org/D84725

commit | commitdiff | tree

Matt Arsenault [Tue, 28 Jul 2020 02:00:50 +0000 (22:00 -0400)]

GlobalISel: Implement fewerElementsVector for G_INSERT_VECTOR_ELT

Add unit tests since AMDGPU will only trigger this for gigantic
vectors, and won't use the annoying odd sized breakdown case.

commit | commitdiff | tree

David Blaikie [Fri, 14 Aug 2020 14:56:29 +0000 (07:56 -0700)]

[WIP][DebugInfo] Lazily parse debug_loclist offsets

Parsing DWARFv5 debug_loclist offsets when a CU is parsed is weighing
down memory usage of symbolizers that don't need to parse this data at
all. There's not much benefit to caching these anyway - since they are
O(1) lookup and reading once you know where the offset list starts (and
can do bounds checking with the offset list size too).

In general, I think it might be time to start paying down some of the
technical debt of loc/loclist/range/rnglist parsing to try to unify it a
bit more.

eg:

* Currently DWARFUnit has: RangeSection, RangeSectionBase, LocSection,
  LocSectionBase, LocTable, RngListTable, LoclistTableHeader (be nice if
  these were all wrapped up in two variables - one for loclists, one for
  rnglists)

* rnglists and loclists are handled differently (see:
  LoclistTableHeader, but no RnglistTableHeader)

* maybe all these types could be less stateful - lazily parse what they
  need to, even reparsing rather than caching because it doesn't seem
  too expensive, for instance. (though admittedly so long as it's
  constantcost/overead per compilatiton that's probably adequate)

* Maybe implementing and using a DWARFDataExtractor that can be
  sub-ranged (so we could slice it up to just the single contribution) -
  though maybe that's not so useful because loc/ranges need to refer to
  it by absolute, not contribution-relative mechanisms

Differential Revision: https://reviews.llvm.org/D86110

commit | commitdiff | tree

Tim Keith [Tue, 18 Aug 2020 17:47:52 +0000 (10:47 -0700)]

[flang] Improve error messages for procedures in expressions

When a procedure name was used on the RHS of an assignment we were not
reporting the error. When one was used in an expression the error
message wasn't very good (e.g. "Operands of + must be numeric; have
INTEGER(4) and untyped").

Detect these cases in ArgumentAnalyzer and emit better messages,
depending on whether the named procedure is a function or subroutine.

Procedure names may appear as actual arguments to function and
subroutine calls so don't report errors in those cases. That is the same
case where assumed type arguments are allowed, so rename `isAssumedType_`
to `isProcedureCall_` and use that to decide if it is an error.

Differential Revision: https://reviews.llvm.org/D86107

commit | commitdiff | tree

Amara Emerson [Fri, 14 Aug 2020 09:00:07 +0000 (02:00 -0700)]

[GlobalISel] Add a combine for sext_inreg(load x), c --> sextload x

This is restricted to single use loads, which if we fold to sextloads we can
find more optimal addressing modes on AArch64.

This also fixes an overload the MachineFunction::getMachineMemOperand() method
which was incorrectly using the MF alignment instead of the MMO alignment.

Differential Revision: https://reviews.llvm.org/D85966

commit | commitdiff | tree

Amara Emerson [Fri, 14 Aug 2020 08:58:00 +0000 (01:58 -0700)]

[GlobalISel] Add a combine for ashr(shl x, c), c --> sext_inreg x, c'

By detecting this sign extend pattern early, we can uncover opportunities for
more optimizations.

Differential Revision: https://reviews.llvm.org/D85965

commit | commitdiff | tree

Rob Suderman [Thu, 13 Aug 2020 21:59:58 +0000 (14:59 -0700)]

Added std.floor operation to match std.ceil

There should be an equivalent std.floor op to std.ceil. This includes
matching lowerings for SPIRV, NVVM, ROCDL, and LLVM.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D85940

commit | commitdiff | tree

Arthur Eubanks [Sat, 15 Aug 2020 00:09:23 +0000 (17:09 -0700)]

[gn build] Add support for expensive checks

Reviewed By: hans, MaskRay

Differential Revision: https://reviews.llvm.org/D86007

commit | commitdiff | tree

Simon Pilgrim [Tue, 18 Aug 2020 16:08:49 +0000 (17:08 +0100)]

[X86][AVX] lowerShuffleWithVPMOV - add non-VLX support.

We can efficiently handle non-VLX cases now that we have the getAVX512TruncNode helper.

commit | commitdiff | tree

Arthur Eubanks [Tue, 18 Aug 2020 16:49:05 +0000 (09:49 -0700)]

Revert "[TSan][libdispatch] Add interceptors for dispatch_async_and_wait()"

This reverts commit d137db80297f286f3a19eacc63d4a980646da437.

Breaks builds on older SDKs.

commit | commitdiff | tree

Mauricio Sifontes [Tue, 18 Aug 2020 16:47:06 +0000 (16:47 +0000)]

Create Optimization Pass Wrapper for MLIR Reduce

Create a reduction pass that accepts an optimization pass as argument
and only replaces the golden module in the pipeline if the output of the
optimization pass is smaller than the input and still exhibits the
interesting behavior.

Add a -test-pass option to test individual passes in the MLIR Reduce
tool.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D84783

commit | commitdiff | tree

Fangrui Song [Tue, 18 Aug 2020 16:20:05 +0000 (09:20 -0700)]

[ARM] Fix build after D86087

commit | commitdiff | tree

Mott, Jeffrey T [Fri, 17 Jul 2020 16:50:08 +0000 (09:50 -0700)]

Disable use of _ExtInt with '__atomic' builtins

We're (temporarily) disabling ExtInt for the '__atomic' builtins so we can better design their behavior later. The idea is until we do an audit/design for the way atomic builtins are supposed to work with _ExtInt, we should leave them restricted so they don't limit our future options, such as by binding us to a sub-optimal implementation via ABI.

Example after this change:

    $ cat test.c

        void f(_ExtInt(64) *ptr) {
          __atomic_fetch_add(ptr, 1, 0);
        }

    $ clang -c test.c

        test.c:2:22: error: argument to atomic builtin of type '_ExtInt' is not supported
          __atomic_fetch_add(ptr, 1, 0);
                             ^
        1 error generated.

Differential Revision: https://reviews.llvm.org/D84049

commit | commitdiff | tree

David Green [Tue, 18 Aug 2020 16:15:45 +0000 (17:15 +0100)]

[ARM] Allow tail predication of VLDn

VLD2/4 instructions cannot be predicated, so we cannot tail predicate
them from autovec. From intrinsics though, they should be valid as they
will just end up loading extra values into off vector lanes, not
effecting the on lanes. The same is true for loads in general where so
long as we are not using the other vector lanes, an unpredicated load
can be converted to a predicated one.

This marks VLD2 and VLD4 instructions as validForTailPredication and
allows any unpredicated load in tail predication loop, which seems to be
valid given the other checks we have.

Differential Revision: https://reviews.llvm.org/D86022

commit | commitdiff | tree

Jan Kratochvil [Tue, 18 Aug 2020 16:09:55 +0000 (18:09 +0200)]

[lldb] [testsuite] Add split-file for check-lldb dependencies

D85968 started to use `split-file` and while buildbots run fine while
doing `make check-lldb` by hand I get:

.../llvm-monorepo-clangassert/tools/lldb/test/SymbolFile/DWARF/Output/DW_AT_declaration-with-children.s.script: line 2: split-file: command not found
failed:
lldb-shell :: SymbolFile/DWARF/DW_AT_declaration-with-children.s

Differential Revision: https://reviews.llvm.org/D86144

commit | commitdiff | tree

Sam Tebbs [Mon, 17 Aug 2020 15:03:55 +0000 (16:03 +0100)]

[ARM] Use mov operand if the mov cannot be moved while tail predicating

There are some cases where the instruction that sets up the iteration
count for a tail predicated loop cannot be moved before the dlstp,
stopping tail predication entirely. This patch checks if the mov operand
can be used and if so, uses that instead.

Differential Revision: https://reviews.llvm.org/D86087

commit | commitdiff | tree

George Mitenkov [Tue, 18 Aug 2020 15:42:23 +0000 (18:42 +0300)]

[MLIR][SPIRVToLLVM] Additional conversions for spirv-runner

This patch adds more op/type conversion support
necessary for `spirv-runner`:
- EntryPoint/ExecutionMode: currently removed since we assume
having only one kernel function in the kernel module.
- StorageBuffer storage class is now supported. We are not
concerned with multithreading so this is fine for now.
- Type conversion enhanced, now regular offsets and strides
for structs and arrays are supported (based on
`VulkanLayoutUtils`).
- Support of `spc.AccessChain` that is modelled with GEP op
in LLVM dialect.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D86109

commit | commitdiff | tree

Dokyung Song [Wed, 5 Aug 2020 23:12:19 +0000 (23:12 +0000)]

[libFuzzer] Fix arguments of InsertPartOf/CopyPartOf calls in CrossOver mutator.

The CrossOver mutator is meant to cross over two given buffers (referred to as
the first/second buffer henceforth). Previously InsertPartOf/CopyPartOf calls
used in the CrossOver mutator incorrectly inserted/copied part of the second
buffer into a "scratch buffer" (MutateInPlaceHere of the size
CurrentMaxMutationLen), rather than the first buffer. This is not intended
behavior, because the scratch buffer does not always (i) contain the content of
the first buffer, and (ii) have the same size as the first buffer;
CurrentMaxMutationLen is typically a lot larger than the size of the first
buffer. This patch fixes the issue by using the first buffer instead of the
scratch buffer in InsertPartOf/CopyPartOf calls.

A FuzzBench experiment was run to make sure that this change does not
inadvertently degrade the performance. The performance is largely the same; more
details can be found at:
https://storage.googleapis.com/fuzzer-test-suite-public/fixcrossover-report/index.html

This patch also adds two new tests, namely "cross_over_insert" and
"cross_over_copy", which specifically target InsertPartOf and CopyPartOf,
respectively.

- cross_over_insert.test checks if the fuzzer can use InsertPartOf to trigger
the crash.

- cross_over_copy.test checks if the fuzzer can use CopyPartOf to trigger the
crash.

These newly added tests were designed to pass with the current patch, but not
without the it (with 790878f291fa5dc58a1c560cb6cc76fd1bfd1c5a these tests do not
pass). To achieve this, -max_len was intentionally given a high value. Without
this patch, InsertPartOf/CopyPartOf will generate larger inputs, possibly with
unpredictable data in it, thereby failing to trigger the crash.

The test pass condition for these new tests is narrowed down by (i) limiting
mutation depth to 1 (i.e., a single CrossOver mutation should be able to trigger
the crash) and (ii) checking whether the mutation sequence of "CrossOver-" leads
to the crash.

Also note that these newly added tests and an existing test (cross_over.test)
all use "-reduce_inputs=0" flags to prevent reducing inputs; it's easier to
force the fuzzer to keep original input string this way than tweaking
cov-instrumented basic blocks in the source code of the fuzzer executable.

Differential Revision: https://reviews.llvm.org/D85554

commit | commitdiff | tree

Fangrui Song [Tue, 18 Aug 2020 16:07:38 +0000 (09:07 -0700)]

[llvm-dwarfdump][test] Add a --statistics test for a DW_AT_artificial variable

There is an untested but useful case: `this` (even if not written) is counted as a
source variable.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D86044

commit | commitdiff | tree

Jamie Schmeiser [Tue, 18 Aug 2020 16:05:20 +0000 (16:05 +0000)]

[NFC] Add raw_ostream parameter to printIR routines

This is a non-functional-change to generalize the printIR routines so that
the output can be saved and manipulated rather than being directly output
to dbgs(). This is a prerequisite change for many upcoming changes that
allow new ways of examining changes made to the IR in the new pass manager.

Reviewed By: aeubanks (Arthur Eubanks)

Differential Revision: https://reviews.llvm.org/D85999

commit | commitdiff | tree

Fangrui Song [Thu, 13 Aug 2020 16:00:26 +0000 (09:00 -0700)]

[ELF] Assign file offsets of non-SHF_ALLOC after SHF_ALLOC and set sh_addr=0 to non-SHF_ALLOC

* GNU ld places non-SHF_ALLOC sections after SHF_ALLOC sections. This has the
  advantage that the file offsets of a non-SHF_ALLOC cannot be contained in
  a PT_LOAD. This patch matches the behavior.
* For non-SHF_ALLOC non-orphan sections, GNU ld may assign non-zero sh_addr and
  treat them similar to SHT_NOBITS (not advance location counter). This
  is an alternative approach to what we have done in D85100.
  By placing non-SHF_ALLOC sections at the end, we can drop special
  cases in createSection and findOrphanPos added by D85100.

  Different from GNU ld, we set sh_addr to 0 for non-SHF_ALLOC sections. 0
  arguably is better because non-SHF_ALLOC sections don't appear in the memory
  image.

ELF spec says:

> sh_addr - If the section will appear in the memory image of a process, this
> member gives the address at which the section's first byte should
> reside. Otherwise, the member contains 0.

D85100 appeared to take a detour. If we take a combined view on D85100 and this
patch, the overall complexity slightly increases (one more 3-line loop) and
compatibility with GNU ld improves.

The behavior we don't want to match is the special treatment of .symtab
.shstrtab .strtab: they can be matched in LLD but not in GNU ld.

Reviewed By: jhenderson, psmith

Differential Revision: https://reviews.llvm.org/D85867

commit | commitdiff | tree

Jessica Paquette [Mon, 17 Aug 2020 23:42:28 +0000 (16:42 -0700)]

[GlobalISel][CallLowering] Look through call parameters for flags

We weren't looking through the parameters on calls at all.

E.g., say you had

```
declare i32 @zext(i32 zeroext %x)

...
%y = call i32 @zext(i32 %something)
...

```

At the point of the call, we wouldn't know that the %something should have the
zeroext attribute.

This sets flags in about the same way as
TargetLoweringBase::ArgListEntry::setAttributes.

Differential Revision: https://reviews.llvm.org/D86125

commit | commitdiff | tree

jasonliu [Tue, 18 Aug 2020 14:18:53 +0000 (14:18 +0000)]

[XCOFF] emit .rename for .lcomm when necessary

Summary:

This is a follow up for D82481. For .lcomm directive, although it's
not necessary to have .rename emitted, it's still desirable to do
it so that we do not see internal 'Rename..' gets print out in
symbol table. And we could have consistent naming between TC entry
and .lcomm. And also have consistent naming between IR and final
object file.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D86075

commit | commitdiff | tree

MaheshRavishankar [Tue, 18 Aug 2020 15:16:25 +0000 (08:16 -0700)]

[mlir][Linalg] Canonicalize tensor_reshape(splat-constant) -> splat-constant.

When the operand to the linalg.tensor_reshape op is a splat constant,
the result can be replaced with a splat constant of the same value but
different type.

Differential Revision: https://reviews.llvm.org/D86117

commit | commitdiff | tree

Simon Pilgrim [Tue, 18 Aug 2020 15:08:15 +0000 (16:08 +0100)]

[X86] Regenerate load-slice test labels. NFCI.

Pulled out a superfluous diff from D66004

commit | commitdiff | tree

David Green [Tue, 18 Aug 2020 15:02:21 +0000 (16:02 +0100)]

[LV] Predicated reduction tests. NFC

commit | commitdiff | tree

Nathan James [Tue, 18 Aug 2020 14:52:37 +0000 (15:52 +0100)]

[NFC][clang-tidy] Put abseil headers in alphabetical order

commit | commitdiff | tree

Simon Pilgrim [Tue, 18 Aug 2020 14:46:02 +0000 (15:46 +0100)]

[X86][AVX] lowerShuffleWithPERMV - pad 128/256-bit shuffles on non-VLX targets

Allow non-VLX targets to use 512-bits VPERMV/VPERMV3 for 128/256-bit shuffles.

TBH I'm not sure these targets actually exist in the wild, but we're testing for them and its good test coverage for shuffle lowering/combines across different subvector widths.

commit | commitdiff | tree

Simon Pilgrim [Tue, 18 Aug 2020 14:24:28 +0000 (15:24 +0100)]

[X86][AVX] lowerShuffleWithVTRUNC - extend to support v16i16/v32i8 binary shuffles.

This requires a few additional SrcVT vs DstVT padding cases in getAVX512TruncNode.

commit | commitdiff | tree

Sanjay Patel [Tue, 18 Aug 2020 14:14:07 +0000 (10:14 -0400)]

[SLP] remove instcombine dependency from regression test; NFC

InstCombine doesn't do that much here - sinks some instructions
and improves alignments - but that should not be part of the
SLP pass unit testing.

commit | commitdiff | tree

Simon Pilgrim [Tue, 18 Aug 2020 13:52:23 +0000 (14:52 +0100)]

[X86][AVX] lowerShuffleWithVTRUNC - pull out TRUNCATE/VTRUNC creation into helper code. NFCI.

Prep work toward adding v16i16/v32i8 support for lowerShuffleWithVTRUNC and improving lowerShuffleWithVPMOV.

commit | commitdiff | tree

Matt Arsenault [Sun, 26 Jul 2020 19:43:48 +0000 (15:43 -0400)]

AMDGPU/GlobalISel: Select llvm.amdgcn.groupstaticsize

Previously, it would successfully select and assert if not HSA or PAL
when expanding the pseudoinstruction. We don't need the
pseudoinstruction anymore since we know the total size after
legalization.

commit | commitdiff | tree

Matt Arsenault [Sat, 25 Jul 2020 17:21:31 +0000 (13:21 -0400)]

AMDGPU/GlobalISel: Fix selection of s1/s16 G_[F]CONSTANT

The code to determine the value size was overcomplicated and only
correct in the case where the result register already had a register
class assigned. We can always take the size directly from the
register's type.

commit | commitdiff | tree

Georgii Rymar [Wed, 12 Aug 2020 13:54:49 +0000 (16:54 +0300)]

[llvm-readobj/elf] - Refine testing of broken Android's packed relocation sections.

This uses modern `split-file` tool to merge 5 `packed-relocs-error*.s` tests to a
new `packed-relocs-errors.s` and adds testing for GNU style.

Differential revision: https://reviews.llvm.org/D85835

commit | commitdiff | tree

Sanjay Patel [Tue, 18 Aug 2020 13:19:03 +0000 (09:19 -0400)]

[InstCombine] fold fabs of select with negated operand

This is the FP example shown in:
https://bugs.llvm.org/PR39474

commit | commitdiff | tree

Sanjay Patel [Tue, 18 Aug 2020 12:24:37 +0000 (08:24 -0400)]

[InstCombine] add tests for fneg+fabs; NFC

commit | commitdiff | tree

Georgii Rymar [Tue, 18 Aug 2020 12:52:09 +0000 (15:52 +0300)]

[yaml2obj] - Don't crash when `FileHeader` declares an empty `Flags` key in specific situations.

We currently call the `llvm_unreachable` for the following YAML:

```
--- !ELF
FileHeader:
  Class:   ELFCLASS32
  Data:    ELFDATA2LSB
  Type:    ET_REL
  Machine: EM_NONE
  Flags:   [ ]
```

it happens because the `Flags` key is present, though `EM_NONE` is a
machine type that has no known `EF_*` values and we call `llvm_unreachable` by mistake.

Differential revision: https://reviews.llvm.org/D86138

commit | commitdiff | tree

Alexey Bataev [Wed, 5 Aug 2020 15:48:35 +0000 (11:48 -0400)]

[OPENMP]Do not capture base pointer by reference if it is used as a base for array-like reduction.

If the declaration is used in the reduction clause, it is captured by
reference by default. But if the declaration is a pointer and it is a
base for array-like reduction, this declaration can be captured by
value, since the pointee is reduced but not the original declaration.

Differential Revision: https://reviews.llvm.org/D85321

commit | commitdiff | tree

Eduardo Caldas [Fri, 14 Aug 2020 09:53:45 +0000 (09:53 +0000)]

[SyntaxTree] Use Annotations based tests for expressions

In this process we also create some other tests, in order to not lose
coverage when focusing on the annotated code

Differential Revision: https://reviews.llvm.org/D85962

commit | commitdiff | tree

Eduardo Caldas [Fri, 14 Aug 2020 09:43:20 +0000 (09:43 +0000)]

[SyntaxTree] Implement annotation-based test infrastructure

We add the method `SyntaxTreeTest::treeDumpEqualOnAnnotations`, which
allows us to compare the treeDump of only annotated code. This will reduce a
lot of noise from our `BuildTreeTest` and make them short and easier to
read.

commit | commitdiff | tree

Ronak Chauhan [Tue, 18 Aug 2020 12:42:41 +0000 (18:12 +0530)]

[ELF] Hide target specific methods as private

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D86136

commit | commitdiff | tree

Simon Pilgrim [Tue, 18 Aug 2020 12:38:10 +0000 (13:38 +0100)]

[X86][AVX] lowerShuffleWithVTRUNC - avoid unnecessary division in element counts. NFCI.

(256 / SrcEltBits) == ((2 * EltSizeInBits * NumElts) / (EltSizeInBits * Scale)) == (2 * (NumElts / Scale)) == NumSrcElts

commit | commitdiff | tree

Nico Weber [Tue, 18 Aug 2020 12:40:36 +0000 (08:40 -0400)]

Revert "PR44685: DebugInfo: Handle address-use-invalid type units referencing non-type units"

This reverts commit be3ef93bf58aa5546c7baadfb21d43b75fbb4e24.
Test fails on macOS and Windows, e.g. http://45.33.8.238/win/22216/step_11.txt

commit | commitdiff | tree

Ronak Chauhan [Fri, 24 Jul 2020 09:51:46 +0000 (15:21 +0530)]

[llvm-objdump][AMDGPU] Detect CPU string

AMDGPU ISA isn't backwards compatible and hence -mcpu must always be specified during disassembly.
However, the AMDGPU target CPU is stored in e_flags in the ELF object.

This patch allows targets to implement CPU string detection, and also implements it for AMDGPU by looking at e_flags.

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D84519

commit | commitdiff | tree

Luboš Luňák [Wed, 5 Aug 2020 11:00:37 +0000 (13:00 +0200)]

[lldb][gui] use left/right in the source view to scroll

I intentionally decided not to reset the column automatically
anywhere, because I don't know where and if at all that should happen.
There should be always an indication of being scrolled (too much)
to the right, so I'll leave this to whoever has an opinion.

Differential Revision: https://reviews.llvm.org/D85290

commit | commitdiff | tree

Alex Zinenko [Tue, 18 Aug 2020 08:26:30 +0000 (10:26 +0200)]

[mlir] expose standard types to C API

Provide C API for MLIR standard types. Since standard types live under lib/IR
in core MLIR, place the C APIs in the IR library as well (standard ops will go
into a separate library). This also defines a placeholder for affine maps that
are necessary to construct a memref, but are not yet exposed to the C API.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D86094

commit | commitdiff | tree

Paul Walker [Thu, 13 Aug 2020 14:52:00 +0000 (15:52 +0100)]

[SVE] Fix shift-by-imm patterns used by asr, lsl & lsr intrinsics.

Right shift patterns will no longer incorrectly accept a shift
amount of zero. At the same time they will allow larger shift
amounts that are now saturated to their upper bound.

Patterns have been extended to enable immediate forms for shifts
taking an arbitrary predicate.

This patch also unifies the code path for immediate parsing so the
i64 based shifts are no longer treated specially.

Differential Revision: https://reviews.llvm.org/D86084

commit | commitdiff | tree

Sam Parker [Tue, 18 Aug 2020 10:21:03 +0000 (11:21 +0100)]

[NFC] Add some more Arm tests for IndVarSimplify

Copy some generic functions and apply minsize for arm.

commit | commitdiff | tree

Paul Walker [Mon, 17 Aug 2020 11:46:55 +0000 (12:46 +0100)]

[SVE] Lower fixed length vector ISD::SPLAT_VECTOR operations.

Also strengthens the CHECK lines for scalable vector splat tests.

Differential Revision: https://reviews.llvm.org/D86070

commit | commitdiff | tree

Simon Pilgrim [Tue, 18 Aug 2020 10:11:58 +0000 (11:11 +0100)]

[X86][AVX] Lower v16i8/v8i16 binary shuffles using VTRUNC/TRUNCATE

This patch adds lowerShuffleWithVTRUNC to handle basic binary shuffles that can be lowered either as a pure ISD::TRUNCATE or a X86ISD::VTRUNC (with undef/zero values in the remaining upper elements).

We concat the binary sources together into a single 256-bit source vector. To avoid regressions we perform this after we've tried to lower with PACKS/PACKUS which typically does a cleaner job than a concat.

For non-AVX512VL cases we have to canonicalize VTRUNC cases to use a 512-bit source vectors (inserting undefs/zeros in the upper elements as necessary), truncate and then (possibly) extract the 128-bit result.

This should address the last regressions in D66004

Differential Revision: https://reviews.llvm.org/D86093

commit | commitdiff | tree

sameeran joshi [Tue, 18 Aug 2020 09:35:51 +0000 (15:05 +0530)]

[Flang] Move markdown files(.MD) from documentation/ to docs/

Summary:
Other LLVM sub-projects use docs/ folder for documentation files.
Follow LLVM project policy.
Modify `documentation/` references in sources to `docs/`.
This patch doesn't modify files to reStructuredText(.rst) file format.

Reviewed By: DavidTruby, sscalpone

Differential Revision: https://reviews.llvm.org/D85884

commit | commitdiff | tree

QingShan Zhang [Tue, 18 Aug 2020 09:40:37 +0000 (09:40 +0000)]

[Test][NFC] Add a new test to verify if scheduler can cluster two ld/st
even with different preds

commit | commitdiff | tree

Rainer Orth [Tue, 18 Aug 2020 09:32:51 +0000 (11:32 +0200)]

[compiler-rt][test] XFAIL two tests on 32-bit sparc

Two tests `FAIL` on 32-bit sparc:

  Profile-sparc :: Posix/instrprof-gcov-parallel.test
  UBSan-Standalone-sparc :: TestCases/Float/cast-overflow.cpp

The failure mode is similar:

  Undefined                       first referenced
   symbol                             in file
  __atomic_store_4                    /var/tmp/instrprof-gcov-parallel-6afe8d.o
  __atomic_load_4                     /var/tmp/instrprof-gcov-parallel-6afe8d.o

  Undefined                       first referenced
   symbol                             in file
  __atomic_load_1                     /var/tmp/cast-overflow-72a808.o

This is a known bug: `clang` doesn't inline atomics on 32-bit sparc, unlike
`gcc`.

The patch therefore `XFAIL`s the tests.

Tested on `sparcv9-sun-solaris2.11` and `amd64-pc-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D85346

commit | commitdiff | tree

LLVM GN Syncbot [Tue, 18 Aug 2020 09:10:43 +0000 (09:10 +0000)]

[gn build] Port 00d7b7d014f

commit | commitdiff | tree

Shinji Okumura [Tue, 18 Aug 2020 09:04:47 +0000 (18:04 +0900)]

[Attributor] Deduce noundef attribute

This patch introduces a new abstract attribute `AANoUndef` which corresponds to `noundef` IR attribute and deduce them.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85184

commit | commitdiff | tree

Georgii Rymar [Mon, 17 Aug 2020 13:38:56 +0000 (16:38 +0300)]

[llvm-readobj/elf] - Refine the malformed-pt-dynamic.test.

This is splitted out from D85519, but significantly reworked.

Changes:
1) This test was changed to stop using python.
2) Use NoHeaders: true instead of `llvm-objcopy --strip-sections`.
3) Test llvm-readelf too (not just llvm-readobj).
4) Simplify the YAML used a bit (e.g. remove PT_LOAD).
5) Test 2 different cases: objects with section header table and without.

Differential revision: https://reviews.llvm.org/D86073

commit | commitdiff | tree

Georgii Rymar [Mon, 17 Aug 2020 14:58:14 +0000 (17:58 +0300)]

[llvm-readobj/elf] - Merge mips-got-overlapped.test to mips-got.test and refine testing.

The `mips-got-overlapped.test` was introduced in D16968 and its intention is
to check that when there is an empty section at the same address as `.got`,
then we are able to locate `.got` and dump it.

The issue is that this test does not test llvm-readelf and uses a precompiled
object. This path starts using YAML instead and merges
mips-got-overlapped.test to mips-got.test.

Differential revision: https://reviews.llvm.org/D86080

commit | commitdiff | tree

Alex Zinenko [Mon, 17 Aug 2020 18:25:28 +0000 (20:25 +0200)]

[mlir] Fix printing of unranked memrefs in non-default memory space

The type printer was ignoring the memory space on unranked memrefs.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D86096

commit | commitdiff | tree

Jakub Lichman [Tue, 18 Aug 2020 07:11:52 +0000 (07:11 +0000)]

[mlir] VectorToSCF bug in setAllocAtFunctionEntry fixed.

The function makes too strong assumption regarding parent FuncOp
which gets broken when FuncOp is first lowered to llvm function.
In this fix we generalize the assumption to allocation scope and
add assertion to produce user friendly message in case our assumption
is broken.

Differential Revision: https://reviews.llvm.org/D86086

commit | commitdiff | tree

Nathan Ridge [Thu, 12 Mar 2020 23:27:18 +0000 (19:27 -0400)]

[clangd] Target member of dependent base made visible via a using-decl

Fixes https://github.com/clangd/clangd/issues/307

Differential Revision: https://reviews.llvm.org/D86047

commit | commitdiff | tree

David Blaikie [Tue, 18 Aug 2020 04:27:19 +0000 (21:27 -0700)]

PR44685: DebugInfo: Handle address-use-invalid type units referencing non-type units

Theory was that we should never reach a non-type unit (eg: type in an
anonymous namespace) when we're already in the invalid "encountered an
address-use, so stop emitting types for now, until we throw out the
whole type tree to restart emitting in non-type unit" state. But that's
not the case (prior commit cleaned up one reason this wasn't exposed
sooner - but also makes it easier to test/demonstrate this issue)

commit | commitdiff | tree

David Blaikie [Tue, 18 Aug 2020 01:17:38 +0000 (18:17 -0700)]

DebugInfo: Emit class template parameters first, before members

This reads more like what you'd expect the DWARF to look like (from the
lexical order of C++ - template parameters come before members, etc),
and also happens to make it easier to tickle (& thus test) a bug related
to type units and Split DWARF I'm about to fix.

commit | commitdiff | tree

Johannes Doerfert [Sun, 2 Aug 2020 05:44:08 +0000 (00:44 -0500)]

[Attributor] Bail early if AAMemoryLocation cannot derive anything

Before this change we looked through all memory operations in a function
even if the first was an unknown call that could do anything. This did
cost a lot of time but there is little use to do so. We also avoid
creating AAs for things that we would have looked at in case no other AA
will; that is the reason for the test changes.

Running only the attributor-cgscc pass on a IR version of
`llvm-test-suite/MultiSource/Applications/SPASS/clause.c` reduced the
time we spend in `AAMemoryLocation::update` from 4% total to
0.9% (disclaimer: no accurate measurements).

commit | commitdiff | tree

Johannes Doerfert [Sun, 2 Aug 2020 05:31:30 +0000 (00:31 -0500)]

[Attributor] We (should) keep the CG updated so we can mark it as preserved

commit | commitdiff | tree

Johannes Doerfert [Sat, 1 Aug 2020 06:49:28 +0000 (01:49 -0500)]

[Attributor][NFC] Directly return proper type to avoid casts

commit | commitdiff | tree

Johannes Doerfert [Tue, 18 Aug 2020 00:54:42 +0000 (19:54 -0500)]

[Attributor][FIX] Handle function pointers properly in AANonNull

Before we tired to create a dominator tree for a declaration when we
wanted to determine if the function pointer is `nonnull`. We now avoid
looking at global values if `Value::getPointerDereferenceableBytes` not
already determined `nonnull`.

commit | commitdiff | tree

Nathan Ridge [Mon, 27 Jul 2020 02:45:24 +0000 (22:45 -0400)]

[clang] Fix visitation of ConceptSpecializationExpr in constrained-parameter

Summary: RecursiveASTVisitor needs to traverse TypeConstraint::ImmediatelyDeclaredConstraint

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D84136

commit | commitdiff | tree

Nathan Ridge [Sun, 16 Aug 2020 22:22:04 +0000 (18:22 -0400)]

[clangd] Index refs to main-file symbols as well

Summary: This will be needed to support call hierarchy

Reviewers: kadircet

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83536

commit | commitdiff | tree

Harmen Stoppels [Tue, 18 Aug 2020 02:51:11 +0000 (19:51 -0700)]

Use find_library for ncurses

Currently it is hard to avoid having LLVM link to the system install of
ncurses, since it uses check_library_exists to find e.g. libtinfo and
not find_library or find_package.

With this change the ncurses lib is found with find_library, which also
considers CMAKE_PREFIX_PATH. This solves an issue for the spack package
manager, where we want to use the zlib installed by spack, and spack
provides the CMAKE_PREFIX_PATH for it.

This is a similar change as https://reviews.llvm.org/D79219, which just
landed in master.

Differential revision: https://reviews.llvm.org/D85820

commit | commitdiff | tree

Amy Kwan [Wed, 12 Aug 2020 14:23:05 +0000 (09:23 -0500)]

[PowerPC] Implement Vector Extract Mask builtins in LLVM/Clang

This patch implements the vec_extractm function prototypes in altivec.h in
order to utilize the vector extract with mask instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82675

commit | commitdiff | tree

Arthur Eubanks [Tue, 18 Aug 2020 00:48:04 +0000 (17:48 -0700)]

[NewPM] Pin various tests under Other/ to legacy PM

These all are legacy PM-specific or have a corresponding NPM RUN line.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D86124

commit | commitdiff | tree

Hamilton Tobon Mosquera [Tue, 18 Aug 2020 01:18:21 +0000 (20:18 -0500)]

[OpenMPOpt][HideMemTransfersLatency] Split __tgt_target_data_begin_mapper into its "issue" and "wait" counterparts.

WIP that tries to hide the latency of runtime calls that involve host to
device memory transfers by splitting them into their "issue" and "wait"
versions. The "issue" is moved upwards as much as possible. The "wait" is
moved downards as much as possible. The "issue" issues the memory transfer
asynchronously, returning a handle. The "wait" waits in the returned
handle for the memory transfer to finish. We still lack of the movement.

commit | commitdiff | tree

Leonard Chan [Tue, 18 Aug 2020 01:11:56 +0000 (18:11 -0700)]

Revert "[libc++] Use CMake interface targets to setup benchmark flags"

This reverts commit da0592e4c8df95efad4e42d63646f8a5336a7edc.

Reverting because this is incompatible with cmake 3.13.5, with the
minimum supported version being 3.13.4.

See https://luci-milo.appspot.com/p/fuchsia/builders/ci/clang-linux-x64/b8871967816877544224.

commit | commitdiff | tree

Hongtao Yu [Tue, 18 Aug 2020 00:41:49 +0000 (17:41 -0700)]

[llvm-objdump] Attempt to fix html doc generation issue.

https://reviews.llvm.org/D84191 caused a html doc build issue with the changes in `llvm-objdump.rst`. It looks like a blank line is missing from the `code-block` directives.

Test Plan:

Differential Revision: https://reviews.llvm.org/D86123

commit | commitdiff | tree

Aditya Kumar [Sun, 16 Aug 2020 03:35:10 +0000 (20:35 -0700)]

NFC: [GVNHoist] Outline functions from the class

Reviewers: sebpop
Reviewed By: hiraditya

Differential Revision: https://reviews.llvm.org/D86032

commit | commitdiff | tree

Craig Topper [Mon, 17 Aug 2020 23:31:51 +0000 (16:31 -0700)]

[X86] When manually creating intrinsic nodes in X86ISelLowering, make sure we use getTargetConstant and pointer type for the intrinsic ID.

Doesn't really matter in practice but that's how the nodes are
normally created by SelectionDAGBuilder. So we should match.

Found by temporarily hacking type checks into isel table.

commit | commitdiff | tree

Craig Topper [Mon, 17 Aug 2020 22:52:13 +0000 (15:52 -0700)]

[X86] Rename INTR_TYPE_4OP to INTR_TYPE_4OP_IMM8 and truncate immediates to MVT::i8

This makes sure VPTERNLOG is generated with MVT::i8 immediate
as its SDNode declaration in X86InstrFragmentsSIMD.td declares.

commit | commitdiff | tree

Craig Topper [Mon, 17 Aug 2020 22:59:32 +0000 (15:59 -0700)]

[X86] Truncate immediate to i8 for INTR_TYPE_3OP_IMM8

This is used for DBPSADBW which has a i32 immediate for its
intrinsic and an i8 immediate in tablegen isel patterns.

commit | commitdiff | tree

Craig Topper [Mon, 17 Aug 2020 20:46:41 +0000 (13:46 -0700)]

[X86] Make PreprocessISelDAG create X86ISD::VRNDSCALE nodes with i32 constants instead of i8.

This is the type declared in X86InstrFragmentsSIMD.td. ISel pattern
matching doesn't check so it doesn't matter in practice. Maybe for
SelectionDAG CSE it would matter.

commit | commitdiff | tree

Mehdi Amini [Fri, 14 Aug 2020 09:38:37 +0000 (09:38 +0000)]

Fix method name to start with lower case to match style guide (NFC)

commit | commitdiff | tree

Mircea Trofin [Mon, 10 Aug 2020 16:36:18 +0000 (09:36 -0700)]

[MLInliner] In development mode, obtain the output specs from a file

Different training algorithms may produce models that, besides the main
policy output (i.e. inline/don't inline), produce additional outputs
that are necessary for the next training stage. To facilitate this, in
development mode, we require the training policy infrastructure produce
a description of the outputs that are interesting to it, in the form of
a JSON file. We special-case the first entry in the JSON file as the
inlining decision - we care about its value, so we can guide inlining
during training - but treat the rest as opaque data that we just copy
over to the training log.

Differential Revision: https://reviews.llvm.org/D85674

commit | commitdiff | tree

Hongtao Yu [Mon, 20 Jul 2020 16:45:32 +0000 (09:45 -0700)]

[llvm-objdump] Symbolize binary addresses for low-noisy asm diff.

When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise.
In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand.

So far only the X86 disassemblers are supported.

Test Plan:

llvm-objdump -d  --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:

<_start>:
                push rax
                mov dword ptr [rsp + 4], 0
                mov dword ptr [rsp], 0
                mov eax, dword ptr [rsp]
                cmp eax, dword ptr [rip + 4112]  # 202182 <g>
                jge 0x20117e <_start+0x25>
                call 0x201158 <foo>
                inc dword ptr [rsp]
                jmp 0x201169 <_start+0x10>
                xor eax, eax
                pop rcx
                ret
```

llvm-objdump -d  **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:

<_start>:
                push rax
                mov dword ptr [rsp + 4], 0
                mov dword ptr [rsp], 0
<L1>:
                mov eax, dword ptr [rsp]
                cmp eax, dword ptr  <g>
                jge <L0>
                call <foo>
                inc dword ptr [rsp]
                jmp <L1>
<L0>:
                xor eax, eax
                pop rcx
                ret
```

Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion.

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D84191

commit | commitdiff | tree

Johannes Doerfert [Mon, 17 Aug 2020 23:17:17 +0000 (18:17 -0500)]

[Attributor] Properly use the call site argument position

commit | commitdiff | tree

Johannes Doerfert [Mon, 17 Aug 2020 23:16:08 +0000 (18:16 -0500)]

[Attributor][FIX] Do not request an AANonNull for non-pointer types

commit | commitdiff | tree

Hamilton Tobon Mosquera [Mon, 17 Aug 2020 23:16:16 +0000 (18:16 -0500)]

[OpenMPOpt][HideMemTransfersLatency] Update regression test with new runtime calls.

commit | commitdiff | tree

Kazushi (Jam) Marukawa [Mon, 17 Aug 2020 14:02:24 +0000 (23:02 +0900)]

[VE] Modify ISelLoweirng following clang-tidy

Modify case style of function names following clang-tidy.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D86076

commit | commitdiff | tree

Roman Lebedev [Mon, 17 Aug 2020 21:57:35 +0000 (00:57 +0300)]

[NFC][clang] Adjust test/CodeGenCXX/nrvo.cpp after 03127f795b8244c1039c18d4391374707a3dc75e

commit | commitdiff | tree

Roman Lebedev [Mon, 17 Aug 2020 20:15:30 +0000 (23:15 +0300)]

[InstCombine] PHI-aware aggregate reconstruction: correctly detect "use" basic block

While the original implementation added in D85787 / ae7f08812e0995481eb345cecc5dd4529829ba44
is not incorrect, it is known to be suboptimal.

In particular, it is not incorrect to use the basic block
in which the original `insertvalue` instruction is located
as the merge point, that is not necessarily optimal,
as `@test6` shows.

We should look at all the AggElts, and, if they are all defined
in the same basic block, then that is the basic block we should use.

On RawSpeed library, this catches +4% (+50) more cases.
On vanilla LLVM test-suits, this catches +12% (+92) more cases.

commit | commitdiff | tree

Roman Lebedev [Mon, 17 Aug 2020 19:53:23 +0000 (22:53 +0300)]

[NFC][InstCombine] PHI-aware aggregate reconstruction: don't capture UseBB in lambdas, take it as argument

In a following patch, UseBB will be detected later,
so capturing it is potentially error-prone (capture by ref vs by val).

Also, parametrized UseBB will likely be needed
for multiple levels of PHI indirections later on anyways.

commit | commitdiff | tree

Roman Lebedev [Mon, 17 Aug 2020 19:59:48 +0000 (22:59 +0300)]

[NFC][InstCombine] PHI-aware aggregate reconstruction: insert PHI node manually

This is NFC at the moment, because right now we always insert the PHI
into the same basic block in which the original `insertvalue` instruction
is, but that will change.

Also, fixes addition of the suffix to the value names.

commit | commitdiff | tree

Roman Lebedev [Mon, 17 Aug 2020 12:55:58 +0000 (15:55 +0300)]

[NFC][InstCombine] Add more tests for aggregate reconstruction w/ PHI handling

Even without handling several layers of PHI nodes,
we can handle more cases, as `@test6` shows.

commit | commitdiff | tree

Adrian Prantl [Mon, 17 Aug 2020 19:52:23 +0000 (12:52 -0700)]

Convert to early exit (NFC)

commit | commitdiff | tree

Adrian Prantl [Mon, 17 Aug 2020 19:50:24 +0000 (12:50 -0700)]

Simplify error reporting (NFC)

Domain: System / Toolchain;

RSS Atom