review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

Simon Pilgrim [Thu, 13 May 2021 17:23:23 +0000 (18:23 +0100)]

[X86] Try to pass DebugLoc by const-ref to avoid costly TrackingMDNodeRef copies. NFCI.

Max Kazantsev [Fri, 14 May 2021 09:57:05 +0000 (16:57 +0700)]

[Test] Add test on missing opportunity in Loop Deletion

We can break the backedge in some cases when we can evaluate some of the
values and conditions on the 1st iteration.

commit | commitdiff | tree

Tim Northover [Wed, 21 Apr 2021 11:12:28 +0000 (12:12 +0100)]

AArch64: support i128 cmpxchg in GlobalISel.

There are three essentially different cases to handle:

  * -O1, no LSE. The IR is expanded to ldxp/stxp and we need patterns to select
    them.
  * -O0, no LSE. We get G_ATOMIC_CMPXCHG, and need to produce CMP_SWAP_N
    pseudos. The registers are all 64-bit so this is easy.
  * LSE. We get G_ATOMIC_CMPXCHG and need to produce a CASP instruction with
    XSeqPair registers.

The last case is by far the hardest, and and adds 128-bit GPR support as a
byproduct.

commit | commitdiff | tree

Sander de Smalen [Fri, 14 May 2021 08:00:47 +0000 (09:00 +0100)]

NFCI: Remove VF argument from isScalarWithPredication

As discussed in D102437, the VF argument to isScalarWithPredication
seems redundant, so this is intended to be a non-functional change. It
seems wrong to query the widening decision at this point. Removing the
operand and code to get the widening decision causes no unit/regression
tests to fail. I've also found no issues running the LLVM test-suite.

This subsequently removes the VF argument from isPredicatedInst as well,
since it is no longer required.

commit | commitdiff | tree

Jay Foad [Fri, 20 Dec 2019 15:13:57 +0000 (15:13 +0000)]

[AMDGPU] getMemOperandsWithOffset: add vaddr operand for stack access BUF instructions

A consequence is that checkInstOffsetsDoNotOverlap can now distinguish
sp+offset from fp+offset, so it knows that it shouldn't try to work out
whether the accesses overlap just by comparing the offsets. For example
in these two instructions:

MIR:
BUFFER_STORE_DWORD_OFFSET %0:vgpr_32(s32), $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 4 into stack + 4, addrspace 5)
%4:vgpr_32 = BUFFER_LOAD_DWORD_OFFEN %stack.0.alloca, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, 0, 0, 0, implicit $exec :: (load 4 from `i8 addrspace(5)* undef`, addrspace 5)

ISA:
buffer_store_dword v0, off, s[0:3], s32 offset:4
buffer_load_dword v0, off, s[0:3], s34

Differential Revision: https://reviews.llvm.org/D73957

commit | commitdiff | tree

Alexandros Lamprineas [Fri, 14 May 2021 08:37:44 +0000 (09:37 +0100)]

[llvm-mc][AArch64] HINT instruction disassembled as BTI

The Arm Architecture Reference Manual says that the SystemHintOp_BTI
opcode is prefered when CRm:op2 matches 0100:xx0, but llvm-mc
currently accepts 0100:xxx, which isn't right.

Differential Revision: https://reviews.llvm.org/D102415

commit | commitdiff | tree

Martin Storsjö [Thu, 11 Mar 2021 09:03:46 +0000 (11:03 +0200)]

[libcxx] [test] Change the generic_string_alloc test to test conversions to all char types

On windows, the native path char type is wchar_t - therefore, this test
didn't actually do the conversion that the test was supposed to exercise.

The charset conversions on windows do cause extra allocations outside of
the provided allocator though, so that bit of the test has to be waived
now that the test actually does something. (Other tests have similar
TEST_NOT_WIN32() for allocation checks for charset conversions.)

Also fix a typo, and amend the path.native.obs/string_alloc test to
test char8_t, too.

Differential Revision: https://reviews.llvm.org/D102360

commit | commitdiff | tree

Roman Lebedev [Fri, 14 May 2021 08:27:04 +0000 (11:27 +0300)]

[X86] AMD Zen 3: same-reg AVX YMM VXORPD is a zero-cycle(!) dep-breaking zero-idiom

As confirmed by exegesis measurements, and ref docs.

commit | commitdiff | tree

Roman Lebedev [Fri, 14 May 2021 08:26:12 +0000 (11:26 +0300)]

[X86] AMD Zen 3: same-reg AVX XMM VXORPD is a zero-cycle(!) dep-breaking zero-idiom

As confirmed by exegesis measurements, and ref docs.

commit | commitdiff | tree

Roman Lebedev [Fri, 14 May 2021 08:24:22 +0000 (11:24 +0300)]

[X86] AMD Zen 3: same-reg SSE XMM XORPD is a 1-cycle(!) dep-breaking zero-idiom

Same as with it's float friend, unlike their AVX versions.
As confirmed by exegesis, and ref docs.

commit | commitdiff | tree

Roman Lebedev [Fri, 14 May 2021 08:20:15 +0000 (11:20 +0300)]

[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VXORPD tests

commit | commitdiff | tree

Roman Lebedev [Fri, 14 May 2021 08:20:10 +0000 (11:20 +0300)]

[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VXORPD tests

commit | commitdiff | tree

Roman Lebedev [Fri, 14 May 2021 08:19:59 +0000 (11:19 +0300)]

[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM XORPD tests

commit | commitdiff | tree

Roman Lebedev [Fri, 14 May 2021 08:11:45 +0000 (11:11 +0300)]

[X86] AMD Zen 3: same-reg AVX YMM VXORPS is a zero-cycle(!) dep-breaking zero-idiom

As confirmed by exegesis, and ref docs.

commit | commitdiff | tree

Roman Lebedev [Fri, 14 May 2021 08:10:22 +0000 (11:10 +0300)]

[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VXORPS tests

commit | commitdiff | tree

Roman Lebedev [Fri, 14 May 2021 08:06:39 +0000 (11:06 +0300)]

[X86] AMD Zen 3: same-reg AVX XMM VXORPS is a zero-cycle(!) dep-breaking zero-idiom

Unlike it's legacy SSE XMM XORPS version, which measures as being 1-cycle,
this one is certainly a zero-cycle instruction, in addition to both of them
being dependency breaking.

As confirmed by exegesis measurements, and ref docs.

commit | commitdiff | tree

Roman Lebedev [Fri, 14 May 2021 08:02:03 +0000 (11:02 +0300)]

[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VXORPS tests

commit | commitdiff | tree

Pooja Yadav [Fri, 14 May 2021 08:39:03 +0000 (14:09 +0530)]

[docs] Added llvm/cmake section

Added information about the cmake inside llvm.

Reviewed By: xgupta, jroelofs

Differential Revision: https://reviews.llvm.org/D101925

commit | commitdiff | tree

David Stuttard [Fri, 7 May 2021 10:43:29 +0000 (11:43 +0100)]

[AMDGPU] Fix codegen of image intrinsics for g16 and a16

For gfx10 gradient (g16) and address (a16) can be independent. Previous
implementation assumed that a16 implied g16.

There are some other changes that fix the verification (as well as asm/disasm)
that are required for the included test to pass - the XFAIL will be removed in
those changes.

This also includes required fixes for GlobalISel

Differential Revision: https://reviews.llvm.org/D102066

Change-Id: I7d171cc90994de05f41669b66a6d0ffa2ed05d09

commit | commitdiff | tree

David Stuttard [Fri, 30 Apr 2021 10:37:41 +0000 (11:37 +0100)]

[AMDGPU][AsmParser/Disassembler] Correct A16 and G16 handling

A16 support for image instructions assembly/disassembly (gfx10) was missing

Also refactor MIMG op addr size calcs to common function

We'd got 3 places where the same operation was being done.

One test is now marked XFAIL until a related codegen patch is in place

Differential Revision: https://reviews.llvm.org/D102231

Change-Id: I7e86e730ef8c71901457855cba570581f4f576bb

commit | commitdiff | tree

David Spickett [Tue, 11 May 2021 14:33:53 +0000 (14:33 +0000)]

[llvm][AsmPrinter] Restore source location to register clobber warning

Since 5de2d189e6ad466a1f0616195e8c524a4eb3cbc0 this particular warning
hasn't had the location of the source file containing the inline
assembly.

Fix this by reporting via LLVMContext. Which means that we no longer
have the "instantiated into assembly here" lines but they were going to
point to the start of the inline asm string anyway.

This message is already tested via IR in llvm. However we won't have
the required location info there so I've added a C file test in clang
to cover it.
(though strictly, this is testing llvm code)

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D102244

commit | commitdiff | tree

Alexey Bader [Fri, 14 May 2021 05:18:49 +0000 (08:18 +0300)]

New tag for ittapi - fix an error related to cross-compiling ITTAPI in LLVM with mingw

Fix was implemented in the ittap repo to solve an error about cross-compiling ITTAPI in LLVM with mingw.
The problem occurred in the cross-compilation environment for Julia's dependencies.
The corresponding issue item in ittapi repo: https://github.com/intel/ittapi/issues/19
A new tag was created in ittapi repo for that fix.

This patch contains changes to update the ittapi tag in LLVM.

Reviewed By: bader

Differential Revision: https://reviews.llvm.org/D102471

commit | commitdiff | tree

dfukalov [Fri, 9 Apr 2021 10:37:13 +0000 (13:37 +0300)]

[GVN] Clobber partially aliased loads.

Use offsets stored in `AliasResult` implemented in D98718.

Updated with fix of issue reported in https://reviews.llvm.org/D95543#2745161

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D95543

commit | commitdiff | tree

David Green [Fri, 14 May 2021 08:16:51 +0000 (09:16 +0100)]

[DSE] Move isOverwrite into DSEState. NFC

This moves the isOverwrite function into the DSEState so that it can
share the analyses and members from the state.

A few extra loop tests were also added to test stores in and around
multi block loops for D100464.

commit | commitdiff | tree

Lang Hames [Fri, 14 May 2021 05:47:35 +0000 (22:47 -0700)]

[ORC] Add JITLink dependence for ObjectLinkingLayerTest.

This aims to fix the failure at
https://lab.llvm.org/buildbot/#/builders/61/builds/9590.

commit | commitdiff | tree

LLVM GN Syncbot [Fri, 14 May 2021 04:56:03 +0000 (04:56 +0000)]

[gn build] Port 0fda4c4745b8

commit | commitdiff | tree

Lang Hames [Fri, 14 May 2021 04:35:34 +0000 (21:35 -0700)]

[ORC] Add support for adding LinkGraphs directly to ObjectLinkingLayer.

This is separate from (but builds on) the support added in ec6b71df70a for
emitting LinkGraphs in the context of an active materialization. This commit
makes LinkGraphs a first-class data structure with features equivalent to
object files within ObjectLinkingLayer.

commit | commitdiff | tree

Lang Hames [Fri, 14 May 2021 01:59:26 +0000 (18:59 -0700)]

[JITLink] Fix missing 'static' keyword in unit test.

commit | commitdiff | tree

Fangrui Song [Fri, 14 May 2021 04:26:31 +0000 (21:26 -0700)]

[sanitizer] Simplify __sanitizer::BufferedStackTrace::UnwindImpl implementations

Intended to be NFC. D102046 relies on the refactoring for stack boundaries.

commit | commitdiff | tree

Carl Ritson [Fri, 14 May 2021 03:29:54 +0000 (12:29 +0900)]

[AMDGPU] Do not clause NSA instructions

To ensure correct behaviour NSA instructions should not be claused.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D102211

commit | commitdiff | tree

Reid Kleckner [Fri, 14 May 2021 03:26:50 +0000 (20:26 -0700)]

Use enum comparison instead of generated switch/case, NFC

Clang's coverage data for auto-generated switch cases is really, really
large. Before this change, when I enable code coverage, SemaDeclAttr.obj
is 4.0GB. Naturally, this fails to link.

Replacing the RISCV builtin id check with a comparison reduces object
file size from 4.0GB to 330MB. Replacing the AArch64 SVE range check
reduces the size again down to 17MB, which is reasonable.

I think the RISCV switch is larger in coverage data because it uses more
levels of macro expansion, while the SVE intrinsics only use one. In any
case, please try to avoid switches with 1000+ cases, they usually don't
optimize well.

commit | commitdiff | tree

Reid Kleckner [Fri, 14 May 2021 02:32:49 +0000 (19:32 -0700)]

[COFF] Remove a truncation assertion from setRVA

LLD already produces a nice error message when sections exceed 4GB, and
this setRVA assertion causes LLD to crash instead of diagnosing the
error properly.

No test because we don't want slow tests that create 4GB files.

commit | commitdiff | tree

Matthias Springer [Fri, 14 May 2021 01:45:13 +0000 (10:45 +0900)]

[mlir] VectorToSCF cleanup

Group functions/structs in namespaces for better code readability.

Depends On D102123

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D102124

commit | commitdiff | tree

Rahul Joshi [Fri, 14 May 2021 01:33:02 +0000 (18:33 -0700)]

[MLIR] Fix build failures due to unused variables in non-debug builds.

Differential Revision: https://reviews.llvm.org/D102458

commit | commitdiff | tree

Lang Hames [Fri, 14 May 2021 01:22:33 +0000 (18:22 -0700)]

[ORC] Remove the OrcExecutionTest class. It is no longer used.

commit | commitdiff | tree

Lang Hames [Fri, 14 May 2021 01:11:33 +0000 (18:11 -0700)]

[ORC] Remove unused RTDyldObjectLinkingLayerExecutionTest class from unit test.

commit | commitdiff | tree

Lang Hames [Fri, 14 May 2021 00:32:36 +0000 (17:32 -0700)]

[ORC] Remove some stale unit test utils.

This code was used to test ORCv1, which has been removed. It is not useful for
testing ORCv2.

commit | commitdiff | tree

Matthias Springer [Fri, 14 May 2021 00:56:28 +0000 (09:56 +0900)]

[mlir] VectorToSCF target rank is a pass option

Make "target rank" a pass option of VectorToSCF.

Depends On D102101

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D102123

commit | commitdiff | tree

Chen Zheng [Tue, 11 May 2021 01:31:27 +0000 (21:31 -0400)]

[Debug-Info] change Tag type to dwarf::Tag for createAndAddDIE; NFC

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102207

commit | commitdiff | tree

Peter Collingbourne [Thu, 13 May 2021 20:55:16 +0000 (13:55 -0700)]

scudo: Fix MTE error reporting for zero-sized allocations.

With zero-sized allocations we don't actually end up storing the
address tag to the memory tag space, so store it in the first byte of
the chunk instead so that we can find it later in getInlineErrorInfo().

Differential Revision: https://reviews.llvm.org/D102442

commit | commitdiff | tree

Peter Collingbourne [Wed, 12 May 2021 23:49:19 +0000 (16:49 -0700)]

scudo: Check for UAF in ring buffer before OOB in more distant blocks.

It's more likely that we have a UAF than an OOB in blocks that are
more than 1 block away from the fault address, so the UAF should
appear first in the error report.

Differential Revision: https://reviews.llvm.org/D102379

commit | commitdiff | tree

Arthur Eubanks [Fri, 14 May 2021 01:12:55 +0000 (18:12 -0700)]

[test] Fix new-pm-lto-defaults.ll to work on all platforms

https://lab.llvm.org/buildbot/#/builders/119/builds/3775/steps/8/logs/FAIL__LLVM__new-pm-lto-defaults_ll

Followup to D102345.

commit | commitdiff | tree

H.J. Lu [Fri, 14 May 2021 01:07:11 +0000 (18:07 -0700)]

[sanitizer] Use size_t on g_tls_size to fix build on x32

On x32 size_t == unsigned int, not unsigned long int:

../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp: In function ??void __sanitizer::InitTlsSize()??:
../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:209:55: error: invalid conversion from ??__sanitizer::uptr*?? {aka ??long unsigned int*??} to ??size_t*?? {aka ??unsigned int*??} [-fpermissive]
  209 |   ((void (*)(size_t *, size_t *))get_tls_static_info)(&g_tls_size, &tls_align);
      |                                                       ^~~~~~~~~~~
      |                                                       |
      |                                                       __sanitizer::uptr* {aka long unsigned int*}

by using size_t on g_tls_size.  This is to fix:

https://bugs.llvm.org/show_bug.cgi?id=50332

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D102446

commit | commitdiff | tree

Chen Zheng [Mon, 10 May 2021 02:57:36 +0000 (22:57 -0400)]

[Debug-Info] make DIE attributes generation under strict DWARF control

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D101024

commit | commitdiff | tree

Amara Emerson [Thu, 13 May 2021 18:42:17 +0000 (11:42 -0700)]

[AArch64][GlobalISel] Fix a crash during unsuccessful G_CTPOP <2 x s64> legalization.

The legalization rule for scalar-same-as doesn't handle vectors. Until we
implement custom legalization for this, at least fall back properly.

commit | commitdiff | tree

Valentin Clement [Fri, 14 May 2021 00:27:37 +0000 (20:27 -0400)]

[mlir][openacc][NFC] add anonymous namespace around LegalizeDataOpForLLVMTranslation class

Add missing anonymous namespace around LegalizeDataOpForLLVMTranslation class .

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D102380

commit | commitdiff | tree

Reid Kleckner [Thu, 13 May 2021 23:03:01 +0000 (16:03 -0700)]

[gn] Don't pass -fprofile-instr-generate to linker on Windows

Avoids a warning from the linker. The user still has to put the resource
directory on the linker search path, and I can't find a clean way to do
that automatically in gn.

commit | commitdiff | tree

Matt Arsenault [Thu, 13 May 2021 23:00:13 +0000 (19:00 -0400)]

AMDGPU/GlobalISel: Don't hardcode stack alignment in assert message

commit | commitdiff | tree

Matt Arsenault [Tue, 12 Jan 2021 22:56:57 +0000 (17:56 -0500)]

AMDGPU/GlobalISel: Implement tail calls

Or at least the sibling call cases which the DAG already handles.

commit | commitdiff | tree

Nicolas Vasilache [Thu, 13 May 2021 20:57:57 +0000 (20:57 +0000)]

[mlir][Linalg] Add support for vector.transfer ops to comprehensive bufferization (2/n).

Differential revision: https://reviews.llvm.org/D102395

commit | commitdiff | tree

Nicolas Vasilache [Thu, 13 May 2021 20:42:24 +0000 (20:42 +0000)]

[mlir][Linalg] Add ComprehensiveBufferize for functions(step 1/n)

This is the first step towards upstreaming comprehensive bufferization following the
discourse post: https://llvm.discourse.group/t/rfc-linalg-on-tensors-update-and-comprehensive-bufferization-rfc/3373/6.

This first commit introduces a basic pass for bufferizing within function boundaries,
assuming that the inplaceable function boundaries have been marked as such.

Differential revision: https://reviews.llvm.org/D101693

commit | commitdiff | tree

Weston Carvalho [Mon, 10 May 2021 17:50:55 +0000 (10:50 -0700)]

Widen `name` stencil to support `TypeLoc` nodes.

Differential Revision: https://reviews.llvm.org/D102185

commit | commitdiff | tree

Arthur Eubanks [Sun, 2 May 2021 02:04:42 +0000 (19:04 -0700)]

[IR] Introduce the opaque pointer type

The opaque pointer type is essentially just a normal pointer type with a
null pointee type.

This also adds support for the opaque pointer type to the bitcode
reader/writer, as well as to textual IR.

To avoid confusion with existing pointer types, we disallow creating a
pointer to an opaque pointer.

Opaque pointer types should not be widely used at this point since many
parts of LLVM still do not support them. The next steps are to add some
very simple use cases of opaque pointers to make sure they work, then
start pretending that all pointers are opaque pointers and see what
breaks.

https://lists.llvm.org/pipermail/llvm-dev/2021-May/150359.html

Reviewed By: dblaikie, dexonsmith, pcc

Differential Revision: https://reviews.llvm.org/D101704

commit | commitdiff | tree

Michael Kruse [Thu, 13 May 2021 22:12:23 +0000 (17:12 -0500)]

[Clang][OpenMP] Allow unified_shared_memory for Pascal-generation GPUs.

The Pascal architecture supports the page migration engine required for
unified_shared_memory, as indicated by NVIDIA:
* https://developer.nvidia.com/blog/unified-memory-cuda-beginners/
* https://developer.nvidia.com/blog/beyond-gpu-memory-limits-unified-memory-pascal/
* https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-requirements

The limitation was introduced in D54493 which justified the cut-off by
the requirement for unified addressing. However, Unified Virtual
Addressing (UVA) is already available with sm20 (Fermi, Kepler,
Maxwell):
* https://docs.nvidia.com/cuda/gpudirect-rdma/index.html#basics-of-uva-cuda-memory-management

Unified shared memory might even be possible with these, but with
migration of entire allocations on kernel startup.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D101595

commit | commitdiff | tree

cynecx [Thu, 13 May 2021 20:45:53 +0000 (21:45 +0100)]

Don't run MachineVerifier on sjlj-unwind-inline-asm test because of known issue (PR39439)

Fixes buildbot failure (https://lab.llvm.org/buildbot/#/builders/16/builds/10825).

Reviewed By: Amanieu

Differential Revision: https://reviews.llvm.org/D102433

commit | commitdiff | tree

Arthur Eubanks [Wed, 12 May 2021 00:02:12 +0000 (17:02 -0700)]

[docs] Add page on opaque pointer types

Reviewed By: dblaikie, dexonsmith

Differential Revision: https://reviews.llvm.org/D102292

commit | commitdiff | tree

Lang Hames [Thu, 13 May 2021 21:33:33 +0000 (14:33 -0700)]

[clang-repl] Temporarily disable the execute.cpp test on ppc64.

This test is failing on some builders (see [1]) with the following error:

error: Added modules have incompatible data layouts:
e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs
E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)

The JIT layout is correct, but some IR module added to the JIT is using a
little-endian layout instead.

This commit disables the test on ppc64 until we can investigate further and
fix the bug.

[1] https://lab.llvm.org/staging/#/builders/126/builds/371

commit | commitdiff | tree

Nikita Popov [Thu, 13 May 2021 21:03:46 +0000 (23:03 +0200)]

[CaptureTracking] Use isIdentifiedFunctionLocal() (NFC)

These conditions together exactly match isIdentifiedFunctionLocal(),
and this is also what we logically want to check for here.

commit | commitdiff | tree

Nikita Popov [Thu, 13 May 2021 20:59:19 +0000 (22:59 +0200)]

[AA] Use isIdentifiedFunctionLocal() (NFC)

This condition is equivalent to isIdentifiedFunctionLocal(),
and this is also what we semantically want to check here.

commit | commitdiff | tree

Roman Lebedev [Thu, 13 May 2021 20:55:51 +0000 (23:55 +0300)]

Revert "[X86][CostModel] X86TTIImpl::getMemoryOpCost(): rewrite vector handling again"

As reported in post-commit feedback, this has issues with e.g. <16 x i1>:
https://llvm.godbolt.org/z/jxPvdGEW4

This reverts commit c02476f3158f2908ef0a6f628210b5380bd33695.

commit | commitdiff | tree

Roman Lebedev [Thu, 13 May 2021 20:55:28 +0000 (23:55 +0300)]

Revert "[X86] X86TTIImpl::getInterleavedMemoryOpCostAVX2(): use getMemoryOpCost()"

Depends on a commit that is about to be reverted.

This reverts commit 69ed93a4355123a45c1d7216aea7cd53d07a361b.

commit | commitdiff | tree

Roman Lebedev [Thu, 13 May 2021 20:54:11 +0000 (23:54 +0300)]

[X86] AMD Zen 3: same-reg SSE XMM XORPS is a 1-cycle(!) dep-breaking one-idiom

While both the SOG and Agner insist that it is zero-cycle,
i can not confirm that claim. While it clearly breaks the dependency,
i can not come up with a snippet, or measurement approach,
to end up with IPC bigger than 4, which, to me, means that it actually
consumes execution resource of an FP unit for a cycle.

commit | commitdiff | tree

Roman Lebedev [Thu, 13 May 2021 20:48:49 +0000 (23:48 +0300)]

[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM XORPS test

commit | commitdiff | tree

Rob Suderman [Mon, 3 May 2021 20:56:00 +0000 (13:56 -0700)]

[mlir][tosa] Add lowering to tosa.abs for integer cases

Integer case requires decomposing to simple LLVM operatons.

Differential Revision: https://reviews.llvm.org/D101809

commit | commitdiff | tree

Siva Chandra Reddy [Thu, 13 May 2021 20:48:45 +0000 (20:48 +0000)]

[libc] Enable fmaf and fma on x86_64.

They require clang-11 or above for building and hence had to be disabled
as the bots did not have clang-11 or higher. Bots have now been upgraded
so we can enable these functions now.

commit | commitdiff | tree

Fangrui Song [Thu, 13 May 2021 20:44:57 +0000 (13:44 -0700)]

[CMake][ELF] Link libLLVM.so and libclang-cpp.so with -Bsymbolic-functions

llvm-dev message: https://lists.llvm.org/pipermail/llvm-dev/2021-May/150465.html

In an ELF shared object, a default visibility defined symbol is preemptible by
default. This creates some missed optimization opportunities.
-Bsymbolic-functions is more aggressive than our current -fvisibility-inlines-hidden
(present since 2012) as it applies to all function definitions. It can

* avoid PLT for cross-TU function calls && reduce dynamic symbol lookup
* reduce dynamic symbol lookup for taking function addresses and optimize out GOT/TOC on x86-64/ppc64

In a -DLLVM_TARGETS_TO_BUILD=X86 build, the number of JUMP_SLOT decreases from 12716 to 1628, and the number of GLOB_DAT decreases from 1918 to 1313
The built clang with `-DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on` is significantly faster.
See the Linux kernel build result https://bugs.archlinux.org/task/70697

Note: the performance of -fno-semantic-interposition -Bsymbolic-functions
libLLVM.so and libclang-cpp.so is close to a PIE binary linking against
`libLLVM*.a` and `libclang*.a`. When the host compiler is Clang,
-Bsymbolic-functions is the major contributor. On x86-64 (with GOTPCRELX) and
ppc64 ELFv2, the GOT/TOC relocations can be optimized.

Some implication:

Interposing a subset of functions is no longer supported.
(This is fragile on ELF and unsupported on Mach-O at all. For Mach-O we don't
use `ld -interpose` or `-flat_namespace`)

Compiling a program which takes the address of any LLVM function with
`{gcc,clang} -fno-pic` and expects the address to equal to the address taken
from libLLVM.so or libclang-cpp.so is unsupported. I am fairly confident that
llvm-project shouldn't have different behaviors depending on such pointer
equality (as we've been using -fvisibility-inlines-hidden which applies to
inline functions for a long time), but if we accidentally do, users should be
aware that they should not make assumption on pointer equality in `-fno-pic`
mode.

See more on https://maskray.me/blog/2021-05-09-fno-semantic-interposition

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D102090

commit | commitdiff | tree

Joseph Huber [Thu, 13 May 2021 19:54:22 +0000 (15:54 -0400)]

[OpenMP] Prevent Attributor from deleting functions in OpenMPOptCGSCC pass

Summary:
This patch prevents the Attributor instances made in the CGSCC pass from
deleting functions. This prevents the attributor from changing the call
graph while OpenMPOpt is working with it.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D102363

commit | commitdiff | tree

natashaknk [Thu, 13 May 2021 20:15:57 +0000 (13:15 -0700)]

[mlir][tosa] Add tosa.div integer lowering to linalg.generic.

Lowering div elementwise op to the linalg dialect. Since tosa only supports integer division, that is the only version that is currently implemented.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D102430

commit | commitdiff | tree

Sean Silva [Wed, 12 May 2021 21:59:12 +0000 (14:59 -0700)]

[mlir][NFC] Add helper for common pattern of replaceAllUsesExcept

This covers the extremely common case of replacing all uses of a Value
with a new op that is itself a user of the original Value.

This should also be a little bit more efficient than the
`SmallPtrSet<Operation *, 1>{op}` idiom that was being used before.

Differential Revision: https://reviews.llvm.org/D102373

commit | commitdiff | tree

Martin Storsjö [Tue, 11 May 2021 06:19:52 +0000 (09:19 +0300)]

[llvm-nm] Support the -V option, print that the tool is compatible with GNU nm

This unlocks some codepaths in libtool.

Differential Revision: https://reviews.llvm.org/D102321

commit | commitdiff | tree

Siva Chandra Reddy [Thu, 13 May 2021 19:22:11 +0000 (19:22 +0000)]

[libc][NFC] Instead of erroring, skip math targets with missing implementations.

Fixes Aarch64 bot.

commit | commitdiff | tree

Siva Chandra Reddy [Wed, 12 May 2021 23:05:13 +0000 (23:05 +0000)]

[libc] Add x86_64 implementations of double precision cos, sin and tan.

The implementations use the x86_64 FPU instructions. These instructions
are extremely slow compared to a polynomial based software
implementation. Also, their accuracy falls drastically once the input
goes beyond 2PI. To improve both the speed and accuracy, we will be
taking the following approach going forward:
1. As a follow up to this CL, we will implement a range reduction algorithm
which will expand the accuracy to the entire double precision range.
2. After that, we will replace the HW instructions with a polynomial
implementation to improve the run time.

After step 2, the implementations will be accurate, performant and target
architecture independent.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D102384

commit | commitdiff | tree

Arnamoy Bhattacharyya [Thu, 13 May 2021 18:56:07 +0000 (14:56 -0400)]

[flang][OpenMP] Add semantic check for close nesting of `master` regions

This patch implements the following semantic check:
```
A master region may not be closely nested inside a work-sharing, loop, atomic, task, or taskloop region.
```

Adds a test case and also modifies a couple of existing test cases to include the check.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D100228

commit | commitdiff | tree

Aaron En Ye Shi [Mon, 10 May 2021 18:20:42 +0000 (18:20 +0000)]

[HIP] Clean up llvm intrinsics using __asm

Instead of using inline asm, use clang builtins
for llvm intrinsics.

Differential Revision: https://reviews.llvm.org/D102427

commit | commitdiff | tree

peter klausler [Wed, 12 May 2021 19:03:21 +0000 (12:03 -0700)]

[flang] Support legacy extension OPEN(ACCESS='APPEND')

It should of course be POSITION='APPEND' but Sun Fortran
supported it on ACCESS=.

Differential Revision: https://reviews.llvm.org/D102350

commit | commitdiff | tree

zoecarver [Thu, 13 May 2021 18:48:43 +0000 (11:48 -0700)]

[libcxx][docs] Add two locks: transform_view and take_view.

Assign myself both of these views.

commit | commitdiff | tree

zoecarver [Thu, 13 May 2021 18:45:22 +0000 (11:45 -0700)]

[libcxx][docs] Update the One Ranges PRoposal Status with open revisions.

1. Moves the names into the names column.
2. Changes the names to reflect who's actually working on what.
3. Adds open revisions.

commit | commitdiff | tree

Aakanksha Patil [Thu, 13 May 2021 18:21:40 +0000 (14:21 -0400)]

[AMDGPU] Add gfx1034 target

Differential Revision: https://reviews.llvm.org/D102306

commit | commitdiff | tree

Artem Dergachev [Tue, 11 May 2021 23:44:49 +0000 (16:44 -0700)]

[clang-tidy] bugprone-infinite-loop: React to ObjC ivars and messages.

If the loop condition is a value of an instance variable, a property value,
or a message result value, it's a good indication that the loop is not infinite
and we have a really hard time proving the opposite so suppress the warning.

Differential Revision: https://reviews.llvm.org/D102294

commit | commitdiff | tree

Artem Dergachev [Tue, 11 May 2021 03:09:32 +0000 (20:09 -0700)]

[clang-tidy] bugprone-infinite-loop: forFunction() -> forCallable().

Take advantage of the new ASTMatcher added in D102213 to fix massive false negatives of the infinite loop checker on Objective-C.

Differential Revision: https://reviews.llvm.org/D102214

commit | commitdiff | tree

Artem Dergachev [Wed, 12 May 2021 03:22:58 +0000 (20:22 -0700)]

[ASTMatchers] Add forCallable(), a generalization of forFunction().

The new matcher additionally covers blocks and Objective-C methods.

This matcher actually makes sure that the statement truly belongs
to that declaration's body. forFunction() incorrectly reported that
a statement in a nested block belonged to the surrounding function.

forFunction() is now deprecated due to the above footgun, in favor of
forCallable(functionDecl()) when only functions need to be considered.

Differential Revision: https://reviews.llvm.org/D102213

commit | commitdiff | tree

Artem Dergachev [Wed, 12 May 2021 03:21:26 +0000 (20:21 -0700)]

[ASTMatchers] NFC: Fix formatting around forFunction().

Differential Revision: https://reviews.llvm.org/D102303

commit | commitdiff | tree

Roman Lebedev [Thu, 13 May 2021 18:22:38 +0000 (21:22 +0300)]

[NFC] Delete two newly-added test cases

Failing on bots in unobvious ways.

commit | commitdiff | tree

peter klausler [Wed, 12 May 2021 19:10:28 +0000 (12:10 -0700)]

[flang] (NFC) Expose internal idiom as utility API

Add overloads to AsGenericExpr() in Evaluate/tools.h to take care
of wrapping an untyped DataRef or bare Symbol in a typed Designator
wrapped up in a generic Expr<SomeType>. Use the new overloads to
replace a few instances of code that was calling TypedWrapper<>()
with a dynamic type.

This new tool will be useful in lowering to drive some code that
works with typed expressions (viz., list-directed I/O list items)
when starting with only a bare Symbol (viz., NAMELIST).

Differential Revision: https://reviews.llvm.org/D102352

commit | commitdiff | tree

Roman Lebedev [Thu, 13 May 2021 18:16:44 +0000 (21:16 +0300)]

[NFC] Try to fix CodeGenCXX/thunk-wrong-return-type.cpp test

commit | commitdiff | tree

cynecx [Thu, 13 May 2021 18:05:11 +0000 (19:05 +0100)]

Support unwinding from inline assembly

I've taken the following steps to add unwinding support from inline assembly:

1) Add a new `unwind` "attribute" (like `sideeffect`) to the asm syntax:

```
invoke void asm sideeffect unwind "call thrower", "~{dirflag},~{fpsr},~{flags}"()
to label %exit unwind label %uexit
```

2.) Add Bitcode writing/reading support + LLVM-IR parsing.

3.) Emit EHLabels around inline assembly lowering (SelectionDAGBuilder + GlobalISel) when `InlineAsm::canThrow` is enabled.

4.) Tweak InstCombineCalls/InlineFunction pass to not mark inline assembly "calls" as nounwind.

5.) Add clang support by introducing a new clobber: "unwind", which lower to the `canThrow` being enabled.

6.) Don't allow unwinding callbr.

Reviewed By: Amanieu

Differential Revision: https://reviews.llvm.org/D95745

commit | commitdiff | tree

Roman Lebedev [Thu, 13 May 2021 18:09:45 +0000 (21:09 +0300)]

[NFC] Try to fix CodeGenCXX/thunk-wrong-this.cpp test

commit | commitdiff | tree

Stefan Pintilie [Thu, 13 May 2021 14:58:59 +0000 (09:58 -0500)]

[PowerPC] Add ROP Protection to prologue and epilogue

Added hashst to the prologue and hashchk to the epilogue.
The hash for the prologue and epilogue must always be stored as the first
element in the local variable space on the stack.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D99377

commit | commitdiff | tree

peter klausler [Wed, 12 May 2021 19:07:51 +0000 (12:07 -0700)]

[flang] Implement DOT_PRODUCT in the runtime

API, implementation, and basic tests for the transformational
reduction intrinsic function DOT_PRODUCT in the runtime support
library.

Differential Revision: https://reviews.llvm.org/D102351

commit | commitdiff | tree

Duncan P. N. Exon Smith [Fri, 30 Apr 2021 22:23:47 +0000 (15:23 -0700)]

Modules: Simplify how DisableGeneratingGlobalModuleIndex is set, likely NFC

DisableGeneratingGlobalModuleIndex was being set by
CompilerInstance::findOrCompileModuleAndReadAST most of (but not all of)
the times it returned `nullptr` as a "normal" failure. Pull that up to
the caller, CompilerInstance::loadModule, to simplify the code. This
resolves a number of FIXMEs added during the refactoring in
5cca622310c10fdf6f921b6cce26f91d9f14c762.

The extra cases where this is set are all some version of a fatal error,
and the only client of the field, shouldBuildGlobalModuleIndex, seems
to be unreachable in that case. Even if there is some corner case where
this has an effect, it seems like the right/consistent behaviour.

Differential Revision: https://reviews.llvm.org/D101672

commit | commitdiff | tree

Roman Lebedev [Thu, 13 May 2021 15:20:37 +0000 (18:20 +0300)]

Return "[CGCall] Annotate `this` argument with alignment"

The original change was reverted because it was discovered
that clang mishandles thunks, and they receive wrong
attributes for their this/return types - the ones for the function
they will call, not the ones they have.

While i have tried to fix this in https://reviews.llvm.org/D100388
that patch has been up and stuck for a month now,
with little signs of progress.

So while it will be good to solve this for real,
for now we can simply avoid introducing the bug,
by not annotating this/return for thunks.

This reverts commit 6270b3a1eafaba4279e021418c5a2c5a35abc002,
relanding 0aa0458f1429372038ca6a4edc7e94c96cd9a753.

commit | commitdiff | tree

Roman Lebedev [Thu, 13 May 2021 11:48:26 +0000 (14:48 +0300)]

[Clang][Codegen] Do not annotate thunk's this/return types with align/deref/nonnull attrs

As it was discovered in post-commit feedback
for 0aa0458f1429372038ca6a4edc7e94c96cd9a753,
we handle thunks incorrectly, and end up annotating
their this/return with attributes that are valid
for their callees, not for thunks themselves.

While it would be good to fix this properly,
and keep annotating them on thunks,
i've tried doing that in https://reviews.llvm.org/D100388
with little success, and the patch is stuck for a month now.

So for now, as a stopgap measure, subj.

commit | commitdiff | tree

Roman Lebedev [Thu, 13 May 2021 11:46:41 +0000 (14:46 +0300)]

[NFC][Clang][Codegen] Add tests with wrong attributes on this/return of thunks

From https://reviews.llvm.org/D100388

commit | commitdiff | tree

David Green [Thu, 13 May 2021 17:31:01 +0000 (18:31 +0100)]

[ARM] Constrain CMPZ shift combine to a single use

We currently prefer t2CMPrs over t2CMPri when the node contains a shift.
This can introduce more nodes if the shift has multiple uses though, as
value from the shift will be needed anyway, and in the case of a t2CMPri
compared with zero will more readily be removed entirely.

Differential Revision: https://reviews.llvm.org/D101688

commit | commitdiff | tree

Jonas Devlieghere [Thu, 13 May 2021 17:12:00 +0000 (10:12 -0700)]

[lldb] Fixup indirect symbols as they are signed.

This fixes a bunch of test failures in Apple Silicon (arm64e).

commit | commitdiff | tree

Jonas Devlieghere [Thu, 13 May 2021 16:41:09 +0000 (09:41 -0700)]

[lldb] Fixup more code addresses

The Swift async task pointers are signed on arm64e and we need to fixup
the addresses in the CFA and DWARF expressions.

commit | commitdiff | tree

Duncan P. N. Exon Smith [Fri, 30 Apr 2021 22:09:09 +0000 (15:09 -0700)]

Modules: Rename ModuleBuildFailed => DisableGeneratingGlobalModuleIndex, NFC

Rename CompilerInstance's ModuleBuildFailed field to
DisableGeneratingGlobalModuleIndex, which more precisely describes its
role. Otherwise, it's hard to suss out how it's different from
ModuleLoader::HadFatalFailure, and what sort of code simplifications are
safe.

Differential Revision: https://reviews.llvm.org/D101670

commit | commitdiff | tree

Weiwei Li [Thu, 13 May 2021 17:06:53 +0000 (13:06 -0400)]

[mlir][spirv] Define spv.ImageQuerySize operation

Support OpImageQuerySize in spirv dialect

co-authored-by: Alan Liu <alanliu.yf@gmail.com>

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D102029

commit | commitdiff | tree

Valeriy Savchenko [Tue, 11 May 2021 14:30:02 +0000 (17:30 +0300)]

[analyzer][solver] Prevent use of a null state

rdar://77686137

Differential Revision: https://reviews.llvm.org/D102240

commit | commitdiff | tree

zoecarver [Thu, 13 May 2021 17:09:15 +0000 (10:09 -0700)]

[pstl] Use logical operator for loop condition in tests

Fix a probable typo in two PSTL tests that causes warnings with GCC.

Patch by Jonathan Wakely (jwakely).

Reviewed By: zoecarver

Differential Revision: https://reviews.llvm.org/D102327

Domain: System / Toolchain;

RSS Atom