review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

Matthias Springer [Tue, 5 Jul 2022 14:54:38 +0000 (16:54 +0200)]

[mlir][interfaces][NFC] Remove ViewLikeInterface::expandToRank

This helper function is no longer needed.

Differential Revision: https://reviews.llvm.org/D129145

commit | commitdiff | tree

Nikita Popov [Tue, 5 Jul 2022 08:56:54 +0000 (10:56 +0200)]

Revert "[SimplifyCFG] Thread branches on same condition in more cases (PR54980)"

This reverts commit 4e545bdb355a470d601e9bb7f7b2693c99e61a3e.

The newly added test is the third infinite combine loop caused by
this change. In this case, it's a combination of the branch to
common dest and jump threading folds that keeps peeling off loop
iterations.

The core problem here is that we ideally would not thread over
loop backedges, both because it is potentially non-profitable
(it may break canonical loop structure) and because it may result
in these kinds of loops. Unfortunately, due to the lack of a
dominator tree in SimplifyCFG, there is no good way to prevent
this. While we have LoopHeaders, this is an optional structure and
we don't do a good job of keeping it up to date. It would be fine
for a profitability check, but is not suitable for a correctness
check.

So for now I'm just giving up here, as I don't see a good way to
robustly prevent infinite combine loops.

Fixes https://github.com/llvm/llvm-project/issues/56203.

commit | commitdiff | tree

Matthias Springer [Tue, 5 Jul 2022 14:39:29 +0000 (16:39 +0200)]

[mlir][memref] Improve type inference for rank-reducing subviews

The result shape of a rank-reducing subview cannot be inferred in the general case. Just the result rank is not enough. The only thing that we can infer is the layout map.

This change also improves the bufferization patterns of tensor.extract_slice and tensor.insert_slice to fully support rank-reducing operations.

Differential Revision: https://reviews.llvm.org/D129144

commit | commitdiff | tree

Joe Nash [Mon, 27 Jun 2022 17:20:21 +0000 (13:20 -0400)]

[AMDGPU] gfx11 CodeGen for new DPP instructions

Modifies the GCNDPPCombine pass to enable DPP formation for the new DPP
instruction in gfx11, namely VOP3 encoded instructions with DPP and VOPC
with DPP.

Depends on D128656

Reviewed By: #amdgpu, rampitec

Differential Revision: https://reviews.llvm.org/D128682

commit | commitdiff | tree

Matthias Springer [Tue, 5 Jul 2022 13:36:02 +0000 (15:36 +0200)]

[mlir][tensor][bufferize][NFC] Clean up test case

Insert -split-input-file flag to make the test cases more stable.

Differential Revision: https://reviews.llvm.org/D129143

commit | commitdiff | tree

LLVM GN Syncbot [Tue, 5 Jul 2022 13:57:20 +0000 (13:57 +0000)]

[gn build] Port d1af09ad9617

commit | commitdiff | tree

Haojian Wu [Fri, 1 Jul 2022 12:50:07 +0000 (14:50 +0200)]

[pseudo] Implement guard extension.

- Extend the GLR parser to allow conditional reduction based on the
guard functions;
- Implement two simple guards (contextual-override/final) for cxx.bnf;
- layering: clangPseudoCXX depends on clangPseudo (as the guard function need
to access the TokenStream);

Differential Revision: https://reviews.llvm.org/D127448

commit | commitdiff | tree

Nikita Popov [Wed, 29 Jun 2022 12:27:04 +0000 (14:27 +0200)]

[ConstExpr] Don't create div/rem expressions

This removes creation of udiv/sdiv/urem/srem constant expressions,
in preparation for their removal. I've added a
ConstantExpr::isDesirableBinOp() predicate to determine whether
an expression should be created for a certain operator.

With this patch, div/rem expressions can still be created through
explicit IR/bitcode, forbidding them entirely will be the next step.

Differential Revision: https://reviews.llvm.org/D128820

commit | commitdiff | tree

Eric Li [Mon, 4 Jul 2022 20:46:07 +0000 (20:46 +0000)]

[clang][dataflow] Handle null pointers of type std::nullptr_t

Treat `std::nullptr_t` as a regular scalar type to avoid tripping
assertions when analyzing code that uses `std::nullptr_t`.

Differential Revision: https://reviews.llvm.org/D129097

commit | commitdiff | tree

Joe Nash [Thu, 23 Jun 2022 19:57:01 +0000 (15:57 -0400)]

[AMDGPU] gfx11 Generate VOPD Instructions

We form VOPD instructions in the GCNCreateVOPD pass by combining
back-to-back component instructions. There are strict register
constraints for creating a legal VOPD, namely that the matching operands
(e.g. src0x and src0y, src1x and src1y) must be in different register
banks. We add a PostRA scheduler
mutation to put possible VOPD components back-to-back.

Depends on D128442, D128270

Reviewed By: #amdgpu, rampitec

Differential Revision: https://reviews.llvm.org/D128656

commit | commitdiff | tree

Haojian Wu [Tue, 5 Jul 2022 13:40:02 +0000 (15:40 +0200)]

[pseudo] Fix the build for the benchmark tool.

commit | commitdiff | tree

Vladislav Khmelevsky [Tue, 28 Jun 2022 16:54:59 +0000 (19:54 +0300)]

[RuntimeDyld] Fix R_AARCH64_TSTBR14 relocation

Wrong mask was used to get branch instruction imm value.

Differential Revision: https://reviews.llvm.org/D128740

commit | commitdiff | tree

Nikita Popov [Mon, 27 Jun 2022 13:09:24 +0000 (15:09 +0200)]

[SCEV] Fix isImpliedViaMerge() with values from previous iteration (PR56242)

When trying to prove an implied condition on a phi by proving it
for all incoming values, we need to be careful about values coming
from a backedge, as these may refer to a previous loop iteration.
A variant of this issue was fixed in D101829, but the dominance
condition used there isn't quite right: It checks that the value
dominates the incoming block, which doesn't exclude backedges
(values defined in a loop will usually dominate the loop latch,
which is the incoming block of the backedge).

Instead, we should be checking for domination of the phi block.
Any values defined inside the loop will not dominate the loop
header phi.

Fixes https://github.com/llvm/llvm-project/issues/56242.

Differential Revision: https://reviews.llvm.org/D128640

commit | commitdiff | tree

Andi-Bogdan Postelnicu [Tue, 5 Jul 2022 09:07:15 +0000 (09:07 +0000)]

[Compiler-RT] Remove FlushViewOfFile call when unmapping gcda files on win32.

This patch was pushed for calixte@mozilla.com

- this function (Windows only) is called when gcda are dumped on disk;
- according to its documentation, it's only useful in case of hard failures, this is highly improbable;
- it drastically decreases the time in the tests and consequently it avoids timeouts when we use slow disks.

Differential Revision: https://reviews.llvm.org/D129128

commit | commitdiff | tree

Haojian Wu [Mon, 4 Jul 2022 12:15:51 +0000 (14:15 +0200)]

[pseudo] Use the prebuilt cxx grammar for the lit tests, NFC.

Differential Revision: https://reviews.llvm.org/D129074

commit | commitdiff | tree

Ivan Kosarev [Tue, 5 Jul 2022 13:06:36 +0000 (14:06 +0100)]

[AMDGPU][NFC] Refine matching SMRD offsets.

Tell the matcher what we are looking for instead of matching everything
and then discarding the result if doesn't fit.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D128171

commit | commitdiff | tree

Ivan Kosarev [Tue, 5 Jul 2022 12:39:46 +0000 (13:39 +0100)]

[AMDGPU][GlobalISel] Support register offsets for SMRDs.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D128836

commit | commitdiff | tree

Sam McCall [Mon, 4 Jul 2022 18:35:40 +0000 (20:35 +0200)]

[pseudo] Eliminate LRTable::Action. NFC

The last remaining uses are in tests/test builders.
Replace with a builder struct.

Differential Revision: https://reviews.llvm.org/D129093

commit | commitdiff | tree

Nikita Popov [Tue, 5 Jul 2022 10:21:06 +0000 (12:21 +0200)]

[SimplifyCFG] Thread all predecessors with same value at once

If there are multiple predecessors that have the same condition
value (and thus same "real destination"), these were previously
handled by copying the threaded block for each predecessor.
Instead, we can reuse one block for all of them. This makes the
behavior of SimplifyCFG's jump threading match that of the
actual JumpThreading pass.

This also avoids the infinite combine loop reported in:
https://reviews.llvm.org/D124159#3624387

commit | commitdiff | tree

Florian Hahn [Tue, 5 Jul 2022 11:58:13 +0000 (12:58 +0100)]

[LV] Remove stray dbgs() call after 774fc63490939.

commit | commitdiff | tree

Nikita Popov [Tue, 5 Jul 2022 11:57:16 +0000 (13:57 +0200)]

[SimplifyCFG] Add additional jump threading test (NFC)

A case where multiple predecessors can be threaded over the same
edge, with a phi node in the threaded block.

commit | commitdiff | tree

Tobias Hieta [Tue, 5 Jul 2022 11:45:32 +0000 (13:45 +0200)]

[clang-extdef-mapping] Directly process .ast files

When doing CTU analysis setup you pre-compile .cpp to .ast and then
you run clang-extdef-mapping on the .cpp file as well. This is a
pretty slow process since we have to recompile the file each time.

With this patch you can now run clang-extdef-mapping directly on
the .ast file. That saves a lot of time.

I tried this on llvm/lib/AsmParser/Parser.cpp and running
extdef-mapping on the .cpp file took 5.4s on my machine.

While running it on the .ast file it took 2s.

This can save a lot of time for the setup phase of CTU analysis.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D128704

commit | commitdiff | tree

Thomas Symalla [Tue, 5 Jul 2022 11:37:44 +0000 (13:37 +0200)]

[NFC] Fix wrong comment.

commit | commitdiff | tree

Muhammad Omair Javaid [Tue, 5 Jul 2022 11:26:14 +0000 (15:26 +0400)]

[LLDB] Fix decorator import in TestTwoHitsOneActual.py

commit | commitdiff | tree

Groverkss [Tue, 5 Jul 2022 11:17:24 +0000 (12:17 +0100)]

[MLIR][Affine] Allow `<=` in IntegerSet constraints

This patch extends the affine parser to allow affine constraints with `<=`.
This is useful in writing unittests for Presburger library and test in general.

The internal storage and printing of IntegerSet is still in the original format.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D129046

commit | commitdiff | tree

Alexey Bader [Tue, 5 Jul 2022 11:11:45 +0000 (07:11 -0400)]

Updating office hours

commit | commitdiff | tree

Muhammad Omair Javaid [Tue, 5 Jul 2022 11:00:53 +0000 (15:00 +0400)]

[LLDB] Skip TestTwoHitsOneActual.py on Arm/AArch64 Linux

This test has some race condition which is making it hang on LLDB
Arm/AArch64 Linux buildbot. I am marking it as skipped until we
investigate whats going wrong.

commit | commitdiff | tree

Kazushi (Jam) Marukawa [Tue, 5 Jul 2022 10:38:06 +0000 (19:38 +0900)]

[VE] Restructure eliminateFrameIndex

Restructure the current implementation of eliminateFrameIndex function
in order to support more instructions.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D129034

commit | commitdiff | tree

Jun Zhang [Tue, 5 Jul 2022 10:43:21 +0000 (18:43 +0800)]

Correct XFAIL according to bot owner's advice

Signed-off-by: Jun Zhang <jun@junz.org>

commit | commitdiff | tree

serge-sans-paille [Tue, 5 Jul 2022 10:19:00 +0000 (12:19 +0200)]

[clang-tidy] By-pass portability issues in confusable-identifiers test

commit | commitdiff | tree

Kazushi (Jam) Marukawa [Tue, 5 Jul 2022 10:35:12 +0000 (19:35 +0900)]

Revert "[VE] Restructure eliminateFrameIndex"

This reverts commit 98e52e8bff525b1fb2b269f74b27f0a984588c9c.

commit | commitdiff | tree

Kazushi (Jam) Marukawa [Sat, 2 Jul 2022 05:06:17 +0000 (14:06 +0900)]

commit | commitdiff | tree

Jun Zhang [Tue, 5 Jul 2022 04:32:12 +0000 (12:32 +0800)]

Reland "Reland "[NFC] Add a missing test for for clang-repl""

This reverts commit 6956840b5c0029d7f8e043b3c77bb1ffc230e4d5.
Try to use `XFAIL: windows-msvc || ps4` to disable all unsupported targets.

Signed-off-by: Jun Zhang <jun@junz.org>

commit | commitdiff | tree

Muhammad Omair Javaid [Tue, 5 Jul 2022 09:16:23 +0000 (13:16 +0400)]

[LLDB] Disable TestGdbRemoteFork* for Arm/AArch64 Linux

This test is causing some trouble with LLDB Arm/AArch64 Linux buildbot.
I am disabling is temporarily to make buildbot green.

commit | commitdiff | tree

Archibald Elliott [Tue, 5 Jul 2022 09:43:31 +0000 (10:43 +0100)]

[ARM] Add Support for Cortex-M85

This patch adds support for Arm's Cortex-M85 CPU. The Cortex-M85 CPU is
an Arm v8.1m Mainline CPU, with optional support for MVE and PACBTI,
both of which are enabled by default.

Parts have been coauthored by by Mark Murray, Alexandros Lamprineas and
David Green.

Differential Revision: https://reviews.llvm.org/D128415

commit | commitdiff | tree

Yi Kong [Tue, 5 Jul 2022 09:26:34 +0000 (17:26 +0800)]

Fix tests with non-default CLANG_DEFAULT_LINKER

Force -fuse-ld option, as some other tests in the same file do.

commit | commitdiff | tree

Sven van Haastregt [Tue, 5 Jul 2022 09:22:34 +0000 (10:22 +0100)]

[OpenCL] Remove fast_ half geometric builtins

These are not mentioned in the OpenCL C Specification nor in the
OpenCL Extension Specification.

Differential Revision: https://reviews.llvm.org/D128436

commit | commitdiff | tree

Florian Hahn [Tue, 5 Jul 2022 09:21:33 +0000 (10:21 +0100)]

[IndVars] Precommit test with redundant FPToSI.

Test for #55505.

commit | commitdiff | tree

Chenbing Zheng [Tue, 5 Jul 2022 09:14:22 +0000 (17:14 +0800)]

[InstCombine] improve fold for icmp_eq_and to icmp_ult

In D95959, the improve analysis for "C >> X" broken the fold
((%x & C) == 0) --> %x u< (-C) iff (-C) is power of two.

It simplifies C, but fails to satisfy the fold condition.
This patch try to restore C before the fold.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D128790

commit | commitdiff | tree

Peter Waller [Mon, 4 Jul 2022 14:10:02 +0000 (14:10 +0000)]

[gn build] (manually) port 6b3956e123db

Differential Revision: https://reviews.llvm.org/D129080

commit | commitdiff | tree

Chenbing Zheng [Tue, 5 Jul 2022 09:02:52 +0000 (17:02 +0800)]

[InstCombine] [NFC] use C.isNegatedPowerOf2() instead of (~C + 1).isPowerOf2()

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D129103

commit | commitdiff | tree

Nico Weber [Tue, 5 Jul 2022 08:59:41 +0000 (10:59 +0200)]

[gn build] (manually) port dfb77f2e99a1

commit | commitdiff | tree

Kaining Zhong [Tue, 5 Jul 2022 08:50:37 +0000 (10:50 +0200)]

[lldb] Add support to load object files from thin archives

This fixes https://github.com/llvm/llvm-project/issues/50114 where lldb/mac
can't load object files from thin archives. This patch allows lldb to identify
thin archives, and load object files contained in them.

Differential Revision: https://reviews.llvm.org/D126464

commit | commitdiff | tree

Chenbing Zheng [Tue, 5 Jul 2022 08:48:49 +0000 (16:48 +0800)]

[InstCombine] add negtive tests for (%x & C) == 0 -> %x u< (-C). nfc

commit | commitdiff | tree

David Sherwood [Wed, 15 Jun 2022 14:10:16 +0000 (15:10 +0100)]

[AArch64][SME] Add SME addha/va intrinsics

This patch adds new the following SME intrinsics:

@llvm.aarch64.sme.addva
@llvm.aarch64.sme.addha

Differential Revision: https://reviews.llvm.org/D127861

commit | commitdiff | tree

Ben Dunbobbin [Fri, 1 Jul 2022 15:45:09 +0000 (16:45 +0100)]

[LLD][ELF] Add FORCE_LLD_DIAGNOSTICS_CRASH to force LLD to crash

Add FORCE_LLD_DIAGNOSTICS_CRASH inspired by the existing
FORCE_CLANG_DIAGNOSTICS_CRASH.

This is particularly useful for people customizing LLD as they may
want to modify the crash reporting behavior.

Differential Revision: https://reviews.llvm.org/D128195

commit | commitdiff | tree

Florian Hahn [Tue, 5 Jul 2022 08:41:58 +0000 (09:41 +0100)]

[LV] Consider minimum vscale assmuption for RT check cost.

For scalable VFs, the minimum assumed vscale needs to be included in the
cost-computation, otherwise a smaller VF may be used for RT check cost
computation than was used for earlier cost computations.

Fixes a RISCV test failing with UBSan due to both scalar and vector
loops having the same cost.

commit | commitdiff | tree

Nicolas Vasilache [Mon, 4 Jul 2022 16:48:18 +0000 (09:48 -0700)]

[mlir][Linalg] Add DropUnitDims support for tensor::ParallelInsertSliceOp.

ParallelInsertSlice behaves similarly to tensor::InsertSliceOp in its
rank-reducing properties.
This revision extends rank-reducing rewrite behavior and reuses most of the
existing implementation.

Differential Revision: https://reviews.llvm.org/D129091

commit | commitdiff | tree

Manuel Klimek [Tue, 5 Jul 2022 08:20:09 +0000 (08:20 +0000)]

Fix use of pointer arithmetic instead of iterators.

commit | commitdiff | tree

Nikolas Klauser [Tue, 5 Jul 2022 08:13:38 +0000 (10:13 +0200)]

[libc++] Fix __split_buffer::__construct_at_end definition to match declaration

commit | commitdiff | tree

Nikolas Klauser [Mon, 4 Jul 2022 20:45:49 +0000 (22:45 +0200)]

[libc++] Use __is_exactly_{input, forward}_iterator

Reviewed By: ldionne, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D128646

commit | commitdiff | tree

Nikita Popov [Tue, 5 Jul 2022 07:29:11 +0000 (09:29 +0200)]

Revert "[VectorCombine] Improve shuffle select shuffle-of-shuffles"

This reverts commit 19a1e20b8a0f69da2a871eae6cbd03d1314ee02d.

Clang crashes while linking bullet from llvm-test-suite in
ReleaseLTO-g cmake configuration.

commit | commitdiff | tree

Jean Perier [Tue, 5 Jul 2022 07:13:07 +0000 (09:13 +0200)]

[flang] Avoid opaque pointer issue with character array substring addressing

When addressing a substring of a character array, codegen emits two
GEPs: one for to compute the address of the base element, and a second
one to address the first characters from that element.

The first GEP still returns the LLVM array type (if the FIR array type could be
translated to an array type. Therefore) so zero
indexes must be added to the second GEP in this case to cover for the
Fortran array dimensions before inserting the susbtring offset index.

Surprisingly, the previous code worked ok when MLIR emits none opaque
pointers. But with opaque pointers, the two GEPs are folded in an
invalid GEP where the substring offset becomes an offset for the outer
array dimension.

Note that I tried to fix the issue by modifying the first GEP to return the
element type, but this still gave bad results (here something might be
wrong with opaque pointer in MLIR or LLVM).

Differential Revision: https://reviews.llvm.org/D129079

commit | commitdiff | tree

serge-sans-paille [Tue, 28 Jun 2022 08:34:46 +0000 (10:34 +0200)]

[clang-tidy] Fix confusable identifiers interaction with DeclContext

Properly checks enclosing DeclContext, and add the related test case.
It would be great to be able to use Sema to check conflicting scopes, but that's
not something clang-tidy seems to be able to do :-/

Fix #56221

Differential Revision: https://reviews.llvm.org/D128715

commit | commitdiff | tree

Craig Topper [Tue, 5 Jul 2022 05:33:15 +0000 (22:33 -0700)]

[RISCV] Replace an explicit check with an assert.

Shift amounts should never be 0 or more than bitwidth - 1.

commit | commitdiff | tree

Craig Topper [Tue, 5 Jul 2022 05:28:08 +0000 (22:28 -0700)]

[RISCV] Rename some variables for clarity. NFC

commit | commitdiff | tree

Stephan Bergmann [Wed, 29 Jun 2022 06:17:58 +0000 (08:17 +0200)]

[test] Check for more -fsanitize=array-bounds behavior

...that had temporarily regressed with (since reverted)
<https://github.com/llvm/llvm-project/commit/886715af962de2c92fac4bd37104450345711e4a>
"[clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible
arrays", and had then been seen to cause issues in the wild:

For one, the HarfBuzz project has various "fake" flexible array members of the
form

> Type                arrayZ[HB_VAR_ARRAY];

in <https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-open-type.hh>, where
HB_VAR_ARRAY is a macro defined as

> #ifndef HB_VAR_ARRAY
> #define HB_VAR_ARRAY 1
> #endif

in <https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-machinery.hh>.

For another, the Firebird project in
<https://github.com/FirebirdSQL/firebird/blob/master/src/lock/lock_proto.h> uses
a trailing member

>         srq lhb_hash[1];                        // Hash table

as a "fake" flexible array, but declared in a

> struct lhb : public Firebird::MemoryHeader

that is not a standard-layout class (because the Firebird::MemoryHeader base
class also declares non-static data members).

(The second case is specific to C++.  Extend the test setup so that all the
other tests are now run for both C and C++, just in case the behavior could ever
start to diverge for those two languages.)

A third case where -fsanitize=array-bounds differs from -Warray-bounds (and
which is also specific to C++, but which doesn't appear to have been encountered
in the wild) is when the "fake" flexible array member's size results from
template argument substitution.

Differential Revision: https://reviews.llvm.org/D128783

commit | commitdiff | tree

Daniel Bertalan [Sun, 3 Jul 2022 08:58:39 +0000 (10:58 +0200)]

[lld-macho] Handle LOH_ARM64_ADRP_LDR_GOT optimization hints

This hint instructs the linker to perform the AdrpLdr or AdrpAdd
transformation depending on whether the GOT load has been relaxed to
load a local symbol's address.

Differential Revision: https://reviews.llvm.org/D129059

commit | commitdiff | tree

Christian Sigg [Mon, 4 Jul 2022 06:11:30 +0000 (08:11 +0200)]

[mlir] Add InferIntRangeInterface to gpu.launch

Infers block/grid dimensions/indices or ranges of such dimensions/indices.

Reviewed By: krzysz00

Differential Revision: https://reviews.llvm.org/D129036

commit | commitdiff | tree

Fangrui Song [Tue, 5 Jul 2022 04:45:19 +0000 (21:45 -0700)]

[llvm-objcopy] -O binary: align sh_offset for section changed from SHT_NOBITS

For a SHT_NOBITS section like .bss, its sh_offset is typically not
aligned by sh_addralign. If it is converted to SHT_PROGBITS by
`--set-section-flags .bss=alloc,contents`, we should conceptually align
it when computing the output size for -O binary. Otherwise the output
size may be smaller than GNU objcopy produced output.

* binary-no-paddr.test has a case with non-sensical p_paddr=1 which has
a changed behavior. Update it.

Close https://github.com/llvm/llvm-project/issues/55246

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D128961

commit | commitdiff | tree

wanglei [Tue, 5 Jul 2022 01:54:30 +0000 (09:54 +0800)]

[LoongArch] Add initial support for function calls

Note that this is just enough for simple function call examples to
generate working code.

A good portion of this patch is the extra functions that needed to be
implemented to support the test case. e.g. storeRegToStackSlot,
loadRegFromStackSlot, eliminateFrameIndex.

Differential Revision: https://reviews.llvm.org/D128429

commit | commitdiff | tree

wanglei [Tue, 5 Jul 2022 01:49:11 +0000 (09:49 +0800)]

[LoongArch] Add codegen support for conditional branches

Setting ISD::BR_CC to Expand makes it much easier to deal with
matching the expanded form.

Differential Revision: https://reviews.llvm.org/D128428

commit | commitdiff | tree

wanglei [Tue, 5 Jul 2022 01:46:19 +0000 (09:46 +0800)]

[LoongArch] Add codegen support for load/store operations

This patch also support lowering global addresses.

Differential Revision: https://reviews.llvm.org/D128427

commit | commitdiff | tree

Yeting Kuo [Sun, 3 Jul 2022 11:20:28 +0000 (19:20 +0800)]

[RISCV][Clang] Teach RISCVEmitter to generate BitCast for pointer operands.

RVV C intrinsics use pointers to scalar for base address and their corresponding
IR intrinsics but use pointers to vector. It makes some vector load intrinsics
need specific ManualCodegen and MaskedManualCodegen to just add bitcast for
transforming to IR.

For simplifying riscv_vector.td, the patch make RISCVEmitter detect pointer
operands and bitcast them.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D129043

commit | commitdiff | tree

Chuanqi Xu [Tue, 5 Jul 2022 02:55:14 +0000 (10:55 +0800)]

[NFC] Remove unused test inputs

commit | commitdiff | tree

phyBrackets [Tue, 5 Jul 2022 02:41:00 +0000 (08:11 +0530)]

[NFC][ASTImporter] remove the unnecessary condition checks in ASTImporter.cpp

I think that these conditions are unnecessary because in VisitClassTemplateDecl we import the definition via the templated CXXRecordDecl and in VisitVarTemplateDecl via the templated VarDecl. These are named ToTemplted and DTemplated respectively.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D128608

commit | commitdiff | tree

jacquesguan [Mon, 4 Jul 2022 09:15:32 +0000 (17:15 +0800)]

[RISCV][NFC] Merge the isolated decleration into foreach.

Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D129063

commit | commitdiff | tree

zhongyunde [Tue, 5 Jul 2022 01:14:29 +0000 (09:14 +0800)]

[InstCombine] Make use of low zero bits to determine exact int->fp cast

According the comment https://reviews.llvm.org/D127854#inline-1226805,
We could also make use of these low zero bits, https://alive2.llvm.org/ce/z/GYxTRu

Reviewed By: spatel, nikic, xbolva00

Differential Revision: https://reviews.llvm.org/D128895

commit | commitdiff | tree

Sanjay Patel [Mon, 4 Jul 2022 22:55:24 +0000 (18:55 -0400)]

[InstCombine] fold sub of min/max of sub with common operand

x - max(x - y, 0) --> min(x, y)
x - min(x - y, 0) --> max(x, y)

https://alive2.llvm.org/ce/z/2YkqFe

issue #55470

commit | commitdiff | tree

Sanjay Patel [Mon, 4 Jul 2022 22:46:40 +0000 (18:46 -0400)]

[InstCombine] add tests for sub of smin/smax; NFC

issue #55470

commit | commitdiff | tree

Sanjay Patel [Mon, 4 Jul 2022 21:39:54 +0000 (17:39 -0400)]

[InstCombine] add helper function for sub-of-min/max folds; NFC

The test diffs are cosmetic -- but improvements -- because we
let instcombine handle replacement. Instead of dropping the
old value name, it propagates to the new instruction.

commit | commitdiff | tree

Joseph Huber [Mon, 4 Jul 2022 21:32:47 +0000 (17:32 -0400)]

[OffloadPackager] Use appropriate kind for LTO bitcode

Summary:
Currently we just check the extension to set the image kind. This
incorrectly labels the `.o` files created during LTO as object files.
This patch simply adds a check for the bitcode magic bytes instead.

commit | commitdiff | tree

Jonas Hahnfeld [Mon, 4 Jul 2022 17:27:49 +0000 (19:27 +0200)]

[Orc][LLJIT] Use JITLink on RISC-V

RuntimeDyld does not support RISC-V, so it makes sense to enable
JITLink by default. This also makes relocations work without support
for a large code model.

Differential Revision: https://reviews.llvm.org/D129092

commit | commitdiff | tree

Simon Pilgrim [Mon, 4 Jul 2022 20:43:40 +0000 (21:43 +0100)]

[X86] Regenerate fold-tied-op.ll test checks

commit | commitdiff | tree

Florian Hahn [Mon, 4 Jul 2022 20:37:16 +0000 (21:37 +0100)]

[LV] Consider runtime checks profitable if scalar cost is zero.

This fixes an UBSan failure after 644a965c1efef. When using
user-provided VFs/ICs (via the force-vector-width /
force-vector-interleave options) the scalar cost is zero, which would
cause divide-by-zero.

When forcing vectorization using the options, the cost of the runtime
checks should not block vectorization.

commit | commitdiff | tree

Nico Weber [Sun, 3 Jul 2022 20:14:48 +0000 (22:14 +0200)]

[clang-format] Update documentation

- Update `clang-format --help` output after b1f0efc06acc.
- Update `clang-format-diff.py` help text, which apparently hasn't
been updated in a while. Since git and svn examples are now part
of the help text, remove them in the text following the help text.

Differential Revision: https://reviews.llvm.org/D129050

commit | commitdiff | tree

owenca [Sun, 3 Jul 2022 23:42:00 +0000 (16:42 -0700)]

[clang-format] Break on AfterColon only if not followed by comment

Break after a constructor initializer colon only if it's not followed by a
comment on the same line.

Fixes #41128.
Fixes #43246.

Differential Revision: https://reviews.llvm.org/D129057

commit | commitdiff | tree

Valentin Clement [Mon, 4 Jul 2022 19:16:13 +0000 (21:16 +0200)]

[flang] Make code more homogenous in CodeGen

This patch just make the code more similar
in each conversion.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D129071

commit | commitdiff | tree

Sam McCall [Fri, 24 Jun 2022 01:01:45 +0000 (03:01 +0200)]

[pseudo] Store shift and goto actions in a compact structure with faster lookup.

The actions table is very compact but the binary search to find the
correct action is relatively expensive.
A hashtable is faster but pretty large (64 bits per value, plus empty
slots, and lookup is constant time but not trivial due to collisions).

The structure in this patch uses 1.25 bits per entry (whether present or absent)
plus the size of the values, and lookup is trivial.

The Shift table is 119KB = 27KB values + 92KB keys.
The Goto table is 86KB = 30KB values + 57KB keys.
(Goto has a smaller keyspace as #nonterminals < #terminals, and more entries).

This patch improves glrParse speed by 28%: 4.69 => 5.99 MB/s
Overall the table grows by 60%: 142 => 228KB.

By comparison, DenseMap<unsigned, StateID> is "only" 16% faster (5.43 MB/s),
and results in a 285% larger table (547 KB) vs the baseline.

Differential Revision: https://reviews.llvm.org/D128485

commit | commitdiff | tree

Jeff Bailey [Sun, 3 Jul 2022 03:42:58 +0000 (03:42 +0000)]

Use add_llvm_install_targets for install-llvmlibc

Using the LLVM rules for install ensures that DESTDIR and other expected
variables for an LLVM install work correctly.

Tested:
Manually with DESTDIR=/tmp/testinstall/ ninja install-llvmlibc

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D129041

commit | commitdiff | tree

Benoit Jacob [Mon, 4 Jul 2022 15:33:50 +0000 (15:33 +0000)]

CombineContractBroadcast should not create dims unused in LHS+RHS

Differential Revision: https://reviews.llvm.org/D129087

commit | commitdiff | tree

Florian Hahn [Mon, 4 Jul 2022 16:23:47 +0000 (17:23 +0100)]

[LV] Add back CantReorderMemOps remark.

Add back remark unintentionally dropped by 644a965c1efef68f.

I will add a LV test separately, so we do not have to rely on a Clang
test to catch this.

commit | commitdiff | tree

Nicolas Vasilache [Mon, 4 Jul 2022 16:00:03 +0000 (09:00 -0700)]

[mlir][Linalg][NFC] Make getReassociationMapForFoldingUnitDims a visible helper function

commit | commitdiff | tree

Sander de Smalen [Mon, 4 Jul 2022 15:47:36 +0000 (15:47 +0000)]

[AArch64] Add support for insert/extract for nxv1i1 types.

This patch adds patterns and tests for subvector insert/extract
intrinsics to/from all legal predicate types.

Reviewed By: david-arm, kmclaughlin

Differential Revision: https://reviews.llvm.org/D128975

commit | commitdiff | tree

Craig Topper [Mon, 4 Jul 2022 15:33:21 +0000 (08:33 -0700)]

[X86] Disable combineVectorSizedSetCCEquality for soft float.

The vector types aren't legal with soft float.
Also disable under NoImplicitFloat for good measure.

Fixes PR56351.

Differential Revision: https://reviews.llvm.org/D129060

commit | commitdiff | tree

Shraiysh Vaishay [Mon, 4 Jul 2022 08:22:35 +0000 (13:52 +0530)]

[mlir][OpenMP] omp.task translation to LLVM IR

This patch adds translation for omp.task from OpenMPDialect to LLVM IR
Dialect and adds tests for the same.

Depends on D71989

Reviewed By: ftynse, kiranchandramohan, peixin, Meinersbur

Differential Revision: https://reviews.llvm.org/D123919

commit | commitdiff | tree

Sanjay Patel [Mon, 4 Jul 2022 14:54:16 +0000 (10:54 -0400)]

[SLP] add test for load combining + shuffling; NFC

issue #38821

commit | commitdiff | tree

Nikita Popov [Mon, 4 Jul 2022 14:45:13 +0000 (16:45 +0200)]

[InstCombine] Avoid ConstantExpr::get() in phi binop fold

Use ConstantFoldBinaryOpOperands() instead, in preparation for not
all binops having a supported constant expression.

commit | commitdiff | tree

Nikita Popov [Mon, 4 Jul 2022 14:40:07 +0000 (16:40 +0200)]

[Bitcode] Use bitcode input for test (NFC)

The constant expression used in the test will become invalid in
the future. Convert the input into bitcode, so we test that auto-
upgrade happens gracefully once this is the case.

commit | commitdiff | tree

Florian Hahn [Mon, 4 Jul 2022 14:20:52 +0000 (15:20 +0100)]

[LTO] Update remark test after 644a965c1efef6.

commit | commitdiff | tree

Peter Waller [Mon, 4 Jul 2022 14:06:38 +0000 (14:06 +0000)]

[LoopVectorize][NFC] Reinstate TTICapture workaround for gcc-6

Fixes #56374.

commit | commitdiff | tree

luxufan [Sun, 19 Jun 2022 12:01:25 +0000 (20:01 +0800)]

[RISCV] Add ADDI instr for computing FrameIndex address

RVV doesn't have immediate field for memory addressing. Currently
we build MachineInstructions in PEI to computing stack offset for
RVV load store instructions. These instructions were added too late to
can be optimized by CSE, LICM... passes.

This patch makes FrameIndex SDNodes can't be matched in RVV Load Store
instruction selection patterns. So that the FrameIndex SDNodes would be
selected as `ADDI GPR, targetframeindex`.

There are 2 advantages for such change:
1. Stack objects address computing can be optimized by machine function
passes.
2. Since the ADDI instruction's destination register can be used as a
temp register, we can save an emergency spill slot.

Differential Revision: https://reviews.llvm.org/D128187

commit | commitdiff | tree

Florian Hahn [Mon, 4 Jul 2022 14:10:48 +0000 (15:10 +0100)]

[LV] Vectorize cases with larger number of RT checks, execute only if profitable.

This patch replaces the tight hard cut-off for the number of runtime
checks with a more accurate cost-driven approach.

The new approach allows vectorization with a larger number of runtime
checks in general, but only executes the vector loop (and runtime checks) if
considered profitable at runtime. Profitable here means that the cost-model
indicates that the runtime check cost + vector loop cost < scalar loop cost.

To do that, LV computes the minimum trip count for which runtime check cost
+ vector-loop-cost < scalar loop cost.

Note that there is still a hard cut-off to avoid excessive compile-time/code-size
increases, but it is much larger than the original limit.

The performance impact on standard test-suites like SPEC2006/SPEC2006/MultiSource
is mostly neutral, but the new approach can give substantial gains in cases where
we failed to vectorize before due to the over-aggressive cut-offs.

On AArch64 with -O3, I didn't observe any regressions outside the noise level (<0.4%)
and there are the following execution time improvements. Both `IRSmk` and `srad` are relatively short running, but the changes are far above the noise level for them on my benchmark system.

```
CFP2006/447.dealII/447.dealII    -1.9%
CINT2017rate/525.x264_r/525.x264_r    -2.2%
ASC_Sequoia/IRSmk/IRSmk       -9.2%
Rodinia/srad/srad     -36.1%
```

`size` regressions on AArch64 with -O3 are

```
MultiSource/Applications/hbd/hbd                 90256.00   106768.00 18.3%
MultiSourc...ks/ASCI_Purple/SMG2000/smg2000     240676.00   257268.00  6.9%
MultiSourc...enchmarks/mafft/pairlocalalign     472603.00   489131.00  3.5%
External/S...2017rate/525.x264_r/525.x264_r     613831.00   630343.00  2.7%
External/S...NT2006/464.h264ref/464.h264ref     818920.00   835448.00  2.0%
External/S...te/538.imagick_r/538.imagick_r    1994730.00  2027754.00  1.7%
MultiSourc...nchmarks/tramp3d-v4/tramp3d-v4    1236471.00  1253015.00  1.3%
MultiSource/Applications/oggenc/oggenc         2108147.00  2124675.00  0.8%
External/S.../CFP2006/447.dealII/447.dealII    4742999.00  4759559.00  0.3%
External/S...rate/510.parest_r/510.parest_r   14206377.00 14239433.00  0.2%
```

Reviewed By: lebedev.ri, ebrevnov, dmgreen

Differential Revision: https://reviews.llvm.org/D109368

commit | commitdiff | tree

Stella Laurenzo [Mon, 4 Jul 2022 14:06:16 +0000 (07:06 -0700)]

Fix MLIR Python CMake bug causing duplicate sources target.

The refactor in https://reviews.llvm.org/D128230 introduced a new target and the name is not scoped properly, leading to name collisions on larger projects. It is done properly on the target just below, so applying the same pattern here fixes the issue.

commit | commitdiff | tree

Nikita Popov [Mon, 4 Jul 2022 14:01:12 +0000 (16:01 +0200)]

[BPI] Avoid ConstantExpr::get()

Use ConstantFoldBinaryOpOperands() instead, to prepare for the case
where not all binary operators have a constant expression form.

I believe this code actually intended to set OnlyIfReduced=true,
however ConstantExpr::get() actually accepts a Flags argument at
that position (and OnlyIfReducedTy as the next argument), so this
ended up creating a constant expression with some random flag
(probably exact or nuw depending on which).

commit | commitdiff | tree

Valentin Clement [Mon, 4 Jul 2022 14:02:42 +0000 (16:02 +0200)]

[flang] Avoid segfault when defining op is not a fir::Convert

The previous code made the assumption that the defining
operation is a fir::ConvertOp without checking. This results in
segmentation fault in code like the added test.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D129077

commit | commitdiff | tree

Tue Ly [Sat, 2 Jul 2022 08:50:31 +0000 (08:50 +0000)]

[libc] Add a separate algorithm_test.

Differential Revision: https://reviews.llvm.org/D128994

commit | commitdiff | tree

gbreynoo [Mon, 4 Jul 2022 13:21:45 +0000 (14:21 +0100)]

[llvm-ar][test] Add additional MRI script testing

This commit adds:
- Additional test coverage of the DELETE and END commands.
- File names to be read in the line endings test.
- A use of ADDLIB in the nonascii test.

Differential Revision: https://reviews.llvm.org/D128838

commit | commitdiff | tree

David Green [Mon, 4 Jul 2022 13:22:50 +0000 (14:22 +0100)]

[SLP] Peek into loads when hitting the RecursionMaxDepth

This patch slightly extends the limit on the RecursionMaxDepth inside
the SLP vectorizer. It does it only when it hits a load (or zext/sext of
a load), which allows it to peek through in the places where it will be
the most valuable, without ballooning out the O(..) by any 2^n factors.

Differential Revision: https://reviews.llvm.org/D122148

commit | commitdiff | tree

Nikita Popov [Mon, 4 Jul 2022 13:17:22 +0000 (15:17 +0200)]

[Reassociate] Avoid ConstantExpr::get()

Use ConstantFoldBinaryOpOperands() instead, to handle the case
where not all binary ops have a constant expression variant.

This is a bit awkward because we only want to pop the element from
Ops once we're sure that it has folded.

Domain: System / Toolchain;

RSS Atom