review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Joseph Huber [Mon, 4 Jul 2022 21:32:47 +0000 (17:32 -0400)]

[OffloadPackager] Use appropriate kind for LTO bitcode

Summary:
Currently we just check the extension to set the image kind. This
incorrectly labels the `.o` files created during LTO as object files.
This patch simply adds a check for the bitcode magic bytes instead.

commit | commitdiff | tree

Jonas Hahnfeld [Mon, 4 Jul 2022 17:27:49 +0000 (19:27 +0200)]

[Orc][LLJIT] Use JITLink on RISC-V

RuntimeDyld does not support RISC-V, so it makes sense to enable
JITLink by default. This also makes relocations work without support
for a large code model.

Differential Revision: https://reviews.llvm.org/D129092

commit | commitdiff | tree

Simon Pilgrim [Mon, 4 Jul 2022 20:43:40 +0000 (21:43 +0100)]

[X86] Regenerate fold-tied-op.ll test checks

commit | commitdiff | tree

Florian Hahn [Mon, 4 Jul 2022 20:37:16 +0000 (21:37 +0100)]

[LV] Consider runtime checks profitable if scalar cost is zero.

This fixes an UBSan failure after 644a965c1efef. When using
user-provided VFs/ICs (via the force-vector-width /
force-vector-interleave options) the scalar cost is zero, which would
cause divide-by-zero.

When forcing vectorization using the options, the cost of the runtime
checks should not block vectorization.

commit | commitdiff | tree

Nico Weber [Sun, 3 Jul 2022 20:14:48 +0000 (22:14 +0200)]

[clang-format] Update documentation

- Update `clang-format --help` output after b1f0efc06acc.
- Update `clang-format-diff.py` help text, which apparently hasn't
been updated in a while. Since git and svn examples are now part
of the help text, remove them in the text following the help text.

Differential Revision: https://reviews.llvm.org/D129050

commit | commitdiff | tree

owenca [Sun, 3 Jul 2022 23:42:00 +0000 (16:42 -0700)]

[clang-format] Break on AfterColon only if not followed by comment

Break after a constructor initializer colon only if it's not followed by a
comment on the same line.

Fixes #41128.
Fixes #43246.

Differential Revision: https://reviews.llvm.org/D129057

commit | commitdiff | tree

Valentin Clement [Mon, 4 Jul 2022 19:16:13 +0000 (21:16 +0200)]

[flang] Make code more homogenous in CodeGen

This patch just make the code more similar
in each conversion.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D129071

commit | commitdiff | tree

Sam McCall [Fri, 24 Jun 2022 01:01:45 +0000 (03:01 +0200)]

[pseudo] Store shift and goto actions in a compact structure with faster lookup.

The actions table is very compact but the binary search to find the
correct action is relatively expensive.
A hashtable is faster but pretty large (64 bits per value, plus empty
slots, and lookup is constant time but not trivial due to collisions).

The structure in this patch uses 1.25 bits per entry (whether present or absent)
plus the size of the values, and lookup is trivial.

The Shift table is 119KB = 27KB values + 92KB keys.
The Goto table is 86KB = 30KB values + 57KB keys.
(Goto has a smaller keyspace as #nonterminals < #terminals, and more entries).

This patch improves glrParse speed by 28%: 4.69 => 5.99 MB/s
Overall the table grows by 60%: 142 => 228KB.

By comparison, DenseMap<unsigned, StateID> is "only" 16% faster (5.43 MB/s),
and results in a 285% larger table (547 KB) vs the baseline.

Differential Revision: https://reviews.llvm.org/D128485

commit | commitdiff | tree

Jeff Bailey [Sun, 3 Jul 2022 03:42:58 +0000 (03:42 +0000)]

Use add_llvm_install_targets for install-llvmlibc

Using the LLVM rules for install ensures that DESTDIR and other expected
variables for an LLVM install work correctly.

Tested:
Manually with DESTDIR=/tmp/testinstall/ ninja install-llvmlibc

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D129041

commit | commitdiff | tree

Benoit Jacob [Mon, 4 Jul 2022 15:33:50 +0000 (15:33 +0000)]

CombineContractBroadcast should not create dims unused in LHS+RHS

Differential Revision: https://reviews.llvm.org/D129087

commit | commitdiff | tree

Florian Hahn [Mon, 4 Jul 2022 16:23:47 +0000 (17:23 +0100)]

[LV] Add back CantReorderMemOps remark.

Add back remark unintentionally dropped by 644a965c1efef68f.

I will add a LV test separately, so we do not have to rely on a Clang
test to catch this.

commit | commitdiff | tree

Nicolas Vasilache [Mon, 4 Jul 2022 16:00:03 +0000 (09:00 -0700)]

[mlir][Linalg][NFC] Make getReassociationMapForFoldingUnitDims a visible helper function

commit | commitdiff | tree

Sander de Smalen [Mon, 4 Jul 2022 15:47:36 +0000 (15:47 +0000)]

[AArch64] Add support for insert/extract for nxv1i1 types.

This patch adds patterns and tests for subvector insert/extract
intrinsics to/from all legal predicate types.

Reviewed By: david-arm, kmclaughlin

Differential Revision: https://reviews.llvm.org/D128975

commit | commitdiff | tree

Craig Topper [Mon, 4 Jul 2022 15:33:21 +0000 (08:33 -0700)]

[X86] Disable combineVectorSizedSetCCEquality for soft float.

The vector types aren't legal with soft float.
Also disable under NoImplicitFloat for good measure.

Fixes PR56351.

Differential Revision: https://reviews.llvm.org/D129060

commit | commitdiff | tree

Shraiysh Vaishay [Mon, 4 Jul 2022 08:22:35 +0000 (13:52 +0530)]

[mlir][OpenMP] omp.task translation to LLVM IR

This patch adds translation for omp.task from OpenMPDialect to LLVM IR
Dialect and adds tests for the same.

Depends on D71989

Reviewed By: ftynse, kiranchandramohan, peixin, Meinersbur

Differential Revision: https://reviews.llvm.org/D123919

commit | commitdiff | tree

Sanjay Patel [Mon, 4 Jul 2022 14:54:16 +0000 (10:54 -0400)]

[SLP] add test for load combining + shuffling; NFC

issue #38821

commit | commitdiff | tree

Nikita Popov [Mon, 4 Jul 2022 14:45:13 +0000 (16:45 +0200)]

[InstCombine] Avoid ConstantExpr::get() in phi binop fold

Use ConstantFoldBinaryOpOperands() instead, in preparation for not
all binops having a supported constant expression.

commit | commitdiff | tree

Nikita Popov [Mon, 4 Jul 2022 14:40:07 +0000 (16:40 +0200)]

[Bitcode] Use bitcode input for test (NFC)

The constant expression used in the test will become invalid in
the future. Convert the input into bitcode, so we test that auto-
upgrade happens gracefully once this is the case.

commit | commitdiff | tree

Florian Hahn [Mon, 4 Jul 2022 14:20:52 +0000 (15:20 +0100)]

[LTO] Update remark test after 644a965c1efef6.

commit | commitdiff | tree

Peter Waller [Mon, 4 Jul 2022 14:06:38 +0000 (14:06 +0000)]

[LoopVectorize][NFC] Reinstate TTICapture workaround for gcc-6

Fixes #56374.

commit | commitdiff | tree

luxufan [Sun, 19 Jun 2022 12:01:25 +0000 (20:01 +0800)]

[RISCV] Add ADDI instr for computing FrameIndex address

RVV doesn't have immediate field for memory addressing. Currently
we build MachineInstructions in PEI to computing stack offset for
RVV load store instructions. These instructions were added too late to
can be optimized by CSE, LICM... passes.

This patch makes FrameIndex SDNodes can't be matched in RVV Load Store
instruction selection patterns. So that the FrameIndex SDNodes would be
selected as `ADDI GPR, targetframeindex`.

There are 2 advantages for such change:
1. Stack objects address computing can be optimized by machine function
passes.
2. Since the ADDI instruction's destination register can be used as a
temp register, we can save an emergency spill slot.

Differential Revision: https://reviews.llvm.org/D128187

commit | commitdiff | tree

Florian Hahn [Mon, 4 Jul 2022 14:10:48 +0000 (15:10 +0100)]

[LV] Vectorize cases with larger number of RT checks, execute only if profitable.

This patch replaces the tight hard cut-off for the number of runtime
checks with a more accurate cost-driven approach.

The new approach allows vectorization with a larger number of runtime
checks in general, but only executes the vector loop (and runtime checks) if
considered profitable at runtime. Profitable here means that the cost-model
indicates that the runtime check cost + vector loop cost < scalar loop cost.

To do that, LV computes the minimum trip count for which runtime check cost
+ vector-loop-cost < scalar loop cost.

Note that there is still a hard cut-off to avoid excessive compile-time/code-size
increases, but it is much larger than the original limit.

The performance impact on standard test-suites like SPEC2006/SPEC2006/MultiSource
is mostly neutral, but the new approach can give substantial gains in cases where
we failed to vectorize before due to the over-aggressive cut-offs.

On AArch64 with -O3, I didn't observe any regressions outside the noise level (<0.4%)
and there are the following execution time improvements. Both `IRSmk` and `srad` are relatively short running, but the changes are far above the noise level for them on my benchmark system.

```
CFP2006/447.dealII/447.dealII    -1.9%
CINT2017rate/525.x264_r/525.x264_r    -2.2%
ASC_Sequoia/IRSmk/IRSmk       -9.2%
Rodinia/srad/srad     -36.1%
```

`size` regressions on AArch64 with -O3 are

```
MultiSource/Applications/hbd/hbd                 90256.00   106768.00 18.3%
MultiSourc...ks/ASCI_Purple/SMG2000/smg2000     240676.00   257268.00  6.9%
MultiSourc...enchmarks/mafft/pairlocalalign     472603.00   489131.00  3.5%
External/S...2017rate/525.x264_r/525.x264_r     613831.00   630343.00  2.7%
External/S...NT2006/464.h264ref/464.h264ref     818920.00   835448.00  2.0%
External/S...te/538.imagick_r/538.imagick_r    1994730.00  2027754.00  1.7%
MultiSourc...nchmarks/tramp3d-v4/tramp3d-v4    1236471.00  1253015.00  1.3%
MultiSource/Applications/oggenc/oggenc         2108147.00  2124675.00  0.8%
External/S.../CFP2006/447.dealII/447.dealII    4742999.00  4759559.00  0.3%
External/S...rate/510.parest_r/510.parest_r   14206377.00 14239433.00  0.2%
```

Reviewed By: lebedev.ri, ebrevnov, dmgreen

Differential Revision: https://reviews.llvm.org/D109368

commit | commitdiff | tree

Stella Laurenzo [Mon, 4 Jul 2022 14:06:16 +0000 (07:06 -0700)]

Fix MLIR Python CMake bug causing duplicate sources target.

The refactor in https://reviews.llvm.org/D128230 introduced a new target and the name is not scoped properly, leading to name collisions on larger projects. It is done properly on the target just below, so applying the same pattern here fixes the issue.

commit | commitdiff | tree

Nikita Popov [Mon, 4 Jul 2022 14:01:12 +0000 (16:01 +0200)]

[BPI] Avoid ConstantExpr::get()

Use ConstantFoldBinaryOpOperands() instead, to prepare for the case
where not all binary operators have a constant expression form.

I believe this code actually intended to set OnlyIfReduced=true,
however ConstantExpr::get() actually accepts a Flags argument at
that position (and OnlyIfReducedTy as the next argument), so this
ended up creating a constant expression with some random flag
(probably exact or nuw depending on which).

commit | commitdiff | tree

Valentin Clement [Mon, 4 Jul 2022 14:02:42 +0000 (16:02 +0200)]

[flang] Avoid segfault when defining op is not a fir::Convert

The previous code made the assumption that the defining
operation is a fir::ConvertOp without checking. This results in
segmentation fault in code like the added test.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D129077

commit | commitdiff | tree

Tue Ly [Sat, 2 Jul 2022 08:50:31 +0000 (08:50 +0000)]

[libc] Add a separate algorithm_test.

Differential Revision: https://reviews.llvm.org/D128994

commit | commitdiff | tree

gbreynoo [Mon, 4 Jul 2022 13:21:45 +0000 (14:21 +0100)]

[llvm-ar][test] Add additional MRI script testing

This commit adds:
- Additional test coverage of the DELETE and END commands.
- File names to be read in the line endings test.
- A use of ADDLIB in the nonascii test.

Differential Revision: https://reviews.llvm.org/D128838

commit | commitdiff | tree

David Green [Mon, 4 Jul 2022 13:22:50 +0000 (14:22 +0100)]

[SLP] Peek into loads when hitting the RecursionMaxDepth

This patch slightly extends the limit on the RecursionMaxDepth inside
the SLP vectorizer. It does it only when it hits a load (or zext/sext of
a load), which allows it to peek through in the places where it will be
the most valuable, without ballooning out the O(..) by any 2^n factors.

Differential Revision: https://reviews.llvm.org/D122148

commit | commitdiff | tree

Nikita Popov [Mon, 4 Jul 2022 13:17:22 +0000 (15:17 +0200)]

[Reassociate] Avoid ConstantExpr::get()

Use ConstantFoldBinaryOpOperands() instead, to handle the case
where not all binary ops have a constant expression variant.

This is a bit awkward because we only want to pop the element from
Ops once we're sure that it has folded.

commit | commitdiff | tree

Nikita Popov [Mon, 4 Jul 2022 12:56:23 +0000 (14:56 +0200)]

[SCEVExpander] Avoid ConstantExpr::get() (NFCI)

Use ConstantFoldBinaryOpOperands() instead. This will be important
when not all binops have constant expression variants.

commit | commitdiff | tree

LLVM GN Syncbot [Mon, 4 Jul 2022 12:44:50 +0000 (12:44 +0000)]

[gn build] Port 25607d143d1d

commit | commitdiff | tree

Hui Xie [Sun, 26 Jun 2022 15:13:43 +0000 (16:13 +0100)]

[libc++] Implement `std::ranges::merge`

Implement `std::ranges::merge`. added unit tests

Differential Revision: https://reviews.llvm.org/D128611

commit | commitdiff | tree

David Green [Mon, 4 Jul 2022 12:38:43 +0000 (13:38 +0100)]

[VectorCombine] Improve shuffle select shuffle-of-shuffles

This in an extension to the code added in D123911 which added vector
combine folding of shuffle-select patterns, attempting to reduce the
total amount of shuffling required in patterns like:
  %x = shuffle %i1, %i2
  %y = shuffle %i1, %i2
  %a = binop %x, %y
  %b = binop %x, %y
  shuffle %a, %b, selectmask

This patch extends the handing of shuffles that are dependent on one
another, which can arise from the SLP vectorizer, as-in:
  %x = shuffle %i1, %i2
  %y = shuffle %x

The input shuffles can also be emitted, in which case they are treated
like identity shuffles. This patch also attempts to calculate a better
ordering of input shuffles, which can help getting lower cost input
shuffles, pushing complex shuffles further down the tree.

Differential Revision: https://reviews.llvm.org/D128732

commit | commitdiff | tree

Nikita Popov [Mon, 4 Jul 2022 10:55:42 +0000 (12:55 +0200)]

[AMDGPUCodeGenPrepare] Check result of ConstantFoldBinaryOpOperands()

This function will become fallible once we don't support constant
expressions for all binops, so make sure to check the result.

commit | commitdiff | tree

Nikita Popov [Mon, 4 Jul 2022 10:49:52 +0000 (12:49 +0200)]

[ConstantFolding] Check return value of ConstantFoldInstOperandsImpl()

This operation is fallible, but ConstantFoldConstantImpl() is not.
If we fail to fold, we should simply return the original expression.

I don't think this can cause any issues right now, but it becomes
a problem if once make ConstantFoldInstOperandsImpl() not create a
constant expression for everything it possibly could.

commit | commitdiff | tree

Valentin Clement [Mon, 4 Jul 2022 10:56:47 +0000 (12:56 +0200)]

[flang] Add TODO for derived types with final procedure

Finalization is F2003 and although the runtime supports it already,
lowering is not ensuring all the derived type are finalized properly
when they should. This will require surveying the places where lowering
needs to call it. Add a hard TODO for now.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D129069

Co-authored-by: Jean Perier <jperier@nvidia.com>

commit | commitdiff | tree

Sander de Smalen [Mon, 4 Jul 2022 09:36:13 +0000 (09:36 +0000)]

[AArch64] NFC: Move safe predicate casting to a separate function.

This patch puts the code to safely bitcast a predicate, and possibly zero
any undefined lanes when doing a widening cast, into one place and merges
the functionality with lowerConvertToSVBool.

This is some cleanup inspired by D128665.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D128926

commit | commitdiff | tree

Dmitry Preobrazhensky [Mon, 27 Jun 2022 16:49:44 +0000 (19:49 +0300)]

[AMDGPU][GFX10][DOC][NFC] Update assembler syntax description

Summary of changes:
- Update MUBUF lds syntax (see https://reviews.llvm.org/D124485).
- Add v_cvt_pkrtz_f16_f32_dpp, v_cvt_pkrtz_f16_f32_sdwa.
- Update SMEM syntax (see https://reviews.llvm.org/D127314).
- Enable op_sel for v_add_nc_u16, v_sub_nc_u16 (see https://reviews.llvm.org/D123594).
- Minor bug fixing and improvements.

commit | commitdiff | tree

Simon Pilgrim [Mon, 4 Jul 2022 10:23:24 +0000 (11:23 +0100)]

[DAG] visitTRUNCATE - move GetDemandedBits AFTER SimplifyDemandedBits.

Another cleanup step before removing GetDemandedBits entirely.

commit | commitdiff | tree

Daniil Dudkin [Mon, 4 Jul 2022 10:22:12 +0000 (13:22 +0300)]

[mlir][NFC] Fix various warnings generated by GCC 9

Currently, there've been a lot of warnings while building MLIR.
This change fixes the warnings listed below.

  .../SparseTensorUtils.cpp: In instantiation of ‘...::openSparseTensorCOO(...) [with ...]’:
  .../SparseTensorUtils.cpp:1672:3:   required from here
  .../SparseTensorUtils.cpp:87:21: warning: format ‘%d’ expects argument of type ‘int’, but argument 3 has type ‘PrimaryType’ [-Wformat=]

  .../OptUtils.cpp:36:5: warning: this statement may fall through [-Wimplicit-fallthrough=]

  .../AffineOps.cpp:1741:32: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]

Reviewed By: aartbik, wrengr, aeubanks

Differential Revision: https://reviews.llvm.org/D128993

commit | commitdiff | tree

Chuanqi Xu [Mon, 4 Jul 2022 09:33:57 +0000 (17:33 +0800)]

[AST] Use canonical constraint declaration for ASTContext::getAutoType

When we do profiling in ASTContext::getAutoType, it wouldn't think about
the canonical declaration for the type constraint. It is bad since it
would cause a negative ODR mismatch while we already know the type
constraint declaration is a redeclaration for the previous one. Also it shouldn't be
bad to use the canonical declaration here.

commit | commitdiff | tree

Nicolas Vasilache [Fri, 1 Jul 2022 06:56:44 +0000 (23:56 -0700)]

[mlir][Tensor] Update ParallelInsertSlicOp semantics to match that of InsertSliceOp

This revision updates the op semantics to also allow rank-reducing behavior as well
as updates the implementation to reuse code between the sequential and the parallel
version of the op.

Depends on D128920

Differential Revision: https://reviews.llvm.org/D128985

commit | commitdiff | tree

Haojian Wu [Mon, 4 Jul 2022 09:30:11 +0000 (11:30 +0200)]

[pseudo] Remove duplicated code in ClangPseudo.cpp

The code was added accidently during the rebase when landing fe66aebd.

commit | commitdiff | tree

Edd Barrett [Mon, 4 Jul 2022 06:02:25 +0000 (07:02 +0100)]

Revise outdated parts of the developer policy.

Specifically:

- Diffs are not passed around on mailing lists any more.
- Diffs should be `-U999999`.
- Clarify part about automated emails.

Differential review: https://reviews.llvm.org/D128645

commit | commitdiff | tree

Nikolas Klauser [Sun, 3 Jul 2022 23:21:44 +0000 (01:21 +0200)]

[libc++][NFC] Replace enable_if with __enable_if_t in a few places

Reviewed By: ldionne, #libc

Spies: jloser, libcxx-commits

Differential Revision: https://reviews.llvm.org/D128400

commit | commitdiff | tree

Nikita Popov [Mon, 4 Jul 2022 08:52:12 +0000 (10:52 +0200)]

[SimplifyCFG] Remove redundant checks for hoisting (NFCI)

These conditions are later checked in the HoistTerminator code
path. Checking them here is somewhat confusing, because this code
only checks the first instruction in the block, which is not
necessarily the terminator.

commit | commitdiff | tree

Nicolas Vasilache [Thu, 30 Jun 2022 11:27:41 +0000 (04:27 -0700)]

[mlir][Tensor] Move ParallelInsertSlice to the tensor dialect

This is moslty NFC and will allow tensor.parallel_insert_slice to gain
rank-reducing semantics by reusing the vast majority of the tensor.insert_slice impl.

Depends on D128857

Differential Revision: https://reviews.llvm.org/D128920

commit | commitdiff | tree

Florian Hahn [Mon, 4 Jul 2022 08:29:21 +0000 (09:29 +0100)]

[AArch64] Add additional tests for D120481.

commit | commitdiff | tree

Florian Hahn [Mon, 4 Jul 2022 08:25:26 +0000 (09:25 +0100)]

[LV] Simplify setDebugLocFromInst by using early exit (NFC).

Suggested as separate improvement in D128657.

commit | commitdiff | tree

Nikita Popov [Tue, 28 Jun 2022 09:25:54 +0000 (11:25 +0200)]

[IR] Remove support for insertvalue constant expression

This removes the insertvalue constant expression, as part of
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.
This is very similar to the extractvalue removal from D125795.
insertvalue is also not supported in bitcode, so no auto-ugprade
is necessary.

ConstantExpr::getInsertValue() can be replaced with
IRBuilder::CreateInsertValue() or ConstantFoldInsertValueInstruction(),
depending on whether a constant result is required (with the latter
being fallible).

The ConstantExpr::hasIndices() and ConstantExpr::getIndices()
methods also go away here, because there are no longer any constant
expressions with indices.

Differential Revision: https://reviews.llvm.org/D128719

commit | commitdiff | tree

Shraiysh Vaishay [Mon, 4 Jul 2022 05:08:58 +0000 (10:38 +0530)]

[mlir][openmp] Added omp.taskloop

This patch adds omp.taskloop operation to OpenMP Dialect along with
tests.

Reviewed By: peixin

Differential Revision: https://reviews.llvm.org/D127380

commit | commitdiff | tree

Craig Topper [Mon, 4 Jul 2022 04:07:25 +0000 (21:07 -0700)]

[RISCV] Add more SHXADD patterns.

This handles the code we get for this.

int foo(unsigned x, int *y) {
return y[x >> 3];
}

The srl and shl implied by the array index will be combined to
form (srl (and X, C2), C1). We need to reverse this get to back
the shl to fold into SHXADD.

commit | commitdiff | tree

Craig Topper [Sun, 3 Jul 2022 21:41:36 +0000 (14:41 -0700)]

[RISCV] Move some SHXADD matching cases into a ComplexPattern. NFC

Some more complex cases require checking the relationship of
operands on different nodes of the match. They also require
additional instructions to be created. Using a ComplexPattern
gives us that flexibility.

I'll be adding another pattern in a future patch.

commit | commitdiff | tree

Argyrios Kyrtzidis [Sat, 2 Jul 2022 00:18:00 +0000 (17:18 -0700)]

[Driver] Ignore the clang modules validation-related flags if clang modules are not enabled

If clang modules are not enabled it becomes unnecessary to read the session timestamp file in order
to pass `-fbuild-session-timestamp` to the `cc1` invocation.

Differential Revision: https://reviews.llvm.org/D129030

commit | commitdiff | tree

esmeyi [Mon, 4 Jul 2022 03:16:16 +0000 (23:16 -0400)]

[AIX] Handling the label alignment of a global
variable with its multiple aliases.

This patch handles the case where a variable has
multiple aliases.
AIX's assembly directive .set is not usable for the
aliasing purpose, and using different labels allows
AIX to emulate symbol aliases. If a value is emitted
between any two labels, meaning they are not aligned,
XCOFF will automatically calculate the offset for them.

This patch implements:
1) Emits the label of the alias just before emitting
the value of the sub-element that the alias referred to.
2) A set of aliases that refers to the same offset
should be aligned.
3) We didn't emit aliasing labels for common and
zero-initialized local symbols in
PPCAIXAsmPrinter::emitGlobalVariableHelper, but
emitted linkage for them in
AsmPrinter::emitGlobalAlias, which caused a FAILURE.
This patch fixes the bug by blocking emitting linkage
for the alias without a label.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D124654

commit | commitdiff | tree

jacquesguan [Fri, 1 Jul 2022 07:08:58 +0000 (15:08 +0800)]

[mlir][Vector] Fold ShuffleOp(SplatOp(X), SplatOp(X)) to SplatOp(X).

This patch folds ShuffleOp(SplatOp(X), SplatOp(X)) to SplatOp(X).

Differential Revision: https://reviews.llvm.org/D128969

commit | commitdiff | tree

Chen Zheng [Wed, 29 Jun 2022 09:21:04 +0000 (05:21 -0400)]

[SCEV] recognize llvm.annotation intrinsic

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D127835

commit | commitdiff | tree

Nicolas van Kempen [Sun, 3 Jul 2022 22:27:57 +0000 (16:27 -0600)]

[clang-tidy] Properly forward clang-tidy output when running tests

When running tests, the check_clang_tidy script encodes the output
string, making it hard to read when debugging checks. This removes the
.encode() call.

Test Plan:
Making a new default check for testing (as of right now, it includes a
failing test):

[~/llvm-project/clang-tools-extra] python3 clang-tidy/add_new_check.py
bugprone example
<...>
Pre-changes:

[~/llvm-project/build] ninja check-clang-tools
<...>
------------------------ clang-tidy output -----------------------
b"1 warning
generated.\n/data/users/nvankempen/llvm-project/build/Debug/tools/clang/tools/extra/test/clang-tidy/checkers/Output/bugprone-example.cpp.tmp.cpp:4:6:
warning: function 'f' is insufficiently awesome [bugprone-example]\nvoid
f();\n
^\n/data/users/nvankempen/llvm-project/build/Debug/tools/clang/tools/extra/test/clang-tidy/checkers/Output/bugprone-example.cpp.tmp.cpp:4:6:
note: insert 'awesome'\nvoid f();\n     ^\n     awesome_\n"

------------------------------------------------------------------
<...>
Post-changes:

[~/llvm-project/build] ninja check-clang-tools
<...>
------------------------ clang-tidy output -----------------------
1 warning generated.
/data/users/nvankempen/llvm-project/build/Debug/tools/clang/tools/extra/test/clang-tidy/checkers/Output/bugprone-example.cpp.tmp.cpp:4:6:
warning: function 'f' is insufficiently awesome [bugprone-example]
void f();
     ^
/data/users/nvankempen/llvm-project/build/Debug/tools/clang/tools/extra/test/clang-tidy/checkers/Output/bugprone-example.cpp.tmp.cpp:4:6:
note: insert 'awesome'
void f();
     ^
     awesome_

------------------------------------------------------------------
<...>

Differential Revision: https://reviews.llvm.org/D127807

commit | commitdiff | tree

Ishaan Gandhi [Sun, 3 Jul 2022 20:42:31 +0000 (14:42 -0600)]

[clang-tidy] Don't treat invalid branches as identical

The clang-tidy check bugprone-branch-clone has a false positive if some
symbols are undefined. This patch silences the warning when the two
sides of a branch are invalid.

Fixes #56057

Differential Revision: https://reviews.llvm.org/D128402

commit | commitdiff | tree

Sunho Kim [Sun, 3 Jul 2022 20:30:56 +0000 (05:30 +0900)]

[clang] Fix gcc-6 compilation error. (NFC)

Fix https://github.com/llvm/llvm-project/issues/55626.

Differential Revision: https://reviews.llvm.org/D129049

commit | commitdiff | tree

Nico Weber [Fri, 1 Jul 2022 11:37:29 +0000 (13:37 +0200)]

[clang-format] Tweak help text a bit

In particular, make it clear that `--style=file` is the default,
since there's some confusion about this, e.g. here:
https://stackoverflow.com/questions/61455148/

Differential Revision: https://reviews.llvm.org/D128984

commit | commitdiff | tree

Sanjay Patel [Sun, 3 Jul 2022 16:23:29 +0000 (12:23 -0400)]

[InstCombine] fold negated low-bit-mask to cmp+select

(-(X & 1)) & Y --> (X & 1) == 0 ? 0 : Y
https://alive2.llvm.org/ce/z/rhpH3i

This is noted as a missing IR canonicalization in issue #55618.
We already managed to fix codegen to the expected form.

commit | commitdiff | tree

Sanjay Patel [Sun, 3 Jul 2022 15:03:16 +0000 (11:03 -0400)]

[InstCombine] add tests for and-of-negated-lowbitmask; NFC

commit | commitdiff | tree

LLVM GN Syncbot [Sun, 3 Jul 2022 16:05:49 +0000 (16:05 +0000)]

[gn build] Port 2aea8af25136

commit | commitdiff | tree

Nikolas Klauser [Sun, 3 Jul 2022 14:52:22 +0000 (16:52 +0200)]

[libc++] Make _LIBCPP_DEBUG_RANDOMIZE_RANGE a function

Reviewed By: ldionne, Mordante, var-const, #libc

Spies: mgorny, libcxx-commits

Differential Revision: https://reviews.llvm.org/D128181

commit | commitdiff | tree

Craig Topper [Sun, 3 Jul 2022 15:57:51 +0000 (08:57 -0700)]

[RISCV] Replace call to APInt::countTrailingZeros with uint64_t verson. NFC

We know the number of bits is 64 or 32 so we can use the uint64_t
version directly. This saves the APInt needing to check for the
small vs large size.

commit | commitdiff | tree

Groverkss [Sun, 3 Jul 2022 15:22:35 +0000 (16:22 +0100)]

[MLIR][Affine] Allow affine-expr on RHS in IntegerSet

Currently, the parser for IntegerSet, only allows constraints like:

```
affine-constraint ::= affine-expr `>=` `0`
| affine-expr `==` `0`
```

This form is sometimes unreadable and painful to use when writing unittests
for Presburger library and tests in general.

This patch extends the parser to allow affine constraints with affine-expr on
the RHS:

```
affine-constraint ::= affine-expr `>=` `affine-expr`
| affine-expr `==` `affine-expr`
```

The internal storage and printing of IntegerSet is still in the original format.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D128915

commit | commitdiff | tree

David Green [Sun, 3 Jul 2022 14:49:16 +0000 (15:49 +0100)]

[AArch64] Regenerate more tests. NFC

Also includes some adjustments for asm.py to handle updating more cases
successfully.

commit | commitdiff | tree

Nuno Lopes [Sun, 3 Jul 2022 13:33:47 +0000 (14:33 +0100)]

[NFC] Switch a few uses of undef to poison as placeholders for unreachble code

commit | commitdiff | tree

luxufan [Sat, 18 Jun 2022 15:44:09 +0000 (23:44 +0800)]

[RISCV] Add a scavenge spill slot when use ADDI to compute scalable stack offset

Computing scalable offset needs up to two scrach registers. We add
scavenge spill slots according to the result of `RISCV::isRVVSpill`
and `RVVStackSize`. Since ADDI is not included in `RISCV::isRVVSpill`,
PEI doesn't add scavenge spill slots for scrach registers when using
ADDI to get scalable stack offsets.

The ADDI instruction has a destination register which can be used as
a scrach register. So one scavenge spil slot is sufficient for
computing scalable stack offsets.

Differential Revision: https://reviews.llvm.org/D128188

commit | commitdiff | tree

Jun Zhang [Sun, 3 Jul 2022 11:40:56 +0000 (19:40 +0800)]

Revert "Reland "[NFC] Add a missing test for for clang-repl""

This reverts commit 8679cbc29fb76195544956fe233060bb7a1a6453.
See https://lab.llvm.org/buildbot/#/builders/216/builds/6799

commit | commitdiff | tree

Nuno Lopes [Sun, 3 Jul 2022 10:56:29 +0000 (11:56 +0100)]

[LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC]

commit | commitdiff | tree

Jun Zhang [Sun, 3 Jul 2022 10:04:52 +0000 (18:04 +0800)]

Reland "[NFC] Add a missing test for for clang-repl"

This reverts 3668d1264e2d246f7e222338b8a5cab18ce1bdab
As far as we know, `__attribute__((weak))` support has been really bad
in runtimeldyld, so we just disable it in Windows at this moment. This
should fix the angry Windows buildbot.

Differential Revision: https://reviews.llvm.org/D129042

commit | commitdiff | tree

Serge Pavlov [Fri, 1 Jul 2022 11:32:26 +0000 (18:32 +0700)]

[FPEnv] Allow CompoundStmt to keep FP options

This is a recommit of b822efc7404bf09ccfdc1ab7657475026966c3b2,
reverted in dc34d8df4c48b3a8f474360970cae8a58e6c84f0. The commit caused
fails because the test ast-print-fp-pragmas.c did not specify particular
target, and it failed on targets which do not support constrained
intrinsics. The original commit message is below.

AST does not have special nodes for pragmas. Instead a pragma modifies
some state variables of Sema, which in turn results in modified
attributes of AST nodes. This technique applies to floating point
operations as well. Every AST node that can depend on FP options keeps
current set of them.

This technique works well for options like exception behavior or fast
math options. They represent instructions to the compiler how to modify
code generation for the affected nodes. However treatment of FP control
modes has problems with this technique. Modifying FP control mode
(like rounding direction) usually requires operations on hardware, like
writing to control registers. It must be done prior to the first
operation that depends on the control mode. In particular, such
operations are required for implementation of `pragma STDC FENV_ROUND`,
compiler should set up necessary rounding direction at the beginning of
compound statement where the pragma occurs. As there is no representation
for pragmas in AST, the code generation becomes a complicated task in
this case.

To solve this issue FP options are kept inside CompoundStmt. Unlike to FP
options in expressions, these does not affect any operation on FP values,
but only inform the codegen about the FP options that act in the body of
the statement. As all pragmas that modify FP environment may occurs only
at the start of compound statement or at global level, such solution
works for all relevant pragmas. The options are kept as a difference
from the options in the enclosing compound statement or default options,
it helps codegen to set only changed control modes.

Differential Revision: https://reviews.llvm.org/D123952

commit | commitdiff | tree

NAKAMURA Takumi [Sun, 3 Jul 2022 06:23:10 +0000 (15:23 +0900)]

[Bazel] Make `builtin_headers_gen` as subset of CMake's `clang-resource-headers`

At the moment, two files are not installed by CMake.

- `lib/Headers/openmp_wrappers/time.h`
- `lib/Headers/ppc_wrappers/nmmintrin.h`

`builtin_headers_gen` is available as the source of rules_pkg.
The difference of the layout of installed headers makes cache hit harder.

commit | commitdiff | tree

Craig Topper [Sun, 3 Jul 2022 06:11:14 +0000 (23:11 -0700)]

[RISCV] Add more SHXADD isel patterns.

This handles the code we get for

int foo(int* x, unsigned y) {
return x[y >> 1];
}

The shift right and the shl will get DAG combined into
(shl (and X, 0xfffffffe), 1). We have custom isel to match the
shl+and, but with Zba the (add (shl X, 1), Y) part will get
matched and leave the and to be iseled by itself. This commit
adds a larger pattern that includes the and.

commit | commitdiff | tree

Vitaly Buka [Sun, 3 Jul 2022 03:14:51 +0000 (20:14 -0700)]

[lsan] malloc_usable_size returns 0 for nullptr

commit | commitdiff | tree

lewuathe [Sun, 3 Jul 2022 00:26:41 +0000 (09:26 +0900)]

[mlir][complex] Inverse canonicalization between exp and log

We can canonicalize consecutive complex.exp and complex.log which are inverse functions each other.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D128966

commit | commitdiff | tree

Craig Topper [Sat, 2 Jul 2022 16:51:00 +0000 (09:51 -0700)]

[RISCV] Match RISCVISD::ADD_LO in SelectAddrRegImm.

This allows us to fold global and constant pool addresses into
load/store during isel instead of in the post-isel peephole. I
did not copy the alignment check for ConsantPoolSDNode because it
wasn't tested.

This is a step towards being able to remove the post-isel
peephole.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D128738

commit | commitdiff | tree

Florian Hahn [Sat, 2 Jul 2022 14:18:16 +0000 (15:18 +0100)]

[VPlan] Move setDebugLocFromInst to VPTransformState (NFC).

The moved helpers are only used for codegen. It will allow moving the
remaining ::execute implementations out of LoopVectorize.cpp.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D128657

commit | commitdiff | tree

Luo, Yuanke [Sat, 2 Jul 2022 08:49:02 +0000 (16:49 +0800)]

[globalisel] Add test case for regbank selection.

commit | commitdiff | tree

lorenzo chelini [Fri, 1 Jul 2022 18:21:09 +0000 (20:21 +0200)]

[MLIR] Rename FusePadOpWithLinalgConsumer -> FusePadOpWithLinalgProducer (NFC)

Follow up after D128978, where I mistakenly rename the file. The linalg op is
fused with its producer, not the consumer.

commit | commitdiff | tree

Craig Topper [Sat, 2 Jul 2022 07:57:35 +0000 (00:57 -0700)]

[RISCV] isel (shl (and X, C2), C) -> (slli (srliw X, C3), C3+C).

where C2 has 32 leading zeros and C3 trailing zeros.

When the shl is used by an add C is 1,2 or 3, we end up matching
(add (shl X, C), Y) first. This leaves an and with a constant that
is harder to materialize.

commit | commitdiff | tree

Craig Topper [Sat, 2 Jul 2022 06:32:30 +0000 (23:32 -0700)]

[RISCV] isel (add (and X, 0xFFFFFFFE), Y) as (SH1ADD (SRLIW X, 1), Y).

Similar for SH2ADD and SH3ADD.

This is what we get from

int foo(int* x, unsigned y) {
return x[y >> 1];
}

This allows us to avoid materializing 0xFFFFFFFE into a register.

commit | commitdiff | tree

Petr Hosek [Sat, 2 Jul 2022 04:51:16 +0000 (04:51 +0000)]

Revert "[CMake][Fuchsia] Use libunwind as the default unwinder"

This reverts commit 6213dba19fc0d65ab8b366b6d78c56cbd63c9d7d since
this broke Fuchsia builders.

commit | commitdiff | tree

owenca [Sat, 2 Jul 2022 04:20:16 +0000 (21:20 -0700)]

[clang-format][NFC] Replace an EXPECT_EQ with a verifyFormat

commit | commitdiff | tree

Joseph Huber [Sat, 2 Jul 2022 03:24:22 +0000 (23:24 -0400)]

[llvm-objdump] Ensure offloading sections have proper alignment

Summary:
A previous patch added support for dumping offloading sections. The
tests for this feature added dummy input to the required section using
`llvm-objcopy`. This binary format has a required alignment of `8` which
was not being respected by the file copied with llvm-objcopy and would
cause failures on architectures sensitive to alignment problems or with
sanitizers. This patch adds the proper alignemnt and adds an error check
at least for the binary format so it's not completely opaque. This
should be improvbed so users actually get a helpful message.

commit | commitdiff | tree

Yeting Kuo [Fri, 1 Jul 2022 03:11:21 +0000 (11:11 +0800)]

[RISCV] Restore "Enable shrink wrap by default"

This reverts commit 7af3d4ab3d5da07256e1a7a0438e7308e14b9fd5.

RISC-V reverted the shrink wrap patch for bug 53662. Since the bug is fixed
by D123679, the commit re-enable it.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D128965

commit | commitdiff | tree

Johannes Doerfert [Fri, 1 Jul 2022 19:35:10 +0000 (14:35 -0500)]

[Attributor] Move heap2stack allocas to the entry block if possible

If we are certainly not in a loop we can directly emit the heap2stack
allocas in the function entry block. This will help to get rid of them
(SROA) and avoid stacksave/restore intrinsics when the function is
inlined.

commit | commitdiff | tree

Johannes Doerfert [Mon, 27 Jun 2022 22:45:17 +0000 (17:45 -0500)]

[OpenMP][NFC] Reuse check lines for Clang/OpenMP tests

I used a script to reuse existing check lines rather than creating new
ones. There are more opportunities to reduce the line count but the
"check generated functions" logic makes that somewhat tricky.

FWIW, we really should redo the update script with all these use cases
in mind...

Differential Revision: https://reviews.llvm.org/D128686

commit | commitdiff | tree

owenca [Sat, 2 Jul 2022 01:59:52 +0000 (18:59 -0700)]

[clang-format] Run dump_format_style.py for LK_Verilog

commit | commitdiff | tree

wren romano [Fri, 1 Jul 2022 23:17:00 +0000 (16:17 -0700)]

[mlir][sparse] Silencing some -Wunused-function in unittests

This is a followup to D128058.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D129027

commit | commitdiff | tree

Yeting Kuo [Fri, 1 Jul 2022 07:25:54 +0000 (15:25 +0800)]

[RISCV][NFC] Simplify condition of IsTU.

Just simplify code.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D128972

commit | commitdiff | tree

LLVM GN Syncbot [Sat, 2 Jul 2022 01:13:41 +0000 (01:13 +0000)]

[gn build] Port d2d8b0aa4f80

commit | commitdiff | tree

LLVM GN Syncbot [Sat, 2 Jul 2022 01:13:40 +0000 (01:13 +0000)]

[gn build] Port 228c8f9cc0b2

commit | commitdiff | tree

Joseph Huber [Thu, 2 Jun 2022 18:25:49 +0000 (14:25 -0400)]

[llvm-objdump] Add support for dumping embedded offloading data

In Clang/LLVM we are moving towards a new binary format to store many
embedded object files to create a fatbinary. This patch adds support for
dumping these embedded images in the `llvm-objdump` tool. This will
allow users to query information about what is stored inside the binary.
This has very similar functionality to the `cuobjdump` tool for thoe familiar
with the Nvidia utilities. The proposed use is as follows:
```
$ clang input.c -fopenmp --offload-arch=sm_70 --offload-arch=sm_52 -c
$ llvm-objdump -O input.o

input.o:        file format elf64-x86-64

OFFLOADIND IMAGE [0]:
kind            cubin
arch            sm_52
triple          nvptx64-nvidia-cuda
producer        openmp

OFFLOADIND IMAGE [1]:
kind            cubin
arch            sm_70
triple          nvptx64-nvidia-cuda
producer        openmp
```

This will be expanded further once we start embedding more information
into these offloading images. Right now we are planning on adding
flags and entries for debug level, optimization, LTO usage, target
features, among others.

This patch only supports printing these sections, later we will want to
support dumping files the user may be interested in via another flag. I
am unsure if this should go here in `llvm-objdump` or `llvm-objcopy`.

Reviewed By: MaskRay, tra, jhenderson, JonChesterfield

Differential Revision: https://reviews.llvm.org/D126904

commit | commitdiff | tree

Joseph Huber [Tue, 14 Jun 2022 19:04:39 +0000 (15:04 -0400)]

[ObjectYAML] Add offloading binary implementations for obj2yaml and yaml2obj

This patchs adds the necessary code for inspecting or creating offloading
binaries using the standing `obj2yaml` and `yaml2obj` features in LLVM.

Depends on D127774

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D127776

commit | commitdiff | tree

Jennifer Yu [Tue, 14 Jun 2022 17:11:10 +0000 (10:11 -0700)]

Generate the capture for the field when the field is used in openmp
region with implicit default inside the member function.

This is to fix assert when field is referenced in OpenMP region with
default (first|private) clause inside member function.

The problem of assert is that the capture is not generated for the field.

This patch is to generate capture when the field is used with implicit
default, use it in the code, and save the capture off to make sure it is
considered from that point and add first/private clauses.

1> Add new field ImplicitDefaultFirstprivateFDs in SharingMapTy, used to
   store generated capture fields info.
2> In function isOpenMPCaptureDecl: the caputer is generated and saved
   in ImplicitDefaultFirstprivateFDs.
3> Add new help functions:
   getImplicitFDCapExprDecl
   isImplicitDefaultFirstprivateFD
   addImplicitDefaultFirstprivateFD
4> Add addition argument in hasDSA to check default attribute for
   default(first|private).
5> The isImplicitDefaultFirstprivateFD is used in VisitDeclRefExpr to
   build the implicit clause.
6> Add new parameter "Context" for buildCaptureDecl, due to when capture
   field, the parent context is needed to be used.
7> Change in isOpenMPPrivateDecl where stop propagate the capture from
   the enclosing region for private variable.
8> In ActOnOpenMPFirstprivate/ActOnOpenMPPrivate, using captured info
   to generate first|private clause.
9> Add new function isOpenMPRebuildMemberExpr: use to determine if field
   needs to be rebuild during template instantiation.

Differential Revision: https://reviews.llvm.org/D127803

commit | commitdiff | tree

LLVM GN Syncbot [Fri, 1 Jul 2022 23:35:58 +0000 (23:35 +0000)]

[gn build] Port 94c7b89fe5b0

commit | commitdiff | tree

Konstantin Varlamov [Fri, 1 Jul 2022 23:34:08 +0000 (16:34 -0700)]

[libc++][ranges] Implement `ranges::stable_sort`.

Differential Revision: https://reviews.llvm.org/D127834

Domain: System / Toolchain;