review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

Alan Zhao [Wed, 14 Jun 2023 21:10:31 +0000 (14:10 -0700)]

Revert "[InstSimplify] Fold all global variables with initializers"

This reverts commit 17b7df3daee85c1a4d1d955e558d42b34ce17549.

Reason: causes chrome builds to crash: https://crbug.com/1454861

commit | commitdiff | tree

Jason Molenda [Wed, 14 Jun 2023 20:43:53 +0000 (13:43 -0700)]

Clear non-addressable bits from pc/fp/sp in unwinds

Some Darwin corefiles can have the pc/fp/sp/lr in the
live register context signed with pointer authentication;
this patch changes RegisterContextUnwind to strip those
bits off of those values as we try to walk the stack.

Differential Revision: https://reviews.llvm.org/D152861
rdar://109185291

commit | commitdiff | tree

Jason Molenda [Wed, 14 Jun 2023 20:40:54 +0000 (13:40 -0700)]

Add Fix*Address methods to Process, call into ABI

We need to clear non-addressable bits from addresses across
the lldb sources. Currently these need to use an ABI method
to clear those bits from addresses, which you do by taking a
Process, getting the current ABI, then calling the method.

Simplify this by providing methods in Process which call into
the ABI methods themselves.

Differential Revision: https://reviews.llvm.org/D152863

commit | commitdiff | tree

Zequan Wu [Thu, 8 Jun 2023 21:38:57 +0000 (17:38 -0400)]

[Clang][MS] Remove assertion on BaseOffset can't be smaller than Size.

This assertion triggered when we have two base classes sharing the same offset
and the first base is empty and the second class is non-empty.
Remove it for correctness.

I can't add a test case for this because -foverride-record-layout doesn't read
base class info at all. I can add that support later for testing if needed.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D152472

commit | commitdiff | tree

Leonard Chan [Wed, 14 Jun 2023 20:36:27 +0000 (20:36 +0000)]

Revert "[LLD] Allow usage of LLD as a library"

This reverts commit 2700da5fe28d8b17c66e5c960d2188276a6ced39.

Reverting since this causes some test failures on our builders: https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8778372807208184913/overview

commit | commitdiff | tree

Yaxun (Sam) Liu [Wed, 14 Jun 2023 20:03:29 +0000 (16:03 -0400)]

Revert "[HIP] Allow std::malloc in device function"

This reverts commit f5033c37025db46df95a7859d7189d09b5e3433e.

revert this patch since it causes regressions for Tensile. A
reduced test case is:

int main()
{
    std::shared_ptr<float> a;
    a = std::shared_ptr<float>(
        (float*)std::malloc(sizeof(float) * 100),
        std::free
    );
    return 0;
}

Will fix the issue then re-commit.

Fixes: SWDEV-405317

commit | commitdiff | tree

Peiming Liu [Wed, 14 Jun 2023 01:29:20 +0000 (01:29 +0000)]

[mlir][sparse] fix crashes when the tensor that defines the loop bound can not be found

Reviewed By: aartbik, K-Wu

Differential Revision: https://reviews.llvm.org/D152877

commit | commitdiff | tree

Fangrui Song [Wed, 14 Jun 2023 20:16:49 +0000 (13:16 -0700)]

[asan] Fix shadow load alignment for unaligned 128-bit load/store

When a 128-bit load/store is aligned by 8, we incorrectly emit `load i16, ptr ..., align 2`
while the shadow memory address may not be aligned by 2.

This manifests as possibly-misaligned shadow memory load with `-mstrict-align`,
e.g. `clang --target=aarch64-linux -O2 -mstrict-align -fsanitize=address`
```
__attribute__((noinline)) void foo(unsigned long *ptr) {
  ptr[0] = 3;
  ptr[1] = 3;
}
// ldrh    w8, [x9, x8]  // the shadow memory load may not be aligned by 2
```

Infer the shadow memory alignment from the load/store alignment to set the
correct alignment. The generated code now uses two ldrb and one orr.

Fix https://github.com/llvm/llvm-project/issues/63258

Differential Revision: https://reviews.llvm.org/D152663

commit | commitdiff | tree

William S. Moses [Wed, 14 Jun 2023 16:23:44 +0000 (12:23 -0400)]

[LoopIdiom] Preserve alias information for memset_pattern

TBAA/NoAlias/AliasScope and other information is currently preserved
when upgrading to a memcpy/memset. However, this is missing when upgrading to
the macOS memset_pattern function. This adds the same alias information preservation
to memset_pattern

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D152934

commit | commitdiff | tree

Jonas Devlieghere [Wed, 14 Jun 2023 20:00:38 +0000 (13:00 -0700)]

[DebugInfo] Always emit `.debug_names` with DWARF 5 for Apple platforms

On Apple platforms, we generate .apple_names, .apple_types,
.apple_namespaces and .apple_objc Apple accelerator tables for DWARF 4
and earlier. For DWARF 5 we should generate .debug_names, but instead we
get no accelerator tables at all.

In the backend we are correctly determining that we should be emitting
.debug_names instead of .apple_names. However, when we get to the point
of emitting the section, if the CU debug name table kind is not
"default", the accelerator table emission is skipped.

This patch sets the DebugNameTableKind to Apple in the frontend when
target an Apple target. That way we know that the CU was compiled with
the intent of emitting accelerator tables. For DWARF 4 and earlier, that
means Apple accelerator tables. For DWARF 5 and later, that means .debug
names.

Differential revision: https://reviews.llvm.org/D118754

commit | commitdiff | tree

Peiming Liu [Tue, 6 Jun 2023 22:51:32 +0000 (22:51 +0000)]

[mlir][sparse] unifying enterLoopOverTensorAtLvl and enterCoIterationOverTensorsAtLvls

The tensor levels are now explicitly categorized into different `LoopCondKind` to instruct LoopEmitter generate different code for different kinds of condition (e.g., `SparseCond`, `SparseSliceCond`, `SparseAffineIdxCond`, etc)

The process of generating a while loop is now dissembled into three steps and they are dispatched to different LoopCondKind handler.
1. Generate LoopCondition (e.g., `pos <= posHi` for `SparseCond`, `slice.isNonEmpty` for `SparseAffineIdxCond`)
2. Generate LoopBody (e.g., compute the coordinates)
3. Generate ExtraChecks (e.g., `if (onSlice(crd))` for `SparseSliceCond`)

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D152464

commit | commitdiff | tree

Philip Reames [Wed, 14 Jun 2023 19:57:49 +0000 (12:57 -0700)]

[RISCV] Remove dead code from doPeepholeMaskedRVV [nfc]

This is after lowering of undef to IMPLICIT_DEF, so the condition is always false. Rather than fixing the intent (which was to match implicit_def per the comment), just delete it. We're in the process of migrating away from the TA pseudos, so using _TA more often is fine.

commit | commitdiff | tree

Simon Pilgrim [Wed, 14 Jun 2023 19:48:35 +0000 (20:48 +0100)]

[GlobalIsel][X86] Fix copy+pasta typo in the G_EXTRACT/G_INSERT legal type pairs

commit | commitdiff | tree

Benoit Jacob [Fri, 9 Jun 2023 02:33:51 +0000 (02:33 +0000)]

bazel build --incompatible_no_implicit_file_export

The Bazel build was relying, for the two files enumerated in this diff, on the legacy implicit-export semantics described here:
https://bazel.build/reference/be/functions#exports_files

This documentation page encourages migrating away from this legacy behavior, and indeed we have a user who reported a Bazel build error and it appears that they were already using the new, stricter behavior:
https://github.com/openxla/iree/pull/13982
and while examining fixes on our side and trying to get a clean Bazel build, I ran into this similar issue in the LLVM overlay.

It would arguably be cleaner (in the sense of more structured) to rely on `filegroup` to export this, but I am insufficiently familiar with the Clang build (the dependent targets seem to be below Clang) to do this myself. The present `exports_files` solution has the merit of being localized in these few lines here.

Differential Revision: https://reviews.llvm.org/D152491

commit | commitdiff | tree

AMS21 [Wed, 14 Jun 2023 18:47:32 +0000 (18:47 +0000)]

[clang-tidy] Fix wrong code generation for `modernize-loop-convert` with structured bindings.

Fixes llvm#62951

Reviewed By: PiotrZSL

Differential Revision: https://reviews.llvm.org/D152852

commit | commitdiff | tree

Eli Friedman [Wed, 14 Jun 2023 18:44:02 +0000 (11:44 -0700)]

[clang docs] Rescue some deleted bits of the command-line reference.

Back when the command-line reference rst was in-tree, a lot of people missed
the "DO NOT EDIT" comment at the top, and then changes were
effectively reverted when the file was regenerated. I went through the
changes, and rescued the interesting bits of documentation that were
destroyed.

Additional notes:

- I'm intentionally leaving out D73459 because I'm not sure how to port
  the changes to -march.
- Some options have help text in Options.td, but that text doesn't make
  it into the reference. Incomplete list of such options:
  -fc++-static-destructors, -frtti-data, -fplt, -fstrict-return,
  -funique-section-names, -fuse-init-array.  Not sure what's happening.

Differential Revision: https://reviews.llvm.org/D152396

commit | commitdiff | tree

Saleem Abdulrasool [Wed, 14 Jun 2023 18:14:17 +0000 (11:14 -0700)]

[lit] Avoid os.path.realpath in lit.py due to MAX_PATH limitations on Windows

lit.py uses os.path.realpath on file paths. Somewhere between Python 3.7
and 3.9, os.path.realpath was updated to resolve substitute drives on
Windows (subst S: C:\Long\Path\To\My\Code). This is a problem because it
prevents using substitute drives to work around MAX_PATH path length
limitations on Windows.

We run into this while building & testing, the Swift compiler on
Windows, which uses a substitute drive in CI to shorten the workspace
directory. cmake builds without resolving the substitute drive and can
apply its logic to avoid output files exceeding MAX_PATH. However, when
running tests, lit.py's use of os.path.realpath will resolve the
substitute drive (with newer Python versions), resulting in some paths
being longer than MAX_PATH, which cause all kinds of failures (for
example rd in tests fails, or link.exe fails, etc).

How tested: Ran check-all, and lit tests, saw no failures
```
> ninja -C build check-all
Testing Time: 262.63s
  Skipped          :    24
  Unsupported      :  2074
  Passed           : 51812
  Expectedly Failed:   167

> python utils\lit\lit.py --path ..\build\bin utils\lit\tests
Testing Time: 12.17s
  Unsupported:  6
  Passed     : 47
```

Patch by Tristan Labelle!

Differential Revision: https://reviews.llvm.org/D152709
Reviewed By: rnk, compnerd

commit | commitdiff | tree

Dimitry Andric [Tue, 13 Jun 2023 08:58:07 +0000 (10:58 +0200)]

[Clang] Show type in enum out of range diagnostic

When the diagnostic for an out of range enum value is printed, it
currently does not show the actual enum type in question, for example:

    v8/src/base/bit-field.h:43:29: error: integer value 7 is outside the valid range of values [0, 3] for this enumeration type [-Wenum-constexpr-conversion]
      static constexpr T kMax = static_cast<T>(kNumValues - 1);
                                ^

This can make it cumbersome to find the cause for the problem. Add the
enum type to the diagnostic message, to make it easier.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D152788

commit | commitdiff | tree

Joseph Huber [Wed, 14 Jun 2023 18:33:00 +0000 (13:33 -0500)]

[libc][obvious] Fix the FMA implementation on the GPU

Summary:
This doesn't include the type_traits to perform the indirection, nor
does it return the value.

commit | commitdiff | tree

Simon Pilgrim [Wed, 14 Jun 2023 18:32:17 +0000 (19:32 +0100)]

Fix MSVC "'std::max': no matching overloaded function found" error. NFCI.

commit | commitdiff | tree

Amaury Séchet [Wed, 14 Jun 2023 17:59:15 +0000 (17:59 +0000)]

[NFC] Autogenerate several AArch64 tests.

commit | commitdiff | tree

Joseph Huber [Wed, 14 Jun 2023 14:35:59 +0000 (09:35 -0500)]

[libc] Add support for FMA in the GPU utilities

This adds the generic FMA utilities for the GPU. We implement these
through the builtins which map to the FMA instructions in the ISA. These
may not have strict compliance with other assumptions in the the `libc`
such as rounding modes. I've included the relevant information on how
the GPU vendors map the behaviour. This should help make it easier to
implement some future generic versions.

Depends on D152486

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D152923

commit | commitdiff | tree

Joseph Huber [Thu, 8 Jun 2023 22:51:43 +0000 (17:51 -0500)]

[libc] Begin implementing a 'libmgpu.a' for math on the GPU

This patch adds an outline to begin adding a `libmgpu.a` file for
provindg math on the GPU. Currently, this is most likely going to be
wrapping around existing vendor libraries and placing them in a more
usable format. Long term, we would like to provide our own
implementations of math functions that can be used instead.

This patch works by simply forwarding the calls to the standard C math
library calls like `sin` to the appropriate vendor call like `__nv_sin`.
Currently, we will use the vendor libraries directly and link them in
via `-mlink-builtin-bitcode`. This is necessary because of bizarre
interactions with the generic bitcode, `-mlink-builtin-bitcode`
internalizes and only links in the used symbols, furthermore is
propagates the target's default attributes and its the only "truly"
correct way to pull in these vendor bitcode libraries without error.

If the vendor libraries are not availible at build time, we will still
create the `libmgpu.a`, but we will expect that the vendor library
definitions will be provided by the user's compilation as is made
possible by https://reviews.llvm.org/D152442.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D152486

commit | commitdiff | tree

Kazu Hirata [Wed, 14 Jun 2023 17:56:22 +0000 (10:56 -0700)]

[lldb] Fix a warning

This patch fixes:

  lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4843:13:
  error: 225 enumeration values not handled in switch: 'RvvInt8mf8x2',
  'RvvInt8mf8x3', 'RvvInt8mf8x4'... [-Werror,-Wswitch]

commit | commitdiff | tree

Kazu Hirata [Wed, 14 Jun 2023 17:53:11 +0000 (10:53 -0700)]

[CodeGen] Fix a warning

This patch fixes:

  llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:1790:3: error:
  default label in switch which covers all enumeration values
  [-Werror,-Wcovered-switch-default]

commit | commitdiff | tree

Nathan Ridge [Fri, 9 Jun 2023 06:39:28 +0000 (02:39 -0400)]

[clangd] Unwrap type sugar in HeuristicResolver::resolveTypeToRecordDecl()

Fixes https://github.com/clangd/clangd/issues/1663

Differential Revision: https://reviews.llvm.org/D152500

commit | commitdiff | tree

Amaury Séchet [Wed, 14 Jun 2023 17:46:34 +0000 (17:46 +0000)]

[NFC] Autogenerate several AArch64 tests.

commit | commitdiff | tree

Neumann Hon [Wed, 14 Jun 2023 17:37:46 +0000 (13:37 -0400)]

[SystemZ][z/OS] Correct value of length/4 of params field in PPA1.

The Length/4 of Params field in the PPA1 ought to be the length of the parameters for the current function. Currently we are storing the length of the parameter area in the current function's stack frame, which represents the length of the params of the longest callee in the current function.

Differential Revision: https://reviews.llvm.org/D152920

Reviewed By: uweigand

commit | commitdiff | tree

Christopher Ferris [Wed, 14 Jun 2023 01:56:12 +0000 (18:56 -0700)]

[scudo] Fix MallocIterateBoundary test on 32 bit Android.

On Android, the min alignment is 16 bytes. This test needs
the BlockDelta to match the min alignment, so set this value
differently for Android.

Update the comment in to explain these details.

Reviewed By: Chia-hungDuan

Differential Revision: https://reviews.llvm.org/D152884

commit | commitdiff | tree

Neumann Hon [Wed, 14 Jun 2023 17:34:16 +0000 (13:34 -0400)]

Revert "[SystemZ][z/OS] Correct value of length/4 of params field in PPA1."

This reverts commit e0f7b0e0f704dc3759925602e474b9e669270fcb.

commit | commitdiff | tree

Igor Kirillov [Fri, 2 Jun 2023 19:14:07 +0000 (19:14 +0000)]

[CodeGen] Add support for reductions in ComplexDeinterleaving pass

This commit enhances the ComplexDeinterleaving pass to handle unordered
reductions in simple one-block vectorized loops, supporting both
SVE and Neon architectures.

Differential Revision: https://reviews.llvm.org/D152022

commit | commitdiff | tree

Neumann Hon [Wed, 14 Jun 2023 17:20:45 +0000 (13:20 -0400)]

[SystemZ][z/OS] Correct value of length/4 of params field in PPA1.

The Length/4 of Params field in the PPA1 ought to be the length of the parameters for the current function. Currently we are storing the length of the parameter area in the current function's stack frame, which represents the length of the params of the longest callee in the current function.

Differential revision: https://reviews.llvm.org/D119049

Reviewed By: uweigand

commit | commitdiff | tree

Simon Pilgrim [Wed, 14 Jun 2023 17:09:59 +0000 (18:09 +0100)]

[GlobalIsel][X86] G_EXTRACT/G_INSERT subvector ops

Replace the legacy legalizer versions

commit | commitdiff | tree

Amaury Séchet [Wed, 14 Jun 2023 17:09:55 +0000 (17:09 +0000)]

[NFC] Autogenerate CodeGen/AArch64/sve-vl-arith.ll

commit | commitdiff | tree

Artem Belevich [Wed, 14 Jun 2023 16:21:21 +0000 (09:21 -0700)]

Revert "[NVPTX] Allow using v4i32 for memcpy lowering."

The patch may trigger a hang:
https://github.com/llvm/llvm-project/issues/63294

This reverts commit c16b7e54ac5b4da05c1d19e350ee8e75bf5f8980.

commit | commitdiff | tree

Amaury Séchet [Wed, 14 Jun 2023 16:58:18 +0000 (16:58 +0000)]

[NFC] Autogenerate a couple of AArch64 tests.

commit | commitdiff | tree

Alex Langford [Wed, 7 Jun 2023 01:48:31 +0000 (18:48 -0700)]

[lldb][NFCI] Platforms should own their SDKBuild and SDKRootDirectory strings

These don't need to be ConstStrings. They don't really benefit much from
deduplication and comparing them isn't on a hot path, so they don't
really benefit much from quick comparisons.

Differential Revision: https://reviews.llvm.org/D152331

commit | commitdiff | tree

Philip Reames [Wed, 14 Jun 2023 16:49:58 +0000 (09:49 -0700)]

[RISCV] Enable SLP by default (when vectors are available)

I propose that we go ahead and enabled SLP by default. Over the last few weeks, @luke and I have been working through codegen issues seen at small VLs from a couple of SPEC workloads. We still have a ways to go to get optimal codegen, but we're at the point where having a single configuration we're all tuning against is probably the right default.

As a bit of history, I introduced this TTI hook back in a310637132 back in August of last year to unblock enabling LoopVectorizer. At the time, we had a couple known issues: constant materialization, address generation, and a general lack of maturity of small fixed vector codegen. By now, each of these has had significant investment. I can't say any of them are completely fixed, but we're no longer seeing instances of them every place we look.

What we're mostly seeing at this point is a long tail of code gen opportunities, many involving build vectors, shuffles, and extract patterns. I have a couple patches up to continue iterating on those issues, but I don't think they need to be blockers for enabling SLP.

Differential Revision: https://reviews.llvm.org/D152750

commit | commitdiff | tree

Philip Reames [Wed, 14 Jun 2023 16:47:24 +0000 (09:47 -0700)]

[RISCV][InsertVSETVLI] Rework code structure to make reasoning about undefined lanes explicit [NFC]

We already have several places in this code which reason about whether the inactive lanes are defined, and are about to add one more in D151653. Let's go ahead and common the code so that we don't have the same concept repeating in multiply places.

Differential Revision: https://reviews.llvm.org/D152844

commit | commitdiff | tree

Amaury Séchet [Wed, 14 Jun 2023 16:35:38 +0000 (16:35 +0000)]

[NFC] Regenerate CodeGen/AArch64/sve-streaming-mode-fixed-length-*.ll

commit | commitdiff | tree

Philip Reames [Wed, 14 Jun 2023 16:21:31 +0000 (09:21 -0700)]

[RISCV] Canonicalize towards vid w/passthrough representation

This patch teaches performCombineVMergeAndVOps how to handle a True instruction (the one being merged into) which is a _TU psuedo, but with an implicit_def passthrough operand. These are semantically equivalent to the unsuffixed "TA" psuedos, and we can hnndle them as such.

This is a companion to D152380, and demonstrates the unsuffixed to _TA pseudo transition for a non-VMERGE case. Between the two of them, these should cover all the changes required to the post-ISEL combines, and other arithmetic-like instructions should be just TD changes.

See https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295 for context on the patch series.

Differential Revision: https://reviews.llvm.org/D152740

commit | commitdiff | tree

Amaury Séchet [Wed, 14 Jun 2023 16:27:58 +0000 (16:27 +0000)]

[NFC] Automatically generate arm64-dagcombiner-dead-indexed-load.ll

commit | commitdiff | tree

Shilei Tian [Wed, 14 Jun 2023 16:23:24 +0000 (12:23 -0400)]

[OpenMP] Fix the issue in openmp/runtime/test/parallel/bug63197.c

If the system has 32 threads, then the test will fail because of partial match.

commit | commitdiff | tree

Aart Bik [Wed, 14 Jun 2023 00:51:13 +0000 (17:51 -0700)]

[mlir][sparse] refine single condition set up for semi-ring ops

Reviewed By: Peiming, K-Wu

Differential Revision: https://reviews.llvm.org/D152874

commit | commitdiff | tree

Amaury Séchet [Wed, 14 Jun 2023 16:19:49 +0000 (16:19 +0000)]

[NFC] Regenerate several VE codegen tests.

commit | commitdiff | tree

Krzysztof Parzyszek [Tue, 6 Jun 2023 19:34:40 +0000 (12:34 -0700)]

[RDF] Minor refactoring for clarity, NFC

commit | commitdiff | tree

Krzysztof Parzyszek [Tue, 6 Jun 2023 19:21:45 +0000 (12:21 -0700)]

[RDF] Remove unused variant of getNextShadow, NFC

commit | commitdiff | tree

Amaury Séchet [Wed, 14 Jun 2023 16:08:31 +0000 (16:08 +0000)]

[NFC] Regen CodeGen/AArch64/bitfield-insert.ll

commit | commitdiff | tree

Igor Kirillov [Fri, 2 Jun 2023 19:28:43 +0000 (19:28 +0000)]

[CodeGen] Add pre-commit tests for D152022 and D152558

Differential Revision: https://reviews.llvm.org/D152025

commit | commitdiff | tree

Florian Hahn [Wed, 14 Jun 2023 15:53:33 +0000 (16:53 +0100)]

[LV] Fix crash when stride isn't a constant.

In same cases, the stride may not be a constant. Just skip those cases
for now. This should only happen for cases where LV interleaves only, if
it is vectorized the stride needs to be versioned to a constant.

commit | commitdiff | tree

Craig Topper [Wed, 14 Jun 2023 15:52:51 +0000 (08:52 -0700)]

[SelectionDAG][RISCV] Add very basic PromoteIntegerResult/Op support for VP_SIGN/ZERO_EXTEND.

We don't have VP_ANY_EXTEND or VP_SIGN_EXTEND_INREG yet so I've
deviated a little from the non-VP lowering.

My goal was to fix the crashes that occurs on these test cases without this patch.

Reviewed By: fakepaper56

Differential Revision: https://reviews.llvm.org/D152854

commit | commitdiff | tree

Shilei Tian [Wed, 14 Jun 2023 15:51:51 +0000 (11:51 -0400)]

[OpenMP] Use 0 instead of false in the test bug63197.c

commit | commitdiff | tree

Craig Topper [Wed, 14 Jun 2023 15:47:23 +0000 (08:47 -0700)]

[RISCV] Reduce the number of ExtInfo_rr permutations in tablegen.

Add ExtraPreds parameter to FPUnaryOp_r_frm_m to pass IsRV64 so we
don't need RV64 versions of ExtInfo_rr.

Differential Revision: https://reviews.llvm.org/D152890

commit | commitdiff | tree

Shilei Tian [Wed, 14 Jun 2023 15:45:49 +0000 (11:45 -0400)]

[OpenMP] Fix the issue where `num_threads` still takes effect incorrectly

This patch fixes the issue that, if we have a compile-time serialized parallel
region (such as `if (0)`) with `num_threads`, followed by a regular parallel
region, the regular parallel region will pick up the value set in the serialized
parallel region incorrectly. The reason is, in the front end, if we can prove a
parallel region has to serialized, instead of emitting `__kmpc_fork_call`, the
front end directly emits `__kmpc_serialized_parallel`, body, and `__kmpc_end_serialized_parallel`.
However, this "optimization" doesn't consider the case where `num_threads` is
used such that `__kmpc_push_num_threads` is still emitted. Since we don't reset
the value in `__kmpc_serialized_parallel`, it will affect the next parallel region
followed by it.

Fix #63197.

Reviewed By: tlwilmar

Differential Revision: https://reviews.llvm.org/D152883

commit | commitdiff | tree

Adrian Prantl [Wed, 14 Jun 2023 00:31:32 +0000 (17:31 -0700)]

Add support for __debug_line_str in Mach-O

This patch resolves an issue that currently accounts for the vast
majority of failures on the matrix bot.

Differential Revision: https://reviews.llvm.org/D152872

commit | commitdiff | tree

zhongyunde [Wed, 14 Jun 2023 15:28:46 +0000 (23:28 +0800)]

[LegalizeTypes][AArch64] Use scalar_to_vector to eliminate bitcast

```
Legalize t3: v2i16 = bitcast i32
with (v2i16 extract_subvector (v4i16 bitcast (v2i32 scalar_to_vector (i32 in))), 0)
```
Fix https://github.com/llvm/llvm-project/issues/61638

NOTE: Don't touch getPreferredVectorAction like X86 as this will touch
too many test cases.

Reviewed By: dmgreen, paulwalker-arm, efriedma
Differential Revision: https://reviews.llvm.org/D147678

commit | commitdiff | tree

zhongyunde [Wed, 14 Jun 2023 15:26:07 +0000 (23:26 +0800)]

[test] Update the checking base for LE and BE

precommit tests for D147678 as we need tests cover BE too.

Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D152815

commit | commitdiff | tree

Nikita Popov [Wed, 14 Jun 2023 15:23:51 +0000 (17:23 +0200)]

[InstCombine] Add tests for binop of shift fold (NFC)

commit | commitdiff | tree

Kelvin Li [Fri, 9 Jun 2023 02:57:27 +0000 (22:57 -0400)]

[flang] rename PPC specific intrinsic modules (NFC)

commit | commitdiff | tree

Ricardo Jesus [Wed, 22 Feb 2023 16:02:47 +0000 (16:02 +0000)]

[AArch64] Neoverse V2 scheduling model

This adds a scheduling model for the Neoverse V2. All information is
taken from the Neoverse V2 Software Optimisation Guide:

https://developer.arm.com/documentation/PJDOC-466751330-593177/r0p2

Differential Revision: https://reviews.llvm.org/D151894

commit | commitdiff | tree

Valentin Clement [Wed, 14 Jun 2023 15:17:00 +0000 (08:17 -0700)]

[flang][openacc] Add lowering for min operator

Add lowering support for the min operator
in reduction clause.

Depends on D151565

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D151671

commit | commitdiff | tree

Tue Ly [Wed, 14 Jun 2023 14:59:33 +0000 (10:59 -0400)]

[libc] Fix merging issue with test/src/math/exhaustive/expm1f_test

commit | commitdiff | tree

Kirill Stoimenov [Wed, 14 Jun 2023 14:55:31 +0000 (14:55 +0000)]

[HWASAN] Fix bot test failure caused by D152763 by switching to
unaligned memory tagging

commit | commitdiff | tree

Tue Ly [Wed, 14 Jun 2023 14:52:23 +0000 (10:52 -0400)]

[libc] Enable hermetic floating point tests again.

Fixing an issue with LLVM libc's fenv.h defined rounding mode macros
differently from system libc, making get_round() return different values from
fegetround(). Also letting math tests to skip rounding modes that cannot be
set. This should allow math tests to be run on platforms in which fenv.h is not
implemented yet.

This allows us to re-enable hermatic floating point tests in
https://reviews.llvm.org/D151123 and reverting https://reviews.llvm.org/D152742.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D152873

commit | commitdiff | tree

Kelvin Li [Tue, 30 May 2023 23:37:44 +0000 (19:37 -0400)]

[flang] semantic checking for unsupported features in PPC vector type

Assumed-shape, deferred-shape and assumed rank entities of PPC vector
type are not supported.

Differential Revision: https://reviews.llvm.org/D152864

commit | commitdiff | tree

Simon Pilgrim [Wed, 14 Jun 2023 14:34:02 +0000 (15:34 +0100)]

[GlobalIsel][X86] Update legalization of G_UADDE

Replace the legacy legalizer versions - still WIP but matches existing s32 handling, we should be able to add full scalar support for G_UADDO/G_USUBE/G_USUBO as well very easily

commit | commitdiff | tree

Alex Brachet [Wed, 14 Jun 2023 14:07:58 +0000 (14:07 +0000)]

[libc][NFC] Fix some issues with LIBC_INLINE

We define LIBC_INLINE to include [[clang::internal_linkage]], and these
must appear before other specifiers. Additionally, there was also a
missing cast that was causing warnings.

Differential Revision: https://reviews.llvm.org/D152865

commit | commitdiff | tree

Viktoriia Bakalova [Fri, 9 Jun 2023 15:11:13 +0000 (15:11 +0000)]

[clangd] Use include_cleaner spelling strategies in clangd.

Differential Revision: https://reviews.llvm.org/D152913

commit | commitdiff | tree

Simon Pilgrim [Wed, 14 Jun 2023 14:01:06 +0000 (15:01 +0100)]

[GlobalIsel][X86] Regenerate legalize-add.mir with common CHECK prefix

commit | commitdiff | tree

Guillaume Chatelet [Wed, 14 Jun 2023 11:55:29 +0000 (11:55 +0000)]

[libc] Enable custom logging in LibcTest

This patch mimics the behavior of Google Test and allow users to log custom messages after all flavors of ASSERT_ / EXPECT_.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D152630

commit | commitdiff | tree

David Spickett [Tue, 6 Jun 2023 08:18:12 +0000 (09:18 +0100)]

[lldb][AArch64] Add Scalable Matrix Extension option to QEMU launch script

The Scalable Matrix Extension (SME) does not require extra options
beyond setting the cpu to "max".

https://qemu-project.gitlab.io/qemu/system/arm/cpu-features.html#sme-cpu-property-examples

SME depends on SVE, so that will be enabled too even if you don't ask
for it by name.

--sve --sme -> SVE and SME
--sme -> SVE and SME
--sve -> Only SVE

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D152519

commit | commitdiff | tree

Krzysztof Parzyszek [Tue, 6 Jun 2023 19:20:37 +0000 (12:20 -0700)]

[RDF] Print something useful for NodeId == 0 instead of crashing

commit | commitdiff | tree

Krzysztof Parzyszek [Tue, 6 Jun 2023 14:51:27 +0000 (07:51 -0700)]

[RDF] Remove unused parameter AllRefs from buildPhis

commit | commitdiff | tree

Francesco Petrogalli [Wed, 14 Jun 2023 13:11:44 +0000 (15:11 +0200)]

[MISched] Fix non-debug builds.

As reported in https://github.com/llvm/llvm-project/issues/63225, we
need to make sure we can use the `&operator<<` on instances of the
`ResourceSegments` class for builds that set `NDEBUG`.

Reviewed By: sylvestre.ledru

Differential Revision: https://reviews.llvm.org/D152817

commit | commitdiff | tree

Amaury Séchet [Wed, 14 Jun 2023 12:57:02 +0000 (12:57 +0000)]

[NFC] Add tests cases for isTruncateOf for D151916

commit | commitdiff | tree

Nikita Popov [Wed, 14 Jun 2023 12:56:10 +0000 (14:56 +0200)]

[InstCombine] Avoid infinite loop in insert/extract combine

Fix the infinite loop reported on https://reviews.llvm.org/D151807#4420467.

collectShuffleElements() will widen vectors and replace extracts
via replaceExtractElements(), to allow the next call of
collectShuffleElements() to fold. However, it's possible for another
fold to run first, and break the expected sequence again. To ensure
this does not happen, directly rerun the collectShuffleElements()
fold if we have adjusted extracts.

commit | commitdiff | tree

Kristina Bessonova [Mon, 13 Mar 2023 12:29:13 +0000 (13:29 +0100)]

[DwarfDebug] Move emission of imported entities from beginModule() to endModule() (2/7)

RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544

!Note! Extracted from the following patch for review purpose only, should
be squashed with the next patch (D144004) before committing.

Currently the back-end emits imported entities in `DwarfDebug::beginModule()`.
However in case an imported declaration is a function, it must point to an
abstract subprogram if it exists (see PR51501). But in `DwarfDebug::beginModule()`
the DWARF generator doesn't have information to identify if an abstract
subprogram needs to be created.

Only by entering `DwarfDebug::endModule()` all subprograms are processed,
so it's clear which subprogram DIE should be referred to. Hence, the patch moves
the emission there.

The patch is need to fix PR51501, but it only does the preliminary
work. Since it changes the order of debug entities in emitted DWARF and
therefore affect many tests it's separated from the fix for the sake of
simplifying review.

Note that there are other issues with handling an imported declaration in
`DwarfDebug::beginModule()`. They are described in more details in D114705.

Differential Revision: https://reviews.llvm.org/D143985

Depends on D143984

commit | commitdiff | tree

Takuya Shimizu [Wed, 14 Jun 2023 12:43:03 +0000 (21:43 +0900)]

[clang][Sema] Fix diagnostic message for unused constant variable templates

BEFORE this patch, unused const-qualified variable templates such as `template <typename T> const double var_t = 0;` were diagnosed as `unused variable 'var_t'`
This patch fixes this message to `unused variable template 'var_t'`

Differential Revision: https://reviews.llvm.org/D152796

commit | commitdiff | tree

Krishna Narayanan [Wed, 14 Jun 2023 12:28:35 +0000 (08:28 -0400)]

Update with warning message for comparison to NULL pointer

The tautological comparison warning was not properly looking through
parenthesized expressions, which is now fixed.

Fixes https://github.com/llvm/llvm-project/issues/42992
Differential Revision: https://reviews.llvm.org/D149000

commit | commitdiff | tree

Jay Foad [Wed, 14 Jun 2023 10:44:41 +0000 (11:44 +0100)]

[update_mir_test_checks] Tolerate -simplify-mir output

D135579 added support for fixedStack, but did not cope with the output
of -simplify-mir which does not include the fixedStack section by
default.

Differential Revision: https://reviews.llvm.org/D152896

commit | commitdiff | tree

Simon Pilgrim [Wed, 14 Jun 2023 12:07:57 +0000 (13:07 +0100)]

[docs] Add missing label

commit | commitdiff | tree

Matt Arsenault [Sat, 10 Jun 2023 17:22:34 +0000 (13:22 -0400)]

LowerMemIntrinsics: Check address space aliasing for memmove expansion

For cases where we cannot insert an addrspacecast, we can still expand
like a memcpy if we know the address spaces cannot alias. Normally
non-aliasing memmoves are optimized to memcpy, but we cannot rely on
that for lowering. If a target has aliasing address spaces that cannot
be casted between, we still have to give up lowering this.

commit | commitdiff | tree

Simon Pilgrim [Wed, 14 Jun 2023 11:49:07 +0000 (12:49 +0100)]

[docs] Add missing empty line at start of code-block

commit | commitdiff | tree

Simon Pilgrim [Wed, 14 Jun 2023 11:25:59 +0000 (12:25 +0100)]

[X86] X86FixupVectorConstantsPass - attempt to replace full width integer vector constant loads with broadcasts on AVX2+ targets (REAPPLIED)

lowerBuildVectorAsBroadcast will not broadcast splat constants in all cases, resulting in a lot of situations where a full width vector load that has failed to fold but is loading splat constant values could use a broadcast load instruction just as cheaply, and save constant pool space.

This is an updated commit of ab4b924832ce26c21b88d7f82fcf4992ea8906bb after being reverted at 78de45fd4a902066617fcc9bb88efee11f743bc6

commit | commitdiff | tree

Nemanja Ivanovic [Wed, 14 Jun 2023 11:45:05 +0000 (06:45 -0500)]

[clang-tidy] Fix build bot break after 474a2b9367ad

The commmit added clang-tidy checks without adding
the required library to the link step.
Caused failures with shared library builds.

commit | commitdiff | tree

Ivan Kosarev [Wed, 14 Jun 2023 10:53:12 +0000 (11:53 +0100)]

[AMDGPU][AsmParser][NFC] Get rid of custom default operand handlers.

Removes the need to add and remove them manually depending on whether
they are used in cvt*() functions. Also removes the compiler warnings
about unused handlers when it happens to be the case.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D151688

commit | commitdiff | tree

Jay Foad [Wed, 14 Jun 2023 11:05:38 +0000 (12:05 +0100)]

[AMDGPU] Use a common check prefix in regbankselect-amdgcn.s.buffer.load.ll

commit | commitdiff | tree

Mikael Holmen [Mon, 12 Jun 2023 12:08:28 +0000 (14:08 +0200)]

[SLPVectorizer] Don't include isAssumeLikeIntrinsics in ScheduleRegionSize

We don't want the existence of debug instructions affect codegen so we now
ignore debug instructions and other "isAssumeLikeIntrinsics in the
"extend schedule region" search loop in
BoUpSLP::BlockScheduling::extendSchedulingRegion.

Differential Revision: https://reviews.llvm.org/D152441

commit | commitdiff | tree

Ivan Kosarev [Wed, 14 Jun 2023 10:40:48 +0000 (11:40 +0100)]

[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 2.

Reviewed By: Joe_Nash

Differential Revision: https://reviews.llvm.org/D152715

commit | commitdiff | tree

Simon Pilgrim [Wed, 14 Jun 2023 10:39:12 +0000 (11:39 +0100)]

[docs] Add missing empty line before lists

commit | commitdiff | tree

Guillaume Chatelet [Wed, 14 Jun 2023 10:31:49 +0000 (10:31 +0000)]

Revert D152630 "[libc] Enable custom logging in LibcTest"

Failing buildbot https://lab.llvm.org/buildbot/#/builders/73/builds/49707
This reverts commit 9a7b4c934893d6bc571e1ce8efab2127ae5f4e45.

commit | commitdiff | tree

Guillaume Chatelet [Wed, 14 Jun 2023 09:18:42 +0000 (09:18 +0000)]

commit | commitdiff | tree

Simon Pilgrim [Wed, 14 Jun 2023 10:06:15 +0000 (11:06 +0100)]

[CostModel][X86] Tweak SSE2 v2i64 multiply costs based off D46276 script

It looks like we were trying to account for SLM costs, which are actually handled separately

Fixes #62969

commit | commitdiff | tree

Simon Pilgrim [Tue, 13 Jun 2023 19:06:21 +0000 (20:06 +0100)]

[TTI][X86] Recognise PMULUDQ costs for vXi64 multiplies

Addresses part of Issue #62969 - if the upper 32-bits of the vXi64 elements are known to be zero, then a multiply simplifies to a single (fast) PMULUDQ instruction

We still have the problem that minRequiredElementSize can't determine that the upper bits are zero for the test case from Issue #62969 - I'll take a look at that next.

commit | commitdiff | tree

Théo Degioanni [Wed, 14 Jun 2023 08:43:10 +0000 (08:43 +0000)]

[mlir][llvm] Add memset support for mem2reg/sroa

This revision introduces support for memset intrinsics in SROA and
mem2reg for the LLVM dialect. This is achieved for SROA by breaking
memsets of aggregates into multiple memsets of scalars, and for mem2reg
by promoting memsets of single integer slots into the value the memset
operation would yield.

The SROA logic supports breaking memsets of static size operating at the
start of a memory slot. The intended most common case is for memsets
covering the entirety of a struct, most often as a way to initialize it
to 0.

The mem2reg logic supports dynamic values and static sizes as input to
promotable memsets. This is achieved by lowering memsets into
`ceil(log_2(n))` LeftShift operations, `ceil(log_2(n))` Or operations
and up to one ZExt operation (for n the byte width of the integer),
computing in registers the integer value the memset would create. Only
byte-aligned integers are supported, more types could easily be added
afterwards.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D152367

commit | commitdiff | tree

Cullen Rhodes [Wed, 14 Jun 2023 09:02:53 +0000 (09:02 +0000)]

Revert "[mlir][ArmSME] Add initial dialect with basic lowering of vector.transfer write to zero"

Apologies I shouldn't have comitted this, need to wait until the planned
MLIR ODM:

https://discourse.llvm.org/t/rfc-creating-a-armsme-dialect/67208/76

This reverts commit a48fe898857c95a063fa6c201343dca969bc098a.

commit | commitdiff | tree

Cullen Rhodes [Wed, 14 Jun 2023 08:26:44 +0000 (08:26 +0000)]

[mlir][ArmSME] Add initial dialect with basic lowering of vector.transfer write to zero

This patch adds support for lowering a `vector.transfer_write` of zeroes
and type `vector<[16x16]xi8>` to the SME `zero {za}` instruction [1],
which zeroes the entire accumulator.

This contributes to supporting a path from `linalg.fill` to SME.

[1] https://developer.arm.com/documentation/ddi0602/2022-06/SME-Instructions/ZERO--Zero-a-list-of-64-bit-element-ZA-tiles-

Reviewed By: awarzynski, dcaballe

Differential Revision: https://reviews.llvm.org/D152508

commit | commitdiff | tree

Guillaume Chatelet [Tue, 13 Jun 2023 14:41:17 +0000 (14:41 +0000)]

[libc] Dispatch memmove to memcpy when buffers are disjoint

Most of the time `memmove` is called on buffers that are disjoint, in that case we can use `memcpy` which is faster.
The additional test is branchless on x86, aarch64 and RISCV with the zbb extension (bitmanip).
On x86 this patch adds a latency of 2 to 3 cycles.

Before
```
--------------------------------------------------------------------------------
Benchmark                      Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------
BM_Memmove/0/0_median       5.00 ns         5.00 ns           10 bytes_per_cycle=1.25477/s bytes_per_second=2.62933G/s items_per_second=199.87M/s __llvm_libc::memmove,memmove Google A
BM_Memmove/1/0_median       6.21 ns         6.21 ns           10 bytes_per_cycle=3.22173/s bytes_per_second=6.75106G/s items_per_second=160.955M/s __llvm_libc::memmove,memmove Google B
BM_Memmove/2/0_median       8.09 ns         8.09 ns           10 bytes_per_cycle=5.31462/s bytes_per_second=11.1366G/s items_per_second=123.603M/s __llvm_libc::memmove,memmove Google D
BM_Memmove/3/0_median       5.95 ns         5.95 ns           10 bytes_per_cycle=2.71865/s bytes_per_second=5.69687G/s items_per_second=167.967M/s __llvm_libc::memmove,memmove Google L
BM_Memmove/4/0_median       5.63 ns         5.63 ns           10 bytes_per_cycle=2.28294/s bytes_per_second=4.78383G/s items_per_second=177.615M/s __llvm_libc::memmove,memmove Google M
BM_Memmove/5/0_median       5.68 ns         5.68 ns           10 bytes_per_cycle=2.16798/s bytes_per_second=4.54295G/s items_per_second=176.015M/s __llvm_libc::memmove,memmove Google Q
BM_Memmove/6/0_median       7.46 ns         7.46 ns           10 bytes_per_cycle=3.97619/s bytes_per_second=8.332G/s items_per_second=134.044M/s __llvm_libc::memmove,memmove Google S
BM_Memmove/7/0_median       5.40 ns         5.40 ns           10 bytes_per_cycle=1.79695/s bytes_per_second=3.76546G/s items_per_second=185.211M/s __llvm_libc::memmove,memmove Google U
BM_Memmove/8/0_median       5.62 ns         5.62 ns           10 bytes_per_cycle=3.18747/s bytes_per_second=6.67927G/s items_per_second=177.983M/s __llvm_libc::memmove,memmove Google W
BM_Memmove/9/0_median        101 ns          101 ns           10 bytes_per_cycle=9.77359/s bytes_per_second=20.4803G/s items_per_second=9.9333M/s __llvm_libc::memmove,uniform 384 to 4096
```
After
```
BM_Memmove/0/0_median       3.57 ns         3.57 ns           10 bytes_per_cycle=1.71375/s bytes_per_second=3.59112G/s items_per_second=280.411M/s __llvm_libc::memmove,memmove Google A
BM_Memmove/1/0_median       4.52 ns         4.52 ns           10 bytes_per_cycle=4.47557/s bytes_per_second=9.37843G/s items_per_second=221.427M/s __llvm_libc::memmove,memmove Google B
BM_Memmove/2/0_median       5.70 ns         5.70 ns           10 bytes_per_cycle=7.37396/s bytes_per_second=15.4519G/s items_per_second=175.399M/s __llvm_libc::memmove,memmove Google D
BM_Memmove/3/0_median       4.47 ns         4.47 ns           10 bytes_per_cycle=3.4148/s bytes_per_second=7.15563G/s items_per_second=223.743M/s __llvm_libc::memmove,memmove Google L
BM_Memmove/4/0_median       4.53 ns         4.53 ns           10 bytes_per_cycle=2.86071/s bytes_per_second=5.99454G/s items_per_second=220.69M/s __llvm_libc::memmove,memmove Google M
BM_Memmove/5/0_median       4.19 ns         4.19 ns           10 bytes_per_cycle=2.5484/s bytes_per_second=5.3401G/s items_per_second=238.924M/s __llvm_libc::memmove,memmove Google Q
BM_Memmove/6/0_median       5.02 ns         5.02 ns           10 bytes_per_cycle=5.94164/s bytes_per_second=12.4505G/s items_per_second=199.14M/s __llvm_libc::memmove,memmove Google S
BM_Memmove/7/0_median       4.03 ns         4.03 ns           10 bytes_per_cycle=2.47028/s bytes_per_second=5.17641G/s items_per_second=247.906M/s __llvm_libc::memmove,memmove Google U
BM_Memmove/8/0_median       4.70 ns         4.70 ns           10 bytes_per_cycle=3.84975/s bytes_per_second=8.06706G/s items_per_second=212.72M/s __llvm_libc::memmove,memmove Google W
BM_Memmove/9/0_median       90.7 ns         90.7 ns           10 bytes_per_cycle=10.8681/s bytes_per_second=22.7739G/s items_per_second=11.02M/s __llvm_libc::memmove,uniform 384 to 4096
```

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D152811

commit | commitdiff | tree

Vitaly Buka [Wed, 14 Jun 2023 08:15:59 +0000 (01:15 -0700)]

[test][hwasan] Allow test for any platform with tagging

commit | commitdiff | tree

Carl Ritson [Wed, 14 Jun 2023 08:13:32 +0000 (17:13 +0900)]

[AMDGPU] Pre-commit test for D152892 (NFC)

Domain: System / Toolchain;

RSS Atom