platform/upstream/llvm.git
17 months ago[Demangle] fix comment NFC
Nick Desaulniers [Mon, 15 May 2023 21:32:38 +0000 (14:32 -0700)]
[Demangle] fix comment NFC

The second and third parameter of itaniumDemangle were removed in
commit 7277a72b908d ("[Demangle] remove unused params of itaniumDemangle")
Update a comment to reflect this.

Reviewed By: nathanchance

Differential Revision: https://reviews.llvm.org/D149975

17 months ago[lldb] Set CMAKE_CXX_STANDARD before including LLDBStandalone
Jonas Devlieghere [Mon, 15 May 2023 21:29:20 +0000 (14:29 -0700)]
[lldb] Set CMAKE_CXX_STANDARD before including LLDBStandalone

Set the C++ language standard before including LLDBStandalone.cmake.
Otherwise we risk building some of our dependencies (such as llvm_gtest)
without C++ 17 support.

This should fix the standalone bot [1] which is currently failing with the
following error:

  test-port.h:841:12: error: no member named 'tuple' in namespace 'std'
  using std::tuple;

[1] https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake-standalone

17 months ago[lldb] Change definition of DisassemblerCreateInstance
Alex Langford [Wed, 10 May 2023 00:04:37 +0000 (17:04 -0700)]
[lldb] Change definition of DisassemblerCreateInstance

DissassemblerCreateInstance is a function pointer whos return type is
`Disassembler *`. But Disassembler::FindPlugin always returns a
DisassemblerSP, so there's no reason why we can't just create a
DisassemblerSP in the first place.

Differential Revision: https://reviews.llvm.org/D150235

17 months ago[clang][modules] NFC: Only sort interesting identifiers
Jan Svoboda [Mon, 15 May 2023 20:28:07 +0000 (13:28 -0700)]
[clang][modules] NFC: Only sort interesting identifiers

In 9c254184 `ASTWriter` stopped writing identifiers that are not interesting. Taking it a bit further, we don't need to sort the whole identifier table, just the interesting identifiers. This reduces the size of sorted vector from ~10k (including lots of builtins) to 2 (`__VA_ARGS__` and `__VA_OPT__`) in a typical Xcode project, improving `clang-scan-deps` performance.

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D150494

17 months ago[LLDB] Fix TestDataFormatterSynthVal.py for AArch64/Windows
Muhammad Omair Javaid [Mon, 15 May 2023 20:09:13 +0000 (00:09 +0400)]
[LLDB] Fix TestDataFormatterSynthVal.py for AArch64/Windows

Since 44363f2 various tests have started passing but introduced a
expression evaluation failure in TestDataFormatterSynthVal.py.
This patch marks the expression evaluation part as skipped while rest
of the test passes.
This patch aslo introduces a new helper isAArch64Windows in lldbtest.py.

17 months agoasan-rt: Silence a few -Wformat=pedantic's in asan_mac.cpp
Jon Roelofs [Mon, 15 May 2023 19:57:39 +0000 (12:57 -0700)]
asan-rt: Silence a few -Wformat=pedantic's in asan_mac.cpp

Differential revision: https://reviews.llvm.org/D150604

17 months ago[clang-tidy] Extract areStatementsIdentical
Piotr Zegar [Mon, 15 May 2023 19:23:56 +0000 (19:23 +0000)]
[clang-tidy] Extract areStatementsIdentical

Move areStatementsIdentical from BranchCloneCheck into ASTUtils.
Add small improvments. Use it in LoopConvertUtils.

Reviewed By: carlosgalvezp

Differential Revision: https://reviews.llvm.org/D148995

17 months agoRevert "Emit the correct flags for the PROC CodeView Debug Symbol"
Muhammad Omair Javaid [Mon, 15 May 2023 19:13:19 +0000 (23:13 +0400)]
Revert "Emit the correct flags for the PROC CodeView Debug Symbol"

This reverts commit e48826e016e2f427f3b7b1274166aa9aa0ea7f4f.

https://lab.llvm.org/buildbot/#/builders/219/builds/2520

ldb-shell :: SymbolFile/PDB/function-nested-block.test

Differential Revision: https://reviews.llvm.org/D148761

17 months ago[libc++] Revert moving the pre-release checklist
Louis Dionne [Mon, 15 May 2023 19:35:15 +0000 (12:35 -0700)]
[libc++] Revert moving the pre-release checklist

I had not seen https://reviews.llvm.org/D150585 which supersedes
this, and I want to avoid merge conflicts for D150585.

17 months ago[clang] Fix emitVoidPtrVAArg for non-zero default alloca address space
Jessica Clarke [Mon, 15 May 2023 19:26:49 +0000 (20:26 +0100)]
[clang] Fix emitVoidPtrVAArg for non-zero default alloca address space

Indirect arguments are passed on the stack and so va_arg should use the
default alloca address space, not hard-code 0, for pointers to those.
The only in-tree target with a non-zero default alloca address space is
AMDGPU, but that does not support variadic arguments, so we cannot test
this upstream. However, downstream in CHERI LLVM (and Morello LLVM, a
further fork of that) we have targets that do both and so require this
change.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D132247

17 months agoFix ConstShapeOp::inferReturnTypes to be resilient to lack of properties
Mehdi Amini [Mon, 15 May 2023 19:12:28 +0000 (12:12 -0700)]
Fix ConstShapeOp::inferReturnTypes to be resilient to lack of properties

The Python bindings test aren't using properties yet, this is a bit
of a hack to support this here, but hopefully it'll be temporary.

17 months agoAdd an operator == and != to properties, use it in DuplicateFunctionElimination
Mehdi Amini [Mon, 15 May 2023 18:03:24 +0000 (11:03 -0700)]
Add an operator == and != to properties, use it in DuplicateFunctionElimination

Differential Revision: https://reviews.llvm.org/D150596

17 months agoRe-land "[-Wunsafe-buffer-usage] Remove an unnecessary const-qualifier"
ziqingluo-90 [Mon, 15 May 2023 18:55:35 +0000 (11:55 -0700)]
Re-land "[-Wunsafe-buffer-usage] Remove an unnecessary const-qualifier"

Re-land 7a0900fd3e2d34bc1d513a97cf8fbdc1754252d7, which includes too
much clang-format changes.  This re-land gets rid of the format changes.

17 months ago[docs] Use doxygen to describe the field `StartAtCycle`. [NFCI]
Francesco Petrogalli [Mon, 15 May 2023 13:49:04 +0000 (15:49 +0200)]
[docs] Use doxygen to describe the field `StartAtCycle`. [NFCI]

Differential Revision: https://reviews.llvm.org/D150572

17 months agoRevert "[lldb] Refactor SBFileSpec::GetDirectory"
Muhammad Omair Javaid [Mon, 15 May 2023 10:25:52 +0000 (14:25 +0400)]
Revert "[lldb] Refactor SBFileSpec::GetDirectory"

This reverts commit 2bea2d7b070dc5df723ce2b92dbc654b8bb1847e.

It introduced following failures on buildbot lldb-aarch64-windows:

lldb-api :: functionalities/process_save_core/TestProcessSaveCore.py
lldb-api :: python_api/symbol-context/TestSymbolContext.py

Differential Revision: https://reviews.llvm.org/D149625

17 months agoRevert "[-Wunsafe-buffer-usage] Remove an unnecessary const-qualifier"
ziqingluo-90 [Mon, 15 May 2023 18:25:52 +0000 (11:25 -0700)]
Revert "[-Wunsafe-buffer-usage] Remove an unnecessary const-qualifier"

This reverts commit 7a0900fd3e2d34bc1d513a97cf8fbdc1754252d7.

The commit includes too much clang-format changes.

17 months agoCleanup uses of getAttrDictionary() in MLIR to use getDiscardableAttrDictionary(...
Mehdi Amini [Mon, 15 May 2023 05:39:50 +0000 (22:39 -0700)]
Cleanup uses of getAttrDictionary() in MLIR to use getDiscardableAttrDictionary() when possible

This also speeds up some benchmarks in compiling simple fortan file by 2x!
Fixes #62687

Differential Revision: https://reviews.llvm.org/D150540

17 months ago[libc++][NFC] Reformat test
Louis Dionne [Mon, 15 May 2023 18:34:54 +0000 (11:34 -0700)]
[libc++][NFC] Reformat test

I didn't notice in the review that clang-format made a poor job at
formatting the test so I went back and did it manually.

17 months ago[libc++][NFC] Use angle brackets to include ranges_mismatch.h
Louis Dionne [Mon, 15 May 2023 18:19:23 +0000 (11:19 -0700)]
[libc++][NFC] Use angle brackets to include ranges_mismatch.h

17 months ago[gn build] Port 61d5671c1697
LLVM GN Syncbot [Mon, 15 May 2023 18:29:44 +0000 (18:29 +0000)]
[gn build] Port 61d5671c1697

17 months ago[gn build] Port 205175578e0d
LLVM GN Syncbot [Mon, 15 May 2023 18:29:43 +0000 (18:29 +0000)]
[gn build] Port 205175578e0d

17 months ago[libc++] Removes _LIBCPP_ABI_OLD_LOGNORMAL_DISTRIBUTION
Mark de Wever [Sun, 7 May 2023 17:50:41 +0000 (19:50 +0200)]
[libc++] Removes _LIBCPP_ABI_OLD_LOGNORMAL_DISTRIBUTION

This was planned for LLVM 15 but was never done.

Reviewed By: #libc, philnik

Differential Revision: https://reviews.llvm.org/D150580

17 months ago[flang][openacc] Lower host_data construct
Valentin Clement [Mon, 15 May 2023 18:22:12 +0000 (11:22 -0700)]
[flang][openacc] Lower host_data construct

Lower host_data construct to the acc.host_data operation.

Depends on D150289

Reviewed By: razvanlupusoru, jeanPerier

Differential Revision: https://reviews.llvm.org/D150290

17 months ago[SLP][NFC]Add missing finalize params in the CostEstimator, NFC.
Alexey Bataev [Fri, 5 May 2023 21:14:39 +0000 (14:14 -0700)]
[SLP][NFC]Add missing finalize params in the CostEstimator, NFC.

Prepare functions for generalization of codegen/cost estimation.

Differential Revision: https://reviews.llvm.org/D150121

17 months ago[libc++] Implement ranges::starts_with
zijunzhao [Mon, 8 May 2023 22:04:00 +0000 (22:04 +0000)]
[libc++] Implement ranges::starts_with

17 months ago[LLD][ELF] change CHECK to CHECK-NEXT in overlay-phdr.test NFCI
Peter Smith [Mon, 15 May 2023 18:01:05 +0000 (19:01 +0100)]
[LLD][ELF] change CHECK to CHECK-NEXT in overlay-phdr.test NFCI

A code-review comment to change a couple of CHECK to CHECK-NEXT that I
forgot to apply prior to committing.

Differential Revision: https://reviews.llvm.org/D150445

17 months agoRevert "[libc++][PSTL] Implement std::copy{,_n}"
Nikolas Klauser [Mon, 15 May 2023 17:33:40 +0000 (10:33 -0700)]
Revert "[libc++][PSTL] Implement std::copy{,_n}"

This reverts commit b049fc0481bc387f57fd61da7239f85ef91096c1.

The wrong patch was landed.

17 months ago[mlir][sparse][gpu] end-to-end integration test of GPU libgen approach
Aart Bik [Mon, 15 May 2023 17:27:39 +0000 (10:27 -0700)]
[mlir][sparse][gpu] end-to-end integration test of GPU libgen approach

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D150172

17 months agoRevert "[AIX][tests] XFAIL -ftime-trace test for now"
Jake Egan [Mon, 15 May 2023 17:48:05 +0000 (13:48 -0400)]
Revert "[AIX][tests] XFAIL -ftime-trace test for now"

The test was fixed by 2f999327534f7cc660d2747ce294f50184dc1f97.

This reverts commit 25dc215ddaa6cb3e206858008fe4bc6844ea0d9c.

17 months ago[flang][runtime] Fixed memory leak in Assign().
Slava Zakharin [Mon, 15 May 2023 16:52:14 +0000 (09:52 -0700)]
[flang][runtime] Fixed memory leak in Assign().

The temporary descriptor must be either Pointer or Allocatable,
otherwise its memory will not be freed.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D150534

17 months ago[flang][runtime] Fixed dimension offset computation for MayAlias.
Slava Zakharin [Mon, 15 May 2023 16:52:07 +0000 (09:52 -0700)]
[flang][runtime] Fixed dimension offset computation for MayAlias.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D150533

17 months ago[gn build] Port b049fc0481bc
LLVM GN Syncbot [Mon, 15 May 2023 17:38:45 +0000 (17:38 +0000)]
[gn build] Port b049fc0481bc

17 months ago[libc++][docs] Move the pre-release check-list
Louis Dionne [Mon, 15 May 2023 17:34:17 +0000 (10:34 -0700)]
[libc++][docs] Move the pre-release check-list

It was confusing to some contributors because it appeared in a
prominent place on the Contibuting page.

17 months ago[libc++][PSTL] Implement std::copy{,_n}
Nikolas Klauser [Fri, 5 May 2023 16:24:58 +0000 (09:24 -0700)]
[libc++][PSTL] Implement std::copy{,_n}

Reviewed By: ldionne, #libc

Spies: jloser, libcxx-commits

Differential Revision: https://reviews.llvm.org/D149706

17 months agoEnable frame pointer for all non-leaf functions on riscv64 Android
AdityaK [Mon, 15 May 2023 17:17:39 +0000 (10:17 -0700)]
Enable frame pointer for all non-leaf functions on riscv64 Android

Bringing parity with aarch64-android https://github.com/google/android-riscv64/issues/9#issuecomment-1535454205

Reviewers: enh, danalbert, pirama, srhines

Differential Revision: https://reviews.llvm.org/D150490

17 months agoFix build failure caused by https://reviews.llvm.org/D150352
Amy Kwan [Mon, 15 May 2023 16:53:12 +0000 (11:53 -0500)]
Fix build failure caused by https://reviews.llvm.org/D150352

This patch fixes the following build error on the clang-ppc64le-rhel bot seen in
in https://lab.llvm.org/buildbot/#/builders/57/builds/26816/steps/5/logs/stdio:

FAILED: tools/clang/tools/extra/clang-tidy/bugprone/CMakeFiles/obj.clangTidyBugproneModule.dir/UncheckedOptionalAccessCheck.cpp.o
.../clang-ppc64le-rhel/llvm-project/clang-tools-extra/clang-tidy/bugprone/UncheckedOptionalAccessCheck.cpp:43:27: error: 'build' is deprecated: Use the version that takes a const Decl & instead [-Werror,-Wdeprecated-declarations]
      ControlFlowContext::build(&FuncDecl, *FuncDecl.getBody(), ASTCtx);
                          ^
.../ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h:41:3: note: 'build' has been explicitly marked deprecated here
  LLVM_DEPRECATED("Use the version that takes a const Decl & instead", "")
  ^
.../clang-ppc64le-rhel/llvm-project/llvm/include/llvm/Support/Compiler.h:143:50: note: expanded from macro 'LLVM_DEPRECATED'
#define LLVM_DEPRECATED(MSG, FIX) __attribute__((deprecated(MSG, FIX)))
                                                 ^
1 error generated.

17 months ago[clang] Convert a few OpenMP tests to use opaque pointers
Sergei Barannikov [Sun, 14 May 2023 17:46:59 +0000 (20:46 +0300)]
[clang] Convert a few OpenMP tests to use opaque pointers

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D150530

17 months ago[SLP][NFC] Cleanup: Separate vectorization of Inserts and CmpInsts.
Vasileios Porpodas [Fri, 5 May 2023 20:23:21 +0000 (13:23 -0700)]
[SLP][NFC] Cleanup: Separate vectorization of Inserts and CmpInsts.

This deprecates `vectorizeSimpleInstructions()` and replaces it with separate
functions that vectorize CmpInsts and Inserts.

Differential Revision: https://reviews.llvm.org/D149993

17 months ago[mlir] Fix a warning
Kazu Hirata [Mon, 15 May 2023 17:06:15 +0000 (10:06 -0700)]
[mlir] Fix a warning

This patch fixes:

  mlir/lib/Dialect/MemRef/Utils/MemRefUtils.cpp:45:2: error: extra ';'
  outside of a function is incompatible with C++98
  [-Werror,-Wc++98-compat-extra-semi]

17 months ago[gn build] Port 6851d078c54e
LLVM GN Syncbot [Mon, 15 May 2023 16:58:34 +0000 (16:58 +0000)]
[gn build] Port 6851d078c54e

17 months ago[libc++][PSTL] Implement std::transform
Nikolas Klauser [Mon, 15 May 2023 14:07:45 +0000 (07:07 -0700)]
[libc++][PSTL] Implement std::transform

Reviewed By: ldionne, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D149615

17 months ago[test] Fix const-str-array-decay.cl failure on PowerPC
Sergei Barannikov [Mon, 15 May 2023 16:54:40 +0000 (19:54 +0300)]
[test] Fix const-str-array-decay.cl failure on PowerPC

D150520 converted the test to use opaque pointers. The update version
fails on PowerPC because of different return type of the function.
This patch resolves the failure by removing the return type check;
it also makes the test look more like it was before the conversion to
prevent other potential issues caused by ABI differences across targets.

17 months ago[clang][AIX] Remove Newly Added Target Dependent Test Case
Qiongsi Wu [Mon, 15 May 2023 16:37:32 +0000 (12:37 -0400)]
[clang][AIX] Remove Newly Added Target Dependent Test Case

https://reviews.llvm.org/D144190 added a test case that is target dependent and requires assembly code generation, which fails on x64 and aarch64 buildbots. This patch removes the test case. We have test cases for code generation added in https://reviews.llvm.org/D144189 already and this removed case was nice to have.

Differential Revision: https://reviews.llvm.org/D150586

17 months ago[flang][hlfir] Fixed copy-in for polymorphic arguments.
Slava Zakharin [Mon, 15 May 2023 16:02:14 +0000 (09:02 -0700)]
[flang][hlfir] Fixed copy-in for polymorphic arguments.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D150502

17 months ago[flang][hlfir] Fixed lowering for intrinsic calls with null() box argument.
Slava Zakharin [Mon, 15 May 2023 16:02:14 +0000 (09:02 -0700)]
[flang][hlfir] Fixed lowering for intrinsic calls with null() box argument.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D150501

17 months agoFix test from b763d6a4ed4650c74c6846d743156468563b0e31
Erich Keane [Mon, 15 May 2023 16:43:25 +0000 (09:43 -0700)]
Fix test from b763d6a4ed4650c74c6846d743156468563b0e31

17 months ago[Mips] Remove MipsRegisterInfo::requiresRegisterScavenging. NFC.
Jay Foad [Mon, 15 May 2023 16:29:00 +0000 (17:29 +0100)]
[Mips] Remove MipsRegisterInfo::requiresRegisterScavenging. NFC.

This method is unused since MipsRegisterInfo is abstract and it is
overridden in both concrete subclasses.

17 months ago[libc][NFC] Clean up the memory buffer handling for RPC
Joseph Huber [Mon, 15 May 2023 14:46:56 +0000 (09:46 -0500)]
[libc][NFC] Clean up the memory buffer handling for RPC

We do a lot of arithmetic on void pointers here, so include a helper and
make some more consistent names. Changes no functionality.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D150576

17 months ago[AMDGPU] Trim zero components from buffer and image stores
Mateja Marjanovic [Mon, 15 May 2023 16:20:50 +0000 (18:20 +0200)]
[AMDGPU] Trim zero components from buffer and image stores

For image and buffer stores the default behaviour on GFX11 and
older is to set all unset components to zero. So if we pass
only X component it will be the same as X000, or XY same as XY00.

This patch simplifies the passed vector of components in InstCombine
by removing zero components from the end.

For image stores it also trims DMask if necessary.

Reviewed By: foad, arsenm
Differential Revision: https://reviews.llvm.org/D146737

17 months ago[AArch64][CostModel] Add costs for fixed operations when using fixed vectors over...
Dinar Temirbulatov [Mon, 15 May 2023 16:18:45 +0000 (16:18 +0000)]
[AArch64][CostModel] Add costs for fixed operations when using fixed vectors over SVE.

Currently any cast operation with fixed length vectors uses NEON costs,
If those operations are end up using SVE instruction then we estimate
those operations based upon SVE costs.

Differential Revision: https://reviews.llvm.org/D133955

17 months ago[clang][USR] Prevent crashes on incomplete FunctionDecls
Kadir Cetinkaya [Wed, 3 May 2023 08:50:46 +0000 (10:50 +0200)]
[clang][USR] Prevent crashes on incomplete FunctionDecls

FunctionDecls can be created with null types (D124351 added such a new
code path), to be filled in later. But parsing can stop before
completing the Decl (e.g. if code completion
point is reached).
Unfortunately most of the methods in FunctionDecl and its derived
classes assume a complete decl and don't perform null-checks.
Since we're not encountring crashes in the wild along other code paths
today introducing extra checks into quite a lot of places didn't feel
right (due to extra complexity && run time checks).
I believe another alternative would be to change Parser & Sema to never
create decls with invalid types, but I can't really see an easy way of
doing that, as most of the pieces are structured around filling that
information as parsing proceeds.

Differential Revision: https://reviews.llvm.org/D149733

17 months agoAdd C++26 compile flags.
Erich Keane [Fri, 12 May 2023 14:30:21 +0000 (07:30 -0700)]
Add C++26 compile flags.

Now that we've updated to C++23, we need to add C++26/C++2c command line
flags, as discussed in
https://discourse.llvm.org/t/rfc-lets-just-call-it-c-26-and-forget-about-the-c-2c-business-at-least-internally/70383

Differential Revision: https://reviews.llvm.org/D150450

17 months agoRevert "[X86] Use the CFA as the DWARF frame base for better variable locations aroun...
J. Ryan Stinnett [Mon, 15 May 2023 15:52:43 +0000 (16:52 +0100)]
Revert "[X86] Use the CFA as the DWARF frame base for better variable locations around calls."

This reverts commit d421f5226048e4a5d88aab157d0f4d434c43f208.

LLDB tests are failing as shown in
https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/55133/testReport/

17 months ago[mlir][sparse][gpu] first implementation of the GPU libgen approach
Aart Bik [Fri, 12 May 2023 19:43:35 +0000 (12:43 -0700)]
[mlir][sparse][gpu] first implementation of the GPU libgen approach

The sparse compiler now has two prototype strategies for GPU acceleration:

* CUDA codegen: this converts sparsified code to CUDA threads
* CUDA libgen: this converts pre-sparsified code to cuSPARSE library calls

This revision introduces the first steps required for the second approach.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D150170

17 months ago[AIX][clang] Storage Locations for Constant Pointers
Qiongsi Wu [Mon, 15 May 2023 15:14:05 +0000 (11:14 -0400)]
[AIX][clang] Storage Locations for Constant Pointers

This patch adds clang options `-mxcoff-roptr` and `-mno-xcoff-roptr` to specify storage locations for constant pointers on AIX.

When the `-mxcoff-roptr` option is in effect, constant pointers, virtual function tables, and virtual type tables are placed in read-only storage. When the `-mno-xcoff-roptr` option is in effect, pointers, virtual function tables, and virtual type tables are placed are placed in read/write storage.

This patch depends on https://reviews.llvm.org/D144189.

Reviewed By: hubert.reinterpretcast, stephenpeckham

Differential Revision: https://reviews.llvm.org/D144190

17 months ago[KnownBitsTest] Remove stray semicolons
Jay Foad [Mon, 15 May 2023 15:20:06 +0000 (16:20 +0100)]
[KnownBitsTest] Remove stray semicolons

17 months ago[mlir][memref] Extract isStaticShapeAndContiguousRowMajor as a util function.
Oleg Shyshkov [Mon, 15 May 2023 15:04:03 +0000 (17:04 +0200)]
[mlir][memref] Extract isStaticShapeAndContiguousRowMajor as a util function.

Differential Revision: https://reviews.llvm.org/D150543

17 months ago[OpenMP] Implement task record and replay mechanism
Chenle Yu [Mon, 15 May 2023 14:56:48 +0000 (09:56 -0500)]
[OpenMP] Implement task record and replay mechanism

This patch implements the "task record and replay" mechanism.  The idea is to be able to store tasks and their dependencies in the runtime so that we do not pay the cost of task creation and dependency resolution for future executions. The objective is to improve fine-grained task performance, both for those from "omp task" and "taskloop".

The entry point of the recording phase is __kmpc_start_record_task, and the end of record is triggered by __kmpc_end_record_task.

Tasks encapsulated between a record start and a record end are saved, meaning that the runtime stores their dependencies and structures, referred to as TDG, in order to replay them in subsequent executions. In these TDG replays, we start the execution by scheduling all root tasks (tasks that do not have input dependencies), and there will be no involvement of a hash table to track the dependencies, yet tasks do not need to be created again.

At the beginning of __kmpc_start_record_task, we must check if a TDG has already been recorded. If yes, the function returns 0 and starts to replay the TDG by calling __kmp_exec_tdg; if not, we start to record, and the function returns 1.

An integer uniquely identifies TDGs. Currently, this identifier needs to be incremented manually in the source code. Still, depending on how this feature would eventually be used in the library, the caller function must do it; also, the caller function needs to implement a mechanism to skip the associated region, according to the return value of __kmpc_start_record_task.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D146642

17 months ago[KnownBitsTest] Align with ConstantRange test infrastructure (NFC)
Nikita Popov [Mon, 15 May 2023 13:41:16 +0000 (15:41 +0200)]
[KnownBitsTest] Align with ConstantRange test infrastructure (NFC)

Align the way we perform exhaustive tests for KnownBits with what
we do for ConstantRange. Test each case separately by specifying
a function on KnownBits and one on APInts. Additionally, specify
a callback that determines which cases are supposed to be optimal,
rather than only correct. Unlike the ConstantRange case there is
a well-defined, unique notion of optimality for KnownBits.

If a failure occurs, print out the inputs, computed result and
exact result. Adjust the printing function to produce the output
in a format that is meaningful for KnownBits, i.e. print the
actual known bits, using ? to signify unknowns and ! to signify
conflicts.

17 months agoUpdate __cplusplus for C++23, add C++23 diag group alias.
Erich Keane [Mon, 15 May 2023 14:09:07 +0000 (07:09 -0700)]
Update __cplusplus for C++23, add C++23 diag group alias.

This came up during the C++26 flag discussion, so split this out into a
separate patch.

17 months ago[LLVM][Uniformity] Propagate temporal divergence explicitly
Sameer Sahasrabuddhe [Mon, 15 May 2023 11:36:06 +0000 (17:06 +0530)]
[LLVM][Uniformity] Propagate temporal divergence explicitly

At a cycle C with divergent exits, UA was using a naive traversal of the exiting
edges to locate blocks that may use values defined inside C. But this traversal
fails when it encounters a cycle. This is now replaced with a much simpler
propagation that iterates over every instruction in C and checks any uses that
are outside C. But such an iteration can be expensive when C is very large; the
original strategy may need to be reconsidered if there is a regression in
compilation times.

Also fixed lit tests that should have originally caught the missed propagation
of temporal divergence.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D149646

17 months ago[MLIR][ROCDL] add gpu to rocdl erf support
Manupa Karunaratne [Mon, 15 May 2023 14:41:49 +0000 (14:41 +0000)]
[MLIR][ROCDL] add gpu to rocdl erf support

This commit adds lowering of lib func
call to support erf in rocdl.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D150355

17 months ago[mlir] allow repeated payload in structured.fuse_into_containing
Alex Zinenko [Mon, 15 May 2023 12:28:21 +0000 (12:28 +0000)]
[mlir] allow repeated payload in structured.fuse_into_containing

Structured fusion proceeds by iteratively finding the next suitable
producer to be fused into the loop. Therefore, it shouldn't matter if
the same producer is listed multiple times (e.g., it is used as multiple
operands). Adjust the implementation of the transform op to support this
case.

Also fix the checking code in the interpreter to actually respect the
TransformOpInterface indication that repeated payload is allowed, it
seems to have been accidentally dropped in one of the refactorings.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D150561

17 months ago[X86] Use the CFA as the DWARF frame base for better variable locations around calls.
Kyle Huey [Mon, 15 May 2023 14:08:18 +0000 (15:08 +0100)]
[X86] Use the CFA as the DWARF frame base for better variable locations around calls.

Prior to this patch, for the DWARF frame base LLVM uses the frame pointer
register if available, otherwise the stack pointer register. If the stack
pointer register is being used and a call or other code modifies the stack
pointer during the body of the function this results in the locations being
wrong and the debugger displaying the wrong values for variables.

By using DW_OP_call_frame_cfa in these situations the emitted location for
the variable will automatically handle changes in the stack pointer.
The CFA needs to be adjusted for the offset between the frame pointer/stack
pointer to allow the variable locations themselves to remain unchanged by
this patch.

Reviewed By: #debug-info, scott.linder, jryans

Differential Revision: https://reviews.llvm.org/D143463

17 months ago[AArch64] Add test case where widening mull could be used.
Florian Hahn [Mon, 15 May 2023 14:05:10 +0000 (15:05 +0100)]
[AArch64] Add test case where widening mull could be used.

Extra test using mull for D150482.

17 months ago[ConstantFold] use StoreSize for VectorType folding
khei4 [Mon, 15 May 2023 13:33:15 +0000 (22:33 +0900)]
[ConstantFold] use StoreSize for VectorType folding
Differential Revision: https://reviews.llvm.org/D150515
Reviewed By: nikic

17 months agoRevert "[libc++][PSTL] Implement std::transform"
Nikolas Klauser [Mon, 15 May 2023 13:56:40 +0000 (06:56 -0700)]
Revert "[libc++][PSTL] Implement std::transform"

This reverts commit cbd9e5454741ebe6b39521fe1a8ed4eed5c2c801.

The wrong patch was landed.

17 months ago[unittests][llvm-exegesis] Remove build warnings [NFCI]
Francesco Petrogalli [Mon, 15 May 2023 13:26:32 +0000 (15:26 +0200)]
[unittests][llvm-exegesis] Remove build warnings [NFCI]

Remove the warning caused by a missing field initializer.

The field is `StartAtCycle` of `struct MCWriteProcResEntry`.

It has been set to the default `StartAtCycle = 0`.

Differential Revision: https://reviews.llvm.org/D150569

17 months ago[libc++][PSTL] Implement std::transform
Nikolas Klauser [Fri, 5 May 2023 16:16:05 +0000 (09:16 -0700)]
[libc++][PSTL] Implement std::transform

Reviewed By: ldionne, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D149615

17 months ago[mlir][scf][bufferize] Fix bug in WhileOp analysis verification
Matthias Springer [Mon, 15 May 2023 13:39:35 +0000 (15:39 +0200)]
[mlir][scf][bufferize] Fix bug in WhileOp analysis verification

Block arguments and yielded values are not equivalent if there are not enough block arguments. This fixes #59442.

Differential Revision: https://reviews.llvm.org/D145575

17 months ago[mlir][bufferization] Add option to dump alias sets
Matthias Springer [Mon, 15 May 2023 13:34:11 +0000 (15:34 +0200)]
[mlir][bufferization] Add option to dump alias sets

This is useful for debugging.

Differential Revision: https://reviews.llvm.org/D143314

17 months agoclang-format: [JS] terminate import sorting on `export type X = Y`
Jan Kuhle [Mon, 15 May 2023 13:33:33 +0000 (15:33 +0200)]
clang-format: [JS] terminate import sorting on `export type X = Y`

Contributed by @jankuehle!

https://reviews.llvm.org/D150116 introduced a bug. `export type X = Y` was considered an export declaration and took part in import sorting. This is not correct. With this change `export type X = Y` properly terminates import sorting.

Reviewed By: krasimir

Differential Revision: https://reviews.llvm.org/D150563

17 months ago[mlir][bufferization] Improve findValueInReverseUseDefChain signature
Matthias Springer [Mon, 15 May 2023 13:26:13 +0000 (15:26 +0200)]
[mlir][bufferization] Improve findValueInReverseUseDefChain signature

Instead of passing traversal options as a long list of arguments, store them in a TraversalConfig object and pass that object.

Differential Revision: https://reviews.llvm.org/D143927

17 months ago[AMDGPU] Simplify liveins in some MIR tests
Jay Foad [Mon, 15 May 2023 13:15:46 +0000 (14:15 +0100)]
[AMDGPU] Simplify liveins in some MIR tests

We can use the following 16-VGPR tuple directly instead of splitting it
into smaller parts:

$vgpr240_vgpr241_vgpr242_vgpr243_vgpr244_vgpr245_vgpr246_vgpr247_vgpr248_vgpr249_vgpr250_vgpr251_vgpr252_vgpr253_vgpr254_vgpr255

17 months agoFix build error caused by https://reviews.llvm.org/D149718
Manna, Soumi [Mon, 15 May 2023 12:58:52 +0000 (05:58 -0700)]
Fix build error caused by https://reviews.llvm.org/D149718

The patch(https://reviews.llvm.org/D149718) broke buildbot

../../clang/include/clang/Sema/ParsedAttr.h:705:18: error: explicitly defaulted move assignment operator is implicitly deleted [-Werror,-Wdefaulted-function-deleted]
  AttributePool &operator=(AttributePool &&pool) = default;
                 ^
../../clang/include/clang/Sema/ParsedAttr.h:674:21: note: move assignment operator of 'AttributePool' is implicitly deleted because field 'Factory' is of reference type 'clang::AttributeFactory &'
  AttributeFactory &Factory;
                    ^
1 error generated.

This patch fixes the build error.

17 months ago[Pipelines] Don't skip GlobalDCE in ThinLTO pre-link
Nikita Popov [Fri, 28 Apr 2023 13:29:49 +0000 (15:29 +0200)]
[Pipelines] Don't skip GlobalDCE in ThinLTO pre-link

GlobalDCE will only remove functions with available externally
linkage if they are unreferenced. As such, I don't believe there
is any problem with running this pass as part of the ThinLTO pre-link
pipeline. It will only remove functions that are completely dead in
that module, and I don't think there is any benefit to keeping them
around for the post-link phase.

There is no compile-time impact from the additional pass.

This is a followup to one of the side discussions in D146776.

Differential Revision: https://reviews.llvm.org/D149446

17 months ago[ValueTracking] Fix computeKnownFPClass with canonicalize
Piotr Sobczak [Mon, 15 May 2023 12:13:51 +0000 (14:13 +0200)]
[ValueTracking] Fix computeKnownFPClass with canonicalize

Update code that assumes llvm.canonicalize only handles scalars,
by adding a call to getScalarType().
This is fine, as the intrinsic is trivially vectorizable.

Introduced in D147870, and uncovered by D148065.

Differential Revision: https://reviews.llvm.org/D150556

17 months ago[mlir][IR][tests] Fix incorrect API usage in RewritePatterns
Matthias Springer [Mon, 15 May 2023 12:39:50 +0000 (14:39 +0200)]
[mlir][IR][tests] Fix incorrect API usage in RewritePatterns

Incorrect API usage was detected by D144552.

Differential Revision: https://reviews.llvm.org/D145167

17 months ago[mlir][bufferization] Fix unknown ops in BufferViewFlowAnalysis
Matthias Springer [Mon, 15 May 2023 12:31:26 +0000 (14:31 +0200)]
[mlir][bufferization] Fix unknown ops in BufferViewFlowAnalysis

If an op is unknown to the analysis, it must be treated conservatively: assume that every operand aliases with every result.

Differential Revision: https://reviews.llvm.org/D150546

17 months ago[clangd] Fix fixAll not shown when there is only one unused-include and missing-inclu...
Haojian Wu [Wed, 3 May 2023 10:18:17 +0000 (12:18 +0200)]
[clangd] Fix fixAll not shown when there is only one unused-include and missing-include diagnostics.

Discovered during the review in https://reviews.llvm.org/D149437#inline-1444851.

Differential Revision: https://reviews.llvm.org/D149822

17 months ago[clang][parser] Fix namespace dropping after malformed declarations
Alejandro Álvarez Ayllón [Mon, 15 May 2023 11:39:58 +0000 (07:39 -0400)]
[clang][parser] Fix namespace dropping after malformed declarations

* After a malformed top-level declaration
* After a malformed templated class method declaration

In both cases, when there is a malformed declaration, any following
namespace is dropped from the AST. This can trigger a cascade of
confusing diagnostics that may hide the original error. An example:
```
// Start #include "SomeFile.h"
template <class T>
void Foo<T>::Bar(void* aRawPtr) {
    (void)(aRawPtr);
}
// End #include "SomeFile.h"

int main() {}
```
We get the original error, plus 19 others from the standard library.
With this patch, we only get the original error.

clangd can also benefit from this patch, as namespaces following the
malformed declaration is now preserved. i.e.
```

MACRO_FROM_MISSING_INCLUDE("X")

namespace my_namespace {
    //...
}
```
Before this patch, my_namespace is not visible for auto-completion.

Differential Revision: https://reviews.llvm.org/D150258

17 months ago[libc] Cache ownership of the shared buffer in the port
Joseph Huber [Mon, 15 May 2023 21:22:27 +0000 (16:22 -0500)]
[libc] Cache ownership of the shared buffer in the port

This patch adds another variable to cache cases where we know that we
own the buffer. This allows us to skip the atomic load on the inbox
because we already know its state. This is legal immediately after
opening a port, or when sending immediately after a recieve. This
caching nets a significant (~17%) speedup for the basic open, send,
recieve combination.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D150516

17 months ago[libc] Make the bump pointer explicitly return null on buffer oveerrun
Joseph Huber [Mon, 24 Apr 2023 19:35:38 +0000 (14:35 -0500)]
[libc] Make the bump pointer explicitly return null on buffer oveerrun

We use a simple bump ptr in the `libc` tests. If we run out of data we
can currently return other static memory and have weird failure cases.
We should fail more explicitly here by returning a null pointer instead.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D150529

17 months ago[clang][dataflow] Don't analyze templated declarations.
Martin Braenne [Mon, 15 May 2023 09:23:44 +0000 (09:23 +0000)]
[clang][dataflow] Don't analyze templated declarations.

Attempting to analyze templated code doesn't have a good cost-benefit ratio. We
have so far done a best-effort attempt at this, but maintaining this support has
an ongoing high maintenance cost because the AST for templates can violate a lot
of the invariants that otherwise hold for the AST of concrete code. As just one
example, in concrete code the operand of a UnaryOperator '*' is always a prvalue
(https://godbolt.org/z/s3e5xxMd1), but in templates this isn't true
(https://godbolt.org/z/6W9xxGvoM).

Further rationale for not analyzing templates:

* The semantics of a template itself are weakly defined; semantics can depend
  strongly on the concrete template arguments. Analyzing the template itself (as
  opposed to an instantiation) therefore has limited value.

* Analyzing templates requires a lot of special-case code that isn't necessary
  for concrete code because dependent types are hard to deal with and the AST
  violates invariants that otherwise hold for concrete code (see above).

* There's precedent in that neither Clang Static Analyzer nor the flow-sensitive
  warnings in Clang (such as uninitialized variables) support analyzing
  templates.

Reviewed By: gribozavr2, xazax.hun

Differential Revision: https://reviews.llvm.org/D150352

17 months ago[VPlan] Use VPRecipeWithIRFlags for VPReplicateRecipe, retire poison map
Florian Hahn [Mon, 15 May 2023 10:49:16 +0000 (11:49 +0100)]
[VPlan] Use VPRecipeWithIRFlags for VPReplicateRecipe, retire poison map

Update VPReplicateRecipe to use VPRecipeWithIRFlags for IR flag
handling. Retire separate MayGeneratePoisonRecipes map.

Depends on D149082.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D150027

17 months ago[RegScavenger] Simplify forward(MachineBasicBlock::iterator). NFC.
Jay Foad [Mon, 15 May 2023 10:37:56 +0000 (11:37 +0100)]
[RegScavenger] Simplify forward(MachineBasicBlock::iterator). NFC.

17 months ago[X86] LowerRotate: prefer unpack-based algorithm
Ivan Chikish [Mon, 15 May 2023 10:25:32 +0000 (11:25 +0100)]
[X86] LowerRotate: prefer unpack-based algorithm

Splitting and improving from the https://reviews.llvm.org/D146357

When running tests for LowerShift, I discovered some poor codegen in rotate and funnel shift tests. This patch attempts to address some of them.

Using unpack for splitting and using double-bitwidth shifts may improve performance according to https://uica.uops.info tests.

    No cross-lane shuffles
    No dirtying double-width registers
    Massive improvement for AVX2 rotates in some cases (var_funnnel_v8i16, var_funnnel_v16i16) — because unpack is currently only used for vXi8 vectors.

Differential Revision: https://reviews.llvm.org/D149071

17 months ago[flang][hlfir] lower hlfir.any into fir runtime call
Jacob Crawley [Fri, 12 May 2023 13:59:37 +0000 (13:59 +0000)]
[flang][hlfir] lower hlfir.any into fir runtime call

Depends on: D150272

Differential Revision: https://reviews.llvm.org/D150451

17 months ago[flang] lower any intrinsic to hlfir.any operation
Jacob Crawley [Wed, 10 May 2023 14:23:48 +0000 (14:23 +0000)]
[flang] lower any intrinsic to hlfir.any operation

Carries out the lowering of the any intrinsic into HLFIR

Depends on: D149964

Differential Revision: https://reviews.llvm.org/D150272

17 months ago[flang] add hlfir.any intrinsic
Jacob Crawley [Fri, 5 May 2023 14:51:15 +0000 (14:51 +0000)]
[flang] add hlfir.any intrinsic

Adds a HLFIR operation for the ANY intrinsic according to the
design set out in flang/docs/HighLevel.md

Differential Revision: https://reviews.llvm.org/D149964

17 months ago[LLD][ELF] Add missing program header parsing to OVERLAY
Peter Smith [Thu, 11 May 2023 11:00:19 +0000 (12:00 +0100)]
[LLD][ELF] Add missing program header parsing to OVERLAY

In D72756 the change to add INPUT_SECTION_FLAGS inadvertantly
removed the line to parse the program header assignment information for
OutputSections within an OVERLAY.

This change adds back the missing line and adds a test for it.

Differential Revision: https://reviews.llvm.org/D150445

17 months ago[docs] Add Python coding standard to documentation
Tobias Hieta [Mon, 15 May 2023 08:58:20 +0000 (10:58 +0200)]
[docs] Add Python coding standard to documentation

As discussed on the forums:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style/

Reviewed By: jhenderson, JDevlieghere

Differential Revision: https://reviews.llvm.org/D143852

17 months ago[TableGen][SubtargetEmitter] Add the StartAtCycles field in the WriteRes class.
Francesco Petrogalli [Fri, 12 May 2023 15:45:07 +0000 (17:45 +0200)]
[TableGen][SubtargetEmitter] Add the StartAtCycles field in the WriteRes class.

Conditions that need to be met:

1. count(StartAtCycle) == count(ReservedCycles);
2. For each i: StartAtCycles[i] < ReservedCycles[i];
3. For each i: StartAtCycles[i] >= 0;
4. If left unspecified, the elements are set to 0.

Differential Revision: https://reviews.llvm.org/D150310

17 months ago[mlir][transform] Use TrackingListener-aware iterator for getPayloadOps
Matthias Springer [Mon, 15 May 2023 07:24:01 +0000 (09:24 +0200)]
[mlir][transform] Use TrackingListener-aware iterator for getPayloadOps

Instead of returning an `ArrayRef<Operation *>`, return at iterator that skips ops that were erased/replaced while iterating over the payload ops.

This fixes an issue in conjuction with TrackingListener, where a tracked op was erased during the iteration. Elements may not be removed from an array while iterating over it; this invalidates the iterator.

When ops are erased/removed via `replacePayloadOp`, they are not immediately removed from the mappings data structure. Instead, they are set to `nullptr`. `nullptr`s are not enumerated by `getPayloadOps`. At the end of each transformation, `nullptr`s are removed from the mapping data structure.

Differential Revision: https://reviews.llvm.org/D149847

17 months ago[libc] Add optimized memset for RISCV
Guillaume Chatelet [Fri, 12 May 2023 09:10:06 +0000 (09:10 +0000)]
[libc]  Add optimized memset for RISCV

This patch adds two versions of `memset` optimized for architectures where unaligned accesses are either illegal or extremely slow.
It is currently enabled for RISCV 64 and RISCV 32 but it could be used for ARM 32 architectures as well.

Here is the before / after output of libc.benchmarks.memory_functions.opt_host --benchmark_filter=BM_Memset on a quad core Linux starfive RISCV 64 board running at 1.5GHz.

Before
```
Run on (4 X 1500 MHz CPU s)
CPU Caches:
  L1 Instruction 32 KiB (x4)
  L1 Data 32 KiB (x4)
  L2 Unified 2048 KiB (x1)
------------------------------------------------------------------------
Benchmark              Time             CPU   Iterations UserCounters...
------------------------------------------------------------------------
BM_Memset/0/0        506 ns          252 ns      2883584 bytes_per_cycle=0.238312/s bytes_per_second=340.908M/s items_per_second=3.96043M/s __llvm_libc::memset,memset Google A
BM_Memset/1/0        296 ns          189 ns      2900992 bytes_per_cycle=0.234589/s bytes_per_second=335.583M/s items_per_second=5.29382M/s __llvm_libc::memset,memset Google B
BM_Memset/2/0       2110 ns         1049 ns       678912 bytes_per_cycle=0.24687/s bytes_per_second=353.151M/s items_per_second=953.527k/s __llvm_libc::memset,memset Google D
BM_Memset/3/0        397 ns          254 ns      3055616 bytes_per_cycle=0.238479/s bytes_per_second=341.147M/s items_per_second=3.93224M/s __llvm_libc::memset,memset Google L
BM_Memset/4/0       1119 ns          621 ns      1079296 bytes_per_cycle=0.244925/s bytes_per_second=350.368M/s items_per_second=1.61047M/s __llvm_libc::memset,memset Google M
BM_Memset/5/0        605 ns          349 ns      1644544 bytes_per_cycle=0.241364/s bytes_per_second=345.274M/s items_per_second=2.8614M/s __llvm_libc::memset,memset Google Q
BM_Memset/6/0        472 ns          271 ns      2310144 bytes_per_cycle=0.238615/s bytes_per_second=341.341M/s items_per_second=3.68799M/s __llvm_libc::memset,memset Google S
BM_Memset/7/0        262 ns          143 ns      3956736 bytes_per_cycle=0.225812/s bytes_per_second=323.026M/s items_per_second=7.0087M/s __llvm_libc::memset,memset Google U
BM_Memset/8/0        454 ns          261 ns      2940928 bytes_per_cycle=0.238883/s bytes_per_second=341.725M/s items_per_second=3.82716M/s __llvm_libc::memset,memset Google W
BM_Memset/9/0       8768 ns         5998 ns       115712 bytes_per_cycle=0.249196/s bytes_per_second=356.478M/s items_per_second=166.724k/s __llvm_libc::memset,uniform 384 to 4096
```

After
```
BM_Memset/0/0        117 ns         69.5 ns      9761792 bytes_per_cycle=0.935152/s bytes_per_second=1.30639G/s items_per_second=14.3834M/s __llvm_libc::memset,memset Google A
BM_Memset/1/0       97.8 ns         58.5 ns     13002752 bytes_per_cycle=0.892814/s bytes_per_second=1.24725G/s items_per_second=17.0848M/s __llvm_libc::memset,memset Google B
BM_Memset/2/0        326 ns          163 ns      5156864 bytes_per_cycle=1.54408/s bytes_per_second=2.15706G/s items_per_second=6.1192M/s __llvm_libc::memset,memset Google D
BM_Memset/3/0        132 ns         65.4 ns     11455488 bytes_per_cycle=0.876411/s bytes_per_second=1.22433G/s items_per_second=15.2803M/s __llvm_libc::memset,memset Google L
BM_Memset/4/0        222 ns          120 ns      6405120 bytes_per_cycle=1.44398/s bytes_per_second=2.01722G/s items_per_second=8.30758M/s __llvm_libc::memset,memset Google M
BM_Memset/5/0        119 ns         79.2 ns      8930304 bytes_per_cycle=1.13327/s bytes_per_second=1.58317G/s items_per_second=12.6189M/s __llvm_libc::memset,memset Google Q
BM_Memset/6/0        123 ns         64.0 ns     11609088 bytes_per_cycle=1.008/s bytes_per_second=1.40817G/s items_per_second=15.6365M/s __llvm_libc::memset,memset Google S
BM_Memset/7/0       85.9 ns         52.1 ns     12423168 bytes_per_cycle=0.641164/s bytes_per_second=917.192M/s items_per_second=19.1937M/s __llvm_libc::memset,memset Google U
BM_Memset/8/0        114 ns         67.1 ns     10347520 bytes_per_cycle=0.911968/s bytes_per_second=1.274G/s items_per_second=14.9015M/s __llvm_libc::memset,memset Google W
BM_Memset/9/0       1326 ns          785 ns       907264 bytes_per_cycle=1.89716/s bytes_per_second=2.6503G/s items_per_second=1.27348M/s __llvm_libc::memset,uniform 384 to 4096
```

Again not as good as current glibc but it's a first step in the right direction.
```
BM_Memset/0/0        108 ns         53.6 ns     12894208 bytes_per_cycle=1.02858/s bytes_per_second=1.4369G/s items_per_second=18.668M/s glibc::memset,memset Google A
BM_Memset/1/0       84.6 ns         47.6 ns     14284800 bytes_per_cycle=1.00197/s bytes_per_second=1.39974G/s items_per_second=21.0256M/s glibc::memset,memset Google B
BM_Memset/2/0        160 ns         85.8 ns      8927232 bytes_per_cycle=3.30805/s bytes_per_second=4.62129G/s items_per_second=11.6596M/s glibc::memset,memset Google D
BM_Memset/3/0       78.9 ns         53.6 ns     13326336 bytes_per_cycle=1.14058/s bytes_per_second=1.59338G/s items_per_second=18.674M/s glibc::memset,memset Google L
BM_Memset/4/0       99.2 ns         60.8 ns     11460608 bytes_per_cycle=2.54751/s bytes_per_second=3.55884G/s items_per_second=16.4587M/s glibc::memset,memset Google M
BM_Memset/5/0       93.0 ns         56.1 ns     12219392 bytes_per_cycle=1.73379/s bytes_per_second=2.42207G/s items_per_second=17.8157M/s glibc::memset,memset Google Q
BM_Memset/6/0       89.4 ns         47.2 ns     14692352 bytes_per_cycle=1.34846/s bytes_per_second=1.88377G/s items_per_second=21.1795M/s glibc::memset,memset Google S
BM_Memset/7/0       84.0 ns         50.0 ns     14468096 bytes_per_cycle=0.911198/s bytes_per_second=1.27293G/s items_per_second=19.994M/s glibc::memset,memset Google U
BM_Memset/8/0       93.4 ns         52.8 ns     13063168 bytes_per_cycle=1.06642/s bytes_per_second=1.48977G/s items_per_second=18.9524M/s glibc::memset,memset Google W
BM_Memset/9/0        438 ns          241 ns      2853888 bytes_per_cycle=6.1185/s bytes_per_second=8.54744G/s items_per_second=4.15064M/s glibc::memset,uniform 384 to 4096
```

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D150433

17 months ago[NFC] Refactor GlobalVariable Ctor
Guillaume Chatelet [Fri, 12 May 2023 15:50:44 +0000 (15:50 +0000)]
[NFC] Refactor GlobalVariable Ctor

Reuse logic from other ctor and remove code duplication.

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D150453

17 months ago[IR] Drop const in DILocation::getMergedLocation
Christian Ulmann [Mon, 15 May 2023 07:04:59 +0000 (07:04 +0000)]
[IR] Drop const in DILocation::getMergedLocation

This commit removes constness from DILocation::getMergedLocation and
fixes all its users accordingly.

Having constness on the parameters forced the return type to be const
as well, which does force usage of `const_cast` when the location needs
to be used in metadata nodes.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D149942

17 months ago[AMDGPU] Improve PHI-breaking heuristics in CGP
pvanhout [Wed, 10 May 2023 12:59:18 +0000 (14:59 +0200)]
[AMDGPU] Improve PHI-breaking heuristics in CGP

D147786 made the transform more conservative by adding heuristics,
which was a good idea. However, the transform got a bit
too conservative at times.

This caused a surprise in some rocRAND benchmarks because D143731 greatly helped a few of them.
For instance, a few xorwow-uniform tests saw a +30% boost in performance after that pass, which was lost when D147786 landed.

This patch is an attempt at reaching a middleground that makes
the pass a bit more permissive. It continues in the same spirit as
D147786 but does the following changes:
- PHI users of a PHI node are now recursively checked. When loops are encountered, we consider the PHIs non-breakable. (Considering them breakable had very negative effect in one app I tested)
-  `shufflevector` is now considered interesting, given that it satisfies a few trivial checks.

Reviewed By: arsenm, #amdgpu, jmmartinez

Differential Revision: https://reviews.llvm.org/D150266

17 months ago[AMDGPU][MC] Don't accept attr > 32 for param_load
Diana Picus [Wed, 10 May 2023 09:52:00 +0000 (11:52 +0200)]
[AMDGPU][MC] Don't accept attr > 32 for param_load

The docs say the interpolation attribute should be between 0..32 [1][2],
but we currently accept values all the way up to 63.

This patch makes the ASMParser error out for values > 32. It does not
touch codegen though because we're currently not checking anything at
all for codegen (llvm.amdgcn.lds.param.load will happily accept even 128
as an attr, although that won't fit in the encoding).

[1] https://llvm.org/docs/AMDGPU/gfx8_attr.html#amdgpu-synid-gfx8-attr
[2] https://llvm.org/docs/AMDGPU/gfx11_attr.html#amdgpu-synid-gfx11-attr

Differential Revision: https://reviews.llvm.org/D150261

17 months ago[Driver][test] Add -fintegrated-as after D150282
Fangrui Song [Mon, 15 May 2023 06:09:31 +0000 (23:09 -0700)]
[Driver][test] Add -fintegrated-as after D150282

D150282 does not add support for derived trace file names with
-fno-integrated-as, e.g. `clang -c -fno-integrated-as a.c -o e/a.o`.

Add -fintegrated-as to fix AIX.