platform/upstream/llvm.git
2 years ago[Clang][CodeGen] Avoid __builtin_assume_aligned crash when the 1st arg is array type
yronglin [Sat, 3 Sep 2022 15:24:37 +0000 (23:24 +0800)]
[Clang][CodeGen] Avoid __builtin_assume_aligned crash when the 1st arg is array type

Avoid __builtin_assume_aligned crash when the 1st arg is array type(or string literal).

Open issue: https://github.com/llvm/llvm-project/issues/57169

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D133202

2 years ago[CostModel][X86] Add CostKinds handling for fdiv ops
Simon Pilgrim [Sat, 3 Sep 2022 14:48:33 +0000 (15:48 +0100)]
[CostModel][X86] Add CostKinds handling for fdiv ops

This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695

As we're using 'typical' worst case values, not all cost entries come from a single CPU - e.g. the latency/throughput from haswell but the size-latency(uops) from zen1/alderlake-e due to 'double pumping'

As the uop count (used for TCK_SizeAndLatency) for divss/divps is typically so low, we need to override isExpensiveToSpeculativelyExecute to ensure we keep fdiv calls behind branches - although for some very recent cpu targets it might not be necessary any more and could be relaxed.

2 years ago[SCCP] add helper function for replacing signed operations; NFC
Sanjay Patel [Sat, 3 Sep 2022 14:12:13 +0000 (10:12 -0400)]
[SCCP] add helper function for replacing signed operations; NFC

Preliminary refactoring for planned enhancement in D133198.

2 years ago[X86] Fix fdiv throughput/latency/uops counts
Simon Pilgrim [Sat, 3 Sep 2022 14:23:41 +0000 (15:23 +0100)]
[X86] Fix fdiv throughput/latency/uops counts

Matches znver1/2 numbers from AMD SoG + Agner - no additional uops for folded instructions and znver1 double pumps 256-bit vectors

Matches skylake/icelake throughput numbers from Intel AoM + Agner/instlatx64

Noticed while adding fdiv CostKinds support

2 years ago[MLIR] Single lit config attribute for CMAKE_LIBRARY_OUTPUT_DIRECTORY
Christian Sigg [Sat, 3 Sep 2022 07:15:34 +0000 (09:15 +0200)]
[MLIR] Single lit config attribute for CMAKE_LIBRARY_OUTPUT_DIRECTORY

Replace the following config attributes with `mlir_lib_dir`:
- `mlir_runner_utils_dir`
- `linalg_test_lib_dir`
- `spirv_wrapper_library_dir`
- `vulkan_wrapper_library_dir`
- `mlir_integration_test_dir`

I'm going to clean up substitutions in separate changes.

Reviewed By: aartbik, mehdi_amini

Differential Revision: https://reviews.llvm.org/D133217

2 years ago[InstCombine] reduce another or-xor bitwise logic pattern
Sanjay Patel [Sat, 3 Sep 2022 12:57:00 +0000 (08:57 -0400)]
[InstCombine] reduce another or-xor bitwise logic pattern

~(A & ?) | (A ^ B) --> ~((A & ?) & B)
https://alive2.llvm.org/ce/z/mxex6V

This is similar to 9d218b61cc50 where we peeked through
another logic op to find a common operand.

2 years ago[InstCombine] add tests for or-xor-nand; NFC
Sanjay Patel [Fri, 2 Sep 2022 19:03:13 +0000 (15:03 -0400)]
[InstCombine] add tests for or-xor-nand; NFC

2 years ago[CostModel][X86] Add fdiv(double) throughput x87 costs for
Simon Pilgrim [Sat, 3 Sep 2022 13:08:25 +0000 (14:08 +0100)]
[CostModel][X86] Add fdiv(double) throughput x87 costs for

2 years ago[AMDGPU] Add -verify-machineinstrs to attr-amdgpu-flat-work-group-size* tests
Simon Pilgrim [Sat, 3 Sep 2022 12:47:41 +0000 (13:47 +0100)]
[AMDGPU] Add -verify-machineinstrs to attr-amdgpu-flat-work-group-size* tests

These were affected by D131825 (and reported on Issue #57149) - adding the verification will help ensure that we don't hit this again on builds with EXPENSIVE_CHECKS enabled

2 years ago[DAG] canCreateUndefOrPoison - add freeze(insert_subvector(x,y,c)) -> insert_subvecto...
Simon Pilgrim [Sat, 3 Sep 2022 12:41:33 +0000 (13:41 +0100)]
[DAG] canCreateUndefOrPoison - add freeze(insert_subvector(x,y,c)) -> insert_subvector(freeze(x),freeze(y),c) support

We already have plenty of assertions in place to ensure that the insertion index is constant and inrange

2 years ago[X86] Add test showing failure to fold freeze(insert_subvector(x,y,c)) -> insert_subv...
Simon Pilgrim [Sat, 3 Sep 2022 12:27:08 +0000 (13:27 +0100)]
[X86] Add test showing failure to fold freeze(insert_subvector(x,y,c)) -> insert_subvector(freeze(x),freeze(y),c)

If at least one of x and y are known never poison.

2 years ago[TTI] Add isExpensiveToSpeculativelyExecute wrapper
Simon Pilgrim [Sat, 3 Sep 2022 12:12:15 +0000 (13:12 +0100)]
[TTI] Add isExpensiveToSpeculativelyExecute wrapper

CGP uses a raw `getInstructionCost(I, TargetTransformInfo::TCK_SizeAndLatency) >= TCC_Expensive` check to see if its better to move an expensive instruction used in a select behind a branch instead.

This is causing issues with upcoming improvements to TCK_SizeAndLatency costs on X86 as we need to use TCK_SizeAndLatency as an uop count (so its compatible with various target-specific buffer sizes - see D132288), but we can have instructions that have a low TCK_SizeAndLatency value but should still be treated as 'expensive' (FDIV for example) - by adding a isExpensiveToSpeculativelyExecute wrapper we can keep the current behaviour but still add an x86 override in a future patch when the cost tables are updated to compensate.

2 years ago[libc++] Implement P2273R3 (`constexpr` `unique_ptr`)
Igor Zhukov [Sat, 3 Sep 2022 11:49:50 +0000 (18:49 +0700)]
[libc++] Implement P2273R3 (`constexpr` `unique_ptr`)

Reviewed By: mordante, #libc

Differential Revision: https://reviews.llvm.org/D131315

2 years ago[NFC][libc++] Uses the new way to mark Standard includes.
Mark de Wever [Sat, 3 Sep 2022 11:35:48 +0000 (13:35 +0200)]
[NFC][libc++] Uses the new way to mark Standard includes.

2 years ago[NFC][libc++][format] Removes unused code.
Mark de Wever [Sat, 3 Sep 2022 11:34:14 +0000 (13:34 +0200)]
[NFC][libc++][format] Removes unused code.

The code was for backwards compatibility with code no longer present in
format.

2 years ago[NFC][libc++] Removes GCC-11 support.
Mark de Wever [Sat, 3 Sep 2022 11:20:10 +0000 (13:20 +0200)]
[NFC][libc++] Removes GCC-11 support.

GCC-11 isn't supported in libc++ so remove UNSUPPORTED directives.

2 years ago[X86] Fix fmul throughput/latency/uops counts
Simon Pilgrim [Sat, 3 Sep 2022 10:10:51 +0000 (11:10 +0100)]
[X86] Fix fmul throughput/latency/uops counts

Matches numbers from AMD SoG + Agner - should always be on FPU Pipes 0+1, no additional uops for folded instructions and znver1 double pumps 256-bit vectors and is always latency = 4cy for f64 multiplies

Noticed while adding fmul CostKinds support to the x86 cost models in rG0735200e3f50 and znver1 wasn't being flagged as requiring 2uop for 256-bit vectors

2 years ago[CostModel][X86] Add CostKinds handling for fmul ops
Simon Pilgrim [Sat, 3 Sep 2022 09:42:20 +0000 (10:42 +0100)]
[CostModel][X86] Add CostKinds handling for fmul ops

This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695

As we're using 'typical' worst case values, not all cost entries come from a single CPU - e.g. the latency/throughput from haswell but the size-latency(uops) from zen1/alderlake-e due to 'double pumping'

2 years ago[CostModel][X86] Remove unused float x87 costs
Simon Pilgrim [Sat, 3 Sep 2022 08:59:14 +0000 (09:59 +0100)]
[CostModel][X86] Remove unused float x87 costs

We only need the double costs for SSE1 fallback

2 years agoRevert "[Clang] change default storing path of `-ftime-trace`"
Junduo Dong [Sat, 3 Sep 2022 08:38:37 +0000 (01:38 -0700)]
Revert "[Clang] change default storing path of `-ftime-trace`"

This reverts commit 38941da066a7b785ba4771710189172e94e37824.

2 years agoRevert "[driver][clang] remove the check-time-trace test on the platform "PS4/PS5...
Junduo Dong [Sat, 3 Sep 2022 08:37:55 +0000 (01:37 -0700)]
Revert "[driver][clang] remove the check-time-trace test on the platform "PS4/PS5/Hexagon""

This reverts commit 39221ad55752c246bb8448a181847103432e12b2.

2 years ago[DWARFLinker] Refactor clang modules loading code.
Alexey Lapshin [Wed, 31 Aug 2022 12:13:26 +0000 (15:13 +0300)]
[DWARFLinker] Refactor clang modules loading code.

Current implementation of registerModuleReference() function not only
"registers" module reference, but also clones referenced module
(inside loadClangModule()). That may lead to cloning the module with
incorrect options (registerModuleReference() examines module references
and additionally accumulates MaxDwarfVersion and accel tables info).
Since accumulated options may differ from the current values,
it is incorrect to clone modules before options are fully accumulated.

This patch separates "cloning" code from "registering" code. So,
that accumulating option is done in the "registering stage" and
"cloning" is done after all modules are registered and options accumulated.
It also adds a callback for loaded compile units which can be used for
D132755 and D132371(to allow doing options accumulation outside
of DWARFLinker).

Differential Revision: https://reviews.llvm.org/D133047

2 years ago[libc++] Fixes generated output CI job.
Mark de Wever [Sat, 3 Sep 2022 08:19:35 +0000 (10:19 +0200)]
[libc++] Fixes generated output CI job.

It seems there was another file with the same issue, which didn't show
up initially.

2 years ago[NFC][libc++] Moves transitive includes location.
Mark de Wever [Fri, 2 Sep 2022 15:53:28 +0000 (17:53 +0200)]
[NFC][libc++] Moves transitive includes location.

As discussed in D132284 they will be moved to the end.

Reviewed By: #libc, Mordante

Differential Revision: https://reviews.llvm.org/D133212

2 years ago[libc++] Fixes generated output CI job.
Mark de Wever [Sat, 3 Sep 2022 08:04:44 +0000 (10:04 +0200)]
[libc++] Fixes generated output CI job.

2 years ago[bazel] Port f7b8a70e7a1738e0fc6574e3cf8faa4fa1f34eba
Benjamin Kramer [Sat, 3 Sep 2022 07:55:20 +0000 (09:55 +0200)]
[bazel] Port f7b8a70e7a1738e0fc6574e3cf8faa4fa1f34eba

2 years agoResubmit "[MLIR] Remove unused config attributes from lit.site.cfg.py"
Christian Sigg [Sat, 3 Sep 2022 06:49:51 +0000 (08:49 +0200)]
Resubmit "[MLIR] Remove unused config attributes from lit.site.cfg.py"

This resubmits commit 0816b62, reverted in commit 328bbab, but without removing the config.target_triple.

Lit checks UNSUPPORTED tags in the input against the config.target_triple (https://llvm.org/docs/TestingGuide.html#constraining-test-execution).

The original commit made the following bots start failing, because unsupported tests were no longer skipped:
- s390x: https://lab.llvm.org/buildbot/#/builders/199/builds/9247
- Windows: https://lab.llvm.org/buildbot/#/builders/13/builds/25321
- Sanitizer: https://lab.llvm.org/buildbot/#/builders/5/builds/27187

2 years ago[clang-format] Fix a bug in merging blocks with a wrapped l_brace
owenca [Thu, 1 Sep 2022 06:19:08 +0000 (23:19 -0700)]
[clang-format] Fix a bug in merging blocks with a wrapped l_brace

When the opening brace of a control statement block is wrapped, we
must check the previous line to determine whether to try to merge
the block.

Fixes #38639.
Fixes #48007.
Fixes #57421.

Differential Revision: https://reviews.llvm.org/D133093

2 years ago[ORC-RT] Refactor ORC runtime CMake for future test tool(s).
Lang Hames [Sat, 3 Sep 2022 03:55:35 +0000 (20:55 -0700)]
[ORC-RT] Refactor ORC runtime CMake for future test tool(s).

We want to move functionality from the LLVM ORCTargetProcess library into the
ORC runtime, and this will mean implementing remote-executor testing tools
(like llvm-jitlink-executor and lli-child-target) in the ORC runtime.

This patch refactors the ORC runtime build system to introduce an
add_orc_tool function that can be used to add new test tools. The code is
modeled on existing functions for adding unit tests.

A placeholder orc-rt-executor tool and test are added to verify that the
config changes behave as expected.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D133084

2 years ago[driver][clang] remove the check-time-trace test on the platform "PS4/PS5/Hexagon"
Junduo Dong [Sat, 3 Sep 2022 02:45:26 +0000 (19:45 -0700)]
[driver][clang] remove the check-time-trace test on the platform "PS4/PS5/Hexagon"

One of the test cases in that test is designed to test the compiling
jobs with a linking stage, but the PS4/PS5/Hexagon platform requires
an external linker that isn't present.

So this test do not support the "PS4/PS5/Hexagon".

2 years ago[gn build] Port bc8fd9c6335f
LLVM GN Syncbot [Sat, 3 Sep 2022 02:43:17 +0000 (02:43 +0000)]
[gn build] Port bc8fd9c6335f

2 years agoRevert "[libc++] Granularize the rest of memory"
Vitaly Buka [Sat, 3 Sep 2022 02:35:10 +0000 (19:35 -0700)]
Revert "[libc++] Granularize the rest of memory"

Breaks buildbots.

This reverts commit 30adaa730c4768b5eb06719c808b2884fcf53cf3.

2 years ago[gn build] Port 3a49cffe3add
LLVM GN Syncbot [Sat, 3 Sep 2022 02:22:45 +0000 (02:22 +0000)]
[gn build] Port 3a49cffe3add

2 years ago[libc++] Implement P2445R1 (`std::forward_like`)
Igor Zhukov [Sun, 21 Aug 2022 15:21:08 +0000 (22:21 +0700)]
[libc++] Implement P2445R1 (`std::forward_like`)

Co-authored-by: A. Jiang <de34@live.cn>
Reviewed By: philnik, huixie90, #libc

Differential Revision: https://reviews.llvm.org/D132327

2 years agoWork around Windows buildbot failure.
Richard Smith [Sat, 3 Sep 2022 02:08:04 +0000 (19:08 -0700)]
Work around Windows buildbot failure.

-fmodules-local-submodule-visibility and -fdelayed-template-parsing
don't work properly together because the template is parsed in the
visibility context of the wrong module.

2 years ago[Clang] Fix lambda CheckForDefaultedFunction(...) so that it checks the CXXMethodDecl...
Shafik Yaghmour [Sat, 3 Sep 2022 00:45:44 +0000 (17:45 -0700)]
[Clang] Fix lambda CheckForDefaultedFunction(...) so that it checks the CXXMethodDecl is not deleted before attempting to call DefineDefaultedFunction(...)

I discovered this additional bug at the end of working on D132906

In Sema::CheckCompletedCXXClass(...)  uses a lambda CheckForDefaultedFunction to
verify each CXXMethodDecl holds to the expected invariants before passing them
on to CheckForDefaultedFunction.

It is currently missing a check that it is not deleted, this adds that check and
a test that crashed without this check.

This fixes: https://github.com/llvm/llvm-project/issues/57516

Differential Revision: https://reviews.llvm.org/D133177

2 years ago[Clang] change default storing path of `-ftime-trace`
Junduo Dong [Tue, 9 Aug 2022 05:04:38 +0000 (22:04 -0700)]
[Clang] change default storing path of `-ftime-trace`

1. This implementation change the default storing behavior of -ftime-trace only.

That is, if the compiling job contains the linking action, the executable file' s directory may be seem as the main work directory.
Thus the time trace files would be stored in the same directory of linking result.

By this approach, the user can easily get the time-trace files in the main work directory. The improved demo results:

```
$ clang++ -ftime-trace -o main.out /demo/main.cpp
$ ls .
main.out   main-[random-string].json
```

2. In addition, the main codes of time-trace files' path inference have been refactored.

* The <path> of -ftime-trace=<path> is infered in clang driver
* After that, -ftime-trace=<path> can be added into clang's options

By this approach, the dirty work of path processing and judging can be implemented in driver layer, so that the clang may focus on its main work.

 #   $ clang -ftime-trace -o xxx.out xxx.cpp

Differential Revision: https://reviews.llvm.org/D131469

2 years agoRevert "[mlir][cmake] Don't add dependencies on mlir-(generic-)headers"
Mehdi Amini [Sat, 3 Sep 2022 01:43:21 +0000 (01:43 +0000)]
Revert "[mlir][cmake] Don't add dependencies on mlir-(generic-)headers"

This reverts commit 7691b69d5b2f5e9d8b210add22926335b3541444.

Bots are broken because we're missing CMake dependencies all around now.

2 years agoAttempt to make AIX bot happier.
Richard Smith [Sat, 3 Sep 2022 01:43:27 +0000 (18:43 -0700)]
Attempt to make AIX bot happier.

2 years ago[mlir:vscode] Add support for viewing and editing a bytecode file as .mlir
River Riddle [Fri, 2 Sep 2022 23:24:17 +0000 (16:24 -0700)]
[mlir:vscode] Add support for viewing and editing a bytecode file as .mlir

This commit adds support for interacting with a (valid) bytecode file in the same
way as .mlir. This allows editing, using all of the traditional LSP features, etc. but
still using bytecode as the on-disk serialization format. Loading a bytecode file this
way will fail if the bytecode is invalid, and saving will fail if the edited .mlir is invalid.

Differential Revision: https://reviews.llvm.org/D132970

2 years agoFix out-of-bounds memory access in test
Adrian Prantl [Sat, 3 Sep 2022 01:14:12 +0000 (18:14 -0700)]
Fix out-of-bounds memory access in test

2 years ago[NFC][clang] LLVM_FALLTHROUGH => [[fallthrough]
Sheng [Sat, 3 Sep 2022 00:58:31 +0000 (08:58 +0800)]
[NFC][clang] LLVM_FALLTHROUGH => [[fallthrough]

2 years ago[DFSan] Increase size of buffer to fix possibly-flakey test.
Andrew Browne [Fri, 2 Sep 2022 19:12:16 +0000 (12:12 -0700)]
[DFSan] Increase size of buffer to fix possibly-flakey test.

Observed a test failure where "Returned length: 3054".

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D133227

2 years ago[mlir][sparse] Introduce sparse_tensor.storage operator to create a sparse tensor...
Peiming Liu [Fri, 2 Sep 2022 20:31:47 +0000 (20:31 +0000)]
[mlir][sparse] Introduce sparse_tensor.storage operator to create a sparse tensor storage tuple

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D133231

2 years agoRevert "[mlir][Tensor] Add rewrites to extract slices through `tensor.collape_shape`"
Mehdi Amini [Fri, 2 Sep 2022 23:34:52 +0000 (23:34 +0000)]
Revert "[mlir][Tensor] Add rewrites to extract slices through `tensor.collape_shape`"

This reverts commit 5711957875738c1318f89afd7bf4be388f85a087.

A circular dependency is introduced here from Dialect/Utils/ to the
ViewLikeInterface, but it already depends on Dialect/Utils.

Also this introduces a dependency from lib/Dialect/Tensor to Linalg,
which isn't obviously correct from a layering point of view.

2 years ago[ODRHash diagnostics] Transform method `ASTReader::diagnoseOdrViolations` into a...
Volodymyr Sapsai [Fri, 2 Sep 2022 22:37:23 +0000 (15:37 -0700)]
[ODRHash diagnostics] Transform method `ASTReader::diagnoseOdrViolations` into a class `ODRDiagsEmitter`. NFC.

Preparing to use diagnostics about ODR hash mismatches outside of ASTReader.

Differential Revision: https://reviews.llvm.org/D128490

2 years agoRevert "[AggressiveInstCombine] Lower Table Based CTTZ"
Richard Smith [Fri, 2 Sep 2022 23:18:12 +0000 (16:18 -0700)]
Revert "[AggressiveInstCombine] Lower Table Based CTTZ"

This reverts commit fec01ee3f5244bb9a04bc4310fc892c56c5b6bab.

According to asan, this patch introduces a heap use after free.

2 years ago[Matrix] Use print instead of dump for matrix-print-after-transpose-opt
Francis Visoiu Mistrih [Fri, 2 Sep 2022 23:05:31 +0000 (16:05 -0700)]
[Matrix] Use print instead of dump for matrix-print-after-transpose-opt

We should be able to use this option even if LLVM_ENABLE_DUMP is not on.

(should fix the bots too)

2 years agoAdd driver test for -fmodule-name and -fmodule-map-file use without -fmodules.
Richard Smith [Fri, 2 Sep 2022 21:47:05 +0000 (14:47 -0700)]
Add driver test for -fmodule-name and -fmodule-map-file use without -fmodules.

2 years ago[NFC] Remove duplicate code in SBTypeCategory
Jorge Gorbe Moya [Fri, 2 Sep 2022 19:59:39 +0000 (12:59 -0700)]
[NFC] Remove duplicate code in SBTypeCategory

TypeCategoryImpl has its own implementation of these, so it makes no
sense to have the same logic inlined in SBTypeCategory.

There are other methods in SBTypeCategory that are directly implemented
there, instead of delegating to TypeCategoryImpl (which IMO kinda
defeats the point of having an "opaque" member pointer in the SB type),
but they don't have equivalent implementations in TypeCategoryImpl, so
this patch only picks the low-hanging fruit for now.

2 years ago[Matrix] Simplify matmuls with scalars
Francis Visoiu Mistrih [Thu, 1 Sep 2022 19:06:59 +0000 (12:06 -0700)]
[Matrix] Simplify matmuls with scalars

If one of the operands is a transposed splat, the transpose can be
removed.

This is useful to simplify when transposes are distributed to operands
of a matmul:

* k^T -> k
* (A * k)^t -> A^t * k

Differential Revision: https://reviews.llvm.org/D130177

2 years ago[clang-tidy] Skip copy assignment operators with nonstandard return types
Alexander Shaposhnikov [Wed, 31 Aug 2022 09:13:21 +0000 (09:13 +0000)]
[clang-tidy] Skip copy assignment operators with nonstandard return types

Skip copy assignment operators with nonstandard return types
since they cannot be defaulted.

Test plan: ninja check-clang-tools

Differential revision: https://reviews.llvm.org/D133006

2 years ago[analyzer] Add more information to the Exploded Graph
isuckatcs [Thu, 4 Aug 2022 17:49:05 +0000 (19:49 +0200)]
[analyzer] Add more information to the Exploded Graph

This patch dumps every state trait in the egraph. Also
the empty state traits are no longer dumped, instead
they are treated as null by the egraph rewriter script,
which solves reverse compatibility issues.

Differential Revision: https://reviews.llvm.org/D131187

2 years ago[clang-tidy] Restrict use-equals-default to c++11-or-later
Alexander Shaposhnikov [Fri, 2 Sep 2022 22:19:06 +0000 (22:19 +0000)]
[clang-tidy] Restrict use-equals-default to c++11-or-later

Restrict use-equals-default to c++11-or-later.

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D132998

2 years ago[CostModel][AArch64] Fix ctpop intrinsic cost when NEON is disabled.
Eli Friedman [Fri, 2 Sep 2022 22:17:55 +0000 (15:17 -0700)]
[CostModel][AArch64] Fix ctpop intrinsic cost when NEON is disabled.

If we don't have NEON, we use the generic fallback, which takes 12
instructions. Make sure the costs reflect that.

(On a related note, we could optimize the generic fallback a bit. It
currently uses sequences like lsr+and+add; if we use and+lsr+add
instead, we can fold the lsr into the add.)

Differential Revision: https://reviews.llvm.org/D133154

2 years ago[clang-format] Fix annotating when deleting array of pointers
jackh [Tue, 30 Aug 2022 05:56:45 +0000 (13:56 +0800)]
[clang-format] Fix annotating when deleting array of pointers

Fixes https://github.com/llvm/llvm-project/issues/57418

The token `*` below should be annotated as `UnaryOperator`.

```
delete[] *ptr;
```

Reviewed By: owenpan, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D132911

2 years ago[mlir][spirv] Convert some 0-D vector extract/insertelement ops
Lei Zhang [Fri, 2 Sep 2022 21:47:31 +0000 (17:47 -0400)]
[mlir][spirv] Convert some 0-D vector extract/insertelement ops

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D133183

2 years agoRevert "[MLIR] Remove unused config attributes from lit.site.cfg.py"
Mitch Phillips [Fri, 2 Sep 2022 21:39:05 +0000 (14:39 -0700)]
Revert "[MLIR] Remove unused config attributes from lit.site.cfg.py"

This reverts commit 0816b629c9da5aa8885c4cb3fbbf5c905d37f0ee.

Reason: Broke the sanitizer buildbots. More information available in the
original phabricator review: https://reviews.llvm.org/D132726

2 years ago [mlir][NVGPU] Adding Support for cp_async_zfill via Inline Asm
Manish Gupta [Fri, 2 Sep 2022 21:20:11 +0000 (21:20 +0000)]
 [mlir][NVGPU] Adding Support for cp_async_zfill via Inline Asm

`cp_async_zfill` is currently not present in the nvvm backend, this patch adds `cp_async_zfill` support by adding inline asm when lowering from `nvgpu` to `nvvm`.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D132269

2 years ago[mlir][spirv] Support more max/min vector.reduction
Lei Zhang [Fri, 2 Sep 2022 21:21:57 +0000 (17:21 -0400)]
[mlir][spirv] Support more max/min vector.reduction

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D133168

2 years ago[mlir][spirv] Add some folders for spv.CompositeExtract
Lei Zhang [Fri, 2 Sep 2022 21:17:45 +0000 (17:17 -0400)]
[mlir][spirv] Add some folders for spv.CompositeExtract

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D133167

2 years ago[libc][NFC] clang-format
Alex Brachet [Fri, 2 Sep 2022 21:17:07 +0000 (21:17 +0000)]
[libc][NFC] clang-format

2 years ago[mlir][spirv] Add support for converting gpu.shuffle xor
Lei Zhang [Fri, 2 Sep 2022 21:14:53 +0000 (17:14 -0400)]
[mlir][spirv] Add support for converting gpu.shuffle xor

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D133054

2 years ago[mlir][spirv] Define various spv.GroupNonUniformShuffle ops
Lei Zhang [Fri, 2 Sep 2022 21:06:52 +0000 (17:06 -0400)]
[mlir][spirv] Define various spv.GroupNonUniformShuffle ops

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D133041

2 years ago[mlir][spirv] Fix MaxVersion for ops after supporting v1.6
Lei Zhang [Fri, 2 Sep 2022 21:02:00 +0000 (17:02 -0400)]
[mlir][spirv] Fix MaxVersion for ops after supporting v1.6

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D133234

2 years ago[mlir][openacc][NFC] Fix typo
Valentin Clement [Fri, 2 Sep 2022 21:00:59 +0000 (23:00 +0200)]
[mlir][openacc][NFC] Fix typo

2 years agouse LLVM_USE_STATIC_ZSTD
Cole [Fri, 2 Sep 2022 21:00:07 +0000 (21:00 +0000)]
use LLVM_USE_STATIC_ZSTD

removes LLVM_PREFER_STATIC_ZSTD in favor of using a LLVM_USE_STATIC_ZSTD

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D133222

2 years ago[TEST][msan] Reformat RUN lines
Vitaly Buka [Fri, 2 Sep 2022 20:24:56 +0000 (13:24 -0700)]
[TEST][msan] Reformat RUN lines

2 years ago[HLSL] Generate buffer subscript operators
Chris Bieneman [Fri, 2 Sep 2022 19:32:24 +0000 (14:32 -0500)]
[HLSL] Generate buffer subscript operators

In HLSL buffer types support array subscripting syntax for loads and
stores. This change fleshes out the subscript operators to become array
accesses on the underlying handle pointer. This will allow LLVM
optimization passes to optimize resource accesses the same way any other
memory access would be optimized.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D131268

2 years ago[gn build] Port 30adaa730c47
LLVM GN Syncbot [Fri, 2 Sep 2022 19:47:21 +0000 (19:47 +0000)]
[gn build] Port 30adaa730c47

2 years ago[libc++][NFC] Copy the whole union instead of a member; also remove __zero()
Nikolas Klauser [Tue, 30 Aug 2022 15:43:14 +0000 (17:43 +0200)]
[libc++][NFC] Copy the whole union instead of a member; also remove __zero()

This doesn't affect code-gen

Reviewed By: ldionne, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D132951

2 years ago[libc++] Granularize the rest of memory
Nikolas Klauser [Fri, 2 Sep 2022 14:24:11 +0000 (16:24 +0200)]
[libc++] Granularize the rest of memory

Reviewed By: ldionne, #libc

Spies: libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D132790

2 years ago[Driver] Remove cc1 Separate form -fvisibility
Fangrui Song [Fri, 2 Sep 2022 19:40:00 +0000 (12:40 -0700)]
[Driver] Remove cc1 Separate form -fvisibility

2 years ago[libc++] Remove noexcept specifier from operator""s
Nikolas Klauser [Fri, 2 Sep 2022 14:20:28 +0000 (16:20 +0200)]
[libc++] Remove noexcept specifier from operator""s

For some reason `operator""s(const char8_t*, size_t)` was marked `noexcept`. Remove it and add regression tests.

Reviewed By: ldionne, huixie90, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D132340

2 years ago[test] Change cc1 -fvisibility to -fvisibility=
Fangrui Song [Fri, 2 Sep 2022 19:36:44 +0000 (12:36 -0700)]
[test] Change cc1 -fvisibility to -fvisibility=

2 years ago[libc++] Make the naming of private member variables consistent and enforce it throug...
Nikolas Klauser [Fri, 2 Sep 2022 14:19:07 +0000 (16:19 +0200)]
[libc++] Make the naming of private member variables consistent and enforce it through readability-identifier-naming

Reviewed By: ldionne, #libc

Spies: aheejin, sstefan1, libcxx-commits

Differential Revision: https://reviews.llvm.org/D129386

2 years ago[libc++] Enable [[nodiscard]] extensions by default
Nikolas Klauser [Thu, 1 Sep 2022 10:02:58 +0000 (12:02 +0200)]
[libc++] Enable [[nodiscard]] extensions by default

Adding `[[nodiscard]]` to functions is a conforming extension and done extensively in the MSVC STL.

Reviewed By: ldionne, EricWF, #libc

Spies: #libc_vendors, cjdb, mgrang, jloser, libcxx-commits

Differential Revision: https://reviews.llvm.org/D128267

2 years ago[mlir][cmake] Don't add dependencies on mlir-(generic-)headers
Jeff Niu [Thu, 1 Sep 2022 18:06:46 +0000 (11:06 -0700)]
[mlir][cmake] Don't add dependencies on mlir-(generic-)headers

Every dialect was dependent on `mlir-headers`, which was causing the
build of any single MLIR dialect to pull in a bunch of extra
dependencies that aren't needed. Now, MLIR dialects will need to
explicitly depend on `MLIR*IncGen` targets to pull in any needed
headers.

This does not impact the actual `mlir-header` target.

Consider the "simple" Arithmetic dialect. Before:

```
% ninja MLIRArithmeticDialect
[151/812] Building CXX object lib/TableGen/CMakeFiles/LLVMTableGen.dir/JSONBackend.cpp.o
```

After:

```
% ninja MLIRArithmeticDialect
[207/374] Building CXX object tools/mlir/lib/TableGen/CMakeFiles/MLIRTableGen.dir/GenInfo.cpp.o
```

(Both clean builds)

Reviewed By: rriddle, jpienaar

Differential Revision: https://reviews.llvm.org/D133132

2 years ago[mlir][cf-to-llvm] Fix error message
Jeff Niu [Fri, 2 Sep 2022 19:13:07 +0000 (12:13 -0700)]
[mlir][cf-to-llvm] Fix error message

2 years ago[libc][NFC] Use no_sanitize("all")
Alex Brachet [Fri, 2 Sep 2022 19:07:32 +0000 (19:07 +0000)]
[libc][NFC] Use no_sanitize("all")

This function cannot have any instrumentation because it's
assembly must match exactly what the debugger is expecting.

Previously it was just a list of what sanitizers we expect
libc would be sanitized with but this is untenable.

2 years ago[NFC] Make MultiplexExternalSemaSource own sources
Chris Bieneman [Thu, 1 Sep 2022 21:31:23 +0000 (16:31 -0500)]
[NFC] Make MultiplexExternalSemaSource own sources

This change refactors the MuiltiplexExternalSemaSource to take ownership
of the underlying sources. As a result it makes a larger cleanup of
external source ownership in Sema and the ChainedIncludesSource.

Reviewed By: aaron.ballman, aprantl

Differential Revision: https://reviews.llvm.org/D133158

2 years ago[clang] Change cc1 -fvisibility's canonical spelling to -fvisibility=
Fangrui Song [Fri, 2 Sep 2022 18:49:38 +0000 (11:49 -0700)]
[clang] Change cc1 -fvisibility's canonical spelling to -fvisibility=

2 years ago[flang][docs] Add lowering design doc for parameterized derived-type
Valentin Clement [Fri, 2 Sep 2022 18:45:30 +0000 (20:45 +0200)]
[flang][docs] Add lowering design doc for parameterized derived-type

This document aims to give insights at the representation of parameterized
derived-type (PDTs) in FIR and how PDTs are lowered to FIR and interact
with the runtime.

Reviewed By: jeanPerier, klausler

Differential Revision: https://reviews.llvm.org/D133096

2 years ago[flang] Use APInt to lower 128 bits integer constants
Valentin Clement [Fri, 2 Sep 2022 18:44:44 +0000 (20:44 +0200)]
[flang] Use APInt to lower 128 bits integer constants

Lowering was truncating 128 bits integer to 64 bits. This
patch makes use of APInt to lower 128 bits integer correctly.

```
program bug
  print *, 170141183460469231731687303715884105727_16
end

! Before patch: 18446744073709551615
! With patch: 170141183460469231731687303715884105727
```

Reviewed By: vdonaldson

Differential Revision: https://reviews.llvm.org/D133206

2 years ago[flang] Make sure allocatable components are initialzed for temp derived-type
Valentin Clement [Fri, 2 Sep 2022 18:43:18 +0000 (20:43 +0200)]
[flang] Make sure allocatable components are initialzed for temp derived-type

Runtime functions expect clean unallocated state for descriptor. This
patch adds a call to the runtime function to initialize the temporary
derived-type created.

Reviewed By: vdonaldson

Differential Revision: https://reviews.llvm.org/D133189

2 years ago[HLSL] Restrict to supported targets
Chris Bieneman [Fri, 2 Sep 2022 15:45:53 +0000 (10:45 -0500)]
[HLSL] Restrict to supported targets

Someday we would like to support HLSL on a wider range of targets, but
today targeting anything other than `dxil` is likly to cause lots of
headaches. This adds an error and tests to validate that the expected
target is `dxil-?-shadermodel`.

We will continue to do a best effort to ensure the code we write makes
it easy to support other targets (like SPIR-V), but this error will
prevent users from hitting frustrating errors for unsupported cases.

Reviewed By: jcranmer-intel, Anastasia

Differential Revision: https://reviews.llvm.org/D132056

2 years ago[mlir][sparse] extend codegen test cases with an additional step after storage expansion
Peiming Liu [Fri, 2 Sep 2022 17:31:54 +0000 (17:31 +0000)]
[mlir][sparse] extend codegen test cases with an additional step after storage expansion

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D133219

2 years ago[mlir][sparse] add conversion rules for storage_get/set/callOp
Peiming Liu [Thu, 1 Sep 2022 20:34:05 +0000 (20:34 +0000)]
[mlir][sparse] add conversion rules for storage_get/set/callOp

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D133175

2 years ago[Attributor] Simplify offset calculation for a constant GEP
Sameer Sahasrabuddhe [Fri, 2 Sep 2022 18:23:51 +0000 (23:53 +0530)]
[Attributor] Simplify offset calculation for a constant GEP

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D132931

2 years ago[mlir][ods] Add ArrayOfAttr for creating custom array attributes
Jeff Niu [Thu, 1 Sep 2022 16:30:54 +0000 (09:30 -0700)]
[mlir][ods] Add ArrayOfAttr for creating custom array attributes

`ArrayOfAttr` can be used to easily create an attribute that just
contains an array of something. The elements can be other attributes,
in which case the custom parsers and printers are invoked directly for
nice syntax, or any C++ type that supports parsing and printing, either
though custom `printer` and `parser` methods or `FieldParser`.

An array of integers:

```
def ArrayOfInts : ArrayOfAttr<Test_Dialect, "ArrayOfInts", "array_of_ints",
                              "int32_t">;
```

When embedded in an op's assembly format, it will look like

```
foo.ints value = [1, 2, 3]
```

An array of enums, when embedded in an op's assembly format, will look
like:

```
foo.enums value = [first, second, last]
```

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D133131

2 years ago[Verifier] Skip debug location check for some non-inlinable functions
Yuanfang Chen [Fri, 2 Sep 2022 17:40:37 +0000 (10:40 -0700)]
[Verifier] Skip debug location check for some non-inlinable functions

If a callee function is not interposable, skip debug location check for its callsites. Doing this is instrumentation-friendly otherwise under some conditions this check triggers for some un-inlinable call sites.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D133060

2 years ago[LoopPassManager] Implement and use LoopNestAnalysis::run() instead of manually creat...
Arthur Eubanks [Wed, 24 Aug 2022 18:18:52 +0000 (11:18 -0700)]
[LoopPassManager] Implement and use LoopNestAnalysis::run() instead of manually creating LoopNests

The current code is basically just emulating what the analysis manager does.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D132581

2 years ago[docs] -fivisibility= allows protected and internal
Fangrui Song [Fri, 2 Sep 2022 17:49:10 +0000 (10:49 -0700)]
[docs] -fivisibility= allows protected and internal

2 years ago[test] Remove problematic thread from MainLoopTest to fix flakiness
Jordan Rupprecht [Fri, 2 Sep 2022 17:21:23 +0000 (10:21 -0700)]
[test] Remove problematic thread from MainLoopTest to fix flakiness

This test, specifically `TwoSignalCallbacks`, can be a little bit flaky, failing in around 5/2000 runs.

POSIX says:

> If the value of pid causes sig to be generated for the sending process, and if sig is not blocked for the calling thread and if no other thread has sig unblocked or is waiting in a sigwait() function for sig, either sig or at least one pending unblocked signal shall be delivered to the sending thread before kill() returns.

The problem is that in test setup, we create a new thread with `std::async` and that is occasionally not cleaned up. This leaves that thread available to eat the signal we're polling for.

The need for this to be async does not apply anymore, so we can just make it synchronous.

This makes the test passes in 10000 runs.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D133181

2 years ago[docs] Regenerate clang/docs/ClangCommandLineReference.rst
Fangrui Song [Fri, 2 Sep 2022 17:29:37 +0000 (10:29 -0700)]
[docs] Regenerate clang/docs/ClangCommandLineReference.rst

2 years ago[mlir][Tensor] Add rewrites to extract slices through `tensor.collape_shape`
Christopher Bate [Tue, 16 Aug 2022 23:27:06 +0000 (17:27 -0600)]
[mlir][Tensor] Add rewrites to extract slices through `tensor.collape_shape`

This change adds a set of utilities to replace the result of a
`tensor.collapse_shape -> tensor.extract_slice` chain with the
equivalent result formed by aggregating slices of the
`tensor.collapse_shape` source. In general, it is not possible to
commute `extract_slice` and `collapse_shape` if linearized dimensions
are sliced. The i-th dimension of the `tensor.collapse_shape`
result is a "linearized sliced dimension" if:

1) Reassociation indices of tensor.collapse_shape in the i'th position
   is greater than size 1 (multiple dimensions of the input are collapsed)
2) The i-th dimension is sliced by `tensor.extract_slice`.

We can work around this by stitching together the result of
`tensor.extract_slice` by iterating over any linearized sliced dimensions.
This is equivalent to "tiling" the linearized-and-sliced dimensions of
the `tensor.collapse_shape` operation in order to manifest the result
tile (the result of the `tensor.extract_slice`). The user of the
utilities must provide the mechanism to create the tiling (e.g. a loop).
In the tests, it is demonstrated how to apply the utilities using either
`scf.for` or `scf.foreach_thread`.

The below example illustrates the pattern using `scf.for`:

```
%0 = linalg.generic ... -> tensor<3x7x11x10xf32>
%1 = tensor.collapse_shape %0 [[0, 1, 2], [3]] : ... to tensor<341x10xf32>
%2 = tensor.extract_slice %1 [13, 0] [10, 10] [2, 1] : .... tensor<10x10xf32>
```

We can construct %2 by generating the following IR:

```
%dest = linalg.init_tensor() : tensor<10x10xf32>
%2 = scf.for %iv = %c0 to %c10 step %c1 iter_args(%arg0) -> tensor<10x10xf32> {
   // Step 1: Map this output idx (%iv) to a multi-index for the input (%3):
   %linear_index = affine.apply affine_map<(d0)[]->(d0*2 + 11)>(%iv)
   %3:3 = arith.delinearize_index %iv into (3, 7, 11)
   // Step 2: Extract the slice from the input
   %4 = tensor.extract_slice %0 [%3#0, %3#1, %3#2, 0] [1, 1, 1, 10] [1, 1, 1, 1] :
         tensor<3x7x11x10xf32> to tensor<1x1x1x10xf32>
   %5 = tensor.collapse_shape %4 [[0, 1, 2], [3]] :
         tensor<1x1x1x10xf32> into tensor<1x10xf32>
   // Step 3: Insert the slice into the destination
   %6 = tensor.insert_slice %5 into %arg0 [%iv, 0] [1, 10] [1, 1] :
         tensor<1x10xf32> into tensor<10x10xf32>
   scf.yield %6 : tensor<10x10xf32>
}
```

The pattern was discussed in the RFC here: https://discourse.llvm.org/t/rfc-tensor-extracting-slices-from-tensor-collapse-shape/64034

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D129699

2 years ago[AArch64][TTI] Add cost table entry for trunc over vector of integers.
Mingming Liu [Tue, 30 Aug 2022 05:54:41 +0000 (22:54 -0700)]
[AArch64][TTI] Add cost table entry for trunc over vector of integers.

1) Tablegen patterns exist to use 'xtn' and 'uzp1' for trunc [1]. Cost table entries are updated based on the actual number of {xtn, uzp1} instructions generated.
2) Without this, an IR instruction like trunc <8 x i16> %v to <8 x i8> is considered free and might be sinked to other basic blocks. As a result, the sinked 'trunc' is in a different basic block with its (usually not-free) vector operand and misses the chance to be combined during instruction selection. (examples in [2])
3) It's a lot of effort to teach CodeGenPrepare.cpp to sink the operand of trunc without introducing regressions, since the instruction to compute the operand of trunc could be faster (e.g., throughput) than the instruction corresponding to "trunc (bin-vector-op". For instance in [3], sinking %1 (as trunc operand) into bb.1 and bb.2 means to replace 2 xtn with 2 shrn (shrn has a throughput of 1 and only utilize v1 pipeline), which is not necessarily good, especially since ushr result needs to be preserved for store operation in bb.0. Meanwhile, it's too optimistic (for CodeGenPrepare pass) to assume machine-cse will always be able to de-dup shrn from various basic blocks into one shrn.

[1] For {v8i16->v8i8, v4i32->v4i16, v2i64->v2i32}, https://github.com/llvm/llvm-project/blob/813ae2871d71f32cce46768e63185cd64651f6e9/llvm/lib/Target/AArch64/AArch64InstrInfo.td#L4472.
    For concat (trunc, trunc) -> uzip1, https://github.com/llvm/llvm-project/blob/813ae2871d71f32cce46768e63185cd64651f6e9/llvm/lib/Target/AArch64/AArch64InstrInfo.td#L5428-L5437
[2] examples
    - trunc(umin(X, 255)) -> UQXTRN v8i8 (and other {u,s}x{min,max} pattern for v8i16 operands) from https://github.com/llvm/llvm-project/blob/813ae2871d71f32cce46768e63185cd64651f6e9/llvm/lib/Target/AArch64/AArch64InstrInfo.td#L4515-L4528
    - trunc (AArch64vlshr v8i16, imm) -> SHRNv8i8 (same missed for SHRNv2i32) from https://github.com/llvm/llvm-project/blob/813ae2871d71f32cce46768e63185cd64651f6e9/llvm/lib/Target/AArch64/AArch64InstrInfo.td#L6743-L6748
[3]
    ---
    ; instruction latency / throughput / pipeline on `neoverse-n1`
    bb.0:
      %1 = lshr <8 x i16> %10, <i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4>   ; ushr, latency 2, throughput 1, pipeline V1
      %2 = trunc <8 x i16> %1 to <8 x i8>  ; xtn, latency 2, throughput 2, pipeline V
      %3 = store <8 x i8> %1, ptr %addr
      br cond i1 cond, label bb.1, label bb.2

    bb.1:
      %4 = trunc <8 x i16> %1 to <8 x i8> ; xtn

    bb.2:
      %5 = trunc <8 x i16> %1 to <8 x i8> ; xtn
    ---

Differential Revision: https://reviews.llvm.org/D132784

2 years ago[LiveIntervals] Split live intervals on any dead def
Daniil Fukalov [Mon, 25 Jul 2022 12:18:51 +0000 (15:18 +0300)]
[LiveIntervals] Split live intervals on any dead def

Each dead def of the same virtual register is required to be split into multiple
virtual registers with separate live intervals to avoid MachineVerifier error.

Partially fixes https://github.com/llvm/llvm-project/issues/56050 and
https://github.com/llvm/llvm-project/issues/56051

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D130477

2 years ago[MinGW] Ignore -fvisibility/-fvisibility-inlines-hidden for dllexport
Fangrui Song [Fri, 2 Sep 2022 16:59:16 +0000 (09:59 -0700)]
[MinGW] Ignore -fvisibility/-fvisibility-inlines-hidden for dllexport

Similar to 123ce97fac78bc4519afd5d2aba17c59c5717aad for dllimport: dllexport
expresses a non-hidden visibility intention. We can consider it explicit and
therefore it should override the global visibility setting (see AST/Decl.cpp
"NamedDecl Implementation").

Adding the special case to CodeGenModule::setGlobalVisibility is somewhat weird,
but allows we to add the code in one place instead of many in AST/Decl.cpp.

Differential Revision: https://reviews.llvm.org/D133180

2 years agoMigrate "CheckUses" pass to the auto-generated constructor (NFC)
Mehdi Amini [Fri, 2 Sep 2022 16:04:00 +0000 (16:04 +0000)]
Migrate "CheckUses" pass to the auto-generated constructor (NFC)

See #57475

Differential Revision: https://reviews.llvm.org/D133215