platform/upstream/llvm.git
2 years ago[NFC] Remove duplicate code in SBTypeCategory
Jorge Gorbe Moya [Fri, 2 Sep 2022 19:59:39 +0000 (12:59 -0700)]
[NFC] Remove duplicate code in SBTypeCategory

TypeCategoryImpl has its own implementation of these, so it makes no
sense to have the same logic inlined in SBTypeCategory.

There are other methods in SBTypeCategory that are directly implemented
there, instead of delegating to TypeCategoryImpl (which IMO kinda
defeats the point of having an "opaque" member pointer in the SB type),
but they don't have equivalent implementations in TypeCategoryImpl, so
this patch only picks the low-hanging fruit for now.

2 years ago[Matrix] Simplify matmuls with scalars
Francis Visoiu Mistrih [Thu, 1 Sep 2022 19:06:59 +0000 (12:06 -0700)]
[Matrix] Simplify matmuls with scalars

If one of the operands is a transposed splat, the transpose can be
removed.

This is useful to simplify when transposes are distributed to operands
of a matmul:

* k^T -> k
* (A * k)^t -> A^t * k

Differential Revision: https://reviews.llvm.org/D130177

2 years ago[clang-tidy] Skip copy assignment operators with nonstandard return types
Alexander Shaposhnikov [Wed, 31 Aug 2022 09:13:21 +0000 (09:13 +0000)]
[clang-tidy] Skip copy assignment operators with nonstandard return types

Skip copy assignment operators with nonstandard return types
since they cannot be defaulted.

Test plan: ninja check-clang-tools

Differential revision: https://reviews.llvm.org/D133006

2 years ago[analyzer] Add more information to the Exploded Graph
isuckatcs [Thu, 4 Aug 2022 17:49:05 +0000 (19:49 +0200)]
[analyzer] Add more information to the Exploded Graph

This patch dumps every state trait in the egraph. Also
the empty state traits are no longer dumped, instead
they are treated as null by the egraph rewriter script,
which solves reverse compatibility issues.

Differential Revision: https://reviews.llvm.org/D131187

2 years ago[clang-tidy] Restrict use-equals-default to c++11-or-later
Alexander Shaposhnikov [Fri, 2 Sep 2022 22:19:06 +0000 (22:19 +0000)]
[clang-tidy] Restrict use-equals-default to c++11-or-later

Restrict use-equals-default to c++11-or-later.

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D132998

2 years ago[CostModel][AArch64] Fix ctpop intrinsic cost when NEON is disabled.
Eli Friedman [Fri, 2 Sep 2022 22:17:55 +0000 (15:17 -0700)]
[CostModel][AArch64] Fix ctpop intrinsic cost when NEON is disabled.

If we don't have NEON, we use the generic fallback, which takes 12
instructions. Make sure the costs reflect that.

(On a related note, we could optimize the generic fallback a bit. It
currently uses sequences like lsr+and+add; if we use and+lsr+add
instead, we can fold the lsr into the add.)

Differential Revision: https://reviews.llvm.org/D133154

2 years ago[clang-format] Fix annotating when deleting array of pointers
jackh [Tue, 30 Aug 2022 05:56:45 +0000 (13:56 +0800)]
[clang-format] Fix annotating when deleting array of pointers

Fixes https://github.com/llvm/llvm-project/issues/57418

The token `*` below should be annotated as `UnaryOperator`.

```
delete[] *ptr;
```

Reviewed By: owenpan, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D132911

2 years ago[mlir][spirv] Convert some 0-D vector extract/insertelement ops
Lei Zhang [Fri, 2 Sep 2022 21:47:31 +0000 (17:47 -0400)]
[mlir][spirv] Convert some 0-D vector extract/insertelement ops

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D133183

2 years agoRevert "[MLIR] Remove unused config attributes from lit.site.cfg.py"
Mitch Phillips [Fri, 2 Sep 2022 21:39:05 +0000 (14:39 -0700)]
Revert "[MLIR] Remove unused config attributes from lit.site.cfg.py"

This reverts commit 0816b629c9da5aa8885c4cb3fbbf5c905d37f0ee.

Reason: Broke the sanitizer buildbots. More information available in the
original phabricator review: https://reviews.llvm.org/D132726

2 years ago [mlir][NVGPU] Adding Support for cp_async_zfill via Inline Asm
Manish Gupta [Fri, 2 Sep 2022 21:20:11 +0000 (21:20 +0000)]
 [mlir][NVGPU] Adding Support for cp_async_zfill via Inline Asm

`cp_async_zfill` is currently not present in the nvvm backend, this patch adds `cp_async_zfill` support by adding inline asm when lowering from `nvgpu` to `nvvm`.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D132269

2 years ago[mlir][spirv] Support more max/min vector.reduction
Lei Zhang [Fri, 2 Sep 2022 21:21:57 +0000 (17:21 -0400)]
[mlir][spirv] Support more max/min vector.reduction

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D133168

2 years ago[mlir][spirv] Add some folders for spv.CompositeExtract
Lei Zhang [Fri, 2 Sep 2022 21:17:45 +0000 (17:17 -0400)]
[mlir][spirv] Add some folders for spv.CompositeExtract

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D133167

2 years ago[libc][NFC] clang-format
Alex Brachet [Fri, 2 Sep 2022 21:17:07 +0000 (21:17 +0000)]
[libc][NFC] clang-format

2 years ago[mlir][spirv] Add support for converting gpu.shuffle xor
Lei Zhang [Fri, 2 Sep 2022 21:14:53 +0000 (17:14 -0400)]
[mlir][spirv] Add support for converting gpu.shuffle xor

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D133054

2 years ago[mlir][spirv] Define various spv.GroupNonUniformShuffle ops
Lei Zhang [Fri, 2 Sep 2022 21:06:52 +0000 (17:06 -0400)]
[mlir][spirv] Define various spv.GroupNonUniformShuffle ops

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D133041

2 years ago[mlir][spirv] Fix MaxVersion for ops after supporting v1.6
Lei Zhang [Fri, 2 Sep 2022 21:02:00 +0000 (17:02 -0400)]
[mlir][spirv] Fix MaxVersion for ops after supporting v1.6

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D133234

2 years ago[mlir][openacc][NFC] Fix typo
Valentin Clement [Fri, 2 Sep 2022 21:00:59 +0000 (23:00 +0200)]
[mlir][openacc][NFC] Fix typo

2 years agouse LLVM_USE_STATIC_ZSTD
Cole [Fri, 2 Sep 2022 21:00:07 +0000 (21:00 +0000)]
use LLVM_USE_STATIC_ZSTD

removes LLVM_PREFER_STATIC_ZSTD in favor of using a LLVM_USE_STATIC_ZSTD

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D133222

2 years ago[TEST][msan] Reformat RUN lines
Vitaly Buka [Fri, 2 Sep 2022 20:24:56 +0000 (13:24 -0700)]
[TEST][msan] Reformat RUN lines

2 years ago[HLSL] Generate buffer subscript operators
Chris Bieneman [Fri, 2 Sep 2022 19:32:24 +0000 (14:32 -0500)]
[HLSL] Generate buffer subscript operators

In HLSL buffer types support array subscripting syntax for loads and
stores. This change fleshes out the subscript operators to become array
accesses on the underlying handle pointer. This will allow LLVM
optimization passes to optimize resource accesses the same way any other
memory access would be optimized.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D131268

2 years ago[gn build] Port 30adaa730c47
LLVM GN Syncbot [Fri, 2 Sep 2022 19:47:21 +0000 (19:47 +0000)]
[gn build] Port 30adaa730c47

2 years ago[libc++][NFC] Copy the whole union instead of a member; also remove __zero()
Nikolas Klauser [Tue, 30 Aug 2022 15:43:14 +0000 (17:43 +0200)]
[libc++][NFC] Copy the whole union instead of a member; also remove __zero()

This doesn't affect code-gen

Reviewed By: ldionne, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D132951

2 years ago[libc++] Granularize the rest of memory
Nikolas Klauser [Fri, 2 Sep 2022 14:24:11 +0000 (16:24 +0200)]
[libc++] Granularize the rest of memory

Reviewed By: ldionne, #libc

Spies: libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D132790

2 years ago[Driver] Remove cc1 Separate form -fvisibility
Fangrui Song [Fri, 2 Sep 2022 19:40:00 +0000 (12:40 -0700)]
[Driver] Remove cc1 Separate form -fvisibility

2 years ago[libc++] Remove noexcept specifier from operator""s
Nikolas Klauser [Fri, 2 Sep 2022 14:20:28 +0000 (16:20 +0200)]
[libc++] Remove noexcept specifier from operator""s

For some reason `operator""s(const char8_t*, size_t)` was marked `noexcept`. Remove it and add regression tests.

Reviewed By: ldionne, huixie90, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D132340

2 years ago[test] Change cc1 -fvisibility to -fvisibility=
Fangrui Song [Fri, 2 Sep 2022 19:36:44 +0000 (12:36 -0700)]
[test] Change cc1 -fvisibility to -fvisibility=

2 years ago[libc++] Make the naming of private member variables consistent and enforce it throug...
Nikolas Klauser [Fri, 2 Sep 2022 14:19:07 +0000 (16:19 +0200)]
[libc++] Make the naming of private member variables consistent and enforce it through readability-identifier-naming

Reviewed By: ldionne, #libc

Spies: aheejin, sstefan1, libcxx-commits

Differential Revision: https://reviews.llvm.org/D129386

2 years ago[libc++] Enable [[nodiscard]] extensions by default
Nikolas Klauser [Thu, 1 Sep 2022 10:02:58 +0000 (12:02 +0200)]
[libc++] Enable [[nodiscard]] extensions by default

Adding `[[nodiscard]]` to functions is a conforming extension and done extensively in the MSVC STL.

Reviewed By: ldionne, EricWF, #libc

Spies: #libc_vendors, cjdb, mgrang, jloser, libcxx-commits

Differential Revision: https://reviews.llvm.org/D128267

2 years ago[mlir][cmake] Don't add dependencies on mlir-(generic-)headers
Jeff Niu [Thu, 1 Sep 2022 18:06:46 +0000 (11:06 -0700)]
[mlir][cmake] Don't add dependencies on mlir-(generic-)headers

Every dialect was dependent on `mlir-headers`, which was causing the
build of any single MLIR dialect to pull in a bunch of extra
dependencies that aren't needed. Now, MLIR dialects will need to
explicitly depend on `MLIR*IncGen` targets to pull in any needed
headers.

This does not impact the actual `mlir-header` target.

Consider the "simple" Arithmetic dialect. Before:

```
% ninja MLIRArithmeticDialect
[151/812] Building CXX object lib/TableGen/CMakeFiles/LLVMTableGen.dir/JSONBackend.cpp.o
```

After:

```
% ninja MLIRArithmeticDialect
[207/374] Building CXX object tools/mlir/lib/TableGen/CMakeFiles/MLIRTableGen.dir/GenInfo.cpp.o
```

(Both clean builds)

Reviewed By: rriddle, jpienaar

Differential Revision: https://reviews.llvm.org/D133132

2 years ago[mlir][cf-to-llvm] Fix error message
Jeff Niu [Fri, 2 Sep 2022 19:13:07 +0000 (12:13 -0700)]
[mlir][cf-to-llvm] Fix error message

2 years ago[libc][NFC] Use no_sanitize("all")
Alex Brachet [Fri, 2 Sep 2022 19:07:32 +0000 (19:07 +0000)]
[libc][NFC] Use no_sanitize("all")

This function cannot have any instrumentation because it's
assembly must match exactly what the debugger is expecting.

Previously it was just a list of what sanitizers we expect
libc would be sanitized with but this is untenable.

2 years ago[NFC] Make MultiplexExternalSemaSource own sources
Chris Bieneman [Thu, 1 Sep 2022 21:31:23 +0000 (16:31 -0500)]
[NFC] Make MultiplexExternalSemaSource own sources

This change refactors the MuiltiplexExternalSemaSource to take ownership
of the underlying sources. As a result it makes a larger cleanup of
external source ownership in Sema and the ChainedIncludesSource.

Reviewed By: aaron.ballman, aprantl

Differential Revision: https://reviews.llvm.org/D133158

2 years ago[clang] Change cc1 -fvisibility's canonical spelling to -fvisibility=
Fangrui Song [Fri, 2 Sep 2022 18:49:38 +0000 (11:49 -0700)]
[clang] Change cc1 -fvisibility's canonical spelling to -fvisibility=

2 years ago[flang][docs] Add lowering design doc for parameterized derived-type
Valentin Clement [Fri, 2 Sep 2022 18:45:30 +0000 (20:45 +0200)]
[flang][docs] Add lowering design doc for parameterized derived-type

This document aims to give insights at the representation of parameterized
derived-type (PDTs) in FIR and how PDTs are lowered to FIR and interact
with the runtime.

Reviewed By: jeanPerier, klausler

Differential Revision: https://reviews.llvm.org/D133096

2 years ago[flang] Use APInt to lower 128 bits integer constants
Valentin Clement [Fri, 2 Sep 2022 18:44:44 +0000 (20:44 +0200)]
[flang] Use APInt to lower 128 bits integer constants

Lowering was truncating 128 bits integer to 64 bits. This
patch makes use of APInt to lower 128 bits integer correctly.

```
program bug
  print *, 170141183460469231731687303715884105727_16
end

! Before patch: 18446744073709551615
! With patch: 170141183460469231731687303715884105727
```

Reviewed By: vdonaldson

Differential Revision: https://reviews.llvm.org/D133206

2 years ago[flang] Make sure allocatable components are initialzed for temp derived-type
Valentin Clement [Fri, 2 Sep 2022 18:43:18 +0000 (20:43 +0200)]
[flang] Make sure allocatable components are initialzed for temp derived-type

Runtime functions expect clean unallocated state for descriptor. This
patch adds a call to the runtime function to initialize the temporary
derived-type created.

Reviewed By: vdonaldson

Differential Revision: https://reviews.llvm.org/D133189

2 years ago[HLSL] Restrict to supported targets
Chris Bieneman [Fri, 2 Sep 2022 15:45:53 +0000 (10:45 -0500)]
[HLSL] Restrict to supported targets

Someday we would like to support HLSL on a wider range of targets, but
today targeting anything other than `dxil` is likly to cause lots of
headaches. This adds an error and tests to validate that the expected
target is `dxil-?-shadermodel`.

We will continue to do a best effort to ensure the code we write makes
it easy to support other targets (like SPIR-V), but this error will
prevent users from hitting frustrating errors for unsupported cases.

Reviewed By: jcranmer-intel, Anastasia

Differential Revision: https://reviews.llvm.org/D132056

2 years ago[mlir][sparse] extend codegen test cases with an additional step after storage expansion
Peiming Liu [Fri, 2 Sep 2022 17:31:54 +0000 (17:31 +0000)]
[mlir][sparse] extend codegen test cases with an additional step after storage expansion

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D133219

2 years ago[mlir][sparse] add conversion rules for storage_get/set/callOp
Peiming Liu [Thu, 1 Sep 2022 20:34:05 +0000 (20:34 +0000)]
[mlir][sparse] add conversion rules for storage_get/set/callOp

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D133175

2 years ago[Attributor] Simplify offset calculation for a constant GEP
Sameer Sahasrabuddhe [Fri, 2 Sep 2022 18:23:51 +0000 (23:53 +0530)]
[Attributor] Simplify offset calculation for a constant GEP

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D132931

2 years ago[mlir][ods] Add ArrayOfAttr for creating custom array attributes
Jeff Niu [Thu, 1 Sep 2022 16:30:54 +0000 (09:30 -0700)]
[mlir][ods] Add ArrayOfAttr for creating custom array attributes

`ArrayOfAttr` can be used to easily create an attribute that just
contains an array of something. The elements can be other attributes,
in which case the custom parsers and printers are invoked directly for
nice syntax, or any C++ type that supports parsing and printing, either
though custom `printer` and `parser` methods or `FieldParser`.

An array of integers:

```
def ArrayOfInts : ArrayOfAttr<Test_Dialect, "ArrayOfInts", "array_of_ints",
                              "int32_t">;
```

When embedded in an op's assembly format, it will look like

```
foo.ints value = [1, 2, 3]
```

An array of enums, when embedded in an op's assembly format, will look
like:

```
foo.enums value = [first, second, last]
```

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D133131

2 years ago[Verifier] Skip debug location check for some non-inlinable functions
Yuanfang Chen [Fri, 2 Sep 2022 17:40:37 +0000 (10:40 -0700)]
[Verifier] Skip debug location check for some non-inlinable functions

If a callee function is not interposable, skip debug location check for its callsites. Doing this is instrumentation-friendly otherwise under some conditions this check triggers for some un-inlinable call sites.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D133060

2 years ago[LoopPassManager] Implement and use LoopNestAnalysis::run() instead of manually creat...
Arthur Eubanks [Wed, 24 Aug 2022 18:18:52 +0000 (11:18 -0700)]
[LoopPassManager] Implement and use LoopNestAnalysis::run() instead of manually creating LoopNests

The current code is basically just emulating what the analysis manager does.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D132581

2 years ago[docs] -fivisibility= allows protected and internal
Fangrui Song [Fri, 2 Sep 2022 17:49:10 +0000 (10:49 -0700)]
[docs] -fivisibility= allows protected and internal

2 years ago[test] Remove problematic thread from MainLoopTest to fix flakiness
Jordan Rupprecht [Fri, 2 Sep 2022 17:21:23 +0000 (10:21 -0700)]
[test] Remove problematic thread from MainLoopTest to fix flakiness

This test, specifically `TwoSignalCallbacks`, can be a little bit flaky, failing in around 5/2000 runs.

POSIX says:

> If the value of pid causes sig to be generated for the sending process, and if sig is not blocked for the calling thread and if no other thread has sig unblocked or is waiting in a sigwait() function for sig, either sig or at least one pending unblocked signal shall be delivered to the sending thread before kill() returns.

The problem is that in test setup, we create a new thread with `std::async` and that is occasionally not cleaned up. This leaves that thread available to eat the signal we're polling for.

The need for this to be async does not apply anymore, so we can just make it synchronous.

This makes the test passes in 10000 runs.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D133181

2 years ago[docs] Regenerate clang/docs/ClangCommandLineReference.rst
Fangrui Song [Fri, 2 Sep 2022 17:29:37 +0000 (10:29 -0700)]
[docs] Regenerate clang/docs/ClangCommandLineReference.rst

2 years ago[mlir][Tensor] Add rewrites to extract slices through `tensor.collape_shape`
Christopher Bate [Tue, 16 Aug 2022 23:27:06 +0000 (17:27 -0600)]
[mlir][Tensor] Add rewrites to extract slices through `tensor.collape_shape`

This change adds a set of utilities to replace the result of a
`tensor.collapse_shape -> tensor.extract_slice` chain with the
equivalent result formed by aggregating slices of the
`tensor.collapse_shape` source. In general, it is not possible to
commute `extract_slice` and `collapse_shape` if linearized dimensions
are sliced. The i-th dimension of the `tensor.collapse_shape`
result is a "linearized sliced dimension" if:

1) Reassociation indices of tensor.collapse_shape in the i'th position
   is greater than size 1 (multiple dimensions of the input are collapsed)
2) The i-th dimension is sliced by `tensor.extract_slice`.

We can work around this by stitching together the result of
`tensor.extract_slice` by iterating over any linearized sliced dimensions.
This is equivalent to "tiling" the linearized-and-sliced dimensions of
the `tensor.collapse_shape` operation in order to manifest the result
tile (the result of the `tensor.extract_slice`). The user of the
utilities must provide the mechanism to create the tiling (e.g. a loop).
In the tests, it is demonstrated how to apply the utilities using either
`scf.for` or `scf.foreach_thread`.

The below example illustrates the pattern using `scf.for`:

```
%0 = linalg.generic ... -> tensor<3x7x11x10xf32>
%1 = tensor.collapse_shape %0 [[0, 1, 2], [3]] : ... to tensor<341x10xf32>
%2 = tensor.extract_slice %1 [13, 0] [10, 10] [2, 1] : .... tensor<10x10xf32>
```

We can construct %2 by generating the following IR:

```
%dest = linalg.init_tensor() : tensor<10x10xf32>
%2 = scf.for %iv = %c0 to %c10 step %c1 iter_args(%arg0) -> tensor<10x10xf32> {
   // Step 1: Map this output idx (%iv) to a multi-index for the input (%3):
   %linear_index = affine.apply affine_map<(d0)[]->(d0*2 + 11)>(%iv)
   %3:3 = arith.delinearize_index %iv into (3, 7, 11)
   // Step 2: Extract the slice from the input
   %4 = tensor.extract_slice %0 [%3#0, %3#1, %3#2, 0] [1, 1, 1, 10] [1, 1, 1, 1] :
         tensor<3x7x11x10xf32> to tensor<1x1x1x10xf32>
   %5 = tensor.collapse_shape %4 [[0, 1, 2], [3]] :
         tensor<1x1x1x10xf32> into tensor<1x10xf32>
   // Step 3: Insert the slice into the destination
   %6 = tensor.insert_slice %5 into %arg0 [%iv, 0] [1, 10] [1, 1] :
         tensor<1x10xf32> into tensor<10x10xf32>
   scf.yield %6 : tensor<10x10xf32>
}
```

The pattern was discussed in the RFC here: https://discourse.llvm.org/t/rfc-tensor-extracting-slices-from-tensor-collapse-shape/64034

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D129699

2 years ago[AArch64][TTI] Add cost table entry for trunc over vector of integers.
Mingming Liu [Tue, 30 Aug 2022 05:54:41 +0000 (22:54 -0700)]
[AArch64][TTI] Add cost table entry for trunc over vector of integers.

1) Tablegen patterns exist to use 'xtn' and 'uzp1' for trunc [1]. Cost table entries are updated based on the actual number of {xtn, uzp1} instructions generated.
2) Without this, an IR instruction like trunc <8 x i16> %v to <8 x i8> is considered free and might be sinked to other basic blocks. As a result, the sinked 'trunc' is in a different basic block with its (usually not-free) vector operand and misses the chance to be combined during instruction selection. (examples in [2])
3) It's a lot of effort to teach CodeGenPrepare.cpp to sink the operand of trunc without introducing regressions, since the instruction to compute the operand of trunc could be faster (e.g., throughput) than the instruction corresponding to "trunc (bin-vector-op". For instance in [3], sinking %1 (as trunc operand) into bb.1 and bb.2 means to replace 2 xtn with 2 shrn (shrn has a throughput of 1 and only utilize v1 pipeline), which is not necessarily good, especially since ushr result needs to be preserved for store operation in bb.0. Meanwhile, it's too optimistic (for CodeGenPrepare pass) to assume machine-cse will always be able to de-dup shrn from various basic blocks into one shrn.

[1] For {v8i16->v8i8, v4i32->v4i16, v2i64->v2i32}, https://github.com/llvm/llvm-project/blob/813ae2871d71f32cce46768e63185cd64651f6e9/llvm/lib/Target/AArch64/AArch64InstrInfo.td#L4472.
    For concat (trunc, trunc) -> uzip1, https://github.com/llvm/llvm-project/blob/813ae2871d71f32cce46768e63185cd64651f6e9/llvm/lib/Target/AArch64/AArch64InstrInfo.td#L5428-L5437
[2] examples
    - trunc(umin(X, 255)) -> UQXTRN v8i8 (and other {u,s}x{min,max} pattern for v8i16 operands) from https://github.com/llvm/llvm-project/blob/813ae2871d71f32cce46768e63185cd64651f6e9/llvm/lib/Target/AArch64/AArch64InstrInfo.td#L4515-L4528
    - trunc (AArch64vlshr v8i16, imm) -> SHRNv8i8 (same missed for SHRNv2i32) from https://github.com/llvm/llvm-project/blob/813ae2871d71f32cce46768e63185cd64651f6e9/llvm/lib/Target/AArch64/AArch64InstrInfo.td#L6743-L6748
[3]
    ---
    ; instruction latency / throughput / pipeline on `neoverse-n1`
    bb.0:
      %1 = lshr <8 x i16> %10, <i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4>   ; ushr, latency 2, throughput 1, pipeline V1
      %2 = trunc <8 x i16> %1 to <8 x i8>  ; xtn, latency 2, throughput 2, pipeline V
      %3 = store <8 x i8> %1, ptr %addr
      br cond i1 cond, label bb.1, label bb.2

    bb.1:
      %4 = trunc <8 x i16> %1 to <8 x i8> ; xtn

    bb.2:
      %5 = trunc <8 x i16> %1 to <8 x i8> ; xtn
    ---

Differential Revision: https://reviews.llvm.org/D132784

2 years ago[LiveIntervals] Split live intervals on any dead def
Daniil Fukalov [Mon, 25 Jul 2022 12:18:51 +0000 (15:18 +0300)]
[LiveIntervals] Split live intervals on any dead def

Each dead def of the same virtual register is required to be split into multiple
virtual registers with separate live intervals to avoid MachineVerifier error.

Partially fixes https://github.com/llvm/llvm-project/issues/56050 and
https://github.com/llvm/llvm-project/issues/56051

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D130477

2 years ago[MinGW] Ignore -fvisibility/-fvisibility-inlines-hidden for dllexport
Fangrui Song [Fri, 2 Sep 2022 16:59:16 +0000 (09:59 -0700)]
[MinGW] Ignore -fvisibility/-fvisibility-inlines-hidden for dllexport

Similar to 123ce97fac78bc4519afd5d2aba17c59c5717aad for dllimport: dllexport
expresses a non-hidden visibility intention. We can consider it explicit and
therefore it should override the global visibility setting (see AST/Decl.cpp
"NamedDecl Implementation").

Adding the special case to CodeGenModule::setGlobalVisibility is somewhat weird,
but allows we to add the code in one place instead of many in AST/Decl.cpp.

Differential Revision: https://reviews.llvm.org/D133180

2 years agoMigrate "CheckUses" pass to the auto-generated constructor (NFC)
Mehdi Amini [Fri, 2 Sep 2022 16:04:00 +0000 (16:04 +0000)]
Migrate "CheckUses" pass to the auto-generated constructor (NFC)

See #57475

Differential Revision: https://reviews.llvm.org/D133215

2 years ago[Driver] Unsupport --print-multiarch
Fangrui Song [Fri, 2 Sep 2022 16:51:02 +0000 (09:51 -0700)]
[Driver] Unsupport --print-multiarch

* If GCC is configured with `--disable-multi-arch`, `--print-multiarch` output is an empty line.
* If GCC is configured with `--enable-multi-arch`, `--print-multiarch` output may be a normalized triple or (on Debian, 'vendor' is omitted) `x86_64-linux-gnu`.

The Clang support D101400 just prints the Debian multiarch style triple
unconditionally, but the string is not really expected for non-Debian systems.

AIUI many Linux distributions and non-Linux OSes don't configure GCC with `--enable-multi-arch`.
Instead of getting us in the trouble of supporting all kinds of variants, drop the support as before D101400.

Close https://github.com/llvm/llvm-project/issues/51469

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D133170

2 years ago[bazel] fix libc build
Mikhail Goncharov [Fri, 2 Sep 2022 16:32:27 +0000 (18:32 +0200)]
[bazel] fix libc build

For a4d48e3b0b66cacb8c42be8421608f7efd170c24

2 years agoRevert rG11765b77be84d793ebedc5b5436c463490746131 "[CostModel][X86] Add CostKinds...
Simon Pilgrim [Fri, 2 Sep 2022 16:21:25 +0000 (17:21 +0100)]
Revert rG11765b77be84d793ebedc5b5436c463490746131 "[CostModel][X86] Add CostKinds handling for fmul ops"

I need to address some x87 codegen changes before re-committing this.

2 years ago[CostModel][X86] Add CostKinds handling for fmul ops
Simon Pilgrim [Fri, 2 Sep 2022 15:55:12 +0000 (16:55 +0100)]
[CostModel][X86] Add CostKinds handling for fmul ops

This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695

As we're using 'typical' worst case values, not all cost entries come from a single CPU - e.g. the latency/throughput from haswell but the size-latency(uops) from zen1/alderlake-e due to 'double pumping'

2 years ago[lldb] From unordered_map synthetic provider, return std::pair children
Dave Lee [Fri, 14 Jan 2022 05:17:02 +0000 (21:17 -0800)]
[lldb] From unordered_map synthetic provider, return std::pair children

Change the behavior of the libc++ `unordered_map` synthetic provider to present
children as `std::pair` values, just like `std::map` does.

The synthetic provider for libc++ `std::unordered_map` has returned children
that expose a level of internal structure (over top of the key/value pair). For
example, given an unordered map initialized with `{{1,2}, {3, 4}}`, the output
is:

```
(std::unordered_map<int, int, std::hash<int>, std::equal_to<int>, std::allocator<std::pair<const int, int> > >) map = size=2 {
  [0] = {
    __cc = (first = 3, second = 4)
  }
  [1] = {
    __cc = (first = 1, second = 2)
  }
}
```

It's not ideal/necessary to have the numbered children embdedded in the `__cc`
field.

Note: the numbered children have type
`std::__hash_node<std::__hash_value_type<Key, T>, void *>::__node_value_type`,
and the `__cc` fields have type `std::__hash_value_type<Key, T>::value_type`.

Compare this output to `std::map`:

```
(std::map<int, int, std::less<int>, std::allocator<std::pair<const int, int> > >) map = size=2 {
  [0] = (first = 1, second = 2)
  [1] = (first = 3, second = 4)
```

Where the numbered children have type `std::pair<const Key, T>`.

This changes the behavior of the synthetic provider for `unordered_map` to also
present children as `pairs`, just like `std::map`.

It appears the synthetic provider implementation for `unordered_map` was meant
to provide this behavior, but was maybe incomplete (see
d22a94377f7554a7e9df050f6dfc3ee42384e3fe). It has both an `m_node_type` and an
`m_element_type`, but uses only the former. The latter is exactly the type
needed for the children pairs. With this existing code, it's not much of a
change to make this work.

Differential Revision: https://reviews.llvm.org/D117383

2 years agoRevert "[debuginfo-tests] Un-XFAIL no passing unused-merged-value.c test"
Adrian Prantl [Fri, 2 Sep 2022 15:46:38 +0000 (08:46 -0700)]
Revert "[debuginfo-tests] Un-XFAIL no passing unused-merged-value.c test"

This reverts commit 96f00f63b2d65ebe759ae1746c30115e73cbd4f2 because D128830 has been reverted.

2 years ago[CostModel][X86] Add CostKinds to SSE42 fadd/fsub/fneg ops
Simon Pilgrim [Fri, 2 Sep 2022 15:04:39 +0000 (16:04 +0100)]
[CostModel][X86] Add CostKinds to SSE42 fadd/fsub/fneg ops

These were missed in an earlier commit, the latency/codesize/size-latency numbers aren't different from the SSE2 values that it was falling through to, hence no test change, but it did mean we were wasting a lookup.

2 years ago[AggressiveInstCombine] Lower Table Based CTTZ
Djordje Todorovic [Thu, 1 Sep 2022 07:21:34 +0000 (09:21 +0200)]
[AggressiveInstCombine] Lower Table Based CTTZ

This patch introduces recognition of table-based ctz implementation
during the AggressiveInstCombine.

This fixes the [0].

[0] https://bugs.llvm.org/show_bug.cgi?id=46434

Differential Revision: https://reviews.llvm.org/D113291

2 years ago[lsan][darwin] Unmask camouflaged class_rw_t pointers
Leonard Grey [Thu, 1 Sep 2022 14:58:13 +0000 (10:58 -0400)]
[lsan][darwin] Unmask camouflaged class_rw_t pointers

Detailed motivation here: https://docs.google.com/document/d/1xUNo5ovPKJMYxitiHUQVRxGI3iUmspI51Jm4w8puMwo

check-asan (with LSAN enabled) and check-lsan are currently broken on recent macOS versions, due to pervasive false positives. Whenever the Objective-C runtime realizes a class, it allocates data for it, then stores that data with flags in the low bits. This means LSAN can not recognize it as a pointer while scanning.

This change checks every potential pointer on Apple platforms, and if the high bit is set, attempts to extract a pointer by masking out the high bit and flags. This is ugly, but it's also the best approach I could think of (see doc above); very open to other suggestions.

Differential Revision: https://reviews.llvm.org/D133126

2 years ago[VP] Correct LEGALPOS for more VP nodes.
Craig Topper [Fri, 2 Sep 2022 15:04:27 +0000 (08:04 -0700)]
[VP] Correct LEGALPOS for more VP nodes.

LEGALPOS appears to only be used by LegalizeVectorOps. It needs
to point at a vector operand. Stores need to point at the second
operand since the result and the first operand are MVT::Other.
Reductions need to point at the second operand since the result
and the first operand are scalsrs.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D133048

2 years ago[AMDGPU][NFC] Fix typo in commment: replace SiMemOpInfo by SIMemOpInfo
Juan Manuel MARTINEZ CAAMAÑO [Fri, 2 Sep 2022 12:26:38 +0000 (14:26 +0200)]
[AMDGPU][NFC] Fix typo in commment: replace SiMemOpInfo by SIMemOpInfo

2 years ago[libc] Add Buildbot to External Links
Jeff Bailey [Fri, 2 Sep 2022 06:44:08 +0000 (06:44 +0000)]
[libc] Add Buildbot to External Links

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D133186

2 years agoExpose QualType::getNonReferenceType in libclang
Luca Di Sera [Fri, 2 Sep 2022 13:54:10 +0000 (09:54 -0400)]
Expose QualType::getNonReferenceType in libclang

The method is now wrapped by clang_getNonReferenceType.

A declaration for clang_getNonReferenceType was added to clang-c/Index.h
to expose it to user of the library.

An implementation for clang_getNonReferenceType was introduced in
CXType.cpp, wrapping the equivalent method of the underlying QualType of
a CXType.

An export symbol for the new function was added to libclang.map under
the LLVM_16 version entry.

A test was added to LibclangTest.cpp that tests the removal of
ref-qualifiers for some CXTypes.

The release-notes for the clang project was updated to include a
notification of the new addition under the "libclang" section.

Differential Revision: https://reviews.llvm.org/D133195

2 years agoApply clang-tidy fixes for llvm-qualified-auto in DecomposeLinalgOps.cpp (NFC)
Mehdi Amini [Mon, 29 Aug 2022 10:48:50 +0000 (10:48 +0000)]
Apply clang-tidy fixes for llvm-qualified-auto in DecomposeLinalgOps.cpp (NFC)

2 years agoApply clang-tidy fixes for llvm-qualified-auto in ConstantFold.cpp (NFC)
Mehdi Amini [Mon, 29 Aug 2022 10:48:06 +0000 (10:48 +0000)]
Apply clang-tidy fixes for llvm-qualified-auto in ConstantFold.cpp (NFC)

2 years ago[llvm][Support] Add DenseMapInfo for std::variant
Kadir Cetinkaya [Fri, 2 Sep 2022 12:35:40 +0000 (14:35 +0200)]
[llvm][Support] Add DenseMapInfo for std::variant

Differential Revision: https://reviews.llvm.org/D133200

2 years ago[InstCombine] Baseline tests for reducing test-for-overflow of shifted value
Tian Zhou [Fri, 2 Sep 2022 12:59:23 +0000 (08:59 -0400)]
[InstCombine] Baseline tests for reducing test-for-overflow of shifted value

Baseline tests for this patch https://reviews.llvm.org/D132888

Differential Revision: https://reviews.llvm.org/D133182

2 years ago[mlir][SCF] foreach_thread: Capture shared output tensors explicitly
Matthias Springer [Fri, 2 Sep 2022 12:48:35 +0000 (14:48 +0200)]
[mlir][SCF] foreach_thread: Capture shared output tensors explicitly

This change refines the semantics of scf.foreach_thread. Tensors that are inserted into in the terminator must now be passed to the region explicitly via `shared_outs`. Inside of the body of the op, those tensors are then accessed via block arguments.

The body of a scf.foreach_thread is now treated as a repetitive region. I.e., op dominance can no longer be used in conflict detection when using a value that is defined outside of the body. Such uses may now be considered as conflicts (if there is at least one read and one write in the body), effectively privatizing the tensor. Shared outputs are not privatized when they are used via their corresponding block arguments.

As part of this change, it was also necessary to update the "tiling to scf.foreach_thread", such that the generated tensor.extract_slice ops use the scf.foreach_thread's block arguments. This is implemented by cloning the TilingInterface op inside the scf.foreach_thread, rewriting all of its outputs with block arguments and then calling the tiling implementation. Afterwards, the cloned op is deleted again.

Differential Revision: https://reviews.llvm.org/D133114

2 years ago[mlir][bufferize] Add isRepetitiveRegion to BufferizableOpInterface
Matthias Springer [Fri, 2 Sep 2022 12:32:04 +0000 (14:32 +0200)]
[mlir][bufferize] Add isRepetitiveRegion to BufferizableOpInterface

This method allows to declare regions as "repetitive" even if the parent op does not implement the RegionBranchOpInterface.

This is needed to support loop-like ops that have parallel semantics but do not branch between regions.

Differential Revision: https://reviews.llvm.org/D133113

2 years agoUpdate the docs about IRC
Aaron Ballman [Fri, 2 Sep 2022 12:41:49 +0000 (08:41 -0400)]
Update the docs about IRC

We haven't had a Geordi bot in years and we moved the build bot to
another channel. This updates the documentation to be more reflective
of reality, and it also identifies whether a channel is actively
moderated or not.

Differential Revision: https://reviews.llvm.org/D133155

2 years ago[LoopLoadElim] Add stores with matching sizes as load-store candidates
Jolanta Jensen [Fri, 8 Jul 2022 10:14:08 +0000 (11:14 +0100)]
[LoopLoadElim] Add stores with matching sizes as load-store candidates

We are not building up a proper list of load-store candidates because
we are throwing away stores where the type don't match the load.
This patch adds stores with matching store sizes as candidates.
Author of the original patch: David Sherwood.

Differential Revision: https://reviews.llvm.org/D130233

2 years ago[TypePromotionPass] Rename variable to avoid name conflict. NFC
David Green [Fri, 2 Sep 2022 11:35:15 +0000 (12:35 +0100)]
[TypePromotionPass] Rename variable to avoid name conflict. NFC

2 years ago[Clang][Comments] Fix `Index/comment-lots-of-unknown-commands.c`
Egor Zhdan [Thu, 1 Sep 2022 12:30:22 +0000 (13:30 +0100)]
[Clang][Comments] Fix `Index/comment-lots-of-unknown-commands.c`

This re-enables a test after it was disabled in https://reviews.llvm.org/D133009.

Fixes #57484.

Differential Revision: https://reviews.llvm.org/D133105

2 years agoRevert "[InstCombine] Treat passing undef to noundef params as UB"
Muhammad Omair Javaid [Fri, 2 Sep 2022 11:09:50 +0000 (16:09 +0500)]
Revert "[InstCombine] Treat passing undef to noundef params as UB"

This reverts commit c911befaec494c52a63e3b957e28d449262656fb.

It has broken LLDB Arm/AArch64 Linux buildbots. I dont really understand
the underlying reason. Reverting for now make buildbot green.

https://reviews.llvm.org/D133036

2 years ago[bazel] additional build fixes for libc
Mikhail Goncharov [Fri, 2 Sep 2022 11:04:51 +0000 (13:04 +0200)]
[bazel] additional build fixes for libc

after fe41529755df946521df64ca9932b58c8eecb52b

2 years ago[CostModel][X86] Add CostKinds handling for fadd/fsub/fneg ops
Simon Pilgrim [Fri, 2 Sep 2022 10:49:46 +0000 (11:49 +0100)]
[CostModel][X86] Add CostKinds handling for fadd/fsub/fneg ops

This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695 which I'll update shortly

As we're using 'typical' worst case values, not all cost entries come from a single CPU - e.g. the latency/throughput from haswell but the size-latency(uops) from zen1/alderlake-e due to 'double pumping'

2 years ago[GlobalOpt] Add test case for #56762.
Florian Hahn [Fri, 2 Sep 2022 10:33:06 +0000 (11:33 +0100)]
[GlobalOpt] Add test case for #56762.

Add test case where GlobalOpt fails to remove loads to global fields
with struct types.

2 years ago[clang] Skip re-building lambda expressions in parameters to consteval fns.
Utkarsh Saxena [Tue, 30 Aug 2022 14:57:07 +0000 (16:57 +0200)]
[clang] Skip re-building lambda expressions in parameters to consteval fns.

As discussed in this [comment](https://github.com/llvm/llvm-project/issues/56183#issuecomment-1224331699),
we end up building the lambda twice: once while parsing the function calls and then again while handling the immediate invocation.

This happens specially during removing nested immediate invocation.
Eg: When we have another consteval function as the parameter along with this lambda expression. Eg: `foo(bar([]{}))`, `foo(bar(), []{})`

While removing such nested immediate invocations, we should not rebuild this lambda. (IIUC, rebuilding a lambda would always generate a new type which will never match the original type from parsing)

Fixes: https://github.com/llvm/llvm-project/issues/56183
Fixes: https://github.com/llvm/llvm-project/issues/51695
Fixes: https://github.com/llvm/llvm-project/issues/50455
Fixes: https://github.com/llvm/llvm-project/issues/54872
Fixes: https://github.com/llvm/llvm-project/issues/54587

Differential Revision: https://reviews.llvm.org/D132945

2 years ago[GlobalOpt] Fix debug variance problem in hasOnlyColdCalls
Mikael Holmen [Fri, 2 Sep 2022 08:56:11 +0000 (10:56 +0200)]
[GlobalOpt] Fix debug variance problem in hasOnlyColdCalls

hasOnlyColdCalls skipped over calls to intrinsics, but it did so after
checking the linkage of the called function. This meant that the presence
of a call to a debug intrinsic could affect the outcome of the
optimization.

In my original reproducer (for an out of tree target) it was particularly
interesting, because the actual IR after GlobalOpt was not different with
debug instrinsics present, so -print-after-all printouts didn't show
anything there.

However, without debuginfo, GlobalOpt went further and ran
BlockFrequencyAnalysis and (more importanly) LoopAnalysis, and later on in
the pipeline, instcombine behaved in different ways when LoopInfo was
present.

So a call to a dbg.declare prevented running LoopAnalysis in
GlobalOpt, which later prevented InstCombine from doing an optimization.

The dbg-intrinsic-loopanalysis.ll testcase tries to expose this.

Then I also noted that adding a dbg.declare actually made the existing
testcase colccc_coldsites.ll generate different code, so I modified that
to now test it behaves the same way with and without the dbg.declare.

Reviewed By: nikic, fhahn

Differential Revision: https://reviews.llvm.org/D133193

2 years ago[JumpThreading] Process range comparisions with non-local cmp instructions
Sergey Kachkov [Fri, 2 Sep 2022 10:21:05 +0000 (12:21 +0200)]
[JumpThreading] Process range comparisions with non-local cmp instructions

Use getPredicateOnEdge method if value is a non-local
compare-with-a-constant instruction, that can give more precise
results than getConstantOnEdge.

Differential Revision: https://reviews.llvm.org/D131956

2 years ago[SPIRV] Add tests to improve test coverage
Andrey Tretyakov [Tue, 30 Aug 2022 01:14:49 +0000 (04:14 +0300)]
[SPIRV] Add tests to improve test coverage

Differential Revision: https://reviews.llvm.org/D132903

2 years ago[LoongArch][test] Replace bashism `|&` to `2>&1 |` (NFC)
wanglei [Fri, 2 Sep 2022 10:10:43 +0000 (18:10 +0800)]
[LoongArch][test] Replace bashism `|&` to `2>&1 |` (NFC)

The bash syntax `|&` is unsupported on other shells.

Differential Revision: https://reviews.llvm.org/D133187

2 years ago[TTI] Improve description of TargetCostKind enums to aid targets in choosing cost...
Simon Pilgrim [Fri, 2 Sep 2022 10:08:57 +0000 (11:08 +0100)]
[TTI] Improve description of TargetCostKind enums to aid targets in choosing cost values

I'm not sure how much to add to the description as we've tried to allow targets to interpret the TargetCostKind enums in their own way. But we need to make it clear that certain cost kinds need to match threshold numbers used by various passes (and vice-versa when passes are determining a cost-benefit threshold).

I'm not keen on the "The weighted sum of size and latency" description, but its very difficult to come up with anything else that's suitably generic (e.g. X86 will use uop counts here to easily work with LoopMicroOpBufferSize thresholds, even though high latency fdiv/fsqrt instructions still often have low uop counts).

Differential Revision: https://reviews.llvm.org/D132288

2 years ago[LoongArch] Support lowering br_jt
WANG Xuerui [Fri, 2 Sep 2022 09:57:29 +0000 (17:57 +0800)]
[LoongArch] Support lowering br_jt

Jump tables cannot be generated yet, due to missing support for emitting
local addresses.

Differential Revision: https://reviews.llvm.org/D132653

2 years ago[FLANG][NFCI]De-duplicate code in SimplifyIntrinsics
Mats Petersson [Fri, 19 Aug 2022 16:45:35 +0000 (17:45 +0100)]
[FLANG][NFCI]De-duplicate code in SimplifyIntrinsics

This removes a bunch of duplicated code, by adding an intermediate
function simplifyReduction that takes a std::function argument
for the actual replacement of the code.

No functional change intended.

Reviewed By: vzakhari

Differential Revision: https://reviews.llvm.org/D132588

2 years ago[LICM] Add test for missed load promotion opportunity (NFC)
Nikita Popov [Fri, 2 Sep 2022 09:35:26 +0000 (11:35 +0200)]
[LICM] Add test for missed load promotion opportunity (NFC)

2 years ago[NFC] Cleanup lookup for coroutine allocation/deallocation
Chuanqi Xu [Fri, 2 Sep 2022 08:13:05 +0000 (16:13 +0800)]
[NFC] Cleanup lookup for coroutine allocation/deallocation

2 years ago[mlir][Vector] Refactor vector distribution and fix an issue related to non-homogenou...
Nicolas Vasilache [Thu, 1 Sep 2022 12:47:32 +0000 (05:47 -0700)]
[mlir][Vector] Refactor vector distribution and fix an issue related to non-homogenous transfer indices.

Running: `mlir-opt -test-vector-warp-distribute=rewrite-warp-ops-to-scf-if -canonicalize -verify-each=0`.

Prior to this revision, IR resembling the following would be produced:
```
  %4 = "vector.load"(%3, %arg0) : (memref<1x32xf32, 3>, index) -> vector<1x1xf32>
```
This fails verification since it needs 2 indices to load but only 1 is provided.

Differential Revision: https://reviews.llvm.org/D133106

2 years ago[MLIR] Remove unused config attributes from lit.site.cfg.py
Christian Sigg [Fri, 26 Aug 2022 10:04:32 +0000 (12:04 +0200)]
[MLIR] Remove unused config attributes from lit.site.cfg.py

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D132726

2 years ago[SPIRV] Add tests to improve test coverage
Andrey Tretyakov [Sun, 28 Aug 2022 23:19:14 +0000 (02:19 +0300)]
[SPIRV] Add tests to improve test coverage

Differential Revision: https://reviews.llvm.org/D132817

2 years ago[mlir][Linalg] Apply ClangTidy performance finding.
Adrian Kuegel [Fri, 2 Sep 2022 08:57:28 +0000 (10:57 +0200)]
[mlir][Linalg] Apply ClangTidy performance finding.

Loop variable is copied but only used as const reference.

2 years ago[cmake] Append CLANG_LIBDIR_SUFFIX to scan-build-py installation destination
Sinan Lin [Fri, 2 Sep 2022 08:16:23 +0000 (16:16 +0800)]
[cmake] Append CLANG_LIBDIR_SUFFIX to scan-build-py installation destination

met this issue when building llvm with config LLVM_LIBDIR_SUFFIX=64, and
the installation destination of scan-build-py does not respect the given
suffix.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D133160

2 years ago[X86] Add missing key feature for core2
Freddy Ye [Fri, 2 Sep 2022 03:53:43 +0000 (11:53 +0800)]
[X86] Add missing key feature for core2

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D133094

2 years ago[flang] Avoid copyin/copyout if the actual argument is contiguous at runtime
Valentin Clement [Fri, 2 Sep 2022 07:46:01 +0000 (09:46 +0200)]
[flang] Avoid copyin/copyout if the actual argument is contiguous at runtime

This patch adds contiguity check with the runtime to avoid copyin/copyout
in case the actual argument is actually contiguous.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D133097

2 years agoRevert "[DSE] Eliminate noop store even through has clobbering between LoadI and...
Nikita Popov [Fri, 2 Sep 2022 07:28:48 +0000 (09:28 +0200)]
Revert "[DSE] Eliminate noop store even through has clobbering between LoadI and StoreI"

This reverts commit cd8f3e75813995c1d2da35370ffcf5af3aff9c2f.

As pointed out by Eli on the review, this is missing an alignment
check. The value might be written at an offset.

2 years ago[LICM] Allow load-only scalar promotion in the presence of unwinding
Nikita Popov [Thu, 1 Sep 2022 12:33:55 +0000 (14:33 +0200)]
[LICM] Allow load-only scalar promotion in the presence of unwinding

Currently, we bail out of scalar promotion if the loop may unwind
and the memory may be visible on unwind. This is because we can't
insert stores of the promoted value on unwind edges.

However, nowadays scalar promotion also has support for only
promoting loads, while leaving stores in place. This kind of
promotion is safe even in the presence of unwinding.

Differential Revision: https://reviews.llvm.org/D133111

2 years ago[DSE] Eliminate noop store even through has clobbering between LoadI and StoreI
luxufan [Wed, 24 Aug 2022 13:51:58 +0000 (13:51 +0000)]
[DSE] Eliminate noop store even through has clobbering between LoadI and StoreI

For noop store of the form of LoadI and StoreI,
An invariant should be kept is that the memory state of the related
MemoryLoc before LoadI is the same as before StoreI.
For this example:
```
define void @pr49927(i32* %q, i32* %p) {
  %v = load i32, i32* %p, align 4
  store i32 %v, i32* %q, align 4
  store i32 %v, i32* %p, align 4
  ret void
}
```
Here the definition of the store's destination is different with the
definition of the load's destination, which it seems that the
invariant mentioned above is broken. But the definition of the
store's destination would write a value that is LoadI, actually, the
invariant is still kept. So we can safely ignore it.

Differential Revision: https://reviews.llvm.org/D132657

2 years ago[ORC-RT] Fix typo.
Lang Hames [Fri, 2 Sep 2022 06:16:45 +0000 (23:16 -0700)]
[ORC-RT] Fix typo.

Removes the stray '$' that slipped in to c1c585a065e5.

2 years ago[ORC-RT] Don't unconditionally add dependence on llvm-jitlink.
Lang Hames [Fri, 2 Sep 2022 05:42:39 +0000 (22:42 -0700)]
[ORC-RT] Don't unconditionally add dependence on llvm-jitlink.

Commit 4adc5bead4a moved a dependence on llvm-jitlink from
SANITIZER_COMMON_LIT_TEST_DEPS to ORC_TEST_DEPS, but in doing so it moved it
out from under a 'NOT COMPILER_RT_STANDALONE_BUILD ...' conditional. This led
to failures on standalone builds.

This commit adds the conditional to the ORC_TEST_DEPS assignment to work
around the issue while we look a longer term fix.

rdar://99453446