platform/upstream/llvm.git
22 months ago[mlir] Add missing dependency in bazel build
Quentin Colombet [Thu, 8 Sep 2022 18:49:47 +0000 (18:49 +0000)]
[mlir] Add missing dependency in bazel build

The MemRefTransform library now depends on ArithmeticUtils because
of the newly added SimplifyExtractStridedMetadata pass.

22 months agoRevert "[lldb] Use just-built libcxx for tests when available"
Felipe de Azevedo Piovezan [Thu, 8 Sep 2022 18:37:46 +0000 (14:37 -0400)]
Revert "[lldb] Use just-built libcxx for tests when available"

This reverts commit c38eeecbc7d929c9601f2189214a7a90d3982a47.

22 months ago[mlir][PDL] Infer result types from a `replace` as the last resort
River Riddle [Thu, 1 Sep 2022 17:36:00 +0000 (10:36 -0700)]
[mlir][PDL] Infer result types from a `replace` as the last resort

This prevents situations where explicit results types were provided,
which have different types than the operation being replaced. This
is useful for supporting dialect conversion, which will have proper
support added in a followup.

Differential Revision: https://reviews.llvm.org/D133141

22 months agoMake llvm-tli-checker support static libraries
Paul Robinson [Wed, 7 Sep 2022 19:45:35 +0000 (12:45 -0700)]
Make llvm-tli-checker support static libraries

The original implementation assumed dynamic libraries and so looked
only at the dynamic symbol table.  Use the regular symbol table for
ET_REL files.

Differential Revision: https://reviews.llvm.org/D133448

22 months ago[mlir][arith] Add wide integer emulation pass
Jakub Kuderski [Thu, 8 Sep 2022 17:50:21 +0000 (13:50 -0400)]
[mlir][arith] Add wide integer emulation pass

In this first patch in a series to add wide integer emulation:
*  Set up the initial pass structure
*  Add a custom type converter
*  Handle func ops

The initial implementation supports power-of-two integers types only. We
emulate wide integer operations by splitting original i2N integer types
into two iN halves

My immediate use case is to emulate i64 operations using i32 ones
on mobile GPUs that do not support i64.

Reviewed By: antiagainst, Mogball

Differential Revision: https://reviews.llvm.org/D133135

22 months ago[lldb] Use just-built libcxx for tests when available
Felipe de Azevedo Piovezan [Tue, 30 Aug 2022 13:28:14 +0000 (09:28 -0400)]
[lldb] Use just-built libcxx for tests when available

This commit improves upon cc0b5ebf7fc8, which added support for
specifying which libcxx to use when testing LLDB. That patch honored
requests by tests that had `USE_LIBCPP=1` defined in their makefiles.
Now, we also use a non-default libcxx if all conditions below are true:

1. The test is not explicitly requesting the use of libstdcpp
(USE_LIBSTDCPP=1).
2. The test is not explicitly requesting the use of the system's
library (USE_SYSTEM_STDLIB=1).
3. A path to libcxx was either provided by the user through CMake flags
or libcxx was built together with LLDB.

Condition (2) is new and introduced in this patch in order to support
tests that are either:

* Cross-platform (such as API/macosx/macCatalyst and
API/tools/lldb-server). The just-built libcxx is usually not built for
platforms other than the host's.
* Cross-language (such as API/lang/objc/exceptions). In this case, the
Objective C runtime throws an exceptions that always goes through the
system's libcxx, instead of the just built libcxx. Fixing this would
require either changing the install-name of the just built libcxx in Mac
systems, or tuning the DYLD_LIBRARY_PATH variable at runtime.

Some other tests exposes limitations of LLDB when running with a debug
standard library. TestDbgInfoContentForwardLists had an assertion
removed, as it was checking for buggy LLDB behavior (which now
crashes). TestFixIts had a variable renamed, as the old name clashes
with a standard library name when debug info is present. This is a known
issue: https://github.com/llvm/llvm-project/issues/34391.

For `TestSBModule`, the way the "main" module is found was changed to
look for the "a.out" module, instead of relying on the index being 0. In
some systems, the index 0 is dyld when a custom standard library is
used.

Differential Revision: https://reviews.llvm.org/D132940

22 months ago[LLDB][NativePDB] Fix PdbAstBuilder::GetParentDeclContext when ICF happens.
Zequan Wu [Tue, 6 Sep 2022 22:38:54 +0000 (15:38 -0700)]
[LLDB][NativePDB] Fix PdbAstBuilder::GetParentDeclContext when ICF happens.

Removed `GetParentDeclContextForSymbol` as this is not necesssary. We can get
the demangled names from CVSymbol and then using it to create tag decl or
namespace decl. This also fixed a bug when icf applied.

Differential Revision: https://reviews.llvm.org/D133243

22 months ago[ARM] Add tests on instructions fusion with comparison with zero; NFC
Filipp Zhinkin [Thu, 8 Sep 2022 13:19:43 +0000 (16:19 +0300)]
[ARM] Add tests on instructions fusion with comparison with zero; NFC

Baseline tests for D131786

22 months ago[clang] extend getCommonSugaredType to merge sugar nodes
Matheus Izvekov [Tue, 19 Jul 2022 09:02:32 +0000 (11:02 +0200)]
[clang] extend getCommonSugaredType to merge sugar nodes

This continues D111283 by extending the getCommonSugaredType
implementation to also merge non-canonical type nodes.

We merge these nodes by going up starting from the canonical
node, calculating their merged properties on the way.

If we reach a pair that is too different, or which we could not
otherwise unify, we bail out and don't try to keep going on to
the next pair, in effect striping out all the remaining top-level
sugar nodes. This avoids mismatching 'companion' nodes, such as
ElaboratedType, so that they don't end up elaborating some other
unrelated thing.

Depends on D111509

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D130308

22 months ago[clang] use getCommonSugar in an assortment of places
Matheus Izvekov [Sun, 10 Oct 2021 13:28:37 +0000 (15:28 +0200)]
[clang] use getCommonSugar in an assortment of places

For this patch, a simple search was performed for patterns where there are
two types (usually an LHS and an RHS) which are structurally the same, and there
is some result type which is resolved as either one of them (typically LHS for
consistency).

We change those cases to resolve as the common sugared type between those two,
utilizing the new infrastructure created for this purpose.

Depends on D111283

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D111509

22 months ago[clang] template / auto deduction deduces common sugar
Matheus Izvekov [Wed, 25 May 2022 20:00:58 +0000 (22:00 +0200)]
[clang] template / auto deduction deduces common sugar

After upgrading the type deduction machinery to retain type sugar in
D110216, we were left with a situation where there is no general
well behaved mechanism in Clang to unify the type sugar of multiple
deductions of the same type parameter.

So we ended up making an arbitrary choice: keep the sugar of the first
deduction, ignore subsequent ones.

In general, we already had this problem, but in a smaller scale.
The result of the conditional operator and many other binary ops
could benefit from such a mechanism.

This patch implements such a type sugar unification mechanism.

The basics:

This patch introduces a `getCommonSugaredType(QualType X, QualType Y)`
method to ASTContext which implements this functionality, and uses it
for unifying the results of type deduction and return type deduction.
This will return the most derived type sugar which occurs in both X and
Y.

Example:

Suppose we have these types:
```
using Animal = int;
using Cat = Animal;
using Dog = Animal;

using Tom = Cat;
using Spike = Dog;
using Tyke = Dog;
```
For `X = Tom, Y = Spike`, this will result in `Animal`.
For `X = Spike, Y = Tyke`, this will result in `Dog`.

How it works:

We take two types, X and Y, which we wish to unify as input.
These types must have the same (qualified or unqualified) canonical
type.

We dive down fast through top-level type sugar nodes, to the
underlying canonical node. If these canonical nodes differ, we
build a common one out of the two, unifying any sugar they had.
Note that this might involve a recursive call to unify any children
of those. We then return that canonical node, handling any qualifiers.

If they don't differ, we walk up the list of sugar type nodes we dived
through, finding the last identical pair, and returning that as the
result, again handling qualifiers.

Note that this patch will not unify sugar nodes if they are not
identical already. We will simply strip off top-level sugar nodes that
differ between X and Y. This sugar node unification will instead be
implemented in a subsequent patch.

This patch also implements a few users of this mechanism:
* Template argument deduction.
* Auto deduction, for functions returning auto / decltype(auto), with
  special handling for initializer_list as well.

Further users will be implemented in a subsequent patch.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D111283

22 months ago[lld][COFF] Add support for overriding weak symbols in LLVM bitcode input
Alan Zhao [Wed, 31 Aug 2022 20:45:40 +0000 (16:45 -0400)]
[lld][COFF] Add support for overriding weak symbols in LLVM bitcode input

LLVM bitcode contains support for weak symbols, so we can add support
for overriding weak symbols in the output COFF even though COFF doesn't
have inherent support for weak symbols.

The motivation for this patch is that Chromium is trying to use libc++'s
assertion handler mechanism, which relies on weak symbols [0], but we're
unable to perform a ThinLTO build on Windows due to this problem [1].

[0]: https://reviews.llvm.org/D121478
[1]: https://crrev.com/c/3863576

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D133165

22 months ago[mlir][MemRef] Simplify extract_strided_metadata(subview)
Quentin Colombet [Thu, 1 Sep 2022 22:11:14 +0000 (22:11 +0000)]
[mlir][MemRef] Simplify extract_strided_metadata(subview)

Add a dedicated pass to simplify
extract_strided_metadata(other_op(memref)).

Currently the pass features only one pattern:
extract_strided_metadata(subview).
The goal is to get rid of the subview while materializing its effects on
the offset, sizes, and strides with respect to the base object.

In other words, this simplification replaces:
```
baseBuffer, offset, sizes, strides =
    extract_strided_metadata(
        subview(memref, subOffset, subSizes, subStrides))
```

With

```
baseBuffer, baseOffset, baseSizes, baseStrides =
    extract_strided_metadata(memref)
strides#i = baseStrides#i * subSizes#i
offset = baseOffset + sum(subOffset#i * strides#i)
sizes = subSizes
```

Differential Revision: https://reviews.llvm.org/D133166

22 months ago[mlir][bufferization] fix typo in example code for bufferization.alloc_tensor
Peiming Liu [Thu, 8 Sep 2022 16:39:17 +0000 (16:39 +0000)]
[mlir][bufferization] fix typo in example code for bufferization.alloc_tensor

See BufferizationOps.cpp:408, the dynamic sizes are enclosed by "()" not "[]"

https://github.com/llvm/llvm-project/blob/main/mlir/lib/Dialect/Bufferization/IR/BufferizationOps.cpp#L408

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D133505

22 months ago[flang] Control flow with empty select case blocks
V Donaldson [Thu, 8 Sep 2022 04:22:59 +0000 (21:22 -0700)]
[flang] Control flow with empty select case blocks

Fix control flow for empty select case blocks such as:

  select case (2)
    case (1)
      print*, '1'
    case (2)
    ! print*, '2'
    case default
      print*, 'default'
  end select

22 months ago[mlir][tosa] Add dynamic width/height for pooling in tosa to linalg
natashaknk [Wed, 7 Sep 2022 22:40:17 +0000 (15:40 -0700)]
[mlir][tosa] Add dynamic width/height for pooling in tosa to linalg

Needed to support dynamic width/height for pooling inputs using
the similar convolution work.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D133389

22 months agoRevert "Support: Add mapped_file_region::sync(), equivalent to msync"
raghavmedicherla [Thu, 8 Sep 2022 16:04:12 +0000 (12:04 -0400)]
Revert "Support: Add mapped_file_region::sync(), equivalent to msync"

This reverts commit 142f51fc2f448845f6a32e767ffaa2b665eea11f.

This shouldn't be committed, it got committed accidentally.

22 months ago[mlir][NFC] Provide accessor for TableGen record
Mathieu Fehr [Thu, 8 Sep 2022 15:46:59 +0000 (08:46 -0700)]
[mlir][NFC] Provide accessor for TableGen record

Constraint and Predicate classes did not expose their underlying
tablegen records. This means that for Constraints, it is not possible
to get the underlying base constraint of a variadic constraint.
For Predicate, it is not possible to get the predicates forming an
`Or` predicate for instance.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D133264

22 months ago[InstCombine] fold icmp of truncated left shift, part 2
Sanjay Patel [Thu, 8 Sep 2022 16:09:26 +0000 (12:09 -0400)]
[InstCombine] fold icmp of truncated left shift, part 2

(trunc (1 << Y) to iN) == 2**C --> Y == C
(trunc (1 << Y) to iN) != 2**C --> Y != C
https://alive2.llvm.org/ce/z/xnFPo5

Follow-up to d9e1f9d7591b0d3e4d. This was a suggested
enhancement mentioned in issue #51889.

22 months ago[flang] Add co_sum to the list of intrinsics and update test
Katherine Rasmussen [Wed, 24 Aug 2022 22:34:20 +0000 (15:34 -0700)]
[flang] Add co_sum to the list of intrinsics and update test

Add the collective subroutine, co_sum, to the list of intrinsics.
In accordance with 16.9.50 and 16.9.137, add a check for and an
error if coindexed objects are being passed to certain arguments
in co_sum and in move_alloc. Add a semantics test to check that
this error is successfully caught in calls to move_alloc. Remove
the XFAIL directive, update the ERROR directives and add
standard-conforming and non-standard conforming calls in the
semantics test for co_sum.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D114134

22 months ago[clangd] Set Incompleteness for spec fuzzyfind requests
Kadir Cetinkaya [Thu, 8 Sep 2022 09:19:40 +0000 (11:19 +0200)]
[clangd] Set Incompleteness for spec fuzzyfind requests

Differential Revision: https://reviews.llvm.org/D133479

22 months ago[test] Fix compress-debug-sections-zlib-unavailable.s
Fangrui Song [Thu, 8 Sep 2022 16:32:12 +0000 (09:32 -0700)]
[test] Fix compress-debug-sections-zlib-unavailable.s

22 months ago[clang] Fix a crash in constant evaluation
Kadir Cetinkaya [Fri, 2 Sep 2022 12:24:03 +0000 (14:24 +0200)]
[clang] Fix a crash in constant evaluation

22 months ago[LV] Use safe-divisor lowering for fixed vectors if profitable
Philip Reames [Thu, 8 Sep 2022 16:07:08 +0000 (09:07 -0700)]
[LV] Use safe-divisor lowering for fixed vectors if profitable

This extends the safe-divisor widening scheme recently added for scalable vectors to handle fixed vectors as well.

Differential Revision: https://reviews.llvm.org/D132591

22 months ago[SystemZ] Fix new test case
Jonas Paulsson [Thu, 8 Sep 2022 16:03:38 +0000 (18:03 +0200)]
[SystemZ] Fix new test case

Add 'REQUIRES: systemz-registered-target'.

22 months ago[AArch64] Add test for vscale nontemporal loads larger than 256.
Zain Jaffal [Thu, 8 Sep 2022 15:42:11 +0000 (16:42 +0100)]
[AArch64] Add test for vscale nontemporal loads larger than 256.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D133498

22 months ago[libc++] Removes Clang 13 support.
Mark de Wever [Wed, 7 Sep 2022 16:55:23 +0000 (18:55 +0200)]
[libc++] Removes Clang 13 support.

Reviewed By: #libc, ldionne, jloser

Differential Revision: https://reviews.llvm.org/D133435

22 months ago[mlir][dataflow] Remove Lattice::isUninitialized().
Zhixun Tan [Thu, 8 Sep 2022 15:43:47 +0000 (08:43 -0700)]
[mlir][dataflow] Remove Lattice::isUninitialized().

Currently, for sparse analyses, we always store a `Optional<ValueT>` in each lattice element. When it's `None`, we consider the lattice element as `uninitialized`.

However:

* Not all lattices have an `uninitialized` state. For example, `Executable` and `PredecessorState` have default values so they are always initialized.

* In dense analyses, we don't have the concept of an `uninitialized` state.

Given these inconsistencies, this patch removes `Lattice::isUninitialized()`. Individual analysis states are now default-constructed. If the default state of an analysis can be considered as "uninitialized" then this analysis should implement the following logic:

* Special join rule: `join(uninitialized, any) == any`.

* Special bail out logic: if any of the input states is uninitialized, exit the transfer function early.

Depends On D132086

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D132800

22 months ago[AMDGPU] Fix shrinking of F16 FMA on newer subtargets
Jay Foad [Thu, 8 Sep 2022 13:36:23 +0000 (14:36 +0100)]
[AMDGPU] Fix shrinking of F16 FMA on newer subtargets

D125803 introduced shrinking of F16 FMA to FMAAK/FMAMK in
SIShrinkInstructions (useful on GFX10+ where VOP3 instructions may have
a literal operand) but failed to handle the V_FMA_F16_gfx9_e64 form of
the opcode which is used on GFX9+.

Differential Revision: https://reviews.llvm.org/D133489

22 months ago[SystemZ] Improve handling of vector alignments.
Jonas Paulsson [Thu, 4 Aug 2022 10:16:44 +0000 (12:16 +0200)]
[SystemZ] Improve handling of vector alignments.

Make the DataLayout string always hold a vector alignment of 8 bytes,
regardless of the vector ABI. This makes the datalayout depend only on the
target triple which is the general expectation (in assertions).

On older architectures where vectors use the natural alignment (16 bytes),
the front end will maintain the same behavior and produce an overalignment
compared to the datalayout.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D131158

22 months ago[clangd][ObjC] Improve completions for protocols + category names
David Goldman [Wed, 31 Aug 2022 17:33:09 +0000 (13:33 -0400)]
[clangd][ObjC] Improve completions for protocols + category names

- Render protocols as interfaces to differentiate them from classes
  since a protocol and class can have the same name. Take this one step
  further though, and only recommend protocols in ObjC protocol completions.

- Properly call `includeSymbolFromIndex` even with a cached
  speculative fuzzy find request

- Don't use the index to provide completions for category names,
  symbols there don't make sense

Differential Revision: https://reviews.llvm.org/D132962

22 months ago[mlir][sparse] fix bug in workspace dimension computation
Aart Bik [Thu, 8 Sep 2022 05:46:31 +0000 (22:46 -0700)]
[mlir][sparse] fix bug in workspace dimension computation

Access pattern expansion is always done along the innermost stored
dimension, but this was incorrectly reordered due to using a
general utility typically used by original dimensions only.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D133472

22 months ago[WebAssembly] Prototype `f32x4.relaxed_dot_bf16x8_add_f32`
Thomas Lively [Thu, 8 Sep 2022 15:07:48 +0000 (08:07 -0700)]
[WebAssembly] Prototype `f32x4.relaxed_dot_bf16x8_add_f32`

As proposed in https://github.com/WebAssembly/relaxed-simd/issues/77. Only an
LLVM intrinsic and a clang builtin are implemented. Since there is no bfloat16
type, use u16 to represent the bfloats in the builtin function arguments.

Differential Revision: https://reviews.llvm.org/D133428

22 months ago[llvm] Use std::size instead of llvm::array_lengthof
Joe Loser [Wed, 7 Sep 2022 00:06:58 +0000 (18:06 -0600)]
[llvm] Use std::size instead of llvm::array_lengthof

LLVM contains a helpful function for getting the size of a C-style
array: `llvm::array_lengthof`. This is useful prior to C++17, but not as
helpful for C++17 or later: `std::size` already has support for C-style
arrays.

Change call sites to use `std::size` instead.

Differential Revision: https://reviews.llvm.org/D133429

22 months agoRevert "Recommit "[AggressiveInstCombine] Lower Table Based CTTZ""
Djordje Todorovic [Thu, 8 Sep 2022 15:00:36 +0000 (17:00 +0200)]
Revert "Recommit "[AggressiveInstCombine] Lower Table Based CTTZ""

This reverts commit f87993915768772d113bfd524347ce4341b843cf.

22 months ago[Hexagon] Handle shifts of short vectors of i8
Krzysztof Parzyszek [Thu, 8 Sep 2022 14:06:05 +0000 (07:06 -0700)]
[Hexagon] Handle shifts of short vectors of i8

22 months ago[NFC][Regalloc] Introduce the RegAllocPriorityAdvisorAnalysis
Eric Wang [Thu, 8 Sep 2022 00:12:29 +0000 (17:12 -0700)]
[NFC][Regalloc] Introduce the RegAllocPriorityAdvisorAnalysis

This patch introduces the priority analysis and the priority advisor,
the default implementation, and the scaffolding for introducing the
other implementations of the advisor.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D132835

22 months ago[InstCombine] Fold icmp of truncated left shift
Sanjay Patel [Thu, 8 Sep 2022 14:44:49 +0000 (10:44 -0400)]
[InstCombine] Fold icmp of truncated left shift

(trunc (1 << Y) to iN) == 0 --> Y u>= N
(trunc (1 << Y) to iN) != 0 --> Y u<  N

These can be generalized in several ways as noted by the TODO
items, but this handles the pattern in the motivating bug report.

Fixes #51889

Differential Revision: https://reviews.llvm.org/D115480

22 months ago[AArch64] Add tests for lowering trunc to i8 using tbl.
Florian Hahn [Thu, 8 Sep 2022 14:45:32 +0000 (15:45 +0100)]
[AArch64] Add tests for lowering trunc to i8 using tbl.

22 months agoRecommit "[AggressiveInstCombine] Lower Table Based CTTZ"
Djordje Todorovic [Wed, 7 Sep 2022 09:57:43 +0000 (11:57 +0200)]
Recommit "[AggressiveInstCombine] Lower Table Based CTTZ"

22 months ago[AMDGPU] Only raise wave priority if there is a long enough sequence of VALU instruct...
Ivan Kosarev [Thu, 8 Sep 2022 13:56:37 +0000 (14:56 +0100)]
[AMDGPU] Only raise wave priority if there is a long enough sequence of VALU instructions.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D124671

22 months ago[AMDGPU] Add basic tests for emitting v_fma_f16 and friends
Jay Foad [Thu, 8 Sep 2022 13:13:17 +0000 (14:13 +0100)]
[AMDGPU] Add basic tests for emitting v_fma_f16 and friends

22 months ago[VPlan] Only generate single instr for loads uniform across all parts.
Florian Hahn [Thu, 8 Sep 2022 13:27:58 +0000 (14:27 +0100)]
[VPlan] Only generate single instr for loads uniform across all parts.

VPReplicateRecipe::isUniform actually means uniform-per-parts, hence a
scalar instruction is generated per-part.

This is a potential alternative D132892. For now the current patch only
catches cases where the address is trivially invariant (defined outside
VPlan), while D132892 catches any address that is considered invariant
by SCEV AFAICT.

It should be possible to hoist fully invariant recipes feeding loads out
of the vector loop region as well, but in practice LICM should do that
already.

This version of the patch artificially limits this to loads to make it
easier to compare, but this restriction should be easily liftable.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D133019

22 months ago[AArch64] Add tests for shuffle (tbl2, tbl2) -> tbl4 fold.
Florian Hahn [Thu, 8 Sep 2022 13:01:11 +0000 (14:01 +0100)]
[AArch64] Add tests for shuffle (tbl2, tbl2) -> tbl4 fold.

Add extra tests where shuffle (tbl2, tbl2) can be folded to tbl4.
Regenerate check lines automatically as well.

22 months ago[LICM] Add test for sret with conditional store (NFC)
Nikita Popov [Thu, 8 Sep 2022 12:52:35 +0000 (14:52 +0200)]
[LICM] Add test for sret with conditional store (NFC)

22 months ago[lit] Test changes to make it work with bazel
Christian Sigg [Wed, 7 Sep 2022 21:59:18 +0000 (23:59 +0200)]
[lit] Test changes to make it work with bazel

These non-functional changes will make it easier to add the lit tests to the bazel build (see utils/bazel).

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D133416

22 months ago[AArch64][SVE] Add out of range SVE arg CC test
Matt Devereau [Thu, 8 Sep 2022 11:30:06 +0000 (11:30 +0000)]
[AArch64][SVE] Add out of range SVE arg CC test

Add calling convention test for callee functions that have SVE
parameters outside of the z0-z7 range

22 months ago[AARCH64][COST] Improve cost of reverse shuffles for AArch64
liqinweng [Thu, 8 Sep 2022 10:33:29 +0000 (18:33 +0800)]
[AARCH64][COST] Improve cost of reverse shuffles for AArch64

Update the comments for reverse shuffles and add tests

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D132730

22 months ago[RISCV][COST] Add cost model for mask vector select instruction when its condition...
liqinweng [Thu, 8 Sep 2022 10:25:20 +0000 (18:25 +0800)]
[RISCV][COST] Add cost model for mask vector select instruction when its condition is a scalar type

Reviewed By: jacquesguan

Differential Revision: https://reviews.llvm.org/D132992

22 months ago[unittests] Change operands of Add in AsmWriterTest from Undef to Poison
Manuel Brito [Thu, 8 Sep 2022 10:35:59 +0000 (11:35 +0100)]
[unittests] Change operands of Add in AsmWriterTest from Undef to Poison
Replacing UndefValue with PoisonValue in this test where it's use as dummy value
in light of the efforts to remove undef from llvm.

Differential Revision: https://reviews.llvm.org/D133481

22 months ago[gn build] port a0365abad811
Nico Weber [Thu, 8 Sep 2022 10:28:45 +0000 (06:28 -0400)]
[gn build] port a0365abad811

22 months ago[LLVM][ARM] Remove options for armv2, 2A, 3 and 3M
David Spickett [Thu, 1 Sep 2022 13:30:39 +0000 (13:30 +0000)]
[LLVM][ARM] Remove options for armv2, 2A, 3 and 3M

Fixes #57486

These pre v4 architectures are not specifically supported
by codegen. As demonstrated in the linked issue.

GCC has not supported 3M since GCC 9 and presumably
2 and 2A earlier than that. So we are aligned in that sense.

(see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2abd6e34fcf3bd9f9ffafcaa47cdc3ed443f9add)

This removes the options and associated testing.

The Pre_v4 build attribute remains mainly because its absence
would be more confusing. It will not be used other than to
complete the list of build attributes as shown in the ABI.

https://github.com/ARM-software/abi-aa/blob/main/addenda32/addenda32.rst#3352the-target-related-attributes

Reviewed By: nickdesaulniers, peter.smith, rengolin

Differential Revision: https://reviews.llvm.org/D133109

22 months agoFix clang-format.
Johannes Reifferscheid [Thu, 8 Sep 2022 09:04:33 +0000 (11:04 +0200)]
Fix clang-format.

22 months ago[AArch64] add i56 load store pair test case; NFC
chenglin.bi [Thu, 8 Sep 2022 08:52:25 +0000 (16:52 +0800)]
[AArch64] add i56 load store pair test case; NFC

22 months ago[Driver] Support -gz=zstd
Fangrui Song [Thu, 8 Sep 2022 08:39:06 +0000 (01:39 -0700)]
[Driver] Support -gz=zstd

The driver option translates to --compress-debug-sections=zstd cc1/cc1as/GNU
assembler/linker options.

`clang -g -gz=zstd -c a.c` generates ELFCOMPRESS_ZSTD compressed debug info
sections if compression decreases size.

22 months ago[ConstantExpr] Remove fneg expression
Nikita Popov [Wed, 7 Sep 2022 09:36:19 +0000 (11:36 +0200)]
[ConstantExpr] Remove fneg expression

As part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179,
this removes the fneg constant expression (which is, incidentally,
the only unary operator expression).

Differential Revision: https://reviews.llvm.org/D133418

22 months agoOne-shot-bufferize: fix for inconsistent while arg types in before/after.
Johannes Reifferscheid [Thu, 8 Sep 2022 07:38:50 +0000 (09:38 +0200)]
One-shot-bufferize: fix for inconsistent while arg types in before/after.

Currently, if the `before` and `after` regions of a while op have
tensor args in different indices, this leads to a crash.

This moves the pass-through check for args to the handling of the
condition block, since that is where the results are produced, so
it's also where copies must be made.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D133477

22 months agoC++/ObjC++: switch to gnu++17 as the default standard
Fangrui Song [Thu, 8 Sep 2022 08:22:04 +0000 (08:22 +0000)]
C++/ObjC++: switch to gnu++17 as the default standard

Clang's default C++ standard is now `gnu++17` instead of `gnu++14`:
https://discourse.llvm.org/t/c-objc-switch-to-gnu-17-as-the-default-standard/64360

* CUDA/HIP are unchanged: C++14 from D103221.
* Sony PS4/PS5 are unchanged: https://discourse.llvm.org/t/c-objc-switch-to-gnu-17-as-the-default-standard/64360/6
* lit feature `default-std-cxx` is added to keep CLANG_DEFAULT_STD_CXX=xxx tests working.
  Whether the cmake variable should be retained is disccused in D133375.

Depends on D131464

Reviewed By: #clang-language-wg, aaron.ballman

Differential Revision: https://reviews.llvm.org/D131465

22 months agoImprove diagnostic when emitting operations with regions
Uday Bondhugula [Sat, 3 Sep 2022 05:56:29 +0000 (11:26 +0530)]
Improve diagnostic when emitting operations with regions

This has a broad impact on diagnostics that attach an operation. Ops
with one or more regions will now be printed on a new line. It was
confusing and hard to read with a trailing first line for ops
with regions.

Before:

```
<unknown>:0: note: see current operation: affine.for %arg3 = 0 to 8192 {
  affine.for %arg4 = 0 to 8192 step 512 {
    affine.for %arg5 = 0 to 8192 step 128 {

    ...
```

After:

```
<unknown>:0: note: see current operation:
affine.for %arg3 = 0 to 8192 {
  affine.for %arg4 = 0 to 8192 step 512 {
    affine.for %arg5 = 0 to 8192 step 128 {
      ...
```

Differential Revision: https://reviews.llvm.org/D132645

22 months ago[flang] Deallocate intent(out) allocatables
Valentin Clement [Thu, 8 Sep 2022 08:15:36 +0000 (10:15 +0200)]
[flang] Deallocate intent(out) allocatables

From Fortran 2018 standard 9.7.3.2 point 6:
When a procedure is invoked, any allocated allocatable object that is an actual
argument corresponding to an INTENT (OUT) allocatable dummy argument is
deallocated; any allocated allocatable object that is a subobject of an actual
argument corresponding to an INTENT (OUT) dummy argument is deallocated.

Deallocation is done on the callee side. For BIND(C) procedure, the deallocation
is also done on the caller side.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D133348

22 months ago[clang][Interp] Only initialize initmaps for primitive arrays
Timm Bäder [Thu, 8 Sep 2022 08:02:36 +0000 (10:02 +0200)]
[clang][Interp] Only initialize initmaps for primitive arrays

As the comment states, this code should only run for primitive arrays.

This should fix the memory sanitize builds.

22 months ago[MC] Support writing ELFCOMPRESS_ZSTD compressed debug info sections
Fangrui Song [Thu, 8 Sep 2022 08:00:06 +0000 (01:00 -0700)]
[MC] Support writing ELFCOMPRESS_ZSTD compressed debug info sections

and add --compress-debug-sections=zstd to llvm-mc for testing.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D130724

22 months ago[llvm-objcopy] Support --{,de}compress-debug-sections for zstd
Fangrui Song [Thu, 8 Sep 2022 07:59:14 +0000 (00:59 -0700)]
[llvm-objcopy] Support --{,de}compress-debug-sections for zstd

Also, add ELFCOMPRESS_ZSTD (2) from the approved generic-abi proposal:
https://groups.google.com/g/generic-abi/c/satyPkuMisk
("Add new ch_type value: ELFCOMPRESS_ZSTD")

Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399
("[RFC] Zstandard as a second compression method to LLVM")

Reviewed By: jhenderson, dblaikie

Differential Revision: https://reviews.llvm.org/D130458

22 months ago[Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress}
Fangrui Song [Thu, 8 Sep 2022 07:58:54 +0000 (00:58 -0700)]
[Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress}

as high-level API on top of `llvm::compression::{zlib,zstd}::*`:

* getReasonIfUnsupported: return nullptr if the specified format is
  supported, or (if unsupported) a string like `LLVM was not built with LLVM_ENABLE_ZLIB ...`
* compress: dispatch to zlib::uncompress or zstd::uncompress
* decompress: dispatch to zlib::uncompress or zstd::uncompress

Move `llvm::DebugCompressionType` from MC to Support to avoid Support->MC cyclic
dependency. There are 40+ uses in llvm-project.

Add another enum class `llvm::compression::Format` to represent supported
compression formats, which may be a superset of ELF compression formats.

See D130458 (llvm-objcopy --{,de}compress-debug-sections for zstd) for a use
case.

Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399
("[RFC] Zstandard as a second compression method to LLVM")

---

Note: this patch alone will cause -Wswitch to llvm/lib/ObjCopy/ELF/ELFObject.cpp

Reviewed By: ckissane, dblaikie

Differential Revision: https://reviews.llvm.org/D130506

22 months agoRevert "C++/ObjC++: switch to gnu++17 as the default standard"
Nikita Popov [Thu, 8 Sep 2022 07:45:50 +0000 (09:45 +0200)]
Revert "C++/ObjC++: switch to gnu++17 as the default standard"

This reverts commit e321c8dd2cea8365045ed44ae1c3c00c6a977d2e.

This causes many failures in llvm-test-suite, for example:

    /home/npopov/repos/llvm-test-suite/build-O3/tools/timeit --summary MultiSource/Applications/lambda-0.1.3/CMakeFiles/lambda.dir/token_stream.cc.o.time /home/npopov/repos/llvm-project/build/bin/clang++ -DNDEBUG -I/home/npopov/repos/llvm-test-suite/MultiSource/Applications/lambda-0.1.3 -O3   -w -Werror=date-time -MD -MT MultiSource/Applications/lambda-0.1.3/CMakeFiles/lambda.dir/token_stream.cc.o -MF MultiSource/Applications/lambda-0.1.3/CMakeFiles/lambda.dir/token_stream.cc.o.d -o MultiSource/Applications/lambda-0.1.3/CMakeFiles/lambda.dir/token_stream.cc.o -c /home/npopov/repos/llvm-test-suite/MultiSource/Applications/lambda-0.1.3/token_stream.cc
    /home/npopov/repos/llvm-test-suite/MultiSource/Applications/lambda-0.1.3/token_stream.cc:192:2: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
            register char chr;
            ^~~~~~~~~

22 months agoRevert "[Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress}"
Nikita Popov [Thu, 8 Sep 2022 07:32:54 +0000 (09:32 +0200)]
Revert "[Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress}"

This reverts commit 19dc3cff0f771bb8933136ef68e782553e920d04.
This reverts commit 5b19a1f8e88da9ec92b995bfee90043795c2c252.
This reverts commit 9397648ac8ad192f7e6e6a8e6894c27bf7e024e9.
This reverts commit 10842b44759f987777b08e7714ef77da2526473a.

Breaks the GCC build, as reported here:
https://reviews.llvm.org/D130506#3776415

22 months ago[Support] Work around GCC's enum support
Fangrui Song [Thu, 8 Sep 2022 07:13:25 +0000 (00:13 -0700)]
[Support] Work around GCC's enum support

22 months ago[MC] Support writing ELFCOMPRESS_ZSTD compressed debug info sections
Fangrui Song [Thu, 8 Sep 2022 07:03:39 +0000 (00:03 -0700)]
[MC] Support writing ELFCOMPRESS_ZSTD compressed debug info sections

and add --compress-debug-sections=zstd to llvm-mc for testing.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D130724

22 months ago[llvm-objcopy] Support --{,de}compress-debug-sections for zstd
Fangrui Song [Thu, 8 Sep 2022 06:53:40 +0000 (23:53 -0700)]
[llvm-objcopy] Support --{,de}compress-debug-sections for zstd

Also, add ELFCOMPRESS_ZSTD (2) from the approved generic-abi proposal:
https://groups.google.com/g/generic-abi/c/satyPkuMisk
("Add new ch_type value: ELFCOMPRESS_ZSTD")

Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399
("[RFC] Zstandard as a second compression method to LLVM")

Reviewed By: jhenderson, dblaikie

Differential Revision: https://reviews.llvm.org/D130458

22 months ago[Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress}
Fangrui Song [Thu, 8 Sep 2022 06:53:14 +0000 (23:53 -0700)]
[Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress}

as high-level API on top of `llvm::compression::{zlib,zstd}::*`:

* getReasonIfUnsupported: return nullptr if the specified format is
  supported, or (if unsupported) a string like `LLVM was not built with LLVM_ENABLE_ZLIB ...`
* compress: dispatch to zlib::uncompress or zstd::uncompress
* decompress: dispatch to zlib::uncompress or zstd::uncompress

Move `llvm::DebugCompressionType` from MC to Support to avoid Support->MC cyclic
dependency. There are 40+ uses in llvm-project.

Add another enum class `llvm::compression::Format` to represent supported
compression formats, which may be a superset of ELF compression formats.

See D130458 (llvm-objcopy --{,de}compress-debug-sections for zstd) for a use
case.

Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399
("[RFC] Zstandard as a second compression method to LLVM")

Differential Revision: https://reviews.llvm.org/D130506

22 months ago[mlir][Math] Add constant folder for RoundOp.
jacquesguan [Wed, 7 Sep 2022 06:28:46 +0000 (14:28 +0800)]
[mlir][Math] Add constant folder for RoundOp.

This patch uses round/roundf of libm to fold RoundOp of constant.

Differential Revision: https://reviews.llvm.org/D133401

22 months ago[LoongArch] Add codegen support for atomicrmw xchg operation on LA32
gonglingqin [Thu, 8 Sep 2022 05:56:59 +0000 (13:56 +0800)]
[LoongArch] Add codegen support for atomicrmw xchg operation on LA32

Depends on D131228

Differential Revision: https://reviews.llvm.org/D131229

22 months ago[LoongArch] Add codegen support for atomicrmw xchg operation on LA64
gonglingqin [Tue, 19 Jul 2022 08:39:05 +0000 (16:39 +0800)]
[LoongArch] Add codegen support for atomicrmw xchg operation on LA64

In order to avoid the patch being too large, the atomicrmw xchg operation
on LA32 will be added later

Differential Revision: https://reviews.llvm.org/D131228

22 months ago[MLIR] NFC: add back exports_files(["run_lit.sh"]).
Christian Sigg [Thu, 8 Sep 2022 05:46:10 +0000 (07:46 +0200)]
[MLIR] NFC: add back exports_files(["run_lit.sh"]).

22 months ago[clang][Interp][NFC] Use constexpr if when possible in Integral.h
Timm Bäder [Tue, 6 Sep 2022 07:34:23 +0000 (09:34 +0200)]
[clang][Interp][NFC] Use constexpr if when possible in Integral.h

22 months ago[clang][Interp][NFC] Context::classify() can be const
Timm Bäder [Mon, 29 Aug 2022 18:29:19 +0000 (20:29 +0200)]
[clang][Interp][NFC] Context::classify() can be const

22 months ago[clang][Interp] Implement array initializers and subscript expressions
Timm Bäder [Wed, 31 Aug 2022 14:09:40 +0000 (16:09 +0200)]
[clang][Interp] Implement array initializers and subscript expressions

Differential Revision: https://reviews.llvm.org/D132727

22 months ago[clang][Interp] Handle missing local initializers better
Timm Bäder [Mon, 29 Aug 2022 08:20:24 +0000 (10:20 +0200)]
[clang][Interp] Handle missing local initializers better

This is illegal in a constexpr context. We can already figure that out,
but we'd still run into an assertion later on when trying to visit the
missing initializer or run the invalid function.

Differential Revision: https://reviews.llvm.org/D132832

22 months ago[clang][Interp] Handle SubstNonTypeTemplateParmExprs
Timm Bäder [Sat, 27 Aug 2022 15:37:26 +0000 (17:37 +0200)]
[clang][Interp] Handle SubstNonTypeTemplateParmExprs

Differential Revision: https://reviews.llvm.org/D132831

22 months ago[clang][Interp] Implement ImplicitValueInitExprs
Timm Bäder [Sat, 27 Aug 2022 06:21:59 +0000 (08:21 +0200)]
[clang][Interp] Implement ImplicitValueInitExprs

Take the existing Zero opcode and emit it.

Differential Revision: https://reviews.llvm.org/D132829

22 months ago[clang][Interp] Implement IntegralToBoolean casts
Timm Bäder [Fri, 26 Aug 2022 13:39:17 +0000 (15:39 +0200)]
[clang][Interp] Implement IntegralToBoolean casts

Redo how we do IntegralCasts and implement IntegralToBoolean casts using
the already existing cast op.

Differential Revision: https://reviews.llvm.org/D132739

22 months ago[clang][Interp] Implement function calls
Timm Bäder [Fri, 19 Aug 2022 11:45:11 +0000 (13:45 +0200)]
[clang][Interp] Implement function calls

Add Call() and CallVoid() ops and use them to call functions. Only
FunctionDecls are supported for now.

Differential Revision: https://reviews.llvm.org/D132286

22 months ago[clang] Perform implicit lvalue-to-rvalue cast with new interpreter
Timm Bäder [Thu, 18 Aug 2022 14:06:08 +0000 (16:06 +0200)]
[clang] Perform implicit lvalue-to-rvalue cast with new interpreter

The EvaluateAsRValue() documentation mentions that an implicit
lvalue-to-rvalue cast is being performed if the result is an lvalue.
However, that was not being done if the new constant interpreter was in
use.

Just always do it.

Differential Revision: https://reviews.llvm.org/D132136

22 months ago[NewPM] Switch -filter-passes from ClassName to pass-name
Fangrui Song [Thu, 8 Sep 2022 05:02:26 +0000 (22:02 -0700)]
[NewPM] Switch -filter-passes from ClassName to pass-name

NewPM -filter-passes (D86360) uses ClassName instead of pass-name as used in
`-passes`, `-print-after`, etc. D87216 has added a mechanism to map
ClassName to pass-name. Adopt it for -filter-passes.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D133263

22 months ago[NFC][MachineFunctionPass] Only lookup pass name if we request printing
Arthur Eubanks [Thu, 8 Sep 2022 04:38:00 +0000 (21:38 -0700)]
[NFC][MachineFunctionPass] Only lookup pass name if we request printing

Should report the small compile time regression reported in D133055.

22 months ago[X86] Pre-commit test for PR57576. NFC
Craig Topper [Thu, 8 Sep 2022 03:54:49 +0000 (20:54 -0700)]
[X86] Pre-commit test for PR57576. NFC

22 months ago[BOLT][TEST] Remove functions with dynamic exception specification
Amir Ayupov [Thu, 8 Sep 2022 02:17:39 +0000 (19:17 -0700)]
[BOLT][TEST] Remove functions with dynamic exception specification

Clang has switched to gnu++17 by default with https://reviews.llvm.org/D131465.
C++17 removes dynamic exception specification. Remove its use as it wasn't
properly tested.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D133467

22 months ago[InstCombine] extractvalue (any_mul_with_overflow X, 2^n), 0 -> X << n
Chenbing Zheng [Thu, 8 Sep 2022 03:12:55 +0000 (11:12 +0800)]
[InstCombine] extractvalue (any_mul_with_overflow X, 2^n), 0 -> X << n

Alive2: https://alive2.llvm.org/ce/z/JLmabt (umul)
        https://alive2.llvm.org/ce/z/J_ruXR  (smul)
        https://alive2.llvm.org/ce/z/o9SVSz (vector)

Reviewed By: spatel, RKSimon

Differential Revision: https://reviews.llvm.org/D133188

22 months ago[NFC][DSE] Add a masked dead store test that should rely on additional guards for...
Michael Berg [Thu, 8 Sep 2022 02:20:13 +0000 (19:20 -0700)]
[NFC][DSE] Add a masked dead store test that should rely on additional guards for removal.

22 months ago[libunwind] Fix a few libunwind includes
Ryan Prichard [Wed, 7 Sep 2022 21:27:57 +0000 (17:27 -0400)]
[libunwind] Fix a few libunwind includes

In UnwindCursor.hpp, include config.h before checking _LIBUNWIND_SUPPORT_SEH_UNWIND.

Include libunwind_ext.h for UNW_STEP_SUCCESS.

Differential Revision: https://reviews.llvm.org/D86766

22 months ago[flang] Write semantics test for atomic_fetch_add
Katherine Rasmussen [Wed, 31 Aug 2022 19:12:00 +0000 (12:12 -0700)]
[flang] Write semantics test for atomic_fetch_add

Write a semantics test for the atomic intrinsic subroutine,
atomic_fetch_add.

Reviewed By: rouson

Differential Revision: https://reviews.llvm.org/D133139

22 months agoApply clang-tidy fixes for readability-identifier-naming in CRunnerUtils.cpp (NFC)
Mehdi Amini [Mon, 29 Aug 2022 11:24:57 +0000 (11:24 +0000)]
Apply clang-tidy fixes for readability-identifier-naming in CRunnerUtils.cpp (NFC)

22 months agoApply clang-tidy fixes for performance-unnecessary-value-param in VectorDistribute...
Mehdi Amini [Mon, 29 Aug 2022 11:22:27 +0000 (11:22 +0000)]
Apply clang-tidy fixes for performance-unnecessary-value-param in VectorDistribute.cpp (NFC)

22 months ago[mlir:PassTiming] Always use parentInfo for determining pipeline parent scope
River Riddle [Tue, 30 Aug 2022 22:25:55 +0000 (15:25 -0700)]
[mlir:PassTiming] Always use parentInfo for determining pipeline parent scope

This fixes a bug where, depending on thread usage, a pipeline may be
misattributed to a timer that wasn't it's parent.

Differential Revision: https://reviews.llvm.org/D132979

22 months ago[BOLT] Distinguish sections in heatmap
Fabian Parzefall [Wed, 7 Sep 2022 23:28:01 +0000 (16:28 -0700)]
[BOLT] Distinguish sections in heatmap

Output different letters for different sections in the heatmap to
visually separate sections.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D133068

22 months ago[AMDGPU] Drop _oneuse checks from med3 patterns
Justin Bogner [Wed, 24 Aug 2022 22:35:02 +0000 (15:35 -0700)]
[AMDGPU] Drop _oneuse checks from med3 patterns

We use _oneuse checks to make sure combines won't accidentally
increase code size, but this prevents the optimization in cases where
we happen to want to clamp multiple values to the same range

It's safe to drop these checks for two reasons:

1. The pattern of max/min operations for med3 is complicated enough
   it's unlikely to come up by accident, so this will still only fire
   when appropriate to do so
2. Even if every intermediate is used and we don't save a single
   operation, we still won't end up with more operations since the
   med3 replaces the final max/min.

In pathological cases we could potentially end up with a larger
encoding size or possibly slightly increased vgpr pressure, but the
risk of that is low, especially considering the upside.

Differential Revision: https://reviews.llvm.org/D132621

22 months ago[AMDGPU] Fix liveness verifier error in hazard recognizer
Stanislav Mekhanoshin [Wed, 7 Sep 2022 22:24:11 +0000 (15:24 -0700)]
[AMDGPU] Fix liveness verifier error in hazard recognizer

After D133067 we are inserting swaps to use a new physical
register. I have noticed verifier errors about undefined
physical register uses if we are tracking liveness post RA.

We have no access to LIS at this point, so mark new register
uses as undef to calm down the verifier. Liveness should not
matter at this point anyway.

Note the description of the RegState::Undef: "Value of the
register doesn't matter." I.e. it does not say it is strictly
undefined. In fact that is what we really need: this value
does not matter.

I also had to modify the test a bit since with tracking enabled
it does not pass verification even before the recognizer.

Differential Revision: https://reviews.llvm.org/D133459

22 months ago[libc][math] Implement asinf function correctly rounded for all rounding modes.
Tue Ly [Wed, 7 Sep 2022 06:20:45 +0000 (02:20 -0400)]
[libc][math] Implement asinf function correctly rounded for all rounding modes.

Implement asinf function correctly rounded for all rounding modes.

For `|x| <= 0.5`, we approximate `asin(x)` by
```
  asin(x) = x * P(x^2)
```
where `P(X^2) = Q(X)` is a degree-20 minimax even polynomial approximating
`asin(x)/x` on `[0, 0.5]` generated by Sollya with:
```
  > Q = fpminimax(asin(x)/x, [|0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20|],
                 [|1, D...|], [0, 0.5]);
```

When `|x| > 0.5`, we perform range reduction as follow:
Assume further that `0.5 < x <= 1`, and let:
```
  y = asin(x)
```
We will use the double angle formula:
```
  cos(2X) = 1 - 2 sin^2(X)
```
and the complement angle identity:
```
  x = sin(y) = cos(pi/2 - y)
              = 1 - 2 sin^2 (pi/4 - y/2)
```
So:
```
  sin(pi/4 - y/2) = sqrt( (1 - x)/2 )
```
And hence:
```
  pi/4 - y/2 = asin( sqrt( (1 - x)/2 ) )
```
Equivalently:
```
  asin(x) = y = pi/2 - 2 * asin( sqrt( (1 - x)/2 ) )
```
Let `u = (1 - x)/2`, then
```
  asin(x) = pi/2 - 2 * asin(u)
```
Moreover, since `0.5 < x <= 1`,
```
  0 <= u < 1/4, and 0 <= sqrt(u) < 0.5.
```
And hence we can reuse the same polynomial approximation of `asin(x)` when
`|x| <= 0.5`:
```
  asin(x) = pi/2 - 2 * u * P(u^2).
```

Performance benchmark using `perf` tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf
CORE-MATH reciprocal throughput   : 23.418
System LIBC reciprocal throughput : 27.310
LIBC reciprocal throughput        : 22.741

$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf --latency
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH latency   : 58.884
System LIBC latency : 62.055
LIBC latency        : 62.037
```

Reviewed By: orex, zimmermann6

Differential Revision: https://reviews.llvm.org/D133400

22 months ago[mlir][sparse] change variable dimension to fixed attribute pointers/indices
Aart Bik [Wed, 7 Sep 2022 22:08:29 +0000 (15:08 -0700)]
[mlir][sparse] change variable dimension to fixed attribute pointers/indices

The "sparsification" pass does not need the ability to use runtime values for
the dimension, so the only source for variability would have been user code.
Restricting the dimension to constants simplifies code generation.

Reviewed By: Peiming, wrengr

Differential Revision: https://reviews.llvm.org/D133458

22 months ago[libc] Return correct values for hypot when overflowed.
Tue Ly [Tue, 6 Sep 2022 18:18:18 +0000 (14:18 -0400)]
[libc] Return correct values for hypot when overflowed.

Hypot incorrectly returns +Inf when overflowed with FE_DOWNWARD and
FE_TOWARDZERO rounding modes.

Reviewed By: sivachandra, zimmermann6

Differential Revision: https://reviews.llvm.org/D133370