platform/upstream/llvm.git
23 months ago[SelectOpti] Remove test on loop-level analysis
Sotiris Apostolakis [Wed, 17 Aug 2022 15:38:28 +0000 (15:38 +0000)]
[SelectOpti] Remove test on loop-level analysis

Remove a test that relied on the underlying instruction latency modeling.
Such dependency blocks efforts such as D79483 to improve this cost modeling.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D132029

23 months ago[bazel] Add --config=ci
Arthur Eubanks [Wed, 17 Aug 2022 16:10:47 +0000 (09:10 -0700)]
[bazel] Add --config=ci

To speedup builds/tests.

23 months ago[MLIR]Add support for Arith MAX & MIN operations
Mats Petersson [Thu, 4 Aug 2022 19:11:32 +0000 (20:11 +0100)]
[MLIR]Add support for Arith MAX & MIN operations

There are some of this supported in various places, but the
basic conversion of single operations to LLVM was not supported.

Adding this to allow Flang to use these.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D131912

23 months ago[libc++][format] Uglyfies format buffer.
Mark de Wever [Sat, 13 Aug 2022 11:09:41 +0000 (13:09 +0200)]
[libc++][format] Uglyfies format buffer.

While working on D129964 I noticed some code hadn't been uglyfied, this
rectifies the issue.

Depends on D129964

Reviewed By: #libc, philnik

Differential Revision: https://reviews.llvm.org/D131834

23 months ago[libc++][CI] Updates and improves the Docker image.
Mark de Wever [Sat, 6 Aug 2022 14:19:30 +0000 (16:19 +0200)]
[libc++][CI] Updates and improves the Docker image.

Since we branched LLVM install Clang 16 and remove Clang 12.

Currently our Docker installs 4 versions of Clang so our CI can use the
same image for both the main and the release branch. This wasn't done for
the other Clang tools so they always use the same version for testing
the main and the release branch. Instead install 2 versions for the
tools.

However it seems the default for Clang and its tools were the latest
released version instead of the ToT. To lessen the risk of breaking the
release CI, version 14 is installed hard-coded as a temporary solution.

Updating the main branch to use the Clang 16 compiler will be done in a
separate patch.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D131324

23 months agoAdd N2653 to the C2x status page
Aaron Ballman [Wed, 17 Aug 2022 15:29:51 +0000 (11:29 -0400)]
Add N2653 to the C2x status page

This was accidentally missing from the Feb 2022 paper section.

23 months ago[ModuloSchedule] Add interface call to accept/reject SMS schedules
David Penry [Thu, 30 Jun 2022 18:03:50 +0000 (11:03 -0700)]
[ModuloSchedule] Add interface call to accept/reject SMS schedules

This interface allows a target to reject a proposed
SMS schedule.  For Hexagon/PowerPC, all schedules
are accepted, leaving behavior unchanged.  For ARM,
schedules which exceed register pressure limits are
rejected.

Also, two RegisterPressureTracker methods now need to be public so
that register pressure can be computed by more callers.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D128941

23 months ago[lldb][Tests] Skip static-only tests in TestConstStaticIntegralMember.py for dsym...
Michael Buch [Tue, 16 Aug 2022 23:48:45 +0000 (00:48 +0100)]
[lldb][Tests] Skip static-only tests in TestConstStaticIntegralMember.py for dsym variant

This test fails for Clang versions < 14.0 for `dsym` variants.
`dsymutil` strips debug info for classes with only static members.
Thus move the failing assertions into the XFAIL test case.

Differential Revision: https://reviews.llvm.org/D132004

23 months ago[clang][llvm][NFC] Change misexpect's tolerance option to be 32-bit
Paul Kirth [Tue, 16 Aug 2022 01:14:28 +0000 (01:14 +0000)]
[clang][llvm][NFC] Change misexpect's tolerance option to be 32-bit

In D131869 we noticed that we jump through some hoops because we parse the
tolerance option used in MisExpect.cpp into a 64-bit integer. This is
unnecessary, since the value can only be in the range [0, 100).

This patch changes the underlying type to be 32-bit from where it is
parsed in Clang through to it's use in LLVM.

Reviewed By: jloser

Differential Revision: https://reviews.llvm.org/D131935

23 months ago[InstrProf] Add option to avoid instrumenting small functions
Ellis Hoag [Wed, 17 Aug 2022 13:22:50 +0000 (06:22 -0700)]
[InstrProf] Add option to avoid instrumenting small functions

If a function only has a few instructions, instrumentation can significantly increase the size and performance overhead of that function. Add the `-pgo-function-size-threshold` option to select a size threshold so these small functions are not instrumented.

A similar option `-fxray-instruction-threshold=<N>` is used for XRay to reduce binary size overhead [1].

[1] https://www.llvm.org/docs/XRay.html

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D131816

23 months ago[CostModel][X86] Add cost kinds test coverage for integer arithmetic operators
Simon Pilgrim [Wed, 17 Aug 2022 13:36:21 +0000 (14:36 +0100)]
[CostModel][X86] Add cost kinds test coverage for integer arithmetic operators

23 months agoRe-apply "Deferred Concept Instantiation Implementation""
Erich Keane [Fri, 1 Jul 2022 18:21:21 +0000 (11:21 -0700)]
Re-apply "Deferred Concept Instantiation Implementation""

This reverts commit 258c3aee54e11bc5c5d8ac137eb15e8d5bbcc7e4.

This should fix the libc++ issue that caused the revert, by re-designing
slightly how we determined when we should evaluate the constraints.
Additionally, many of the other components to the original patch (the
NFC parts) were committed separately to shrink the size of this patch
for review.

Differential Revision: https://reviews.llvm.org/D126907

23 months agoFix unused variable (introduced in
Thomas Joerg [Wed, 17 Aug 2022 13:18:37 +0000 (15:18 +0200)]
Fix unused variable (introduced in
c248219b09c1e724468d4603f647466b3e282330)

23 months ago[AMDGPU][MC][NFC] Refine SMEM store, probe and discard definitions.
Ivan Kosarev [Wed, 17 Aug 2022 12:38:30 +0000 (13:38 +0100)]
[AMDGPU][MC][NFC] Refine SMEM store, probe and discard definitions.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D131968

23 months ago[RISCV] Avoid redundant branch-to-branch when expanding cmpxchg
Alex Bradbury [Wed, 17 Aug 2022 12:48:05 +0000 (13:48 +0100)]
[RISCV] Avoid redundant branch-to-branch when expanding cmpxchg

If the success value of a cmpxchg is used in a branch, the expanded
cmpxchg sequence ends up with a redundant branch-to-branch (as the
backend atomics expansion happens as late as possible, passes to
optimise such cases have already run). This patch identifies this case
and avoid it when expanding the cmpxchg.

Note that a similar optimisation is possible for a BEQ on the cmpxchg
success value. As it's hard to imagine a case where real-world code may
do that, this patch doens't handle that case.

Differential Revision: https://reviews.llvm.org/D130192

23 months ago[NFC][Flang] Add simd collapse test case
Dominik Adamski [Wed, 17 Aug 2022 08:49:46 +0000 (03:49 -0500)]
[NFC][Flang] Add simd collapse test case

Flang supports lowering collapse clause to MLIR for worksharing loops
and simd loops. Simd collapse clause is represented in MLIR code as a
simd-loop having a list of indices, bounds and steps where the size of the list
is equal to the collapse value.

Support for simd collapse clause was added by several patches:
https://reviews.llvm.org/D118065 -> basic support for simd-loop in MLIR.
                                    Loop collapsing is done in the same way as
                                    for worksharing loops:
                                    https://reviews.llvm.org/D105706
https://reviews.llvm.org/D125282 -> support for lowering simd clause from
                                    Fortran into MLIR
https://reviews.llvm.org/D125302 -> lowering collapse clause from Fortran to
                                    MLIR. Lowering collapse clause is done
                                    before simd-specific function is called.
https://reviews.llvm.org/D128338 -> modified the MLIR representation of
                                    collapse clause. Removed collapse keyword
                                    in OpenMP MLIR dialect. Use loop list to
                                    represent collapse clause. This loop list
                                    is created by changes from patch:
                                    https://reviews.llvm.org/D125302 and it is
                                    passed to function responsible for lowering
                                    of simd directive which was implemented in
                                    patch: https://reviews.llvm.org/D125282

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D132023

Signed-off-by: Dominik Adamski <dominik.adamski@amd.com>
23 months ago[pseudo] Eliminate the type-name identifier ambiguities in the grammar.
Haojian Wu [Tue, 16 Aug 2022 19:23:11 +0000 (21:23 +0200)]
[pseudo] Eliminate the type-name identifier ambiguities in the grammar.

See https://reviews.llvm.org/D130626 for motivation.

Identifier in the grammar has different categories (type-name, template-name,
namespace-name), they requires semantic information to resolve. This patch is
to eliminate the "local" ambiguities in type-name, and namespace-name, which
gives us a performance boost of the parser:

  - eliminate all different type rules (class-name, enum-name, typedef-name), and
    fold them into a unified type-name, this removes the #1 type-name ambiguity, and
    gives us a big performance boost;
  - remove the namespace-alis rules, as they're hard and uninteresting;

Note that we could eliminate more and gain more performance (like fold template-name,
type-name, namespace together), but at current stage, we'd like keep all existing
categories of the identifier (as they might assist in correlated disambiguation &
keep the representation of important concepts uniform).

| file               |ambiguous nodes |  forest size     | glrParse performance |
|SemaCodeComplete.cpp|  11k -> 5.7K   | 10.4MB -> 7.9MB  | 7.1MB/s -> 9.98MB/s  |
|       AST.cpp      |  1.3k -> 0.73K | 0.99MB -> 0.77MB | 6.7MB/s -> 8.4MB/s   |

Differential Revision: https://reviews.llvm.org/D130747

23 months agoUpdate the status of some more C99 features
Aaron Ballman [Wed, 17 Aug 2022 12:11:07 +0000 (08:11 -0400)]
Update the status of some more C99 features

This also adds some test coverage to demonstrate we implement what was
standardized.

23 months ago[Driver] Override default location of config files
Serge Pavlov [Wed, 17 Aug 2022 04:08:30 +0000 (11:08 +0700)]
[Driver] Override default location of config files

If directory for config files was specified in project configuration
using parameters CLANG_CONFIG_FILE_SYSTEM_DIR or CLANG_CONFIG_FILE_USER_DIR,
it was not overriden by command-line option `--config-system-dir=` or
`--config-user-dir=` that specified empty path.

This change corrects the behavior. It fixes the issue
https://github.com/llvm/llvm-project/issues/56836 ([clang] [test]
test/Driver/config-file-errs.c fails if CLANG_CONFIG_FILE_SYSTEM_DIR is
specified).

23 months ago[CostModel][X86] intrinsic-cost-kinds.ll - add fcopysign costs tests
Simon Pilgrim [Wed, 17 Aug 2022 11:36:05 +0000 (12:36 +0100)]
[CostModel][X86] intrinsic-cost-kinds.ll - add fcopysign costs tests

23 months ago[flang]Avoid asking for operands when there are none
Mats Petersson [Tue, 9 Aug 2022 15:51:00 +0000 (16:51 +0100)]
[flang]Avoid asking for operands when there are none

Fix one encountered (issue #57072) and two potential scenarios where the
code would ask for an operand that isn't there.

Add test for the encountered case.

Reviewed By: vzakhari

Differential Revision: https://reviews.llvm.org/D131671

23 months ago[LLVM][IndvarSimplify] Move test that requires X86
David Spickett [Wed, 17 Aug 2022 11:13:27 +0000 (11:13 +0000)]
[LLVM][IndvarSimplify] Move test that requires X86

This is failing on our bots that only build Arm/AArch64.

https://lab.llvm.org/buildbot/#/builders/171/builds/19033/steps/5/logs/FAIL__LLVM__pr57187_ll

23 months ago[AArch64] Add pattern for SQDML*Lv1i32_indexed
OverMighty [Wed, 17 Aug 2022 11:00:47 +0000 (12:00 +0100)]
[AArch64] Add pattern for SQDML*Lv1i32_indexed

There was no pattern to fold into these instructions. This patch adds
the pattern obtained from the following ACLE intrinsics so that they
generate sqdmlal/sqdmlsl instructions instead of separate sqdmull and
sqadd/sqsub instructions:
 - vqdmlalh_s16, vqdmlslh_s16
 - vqdmlalh_lane_s16, vqdmlalh_laneq_s16, vqdmlslh_lane_s16,
   vqdmlslh_laneq_s16 (when the lane index is 0)

It also modifies the result of the existing pattern for the latter, when
the lane index is not 0, to use the v1i32_indexed instructions instead
of the v4i16_indexed ones.

Fixes #49997.

Differential Revision: https://reviews.llvm.org/D131700

23 months ago[Sparc] Don't use SunStyleELFSectionSwitchSyntax
Rainer Orth [Wed, 17 Aug 2022 10:59:29 +0000 (12:59 +0200)]
[Sparc] Don't use SunStyleELFSectionSwitchSyntax

As discussed in D85414 <https://reviews.llvm.org/D85414>, two tests
currently `FAIL` on Sparc since that backend uses the Sun assembler syntax
for the `.section` directive, controlled by
`SunStyleELFSectionSwitchSyntax`.

Instead of adapting the affected tests, this patch changes that default.
The internal assembler still accepts both forms as input, only the output
syntax is affected.

Current support for the Sun syntax is cursory at best: the built-in
assembler cannot even assemble some of the directives emitted by GCC, and
the set supported by the Solaris assembler is even larger: SPARC Assembly
Language Reference Manual, 3.4 Pseudo-Op Attributes
<https://docs.oracle.com/cd/E37838_01/html/E61063/gmabi.html#scrolltoc>.

A few Sparc test cases need to be adjusted. At the same time, the patch
fixes the failures from D85414 <https://reviews.llvm.org/D85414>.

Tested on `sparcv9-sun-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D85415

23 months ago[TTI] Remove getInstructionThroughput cost helper.
Simon Pilgrim [Wed, 17 Aug 2022 10:41:38 +0000 (11:41 +0100)]
[TTI] Remove getInstructionThroughput cost helper.

Pulled out of D79483 - we can just as easily use getUserCost directly

23 months ago[SLP] Update TODO comment about shuffle mask decoding
Simon Pilgrim [Wed, 17 Aug 2022 10:17:20 +0000 (11:17 +0100)]
[SLP] Update TODO comment about shuffle mask decoding

This is handled in ShuffleVectorInst/getShuffleCost - getInstructionThroughput is (slowly) being removed.

23 months ago[instcombine] Optimise for zero initialisation of product given fast flags are enabled
Zain Jaffal [Wed, 17 Aug 2022 10:12:15 +0000 (11:12 +0100)]
[instcombine] Optimise for zero initialisation of product given fast flags are enabled

Currently, clang ignores the 0 initialisation in finite math
For example:

```
double f_prod = 0;
double arr[1000];
for (size_t i = 0; i < 1000; i++) {
  f_prod *= arr[i];
 }
```
Clang will ignore that `f_prod` is set to zero and it will generate assembly to iterate over the loop.

Reviewed By: fhahn, spatel

Differential Revision: https://reviews.llvm.org/D131672

23 months ago[flang] Add Solaris/x86 support to Optimizer/CodeGen/Target.cpp
Rainer Orth [Wed, 17 Aug 2022 09:54:38 +0000 (11:54 +0200)]
[flang] Add Solaris/x86 support to Optimizer/CodeGen/Target.cpp

When testing LLVM 15.0.0 rc1 on Solaris, I found that 50+ flang tests
`FAIL`ed with

  error:
/vol/llvm/src/llvm-project/local/flang/lib/Optimizer/CodeGen/Target.cpp:310:
not yet implemented: target not implemented

This patch fixes that for Solaris/x86, where the fix is trivial (just
handling it like the other x86 OSes).

Tested on `amd64-pc-solaris2.11`; only a single failure remains now.

Differential Revision: https://reviews.llvm.org/D131054

23 months ago[NFC][OpenMP] Update simd loop collapse support description
Dominik Adamski [Tue, 16 Aug 2022 10:18:14 +0000 (05:18 -0500)]
[NFC][OpenMP] Update simd loop collapse support description

Simdloop collapse clause is supported in the same way
as colllapse clause for worksharing loops.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D131674

Signed-off-by: Dominik Adamski <dominik.adamski@amd.com>
23 months ago[TypePromotion] Don't promote PHI + ZExt if wider than RegisterBitWidth
Andre Vieira [Wed, 17 Aug 2022 08:54:15 +0000 (09:54 +0100)]
[TypePromotion] Don't promote PHI + ZExt if wider than RegisterBitWidth

Differential Revision: https://reviews.llvm.org/D131966

23 months ago[LLDB][ARM] Remove expected failure from AnonTypedef test
David Spickett [Wed, 17 Aug 2022 08:51:57 +0000 (08:51 +0000)]
[LLDB][ARM] Remove expected failure from AnonTypedef test

Thanks to ff9efe240c4711572d2892f9058fd94a8bd5336e this test
is now passing.

https://lab.llvm.org/buildbot/#/builders/17/builds/26270

23 months ago[LAA] Handle forked pointers with add/sub instructions
Graham Hunter [Thu, 21 Jul 2022 13:22:27 +0000 (14:22 +0100)]
[LAA] Handle forked pointers with add/sub instructions

Handle cases where a forked pointer has an add or sub instruction
before reaching a select.

Reviewed By: fhahn
Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D130278

23 months ago[Test] Add miscompiled test for PR57187
Max Kazantsev [Wed, 17 Aug 2022 08:25:47 +0000 (15:25 +0700)]
[Test] Add miscompiled test for PR57187

Details at https://github.com/llvm/llvm-project/issues/57187

23 months agoRevert "DynamicMemRefType: iteration and access by indices"
Thomas Joerg [Wed, 17 Aug 2022 08:35:00 +0000 (10:35 +0200)]
Revert "DynamicMemRefType: iteration and access by indices"

This reverts commit b8ecf32f81bb8073320ad5d4722a1680f615d133.

This commit introduces undefined behavior according to UBSan:
UndefinedBehaviorSanitizer: nullptr-with-nonzero-offset third_party/llvm/llvm-project/mlir/include/mlir/ExecutionEngine/CRunnerUtils.h:377:5

23 months ago[clang] fix a typo in da6187f566b7881cb835
Yuanfang Chen [Wed, 17 Aug 2022 08:26:52 +0000 (01:26 -0700)]
[clang] fix a typo in da6187f566b7881cb835

23 months ago[clang] Apply FixIts to members declared via `using` in derived classes
Denis Fatkulin [Wed, 17 Aug 2022 08:08:19 +0000 (10:08 +0200)]
[clang] Apply FixIts to members declared via `using` in derived classes

FixIt don't switch to arrow in derrived members with `using`

Example code:
```
struct Bar {
  void foo();
};
struct Baz {
  using Bar::foo;
};
void test(Baz* ptr) {
  ptr.^
}

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D131088

23 months ago[clang] Give priority to Class context while parsing declarations
Furkan Usta [Wed, 17 Aug 2022 07:44:17 +0000 (09:44 +0200)]
[clang] Give priority to Class context while parsing declarations

Fixes https://github.com/clangd/clangd/issues/290.

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D130363

23 months ago[lldb][ClangExpression] Add asm() label to all FunctionDecls we create from DWARF
Michael Buch [Tue, 16 Aug 2022 15:35:20 +0000 (16:35 +0100)]
[lldb][ClangExpression] Add asm() label to all FunctionDecls we create from DWARF

When resolving symbols during IR execution, lldb makes a last effort attempt
to resolve external symbols from object files by approximate name matching.
It currently uses `CPlusPlusNameParser` to parse the demangled function name
and arguments for the unresolved symbol and its candidates. However, this
hand-rolled C++ parser doesn’t support ABI tags which, depending on the demangler,
get demangled into `[abi:tag]`. This lack of parsing support causes lldb to never
consider a candidate mangled function name that has ABI tags.

The issue reproduces by calling an ABI-tagged template function from the
expression evaluator. This is particularly problematic with the recent
addition of ABI tags to numerous libcxx APIs.

The issue stems from the fact that `clang::CodeGen` emits function
function calls using the mangled name inferred from the `FunctionDecl`
LLDB constructs from DWARF. Debug info often lacks information for
us to construct a perfect FunctionDecl resulting in subtle mangled
name inaccuracies.

This patch side-steps the problem of inaccurate `FunctionDecl`s by
attaching an `asm()` label to each `FunctionDecl` LLDB creates from DWARF.
`clang::CodeGen` consults this label to get the mangled name as one of
the first courses of action when emitting a function call.

LLDB already does this for C++ member functions as of
[675767a5910d2ec77ef8b51c78fe312cf9022896](https://reviews.llvm.org/D40283)

**Testing**

* Added API tests

Differential Revision: https://reviews.llvm.org/D131974

23 months ago[libc][NFC] Make IntegerToString simpler to use at call-sites.
Siva Chandra Reddy [Fri, 12 Aug 2022 21:26:22 +0000 (21:26 +0000)]
[libc][NFC] Make IntegerToString simpler to use at call-sites.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D131943

23 months ago[clangd] Support for standard type hierarchy
Kadir Cetinkaya [Mon, 8 Aug 2022 09:22:31 +0000 (11:22 +0200)]
[clangd] Support for standard type hierarchy

This is mostly a mechanical change to adapt standard type hierarchy
support proposed in LSP 3.17 on top of clangd's existing extension support.

This does mainly two things:
- Incorporate symbolids for all the parents inside resolution parameters, so
  that they can be retrieved from index later on. This is a new code path, as
  extension always resolved them eagerly.
- Propogate parent information when resolving children, so that at least one
  branch of parents is always preserved. This is to address a shortcoming in the
  extension.

This doesn't drop support for the extension, but it's deprecated from now on and
will be deleted in upcoming releases. Currently we use the same struct
internally but don't serialize extra fields.

Fixes https://github.com/clangd/clangd/issues/826.

Differential Revision: https://reviews.llvm.org/D131385

23 months ago[CMake] Explicit bootstrap options override any passthrough ones.
Carlos Alberto Enciso [Wed, 17 Aug 2022 07:16:10 +0000 (08:16 +0100)]
[CMake] Explicit bootstrap options override any passthrough ones.

The https://reviews.llvm.org/D53014 added CMAKE_BUILD_TYPE to
the list of BOOTSTRAP_DEFAULT_PASSTHROUGH variables.

The downside is that both stage-1 and stage-2 configurations
are the same. So it is not possible to build different stage
configurations.

This patch allow explicit bootstrap options to override any
passthrough ones.

For instance, the following settings would build:
  stage-1 (Release) and stage-2(Debug)

-DCMAKE_BUILD_TYPE=Release -DBOOTSTRAP_CMAKE_BUILD_TYPE=Debug

Reviewed By: @beanz

Differential Revision: https://reviews.llvm.org/D131755

23 months ago[ELF] Fix .plt.got comments. NFC
Fangrui Song [Wed, 17 Aug 2022 06:29:01 +0000 (23:29 -0700)]
[ELF] Fix .plt.got comments. NFC

23 months ago[LLDB][JIT] Set processor for ARM architecture
Pavel Kosov [Wed, 17 Aug 2022 06:10:21 +0000 (09:10 +0300)]
[LLDB][JIT] Set processor for ARM architecture

Patch sets ARM cpu, before compiling JIT code. This enables FastISel for armv6 and higher CPUs and allows using hardware FPU

~~~

OS Laboratory. Huawei RRI. Saint-Petersburg

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D131783

23 months ago[lld-macho] Honor weak and thread-local flags for TAPI symbols
Daniel Bertalan [Tue, 16 Aug 2022 21:09:46 +0000 (23:09 +0200)]
[lld-macho] Honor weak and thread-local flags for TAPI symbols

Differential Revision: https://reviews.llvm.org/D131995

23 months ago[RISCV] Allow lowerSELECT to fold integer setcc with FP select.
Craig Topper [Wed, 17 Aug 2022 04:28:51 +0000 (21:28 -0700)]
[RISCV] Allow lowerSELECT to fold integer setcc with FP select.

We'd pick it up in DAG combine later even if we didn't handle it here.
No test changes because we get it in DAG combine anyway.

23 months ago[RISCV] Add test coverage for (select (icmp X, Y), float, float). NFC
Craig Topper [Wed, 17 Aug 2022 04:23:09 +0000 (21:23 -0700)]
[RISCV] Add test coverage for (select (icmp X, Y), float, float). NFC

We fold integer setcc into SELECT_CC during DAG combine even if
the SELECT_CC has FP result type, but we had no test coverage.

23 months agofold assert-only variable into assert to address non-assert -Wunused-variable
David Blaikie [Wed, 17 Aug 2022 04:10:59 +0000 (04:10 +0000)]
fold assert-only variable into assert to address non-assert -Wunused-variable

23 months agoRevert "[AArch64] Add `foldCSELOfCSEl` DAG combine"
Vitaly Buka [Wed, 17 Aug 2022 03:29:37 +0000 (20:29 -0700)]
Revert "[AArch64] Add `foldCSELOfCSEl` DAG combine"

Breaks ubsan on buildbot, details in D125504

This reverts commit 6f9423ef06926a70af84b77cb290c91214cf791a.

23 months ago[clang-format] Handle comments between access specifier and colon
owenca [Tue, 16 Aug 2022 04:30:36 +0000 (21:30 -0700)]
[clang-format] Handle comments between access specifier and colon

Fixes #56740.

Differential Revision: https://reviews.llvm.org/D131940

23 months ago[RISCV] Reuse existing VT variable instead of calling getValueType() repeatedly. NFC
Craig Topper [Wed, 17 Aug 2022 02:56:33 +0000 (19:56 -0700)]
[RISCV] Reuse existing VT variable instead of calling getValueType() repeatedly. NFC

23 months ago[Clang] followup D128745, add a missing ClangABICompat check
Yuanfang Chen [Wed, 17 Aug 2022 01:28:49 +0000 (18:28 -0700)]
[Clang] followup D128745, add a missing ClangABICompat check

23 months agoRevert "[LLDB][NFC] Fix optons parsing and misc. reliability in CommandObjectThread"
Stella Stamenova [Wed, 17 Aug 2022 01:11:28 +0000 (18:11 -0700)]
Revert "[LLDB][NFC] Fix optons parsing and misc. reliability in CommandObjectThread"

This very much non-NFC change broke the windows lldb bot: https://lab.llvm.org/buildbot/#/builders/83/builds/22557

This reverts commit 461b410159426fdc6da77e0fb653737e04e0ebe9.

23 months ago[lldb] Automatically unwrap parameter packs in template argument accessors
Jonas Devlieghere [Wed, 17 Aug 2022 00:53:34 +0000 (17:53 -0700)]
[lldb]  Automatically unwrap parameter packs in template argument accessors

When looking at template arguments in LLDB, we usually care about what
the user passed in his code, not whether some of those arguments where
passed as a variadic parameter pack.

This patch extends all the C++ APIs to look at template parameters to
take an additional 'expand_pack' boolean that automatically unwraps the
potential argument packs. The equivalent SBAPI calls have been changed
to pass true for this parameter.

A byproduct of the patch is to also fix the support for template type
that have only a parameter pack as argument (like the OnlyPack type in
the test). Those were not recognized as template instanciations before.

The added test verifies that the SBAPI is able to iterate over the
arguments of a variadic template.

The original patch was written by Fred Riss almost 4 years ago.

Differential revision: https://reviews.llvm.org/D51387

23 months ago[RISCV] Add scheduling class for vector pseudo segment instructions.
Monk Chiang [Thu, 21 Jul 2022 02:30:43 +0000 (19:30 -0700)]
[RISCV] Add scheduling class for vector pseudo segment instructions.

Add scheduling resource for vector segment load/store instructions in D128886.
I miss to add scheduling resource for pseudo segment instructions.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D130222

23 months agoDebugInfo: Remove auto return type representation support
David Blaikie [Mon, 15 Aug 2022 23:47:23 +0000 (23:47 +0000)]
DebugInfo: Remove auto return type representation support

Seems this complicated lldb sufficiently for some cases that it hasn't
been worth supporting/fixing there - and it so far hasn't provided any
new use cases/value for debug info consumers, so let's remove it until
someone has a use case for it.

(side note: the original implementation of this still had a bug (I
should've caught it in review) that we still didn't produce
auto-returning function declarations in types where the function wasn't
instantiatied (that requires a fix to remove the `if
getContainedAutoType` condition in
`CGDebugInfo::CollectCXXMemberFunctions` - without that, auto returning
functions were still being handled the same as member function templates
and special member functions - never added to the member list, only
attached to the type via the declaration chain from the definition)

Further discussion about this in D123319

This reverts commit 5ff992bca208a0e37ca6338fc735aec6aa848b72: [DEBUG-INFO] Change how we handle auto return types for lambda operator() to be consistent with gcc

This reverts commit c83602fdf51b2692e3bacb06bf861f20f74e987f: [DWARF5][clang]: Added support for DebugInfo generation for auto return type for C++ member functions.

Differential Revision: https://reviews.llvm.org/D131933

23 months ago[LLDB][NativePDB] Add nullptr checking.
Zequan Wu [Wed, 17 Aug 2022 00:01:41 +0000 (17:01 -0700)]
[LLDB][NativePDB] Add nullptr checking.

23 months agoLibfuzzer fix for Ctrl + c not working with -fork and -ignore_crashes=1
Maxim Schessler [Mon, 15 Aug 2022 18:44:06 +0000 (11:44 -0700)]
Libfuzzer fix for Ctrl + c not working with -fork and -ignore_crashes=1

In some cases running Libfuzzer in fork mode with -ignore_crashes=1 counts ctrl+c as crash and restarts.

Thread: https://github.com/google/oss-fuzz/issues/4547

Credit: Marcel Boehme <marcel.boehme@acm.org>

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D130990

23 months ago[RISCV] Add test cases to show missed opportunity to fold (sub C, (xor (setcc), 1...
Craig Topper [Tue, 16 Aug 2022 23:11:37 +0000 (16:11 -0700)]
[RISCV] Add test cases to show missed opportunity to fold (sub C, (xor (setcc), 1)). NFC

(sub C, (xori X, 1)) can be folded to (add X, C-1) if X is 0 or 1.

This would avoid the xori and in some cases remove an instruction
neede to materialize the constant.

23 months ago[InstCombine] convert second std::min argument to same type as first
Martin Sebor [Tue, 16 Aug 2022 23:10:10 +0000 (17:10 -0600)]
[InstCombine] convert second std::min argument to same type as first

Ensure both arguments to std::min have the same type in all data models.

23 months ago[compiler-rt] Build with C++17 explicitly
Shoaib Meenai [Tue, 16 Aug 2022 23:24:42 +0000 (16:24 -0700)]
[compiler-rt] Build with C++17 explicitly

We've started using C++17 constructs in compiler-rt now (e.g.
string_view in ORC), but when using the bootstrapping build, we won't
inherit the C++ standard from LLVM, and compilation may fail if we
default to an older standard. Explicitly build compiler-rt with C++17 in
a standalone build, which matches what other subprojects (e.g. Clang and
LLD) do.

23 months agoUntangle the mess which is MachineBasicBlock::hasAddressTaken().
Eli Friedman [Tue, 16 Aug 2022 23:15:44 +0000 (16:15 -0700)]
Untangle the mess which is MachineBasicBlock::hasAddressTaken().

There are two different senses in which a block can be "address-taken".
There can be a BlockAddress involved, which means we need to map the
IR-level value to some specific block of machine code.  Or there can be
constructs inside a function which involve using the address of a basic
block to implement certain kinds of control flow.

Mixing these together causes a problem: if target-specific passes are
marking random blocks "address-taken", if we have a BlockAddress, we
can't actually tell which MachineBasicBlock corresponds to the
BlockAddress.

So split this into two separate bits: one for BlockAddress, and one for
the machine-specific bits.

Discovered while trying to sort out related stuff on D102817.

Differential Revision: https://reviews.llvm.org/D124697

23 months ago[Clang][BPF]: Force sign/zero extension for return values in caller
Yonghong Song [Tue, 9 Aug 2022 00:33:23 +0000 (17:33 -0700)]
[Clang][BPF]: Force sign/zero extension for return values in caller

Currently bpf supports calling kernel functions (x86_64, arm64, etc.)
in bpf programs. Tejun discovered a problem where the x86_64 func
return value (a unsigned char type) is stored in 8-bit subregister %al
and the other 56-bits in %rax might be garbage. But based on current
bpf ABI, the bpf program assumes the whole %rax holds the correct value
as the callee is supposed to do necessary sign/zero extension.
This mismatch between bpf and x86_64 caused the incorrect results.

To resolve this problem, this patch forced caller to do needed
sign/zero extension for 8/16-bit return values as well. Note that
32-bit return values already had sign/zero extension even without
this patch.

For example, for the test case attached to this patch:

  $  cat t.c
  _Bool bar_bool(void);
  unsigned char bar_char(void);
  short bar_short(void);
  int bar_int(void);
  int foo_bool(void) {
        if (bar_bool() != 1) return 0; else return 1;
  }
  int foo_char(void) {
        if (bar_char() != 10) return 0; else return 1;
  }
  int foo_short(void) {
        if (bar_short() != 10) return 0; else return 1;
  }
  int foo_int(void) {
        if (bar_int() != 10) return 0; else return 1;
  }

Without this patch, generated call insns in IR looks like:
    %call = call zeroext i1 @bar_bool()
    %call = call zeroext i8 @bar_char()
    %call = call signext i16 @bar_short()
    %call = call i32 @bar_int()
So it is assumed that zero extension has been done for return values of
bar_bool()and bar_char(). Sign extension has been done for the return
value of bar_short(). The return value of bar_int() does not have any
assumption so caller needs to do necessary shifting to get correct
32bit values.

With this patch, generated call insns in IR looks like:
    %call = call i1 @bar_bool()
    %call = call i8 @bar_char()
    %call = call i16 @bar_short()
    %call = call i32 @bar_int()
There are no assumptions for return values of the above four function calls,
so necessary shifting is necessary for all of them.

The following is the objdump file difference for function foo_char().
Without this patch:
  0000000000000010 <foo_char>:
       2:       85 10 00 00 ff ff ff ff call -1
       3:       bf 01 00 00 00 00 00 00 r1 = r0
       4:       b7 00 00 00 01 00 00 00 r0 = 1
       5:       15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB1_2>
       6:       b7 00 00 00 00 00 00 00 r0 = 0
  0000000000000038 <LBB1_2>:
       7:       95 00 00 00 00 00 00 00 exit

With this patch:
  0000000000000018 <foo_char>:
       3:       85 10 00 00 ff ff ff ff call -1
       4:       bf 01 00 00 00 00 00 00 r1 = r0
       5:       57 01 00 00 ff 00 00 00 r1 &= 255
       6:       b7 00 00 00 01 00 00 00 r0 = 1
       7:       15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB1_2>
       8:       b7 00 00 00 00 00 00 00 r0 = 0
  0000000000000048 <LBB1_2>:
       9:       95 00 00 00 00 00 00 00 exit
The zero extension of the return 'char' value is done here.

Differential Revision: https://reviews.llvm.org/D131598

23 months agoRecommit "[RISCV] Use setcc's original SDLoc when inverting it in performSUBCombine."
Craig Topper [Tue, 16 Aug 2022 22:50:24 +0000 (15:50 -0700)]
Recommit "[RISCV] Use setcc's original SDLoc when inverting it in performSUBCombine."

This time using N1 instead of N0 since N1 points to the original
setcc. This now affects scheduling as I expected.

Original commit message:
We change seteq<->setne but it doesn't change the semantics
of the setcc. We should keep original debug location. This is
consistent with visitXor in the generic DAGCombiner.

23 months agoRevert "[RISCV] Use setcc's original SDLoc when inverting it in performSUBCombine."
Craig Topper [Tue, 16 Aug 2022 22:46:21 +0000 (15:46 -0700)]
Revert "[RISCV] Use setcc's original SDLoc when inverting it in performSUBCombine."

This reverts commit 1380b21ceba7b7b19e960da5df68dcd5cba1b091.

I mixed up N0 and N1 and didn't do what I intended.

23 months ago[InstCombine] Add support for strlcpy folding
Martin Sebor [Wed, 27 Jul 2022 21:55:22 +0000 (15:55 -0600)]
[InstCombine] Add support for strlcpy folding

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D130666

23 months ago[RISCV] Use setcc's original SDLoc when inverting it in performSUBCombine.
Craig Topper [Tue, 16 Aug 2022 22:19:51 +0000 (15:19 -0700)]
[RISCV] Use setcc's original SDLoc when inverting it in performSUBCombine.

We change seteq<->setne but it doesn't change the semantics
of the setcc. We should keep original debug location. This is
consistent with visitXor in the generic DAGCombiner.

23 months ago[libc][Obvious] Rearrange few header targets to satisfy dependency order.
Siva Chandra Reddy [Tue, 16 Aug 2022 22:32:29 +0000 (22:32 +0000)]
[libc][Obvious] Rearrange few header targets to satisfy dependency order.

23 months ago[LLDB][NFC] Fix optons parsing and misc. reliability in CommandObjectThread
Slava Gurevich [Mon, 15 Aug 2022 08:51:49 +0000 (01:51 -0700)]
[LLDB][NFC] Fix optons parsing and misc. reliability in CommandObjectThread

* Improve reliability by checking return results for calls to FindLineEntryByAddress()
* Fix broken option parsing in SetOptionValue()

Differential Revision: https://reviews.llvm.org/D131983

23 months ago[unittests/CodeGen] Remove unique_ptr from the result of createTargetMachine
Guozhi Wei [Tue, 16 Aug 2022 22:06:50 +0000 (22:06 +0000)]
[unittests/CodeGen] Remove unique_ptr from the result of createTargetMachine

The object contained in unique_ptr will be automatically deleted at the end of
the current scope. In createMachineFunction,

  auto TM = createTargetMachine();

creates a TM contained in unique_ptr, a reference of the TM is stored in a
MachineFunction object, but at the end of the function, the TM is deleted, so
later access to the TM(and contained STI, TRI ...) through MachineFunction
object is invalid.

So we should not use unique_ptr<BogusTargetMachine> in functions
createMachineFunction and createTargetMachine.

Differential Revision: https://reviews.llvm.org/D131790

23 months ago[RISCV] Remove C!=0 restriction from (sub C, (setcc x, y, eq/neq)) -> (add C-1, ...
Craig Topper [Tue, 16 Aug 2022 21:39:41 +0000 (14:39 -0700)]
[RISCV] Remove C!=0 restriction from (sub C, (setcc x, y, eq/neq)) -> (add C-1, (setcc x, y, neq/eq)).

While (sub 0, X) can use x0 for the 0, I believe (add X, -1) is
still preferrable. (addi X, -1) can be compressed, sub with x0 on
the LHS is never compressible.

23 months ago[InstCombine] Remove assumptions about int having 32 bits
Martin Sebor [Mon, 15 Aug 2022 16:06:12 +0000 (10:06 -0600)]
[InstCombine] Remove assumptions about int having 32 bits

Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D131731

23 months ago[LLDB][NFC] Fix memory leak in IntstumentationRuntimeTSan.cpp
Slava Gurevich [Mon, 15 Aug 2022 09:25:42 +0000 (02:25 -0700)]
[LLDB][NFC] Fix memory leak in IntstumentationRuntimeTSan.cpp

ConvertToStructuredArray() relies on its caller to deallocate the heap-allocated object pointer it returns. One of its call-sites, in GetRenumberedThreadIds(), fails to deallocate causing a memory/resource leak. Fix the memory leak by converting the return type to shared_ptr, and clean up the rest of the file to use the typedef-ed shared_ptr types for StructuredData for safety and consistency.

Differential Revision: https://reviews.llvm.org/D131900

23 months ago[Intrinsics] Add initial support for NonNull attribute
Alexander Shaposhnikov [Tue, 16 Aug 2022 21:25:19 +0000 (21:25 +0000)]
[Intrinsics] Add initial support for NonNull attribute

Add initial support for NonNull attribute.
(https://github.com/llvm/llvm-project/issues/57113)

Test plan:

verify that for
__thread int x;
int main() {

int* y = &x;
return *y;
}
(with this patch) clang -O -fsanitize=null -S -emit-llvm -o -
doesn't emit a null-pointer check

Differential revision: https://reviews.llvm.org/D131872

23 months agoCodeGen: correct handling of debug info generation for aliases
Saleem Abdulrasool [Tue, 16 Aug 2022 21:22:27 +0000 (21:22 +0000)]
CodeGen: correct handling of debug info generation for aliases

When aliasing a static array, the aliasee is going to be a GEP which
points to the value.  We should strip pointer casts before forming the
reference.  This was occluded by the use of opaque pointers.

This problem has existed since the introduction of the debug info
generation for aliases in b1ea0191a42074341847d767609f66a26b6d5a41.  The
test case would assert due to the invalid cast with or without
`-no-opaque-pointers` at that revision.

Fixes: #57179

23 months ago[mlir][sparse] Refactoring: remove Operation * from the argument list in utility...
Peiming Liu [Tue, 16 Aug 2022 20:47:02 +0000 (20:47 +0000)]
[mlir][sparse] Refactoring: remove Operation * from the argument list in utility functions

This patch remove the Operation *op from the argument list in utility functions, and directly pass the Location instead of calling op->getLoc().

This should make the code more clear, as the utility function (logically) does not relies on the operation that we are currently rewriting, and they behave the same regardless of the operation.

Reviewed By: aartbik, wrengr

Differential Revision: https://reviews.llvm.org/D131991

23 months ago[clang][deps] Compute command-lines for dependencies immediately
Ben Langmuir [Tue, 16 Aug 2022 00:54:00 +0000 (17:54 -0700)]
[clang][deps] Compute command-lines for dependencies immediately

Instead of delaying the generation of command-lines to after all
dependencies are reported, compute them immediately. This is partly in
preparation for splitting the TU driver command into its constituent cc1
and other jobs, but it also just simplifies working with the compiler
invocation for modules if they are not "without paths".

Also change the computation of the default output path in
clang-scan-deps to scrape the implicit module cache from the
command-line rather than get it from the dependency, since that is now
unavailable at the time we make the callback.

Differential Revision: https://reviews.llvm.org/D131934

23 months ago[RISCV] Don't fold (sub C, (setcc x, y, eq/neq)) -> (add C-1, (setcc x, y, neq/eq...
Craig Topper [Tue, 16 Aug 2022 21:08:38 +0000 (14:08 -0700)]
[RISCV] Don't fold (sub C, (setcc x, y, eq/neq)) -> (add C-1, (setcc x, y, neq/eq)) if C-1 isn't simm12.

We still need to materialize the constant in a register and we
may not be removing all uses of the original constant so it may
increase code size.

23 months ago[RISCV] Add more test cases for (sub C, (setcc x, y, eq/neq)) -> (add C-1, (setcc...
Craig Topper [Tue, 16 Aug 2022 21:04:19 +0000 (14:04 -0700)]
[RISCV] Add more test cases for (sub C, (setcc x, y, eq/neq)) -> (add C-1, (setcc x, y, neq/eq)). NFC

In these test cases we do the transform, but the immediate is too
large to form an ADDI so it didn't save any instructions.

If the constant is opaque or has additional users we shouldn't do
the transform if it doesn't form an ADDI.

23 months ago[RISCV] Move test from setcc-logic.ll to select-const.ll. NFC
Craig Topper [Tue, 16 Aug 2022 20:49:44 +0000 (13:49 -0700)]
[RISCV] Move test from setcc-logic.ll to select-const.ll. NFC

Also add setne version of the test.

Add some common prefixes to reduce number of identical CHECK lines.

23 months ago[mlir][sparse] Implements concatenate operation for sparse tensor
Peiming Liu [Thu, 4 Aug 2022 20:50:55 +0000 (20:50 +0000)]
[mlir][sparse] Implements concatenate operation for sparse tensor

This patch implements the conversion rule for operation introduced in https://reviews.llvm.org/D131200.
Also contains integration test for correctness

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D131200

23 months ago[libc][Obvious] Convert an add_header target to add_header_library target.
Siva Chandra Reddy [Tue, 16 Aug 2022 20:33:24 +0000 (20:33 +0000)]
[libc][Obvious] Convert an add_header target to add_header_library target.

23 months ago[RISCV] (sub C, (setcc x, y, eq/neq)) -> (add C-1, (setcc x, y, neq/eq)) fold for...
Craig Topper [Tue, 16 Aug 2022 19:57:05 +0000 (12:57 -0700)]
[RISCV] (sub C, (setcc x, y, eq/neq)) -> (add C-1, (setcc x, y, neq/eq)) fold for FP setcc.

This introduce an xori in some cases. I don't believe it was the
intention of the original patch. This was an accident because
nonan FP equality compares also use SETEQ/SETNE.

Also pass the correct type to getSetCCInverse.

23 months ago[RISCV] Add test cases to show where we inverted a fp setcc and introduced an extra...
Craig Topper [Tue, 16 Aug 2022 19:35:48 +0000 (12:35 -0700)]
[RISCV] Add test cases to show where we inverted a fp setcc and introduced an extra xori.

In these tests we had (sub C, (seteq X, Y)) which we converted to
the (add (setne X, Y), C-1). We don't have a FNE compare instruction
so this created an XORI to invert an FEQ instruction.

This might be a good idea since it can save a constant materialization,
but does not appear to be the intention of the original patch.

23 months ago[RISCV] Minor cleanups to performSUBCombine. NFC
Craig Topper [Tue, 16 Aug 2022 19:24:18 +0000 (12:24 -0700)]
[RISCV] Minor cleanups to performSUBCombine. NFC

-Rename variable NnzC -> N0C.
-Use SelectionDAG::getSetCC to reduce code.
-Use SDValue::getOperand instead of operator-> and SDNode::getOperand.

Initial steps to add another similar combine to this code.

23 months ago[lldb] Fix warnings
Kazu Hirata [Tue, 16 Aug 2022 19:33:21 +0000 (12:33 -0700)]
[lldb] Fix warnings

This patch fixes:

  lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.h:34:5:
  error: default label in switch which covers all enumeration values
  [-Werror,-Wcovered-switch-default]

and:

  lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.cpp:194:21:
  error: comparison of integers of different signs: 'int' and 'size_t'
  (aka 'unsigned long') [-Werror,-Wsign-compare]

23 months ago[Sema] Fix friend destructor declarations after D130936
Roy Jacobson [Tue, 16 Aug 2022 19:27:36 +0000 (22:27 +0300)]
[Sema] Fix friend destructor declarations after D130936

I accidentally broke friend destructor declarations in D130936.

Modify it to skip performing the destructor name check if we have a dependent friend declaration.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D131541

23 months ago[FlattenCFG] avoid crash on malformed code
Sanjay Patel [Tue, 16 Aug 2022 18:37:41 +0000 (14:37 -0400)]
[FlattenCFG] avoid crash on malformed code

We don't have a dominator tree in this pass, so we
can't bail out sooner by checking for unreachable
code, but this is a minimal fix for the example in
issue #56875.

23 months ago[NFC][PowerPC] Add missing NOCOMPAT checks for builtins-ppc-xlcompat.c
Lei Huang [Wed, 10 Aug 2022 21:43:29 +0000 (16:43 -0500)]
[NFC][PowerPC] Add missing NOCOMPAT checks for builtins-ppc-xlcompat.c

Followup patch to address request from https://reviews.llvm.org/D124093

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D131622

23 months ago[clang][dataflow] Use llvm::is_contained()
Dmitri Gribenko [Tue, 16 Aug 2022 16:27:41 +0000 (18:27 +0200)]
[clang][dataflow] Use llvm::is_contained()

Reviewed By: samestep, xazax.hun

Differential Revision: https://reviews.llvm.org/D131975

23 months agoFix subrange liveness checking at rematerialization
Nicolas Miller [Tue, 16 Aug 2022 17:23:38 +0000 (10:23 -0700)]
Fix subrange liveness checking at rematerialization

This patch fixes an issue where an instruction reading a whole register would be moved during register allocation into a spot where one of the subregisters was dead.

The code to check whether an instruction can be rematerialized at a given point or not was already checking for subranges to ensure that subregisters are live, but only when the instruction being moved was using a subregister, this patch changes that so the subranges are checked even when the moved instruction uses the full register.

This patch also adds a case to the original test for the subrange checking that trigger the issue described above.

The original subrange checking code was introduced in this revision: https://reviews.llvm.org/D115278

And I've encountered this issue on AMDGPUs while working with DPC++: https://github.com/intel/llvm/issues/6209

Essentially the greedy register allocator attempts to move the following instruction:

```
%3961:vreg_64 = V_LSHLREV_B64_e64 3, %3078:vreg_64, implicit $exec
```

From `@3440` into the body of a loop `@16312`, but `%3078` has the following live ranges:

```
%3078 [2224r,2240r:0)[2240r,3488B:1)[16192B,38336B:1) 0@2224r 1@2240r  L0000000000000003 [2224r,3440r:0) 0@2224r  L000000000000000C [2240r,3488B:0)[16192B,38336B:0) 0@2240r
```

So `@16312e` `%3078.sub1` is alive but `%3078.sub0` is dead, so this instruction being moved there leads to invalid memory accesses as `3078.sub0` ends up being trashed and the result of this instruction is used as part of an address calculation for a load.

On the original ticket this issue showed up on gfx906 and gfx90a but not on gfx908, this turned out to be because on gfx908 instead of moving the shift instruction into the loop, its value is spilled into an ACC register, gfx906 doesn't have ACC registers and for gfx90a ACC registers are used like regular vector registers and so aren't used for spilling.

With this patch the original application from the DPC++ ticket works properly on gfx906, and the result of the shift instruction is correctly spilled instead of moving the instruction in the loop.

Original Author: npmiller

Reviewed by: rampitec

Submitted by: rampitec

Differential Revision: https://reviews.llvm.org/D131884

23 months agoRevert "flang: Fix flang build with -Wctad-maybe-unsupported"
David Blaikie [Tue, 16 Aug 2022 17:43:40 +0000 (17:43 +0000)]
Revert "flang: Fix flang build with -Wctad-maybe-unsupported"

-Wctad-maybe-unsupported is now disabled for flang so these explicit
deduction guides are not required.

This reverts commit 248591aabee7fcc5246b67879b6a71b0bbbc0b9c.

23 months agoRevert "Some more from-the-hip ctad-maybe-unsupported fixes for flang"
David Blaikie [Tue, 16 Aug 2022 17:43:08 +0000 (17:43 +0000)]
Revert "Some more from-the-hip ctad-maybe-unsupported fixes for flang"

-Wctad-maybe-unsupported is now disabled for flang so these explicit
deduction guides are not required.

This reverts commit ec3956b6e63c1524d6b024ba5db9ffcd7281ada0.

23 months agoDisable -Wctad-maybe-unsupported in flang since it already uses the feature a lot
David Blaikie [Tue, 16 Aug 2022 17:42:45 +0000 (17:42 +0000)]
Disable -Wctad-maybe-unsupported in flang since it already uses the feature a lot

23 months ago[LLDB][NativePDB] Add nullptr checking.
Zequan Wu [Tue, 16 Aug 2022 02:34:13 +0000 (19:34 -0700)]
[LLDB][NativePDB] Add nullptr checking.

23 months ago[libc++] Improve updating data files.
Mark de Wever [Wed, 13 Jul 2022 17:24:12 +0000 (19:24 +0200)]
[libc++] Improve updating data files.

This changes makes it easier to update the Unicode data files used for
the Extended Graphme Clustering as added in D126971.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D129668

23 months ago[libc++][format] Improve format buffer.
Mark de Wever [Sat, 16 Jul 2022 15:03:27 +0000 (17:03 +0200)]
[libc++][format] Improve format buffer.

Allow bulk output operations on the buffer instead of adding one
code unit at a time. This has a huge performance benefit at the cost of
larger binary. This doesn't implement @vitaut's earlier suggestion to
avoid buffering for std::string when writing a strings. That can be done
in a follow-up patch.

There are some minor complications for the non-buffered format_to_n.
When writing one character at a time it's easy to detect when reaching
the limit n. This is solved by adding a small overhead for format_to_n.
When the next write would overflow it stores the data in the internal
buffer and copies that up-to n code units. The overhead isn't measured,
but it's expected to only be an issue for small values of n; for larger
values the general improvements will outweight the new overhead.

```
   text    data     bss     dec     hex filename
 349081    6096     440  355617   56d21 format.libcxx.out-baseline
 344442    6088     440  350970   55afa formatted_size.libcxx.out-baseline
4567980   57272     424 4625676  46950c formatter_float.libcxx.out-baseline
 718800   12472     488  731760   b2a70 formatter_int.libcxx.out-baseline
 376341    6096     552  382989   5d80d format_to.libcxx.out-beaseline

 370169    6096     440  376705   5bf81 format.libcxx.out
 365530    6088     440  372058   5ad5a formatted_size.libcxx.out
4575116   57272     424 4632812  46b0ec formatter_float.libcxx.out
 725936   12472     488  738896   b4650 formatter_int.libcxx.out
 397429    6096     552  404077   62a6d format_to.libcxx.out
```

For very small strings the new method is slower, from 4 characters
there's already a small gain.

```
Comparing ./format.libcxx.out-baseline to ./format.libcxx.out
Benchmark                                           Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------------------
BM_format_string<char>/1                         +0.0268         +0.0268            43            44            43            44
BM_format_string<char>/2                         +0.0133         +0.0133            22            22            22            22
BM_format_string<char>/4                         -0.0248         -0.0248            12            11            12            11
BM_format_string<char>/8                         -0.0831         -0.0831             6             6             6             6
BM_format_string<char>/16                        -0.2976         -0.2976             4             3             4             3
BM_format_string<char>/32                        -0.4369         -0.4369             3             2             3             2
BM_format_string<char>/64                        -0.6375         -0.6375             3             1             3             1
BM_format_string<char>/128                       -0.7685         -0.7685             2             1             2             1

```

The int benchmark has benefits for the simple formatting, but shines for
the complex formatting:
```
Comparing ./formatter_int.libcxx.out-baseline to ./formatter_int.libcxx.out
Benchmark                                                               Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------
BM_Basic<uint32_t>                                                   -0.2307         -0.2307            60            46            60            46
BM_Basic<int32_t>                                                    -0.1985         -0.1985            61            49            61            49
BM_Basic<uint64_t>                                                   -0.3478         -0.3479            81            53            81            53
BM_Basic<int64_t>                                                    -0.3475         -0.3475            81            53            81            53
BM_BasicLow<__uint128_t>                                             -0.3388         -0.3388            86            57            86            57
BM_BasicLow<__int128_t>                                              -0.3431         -0.3431            86            57            86            57
BM_Basic<__uint128_t>                                                -0.2822         -0.2822           236           170           236           170
BM_Basic<__int128_t>                                                 -0.3107         -0.3107           219           151           219           151
Integral_LocFalse_BaseBin_AlignNone_Int64                            -0.5781         -0.5781           178            75           178            75
Integral_LocFalse_BaseBin_AlignmentLeft_Int64                        -0.9231         -0.9231          1156            89          1156            89
Integral_LocFalse_BaseBin_AlignmentCenter_Int64                      -0.9179         -0.9179          1107            91          1107            91
Integral_LocFalse_BaseBin_AlignmentRight_Int64                       -0.9238         -0.9238          1147            87          1147            87
Integral_LocFalse_BaseBin_ZeroPadding_Int64                          -0.9170         -0.9170          1137            94          1137            94
Integral_LocFalse_BaseBin_AlignNone_Uint64                           -0.5923         -0.5923           175            71           175            71
Integral_LocFalse_BaseBin_AlignmentLeft_Uint64                       -0.9251         -0.9251          1154            86          1154            86
Integral_LocFalse_BaseBin_AlignmentCenter_Uint64                     -0.9204         -0.9204          1105            88          1105            88
Integral_LocFalse_BaseBin_AlignmentRight_Uint64                      -0.9242         -0.9242          1125            85          1125            85
Integral_LocFalse_BaseBin_ZeroPadding_Uint64                         -0.9232         -0.9232          1139            88          1139            88
Integral_LocFalse_BaseOct_AlignNone_Int64                            -0.3241         -0.3241           100            67           100            67
Integral_LocFalse_BaseOct_AlignmentLeft_Int64                        -0.9322         -0.9322          1166            79          1166            79
Integral_LocFalse_BaseOct_AlignmentCenter_Int64                      -0.9251         -0.9251          1108            83          1108            83
Integral_LocFalse_BaseOct_AlignmentRight_Int64                       -0.9303         -0.9303          1136            79          1136            79
Integral_LocFalse_BaseOct_ZeroPadding_Int64                          -0.9264         -0.9264          1156            85          1156            85
Integral_LocFalse_BaseOct_AlignNone_Uint64                           -0.3116         -0.3116            96            66            96            66
Integral_LocFalse_BaseOct_AlignmentLeft_Uint64                       -0.9310         -0.9310          1168            81          1168            81
Integral_LocFalse_BaseOct_AlignmentCenter_Uint64                     -0.9281         -0.9281          1128            81          1128            81
Integral_LocFalse_BaseOct_AlignmentRight_Uint64                      -0.9299         -0.9299          1148            80          1148            80
Integral_LocFalse_BaseOct_ZeroPadding_Uint64                         -0.9288         -0.9288          1153            82          1153            82
Integral_LocFalse_BaseDec_AlignNone_Int64                            -0.3342         -0.3342            95            63            95            63
Integral_LocFalse_BaseDec_AlignmentLeft_Int64                        -0.9360         -0.9360          1157            74          1157            74
Integral_LocFalse_BaseDec_AlignmentCenter_Int64                      -0.9303         -0.9303          1128            79          1128            79
Integral_LocFalse_BaseDec_AlignmentRight_Int64                       -0.9369         -0.9369          1164            73          1164            73
Integral_LocFalse_BaseDec_ZeroPadding_Int64                          -0.9323         -0.9323          1157            78          1157            78
Integral_LocFalse_BaseDec_AlignNone_Uint64                           -0.3198         -0.3198            93            63            93            63
Integral_LocFalse_BaseDec_AlignmentLeft_Uint64                       -0.9351         -0.9351          1158            75          1158            75
Integral_LocFalse_BaseDec_AlignmentCenter_Uint64                     -0.9298         -0.9298          1128            79          1128            79
Integral_LocFalse_BaseDec_AlignmentRight_Uint64                      -0.9361         -0.9361          1157            74          1157            74
Integral_LocFalse_BaseDec_ZeroPadding_Uint64                         -0.9333         -0.9333          1151            77          1151            77
Integral_LocFalse_BaseHex_AlignNone_Int64                            -0.3020         -0.3020            89            62            89            62
Integral_LocFalse_BaseHex_AlignmentLeft_Int64                        -0.9357         -0.9357          1174            75          1174            75
Integral_LocFalse_BaseHex_AlignmentCenter_Int64                      -0.9319         -0.9319          1129            77          1129            77
Integral_LocFalse_BaseHex_AlignmentRight_Int64                       -0.9350         -0.9350          1161            75          1161            75
Integral_LocFalse_BaseHex_ZeroPadding_Int64                          -0.9293         -0.9293          1150            81          1150            81
Integral_LocFalse_BaseHex_AlignNone_Uint64                           -0.3056         -0.3057            86            59            86            59
Integral_LocFalse_BaseHex_AlignmentLeft_Uint64                       -0.9378         -0.9378          1174            73          1174            73
Integral_LocFalse_BaseHex_AlignmentCenter_Uint64                     -0.9341         -0.9341          1129            74          1130            74
Integral_LocFalse_BaseHex_AlignmentRight_Uint64                      -0.9361         -0.9361          1157            74          1157            74
Integral_LocFalse_BaseHex_ZeroPadding_Uint64                         -0.9315         -0.9315          1147            79          1147            79
Integral_LocFalse_BaseHexUpper_AlignNone_Int64                       -0.0019         -0.0019            91            90            91            90
Integral_LocFalse_BaseHexUpper_AlignmentLeft_Int64                   -0.9099         -0.9099          1162           105          1162           105
Integral_LocFalse_BaseHexUpper_AlignmentCenter_Int64                 -0.9041         -0.9041          1121           108          1121           108
Integral_LocFalse_BaseHexUpper_AlignmentRight_Int64                  -0.9086         -0.9086          1162           106          1162           106
Integral_LocFalse_BaseHexUpper_ZeroPadding_Int64                     -0.9057         -0.9057          1164           110          1164           110
Integral_LocFalse_BaseHexUpper_AlignNone_Uint64                      +0.0110         +0.0110            86            87            86            87
Integral_LocFalse_BaseHexUpper_AlignmentLeft_Uint64                  -0.9136         -0.9136          1161           100          1161           100
Integral_LocFalse_BaseHexUpper_AlignmentCenter_Uint64                -0.9078         -0.9078          1133           104          1133           104
Integral_LocFalse_BaseHexUpper_AlignmentRight_Uint64                 -0.9132         -0.9132          1177           102          1177           102
Integral_LocFalse_BaseHexUpper_ZeroPadding_Uint64                    -0.9091         -0.9091          1160           105          1160           105
```
Other benchmarks give similar results.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D129964

23 months ago[test][libcxx] Don't XFAIL passing test with HWASAN
Vitaly Buka [Tue, 16 Aug 2022 16:35:56 +0000 (09:35 -0700)]
[test][libcxx] Don't XFAIL passing test with HWASAN

23 months ago[mlir][math] Added basic support for FPowI operation.
Slava Zakharin [Thu, 14 Jul 2022 18:00:15 +0000 (11:00 -0700)]
[mlir][math] Added basic support for FPowI operation.

The operation computes pow(b, p), where 'b' is floating point
and 'p' is a signed integer. The result's type matches 'b' type.
The operands must have the same shape.

Differential Revision: https://reviews.llvm.org/D129811

23 months ago[CMake] Cleanup the descriptions for gRPC options
Steven Wu [Tue, 16 Aug 2022 16:03:32 +0000 (09:03 -0700)]
[CMake] Cleanup the descriptions for gRPC options

As a followup to https://reviews.llvm.org/D131593, clean up gRPC related
option names and messages to make them more generic.

23 months agoSome more from-the-hip ctad-maybe-unsupported fixes for flang
David Blaikie [Tue, 16 Aug 2022 16:03:30 +0000 (16:03 +0000)]
Some more from-the-hip ctad-maybe-unsupported fixes for flang