review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 20:37:22 +0000 (23:37 +0300)]

[X86][Costmodel] Load/store i16 Stride=3 VF=16 interleaving costs

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/1T6MMzeh3 - for intels `Block RThroughput: =28.0`; for ryzens, `Block RThroughput: <=8.5`
So pick cost of `28`.

For store we have:
https://godbolt.org/z/1T6MMzeh3 - for intels `Block RThroughput: <=27.0`; for ryzens, `Block RThroughput: <=7.0`
So pick cost of `27`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111017

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 20:37:18 +0000 (23:37 +0300)]

[X86][Costmodel] Load/store i16 Stride=3 VF=8 interleaving costs

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/Mh9MnnT8W - for intels `Block RThroughput: =9.0`; for ryzens, `Block RThroughput: <=2.3`
So pick cost of `9`.

For store we have:
https://godbolt.org/z/Mh9MnnT8W - for intels `Block RThroughput: <=12.0`; for ryzens, `Block RThroughput: <=3.3`
So pick cost of `12`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111016

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 20:37:13 +0000 (23:37 +0300)]

[X86][Costmodel] Load/store i16 Stride=3 VF=4 interleaving costs

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/sP4j1173f - for intels `Block RThroughput: =7.0`; for ryzens, `Block RThroughput: <=3.0`
So pick cost of `7`.

For store we have:
https://godbolt.org/z/sP4j1173f - for intels `Block RThroughput: =6.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `6`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111015

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 20:37:09 +0000 (23:37 +0300)]

[X86][Costmodel] Load/store i16 Stride=3 VF=2 interleaving costs

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/xnE988aej - for intels `Block RThroughput: =5.0`; for ryzens, `Block RThroughput: <=2.5`
So pick cost of `5`.

For store we have:
https://godbolt.org/z/rMGT31Tnh - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111014

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 20:23:13 +0000 (23:23 +0300)]

[X86][Costmodel] Load/store i8 Stride=6 VF=32 interleaving costs

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/c1jjKqP7b - for intels `Block RThroughput: <=82.0`; for ryzens, `Block RThroughput: <=26.0`
So pick cost of `82`.

For store we have:
https://godbolt.org/z/YM4ErY8x7 - for intels `Block RThroughput: <=90.0`; for ryzens, `Block RThroughput: <=25.5`
So pick cost of `90`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111013

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 20:23:13 +0000 (23:23 +0300)]

[X86][Costmodel] Load/store i8 Stride=6 VF=16 interleaving costs

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/Gz8hhqfTM - for intels `Block RThroughput: <=43.0`; for ryzens, `Block RThroughput: <=14.0`
So pick cost of `43`.

For store we have:
https://godbolt.org/z/9vrdssYa8 - for intels `Block RThroughput: <=27.0`; for ryzens, `Block RThroughput: <=12.0`
So pick cost of `27`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111012

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 20:23:08 +0000 (23:23 +0300)]

[X86][Costmodel] Load/store i8 Stride=6 VF=8 interleaving costs

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/v98qPTTf6 - for intels `Block RThroughput: =18.0`; for ryzens, `Block RThroughput: =6.0`
So pick cost of `18`.

For store we have:
https://godbolt.org/z/rn5T9E8q6 - for intels `Block RThroughput: <=16.0`; for ryzens, `Block RThroughput: <=4.5`
So pick cost of `16`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111011

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 20:23:03 +0000 (23:23 +0300)]

[X86][Costmodel] Load/store i8 Stride=6 VF=4 interleaving costs

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/4sWhs396o - for intels `Block RThroughput: =14.0`; for ryzens, `Block RThroughput: <=7.0`
So pick cost of `14`.

For store we have:
https://godbolt.org/z/4sWhs396o - for intels `Block RThroughput: =9.0`; for ryzens, `Block RThroughput: <=3.0`
So pick cost of `9`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111010

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 20:22:58 +0000 (23:22 +0300)]

[X86][Costmodel] Load/store i8 Stride=6 VF=2 interleaving costs

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/jvj6jzns5 - for intels `Block RThroughput: =6.0`; for ryzens, `Block RThroughput: <=3.0`
So pick cost of `6`.

For store we have:
https://godbolt.org/z/ros7eebMP - for intels `Block RThroughput: =7.0`; for ryzens, `Block RThroughput: <=3.0`
So pick cost of `7`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111008

commit | commitdiff | tree

Yuanfang Chen [Sun, 3 Oct 2021 19:49:14 +0000 (12:49 -0700)]

[Clang][NFC] Fix the comment for Sema::DiagIfReachable

commit | commitdiff | tree

Michał Górny [Sat, 2 Oct 2021 09:52:08 +0000 (11:52 +0200)]

[mlir] [test] Add missing tool substitutions

Add missing mlir-capi-*-test tool substitutions in order to fix CAPI
test failures when mlir is not installed yet.

Differential Revision: https://reviews.llvm.org/D110991

commit | commitdiff | tree

David Green [Sun, 3 Oct 2021 18:30:08 +0000 (19:30 +0100)]

[ARM] Mark <= -1 immediate constant as cheap

A <= -1 constant on a compare can be converted to a < 0 operation, which
is usually cheap. If we mark the constant as cheap, preventing hoisting,
we allow that fold to happen even across different blocks.

Differential Revision: https://reviews.llvm.org/D109360

commit | commitdiff | tree

Simon Pilgrim [Sun, 3 Oct 2021 17:38:47 +0000 (18:38 +0100)]

[X86] Split Cannonlake + Icelake Tuning. NFC

The Ice/Tiger/RocketLake specs were inheriting the tuning settings from CannonLake, a previous architecture. We shouldn't have this dependency, so I've copied the current tuning settings so we can make future adjustments to both CNL + ICL etc. more easily.

commit | commitdiff | tree

Simon Pilgrim [Sun, 3 Oct 2021 16:16:45 +0000 (17:16 +0100)]

[CostModel][X86] X86TTIImpl::getCmpSelInstrCost - try to use Predicate argument directly first (PR48337)

There's still a lot of cases where getCmpSelInstrCost fails to specify a predicate, once those are in place we should be able to remove the fallback to the Instruction argument entirely.

commit | commitdiff | tree

David Green [Sun, 3 Oct 2021 15:32:31 +0000 (16:32 +0100)]

[ARM] Tests for constant hoisting -1 immediates

commit | commitdiff | tree

Kazu Hirata [Sun, 3 Oct 2021 15:22:19 +0000 (08:22 -0700)]

[Analysis, CodeGen] Migrate from arg_operands to args (NFC)

Note that arg_operands is considered a legacy name. See
llvm/include/llvm/IR/InstrTypes.h for details.

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 14:50:51 +0000 (17:50 +0300)]

[NFC][X86][Codegen] Add test coverage for interleaved i64 load/store stride=3

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 14:34:21 +0000 (17:34 +0300)]

[NFC][X86][LV] Add costmodel test coverage for interleaved i64/f64 load/store stride=3

commit | commitdiff | tree

Sanjay Patel [Sun, 3 Oct 2021 14:37:22 +0000 (10:37 -0400)]

[InstCombine] fold cast of right-shift if high bits are not demanded (3rd try)

The first two tries at this were reverted because they caused an
infinite loop in instcombine.
That should be fixed after a series of patches that ended with
removing the faulty opposing transform:
3fabd98e5b3e

Original commit message:
(masked) trunc (lshr X, C) --> (masked) lshr (trunc X), C

Narrowing the shift should be better for analysis and can lead
to follow-on transforms as shown.

Attempt at a general proof in Alive2:
https://alive2.llvm.org/ce/z/tRnnSF

Here are a couple of the specific tests:
https://alive2.llvm.org/ce/z/bCnTp-
https://alive2.llvm.org/ce/z/TfaHnb

Differential Revision: https://reviews.llvm.org/D110170

commit | commitdiff | tree

Sanjay Patel [Sun, 3 Oct 2021 14:35:59 +0000 (10:35 -0400)]

[InstCombine] add test for shl + demanded bits; NFC

This is a reduction of a test that would infinite loop with D110170.

commit | commitdiff | tree

Nikita Popov [Sun, 3 Oct 2021 11:26:14 +0000 (13:26 +0200)]

[InstSimplify] Add additional load from constant test (NFC)

This case does not get folded, because the GEP indexes too deeply
(to the i8), making the bitcast logic not apply (on the [8 x i8]).

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 13:48:45 +0000 (16:48 +0300)]

[NFC][X86][Codegen] Add test coverage for interleaved i32 load/store stride=3

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 13:31:23 +0000 (16:31 +0300)]

[NFC][X86][LV] Add costmodel test coverage for interleaved i32/f32 load/store stride=3

commit | commitdiff | tree

Dávid Bolvanský [Sun, 3 Oct 2021 12:52:42 +0000 (14:52 +0200)]

Fixed warnings in target/parser codes produced by -Wbitwise-instead-of-logicala

commit | commitdiff | tree

Dávid Bolvanský [Sun, 3 Oct 2021 11:57:57 +0000 (13:57 +0200)]

Fixed more warnings in LLVM produced by -Wbitwise-instead-of-logical

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 10:41:30 +0000 (13:41 +0300)]

[NFC][X86][Codegen] Add test coverage for interleaved i8 load/store stride=6

commit | commitdiff | tree

Roman Lebedev [Sun, 3 Oct 2021 10:30:49 +0000 (13:30 +0300)]

[NFC][X86][LV] Add costmodel test coverage for interleaved i8 load/store stride=6

commit | commitdiff | tree

Simon Pilgrim [Sun, 3 Oct 2021 11:31:22 +0000 (12:31 +0100)]

[X86] Add SSE2/AVX1/AVX512BW test coverage to interleaved load/store tests

Extension to PR51979 so codegen tests keep close to the costmodel tests

commit | commitdiff | tree

Dávid Bolvanský [Sun, 3 Oct 2021 11:19:04 +0000 (13:19 +0200)]

Unbreak hexagon-check-builtins.c due to rGb1fcca388441

commit | commitdiff | tree

mydeveloperday [Sun, 3 Oct 2021 11:08:24 +0000 (12:08 +0100)]

[clang-format] allow clang-format to be passed a file of filenames so we can add a regression suite of "clean clang-formatted files" from LLVM

This change now generates that list, and the change to clang-format allows
us to run clang-format quickly over these files via the list of files.

clang-format.exe -verbose -n --files=./clang/docs/tools/clang-formatted-files.txt

```
Clang-formating 7926 files
Formatting [1/7925] clang/bindings/python/tests/cindex/INPUTS/header1.h
..
Formatting [7925/7925] utils/bazel/llvm-project-overlay/llvm/include/llvm/Config/config.h
```

This is needed because putting all those files on the command line is too
long, and invoking 7900+ clang-formats is much slower (too slow to be honest)

Using this method it takes on 7.5 minutes (on my machine) to run
`clang-format -n` over all of the files (7925), this should result in us
testing any change quickly and easily.

We should be able to use rerunning this list to ensure that we don't regress
clang-format over a large code base, but also use it to ensure none of the
previous files which were 100% clang-formatted remain so.
(which the LLVM premerge checks should be enforcing)

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D111000

commit | commitdiff | tree

Dávid Bolvanský [Sun, 3 Oct 2021 11:05:09 +0000 (13:05 +0200)]

Reland "[Clang] Extend -Wbool-operation to warn about bitwise and of bools with side effects"

This reverts commit a4933f57f3f0a45e1db1075f7285f0761a80fc06. New warnings were fixed.

commit | commitdiff | tree

Dávid Bolvanský [Sun, 3 Oct 2021 11:04:18 +0000 (13:04 +0200)]

Fixed warnings in LLVM produced by -Wbitwise-instead-of-logical

commit | commitdiff | tree

Dávid Bolvanský [Sun, 3 Oct 2021 10:47:12 +0000 (12:47 +0200)]

Revert "[Clang] Extend -Wbool-operation to warn about bitwise and of bools with side effects"

This reverts commit f62d18ff140f67a8776a7a3c62a75645d8d540b5. Found some cases in LLVM itself.

commit | commitdiff | tree

Dávid Bolvanský [Sun, 3 Oct 2021 09:06:19 +0000 (11:06 +0200)]

[Clang] Extend -Wbool-operation to warn about bitwise and of bools with side effects

Motivation: https://arstechnica.com/gadgets/2021/07/google-pushed-a-one-character-typo-to-production-bricking-chrome-os-devices/

Warn for pattern boolA & boolB or boolA | boolB where boolA and boolB has possible side effects.

Casting one operand to int is enough to silence this warning: for example (int)boolA & boolB or boolA| (int)boolB

Fixes https://bugs.llvm.org/show_bug.cgi?id=51216

Differential Revision: https://reviews.llvm.org/D108003

commit | commitdiff | tree

hyeongyu kim [Sun, 3 Oct 2021 08:57:05 +0000 (17:57 +0900)]

[LSV] Change the default value of InstertElement to poison

This patch is changing the InsertElement's placeholder to poison without changing the LSV's behavior.

Regardless of whether `StoreTy` is FixedVectorType or not, the poison value will be overwritten with a different value.
Therefore, whether the InsertElement's placeholder is poison or undef will not affect the result of the program.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D111005

commit | commitdiff | tree

Hsiangkai Wang [Sun, 3 Oct 2021 07:43:38 +0000 (15:43 +0800)]

[NFC][RISCV] Update test cases through update_cc_test_checks.py.

commit | commitdiff | tree

Michał Górny [Sat, 2 Oct 2021 09:59:15 +0000 (11:59 +0200)]

[mlir] [test] Include mlir_tools_dir in PATH to fix mlir-reduce

Include mlir_tools_dir in the PATH used in test environment,
as otherwise mlir-reduce is unable to find mlir-opt when building
standalone (and hence mlir_tools_dir != llvm_tools_dir).

Differential Revision: https://reviews.llvm.org/D110992

commit | commitdiff | tree

Mehdi Amini [Sun, 3 Oct 2021 01:24:07 +0000 (01:24 +0000)]

Fix ASAN execution for the MLIR Python tests

First the leak sanitizer has to be disabled, as even an empty script
leads to leak detection with Python.
Then we need to preload the ASAN runtime, as the main binary (python)
won't be linked against it. This will only work on Linux right now.

Differential Revision: https://reviews.llvm.org/D111004

commit | commitdiff | tree

Mehdi Amini [Sun, 3 Oct 2021 01:25:10 +0000 (01:25 +0000)]

Exclude MLIR python binding tests from Sanitizer tests for now

This requires more config to work reliably during lit execution.
But also I see many leaks when running manually right now.

commit | commitdiff | tree

Mehdi Amini [Sun, 3 Oct 2021 05:04:03 +0000 (05:04 +0000)]

Fix last leaky MLIR integration test (NFC)

commit | commitdiff | tree

Min-Yih Hsu [Sun, 12 Sep 2021 05:49:06 +0000 (22:49 -0700)]

[IR]PATCH 2/2: Add MDNode::printTree and dumpTree

This patch adds the functionalities to print MDNode in tree shape. For
example, instead of printing a MDNode like this:
```
<0x5643e1166888> = !DILocalVariable(name: "foo", arg: 2, scope: <0x5643e11c9740>, file: <0x5643e11c6ec0>, line: 8, type: <0x5643e11ca8e0>, flags: DIFlagPublic | DIFlagFwdDecl, align: 8)
```
The printTree/dumpTree functions can give you:
```
<0x5643e1166888> = !DILocalVariable(name: "foo", arg: 2, scope: <0x5643e11c9740>, file: <0x5643e11c6ec0>, line: 8, type: <0x5643e11ca8e0>, flags: DIFlagPublic | DIFlagFwdDecl, align: 8)
  <0x5643e11c9740> = distinct !DISubprogram(scope: null, spFlags: 0)
  <0x5643e11c6ec0> = distinct !DIFile(filename: "file.c", directory: "/path/to/dir")
  <0x5643e11ca8e0> = distinct !DIDerivedType(tag: DW_TAG_pointer_type, baseType: <0x5643e11668d8>, size: 1, align: 2)
    <0x5643e11668d8> = !DIBasicType(tag: DW_TAG_unspecified_type, name: "basictype")
```
Which is useful when using it in debugger. Where sometimes printing the
whole module to see all MDNodes is too expensive.

Differential Revision: https://reviews.llvm.org/D110113

commit | commitdiff | tree

Min-Yih Hsu [Sun, 12 Sep 2021 03:44:27 +0000 (20:44 -0700)]

[IR]PATCH 1/2: Add AsmWriterContext into AsmWriter

AsmWriterContext is a simple compound that stores TypePrinting,
SlotTracker (i.e. "Machine" in AsmWriter), and Module instances -- three
of the most commonly used objects in the AsmWriter infrastructure.
Previously these three objects are passed as separate function arguments
to most of the printer functions in this file. Tidying them up can bring
easier code refactoring on printer functions in the future (e.g. when we
want to pass additional objects to all printer functions).

NOTE: Theoritically, this patch should be NFC.

Differential Revision: https://reviews.llvm.org/D110112

commit | commitdiff | tree

Dan Liew [Fri, 1 Oct 2021 20:04:13 +0000 (13:04 -0700)]

Use standard separator for TSan options in `stress.cpp` test case.

Use of space as a separator for options is problematic for wrapper
scripts (i.e. implementations of `%run`) that have to marshall
environment variables to target different than the host.

Rather than requiring every implementation of `%run` to support spaces
in `TSAN_OPTIONS` it is simpler to fix this single test case.

rdar://83637067

Differential Revision: https://reviews.llvm.org/D110967

commit | commitdiff | tree

Uday Bondhugula [Sat, 2 Oct 2021 10:23:57 +0000 (15:53 +0530)]

[MLIR][NFC] Drop unnecessary use of OpBuilder in build trip count map

NFC. Drop unnecessary use of OpBuilder in buildTripCountMapAndOperands.
Rename this to getTripCountMapAndOperands and remove stale comments.

Differential Revision: https://reviews.llvm.org/D110993

commit | commitdiff | tree

Mehdi Amini [Sun, 3 Oct 2021 03:42:19 +0000 (03:42 +0000)]

Disable leak check for the MLIR Linalg CPU integration tests (NFC)

See http://llvm.org/pr52047 for tracking.

commit | commitdiff | tree

Mehdi Amini [Sun, 3 Oct 2021 03:33:22 +0000 (03:33 +0000)]

Disable leak check for the MLIR Sparse CPU integration tests (NFC)

See http://llvm.org/pr52046 for tracking.

commit | commitdiff | tree

Mehdi Amini [Sun, 3 Oct 2021 03:27:54 +0000 (03:27 +0000)]

Fix memory leaks in MLIR integration tests for vector dialect (NFC)

commit | commitdiff | tree

Alfsonso Gregory [Sun, 3 Oct 2021 02:37:12 +0000 (08:07 +0530)]

[LLVM][IR] Fixed input arguments for Verifier getter

ParameterABIAttributes functions work with unsigned integers as the index, so having the getter be signed makes no sense. Additionally, for this reason, the loop vars that were signed were changed to unsigned too.

Reviewed By: jeroen.dobbelaere

Differential Revision: https://reviews.llvm.org/D110344

commit | commitdiff | tree

Takafumi Arakaki [Sun, 3 Oct 2021 01:31:59 +0000 (21:31 -0400)]

Re-apply the fix on DwarfEHPrepare and add a test

This patch re-introduces the fix in the commit https://github.com/llvm/llvm-project/commit/66b0cebf7f736 by @yrnkrn

> In DwarfEHPrepare, after all passes are run, RewindFunction may be a dangling
>
> pointer to a dead function. To make sure it's valid, doFinalization nullptrs
> RewindFunction just like the constructor and so it will be found on next run.
>
> llvm-svn: 217737

It seems that the fix was not migrated to `DwarfEHPrepareLegacyPass`.

This patch also updates `llvm/test/CodeGen/X86/dwarf-eh-prepare.ll` to include `-run-twice` to exercise the cleanup. Without this patch `llvm-lit -v llvm/test/CodeGen/X86/dwarf-eh-prepare.ll` fails with

```
-- Testing: 1 tests, 1 workers --
FAIL: LLVM :: CodeGen/X86/dwarf-eh-prepare.ll (1 of 1)
******************** TEST 'LLVM :: CodeGen/X86/dwarf-eh-prepare.ll' FAILED ********************
Script:
--
: 'RUN: at line 1';   /home/arakaki/build/llvm-project/main/bin/opt -mtriple=x86_64-linux-gnu -dwarfehprepare -simplifycfg-require-and-preserve-domtree=1 -run-twice < /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll -S | /home/arakaki/build/llvm-project/main/bin/FileCheck /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll
--
Exit Code: 2

Command Output (stderr):
--
Referencing function in another module!
  call void @_Unwind_Resume(i8* %ehptr) #1
; ModuleID = '<stdin>'
void (i8*)* @_Unwind_Resume
; ModuleID = '<stdin>'
in function simple_cleanup_catch
LLVM ERROR: Broken function found, compilation aborted!
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /home/arakaki/build/llvm-project/main/bin/opt -mtriple=x86_64-linux-gnu -dwarfehprepare -simplifycfg-require-and-preserve-domtree=1 -run-twice -S
1.      Running pass 'Function Pass Manager' on module '<stdin>'.
2.      Running pass 'Module Verifier' on function '@simple_cleanup_catch'
#0 0x000056121b570a2c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Unix/Signals.inc:569:0
#1 0x000056121b56eb64 llvm::sys::RunSignalHandlers() /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Signals.cpp:97:0
#2 0x000056121b56f28e SignalHandler(int) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Unix/Signals.inc:397:0
#3 0x00007fc7e9b22980 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12980)
#4 0x00007fc7e87d3fb7 raise /build/glibc-S7xCS9/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0
#5 0x00007fc7e87d5921 abort /build/glibc-S7xCS9/glibc-2.27/stdlib/abort.c:81:0
#6 0x000056121b4e1386 llvm::raw_svector_ostream::raw_svector_ostream(llvm::SmallVectorImpl<char>&) /home/arakaki/repos/watch/llvm-project/llvm/include/llvm/Support/raw_ostream.h:674:0
#7 0x000056121b4e1386 llvm::report_fatal_error(llvm::Twine const&, bool) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/ErrorHandling.cpp:114:0
#8 0x000056121b4e1528 (/home/arakaki/build/llvm-project/main/bin/opt+0x29e3528)
#9 0x000056121adfd03f llvm::raw_ostream::operator<<(llvm::StringRef) /home/arakaki/repos/watch/llvm-project/llvm/include/llvm/Support/raw_ostream.h:218:0
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/arakaki/build/llvm-project/main/bin/FileCheck /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll

--

********************
********************
Failed Tests (1):
  LLVM :: CodeGen/X86/dwarf-eh-prepare.ll

Testing Time: 0.22s
  Failed: 1
```

Reviewed By: loladiro

Differential Revision: https://reviews.llvm.org/D110979

commit | commitdiff | tree

Arthur O'Dwyer [Mon, 27 Sep 2021 04:58:56 +0000 (00:58 -0400)]

[libc++] [ranges] Uncomment operator<=> in transform and iota iterators.

The existing tests for transform_view::iterator weren't quite right,
and can be simplified now that we have more of C++20 available to us.
Having done that, let's use the same pattern for iota_view::iterator
as well.

Differential Revision: https://reviews.llvm.org/D110774

commit | commitdiff | tree

Mehdi Amini [Sat, 2 Oct 2021 23:55:25 +0000 (23:55 +0000)]

Fix memory leak in MLIR SPIRV ModuleCombiner

commit | commitdiff | tree

Mehdi Amini [Sat, 2 Oct 2021 23:53:02 +0000 (23:53 +0000)]

Fix/disable more MLIR tests exposing leaks in ASAN builds (NFC)

commit | commitdiff | tree

Mehdi Amini [Sat, 2 Oct 2021 23:16:35 +0000 (23:16 +0000)]

Fix multiple memory leaks in mlir-cpu-runner tests (NFC)

commit | commitdiff | tree

Mehdi Amini [Sat, 2 Oct 2021 23:07:39 +0000 (23:07 +0000)]

Fix memory leak in mlir-cpu-runner/sgemm_naive_codegen.mlir (NFC)

commit | commitdiff | tree

Mehdi Amini [Sat, 2 Oct 2021 21:28:28 +0000 (21:28 +0000)]

Fix Undefined Behavior in MLIR Diagnostic: don't call memcpy with a nullptr source

This happens when streaming an empty Twine as part of a diagnostic.

Differential Revision: https://reviews.llvm.org/D111002

commit | commitdiff | tree

Mehdi Amini [Sat, 2 Oct 2021 21:31:17 +0000 (21:31 +0000)]

Fix memory leaks in MLIR unit-tests (NFC)

commit | commitdiff | tree

Mehdi Amini [Sat, 2 Oct 2021 21:05:22 +0000 (21:05 +0000)]

Fix memory leaks in mlir/unittests/MLIRTableGenTests

Trying to get MLIR ASAN-clean.

commit | commitdiff | tree

Philip Reames [Sat, 2 Oct 2021 19:38:50 +0000 (12:38 -0700)]

[SCEV] Split isSCEVExprNeverPoison reasoning explicitly into scope and mustexecute parts [NFC]

Inspired by the needs to D111001 and D109845. The seperation of concerns also amakes it easier to reason about correctness and completeness.

commit | commitdiff | tree

Kazu Hirata [Sat, 2 Oct 2021 19:06:29 +0000 (12:06 -0700)]

[Target] Migrate from getNumArgOperands to arg_size (NFC)

Note that getNumArgOperands is considered a legacy name. See
llvm/include/llvm/IR/InstrTypes.h for details.

commit | commitdiff | tree

Lang Hames [Sat, 2 Oct 2021 18:28:14 +0000 (11:28 -0700)]

[llvm-jitlink] Sink getPageSize call in Session::Create.

The page size for the host process is only needed in the in-process use case.

commit | commitdiff | tree

Simon Pilgrim [Fri, 1 Oct 2021 20:53:00 +0000 (21:53 +0100)]

[X86][Atom] Fix BSR/BSF uops + port usage

Both ports are required for BitScan ops. Update the uops counts + port usage based off the most recent llvm-exegesis captures (PR36895) and what Intel AoM / Agner reports as well.

commit | commitdiff | tree

Craig Topper [Sat, 2 Oct 2021 17:44:05 +0000 (10:44 -0700)]

Revert "[RISCV] Add an GPR def to the Zvlseg SPILL/RELOAD pseudos"

This reverts commit 1f161919065fbfa2b39b8f373553a64b89f826f8.

We're seeing some issues with this internally. It seems that when
the spill is created by register allocation, the GPR doesn't get
allocated and an assertion fires during virtual register rewriting.

The .mir test case contains the spill before register allocation so
register allocation sees it as any other instruction.

commit | commitdiff | tree

mydeveloperday [Sat, 2 Oct 2021 17:04:32 +0000 (18:04 +0100)]

[clang-format] NFC 1% improvement in the overall clang-formatted status

commit | commitdiff | tree

Mehdi Amini [Sat, 2 Oct 2021 05:16:44 +0000 (05:16 +0000)]

Free memory leak on duplicate interface registration

I guess this is why we should use unique_ptr as much as possible.
Also fix the InterfaceAttachmentTest.cpp test.

Differential Revision: https://reviews.llvm.org/D110984

commit | commitdiff | tree

Simon Pilgrim [Sat, 2 Oct 2021 14:30:58 +0000 (15:30 +0100)]

[X86][SSE] Fix typo + infinite-loop in HOP(HOP'(X,X),HOP'(Y,Y)) fold (PR52040)

PR52040 identified several issues with the HOP(HOP'(X,X),HOP'(Y,Y)) -> HOP(PERMUTE(HOP'(X,Y)),PERMUTE(HOP'(X,Y)) slow-HOP fold.

Not only was there a copy+paste typo when accessing the inner HOP operands, but the (unnecessary) ReplaceAllUsesOfValueWith call was missing one use checks.

Now that we have better shuffle combines of HOPs we can just return a new HOP() sequence and not use ReplaceAllUsesOfValueWith at all - this actually improved pair_sum_v8i32_v4i32 codegen as it kicks off further shuffle combines.

commit | commitdiff | tree

Josh Learn [Sat, 2 Oct 2021 12:22:49 +0000 (13:22 +0100)]

[clang-format] Constructor initializer lists format with pp directives

Currently constructor initializer lists sometimes format incorrectly
when there is a preprocessor directive in the middle of the list.
This patch fixes the issue when parsing the initilizer list by
ignoring the preprocessor directive when checking if a block is
part of an initializer list.

rdar://82554274

Reviewed By: MyDeveloperDay, HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D109951

commit | commitdiff | tree

mydeveloperday [Sat, 2 Oct 2021 12:18:00 +0000 (13:18 +0100)]

[clang-format] [docs] [NFC] improve clarity in the QualifierAlignment warning

Improve the clarity and guidance of the warning when using code modifying option in clang-format see {D69764}

Reviewed By: HazardyKnusperkeks, curdeius

Differential Revision: https://reviews.llvm.org/D110801

commit | commitdiff | tree

Mark de Wever [Sat, 2 Oct 2021 11:47:27 +0000 (13:47 +0200)]

[NFC][libc++] Use TEST_HAS_NO_EXCEPTIONS in tests.

commit | commitdiff | tree

Mark de Wever [Sat, 2 Oct 2021 11:41:05 +0000 (13:41 +0200)]

[libc++][doc] Update format status.

Updated based on recent commits, new reviews and work continuing for
P2216.

commit | commitdiff | tree

Simon Pilgrim [Fri, 1 Oct 2021 17:53:02 +0000 (18:53 +0100)]

[X86] decomposeMulByConstant - decompose legal vXi32 multiplies on SlowPMULLD targets and all vXi64 multiplies

X86's decomposeMulByConstant never permits mul decomposition to shift+add/sub if the vector multiply is legal.

Unfortunately this isn't great for SSE41+ targets which have PMULLD for vXi32 multiplies, but is often quite slow. This patch proposes to allow decomposition if the target has the SlowPMULLD flag (i.e. Silvermont). We also always decompose legal vXi64 multiplies - even latest IceLake has really poor latencies for PMULLQ.

Differential Revision: https://reviews.llvm.org/D110588

commit | commitdiff | tree

Simon Pilgrim [Thu, 30 Sep 2021 11:28:02 +0000 (12:28 +0100)]

[X86] Atom SSE shift-by-variable take 2uops/3uops not 1uop

Based off the most recent llvm-exegesis captures (PR36895) and what Intel AoM / Agner / InstLatX64 reports as well.

commit | commitdiff | tree

Roman Lebedev [Sat, 2 Oct 2021 10:40:09 +0000 (13:40 +0300)]

[X86][Costmodel] Load/store i8 Stride=4 VF=32 interleaving costs

While we already model this tuple, the load cost is divergent from reality, so fix it.

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/zWMhhnPYa - for intels `Block RThroughput: =56.0`; for ryzens, `Block RThroughput: <=24.0`
So pick cost of `56`.

For store we have:
https://godbolt.org/z/vnqqjWx51 - for intels `Block RThroughput: =12.0`; for ryzens, `Block RThroughput: <=4.0`
So pick cost of `12`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110971

commit | commitdiff | tree

Roman Lebedev [Sat, 2 Oct 2021 10:40:09 +0000 (13:40 +0300)]

[X86][Costmodel] Load/store i8 Stride=4 VF=16 interleaving costs

While we already model this tuple, the values are divergent from reality, so fix them.

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/TrGW7cKsE - for intels `Block RThroughput: =24.0`; for ryzens, `Block RThroughput: <=12.0`
So pick cost of `24`.

For store we have:
https://godbolt.org/z/Mh7qaqEfe - for intels `Block RThroughput: =8.0`; for ryzens, `Block RThroughput: <=4.0`
So pick cost of `8`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110970

commit | commitdiff | tree

Roman Lebedev [Sat, 2 Oct 2021 10:40:04 +0000 (13:40 +0300)]

[X86][Costmodel] Load/store i8 Stride=4 VF=8 interleaving costs

While we already model this tuple, the values are divergent from reality, so fix them.

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/v7746Wcf7 - for intels `Block RThroughput: =12.0`; for ryzens, `Block RThroughput: <=6.0`
So pick cost of `12`.

For store we have:
https://godbolt.org/z/aEeEohEbP - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110969

commit | commitdiff | tree

Roman Lebedev [Sat, 2 Oct 2021 10:39:58 +0000 (13:39 +0300)]

[X86][Costmodel] Load/store i8 Stride=4 VF=4 interleaving costs

While we already model this tuple, the store cost is divergent from reality, so fix it.

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/1n4bPh7Tn - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.

For store we have:
https://godbolt.org/z/r8K9sveqo - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110968

commit | commitdiff | tree

Roman Lebedev [Sat, 2 Oct 2021 10:39:54 +0000 (13:39 +0300)]

[X86][Costmodel] Load/store i8 Stride=4 VF=2 interleaving costs

While we already model this tuple, the values are divergent from reality, so fix them.

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/KP6nn36zs - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.

For store we have:
https://godbolt.org/z/ov95zhrq6 - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110966

commit | commitdiff | tree

Roman Lebedev [Sat, 2 Oct 2021 10:39:15 +0000 (13:39 +0300)]

[X86][Costmodel] Load/store i8 Stride=3 VF=32 interleaving costs

For VF=16, costs are correct.
For VF=32, load cost is divergent.

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/qKjevqf4W - for intels `Block RThroughput: <=14.0`; for ryzens, `Block RThroughput: <=4.5`
So pick cost of `14`.

For store we have:
https://godbolt.org/z/xTssTq319 - for intels `Block RThroughput: =13.0`; for ryzens, `Block RThroughput: <=5.5`
So pick cost of `13`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110961

commit | commitdiff | tree

Roman Lebedev [Sat, 2 Oct 2021 10:39:15 +0000 (13:39 +0300)]

[X86][Costmodel] Load/store i8 Stride=3 VF=8 interleaving costs

While we already model this tuple, the values are divergent from reality, so fix them.

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/1jeocxj55 - for intels `Block RThroughput: =6.0`; for ryzens, `Block RThroughput: <=3.0`
So pick cost of `6`.

For store we have:
https://godbolt.org/z/fr7xfa3K5 - for intels `Block RThroughput: =6.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `6`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110960

commit | commitdiff | tree

Roman Lebedev [Sat, 2 Oct 2021 10:39:10 +0000 (13:39 +0300)]

[X86][Costmodel] Load/store i8 Stride=3 VF=4 interleaving costs

While we already model this tuple, the values are divergent from reality, so fix them.

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/obWz3PrfK - for intels `Block RThroughput: =3.0`; for ryzens, `Block RThroughput: <=1.5`
So pick cost of `3`.

For store we have:
https://godbolt.org/z/orjPshn3h - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110958

commit | commitdiff | tree

Roman Lebedev [Sat, 2 Oct 2021 10:39:05 +0000 (13:39 +0300)]

[X86][Costmodel] Load/store i8 Stride=3 VF=2 interleaving costs

While we already model this tuple, the values are divergent from reality, so fix them.

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/WYscYMcW4 - for intels `Block RThroughput: =3.0`; for ryzens, `Block RThroughput: <=1.5`
So pick cost of `3`.

For store we have:
https://godbolt.org/z/e9qvYdbbs - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110956

commit | commitdiff | tree

Mark de Wever [Tue, 25 May 2021 18:32:38 +0000 (20:32 +0200)]

[libc++][format] Implement Unicode support.

This adds the width estimation functions to the std-format-spec.

Implements parts of:
- P0645 Text Formatting
- P1868 width: clarifying units of width and precision in std::format

Reviewed By: #libc, ldionne, vitaut

Differential Revision: https://reviews.llvm.org/D103413

commit | commitdiff | tree

Tomasz Miąsko [Sat, 2 Oct 2021 05:58:54 +0000 (07:58 +0200)]

[llvm-cxxfilt] Replace isalnum with isAlnum from StringExtras

D104366 introduced a new llvm-cxxfilt test with non-ASCII characters,
which caused a failure on llvm-clang-x86_64-expensive-checks-win
builder, with a stack trace suggesting issue in a call to isalnum.

The argument to isalnum should be either EOF or a value that is
representable in the type unsigned char. The llvm-cxxfilt does not
perform a cast from char to unsigned char before the call, so the
value might be out of valid range.

Replace the call to isalnum with isAlnum from StringExtras, which takes
a char as the argument. This also makes the check independent of the
current locale.

Differential Revision: https://reviews.llvm.org/D110986

commit | commitdiff | tree

Amara Emerson [Sat, 2 Oct 2021 04:51:46 +0000 (21:51 -0700)]

[AArch64][GlobalISel] Lower G_SMULH/G_UMULH unless its one of the supported types.

s32 was also incorrectly marked as a supported type, and was causing fallbacks
because we don't support it.

commit | commitdiff | tree

Alexey Lapshin [Thu, 23 Sep 2021 09:26:25 +0000 (12:26 +0300)]

[DWARF][NFC] add ParentIdx and SiblingIdx to DWARFDebugInfoEntry for faster navigation.

This patch implements suggestion done while reviewing D102634. It adds two fields:
ParentIdx and SiblingIdx. These fields allow fast navigation to die parent and
die sibling. These fields are set at the moment when dies are loaded.

dsymutil works 2% faster with this patch(run on clang binary).

Differential Revision: https://reviews.llvm.org/D110363

commit | commitdiff | tree

Mehdi Amini [Sat, 2 Oct 2021 04:45:40 +0000 (04:45 +0000)]

Fix memory leaks in mlir/test/CAPI/ir.c

commit | commitdiff | tree

Mehdi Amini [Sat, 2 Oct 2021 04:06:17 +0000 (04:06 +0000)]

Add a `check-mlir-build-only` build target that only builds the dependencies of the `check-mlir` test target (NFC)

commit | commitdiff | tree

Nimish Mishra [Thu, 30 Sep 2021 19:11:57 +0000 (00:41 +0530)]

[flang][OpenMP] Added OpenMP 5.0 specification based semantic checks for sections construct and test case for simd construct

According to OpenMP 5.0 spec document, the following semantic restrictions have been dealt with in this patch.

1. [sections construct] Orphaned section directives are prohibited. That is, the section directives must appear within the sections construct and must not be encountered elsewhere in the sections region.

Semantic checks for the following are not necessary, since use of orphaned section construct (i.e. without an enclosing sections directive) throws parser errors and control flow never reaches the semantic checking phase. Added a test case for the same.

2. [sections construct] Must be a structured block

Added test case and made changes to branching logic

3. [simd construct] Must be a structured block / A program that branches in or out of a function with declare simd is non conforming

4. Fixed !$omp do's handling of unlabeled CYCLEs

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D108904

commit | commitdiff | tree

Shivam Gupta [Sat, 2 Oct 2021 02:05:15 +0000 (07:35 +0530)]

[libc++][Docs] Update benchmark doc wrt monorepo

Seems this section is not updated since we have transited to llvm-project monorepo.
At the start, we build libcxx under monorepo configuration but later try to make the separate configuration for libcxx build
and running benchmark.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D110722

commit | commitdiff | tree

LLVM GN Syncbot [Sat, 2 Oct 2021 00:21:42 +0000 (00:21 +0000)]

[gn build] Port 657f02d45804

commit | commitdiff | tree

Daniel Rodríguez Troitiño [Fri, 1 Oct 2021 21:30:21 +0000 (14:30 -0700)]

Revert "Extract LC_CODE_SIGNATURE related implementation out of LLD"

This reverts commit cc8229603b67763e77a46894f88f7d3ddd04de34.

As discussed in the review of https://reviews.llvm.org/D109972, this was not
right approach, so we are reverting to start with a different approach.

Differential Revision: https://reviews.llvm.org/D110974

commit | commitdiff | tree

Philip Reames [Fri, 1 Oct 2021 23:39:23 +0000 (16:39 -0700)]

[test] add coverage for a SCEVUnknown scoped value in isSCEVExprNeverPoison

Note that a couple of the "negative" tests also end up showing miscompiles due to D109845 which is not yet fixed.

commit | commitdiff | tree

Philip Reames [Fri, 1 Oct 2021 23:30:44 +0000 (16:30 -0700)]

[SCEV] Stop blindly propagating flags from inbound geps to SCEV nodes

This fixes a violation of the wrap flag rules introduced in c4048d8f. This was also noted in the (very old) PR23527.

The issue being fixed is that we assume the inbound flag on any GEP assumes that all users of *any* gep (or add) which happens to map to that SCEV would also be UB if the (other) gep overflowed. That's simply not true.

In terms of the test diffs, I don't see anything seriously problematic. The lost flags are expected (given the semantic restriction on when its legal to tag the SCEV), and there are several cases where the previously inferred flags are unsound per the new semantics.

The only common trend I noticed when looking at the deltas is that by not considering branch on poison as immediate UB in ValueTracking, we do miss a few cases we could reclaim. We may be able to claw some of these back with the follow ideas mentioned in PR51817.

It's worth noting that most of the changes are analysis result only changes. The two transform changes are pretty minimal. In one case, we miss the opportunity to infer a nuw (correctly). In the other, we fail to fold an exit and produce a loop invariant form instead. This one is probably over-reduced as the program appears to be undefined in practice, and neither before or after exploits that.

Differential Revision: https://reviews.llvm.org/D109789

commit | commitdiff | tree

Philip Reames [Fri, 1 Oct 2021 22:57:37 +0000 (15:57 -0700)]

[SCEV] Remove invariant requirement from isSCEVExprNeverPoison

This code is attempting to prove that I must execute if we enter the defining scope of the SCEV which will be created from I. In the case where it found a defining addrec scope, it had a rather odd restriction that all of the other operands must be loop invariant in that addrec's loop.

As near as I can tell here, we really only need a upper bound on the defining scope. If we can prove the stronger property, then we must also have proven the property on the exact defining scope as well.

In practice, the actual effect of this change is narrow. The compile time restriction at the top of the routine basically limits us to I being an arithmetic in some loop L with both an addrec operand in L, and a unknown operands in L. Possible to demonstrate, but the main value of the change is removing unneeded code.

Differential Revision: https://reviews.llvm.org/D110892

commit | commitdiff | tree

Philip Reames [Fri, 1 Oct 2021 22:34:58 +0000 (15:34 -0700)]

[test] split flags-from-poison.ll to allow ease of autogen update

commit | commitdiff | tree

Jessica Paquette [Fri, 1 Oct 2021 16:22:51 +0000 (09:22 -0700)]

[AArch64][GlobalISel] Change G_ANYEXT fed by scalar G_ICMP to G_ZEXT

This is a common pattern:

```
    %icmp:_(s32) = G_ICMP intpred(eq), ...
    %ext:_(s64) = G_ANYEXT %icmp(s32)
    %and:_(s64) = G_AND %ext, 1
```

Here's an example: https://godbolt.org/z/T13f6o8zE

This pattern appears because of the following combine in the
LegalizationArtifactCombiner:

```
// zext(trunc x) - > and (aext/copy/trunc x), mask
```

Which kicks in when we widen the result of G_ICMP from 1 bit to 32 bits.

We know that, on AArch64, a scalar G_ICMP will produce 0 or 1. So the result
of `%ext` will always be 0 or 1 as well.

We have some KnownBits combines which eliminate redundant G_ANDs with masks.
These combines don't kick in with G_ANYEXT.

So, if we replace the G_ANYEXT with G_ZEXT in this situation, the KnownBits
based combines can remove the redundant G_AND.

I wasn't sure if it woud be more appropriate to

* Take this route
* Put this in the LegalizationArtifactCombiner.
* Allow 64 bit G_ICMP destinations

I decided on this route because

1) It's simple

2) I'm not sure if philosophically-speaking, we should be handling non-artifact
instructions + target-specific details like TargetBooleanContents in the
LegalizationArtifactCombiner

3) There is a lot of existing code which assumes we only have 32 bit G_ICMP
destinations. So, adding support for 64-bit destinations seems rather invasive
right now. I think that adding support for 64-bit destinations, or modelling
G_ICMP as ADDS/SUBS/etc is probably cleaner long term though.

This gives minor code size savings on all CTMark benchmarks.

Differential Revision: https://reviews.llvm.org/D110959

commit | commitdiff | tree

Stefan Pintilie [Fri, 1 Oct 2021 21:46:46 +0000 (16:46 -0500)]

[NFC][PowerPC] Add test case for byval store.

Added a test case for situations where a struct of size 1-7 bytes is
passed by value.

commit | commitdiff | tree

Daniil Suchkov [Fri, 1 Oct 2021 21:49:38 +0000 (21:49 +0000)]

Revert "[DomTree] Assert that blocks in queries aren't from another function"

This reverts commit 86046516e4f4527213c595c154c9971d81a49601.
This assertion fails on https://lab.llvm.org/buildbot/#/builders/98/builds/6690
Reverting it for now.

commit | commitdiff | tree

Amy Kwan [Fri, 1 Oct 2021 21:38:20 +0000 (16:38 -0500)]

Revert "tsan: fix and test detection of TLS races"

This reverts commit b4c1e5cb73bd26e5853af77c2a235ca9f35e2577.

Reverting this as it contains a test that is currently failing on the PPC BE bots.

commit | commitdiff | tree

Amy Kwan [Fri, 1 Oct 2021 21:35:15 +0000 (16:35 -0500)]

Revert "tsan: fix tls_race3 test on darwin"

This reverts commit ade5023c54cffcbefe0557b5473d55b06e40809b.

Reverting this commit as it is dependent on a test breaking the PPC BE bots.

commit | commitdiff | tree

Amy Kwan [Fri, 1 Oct 2021 21:32:32 +0000 (16:32 -0500)]

Revert "tsan: print a meaningful frame for stack races"

This reverts commit ccc83ac7c501c8e117753af0729414350aa9c117.

Reverting this commit as it is dependent on additional commits breaking the
PPC BE bots.

Domain: System / Toolchain;