platform/upstream/llvm.git
4 years ago[NFC][InstCombine] More test for "sign bit test via shifts" pattern (PR43595)
Roman Lebedev [Sun, 13 Oct 2019 17:11:16 +0000 (17:11 +0000)]
[NFC][InstCombine] More test for "sign bit test via shifts" pattern (PR43595)

While that pattern is indirectly handled via
reassociateShiftAmtsOfTwoSameDirectionShifts(),
that incursme one-use restriction on truncation,
which is pointless since we know that we'll produce a single instruction.

Additionally, *if* we are only looking for sign bit,
we don't need shifts to be identical,
which isn't the case in general,
and is the blocker for me in bug in question:

https://bugs.llvm.org/show_bug.cgi?id=43595

llvm-svn: 374726

4 years ago[X86] SimplifyMultipleUseDemandedBitsForTargetNode - use getTargetShuffleInputs with...
Simon Pilgrim [Sun, 13 Oct 2019 17:03:11 +0000 (17:03 +0000)]
[X86] SimplifyMultipleUseDemandedBitsForTargetNode - use getTargetShuffleInputs with KnownUndef/Zero results.

llvm-svn: 374725

4 years ago[X86] getTargetShuffleInputs - add KnownUndef/Zero output support
Simon Pilgrim [Sun, 13 Oct 2019 17:03:02 +0000 (17:03 +0000)]
[X86] getTargetShuffleInputs - add KnownUndef/Zero output support

Adjust SimplifyDemandedVectorEltsForTargetNode to use the known elts masks instead of recomputing it locally.

llvm-svn: 374724

4 years ago[libc++][test] std::variant test cleanup
Casey Carter [Sun, 13 Oct 2019 16:46:16 +0000 (16:46 +0000)]
[libc++][test] std::variant test cleanup

* Add the conventional `return 0` to `main` in `variant.assign/conv.pass.cpp` and `variant.ctor/conv.pass.cpp`

* Fix some MSVC signed-to-unsigned conversion warnings by replacing `int` literarls with `unsigned int` literals

llvm-svn: 374723

4 years ago[libc++][test] <=> now has a feature-test macro
Casey Carter [Sun, 13 Oct 2019 16:46:12 +0000 (16:46 +0000)]
[libc++][test] <=> now has a feature-test macro

...which `test/support/test_macros.h` can use to detect compiler support.

llvm-svn: 374722

4 years agogn build: (manually) merge r374720
Nico Weber [Sun, 13 Oct 2019 15:25:13 +0000 (15:25 +0000)]
gn build: (manually) merge r374720

llvm-svn: 374721

4 years ago[clang-format] Proposal for clang-format to give compiler style warnings
Paul Hoad [Sun, 13 Oct 2019 14:51:45 +0000 (14:51 +0000)]
[clang-format] Proposal for clang-format to give compiler style warnings

relanding {D68554} with fixed lit tests, checked on Windows and MacOS

llvm-svn: 374720

4 years ago[X86][AVX] Add i686 avx splat tests
Simon Pilgrim [Sun, 13 Oct 2019 13:18:07 +0000 (13:18 +0000)]
[X86][AVX] Add i686 avx splat tests

llvm-svn: 374719

4 years agoMake most clangd unittests pass on Windows
Nico Weber [Sun, 13 Oct 2019 13:15:27 +0000 (13:15 +0000)]
Make most clangd unittests pass on Windows

The Windows triple currently turns on delayed template parsing, which
confuses several unit tests that use templates.

For now, just explicitly disable delayed template parsing. This isn't
ideal, but:

- the Windows triple will soon no longer use delayed template parsing
  by default

- there's precedent for this in the clangd unit tests already

- let's get the clangd tests pass on Windows first before making
  behavioral changes

Part of PR43592.

llvm-svn: 374718

4 years agoBlockInCriticalSectionChecker - silence static analyzer dyn_cast null dereference...
Simon Pilgrim [Sun, 13 Oct 2019 11:30:06 +0000 (11:30 +0000)]
BlockInCriticalSectionChecker - silence static analyzer dyn_cast null dereference warning. NFCI.

The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us.

llvm-svn: 374717

4 years agoIRTranslator - silence static analyzer null dereference warnings. NFCI.
Simon Pilgrim [Sun, 13 Oct 2019 11:29:35 +0000 (11:29 +0000)]
IRTranslator - silence static analyzer null dereference warnings. NFCI.

The CmpInst::getType() calls can be replaced by just using User::getType() that it was dyn_cast from, and we then need to assert that any default predicate cases came from the CmpInst.

llvm-svn: 374716

4 years ago[clang-tidy] bugprone-not-null-terminated-result: checker adjustments 4
Csaba Dabis [Sun, 13 Oct 2019 10:59:30 +0000 (10:59 +0000)]
[clang-tidy] bugprone-not-null-terminated-result: checker adjustments 4

llvm-svn: 374715

4 years ago[clang-tidy] bugprone-not-null-terminated-result: checker adjustments 3
Csaba Dabis [Sun, 13 Oct 2019 10:41:13 +0000 (10:41 +0000)]
[clang-tidy] bugprone-not-null-terminated-result: checker adjustments 3

On Windows the signed/unsigned int conversions of APInt seems broken, so that
two of the test files marked as unsupported on Windows, as a hotfix.

llvm-svn: 374713

4 years ago[clang-tidy] bugprone-not-null-terminated-result: checker adjustments 2
Csaba Dabis [Sun, 13 Oct 2019 10:20:58 +0000 (10:20 +0000)]
[clang-tidy] bugprone-not-null-terminated-result: checker adjustments 2

llvm-svn: 374712

4 years ago[clang-tidy] bugprone-not-null-terminated-result: checker adjustments
Csaba Dabis [Sun, 13 Oct 2019 09:46:56 +0000 (09:46 +0000)]
[clang-tidy] bugprone-not-null-terminated-result: checker adjustments

llvm-svn: 374711

4 years ago[clang-tidy] bugprone-not-null-terminated-result: Sphinx adjustments 2
Csaba Dabis [Sun, 13 Oct 2019 08:49:43 +0000 (08:49 +0000)]
[clang-tidy] bugprone-not-null-terminated-result: Sphinx adjustments 2

llvm-svn: 374710

4 years ago[clang-tidy] bugprone-not-null-terminated-result: Sphinx adjustments
Csaba Dabis [Sun, 13 Oct 2019 08:41:24 +0000 (08:41 +0000)]
[clang-tidy] bugprone-not-null-terminated-result: Sphinx adjustments

llvm-svn: 374709

4 years agogn build: Merge r374707
GN Sync Bot [Sun, 13 Oct 2019 08:33:14 +0000 (08:33 +0000)]
gn build: Merge r374707

llvm-svn: 374708

4 years ago[clang-tidy] New checker for not null-terminated result caused by strlen(), size...
Csaba Dabis [Sun, 13 Oct 2019 08:28:27 +0000 (08:28 +0000)]
[clang-tidy] New checker for not null-terminated result caused by strlen(), size() or equal length

Summary:
New checker called bugprone-not-null-terminated-result. This checker finds
function calls where it is possible to cause a not null-terminated result.
Usually the proper length of a string is `strlen(src) + 1` or equal length
of this expression, because the null terminator needs an extra space.
Without the null terminator it can result in undefined behaviour when the
string is read.

The following and their respective `wchar_t` based functions are checked:

`memcpy`, `memcpy_s`, `memchr`, `memmove`, `memmove_s`, `strerror_s`,
`strncmp`, `strxfrm`

The following is a real-world example where the programmer forgot to
increase the passed third argument, which is `size_t length`.
That is why the length of the allocated memory is not enough to hold the
null terminator.

```
    static char *stringCpy(const std::string &str) {
      char *result = reinterpret_cast<char *>(malloc(str.size()));
      memcpy(result, str.data(), str.size());
      return result;
    }
```

In addition to issuing warnings, fix-it rewrites all the necessary code.
It also tries to adjust the capacity of the destination array:

```
    static char *stringCpy(const std::string &str) {
      char *result = reinterpret_cast<char *>(malloc(str.size() + 1));
      strcpy(result, str.data());
      return result;
    }
```

Note: It cannot guarantee to rewrite every of the path-sensitive memory
allocations.

Reviewed By: JonasToth, aaron.ballman, whisperity, alexfh

Tags: #clang-tools-extra, #clang

Differential Revision: https://reviews.llvm.org/D45050

llvm-svn: 374707

4 years ago[X86] Add a one use check on the setcc to the min/max canonicalization code in combin...
Craig Topper [Sun, 13 Oct 2019 06:48:05 +0000 (06:48 +0000)]
[X86] Add a one use check on the setcc to the min/max canonicalization code in combineSelect.

This seems to improve std::midpoint code where we have a min and
a max with the same condition. If we split the setcc we can end
up with two compares if the one of the operands is a constant.
Since we aggressively canonicalize compares with constants.
For non-constants it can interfere with our ability to share
control flow if we need to expand cmovs into control flow.

I'm also not sure I understand this min/max canonicalization code.
The motivating case talks about comparing with 0. But we don't
check for 0 explicitly.

Removes one instruction from the codegen for PR43658.

llvm-svn: 374706

4 years ago[X86] Enable v4i32->v4i16 and v8i16->v8i8 saturating truncates to use pack instructio...
Craig Topper [Sun, 13 Oct 2019 05:47:47 +0000 (05:47 +0000)]
[X86] Enable v4i32->v4i16 and v8i16->v8i8 saturating truncates to use pack instructions with avx512.

llvm-svn: 374705

4 years ago[X86] Add v2i64->v2i32/v2i16/v2i8 test cases to the trunc packus/ssat/usat tests...
Craig Topper [Sun, 13 Oct 2019 05:47:42 +0000 (05:47 +0000)]
[X86] Add v2i64->v2i32/v2i16/v2i8 test cases to the trunc packus/ssat/usat tests. NFC

llvm-svn: 374704

4 years ago[Attributor][FIX] Avoid splitting blocks if possible
Johannes Doerfert [Sun, 13 Oct 2019 05:27:09 +0000 (05:27 +0000)]
[Attributor][FIX] Avoid splitting blocks if possible

Before, we eagerly split blocks even if it was not necessary, e.g., they
had a single unreachable instruction and only a single predecessor.

llvm-svn: 374703

4 years ago[Attributor][FIX] Remove leftover, now unused, variable
Johannes Doerfert [Sun, 13 Oct 2019 05:19:17 +0000 (05:19 +0000)]
[Attributor][FIX] Remove leftover, now unused, variable

llvm-svn: 374702

4 years ago[Attributor] Remove unused verification flag
Johannes Doerfert [Sun, 13 Oct 2019 05:07:00 +0000 (05:07 +0000)]
[Attributor] Remove unused verification flag

We use the verify max iteration now which is more reliable.

llvm-svn: 374701

4 years ago[Attributor][NFC] Expose call site traversal without QueryingAA
Johannes Doerfert [Sun, 13 Oct 2019 04:16:02 +0000 (04:16 +0000)]
[Attributor][NFC] Expose call site traversal without QueryingAA

llvm-svn: 374700

4 years ago[Attributor][FIX] Ensure h2s doesn't trigger on escaped pointers
Johannes Doerfert [Sun, 13 Oct 2019 04:14:15 +0000 (04:14 +0000)]
[Attributor][FIX] Ensure h2s doesn't trigger on escaped pointers

We do not yet perform h2s because we know something is free'ed but we do
it because we know the pointer does not escape. Storing the pointer
allows it to escape so we have to prevent that.

llvm-svn: 374699

4 years ago[Attributor][FIX] Do not apply h2s for arbitrary mallocs
Johannes Doerfert [Sun, 13 Oct 2019 03:54:08 +0000 (03:54 +0000)]
[Attributor][FIX] Do not apply h2s for arbitrary mallocs

H2S did apply to mallocs of non-constant sizes if the uses were OK. This
is now forbidden through reording of the "good" and "bad" cases in the
conditional.

llvm-svn: 374698

4 years ago[Attributor][FIX] Add missing function declaration in test case
Johannes Doerfert [Sun, 13 Oct 2019 02:42:09 +0000 (02:42 +0000)]
[Attributor][FIX] Add missing function declaration in test case

llvm-svn: 374696

4 years ago[Attributor][FIX] Avoid modifying naked/optnone functions
Johannes Doerfert [Sun, 13 Oct 2019 02:24:02 +0000 (02:24 +0000)]
[Attributor][FIX] Avoid modifying naked/optnone functions

The check for naked/optnone was insufficient for different reasons. We
now check before we initialize an abstract attribute and we do it for
all abstract attributes.

llvm-svn: 374694

4 years ago[SROA] Reuse existing lifetime markers if possible
Johannes Doerfert [Sun, 13 Oct 2019 02:21:23 +0000 (02:21 +0000)]
[SROA] Reuse existing lifetime markers if possible

Summary:
If the underlying alloca did not change, we do not necessarily need new
lifetime markers. This patch adds a check and reuses the old ones if
possible.

Reviewers: reames, ssarda, t.p.northover, hfinkel

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68900

llvm-svn: 374692

4 years agoRevert r374663 "[clang-format] Proposal for clang-format to give compiler style warnings"
Nico Weber [Sat, 12 Oct 2019 22:58:34 +0000 (22:58 +0000)]
Revert r374663 "[clang-format] Proposal for clang-format to give compiler style warnings"

The test fails on macOS and looks a bit wrong, see comments on the review.

Also revert follow-up r374686.

llvm-svn: 374688

4 years agogn build: (manually) merge r374663
Nico Weber [Sat, 12 Oct 2019 22:24:56 +0000 (22:24 +0000)]
gn build: (manually) merge r374663

llvm-svn: 374686

4 years ago[libc++][test] Silence MSVC warning in std::optional test
Casey Carter [Sat, 12 Oct 2019 19:01:46 +0000 (19:01 +0000)]
[libc++][test] Silence MSVC warning in std::optional test

`make_optional<string>(4, 'X')` passes `4` (an `int`) as the first argument to `string`'s `(size_t, charT)` constructor, triggering a signed/unsigned mismatch warning when compiling with MSVC at `/W4`. The incredibly simple fix is to instead use an unsigned literal (`4u`).

llvm-svn: 374684

4 years agoRevert r374648: "Reland r374388: [lit] Make internal diff work in pipelines"
Joel E. Denny [Sat, 12 Oct 2019 18:52:46 +0000 (18:52 +0000)]
Revert r374648: "Reland r374388: [lit] Make internal diff work in pipelines"

This series of patches still breaks a Windows bot.

llvm-svn: 374683

4 years agoRevert r374649: "Reland r374389: [lit] Clean up internal diff's encoding handling"
Joel E. Denny [Sat, 12 Oct 2019 18:52:31 +0000 (18:52 +0000)]
Revert r374649: "Reland r374389: [lit] Clean up internal diff's encoding handling"

This series of patches still breaks a Windows bot.

llvm-svn: 374682

4 years agoRevert r374650: "Reland r374390: [lit] Extend internal diff to support `-` argument"
Joel E. Denny [Sat, 12 Oct 2019 18:52:18 +0000 (18:52 +0000)]
Revert r374650: "Reland r374390: [lit] Extend internal diff to support `-` argument"

This series of patches still breaks a Windows bot.

llvm-svn: 374681

4 years agoRevert 374651: "Reland r374392: [lit] Extend internal diff to support -U"
Joel E. Denny [Sat, 12 Oct 2019 18:52:05 +0000 (18:52 +0000)]
Revert 374651: "Reland r374392: [lit] Extend internal diff to support -U"

This series of patches still breaks a Windows bot.

llvm-svn: 374680

4 years agoRevert r374652: "[lit] Fix internal diff's --strip-trailing-cr and use it"
Joel E. Denny [Sat, 12 Oct 2019 18:51:51 +0000 (18:51 +0000)]
Revert r374652: "[lit] Fix internal diff's --strip-trailing-cr and use it"

This series of patches still breaks a Windows bot.

llvm-svn: 374679

4 years agoRevert r374653: "[lit] Fix a few oversights in r374651 that broke some bots"
Joel E. Denny [Sat, 12 Oct 2019 18:51:34 +0000 (18:51 +0000)]
Revert r374653: "[lit] Fix a few oversights in r374651 that broke some bots"

This series of patches still breaks a Windows bot.

llvm-svn: 374678

4 years agoRevert r374665: "[lit] Try yet again to fix new tests that fail on Windows bots"
Joel E. Denny [Sat, 12 Oct 2019 18:51:18 +0000 (18:51 +0000)]
Revert r374665: "[lit] Try yet again to fix new tests that fail on Windows bots"

This series of patches still breaks a Windows bot.

llvm-svn: 374677

4 years agoRevert r374666: "[lit] Adjust error handling for decode introduced by r374665"
Joel E. Denny [Sat, 12 Oct 2019 18:51:08 +0000 (18:51 +0000)]
Revert r374666: "[lit] Adjust error handling for decode introduced by r374665"

This series of patches still breaks a Windows bot.

llvm-svn: 374676

4 years agoRevert r374671: "[lit] Try errors="ignore" for decode introduced by r374665"
Joel E. Denny [Sat, 12 Oct 2019 18:50:57 +0000 (18:50 +0000)]
Revert r374671: "[lit] Try errors="ignore" for decode introduced by r374665"

This series of patches still breaks a Windows bot.

llvm-svn: 374675

4 years ago[X86] scaleShuffleMask - use size_t Scale to avoid overflow warnings
Simon Pilgrim [Sat, 12 Oct 2019 18:33:47 +0000 (18:33 +0000)]
[X86] scaleShuffleMask - use size_t Scale to avoid overflow warnings

llvm-svn: 374674

4 years agoSymbolRecord - consistently use explicit for single operand constructors
Simon Pilgrim [Sat, 12 Oct 2019 17:55:09 +0000 (17:55 +0000)]
SymbolRecord - consistently use explicit for single operand constructors

llvm-svn: 374673

4 years agoSymbolRecord - fix uninitialized variable warnings. NFCI.
Simon Pilgrim [Sat, 12 Oct 2019 17:55:01 +0000 (17:55 +0000)]
SymbolRecord - fix uninitialized variable warnings. NFCI.

llvm-svn: 374672

4 years ago[lit] Try errors="ignore" for decode introduced by r374665
Joel E. Denny [Sat, 12 Oct 2019 17:23:25 +0000 (17:23 +0000)]
[lit] Try errors="ignore" for decode introduced by r374665

Still trying to fix the same error as in r374666.

llvm-svn: 374671

4 years ago[NFC][LoopIdiom] Adjust FIXME to be self-explanatory
Roman Lebedev [Sat, 12 Oct 2019 16:48:16 +0000 (16:48 +0000)]
[NFC][LoopIdiom] Adjust FIXME to be self-explanatory

llvm-svn: 374670

4 years agoReplace for-loop of SmallVector::push_back with SmallVector::append. NFCI.
Simon Pilgrim [Sat, 12 Oct 2019 16:37:02 +0000 (16:37 +0000)]
Replace for-loop of SmallVector::push_back with SmallVector::append. NFCI.

llvm-svn: 374669

4 years agoFix cppcheck shadow variable name warnings. NFCI.
Simon Pilgrim [Sat, 12 Oct 2019 16:36:52 +0000 (16:36 +0000)]
Fix cppcheck shadow variable name warnings. NFCI.

llvm-svn: 374668

4 years ago[X86] Use any_of/all_of patterns in shuffle mask pattern recognisers. NFCI.
Simon Pilgrim [Sat, 12 Oct 2019 16:36:44 +0000 (16:36 +0000)]
[X86] Use any_of/all_of patterns in shuffle mask pattern recognisers. NFCI.

llvm-svn: 374667

4 years ago[lit] Adjust error handling for decode introduced by r374665
Joel E. Denny [Sat, 12 Oct 2019 16:25:46 +0000 (16:25 +0000)]
[lit] Adjust error handling for decode introduced by r374665

On that decode, Windows bots fail with:

```
UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128)
```

That's the same error as before r374665 except it's now at the decode
before the write to stdout.

llvm-svn: 374666

4 years ago[lit] Try yet again to fix new tests that fail on Windows bots
Joel E. Denny [Sat, 12 Oct 2019 16:00:35 +0000 (16:00 +0000)]
[lit] Try yet again to fix new tests that fail on Windows bots

I seem to have misread the bot logs on my last attempt.  When lit's
internal diff runs on Windows under Python 2.7, it's text diffs not
binary diffs that need decoding to avoid this error when writing the
diff to stdout:

```
UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128)
```

There is no `decode` attribute in this case under Python 3.6.8 under
Ubuntu, so this patch checks for the `decode` attribute before using
it here.  Hopefully nothing else is needed when `decode` isn't
available.

It might take a couple more attempts to figure out what error
handling, if any, is needed for this decoding.

llvm-svn: 374665

4 years agoRevert r374657: "[lit] Try again to fix new tests that fail on Windows bots"
Joel E. Denny [Sat, 12 Oct 2019 16:00:25 +0000 (16:00 +0000)]
Revert r374657: "[lit] Try again to fix new tests that fail on Windows bots"

llvm-svn: 374664

4 years ago[clang-format] Proposal for clang-format to give compiler style warnings
Paul Hoad [Sat, 12 Oct 2019 15:36:05 +0000 (15:36 +0000)]
[clang-format] Proposal for clang-format to give compiler style warnings

Summary:
Related somewhat to {D29039}

On seeing a quote on twitter by @invalidop

> If it's not formatted with clang-format it's a build error.

This made me want to change the way I use clang-format into a tool that could optionally show me where my source code violates clang-format syle.

When I'm making a change to clang-format itself, one thing I like to do to test the change is to ensure I didn't cause a huge wave of changes, what I want to do is simply run this on a known formatted directory and see if any new differences arrive in a manner I'm used to.

This started me thinking that we should allow build systems to run clang-format on a whole tree and emit compiler style warnings about files that fail clang-format in a form that would make them as a warning in most build systems and because those build systems range in their construction I don't think its unreasonable to NOT expect them to have to do the directory searching or parsing the output replacements themselves, but simply transform that into an error code when there are changes required.

I am starting this by suggesing adding a -n or -dry-run command line argument which would emit a warning/error of the form

Support for various common compiler command line argumuments like '-Werror' and '-ferror-limit' could make this very flexible to be integrated into build systems and CI systems.

```
> $ /usr/bin/clang-format --dry-run ClangFormat.cpp -ferror-limit=3 -fcolor-diagnostics
> ClangFormat.cpp:54:29: warning: code should be clang-formatted [-Wclang-format-violations]
> static cl::list<std::string>
>                             ^
> ClangFormat.cpp:55:20: warning: code should be clang-formatted [-Wclang-format-violations]
> LineRanges("lines", cl::desc("<start line>:<end line> - format a range of\n"
>                    ^
> ClangFormat.cpp:55:77: warning: code should be clang-formatted [-Wclang-format-violations]
> LineRanges("lines", cl::desc("<start line>:<end line> - format a range of\n"
>                                                                             ^
```

Reviewers: mitchell-stellar, klimek, owenpan

Reviewed By: klimek

Subscribers: mgorny, cfe-commits

Tags: #clang-format, #clang-tools-extra, #clang

Differential Revision: https://reviews.llvm.org/D68554

llvm-svn: 374663

4 years ago[LoopIdiomRecognize] Recommit: BCmp loop idiom recognition
Roman Lebedev [Sat, 12 Oct 2019 15:35:32 +0000 (15:35 +0000)]
[LoopIdiomRecognize] Recommit: BCmp loop idiom recognition

Summary:
This is a recommit, this originally landed in rL370454 but was
subsequently reverted in  rL370788 due to
https://bugs.llvm.org/show_bug.cgi?id=43206
The reduced testcase was added to bcmp-negative-tests.ll
as @pr43206_different_loops - we must ensure that the SCEV's
we got are both for the same loop we are currently investigating.

Original commit message:

@mclow.lists brought up this issue up in IRC.
It is a reasonably common problem to compare some two values for equality.
Those may be just some integers, strings or arrays of integers.

In C, there is `memcmp()`, `bcmp()` functions.
In C++, there exists `std::equal()` algorithm.
One can also write that function manually.

libstdc++'s `std::equal()` is specialized to directly call `memcmp()` for
various types, but not `std::byte` from C++2a. https://godbolt.org/z/mx2ejJ

libc++ does not do anything like that, it simply relies on simple C++'s
`operator==()`. https://godbolt.org/z/er0Zwf (GOOD!)

So likely, there exists a certain performance opportunities.
Let's compare performance of naive `std::equal()` (no `memcmp()`) with one that
is using `memcmp()` (in this case, compiled with modified compiler). {F8768213}

```
#include <algorithm>
#include <cmath>
#include <cstdint>
#include <iterator>
#include <limits>
#include <random>
#include <type_traits>
#include <utility>
#include <vector>

#include "benchmark/benchmark.h"

template <class T>
bool equal(T* a, T* a_end, T* b) noexcept {
  for (; a != a_end; ++a, ++b) {
    if (*a != *b) return false;
  }
  return true;
}

template <typename T>
std::vector<T> getVectorOfRandomNumbers(size_t count) {
  std::random_device rd;
  std::mt19937 gen(rd());
  std::uniform_int_distribution<T> dis(std::numeric_limits<T>::min(),
                                       std::numeric_limits<T>::max());
  std::vector<T> v;
  v.reserve(count);
  std::generate_n(std::back_inserter(v), count,
                  [&dis, &gen]() { return dis(gen); });
  assert(v.size() == count);
  return v;
}

struct Identical {
  template <typename T>
  static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) {
    auto Tmp = getVectorOfRandomNumbers<T>(count);
    return std::make_pair(Tmp, std::move(Tmp));
  }
};

struct InequalHalfway {
  template <typename T>
  static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) {
    auto V0 = getVectorOfRandomNumbers<T>(count);
    auto V1 = V0;
    V1[V1.size() / size_t(2)]++;  // just change the value.
    return std::make_pair(std::move(V0), std::move(V1));
  }
};

template <class T, class Gen>
void BM_bcmp(benchmark::State& state) {
  const size_t Length = state.range(0);

  const std::pair<std::vector<T>, std::vector<T>> Data =
      Gen::template Gen<T>(Length);
  const std::vector<T>& a = Data.first;
  const std::vector<T>& b = Data.second;
  assert(a.size() == Length && b.size() == a.size());

  benchmark::ClobberMemory();
  benchmark::DoNotOptimize(a);
  benchmark::DoNotOptimize(a.data());
  benchmark::DoNotOptimize(b);
  benchmark::DoNotOptimize(b.data());

  for (auto _ : state) {
    const bool is_equal = equal(a.data(), a.data() + a.size(), b.data());
    benchmark::DoNotOptimize(is_equal);
  }
  state.SetComplexityN(Length);
  state.counters["eltcnt"] =
      benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariant);
  state.counters["eltcnt/sec"] =
      benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariantRate);
  const size_t BytesRead = 2 * sizeof(T) * Length;
  state.counters["bytes_read/iteration"] =
      benchmark::Counter(BytesRead, benchmark::Counter::kDefaults,
                         benchmark::Counter::OneK::kIs1024);
  state.counters["bytes_read/sec"] = benchmark::Counter(
      BytesRead, benchmark::Counter::kIsIterationInvariantRate,
      benchmark::Counter::OneK::kIs1024);
}

template <typename T>
static void CustomArguments(benchmark::internal::Benchmark* b) {
  const size_t L2SizeBytes = []() {
    for (const benchmark::CPUInfo::CacheInfo& I :
         benchmark::CPUInfo::Get().caches) {
      if (I.level == 2) return I.size;
    }
    return 0;
  }();
  // What is the largest range we can check to always fit within given L2 cache?
  const size_t MaxLen = L2SizeBytes / /*total bufs*/ 2 /
                        /*maximal elt size*/ sizeof(T) / /*safety margin*/ 2;
  b->RangeMultiplier(2)->Range(1, MaxLen)->Complexity(benchmark::oN);
}

BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, Identical)
    ->Apply(CustomArguments<uint8_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, Identical)
    ->Apply(CustomArguments<uint16_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, Identical)
    ->Apply(CustomArguments<uint32_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, Identical)
    ->Apply(CustomArguments<uint64_t>);

BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, InequalHalfway)
    ->Apply(CustomArguments<uint8_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, InequalHalfway)
    ->Apply(CustomArguments<uint16_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, InequalHalfway)
    ->Apply(CustomArguments<uint32_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, InequalHalfway)
    ->Apply(CustomArguments<uint64_t>);
```
{F8768210}
```
$ ~/src/googlebenchmark/tools/compare.py --no-utest benchmarks build-{old,new}/test/llvm-bcmp-bench
RUNNING: build-old/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpb6PEUx
2019-04-25 21:17:11
Running build-old/test/llvm-bcmp-bench
Run on (8 X 4000 MHz CPU s)
CPU Caches:
  L1 Data 16K (x8)
  L1 Instruction 64K (x4)
  L2 Unified 2048K (x4)
  L3 Unified 8192K (x1)
Load Average: 0.65, 3.90, 4.14
---------------------------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000           432131 ns       432101 ns         1613 bytes_read/iteration=1000k bytes_read/sec=2.20706G/s eltcnt=825.856M eltcnt/sec=1.18491G/s
BM_bcmp<uint8_t, Identical>_BigO               0.86 N          0.86 N
BM_bcmp<uint8_t, Identical>_RMS                   8 %             8 %
<...>
BM_bcmp<uint16_t, Identical>/256000          161408 ns       161409 ns         4027 bytes_read/iteration=1000k bytes_read/sec=5.90843G/s eltcnt=1030.91M eltcnt/sec=1.58603G/s
BM_bcmp<uint16_t, Identical>_BigO              0.67 N          0.67 N
BM_bcmp<uint16_t, Identical>_RMS                 25 %            25 %
<...>
BM_bcmp<uint32_t, Identical>/128000           81497 ns        81488 ns         8415 bytes_read/iteration=1000k bytes_read/sec=11.7032G/s eltcnt=1077.12M eltcnt/sec=1.57078G/s
BM_bcmp<uint32_t, Identical>_BigO              0.71 N          0.71 N
BM_bcmp<uint32_t, Identical>_RMS                 42 %            42 %
<...>
BM_bcmp<uint64_t, Identical>/64000            50138 ns        50138 ns        10909 bytes_read/iteration=1000k bytes_read/sec=19.0209G/s eltcnt=698.176M eltcnt/sec=1.27647G/s
BM_bcmp<uint64_t, Identical>_BigO              0.84 N          0.84 N
BM_bcmp<uint64_t, Identical>_RMS                 27 %            27 %
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000      192405 ns       192392 ns         3638 bytes_read/iteration=1000k bytes_read/sec=4.95694G/s eltcnt=1.86266G eltcnt/sec=2.66124G/s
BM_bcmp<uint8_t, InequalHalfway>_BigO          0.38 N          0.38 N
BM_bcmp<uint8_t, InequalHalfway>_RMS              3 %             3 %
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000     127858 ns       127860 ns         5477 bytes_read/iteration=1000k bytes_read/sec=7.45873G/s eltcnt=1.40211G eltcnt/sec=2.00219G/s
BM_bcmp<uint16_t, InequalHalfway>_BigO         0.50 N          0.50 N
BM_bcmp<uint16_t, InequalHalfway>_RMS             0 %             0 %
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000      49140 ns        49140 ns        14281 bytes_read/iteration=1000k bytes_read/sec=19.4072G/s eltcnt=1.82797G eltcnt/sec=2.60478G/s
BM_bcmp<uint32_t, InequalHalfway>_BigO         0.40 N          0.40 N
BM_bcmp<uint32_t, InequalHalfway>_RMS            18 %            18 %
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000       32101 ns        32099 ns        21786 bytes_read/iteration=1000k bytes_read/sec=29.7101G/s eltcnt=1.3943G eltcnt/sec=1.99381G/s
BM_bcmp<uint64_t, InequalHalfway>_BigO         0.50 N          0.50 N
BM_bcmp<uint64_t, InequalHalfway>_RMS             1 %             1 %
RUNNING: build-new/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpQ46PP0
2019-04-25 21:19:29
Running build-new/test/llvm-bcmp-bench
Run on (8 X 4000 MHz CPU s)
CPU Caches:
  L1 Data 16K (x8)
  L1 Instruction 64K (x4)
  L2 Unified 2048K (x4)
  L3 Unified 8192K (x1)
Load Average: 1.01, 2.85, 3.71
---------------------------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000            18593 ns        18590 ns        37565 bytes_read/iteration=1000k bytes_read/sec=51.2991G/s eltcnt=19.2333G eltcnt/sec=27.541G/s
BM_bcmp<uint8_t, Identical>_BigO               0.04 N          0.04 N
BM_bcmp<uint8_t, Identical>_RMS                  37 %            37 %
<...>
BM_bcmp<uint16_t, Identical>/256000           18950 ns        18948 ns        37223 bytes_read/iteration=1000k bytes_read/sec=50.3324G/s eltcnt=9.52909G eltcnt/sec=13.511G/s
BM_bcmp<uint16_t, Identical>_BigO              0.08 N          0.08 N
BM_bcmp<uint16_t, Identical>_RMS                 34 %            34 %
<...>
BM_bcmp<uint32_t, Identical>/128000           18627 ns        18627 ns        37895 bytes_read/iteration=1000k bytes_read/sec=51.198G/s eltcnt=4.85056G eltcnt/sec=6.87168G/s
BM_bcmp<uint32_t, Identical>_BigO              0.16 N          0.16 N
BM_bcmp<uint32_t, Identical>_RMS                 35 %            35 %
<...>
BM_bcmp<uint64_t, Identical>/64000            18855 ns        18855 ns        37458 bytes_read/iteration=1000k bytes_read/sec=50.5791G/s eltcnt=2.39731G eltcnt/sec=3.3943G/s
BM_bcmp<uint64_t, Identical>_BigO              0.32 N          0.32 N
BM_bcmp<uint64_t, Identical>_RMS                 33 %            33 %
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000        9570 ns         9569 ns        73500 bytes_read/iteration=1000k bytes_read/sec=99.6601G/s eltcnt=37.632G eltcnt/sec=53.5046G/s
BM_bcmp<uint8_t, InequalHalfway>_BigO          0.02 N          0.02 N
BM_bcmp<uint8_t, InequalHalfway>_RMS             29 %            29 %
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000       9547 ns         9547 ns        74343 bytes_read/iteration=1000k bytes_read/sec=99.8971G/s eltcnt=19.0318G eltcnt/sec=26.8159G/s
BM_bcmp<uint16_t, InequalHalfway>_BigO         0.04 N          0.04 N
BM_bcmp<uint16_t, InequalHalfway>_RMS            29 %            29 %
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000       9396 ns         9394 ns        73521 bytes_read/iteration=1000k bytes_read/sec=101.518G/s eltcnt=9.41069G eltcnt/sec=13.6255G/s
BM_bcmp<uint32_t, InequalHalfway>_BigO         0.08 N          0.08 N
BM_bcmp<uint32_t, InequalHalfway>_RMS            30 %            30 %
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000        9499 ns         9498 ns        73802 bytes_read/iteration=1000k bytes_read/sec=100.405G/s eltcnt=4.72333G eltcnt/sec=6.73808G/s
BM_bcmp<uint64_t, InequalHalfway>_BigO         0.16 N          0.16 N
BM_bcmp<uint64_t, InequalHalfway>_RMS            28 %            28 %
Comparing build-old/test/llvm-bcmp-bench to build-new/test/llvm-bcmp-bench
Benchmark                                                  Time             CPU      Time Old      Time New       CPU Old       CPU New
---------------------------------------------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000                      -0.9570         -0.9570        432131         18593        432101         18590
<...>
BM_bcmp<uint16_t, Identical>/256000                     -0.8826         -0.8826        161408         18950        161409         18948
<...>
BM_bcmp<uint32_t, Identical>/128000                     -0.7714         -0.7714         81497         18627         81488         18627
<...>
BM_bcmp<uint64_t, Identical>/64000                      -0.6239         -0.6239         50138         18855         50138         18855
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000                 -0.9503         -0.9503        192405          9570        192392          9569
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000                -0.9253         -0.9253        127858          9547        127860          9547
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000                -0.8088         -0.8088         49140          9396         49140          9394
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000                 -0.7041         -0.7041         32101          9499         32099          9498
```

What can we tell from the benchmark?
* Performance of naive equality check somewhat improves with element size,
  maxing out at eltcnt/sec=1.58603G/s for uint16_t, or bytes_read/sec=19.0209G/s
  for uint64_t. I think, that instability implies performance problems.
* Performance of `memcmp()`-aware benchmark always maxes out at around
  bytes_read/sec=51.2991G/s for every type. That is 2.6x the throughput of the
  naive variant!
* eltcnt/sec metric for the `memcmp()`-aware benchmark maxes out at
  eltcnt/sec=27.541G/s for uint8_t (was: eltcnt/sec=1.18491G/s, so 24x) and
  linearly decreases with element size.
  For uint64_t, it's ~4x+ the elements/second.
* The call obvious is more pricey than the loop, with small element count.
  As it can be seen from the full output {F8768210}, the `memcmp()` is almost
  universally worse, independent of the element size (and thus buffer size) when
  element count is less than 8.

So all in all, bcmp idiom does indeed pose untapped performance headroom.
This diff does implement said idiom recognition. I think a reasonable test
coverage is present, but do tell if there is anything obvious missing.

Now, quality. This does succeed to build and pass the test-suite, at least
without any non-bundled elements. {F8768216} {F8768217}
This transform fires 91 times:
```
$ /build/test-suite/utils/compare.py -m loop-idiom.NumBCmp result-new.json
Tests: 1149
Metric: loop-idiom.NumBCmp

Program                                         result-new

MultiSourc...Benchmarks/7zip/7zip-benchmark    79.00
MultiSource/Applications/d/make_dparser         3.00
SingleSource/UnitTests/vla                      2.00
MultiSource/Applications/Burg/burg              1.00
MultiSourc.../Applications/JM/lencod/lencod     1.00
MultiSource/Applications/lemon/lemon            1.00
MultiSource/Benchmarks/Bullet/bullet            1.00
MultiSourc...e/Benchmarks/MallocBench/gs/gs     1.00
MultiSourc...gs-C/TimberWolfMC/timberwolfmc     1.00
MultiSourc...Prolangs-C/simulator/simulator     1.00
```
The size changes are:
I'm not sure what's going on with SingleSource/UnitTests/vla.test yet, did not look.
```
$ /build/test-suite/utils/compare.py -m size..text result-{old,new}.json --filter-hash
Tests: 1149
Same hash: 907 (filtered out)
Remaining: 242
Metric: size..text

Program                                        result-old result-new diff
test-suite...ingleSource/UnitTests/vla.test   753.00     833.00     10.6%
test-suite...marks/7zip/7zip-benchmark.test   1001697.00 966657.00  -3.5%
test-suite...ngs-C/simulator/simulator.test   32369.00   32321.00   -0.1%
test-suite...plications/d/make_dparser.test   89585.00   89505.00   -0.1%
test-suite...ce/Applications/Burg/burg.test   40817.00   40785.00   -0.1%
test-suite.../Applications/lemon/lemon.test   47281.00   47249.00   -0.1%
test-suite...TimberWolfMC/timberwolfmc.test   250065.00  250113.00   0.0%
test-suite...chmarks/MallocBench/gs/gs.test   149889.00  149873.00  -0.0%
test-suite...ications/JM/lencod/lencod.test   769585.00  769569.00  -0.0%
test-suite.../Benchmarks/Bullet/bullet.test   770049.00  770049.00   0.0%
test-suite...HMARK_ANISTROPIC_DIFFUSION/128    NaN        NaN        nan%
test-suite...HMARK_ANISTROPIC_DIFFUSION/256    NaN        NaN        nan%
test-suite...CHMARK_ANISTROPIC_DIFFUSION/64    NaN        NaN        nan%
test-suite...CHMARK_ANISTROPIC_DIFFUSION/32    NaN        NaN        nan%
test-suite...ENCHMARK_BILATERAL_FILTER/64/4    NaN        NaN        nan%
Geomean difference                                                   nan%
         result-old    result-new       diff
count  1.000000e+01  10.00000      10.000000
mean   3.152090e+05  311695.40000  0.006749
std    3.790398e+05  372091.42232  0.036605
min    7.530000e+02  833.00000    -0.034981
25%    4.243300e+04  42401.00000  -0.000866
50%    1.197370e+05  119689.00000 -0.000392
75%    6.397050e+05  639705.00000 -0.000005
max    1.001697e+06  966657.00000  0.106242
```

I don't have timings though.

And now to the code. The basic idea is to completely replace the whole loop.
If we can't fully kill it, don't transform.
I have left one or two comments in the code, so hopefully it can be understood.

Also, there is a few TODO's that i have left for follow-ups:
* widening of `memcmp()`/`bcmp()`
* step smaller than the comparison size
* Metadata propagation
* more than two blocks as long as there is still a single backedge?
* ???

Reviewers: reames, fhahn, mkazantsev, chandlerc, craig.topper, courbet

Reviewed By: courbet

Subscribers: miyuki, hiraditya, xbolva00, nikic, jfb, gchatelet, courbet, llvm-commits, mclow.lists

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61144

llvm-svn: 374662

4 years ago[NFC][LoopIdiom] Add bcmp loop idiom miscompile test from PR43206.
Roman Lebedev [Sat, 12 Oct 2019 15:35:16 +0000 (15:35 +0000)]
[NFC][LoopIdiom] Add bcmp loop idiom miscompile test from PR43206.

The transform forgot to check SCEV loop scopes.

https://bugs.llvm.org/show_bug.cgi?id=43206

llvm-svn: 374661

4 years ago[NFC][LoopIdiom] Move one bcmp test into the proper place
Roman Lebedev [Sat, 12 Oct 2019 15:35:09 +0000 (15:35 +0000)]
[NFC][LoopIdiom] Move one bcmp test into the proper place

llvm-svn: 374660

4 years agoremove an useless allocation found by scan-build - the new Dead nested assignment...
Sylvestre Ledru [Sat, 12 Oct 2019 15:24:00 +0000 (15:24 +0000)]
remove an useless allocation found by scan-build - the new Dead nested assignment check

llvm-svn: 374659

4 years ago[X86][SSE] Avoid unnecessary PMOVZX in v4i8 sum reduction
Simon Pilgrim [Sat, 12 Oct 2019 15:19:13 +0000 (15:19 +0000)]
[X86][SSE] Avoid unnecessary PMOVZX in v4i8 sum reduction

This should go away once D66004 has landed and we can simplify shuffle chains using demanded elts.

llvm-svn: 374658

4 years ago[lit] Try again to fix new tests that fail on Windows bots
Joel E. Denny [Sat, 12 Oct 2019 14:58:43 +0000 (14:58 +0000)]
[lit] Try again to fix new tests that fail on Windows bots

Based on the bot logs, when lit's internal diff runs on Windows, it
looks like binary diffs must be decoded also for Python 2.7.
Otherwise, writing the diff to stdout fails with:

```
UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128)
```

I did not need to decode using Python 2.7.15 under Ubuntu.  When I do
it anyway in that case, `errors="backslashreplace"` fails for me:

```
TypeError: don't know how to handle UnicodeDecodeError in error callback
```

However, `errors="ignore"` works, so this patch uses that, hoping
it'll work on Windows as well.

This patch leaves `errors="backslashreplace"` for Python >= 3.5 as
there's no evidence yet that doesn't work and it produces more
informative binary diffs.  This patch also adjusts some lit tests to
succeed for either error handler.

This patch adjusts changes introduced by D68664.

llvm-svn: 374657

4 years agoRevert r374654: "[lit] Try to fix new tests that fail on Windows bots"
Joel E. Denny [Sat, 12 Oct 2019 14:58:30 +0000 (14:58 +0000)]
Revert r374654: "[lit] Try to fix new tests that fail on Windows bots"

llvm-svn: 374656

4 years ago[CostModel][X86] Improve sum reduction costs.
Simon Pilgrim [Sat, 12 Oct 2019 13:21:50 +0000 (13:21 +0000)]
[CostModel][X86] Improve sum reduction costs.

I can't see any notable differences in costs between SSE2 and SSE42 arches for FADD/ADD reduction, so I've lowered the target to just SSE2.

I've also added vXi8 sum reduction costs in line with the PSADBW codegen and discussions on PR42674.

llvm-svn: 374655

4 years ago[lit] Try to fix new tests that fail on Windows bots
Joel E. Denny [Sat, 12 Oct 2019 13:08:21 +0000 (13:08 +0000)]
[lit] Try to fix new tests that fail on Windows bots

llvm-svn: 374654

4 years ago[lit] Fix a few oversights in r374651 that broke some bots
Joel E. Denny [Sat, 12 Oct 2019 12:32:00 +0000 (12:32 +0000)]
[lit] Fix a few oversights in r374651 that broke some bots

llvm-svn: 374653

4 years ago[lit] Fix internal diff's --strip-trailing-cr and use it
Joel E. Denny [Sat, 12 Oct 2019 11:58:30 +0000 (11:58 +0000)]
[lit] Fix internal diff's --strip-trailing-cr and use it

Using GNU diff, `--strip-trailing-cr` removes a `\r` appearing before
a `\n` at the end of a line.  Without this patch, lit's internal diff
only removes `\r` if it appears as the last character.  That seems
useless.  This patch fixes that.

This patch also adds `--strip-trailing-cr` to some tests that fail on
Windows bots when D68664 is applied.  Based on what I see in the bot
logs, I think the following is happening.  In each test there, lit
diff is comparing a file with `\r\n` line endings to a file with `\n`
line endings.  Without D68664, lit diff reads those files with
Python's universal newlines support activated, causing `\r` to be
dropped.  However, with D68664, lit diff reads the files in binary
mode instead and thus reports that every line is different, just as
GNU diff does (at least under Ubuntu).  Adding `--strip-trailing-cr`
to those tests restores the previous behavior while permitting the
behavior of lit diff to be more like GNU diff.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D68839

llvm-svn: 374652

4 years agoReland r374392: [lit] Extend internal diff to support -U
Joel E. Denny [Sat, 12 Oct 2019 11:58:03 +0000 (11:58 +0000)]
Reland r374392: [lit] Extend internal diff to support -U

To avoid breaking some tests, D66574, D68664, D67643, and D68668
landed together.  However, D68664 introduced an issue now addressed by
D68839, with which these are now all relanding.

Differential Revision: https://reviews.llvm.org/D68668

llvm-svn: 374651

4 years agoReland r374390: [lit] Extend internal diff to support `-` argument
Joel E. Denny [Sat, 12 Oct 2019 11:57:41 +0000 (11:57 +0000)]
Reland r374390: [lit] Extend internal diff to support `-` argument

To avoid breaking some tests, D66574, D68664, D67643, and D68668
landed together.  However, D68664 introduced an issue now addressed by
D68839, with which these are now all relanding.

Differential Revision: https://reviews.llvm.org/D67643

llvm-svn: 374650

4 years agoReland r374389: [lit] Clean up internal diff's encoding handling
Joel E. Denny [Sat, 12 Oct 2019 11:57:20 +0000 (11:57 +0000)]
Reland r374389: [lit] Clean up internal diff's encoding handling

To avoid breaking some tests, D66574, D68664, D67643, and D68668
landed together.  However, D68664 introduced an issue now addressed by
D68839, with which these are now all relanding.

Differential Revision: https://reviews.llvm.org/D68664

llvm-svn: 374649

4 years agoReland r374388: [lit] Make internal diff work in pipelines
Joel E. Denny [Sat, 12 Oct 2019 11:56:57 +0000 (11:56 +0000)]
Reland r374388: [lit] Make internal diff work in pipelines

To avoid breaking some tests, D66574, D68664, D67643, and D68668
landed together.  However, D68664 introduced an issue now addressed by
D68839, with which these are now all relanding.

Differential Revision: https://reviews.llvm.org/D66574

llvm-svn: 374648

4 years ago[Attributor] Extend anonymous namespace. NFC.
Benjamin Kramer [Sat, 12 Oct 2019 11:01:52 +0000 (11:01 +0000)]
[Attributor] Extend anonymous namespace. NFC.

llvm-svn: 374647

4 years ago[LV] Merge LLVM_DEBUG blocks.
Benjamin Kramer [Sat, 12 Oct 2019 10:57:22 +0000 (10:57 +0000)]
[LV] Merge LLVM_DEBUG blocks.

Avoids unused variable warnings about the range-based for loops in
there. NFCI.

llvm-svn: 374646

4 years ago[X86] Use pack instructions for packus/ssat truncate patterns when 256-bit is the...
Craig Topper [Sat, 12 Oct 2019 07:59:29 +0000 (07:59 +0000)]
[X86] Use pack instructions for packus/ssat truncate patterns when 256-bit is the largest legal vector and the result type is at least 256 bits.

Since the input type is larger than 256-bits we'll need to some
concatenating to reassemble the results. The pack instructions
ability to concatenate while packing make this a shorter/faster
sequence.

llvm-svn: 374643

4 years ago[X86] Test SKX cpu in the vector-trunc-packus/ssat/usat.ll tests instad of min-legal...
Craig Topper [Sat, 12 Oct 2019 07:59:24 +0000 (07:59 +0000)]
[X86] Test SKX cpu in the vector-trunc-packus/ssat/usat.ll tests instad of min-legal-vector-width.ll

This adds "min-legal-vector-width"="256" function attributes to
all the tests for a larger than 256-bit input. Also switch any
larger than 512-bit inputs to use a load. This makes the
arguments consistent with min-legal-vector-width attribute which
should usually be at least as large as the arguments.

The SKX configuration will avoid using zmm registers on the
modified test cases. For many of them we should use something
closer to the AVX2 codegen with pack instructions instead of
the avx512 saturating truncates.

llvm-svn: 374642

4 years ago[mips] Rely on GPR size not ABI when select instruction to load value into register
Simon Atanasyan [Sat, 12 Oct 2019 07:42:51 +0000 (07:42 +0000)]
[mips] Rely on GPR size not ABI when select instruction to load value into register

llvm-svn: 374641

4 years ago[mips] Fix `loadImmediate` calls when load non-address values.
Simon Atanasyan [Sat, 12 Oct 2019 07:42:44 +0000 (07:42 +0000)]
[mips] Fix `loadImmediate` calls when load non-address values.

llvm-svn: 374640

4 years ago[lit] Remove setting of the target-windows feature
Martin Storsjo [Sat, 12 Oct 2019 06:40:24 +0000 (06:40 +0000)]
[lit] Remove setting of the target-windows feature

No other OSes use a target-<os> feature, and no tests depend on it
any lomger.

Differential Revision: https://reviews.llvm.org/D68450

llvm-svn: 374639

4 years ago[clang][IFS] Fixing spelling errors in interface-stubs OPT flag (NFC).
Puyan Lotfi [Sat, 12 Oct 2019 06:25:07 +0000 (06:25 +0000)]
[clang][IFS] Fixing spelling errors in interface-stubs OPT flag (NFC).

This is just a long standing spelling error that was found recently.

llvm-svn: 374638

4 years ago[llvm-lipo] Pass ArrayRef by value.
Alexander Shaposhnikov [Sat, 12 Oct 2019 06:14:02 +0000 (06:14 +0000)]
[llvm-lipo] Pass ArrayRef by value.

Pass ArrayRef by value, fix formatting. NFC.

Test plan: make check-all

llvm-svn: 374637

4 years agoRevert 374629 "[sancov] Accommodate sancov and coverage report server for use under...
Vitaly Buka [Sat, 12 Oct 2019 05:23:43 +0000 (05:23 +0000)]
Revert 374629 "[sancov] Accommodate sancov and coverage report server for use under Windows"

http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/27650/steps/ninja%20check%201/logs/stdio
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/31759
http://lab.llvm.org:8011/builders/clang-s390x-linux-lnt/builds/15095
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/21075
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/31759

llvm-svn: 374636

4 years agoNFC: clang-format rL374420 and adjust comment wording
Hubert Tong [Sat, 12 Oct 2019 04:08:31 +0000 (04:08 +0000)]
NFC: clang-format rL374420 and adjust comment wording

The commit of rL374420 had various formatting issues, including lines
that exceed 80 columns. This patch applies `git clang-format` on the
changes from commit 13bd3ef40d8b1586f26a022e01b21e56c91e05bd.

It further adjusts a comment to clarify the domain of inputs upon which
a newly added function is meant to operate. The adjustment to the
comment was suggested in a post-commit comment on D68721 and discussed
off-list with @sfertile.

llvm-svn: 374635

4 years agorecommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separatel...
Zi Xuan Wu [Sat, 12 Oct 2019 02:53:04 +0000 (02:53 +0000)]
recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize

In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not
estimate different register pressure for different register class separately(especially for scalar type,
float type should not be on the same position with int type), so it's not accurate. Specifically,
it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance.

So we need classify the register classes in IR level, and importantly these are abstract register classes,
and are not the target register class of backend provided in td file. It's used to establish the mapping between
the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types.

For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR),
float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled,
and 3 kinds of register class when VSX is NOT enabled.

It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions.

Differential revision: https://reviews.llvm.org/D67148

llvm-svn: 374634

4 years ago[clang][IFS] Updating tests to pass on -fvisibility=hidden builds (NFCi).
Puyan Lotfi [Sat, 12 Oct 2019 02:46:57 +0000 (02:46 +0000)]
[clang][IFS] Updating tests to pass on -fvisibility=hidden builds (NFCi).

Special thanks to JamesNagurne who got to the bottom of this; landing this on
his behalf.

Differential Revision: https://reviews.llvm.org/D68897

llvm-svn: 374632

4 years ago[platform process list] add a flag for showing the processes of all users
Walter Erquinigo [Sat, 12 Oct 2019 02:36:16 +0000 (02:36 +0000)]
[platform process list] add a flag for showing the processes of all users

Summary:
For context: https://reviews.llvm.org/D68293

We need a way to show all the processes on android regardless of the user id.
When you run `platform process list`, you only see the processes with the same user as the user that launched lldb-server. However, it's quite useful to see all the processes, though, and it will lay a foundation for full apk debugging support from lldb.

Before:
```
PID    PARENT USER       TRIPLE                   NAME
====== ====== ========== ======================== ============================
3234   1                 aarch64-unknown-linux-android adbd
8034   3234              aarch64-unknown-linux-android sh
9096   3234              aarch64-unknown-linux-android sh
9098   9096              aarch64-unknown-linux-android lldb-server
(lldb) ^D
```

Now:
```
(lldb) platform process list -x
205 matching processes were found on "remote-android"
PID    PARENT USER       TRIPLE                   NAME
====== ====== ========== ======================== ============================
1      0                                          init
524    1                                          init
525    1                                          init
531    1                                          ueventd
568    1                                          logd
569    1                 aarch64-unknown-linux-android servicemanager
570    1                 aarch64-unknown-linux-android hwservicemanager
571    1                 aarch64-unknown-linux-android vndservicemanager
577    1                 aarch64-unknown-linux-android qseecomd
580    577               aarch64-unknown-linux-android qseecomd
...
23816  979                                        com.android.providers.calendar
24600  979                                        com.verizon.mips.services
27888  979                                        com.hualai
28043  2378                                       com.android.chrome:sandboxed_process0
31449  979                                        com.att.shm
31779  979                                        com.samsung.android.authfw
31846  979                                        com.samsung.android.server.iris
32014  979                                        com.samsung.android.MtpApplication
32045  979                                        com.samsung.InputEventApp
```

Reviewers: labath,xiaobai,aadsm,clayborg

Subscribers:

> llvm-svn: 374584

llvm-svn: 374631

4 years agoRevert "[platform process list] add a flag for showing the processes of all users"
Walter Erquinigo [Sat, 12 Oct 2019 02:31:22 +0000 (02:31 +0000)]
Revert "[platform process list] add a flag for showing the processes of all users"

This reverts commit f670a5edfc70066872e1795d650ed6e1ac62b6a8.

llvm-svn: 374630

4 years ago[sancov] Accommodate sancov and coverage report server for use under Windows
Vitaly Buka [Sat, 12 Oct 2019 02:29:26 +0000 (02:29 +0000)]
[sancov] Accommodate sancov and coverage report server for use under Windows

Summary:
This patch makes the following changes to SanCov and its complementary Python script in order to resolve issues pertaining to non-UNIX file paths in JSON symbolization information:
* Convert all paths to use forward slash.
* Update `coverage-report-server.py` to correctly handle paths to sources which contain spaces.
* Remove Linux platform restriction for all SanCov unit tests. All SanCov tests passed when ran on my local Windows machine.

Patch by Douglas Gliner.

Reviewers: kcc, filcab, phosek, morehouse, vitalybuka, metzman

Reviewed By: vitalybuka

Subscribers: vsk, Dor1s, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D51018

llvm-svn: 374629

4 years ago[sancov] Use LLVM Support library JSON writer in favor of individual implementation
Vitaly Buka [Sat, 12 Oct 2019 02:29:24 +0000 (02:29 +0000)]
[sancov] Use LLVM Support library JSON writer in favor of individual implementation

Summary:
In this diff, I've replaced the individual implementation of `JSONWriter` with `json::OStream` provided by `llvm/Support/JSON.h`.

Important Note: The output format of the JSON is considerably different compared to the original implementation. Important differences include:
* New line for each entry in an array (should make diffs cleaner)
* No space between keys and colon in attributed object entries.
* Attributes with empty strings will now print the attribute name and a quote pair rather than excluding the attribute altogether

Examples of these differences can be seen in the changes to the sancov tests which compare the JSON output.

Patch by Douglas Gliner.

Reviewers: kcc, filcab, phosek, morehouse, vitalybuka, metzman

Subscribers: mehdi_amini, dexonsmith, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D68752

llvm-svn: 374628

4 years agoSlightly relax restriction on exact order arguments must appear.
Douglas Yung [Sat, 12 Oct 2019 02:22:36 +0000 (02:22 +0000)]
Slightly relax restriction on exact order arguments must appear.

llvm-svn: 374627

4 years ago[platform process list] add a flag for showing the processes of all users
Walter Erquinigo [Sat, 12 Oct 2019 02:08:35 +0000 (02:08 +0000)]
[platform process list] add a flag for showing the processes of all users

Summary:
For context: https://reviews.llvm.org/D68293

We need a way to show all the processes on android regardless of the user id.
When you run `platform process list`, you only see the processes with the same user as the user that launched lldb-server. However, it's quite useful to see all the processes, though, and it will lay a foundation for full apk debugging support from lldb.

Before:
```
PID    PARENT USER       TRIPLE                   NAME
====== ====== ========== ======================== ============================
3234   1                 aarch64-unknown-linux-android adbd
8034   3234              aarch64-unknown-linux-android sh
9096   3234              aarch64-unknown-linux-android sh
9098   9096              aarch64-unknown-linux-android lldb-server
(lldb) ^D
```

Now:
```
(lldb) platform process list -x
205 matching processes were found on "remote-android"
PID    PARENT USER       TRIPLE                   NAME
====== ====== ========== ======================== ============================
1      0                                          init
524    1                                          init
525    1                                          init
531    1                                          ueventd
568    1                                          logd
569    1                 aarch64-unknown-linux-android servicemanager
570    1                 aarch64-unknown-linux-android hwservicemanager
571    1                 aarch64-unknown-linux-android vndservicemanager
577    1                 aarch64-unknown-linux-android qseecomd
580    577               aarch64-unknown-linux-android qseecomd
...
23816  979                                        com.android.providers.calendar
24600  979                                        com.verizon.mips.services
27888  979                                        com.hualai
28043  2378                                       com.android.chrome:sandboxed_process0
31449  979                                        com.att.shm
31779  979                                        com.samsung.android.authfw
31846  979                                        com.samsung.android.server.iris
32014  979                                        com.samsung.android.MtpApplication
32045  979                                        com.samsung.InputEventApp
```

Reviewers: labath,xiaobai,aadsm,clayborg

Subscribers:

> llvm-svn: 374584

llvm-svn: 374626

4 years agoRevert "[platform process list] add a flag for showing the processes of all users"
Walter Erquinigo [Sat, 12 Oct 2019 02:01:33 +0000 (02:01 +0000)]
Revert "[platform process list] add a flag for showing the processes of all users"

This reverts commit 90d0de4999354a5223f08ad714222b0a5dca3cad.

llvm-svn: 374625

4 years ago[libunwind] Fix issues introduced in r374606
Petr Hosek [Sat, 12 Oct 2019 01:50:57 +0000 (01:50 +0000)]
[libunwind] Fix issues introduced in r374606

There are few differences in compile flags introduced in r374606
which are causing libcxx-libcxxabi-libunwind-armv8-linux to fail.
This change should address all of those, I've compared the generated
build file from before r374606 and with this change and the set of
flags is the same modulo order.

llvm-svn: 374624

4 years ago[asan] Return true from instrumentModule
Vitaly Buka [Sat, 12 Oct 2019 01:50:36 +0000 (01:50 +0000)]
[asan] Return true from instrumentModule

createSanitizerCtorAndInitFunctions always change the module.

llvm-svn: 374623

4 years ago[platform process list] add a flag for showing the processes of all users
Walter Erquinigo [Sat, 12 Oct 2019 01:33:21 +0000 (01:33 +0000)]
[platform process list] add a flag for showing the processes of all users

Summary:
For context: https://reviews.llvm.org/D68293

We need a way to show all the processes on android regardless of the user id.
When you run `platform process list`, you only see the processes with the same user as the user that launched lldb-server. However, it's quite useful to see all the processes, though, and it will lay a foundation for full apk debugging support from lldb.

Before:
```
PID    PARENT USER       TRIPLE                   NAME
====== ====== ========== ======================== ============================
3234   1                 aarch64-unknown-linux-android adbd
8034   3234              aarch64-unknown-linux-android sh
9096   3234              aarch64-unknown-linux-android sh
9098   9096              aarch64-unknown-linux-android lldb-server
(lldb) ^D
```

Now:
```
(lldb) platform process list -x
205 matching processes were found on "remote-android"
PID    PARENT USER       TRIPLE                   NAME
====== ====== ========== ======================== ============================
1      0                                          init
524    1                                          init
525    1                                          init
531    1                                          ueventd
568    1                                          logd
569    1                 aarch64-unknown-linux-android servicemanager
570    1                 aarch64-unknown-linux-android hwservicemanager
571    1                 aarch64-unknown-linux-android vndservicemanager
577    1                 aarch64-unknown-linux-android qseecomd
580    577               aarch64-unknown-linux-android qseecomd
...
23816  979                                        com.android.providers.calendar
24600  979                                        com.verizon.mips.services
27888  979                                        com.hualai
28043  2378                                       com.android.chrome:sandboxed_process0
31449  979                                        com.att.shm
31779  979                                        com.samsung.android.authfw
31846  979                                        com.samsung.android.server.iris
32014  979                                        com.samsung.android.MtpApplication
32045  979                                        com.samsung.InputEventApp
```

Reviewers: labath,xiaobai,aadsm,clayborg

Subscribers:

> llvm-svn: 374584

llvm-svn: 374622

4 years agoRevert "[platform process list] add a flag for showing the processes of all users"
Walter Erquinigo [Sat, 12 Oct 2019 01:08:50 +0000 (01:08 +0000)]
Revert "[platform process list] add a flag for showing the processes of all users"

This reverts commit 08781f4c53a177662c029d3da9c407ba65ae6747.

llvm-svn: 374621

4 years ago[platform process list] add a flag for showing the processes of all users
Walter Erquinigo [Sat, 12 Oct 2019 00:44:50 +0000 (00:44 +0000)]
[platform process list] add a flag for showing the processes of all users

Summary:
For context: https://reviews.llvm.org/D68293

We need a way to show all the processes on android regardless of the user id.
When you run `platform process list`, you only see the processes with the same user as the user that launched lldb-server. However, it's quite useful to see all the processes, though, and it will lay a foundation for full apk debugging support from lldb.

Before:
```
PID    PARENT USER       TRIPLE                   NAME
====== ====== ========== ======================== ============================
3234   1                 aarch64-unknown-linux-android adbd
8034   3234              aarch64-unknown-linux-android sh
9096   3234              aarch64-unknown-linux-android sh
9098   9096              aarch64-unknown-linux-android lldb-server
(lldb) ^D
```

Now:
```
(lldb) platform process list -x
205 matching processes were found on "remote-android"
PID    PARENT USER       TRIPLE                   NAME
====== ====== ========== ======================== ============================
1      0                                          init
524    1                                          init
525    1                                          init
531    1                                          ueventd
568    1                                          logd
569    1                 aarch64-unknown-linux-android servicemanager
570    1                 aarch64-unknown-linux-android hwservicemanager
571    1                 aarch64-unknown-linux-android vndservicemanager
577    1                 aarch64-unknown-linux-android qseecomd
580    577               aarch64-unknown-linux-android qseecomd
...
23816  979                                        com.android.providers.calendar
24600  979                                        com.verizon.mips.services
27888  979                                        com.hualai
28043  2378                                       com.android.chrome:sandboxed_process0
31449  979                                        com.att.shm
31779  979                                        com.samsung.android.authfw
31846  979                                        com.samsung.android.server.iris
32014  979                                        com.samsung.android.MtpApplication
32045  979                                        com.samsung.InputEventApp
```

Reviewers: labath,xiaobai,aadsm,clayborg

Subscribers:

> llvm-svn: 374584

llvm-svn: 374620

4 years agoDebugInfo: Fix msan use-of-uninitialized exposed by r374600
David Blaikie [Sat, 12 Oct 2019 00:27:12 +0000 (00:27 +0000)]
DebugInfo: Fix msan use-of-uninitialized exposed by r374600

llvm-svn: 374619

4 years ago[llvm-profdata] Make "malformed-ptr-to-counter-array.test" textual
Vedant Kumar [Sat, 12 Oct 2019 00:23:15 +0000 (00:23 +0000)]
[llvm-profdata] Make "malformed-ptr-to-counter-array.test" textual

As pointed out in https://reviews.llvm.org/D66979 post-commit, making
this test textual would make it more maintainable.

Differential Revision: https://reviews.llvm.org/D68718

llvm-svn: 374617

4 years agoTemporarily Revert [platform process list] add a flag for showing the processes of...
Adrian Prantl [Sat, 12 Oct 2019 00:03:40 +0000 (00:03 +0000)]
Temporarily Revert [platform process list] add a flag for showing the processes of all users

as it breaks the bots.

This reverts r374609 (git commit 696d3cf8ad6f3a0b3019c87526d561bb77ad538e)

llvm-svn: 374616

4 years ago[X86] Fold a VTRUNCS/VTRUNCUS+store into a saturating truncating store.
Craig Topper [Sat, 12 Oct 2019 00:01:08 +0000 (00:01 +0000)]
[X86] Fold a VTRUNCS/VTRUNCUS+store into a saturating truncating store.

We already did this for VTRUNCUS with a specific combination of
types. This extends this to VTRUNCS and handles any types where
a truncating store is legal.

llvm-svn: 374615

4 years ago[X86] Add test case showing missing opportunity to fold vmovsdb into a store after...
Craig Topper [Sat, 12 Oct 2019 00:00:59 +0000 (00:00 +0000)]
[X86] Add test case showing missing opportunity to fold vmovsdb into a store after type legalization. NFC

llvm-svn: 374614