platform/upstream/llvm.git
5 years ago[X86] Replace more calls to getZeroVector with regular getConstant.
Craig Topper [Tue, 20 Nov 2018 06:54:01 +0000 (06:54 +0000)]
[X86] Replace more calls to getZeroVector with regular getConstant.

getZeroVector produces a specifically canonicalized zero vector, but we can just let DAG legalization take care of it.

The test changes are because MULH lowering happens later than it should and this change gave us the opportunity to constant fold away a multiply during a DAG combine before the build_vector got legalized with a bitcast.

llvm-svn: 347290

5 years agoRecommit "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switches"
Max Kazantsev [Tue, 20 Nov 2018 05:43:32 +0000 (05:43 +0000)]
Recommit "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switches"

The initial version of patch lacked Phi nodes updates in destinations of removed
edges. This version contains this update and tests on this situation.

Differential Revision: https://reviews.llvm.org/D54021

llvm-svn: 347289

5 years ago[PowerPC] Don't combine to bswap store on 1-byte truncating store
Nemanja Ivanovic [Tue, 20 Nov 2018 04:42:31 +0000 (04:42 +0000)]
[PowerPC] Don't combine to bswap store on 1-byte truncating store

Turns out that there was no check for a store that truncates down
to a single byte when combining a (store (bswap...)) into a byte-swapping
store. This patch just adds that check.

Fixes https://bugs.llvm.org/show_bug.cgi?id=39478.

llvm-svn: 347288

5 years ago[SelectionDAG] Compute known bits and num sign bits for live out vector registers...
Craig Topper [Tue, 20 Nov 2018 04:30:26 +0000 (04:30 +0000)]
[SelectionDAG] Compute known bits and num sign bits for live out vector registers. Use it to add AssertZExt/AssertSExt in the live in basic blocks

Summary:
We already support this for scalars, but it was explicitly disabled for vectors. In the updated test cases this allows us to see the upper bits are zero to use less multiply instructions to emulate a 64 bit multiply.

This should help with this ispc issue that a coworker pointed me to https://github.com/ispc/ispc/issues/1362

Reviewers: spatel, efriedma, RKSimon, arsenm

Reviewed By: spatel

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D54725

llvm-svn: 347287

5 years ago[XRay] Add a test for allocator exhaustion
Dean Michael Berris [Tue, 20 Nov 2018 03:56:04 +0000 (03:56 +0000)]
[XRay] Add a test for allocator exhaustion

Use a more representative test of allocating small chunks for
oddly-sized (small) objects from an allocator that has a page's worth of
memory.

llvm-svn: 347286

5 years agoEnsure FileManagerTest expects "\\" as path separator on Windows platforms
Matthew Voss [Tue, 20 Nov 2018 03:30:28 +0000 (03:30 +0000)]
Ensure FileManagerTest expects "\\" as path separator on Windows platforms

llvm-svn: 347284

5 years agoSkip TestTargetCreateDeps
Jonas Devlieghere [Tue, 20 Nov 2018 01:18:49 +0000 (01:18 +0000)]
Skip TestTargetCreateDeps

Skip this test because Windows deals differently with shared libraries.

llvm-svn: 347283

5 years agoDriver: SCS is compatible with every other sanitizer.
Peter Collingbourne [Tue, 20 Nov 2018 01:01:49 +0000 (01:01 +0000)]
Driver: SCS is compatible with every other sanitizer.

Because SCS relies on system-provided runtime support, we can use it
together with any other sanitizer simply by linking the runtime for
the other sanitizer.

Differential Revision: https://reviews.llvm.org/D54735

llvm-svn: 347282

5 years ago[ExecutionEngine][Interpreter] Fix out-of-bounds array access.
Lang Hames [Tue, 20 Nov 2018 01:01:26 +0000 (01:01 +0000)]
[ExecutionEngine][Interpreter] Fix out-of-bounds array access.

If args is empty then accesing element 0 is illegal.

https://reviews.llvm.org/D53556

Patch by Eugene Sharygin. Thanks Eugene!

llvm-svn: 347281

5 years ago[XRay] Move buffer extents back to the heap
Dean Michael Berris [Tue, 20 Nov 2018 01:00:26 +0000 (01:00 +0000)]
[XRay] Move buffer extents back to the heap

Summary:
This change addresses an issue which shows up with the synchronised race
between threads writing into a buffer, and another thread reading the
buffer.

In a lot of cases, we cannot guarantee that threads will always see the
signal to finalise their buffers in time despite the grace periods and
state machine maintained through atomic variables. This change addresses
it by ensuring that the same instance being updated to indicate how much
of the buffer is "used" by the writing thread is the same instance being
read by the thread processing the buffer to be written out to disk or
handled through the iterators.

To do this, we ensure that all the "extents" instances live in their own
the backing store, in a different contiguous page from the
buffer-specific backing store. We also take precautions to ensure that
the atomic variables are cache-line-sized to prevent false-sharing from
unnecessarily causing cache contention on unrelated writes/reads.

It's feasible that we may in the future be able to move the storage of
the extents objects into the single backing store, slightly changing the
way to compute the size(s) of the buffers, but in the meantime we'll
settle for the isolation afforded by having a different backing store
for the extents instances.

Reviewers: mboerger

Subscribers: jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D54684

llvm-svn: 347280

5 years ago[compiler-rt] Use zx_futex_wait_deprecated for Fuchsia sanitizer runtime
Petr Hosek [Tue, 20 Nov 2018 00:55:20 +0000 (00:55 +0000)]
[compiler-rt] Use zx_futex_wait_deprecated for Fuchsia sanitizer runtime

This change is part of the soft-transition to the new synchronization
primitives which implement priority inheritance.

Differential Revision: https://reviews.llvm.org/D54727

llvm-svn: 347279

5 years ago[DAGCombiner] reduce code duplication in visitXOR; NFC
Sanjay Patel [Tue, 20 Nov 2018 00:51:45 +0000 (00:51 +0000)]
[DAGCombiner] reduce code duplication in visitXOR; NFC

llvm-svn: 347278

5 years ago[WebAssembly] Remove unused function return types (NFC)
Heejin Ahn [Tue, 20 Nov 2018 00:38:10 +0000 (00:38 +0000)]
[WebAssembly] Remove unused function return types (NFC)

Reviewers: sbc100

Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54734

llvm-svn: 347277

5 years ago[CodeView] Don't print PointerAttributes when dumping.
Zachary Turner [Tue, 20 Nov 2018 00:10:27 +0000 (00:10 +0000)]
[CodeView] Don't print PointerAttributes when dumping.

PointerAttributes is a bitwise-or of several other fields, each of
which is already printed on its own line with a better explanation.
So this doesn't really help much.

llvm-svn: 347275

5 years agoImplement computeKnownBits for scalar_to_vector
Stanislav Mekhanoshin [Mon, 19 Nov 2018 23:34:07 +0000 (23:34 +0000)]
Implement computeKnownBits for scalar_to_vector

Differential Revision: https://reviews.llvm.org/D54728

llvm-svn: 347274

5 years ago[WebAssembly] Fix inaccurate comments / assertion messages
Heejin Ahn [Mon, 19 Nov 2018 23:31:28 +0000 (23:31 +0000)]
[WebAssembly] Fix inaccurate comments / assertion messages

Reviewers: sbc100

Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54688

llvm-svn: 347273

5 years ago[WebAssembly] Make starting indices calcaulation simpler (NFC)
Heejin Ahn [Mon, 19 Nov 2018 23:21:25 +0000 (23:21 +0000)]
[WebAssembly] Make starting indices calcaulation simpler (NFC)

Summary:
At the beginning of `assignIndexes() function, when `FunctionIndex` and
`GlobalIndex` variables are created, `InputFunctions` and `InputGlobals`
vectors are guaranteed to be empty, because those vectors are only
populated in `assignIndexes()` function. Current code looks like they
are nonempty, so this patch deletes them for better readability.

Reviewers: sbc100

Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54687

llvm-svn: 347272

5 years agoIt's its
Paul Robinson [Mon, 19 Nov 2018 22:53:42 +0000 (22:53 +0000)]
It's its

llvm-svn: 347271

5 years agoAdd interceptor for the setvbuf(3) from NetBSD
Kamil Rytarowski [Mon, 19 Nov 2018 22:44:26 +0000 (22:44 +0000)]
Add interceptor for the setvbuf(3) from NetBSD

Summary:
setvbuf(3) is a routine to setup stream buffering.

Enable the interceptor for NetBSD.

Add dedicated tests for setvbuf(3) and functions
on top of this interface: setbuf, setbuffer, setlinebuf.

Based on original work by Yang Zheng.

Reviewers: joerg, vitalybuka

Reviewed By: vitalybuka

Subscribers: devnexen, tomsun.0.7, kubamracek, llvm-commits, mgorny, #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D54548

llvm-svn: 347270

5 years ago[Transforms] Prefer static and avoid namespaces, NFC
Reid Kleckner [Mon, 19 Nov 2018 22:19:05 +0000 (22:19 +0000)]
[Transforms] Prefer static and avoid namespaces, NFC

Put 'static' on three functions in an anonymous namespace as per our
coding style.

Remove the 'namespace llvm {}' around the .cpp file and explicitly
declare the free function 'llvm::optimizeGlobalCtorsList' in 'llvm::'.
I prefer this style for free functions because the compiler will error
out if the .h and .cpp files don't agree on the function name or
prototype.

llvm-svn: 347269

5 years ago[X86] Rename combineVSZext->combineExtendVectorInreg. NFC
Craig Topper [Mon, 19 Nov 2018 22:18:47 +0000 (22:18 +0000)]
[X86] Rename combineVSZext->combineExtendVectorInreg. NFC

Now that we no longer have target specific vector extend nodes let's make the function name match the nodes we do use.

llvm-svn: 347268

5 years ago[NFC][libcxx] Fix incorrect comments
Louis Dionne [Mon, 19 Nov 2018 22:06:42 +0000 (22:06 +0000)]
[NFC][libcxx] Fix incorrect comments

llvm-svn: 347267

5 years ago[X86] Add test case to show missed opportunity to use a single pmuludq to implement...
Craig Topper [Mon, 19 Nov 2018 22:04:12 +0000 (22:04 +0000)]
[X86] Add test case to show missed opportunity to use a single pmuludq to implement a multiply when a zext lives in another basic block.

This can occur when one of the inputs to the multiply is loop invariant. Though my test cases just use two basic blocks with an unconditional jump which we won't merge until after isel in the codegen pipeline.

For scalars, I believe SelectionDAGBuilder can add an AssertZExt to pass knowledge across basic blocks but its explicitly disabled for vectors.

llvm-svn: 347266

5 years agoAMDGPU: Fix V_FMA_F16 selection on GFX9
Konstantin Zhuravlyov [Mon, 19 Nov 2018 21:10:16 +0000 (21:10 +0000)]
AMDGPU: Fix V_FMA_F16 selection on GFX9

GFX9 should select opsel version.

Differential Revision: https://reviews.llvm.org/D54545

llvm-svn: 347265

5 years ago[libcxx] Fix XFAIL for GCC 4.9
Louis Dionne [Mon, 19 Nov 2018 20:53:38 +0000 (20:53 +0000)]
[libcxx] Fix XFAIL for GCC 4.9

The XFAIL started passing since we're only testing for trivial-copyability of
reference_wrapper in C++14 and above. This commit constrains the XFAIL to
gcc-4.9 with C++14 (it would also fail on C++17 and above, but those standards
are not available with GCC 4.9).

llvm-svn: 347264

5 years ago[libcxx] Update test of trivial copyability of reference_wrapper
Louis Dionne [Mon, 19 Nov 2018 20:21:45 +0000 (20:21 +0000)]
[libcxx] Update test of trivial copyability of reference_wrapper

N4151 is not an extension anymore, it was standardized in C++14.

llvm-svn: 347263

5 years ago[Coverage] Fix PR39258: support coverage regions that start deeper than they end
Vedant Kumar [Mon, 19 Nov 2018 20:10:22 +0000 (20:10 +0000)]
[Coverage] Fix PR39258: support coverage regions that start deeper than they end

popRegions used to assume that the start location of a region can't be
nested deeper than the end location, which is not always true.

Patch by Orivej Desh!

Differential Revision: https://reviews.llvm.org/D53244

llvm-svn: 347262

5 years ago[Sema] Fix PR38987: keep end location of a direct initializer list
Vedant Kumar [Mon, 19 Nov 2018 20:10:21 +0000 (20:10 +0000)]
[Sema] Fix PR38987: keep end location of a direct initializer list

If PerformConstructorInitialization of a direct initializer list constructor is
called while instantiating a template, it has brace locations in its BraceLoc
arguments but not in the Kind argument.

This reverts the hunk https://reviews.llvm.org/D41921#inline-468844.

Patch by Orivej Desh!

Differential Revision: https://reviews.llvm.org/D53231

llvm-svn: 347261

5 years agoRevert "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switches"
Benjamin Kramer [Mon, 19 Nov 2018 20:01:20 +0000 (20:01 +0000)]
Revert "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switches"

This reverts commits r347183 & r347184. Crashes while building libxml.

llvm-svn: 347260

5 years ago[AMDGPU] Restored selection of scalar_to_vector (v2x16)
Stanislav Mekhanoshin [Mon, 19 Nov 2018 19:58:13 +0000 (19:58 +0000)]
[AMDGPU] Restored selection of scalar_to_vector (v2x16)

This works if DAG combiner is enabled, but without combining
we cannot select scalar_to_vector of <2 x half> and <2 x i16>.

Differential Revision: https://reviews.llvm.org/D54718

llvm-svn: 347259

5 years ago[clang][CodeGen] Implicit Conversion Sanitizer: discover the world of CompoundAssign...
Roman Lebedev [Mon, 19 Nov 2018 19:56:43 +0000 (19:56 +0000)]
[clang][CodeGen] Implicit Conversion Sanitizer: discover the world of CompoundAssign operators

Summary:
As reported by @regehr (thanks!) on twitter (https://twitter.com/johnregehr/status/1057681496255815686),
we (me) has completely forgot about the binary assignment operator.
In AST, it isn't represented as separate `ImplicitCastExpr`'s,
but as a single `CompoundAssignOperator`, that does all the casts internally.
Which means, out of these two, only the first one is diagnosed:
```
auto foo() {
    unsigned char c = 255;
    c = c + 1;
    return c;
}
auto bar() {
    unsigned char c = 255;
    c += 1;
    return c;
}
```
https://godbolt.org/z/JNyVc4

This patch does handle the `CompoundAssignOperator`:
```
int main() {
  unsigned char c = 255;
  c += 1;
  return c;
}
```
```
$ ./bin/clang -g -fsanitize=integer /tmp/test.c && ./a.out
/tmp/test.c:3:5: runtime error: implicit conversion from type 'int' of value 256 (32-bit, signed) to type 'unsigned char' changed the value to 0 (8-bit, unsigned)
    #0 0x2392b8 in main /tmp/test.c:3:5
    #1 0x7fec4a612b16 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x22b16)
    #2 0x214029 in _start (/build/llvm-build-GCC-release/a.out+0x214029)
```

However, the pre/post increment/decrement is still not handled.

Reviewers: rsmith, regehr, vsk, rjmccall, #sanitizers

Reviewed By: rjmccall

Subscribers: mclow.lists, cfe-commits, regehr

Tags: #clang, #sanitizers

Differential Revision: https://reviews.llvm.org/D53949

llvm-svn: 347258

5 years ago[InstCombine] Set debug loc on `mergeStoreIntoSuccessor` phi
Vedant Kumar [Mon, 19 Nov 2018 19:55:02 +0000 (19:55 +0000)]
[InstCombine] Set debug loc on `mergeStoreIntoSuccessor` phi

Assigning a merged debug location to the `mergeStoreIntoSuccessor` phi
improves backtrace quality.

Fixes llvm.org/PR38083.

llvm-svn: 347257

5 years ago[IR] Add hasNPredecessors, hasNPredecessorsOrMore to BasicBlock
Vedant Kumar [Mon, 19 Nov 2018 19:54:27 +0000 (19:54 +0000)]
[IR] Add hasNPredecessors, hasNPredecessorsOrMore to BasicBlock

Add methods to BasicBlock which make it easier to efficiently check
whether a block has N (or more) predecessors.

This can be more efficient than using pred_size(), which is a linear
time operation.

We might consider adding similar methods for successors. I haven't done
so in this patch because succ_size() is already O(1).

With this patch applied, I measured a 0.065% compile-time reduction in
user time for running `opt -O3` on the sqlite3 amalgamation (30 trials).
The change in mergeStoreIntoSuccessor alone saves 45 million linked list
iterations in a stage2 Release build of llc.

See llvm.org/PR39702 for a harder but more general way of achieving
similar results.

Differential Revision: https://reviews.llvm.org/D54686

llvm-svn: 347256

5 years ago[DAGCombine] SimplifyNodeWithTwoResults - ensure same legalization for LO/HI operands...
Simon Pilgrim [Mon, 19 Nov 2018 19:37:59 +0000 (19:37 +0000)]
[DAGCombine] SimplifyNodeWithTwoResults - ensure same legalization for LO/HI operands (PR21207)

Consistently use (!LegalOperations || isOperationLegalOrCustom) for all node pairs.

Differential Revision: https://reviews.llvm.org/D53478

llvm-svn: 347255

5 years agoFix clang test suite on Windows by reverting part of r347216
Reid Kleckner [Mon, 19 Nov 2018 19:36:28 +0000 (19:36 +0000)]
Fix clang test suite on Windows by reverting part of r347216

Otherwise, the clang analyzer tests fail on Windows when attempting to
unpickle AnalyzerTest objects in the worker processes. The pattern of,
add to path, import, remove from path, serialize, deserialize, doesn't
work. Once something gets added to the path, if we want to move it
across the wire for multiprocessing, we need to keep the module on
sys.path.

llvm-svn: 347254

5 years agoFix Wdocumentation warning. NFCI.
Simon Pilgrim [Mon, 19 Nov 2018 19:18:33 +0000 (19:18 +0000)]
Fix Wdocumentation warning. NFCI.

llvm-svn: 347253

5 years agoFix unused function warning.
Simon Pilgrim [Mon, 19 Nov 2018 19:18:00 +0000 (19:18 +0000)]
Fix unused function warning.

llvm-svn: 347252

5 years ago[TargetLowering] expandFP_TO_UINT - improve fp16 support
Simon Pilgrim [Mon, 19 Nov 2018 19:16:13 +0000 (19:16 +0000)]
[TargetLowering] expandFP_TO_UINT - improve fp16 support

As discussed on D53794, for float types with ranges smaller than the destination integer type, then we should be able to just use a regular FP_TO_SINT opcode.

I thought we'd need to provide MSA test cases for very small integer types as well (fp16 -> i8 etc.), but it turns out that promotion will kick in so they're unnecessary.

Differential Revision: https://reviews.llvm.org/D54703

llvm-svn: 347251

5 years ago[IR] DISubprogram::toSPFlags(): fix "enumeral and non-enumeral type in conditional...
Roman Lebedev [Mon, 19 Nov 2018 19:07:03 +0000 (19:07 +0000)]
[IR] DISubprogram::toSPFlags(): fix "enumeral and non-enumeral type in conditional expression"

/build/llvm/include/llvm/IR/DebugInfoMetadata.h: In static member function â€˜static llvm::DISubprogram::DISPFlags llvm::DISubprogram::toSPFlags(bool, bool, bool, unsigned int)’:
/build/llvm/include/llvm/IR/DebugInfoMetadata.h:1636:50: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
                                   (IsLocalToUnit ? SPFlagLocalToUnit : 0) |
                                    ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
/build/llvm/include/llvm/IR/DebugInfoMetadata.h:1637:49: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
                                   (IsDefinition ? SPFlagDefinition : 0) |
                                    ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
/build/llvm/include/llvm/IR/DebugInfoMetadata.h:1638:48: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
                                   (IsOptimized ? SPFlagOptimized : 0));
                                    ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~

llvm-svn: 347250

5 years agoAdd missing stream operator for Polynomial class to fix debug builds.
Simon Pilgrim [Mon, 19 Nov 2018 18:57:49 +0000 (18:57 +0000)]
Add missing stream operator for Polynomial class to fix debug builds.

llvm-svn: 347249

5 years ago[X86][CostModel] Don't lookup intrinsic cost tables if the intrinsic isn't one we...
Craig Topper [Mon, 19 Nov 2018 18:57:31 +0000 (18:57 +0000)]
[X86][CostModel] Don't lookup intrinsic cost tables if the intrinsic isn't one we care about

We're seeing some issues internally where we sent some intrinsics into the cost model that the getTypeLegalizationCost call fails on, but X86 specific tables don't care about. Our base class implementation takes care of them. We'd just like X86 backend to ignore them.

This patch makes sure the switch returned something X86 cares about and skips the table lookups and type legalization call if not. Probably more efficient too since we don't go scanning the tables for every intrinsic we could possibly see.

Differential Revision: https://reviews.llvm.org/D54711

llvm-svn: 347248

5 years agoAdd missing closing bracket.
Simon Pilgrim [Mon, 19 Nov 2018 18:54:34 +0000 (18:54 +0000)]
Add missing closing bracket.

llvm-svn: 347247

5 years agoFix build break from r347239
Paul Robinson [Mon, 19 Nov 2018 18:51:11 +0000 (18:51 +0000)]
Fix build break from r347239

llvm-svn: 347246

5 years agoFix Wdocumentation warning. NFCI.
Simon Pilgrim [Mon, 19 Nov 2018 18:46:40 +0000 (18:46 +0000)]
Fix Wdocumentation warning. NFCI.

llvm-svn: 347245

5 years agoAdd docker configurations used by the buildbots.
Eric Fiselier [Mon, 19 Nov 2018 18:43:31 +0000 (18:43 +0000)]
Add docker configurations used by the buildbots.

These are the scripts I use to create the docker images for
the build bots and run them.

llvm-svn: 347244

5 years ago[lldbsuite] Invoke sed on Windows to determine the cache dir for clang
Stella Stamenova [Mon, 19 Nov 2018 18:41:33 +0000 (18:41 +0000)]
[lldbsuite] Invoke sed on Windows to determine the cache dir for clang

Summary: In order to invoke sed on Windows, we need to quote the command correctly. Since we already have commands which do that, move the definitions at the beginning of the file and then re-use them for each command.

Reviewers: aprantl, zturner

Subscribers: teemperor, lldb-commits

Differential Revision: https://reviews.llvm.org/D54709

llvm-svn: 347243

5 years ago[X86][SSE] Remove unnecessary bit-and in pshufb vector ctlz (PR39703)
Simon Pilgrim [Mon, 19 Nov 2018 18:40:59 +0000 (18:40 +0000)]
[X86][SSE] Remove unnecessary bit-and in pshufb vector ctlz (PR39703)

SSE PSHUFB vector ctlz lowering works at the i4 nibble level. As detailed in PR39703, we were masking the lower nibble off but we only actually use it in the case where the upper nibble is known to be zero, making it safe to remove the mask and save an instruction.

Differential Revision: https://reviews.llvm.org/D54707

llvm-svn: 347242

5 years ago[InterleavedLoadCombine] Fix warnings
Martin Elshuber [Mon, 19 Nov 2018 18:35:31 +0000 (18:35 +0000)]
[InterleavedLoadCombine] Fix warnings

* remove unused function
* fix compare

llvm-svn: 347241

5 years ago[X86] Attempt to improve v32i8/v64i8 multiply lowering by applying the v16i8 non...
Craig Topper [Mon, 19 Nov 2018 18:32:53 +0000 (18:32 +0000)]
[X86] Attempt to improve v32i8/v64i8 multiply lowering by applying the v16i8 non-avx2 algorithm to each 128-bit lane.

Previously we split the vectors in half to allow the two halves to be any extended then concatenated the results back together.

This patch instead instead extends the v16i8 sse algorithm to extend half of each 128-bit lane using punpcklbw/punpckhbw. Multiplies all the low half lanes and high half lanes together in separate operations. Then merges the half lane results back together using packuswb.

Unfortunately, some of the cases in vector-reduce-mul.ll regress because we aren't narrowing the vector width of the multiplies as we reduce. The splitting was somewhat making up for that before by causing halves to be discarded after the split.

Differential Revision: https://reviews.llvm.org/D54668

llvm-svn: 347240

5 years ago[DebugInfo] DISubprogram flags get their own flags word. NFC.
Paul Robinson [Mon, 19 Nov 2018 18:29:28 +0000 (18:29 +0000)]
[DebugInfo] DISubprogram flags get their own flags word. NFC.
This will hold flags specific to subprograms. In the future
we could potentially free up scarce bits in DIFlags by moving
subprogram-specific flags from there to the new flags word.

This patch does not change IR/bitcode formats, that will be
done in a follow-up.

Differential Revision: https://reviews.llvm.org/D54597

llvm-svn: 347239

5 years ago[ARM] Attempt to fix arm selfhost bots after rL347191
Sam Parker [Mon, 19 Nov 2018 18:08:46 +0000 (18:08 +0000)]
[ARM] Attempt to fix arm selfhost bots after rL347191

llvm-svn: 347238

5 years agoAddress comments.
Kadir Cetinkaya [Mon, 19 Nov 2018 18:06:36 +0000 (18:06 +0000)]
Address comments.

llvm-svn: 347237

5 years agoUse digest size instead of hardcoding it.
Kadir Cetinkaya [Mon, 19 Nov 2018 18:06:33 +0000 (18:06 +0000)]
Use digest size instead of hardcoding it.

llvm-svn: 347236

5 years ago[clangd] Store source file hash in IndexFile{In,Out}
Kadir Cetinkaya [Mon, 19 Nov 2018 18:06:29 +0000 (18:06 +0000)]
[clangd] Store source file hash in IndexFile{In,Out}

Summary:
Puts the digest of the source file that generated the index into
serialized index and stores them back on load, if exists.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54693

llvm-svn: 347235

5 years ago[AMDGPU] Fix -Wunused-variable
Fangrui Song [Mon, 19 Nov 2018 17:54:27 +0000 (17:54 +0000)]
[AMDGPU] Fix -Wunused-variable

llvm-svn: 347234

5 years ago[libcxx] Fix incorrect #include for std::hash
Louis Dionne [Mon, 19 Nov 2018 17:40:16 +0000 (17:40 +0000)]
[libcxx] Fix incorrect #include for std::hash

Reviewed as https://reviews.llvm.org/D54705.
Thanks to Andrey Maksimov for the patch.

llvm-svn: 347233

5 years ago[libcxx] Add missing <cstddef> includes in tests
Louis Dionne [Mon, 19 Nov 2018 17:39:50 +0000 (17:39 +0000)]
[libcxx] Add missing <cstddef> includes in tests

Some tests use type std::max_align_t, but don't include <cstddef> header
directly. As a result, these tests won't compile against some conformant
libraries.

Reviewed as https://reviews.llvm.org/D54645.
Thanks to Andrey Maksimov for the patch.

llvm-svn: 347232

5 years ago[AMDGPU] Convert insert_vector_elt into set of selects
Stanislav Mekhanoshin [Mon, 19 Nov 2018 17:39:20 +0000 (17:39 +0000)]
[AMDGPU] Convert insert_vector_elt into set of selects

This allows to avoid scratch use or indirect VGPR addressing for
small vectors.

Differential Revision: https://reviews.llvm.org/D54606

llvm-svn: 347231

5 years ago[llvm-nm] Fix use-after-free for MachOUniversalBinaries
Francis Visoiu Mistrih [Mon, 19 Nov 2018 17:19:50 +0000 (17:19 +0000)]
[llvm-nm] Fix use-after-free for MachOUniversalBinaries

MachOObjectFile::getHostArch() returns a temporary, and getArchName
returns a StringRef pointing to a temporary std::string.

No tests since it doesn't trigger any errors except with the sanitizers.

llvm-svn: 347230

5 years ago[InterleavedLoadCombine] Fix warning unused variable
Martin Elshuber [Mon, 19 Nov 2018 17:11:48 +0000 (17:11 +0000)]
[InterleavedLoadCombine] Fix warning unused variable

Differential Revision: https://reviews.llvm.org/D52653

llvm-svn: 347229

5 years ago[WebAssembly] replaced .param/.result by .functype
Wouter van Oortmerssen [Mon, 19 Nov 2018 17:10:36 +0000 (17:10 +0000)]
[WebAssembly] replaced .param/.result by .functype

Summary:
This makes it easier/cleaner to generate a single signature from
this directive. Also:
- Adds the symbol name, such that we don't depend on the location
  of this directive anymore.
- Actually constructs the signature in the assembler, and make the
  assembler own it.
- Refactor the use of MVT vs ValType in the streamer and assembler
  to require less conversions overall.
- Changed 700 or so tests to use it.

Reviewers: sbc100, dschuff

Subscribers: jgravelle-google, eraman, aheejin, sunfish, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D54652

llvm-svn: 347228

5 years ago[SelectionDAG] simplify vector select with undef operand(s)
Sanjay Patel [Mon, 19 Nov 2018 17:06:05 +0000 (17:06 +0000)]
[SelectionDAG] simplify vector select with undef operand(s)

llvm-svn: 347227

5 years ago[InterleavedLoadCombine] Remove unused include. NFC.
Benjamin Kramer [Mon, 19 Nov 2018 17:01:19 +0000 (17:01 +0000)]
[InterleavedLoadCombine] Remove unused include. NFC.

llvm-svn: 347226

5 years agoRevert "[LICM] Make LICM able to hoist phis"
Benjamin Kramer [Mon, 19 Nov 2018 16:51:57 +0000 (16:51 +0000)]
Revert "[LICM] Make LICM able to hoist phis"

This reverts commit r347190.

llvm-svn: 347225

5 years ago[lit] On Windows, don't error if MSVC is not in PATH.
Zachary Turner [Mon, 19 Nov 2018 16:47:06 +0000 (16:47 +0000)]
[lit] On Windows, don't error if MSVC is not in PATH.

We had some logic backwards, and as a result if MSVC was not found
in PATH we would throw a string concatenation exception.

llvm-svn: 347224

5 years agoRemove non-ASCII characters at the beginning of file.
Zachary Turner [Mon, 19 Nov 2018 16:41:31 +0000 (16:41 +0000)]
Remove non-ASCII characters at the beginning of file.

It's not clear how these ended up in the file, but this fixes it.

llvm-svn: 347223

5 years ago[AMDGPU] Derive GCNSubtarget from MF to get overridden target features
David Stuttard [Mon, 19 Nov 2018 15:44:20 +0000 (15:44 +0000)]
[AMDGPU] Derive GCNSubtarget from MF to get overridden target features

Summary:
AMDGPUAsmPrinter has a getSTI function that derives a GCNSubtarget from the
TM. However, this means that overridden target features are not detected and can
result in incorrect behaviour.

Switch to using STM which is a GCNSubtarget derived from the MF (used elsewhere
in the same function).

Change-Id: Ib6328ad667b7fcdc87e9c06344e59859207db9b0

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D54301

llvm-svn: 347221

5 years ago[LV] Avoid vectorizing unsafe dependencies in uniform address
Anna Thomas [Mon, 19 Nov 2018 15:39:59 +0000 (15:39 +0000)]
[LV] Avoid vectorizing unsafe dependencies in uniform address

Summary:
Currently, when vectorizing stores to uniform addresses, the only
instance we prevent vectorization is if there are multiple stores to the
same uniform address causing an unsafe dependency.
This patch teaches LAA to avoid vectorizing loops that have an unsafe
cross-iteration dependency between a load and a store to the same uniform address.

Fixes PR39653.

Reviewers: Ayal, efriedma

Subscribers: rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D54538

llvm-svn: 347220

5 years ago[libcxx] Add availability markup for bad_optional_access, bad_variant_access and...
Louis Dionne [Mon, 19 Nov 2018 15:37:04 +0000 (15:37 +0000)]
[libcxx] Add availability markup for bad_optional_access, bad_variant_access and bad_any_cast

Reviewers: dexonsmith, EricWF

Subscribers: christof, arphaman, libcxx-commits

Differential Revision: https://reviews.llvm.org/D53256

llvm-svn: 347219

5 years ago[Hexagon] make test immune to improvements in undef simplification
Sanjay Patel [Mon, 19 Nov 2018 15:34:09 +0000 (15:34 +0000)]
[Hexagon] make test immune to improvements in undef simplification

llvm-svn: 347218

5 years ago[x86] add/make tests immune to improvements in undef simplification
Sanjay Patel [Mon, 19 Nov 2018 15:33:44 +0000 (15:33 +0000)]
[x86] add/make tests immune to improvements in undef simplification

llvm-svn: 347217

5 years agoFix some issues with LLDB's lit configuration files.
Zachary Turner [Mon, 19 Nov 2018 15:12:34 +0000 (15:12 +0000)]
Fix some issues with LLDB's lit configuration files.

Recently I tried to port LLDB's lit configuration files over to use a
on the surface, but broke some cases that weren't broken before and also
exposed some additional problems with the old approach that we were just
getting lucky with.

When we set up a lit environment, the goal is to make it as hermetic as
possible. We should not be relying on PATH and enabling the use of
arbitrary shell commands. Instead, only whitelisted commands should be
allowed. These are, generally speaking, the lit builtins such as echo,
cd, etc, as well as anything for which substitutions have been
explicitly set up for. These substitutions should map to the build
output directory, but in some cases it's useful to be able to override
this (for example to point to an installed tools directory).

This is, of course, how it's supposed to work. What was actually
happening is that we were bringing in PATH and LD_LIBRARY_PATH and then
just running the given run line as a shell command. This led to problems
such as finding the wrong version of clang-cl on PATH since it wasn't
even a substitution, and flakiness / non-determinism since the
environment the tests were running in would change per-machine. On the
other hand, it also made other things possible. For example, we had some
tests that were explicitly running cl.exe and link.exe instead of
clang-cl and lld-link and the only reason it worked at all is because it
was finding them on PATH. Unfortunately we can't entirely get rid of
these tests, because they support a few things in debug info that
clang-cl and lld-link don't (notably, the LF_UDT_MOD_SRC_LINE record
which makes some of the tests fail.

The high level changes introduced in this patch are:

1. Removal of functionality - The lit test suite no longer respects
   LLDB_TEST_C_COMPILER and LLDB_TEST_CXX_COMPILER. This means there is no
   more support for gcc, but nobody was using this anyway (note: The
   functionality is still there for the dotest suite, just not the lit test
   suite). There is no longer a single substitution %cxx and %cc which maps
   to <arbitrary-compiler>, you now explicitly specify the compiler with a
   substitution like %clang or %clangxx or %clang_cl. We can revisit this
   in the future when someone needs gcc.

2. Introduction of the LLDB_LIT_TOOLS_DIR directory. This does in spirit
   what LLDB_TEST_C_COMPILER and LLDB_TEST_CXX_COMPILER used to do, but now
   more friendly. If this is not specified, all tools are expected to be
   the just-built tools. If it is specified, the tools which are not
   themselves being tested but are being used to construct and run checks
   (e.g. clang, FileCheck, llvm-mc, etc) will be searched for in this
   directory first, then the build output directory.

3. Changes to core llvm lit files. The use_lld() and use_clang()
   functions were introduced long ago in anticipation of using them in
   lldb, but since they were never actually used anywhere but their
   respective problems, there were some issues to be resolved regarding
   generality and ability to use them outside their project.

4. Changes to .test files - These are all just replacing things like
   clang-cl with %clang_cl and %cxx with %clangxx, etc.

5. Changes to lit.cfg.py - Previously we would load up some system
   environment variables and then add some new things to them. Then do a
   bunch of work building out our own substitutions. First, we delete the
   system environment variable code, making the environment hermetic. Then,
   we refactor the substitution logic into two separate helper functions,
   one which sets up substitutions for the tools we want to test (which
   must come from the build output directory), and another which sets up
   substitutions for support tools (like compilers, etc).

6. New substitutions for MSVC -- Previously we relied on location of
   MSVC by bringing in the entire parent's PATH and letting
   subprocess.Popen just run the command line. Now we set up real
   substitutions that should have the same effect. We use PATH to find
   them, and then look for INCLUDE and LIB to construct a substitution
   command line with appropriate /I and /LIBPATH: arguments. The nice thing
   about this is that it opens the door to having separate %msvc-cl32 and
   %msvc-cl64 substitutions, rather than only requiring the user to run
   vcvars first. Because we can deduce the path to 32-bit libraries from
   64-bit library directories, and vice versa. Without these substitutions
   this would have been impossible.

Differential Revision: https://reviews.llvm.org/D54567

llvm-svn: 347216

5 years ago[LoopPass] fixing 'Modification' messages in -debug-pass=Executions for loop passes
Fedor Sergeev [Mon, 19 Nov 2018 15:10:59 +0000 (15:10 +0000)]
[LoopPass] fixing 'Modification' messages in -debug-pass=Executions for loop passes

Legacy loop pass manager is issuing "Made Modification" message after each Loop Pass
run, however condition for issuing it is accumulated among all the runs.
That leads to confusing 'modification' messages as soon as the first modification is done.

Changing condition to be "current pass made modifications", similar to how
it is being done in all other pass managers.

llvm-svn: 347215

5 years ago[OpenMP] Check target architecture supports unified shared memory for requires direct...
Patrick Lyster [Mon, 19 Nov 2018 15:09:33 +0000 (15:09 +0000)]
[OpenMP] Check target architecture supports unified shared memory for requires directive. Differential Review: https://reviews.llvm.org/D54493

llvm-svn: 347214

5 years agoDon't use -O in lit tests.
Zachary Turner [Mon, 19 Nov 2018 15:06:10 +0000 (15:06 +0000)]
Don't use -O in lit tests.

Because of different shell quoting rules, and the fact that LLDB
commands often contain spaces, -O is not portable for writing command
lines. Instead, we should use explicit lldbinit files.

Differential Revision: https://reviews.llvm.org/D54680

llvm-svn: 347213

5 years ago[SelectionDAG] simplify select FP with undef condition
Sanjay Patel [Mon, 19 Nov 2018 14:42:28 +0000 (14:42 +0000)]
[SelectionDAG] simplify select FP with undef condition

llvm-svn: 347212

5 years ago[x86] add test for select FP with undef condition; NFC
Sanjay Patel [Mon, 19 Nov 2018 14:39:57 +0000 (14:39 +0000)]
[x86] add test for select FP with undef condition; NFC

llvm-svn: 347211

5 years ago[SelectionDAG] add simplifySelect() to reduce code duplication; NFC
Sanjay Patel [Mon, 19 Nov 2018 14:35:22 +0000 (14:35 +0000)]
[SelectionDAG] add simplifySelect() to reduce code duplication; NFC

This should be extended to handle FP and vectors in follow-up patches.

llvm-svn: 347210

5 years ago[llvm-exegesis][NFC] More tests for ExegesisTarget::fillMemoryOperands().
Clement Courbet [Mon, 19 Nov 2018 14:31:43 +0000 (14:31 +0000)]
[llvm-exegesis][NFC] More tests for ExegesisTarget::fillMemoryOperands().

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54304

llvm-svn: 347209

5 years agoSubject: [PATCH] [CodeGen] Add pass to combine interleaved loads.
Martin Elshuber [Mon, 19 Nov 2018 14:26:10 +0000 (14:26 +0000)]
Subject: [PATCH] [CodeGen] Add pass to combine interleaved loads.

This patch defines an interleaved-load-combine pass. The pass searches
for ShuffleVector instructions that represent interleaved loads. Matches are
converted such that they will be captured by the InterleavedAccessPass.

The pass extends LLVMs capabilities to use target specific instruction
selection of interleaved load patterns (e.g.: ld4 on Aarch64
architectures).

Differential Revision: https://reviews.llvm.org/D52653

llvm-svn: 347208

5 years ago[ThinLTO] Fix comment. NFC
Eugene Leviant [Mon, 19 Nov 2018 14:19:37 +0000 (14:19 +0000)]
[ThinLTO] Fix comment. NFC

llvm-svn: 347207

5 years ago[SelectionDAG] fix formatting; NFC
Sanjay Patel [Mon, 19 Nov 2018 14:03:07 +0000 (14:03 +0000)]
[SelectionDAG] fix formatting; NFC

llvm-svn: 347206

5 years ago[FileManager] getFile(open=true) after getFile(open=false) should open the file.
Sam McCall [Mon, 19 Nov 2018 13:37:46 +0000 (13:37 +0000)]
[FileManager] getFile(open=true) after getFile(open=false) should open the file.

Summary:
Old behavior is to just return the cached entry regardless of opened-ness.
That feels buggy (though I guess nobody ever actually needed this).

This came up in the context of clangd+clang-tidy integration: we're
going to getFile(open=false) to replay preprocessor actions obscured by
the preamble, but the compilation may subsequently getFile(open=true)
for non-preamble includes.

Reviewers: ilya-biryukov

Subscribers: ioeric, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54691

llvm-svn: 347205

5 years ago[llvm-exegesis] (+final perf overview) InstructionBenchmarkClustering::rangeQuery...
Roman Lebedev [Mon, 19 Nov 2018 13:28:41 +0000 (13:28 +0000)]
[llvm-exegesis] (+final perf overview) InstructionBenchmarkClustering::rangeQuery(): reserve for the upper bound of Neighbors

Summary:
As it was pointed out in D54388+D54390, the maximal size of `Neighbors` is known,
it will contain at most Points_.size() minus one (the center of the cluster)

While that is the upper bound, meaning in the most cases, the actual count
will be much smaller, since D54390 made the allocation persistent,
we no longer have to worry about overly-optimistically `reserve()`ing.

Old: (D54393)
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       6553.167456      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.21% )
...
            6.5547 +- 0.0134 seconds time elapsed  ( +-  0.20% )
```
New:
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       6315.057872      task-clock (msec)         #    0.999 CPUs utilized            ( +-  0.24% )
...
            6.3187 +- 0.0160 seconds time elapsed  ( +-  0.25% )
```
And that is another -~4%.

Since this is the last (as of this moment) patch in this patch series,
it is a good time to summarize:
Old: (svn trunk, as stated in D54381)
```
$ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null

real    0m24.884s
user    0m24.099s
sys     0m0.785s
```
So these patches, on a given benchmark,
has decreased llvm-exegesis analysis time by 74.62%.

There surely is more room for further improvements.
D54514 may improve thins by -11.5% more (relative to this patch).
Parallelization may improve things further significantly, too.

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet, MaskRay

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54415

llvm-svn: 347204

5 years ago[llvm-exegesis] Move InstructionBenchmarkClustering::isNeighbour() into header
Roman Lebedev [Mon, 19 Nov 2018 13:28:36 +0000 (13:28 +0000)]
[llvm-exegesis] Move InstructionBenchmarkClustering::isNeighbour() into header

Summary:
Old: (D54390)
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       7432.421721      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.15% )
...
            7.4336 +- 0.0115 seconds time elapsed  ( +-  0.15% )
```
New:
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       6569.936144      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.22% )
...
            6.5711 +- 0.0143 seconds time elapsed  ( +-  0.22% )
```
And another -12%. You'd think it would be `inline`d anyway, but no! :)

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet, MaskRay

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54393

llvm-svn: 347203

5 years ago[llvm-exegesis] InstructionBenchmarkClustering::rangeQuery(): write into llvm::SmallV...
Roman Lebedev [Mon, 19 Nov 2018 13:28:31 +0000 (13:28 +0000)]
[llvm-exegesis] InstructionBenchmarkClustering::rangeQuery(): write into llvm::SmallVectorImpl& output parameter

Summary:
I do believe this is the correct fix.
We call `rangeQuery()` *very* often. And many times it's output vector is large (tens of thousands entries), so small-size-opt won't help.

Old: (D54389)
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       7934.528363      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.19% )
...
            7.9354 +- 0.0148 seconds time elapsed  ( +-  0.19% )
```
New:
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       7383.793440      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.47% )
...
            7.3868 +- 0.0340 seconds time elapsed  ( +-  0.46% )
```
And another -7%. And that isn't even the good bit yet.

Old:
* calls to allocation functions: 2081419
* temporary allocations: 219658 (10.55%)
* bytes allocated in total (ignoring deallocations): 4.31 GB

New:
* calls to allocation functions: 1880295 (-10%)
* temporary allocations: 18758 (1%) (-91% *sic*)
* bytes allocated in total (ignoring deallocations): 545.15 MB (-88% *sic*)

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet, MaskRay

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54390

llvm-svn: 347202

5 years ago[llvm-exegesis] InstructionBenchmarkClustering::dbScan(): replace std::vector<> with...
Roman Lebedev [Mon, 19 Nov 2018 13:28:26 +0000 (13:28 +0000)]
[llvm-exegesis] InstructionBenchmarkClustering::dbScan(): replace std::vector<> with std::deque<> in llvm::SetVector<>

Summary:
Old: (D54388)
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       8606.323981      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.11% )
...
           8.60773 +- 0.00978 seconds time elapsed  ( +-  0.11% )
```
New:
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       7971.403653      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.14% )
...
            7.9728 +- 0.0113 seconds time elapsed  ( +-  0.14% )
```
Another -~7%.

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet, RKSimon

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54389

llvm-svn: 347201

5 years ago[llvm-exegesis] InstructionBenchmarkClustering::rangeQuery(): use llvm::SmallVector...
Roman Lebedev [Mon, 19 Nov 2018 13:28:22 +0000 (13:28 +0000)]
[llvm-exegesis] InstructionBenchmarkClustering::rangeQuery(): use llvm::SmallVector<size_t, 0> for storage.

Summary:
Old: (D54383)
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       9098.781978      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.16% )
...
            9.1015 +- 0.0148 seconds time elapsed  ( +-  0.16% )
```
New:
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       8553.352480      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.12% )
...
            8.5539 +- 0.0105 seconds time elapsed  ( +-  0.12% )
```
So another -6%.
That is because the `SmallVector` **doubles** it size when reallocating, which is great here,
since we can't `reserve()` since we can't know how many `Neighbors` we will have.

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54388

llvm-svn: 347200

5 years ago[llvm-exegesis] Analysis: writeMeasurementValue(): don't alloc string for double...
Roman Lebedev [Mon, 19 Nov 2018 13:28:17 +0000 (13:28 +0000)]
[llvm-exegesis] Analysis: writeMeasurementValue(): don't alloc string for double each time.

Summary:
Test data: 500kLOC of benchmark.yaml, 23Mb. (that is a subset of the actual uops benchmark i was trying to analyze!)
Old time: (D54382)
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       9024.354355      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.18% )
...
            9.0262 +- 0.0161 seconds time elapsed  ( +-  0.18% )
```
New time:
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       8996.541057      task-clock (msec)         #    0.999 CPUs utilized            ( +-  0.19% )
...
            9.0045 +- 0.0172 seconds time elapsed  ( +-  0.19% )
```
-~0.3%, not that much. But this isn't the important part.

Old:
* calls to allocation functions: 2109712
* temporary allocations: 33112
* bytes allocated in total (ignoring deallocations): 4.43 GB

New:
* calls to allocation functions: 2095345 (-0.68%)
* temporary allocations: 18745 (-43.39% !!!)
* bytes allocated in total (ignoring deallocations): 4.31 GB (-2.71%)

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54383

llvm-svn: 347199

5 years ago[llvm-exegesis] Analysis::writeSnippet(): be smarter about memory allocations.
Roman Lebedev [Mon, 19 Nov 2018 13:28:14 +0000 (13:28 +0000)]
[llvm-exegesis] Analysis::writeSnippet(): be smarter about memory allocations.

Summary:
Test data: 500kLOC of benchmark.yaml, 23Mb. (that is a subset of the actual uops benchmark i was trying to analyze!)
Old time: (D54381)
```
$ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null

real    0m10.487s
user    0m9.745s
sys     0m0.740s
```
New time:
```
$ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null

real    0m9.599s
user    0m8.824s
sys     0m0.772s

```
Not that much, around -9%. But that is not the good part yet, again.

Old:
* calls to allocation functions: 3347676
* temporary allocations: 277818
* bytes allocated in total (ignoring deallocations): 10.52 GB

New:
* calls to allocation functions: 2109712 (-36%)
* temporary allocations: 33112 (-88%)
* bytes allocated in total (ignoring deallocations): 4.43 GB (-58% *sic*)

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet, MaskRay

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54382

llvm-svn: 347198

5 years ago[llvm-exegesis] InstructionBenchmarkClustering::dbScan(): use llvm::SetVector<> inste...
Roman Lebedev [Mon, 19 Nov 2018 13:28:09 +0000 (13:28 +0000)]
[llvm-exegesis] InstructionBenchmarkClustering::dbScan(): use llvm::SetVector<> instead of ILLEGAL std::unordered_set<>

Summary:
Test data: 500kLOC of benchmark.yaml, 23Mb. (that is a subset of the actual uops benchmark i was trying to analyze!)
Old time:
```
$ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null

real    0m24.884s
user    0m24.099s
sys     0m0.785s
```
New time:
```
$ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null

real    0m10.469s
user    0m9.797s
sys     0m0.672s
```
So -60%. And that isn't the good bit yet.

Old:
* calls to allocation functions: 106560180  (yes, 107 *million* allocations.)
* bytes allocated in total (ignoring deallocations): 12.17 GB

New:
* calls to allocation functions: 3347676  (-96.86%)  (just 3 mil)
* bytes allocated in total (ignoring deallocations): 10.52 GB (~2GB less)

---

Two points i want to raise:
* `std::unordered_set<>` should not have been used there in the first place.
  It is banned by the https://llvm.org/docs/ProgrammersManual.html#other-set-like-container-options
* There is no tests, so i'm not fully sure this is correct.
  Since it was unordered set, i guess there are zero restrictions on the order, and anything will be ok?
* I tried other containers suggested in https://llvm.org/docs/ProgrammersManual.html#set-like-containers-std-set-smallset-setvector-etc,
  this `llvm::SetVector<>` seems to be best here.

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet

Subscribers: kristina, bobsayshilol, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54381

llvm-svn: 347197

5 years agoFixed uninitialized variable issue.
Anastasia Stulova [Mon, 19 Nov 2018 12:43:39 +0000 (12:43 +0000)]
Fixed uninitialized variable issue.

This commit should fix failing bots.

llvm-svn: 347196

5 years ago[X86] Add codegen tests for slow-shld scalar funnel shifts
Simon Pilgrim [Mon, 19 Nov 2018 12:29:41 +0000 (12:29 +0000)]
[X86] Add codegen tests for slow-shld scalar funnel shifts

llvm-svn: 347195

5 years agoTest commit - delete trailing space.
Michael Platings [Mon, 19 Nov 2018 12:16:05 +0000 (12:16 +0000)]
Test commit - delete trailing space.

llvm-svn: 347194

5 years agoTest commit - delete a trailing space.
Michael Platings [Mon, 19 Nov 2018 12:10:07 +0000 (12:10 +0000)]
Test commit - delete a trailing space.

llvm-svn: 347193

5 years agoAMDGPU/InsertWaitcnts: Some more const-correctness
Nicolai Haehnle [Mon, 19 Nov 2018 12:03:11 +0000 (12:03 +0000)]
AMDGPU/InsertWaitcnts: Some more const-correctness

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54225

llvm-svn: 347192

5 years ago[ARM] Remove trunc sinks in ARM CGP
Sam Parker [Mon, 19 Nov 2018 11:34:40 +0000 (11:34 +0000)]
[ARM] Remove trunc sinks in ARM CGP

Truncs are treated as sources if their produce a value of the same
type as the one we currently trying to promote. Truncs used to be
considered as a sink if their operand was the same value type.

We now allow smaller types in the search, so we should search through
truncs that produce a smaller value. These truncs can then be
converted to an AND mask.

This leaves sinks as being:
  - points where the value in the register is being observed, such as
    an icmp, switch or store.
  - points where value types have to match, such as calls and returns.
  - zext are included to ease the transformation and are generally
    removed later on.

During this change, it also became apart from truncating sinks was
broken: if a sink used a source, its type information had already
been lost by the time the truncation happens. So I've changed the
method of caching the type information.

Differential Revision: https://reviews.llvm.org/D54515

llvm-svn: 347191

5 years ago[LICM] Make LICM able to hoist phis
John Brawn [Mon, 19 Nov 2018 11:31:24 +0000 (11:31 +0000)]
[LICM] Make LICM able to hoist phis

The general approach taken is to make note of loop invariant branches, then when
we see something conditional on that branch, such as a phi, we create a copy of
the branch and (empty versions of) its successors and hoist using that.

This has no impact by itself that I've been able to see, as LICM typically
doesn't see such phis as they will have been converted into selects by the time
LICM is run, but once we start doing phi-to-select conversion later it will be
important.

Differential Revision: https://reviews.llvm.org/D52827

llvm-svn: 347190

5 years ago[OpenCL] Fix address space deduction in template args.
Anastasia Stulova [Mon, 19 Nov 2018 11:00:14 +0000 (11:00 +0000)]
[OpenCL] Fix address space deduction in template args.

Don't deduce address spaces for non-pointer-like types
in template args.

Fixes PR38603!

Differential Revision: https://reviews.llvm.org/D54634

llvm-svn: 347189

5 years agoRemove unused variable. NFC.
Benjamin Kramer [Mon, 19 Nov 2018 10:59:12 +0000 (10:59 +0000)]
Remove unused variable. NFC.

llvm-svn: 347188