Andrea Di Biagio [Wed, 20 Feb 2019 14:53:18 +0000 (14:53 +0000)]
[MCA][ResourceManager] Add a table that maps processor resource indices to processor resource identifiers.
This patch adds a lookup table to speed up resource queries in the ResourceManager.
This patch also moves helper function 'getResourceStateIndex()' from
ResourceManager.cpp to Support.h, so that we can reuse that logic in the
SummaryView (and potentially other views in llvm-mca).
No functional change intended.
llvm-svn: 354470
Hans Wennborg [Wed, 20 Feb 2019 14:50:08 +0000 (14:50 +0000)]
Fix the build with gcc/libstdc++ 4.8.2 after r354441
llvm-svn: 354469
Simon Atanasyan [Wed, 20 Feb 2019 14:47:02 +0000 (14:47 +0000)]
[mips] Put some MIPS-specific sections to separate segments
Three MIPS-specific sections `.reginfo`, `.MIPS.options`, and `.MIPS.abiflags`
are used by loader to read their contents and setup environment for running
a program. Loader looks up these data in the corresponding segments:
`PT_MIPS_REGINFO`, `PT_MIPS_OPTIONS`, and `PT_MIPS_ABIFLAGS` respectively.
This patch put these sections to separate segments like we do already
for ARM `SHT_ARM_EXIDX` section.
Differential Revision: http://reviews.llvm.org/D58381
llvm-svn: 354468
Sanjay Patel [Wed, 20 Feb 2019 14:34:00 +0000 (14:34 +0000)]
[InstSimplify] use any-zero matcher for fcmp folds
The m_APFloat matcher does not work with anything but strict
splat vector constants, so we could miss these folds and then
trigger an assertion in instcombine:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201
The previous attempt at this in rL354406 had a logic bug that
actually triggered a regression test failure, but I failed to
notice it the first time.
llvm-svn: 354467
Michal Gorny [Wed, 20 Feb 2019 14:31:06 +0000 (14:31 +0000)]
[lldb] [ObjectFile/ELF] Fix recognizing NetBSD images
Split the recognition into NetBSD executables & shared libraries
and core(5) files.
Introduce new owner type: "NetBSD-CORE", as core(5) files are not tagged
in the same way as regular NetBSD executables.
Stop using incorrectly ABI_TAG and ABI_SIZE. Introduce IDENT_TAG,
IDENT_DECSZ, IDENT_NAMESZ and PROCINFO.
The new values detect correctly the NetBSD images.
The patch has been originally written by Kamil Rytarowski. I've added
tests and applied minor code changes per review. The work has been
sponsored by the NetBSD Foundation.
Differential Revision: https://reviews.llvm.org/D42870
llvm-svn: 354466
George Rimar [Wed, 20 Feb 2019 14:01:02 +0000 (14:01 +0000)]
[yaml2elf] - Rename a variable. NFC.
Was suggested during review of D58441.
llvm-svn: 354463
George Rimar [Wed, 20 Feb 2019 13:58:43 +0000 (13:58 +0000)]
[yaml2obj] - Simplify implementation. NFCI.
Knowing about how types are declared for 32/64 bit platforms:
https://github.com/llvm-mirror/llvm/blob/master/include/llvm/BinaryFormat/ELF.h#L28
it is possible to simplify code that writes a binary a bit.
The patch does that.
Differential revision: https://reviews.llvm.org/D58441
llvm-svn: 354462
Petar Avramovic [Wed, 20 Feb 2019 13:42:44 +0000 (13:42 +0000)]
[MIPS MSA] Avoid some DAG combines for vector shifts
DAG combiner combines two shifts into shift + and with bitmask.
Avoid such combines for vectors since leaving two vector shifts
as they are produces better end results.
Differential Revision: https://reviews.llvm.org/D58225
llvm-svn: 354461
Ilya Biryukov [Wed, 20 Feb 2019 12:31:44 +0000 (12:31 +0000)]
[clangd] Fix a typo. NFC
The documentation for -index-file mentioned clang-index instead of
clangd-indexer.
llvm-svn: 354456
Petar Avramovic [Wed, 20 Feb 2019 12:13:11 +0000 (12:13 +0000)]
[MIPS MSA] Add test for vector shift combines
Add test for vector shift combines.
llvm-svn: 354455
Simon Pilgrim [Wed, 20 Feb 2019 12:04:54 +0000 (12:04 +0000)]
[SLPVectorizer][X86] Add add/sub/mul overflow tests
Baseline tests - overflow intrinsics aren't flagged as vectorizable yet
llvm-svn: 354454
Kadir Cetinkaya [Wed, 20 Feb 2019 11:45:20 +0000 (11:45 +0000)]
[clangd] Revert r354442 and r354444
Looks like sysroot is only working on linux.
llvm-svn: 354453
Krasimir Georgiev [Wed, 20 Feb 2019 11:44:21 +0000 (11:44 +0000)]
[clang-format] Do not emit replacements if Java imports are OK
Summary:
Currently clang-format would always emit a replacement for a block of Java imports even if it is correctly formatted:
```
% cat /tmp/Aggregator.java
import X;
% clang-format /tmp/Aggregator.java
import X;
% clang-format -output-replacements-xml /tmp/Aggregator.java
<?xml version='1.0'?>
<replacements xml:space='preserve' incomplete_format='false'>
<replacement offset='0' length='9'>import X;</replacement>
</replacements>
%
```
This change makes clang-format not emit replacements in this case. Note that
there is logic to not emit replacements in this case for C++.
Reviewers: ioeric
Reviewed By: ioeric
Subscribers: jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58436
llvm-svn: 354452
H.J. Lu [Wed, 20 Feb 2019 11:43:43 +0000 (11:43 +0000)]
[sanitizers] Restore internal_readlink for x32
r316591 has
@@ -389,13 +383,11 @@ uptr internal_dup2(int oldfd, int newfd) {
}
uptr internal_readlink(const char *path, char *buf, uptr bufsize) {
-#if SANITIZER_NETBSD
- return internal_syscall_ptr(SYSCALL(readlink), path, buf, bufsize);
-#elif SANITIZER_USES_CANONICAL_LINUX_SYSCALLS
+#if SANITIZER_USES_CANONICAL_LINUX_SYSCALLS
return internal_syscall(SYSCALL(readlinkat), AT_FDCWD,
(uptr)path, (uptr)buf, bufsize);
#else
- return internal_syscall(SYSCALL(readlink), (uptr)path, (uptr)buf, bufsize);
+ return internal_syscall_ptr(SYSCALL(readlink), path, buf, bufsize);
#endif
}
which dropped the (uptr) cast and broke x32. This patch puts back the
(uptr) cast to restore x32 and fixes:
https://bugs.llvm.org/show_bug.cgi?id=40783
Differential Revision: https://reviews.llvm.org/D58413
llvm-svn: 354451
Fangrui Song [Wed, 20 Feb 2019 11:34:18 +0000 (11:34 +0000)]
ELF: Remove field for .gdb_index in InStruct. NFC.
Summary: This field is unreferenced outside of createSyntheticSections.
Reviewers: ruiu, pcc, espindola, grimar
Reviewed By: grimar
Subscribers: grimar, emaste, arichardson, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58423
llvm-svn: 354449
Kadir Cetinkaya [Wed, 20 Feb 2019 10:32:04 +0000 (10:32 +0000)]
[clangd] Try to fix windows build bots
llvm-svn: 354444
David Green [Wed, 20 Feb 2019 10:22:18 +0000 (10:22 +0000)]
[Codegen] Remove dead flags on Physical Defs in machine cse
We may leave behind incorrect dead flags on instructions that are CSE'd. Make
sure we remove the dead flags on physical registers to prevent other incorrect
code motion.
Differential Revision: https://reviews.llvm.org/D58115
llvm-svn: 354443
Kadir Cetinkaya [Wed, 20 Feb 2019 09:41:26 +0000 (09:41 +0000)]
[clangd] Testcase for bug 39811
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58133
llvm-svn: 354442
Roman Lebedev [Wed, 20 Feb 2019 09:14:04 +0000 (09:14 +0000)]
[llvm-exegesis] Opcode stabilization / reclusterization (PR40715)
Summary:
Given an instruction `Opcode`, we can make benchmarks (measurements) of the
instruction characteristics/performance. Then, to facilitate further analysis
we group the benchmarks with *similar* characteristics into clusters.
Now, this is all not entirely deterministic. Some instructions have variable
characteristics, depending on their arguments. And thus, if we do several
benchmarks of the same instruction `Opcode`, we may end up with *different*
performance characteristics measurements. And when we then do clustering,
these several benchmarks of the same instruction `Opcode` may end up being
clustered into *different* clusters. This is not great for further analysis.
We shall find every `Opcode` with benchmarks not in just one cluster, and move
*all* the benchmarks of said `Opcode` into one new unstable cluster per `Opcode`.
I have solved this by making `ClusterId` a bit field, adding a `IsUnstable` bit,
and introducing `-analysis-display-unstable-clusters` switch to toggle between
displaying stable-only clusters and unstable-only clusters.
The reclusterization is deterministically stable, produces identical reports
between runs. (Or at least that is what i'm seeing, maybe it isn't)
Timings/comparisons:
old (current trunk/head) {
F8303582}
```
$ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-old.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-old.html' (25 runs):
6624.73 msec task-clock # 0.999 CPUs utilized ( +- 0.53% )
172 context-switches # 25.965 M/sec ( +- 29.89% )
0 cpu-migrations # 0.042 M/sec ( +- 56.54% )
31073 page-faults # 4690.754 M/sec ( +- 0.08% )
26538711696 cycles # 4006230.292 GHz ( +- 0.53% ) (83.31%)
2017496807 stalled-cycles-frontend # 7.60% frontend cycles idle ( +- 0.93% ) (83.32%)
13403650062 stalled-cycles-backend # 50.51% backend cycles idle ( +- 0.33% ) (33.37%)
19770706799 instructions # 0.74 insn per cycle
# 0.68 stalled cycles per insn ( +- 0.04% ) (50.04%)
4419821812 branches #
667207369.714 M/sec ( +- 0.03% ) (66.69%)
121741669 branch-misses # 2.75% of all branches ( +- 0.28% ) (83.34%)
6.6283 +- 0.0358 seconds time elapsed ( +- 0.54% )
```
patch, with reclustering but without filtering (i.e. outputting all the stable *and* unstable clusters) {
F8303586}
```
$ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-all.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-all.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-all.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-all.html' (25 runs):
6475.29 msec task-clock # 0.999 CPUs utilized ( +- 0.31% )
213 context-switches # 32.952 M/sec ( +- 23.81% )
1 cpu-migrations # 0.130 M/sec ( +- 43.84% )
31287 page-faults # 4832.057 M/sec ( +- 0.08% )
25939086577 cycles # 4006160.279 GHz ( +- 0.31% ) (83.31%)
1958812858 stalled-cycles-frontend # 7.55% frontend cycles idle ( +- 0.68% ) (83.32%)
13218961512 stalled-cycles-backend # 50.96% backend cycles idle ( +- 0.29% ) (33.37%)
19752995402 instructions # 0.76 insn per cycle
# 0.67 stalled cycles per insn ( +- 0.04% ) (50.04%)
4417079244 branches #
682195472.305 M/sec ( +- 0.03% ) (66.70%)
121510065 branch-misses # 2.75% of all branches ( +- 0.19% ) (83.34%)
6.4832 +- 0.0229 seconds time elapsed ( +- 0.35% )
```
Funnily, *this* measurement shows that said reclustering actually improved performance.
patch, with reclustering, only the stable clusters {
F8303594}
```
$ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-stable.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-stable.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-stable.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-stable.html' (25 runs):
6387.71 msec task-clock # 0.999 CPUs utilized ( +- 0.13% )
133 context-switches # 20.792 M/sec ( +- 23.39% )
0 cpu-migrations # 0.063 M/sec ( +- 61.24% )
31318 page-faults # 4903.256 M/sec ( +- 0.08% )
25591984967 cycles # 4006786.266 GHz ( +- 0.13% ) (83.31%)
1881234904 stalled-cycles-frontend # 7.35% frontend cycles idle ( +- 0.25% ) (83.33%)
13209749965 stalled-cycles-backend # 51.62% backend cycles idle ( +- 0.16% ) (33.36%)
19767554347 instructions # 0.77 insn per cycle
# 0.67 stalled cycles per insn ( +- 0.04% ) (50.03%)
4417480305 branches #
691618858.046 M/sec ( +- 0.03% ) (66.68%)
118676358 branch-misses # 2.69% of all branches ( +- 0.07% ) (83.33%)
6.3954 +- 0.0118 seconds time elapsed ( +- 0.18% )
```
Performance improved even further?! Makes sense i guess, less clusters to print.
patch, with reclustering, only the unstable clusters {
F8303601}
```
$ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-unstable.html -analysis-display-unstable-clusters
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-unstable.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-unstable.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-unstable.html -analysis-display-unstable-clusters' (25 runs):
6124.96 msec task-clock # 1.000 CPUs utilized ( +- 0.20% )
194 context-switches # 31.709 M/sec ( +- 20.46% )
0 cpu-migrations # 0.039 M/sec ( +- 49.77% )
31413 page-faults # 5129.261 M/sec ( +- 0.06% )
24536794267 cycles # 4006425.858 GHz ( +- 0.19% ) (83.31%)
1676085087 stalled-cycles-frontend # 6.83% frontend cycles idle ( +- 0.46% ) (83.32%)
13035595603 stalled-cycles-backend # 53.13% backend cycles idle ( +- 0.16% ) (33.36%)
18260877653 instructions # 0.74 insn per cycle
# 0.71 stalled cycles per insn ( +- 0.05% ) (50.03%)
4112411983 branches #
671484364.603 M/sec ( +- 0.03% ) (66.68%)
114066929 branch-misses # 2.77% of all branches ( +- 0.11% ) (83.32%)
6.1278 +- 0.0121 seconds time elapsed ( +- 0.20% )
```
This tells us that the actual `-analysis-inconsistencies-output-file=` outputting only takes ~0.4 sec for 43970 benchmark points (3 whole sweeps)
(Also, wow this is fast, it used to take several minutes originally)
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=40715 | PR40715 ]].
Reviewers: courbet, gchatelet
Reviewed By: courbet
Subscribers: tschuett, jdoerfert, llvm-commits, RKSimon
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58355
llvm-svn: 354441
Mikael Holmen [Wed, 20 Feb 2019 07:14:39 +0000 (07:14 +0000)]
[RegAllocGreedy] Take last chance recoloring into account in split and assign
Summary:
This is a follow-up to r353988 where tryEvict was extended to take last
chance recoloring into account. Now we do the same thing for trySplit and
tryAssign.
Now we always pass a "FixedRegisters" argument to canEvictInterference and
tryEvict so it doesn't need to have a default value anymore.
The need for this was found long ago in an out-of-tree target.
Unfortunately I don't have a reproducer for an in-tree target.
Reviewers: qcolombet, rudkx
Reviewed By: qcolombet, rudkx
Subscribers: rudkx, MatzeB, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58376
llvm-svn: 354439
Chen Zheng [Wed, 20 Feb 2019 07:01:04 +0000 (07:01 +0000)]
[NFC] add/modify wrapper function for findRegisterDefOperand().
llvm-svn: 354438
Chijun Sima [Wed, 20 Feb 2019 05:49:01 +0000 (05:49 +0000)]
[DTU] Refine the document of mutation APIs [NFC] (PR40528)
Summary:
It was pointed out in [[ https://bugs.llvm.org/show_bug.cgi?id=40528 | Bug 40528 ]] that it is not clear whether insert/deleteEdge can be used to perform multiple updates and [[ https://reviews.llvm.org/D57316#1388344 | a comment in D57316 ]] reveals that the difference between several ways to update the DominatorTree is confusing.
This patch tries to address issues above.
Reviewers: mkazantsev, kuhar, asbirlea, chandlerc, brzycki
Reviewed By: mkazantsev, kuhar, brzycki
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57881
llvm-svn: 354437
Craig Topper [Wed, 20 Feb 2019 05:39:11 +0000 (05:39 +0000)]
[X86] Remove FeatureSlowIncDec from Sandy Bridge and later Intel Core CPUs
Summary:
Inc and Dec were at one point slow on Intel CPUs due to their tendency to cause partial flag stalls on P6 derived CPU cores. This is because these instructions are defined to preserve the carry flag. This partial flag stall issue persisted until Sandy Bridge when flag merging was changed to be handled as a data dependency instead of as a stall until retirement. Sandy Bridge and later CPUs rename the C flag separately from OSPAZ so there is no flag merge needed on INC/DEC to preserve the C flag.
Given these improvements I don't know why INC/DEC was ever considered slow on Sandy Bridge. If anything they should have been disabled on the earlier CPUs instead.
Note after this patch, INC/DEC are still considered slow on Silvermont, Goldmont, Knights Landing and our generic "x86-64" CPU.
Reviewers: spatel, RKSimon, chandlerc
Reviewed By: chandlerc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D58412
llvm-svn: 354436
Leonard Chan [Wed, 20 Feb 2019 05:07:14 +0000 (05:07 +0000)]
Limit new PM tests to X86 registered targets.
llvm-svn: 354435
Eric Christopher [Wed, 20 Feb 2019 04:42:07 +0000 (04:42 +0000)]
Temporarily Revert "[X86][SLP] Enable SLP vectorization for 128-bit horizontal X86 instructions (add, sub)"
As this has broken the lto bootstrap build for 3 days and is
showing a significant regression on the Dither_benchmark results (from
the LLVM benchmark suite) -- specifically, on the
BENCHMARK_FLOYD_DITHER_128, BENCHMARK_FLOYD_DITHER_256, and
BENCHMARK_FLOYD_DITHER_512; the others are unchanged. These have
regressed by about 28% on Skylake, 34% on Haswell, and over 40% on
Sandybridge.
This reverts commit r353923.
llvm-svn: 354434
Fangrui Song [Wed, 20 Feb 2019 04:39:42 +0000 (04:39 +0000)]
[Dominators] Simplify and optimize path compression used in link-eval forest.
Summary:
* NodeToInfo[*] have been allocated so the addresses are stable. We can store them instead of NodePtr to save NumToNode lookups.
* Nodes are traversed twice. Using `Visited` to check the traversal number is expensive and obscure. Just split the two traversals into two loops explicitly.
* The check `VInInfo.DFSNum < LastLinked` is redundant as it is implied by `VInInfo->Parent < LastLinked`
* VLabelInfo PLabelInfo are used to save a NodeToInfo lookup in the second traversal.
Also add some comments explaining eval().
This shows a ~4.5% improvement (9.8444s -> 9.3996s) on
perf stat -r 10 taskset -c 0 opt -passes=$(printf '%.0srequire<domtree>,invalidate<domtree>,' {1..1000})'require<domtree>' -disable-output sqlite-autoconf-3270100/sqlite3.bc
Reviewers: kuhar, sanjoy, asbirlea
Reviewed By: kuhar
Subscribers: brzycki, NutshellySima, kristina, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58327
llvm-svn: 354433
Leonard Chan [Wed, 20 Feb 2019 04:35:28 +0000 (04:35 +0000)]
Remove test on incompatible mpis target.
llvm-svn: 354432
Leonard Chan [Wed, 20 Feb 2019 03:50:11 +0000 (03:50 +0000)]
[NewPM] Add other sanitizers at O0
This allows for MSan and TSan to be used without optimizations required.
Differential Revision: https://reviews.llvm.org/D58424
llvm-svn: 354431
Kito Cheng [Wed, 20 Feb 2019 03:31:32 +0000 (03:31 +0000)]
[RISCV] Implement pseudo instructions for load/store from a symbol address.
Summary:
Those pseudo-instructions are making load/store instructions able to
load/store from/to a symbol, and its always using PC-relative addressing
to generating a symbol address.
Reviewers: asb, apazos, rogfer01, jrtc27
Differential Revision: https://reviews.llvm.org/D50496
llvm-svn: 354430
Fangrui Song [Wed, 20 Feb 2019 02:35:24 +0000 (02:35 +0000)]
[Dominators] Delete UpdateLevelsAfterInsertion in edge insertion of depth-based search for release builds
Summary:
After insertion of (From, To), v is affected iff
depth(NCD)+1 < depth(v) && path P from To to v exists where every w on P s.t. depth(v) <= depth(w)
All affected vertices change their idom to NCD.
If a vertex u has changed its depth, it must be a descendant of an
affected vertex v. Its depth must have been updated by UpdateLevel()
called by setIDom() of the first affected ancestor.
So UpdateLevelsAfterInsertion and its bookkeeping variable VisitedNotAffectedQueue are redundant.
Run them only in debug builds as a sanity check.
Reviewers: kuhar
Reviewed By: kuhar
Subscribers: kristina, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58369
llvm-svn: 354429
Peter Collingbourne [Wed, 20 Feb 2019 02:32:53 +0000 (02:32 +0000)]
ELF: Remove field for .interp in InStruct. NFC.
This field is unreferenced outside of createSyntheticSections.
Differential Revision: https://reviews.llvm.org/D58422
llvm-svn: 354428
Chen Zheng [Wed, 20 Feb 2019 02:30:06 +0000 (02:30 +0000)]
[PowerPC] exploit P9 instruction maddld.
Differential Revision: https://reviews.llvm.org/D58364
llvm-svn: 354427
Thomas Lively [Wed, 20 Feb 2019 02:22:36 +0000 (02:22 +0000)]
[WebAssembly] Generalize section ordering constraints
Summary:
Changes from using a total ordering of known sections to using a
dependency graph approach. This allows our tools to accept and process
binaries that are compliant with the spec and tool conventions that
would have been previously rejected. It also means our own tools can
do less work to enforce an artificially imposed ordering. Using a
general mechanism means fewer special cases and exceptions in the
ordering logic.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58312
llvm-svn: 354426
Jonas Devlieghere [Wed, 20 Feb 2019 01:49:16 +0000 (01:49 +0000)]
[TestModuleCXX] Use UNSUPPORTED instead of REQUIRES
The requires value turns out to be bogus and the test gets skipped on
macOS.
llvm-svn: 354425
Jonas Devlieghere [Wed, 20 Feb 2019 01:49:13 +0000 (01:49 +0000)]
[Instrumentation] Make API logging unconditional
We should always log API calls in addition to logging whether the call
was recorded as part of the reproducer. Since we already have the macro
we might as well put that logic there.
llvm-svn: 354424
Jonas Devlieghere [Wed, 20 Feb 2019 01:49:10 +0000 (01:49 +0000)]
[lldb-instr] Group RECORD macros
Group LLDB_RECORD macros per input file.
llvm-svn: 354423
Tom Stellard [Wed, 20 Feb 2019 01:40:35 +0000 (01:40 +0000)]
ELF: Fix typo in --build-id option description
Reviewers: ruiu, espindola
Reviewed By: ruiu
Subscribers: emaste, arichardson, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58265
llvm-svn: 354422
Heejin Ahn [Wed, 20 Feb 2019 01:29:34 +0000 (01:29 +0000)]
[WebAssembly] Refactor atomic operation definitions (NFC)
Summary:
- Make `ATOMIC_I`, `ATOMIC_NRI`, `AtomicLoad`, `AtomicStore` classes and
make other operations inherit from them
- Factor the common opcode prefix '0xfe' out from the opcodes into the
common class
- Reorder instructions in the order of increasing opcodes
Reviewers: tlively
Subscribers: dschuff, sbc100, jgravelle-google, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58338
llvm-svn: 354421
Sanjay Patel [Wed, 20 Feb 2019 01:24:59 +0000 (01:24 +0000)]
[InstCombine] regenerate test checks; NFC
llvm-svn: 354420
Heejin Ahn [Wed, 20 Feb 2019 01:14:36 +0000 (01:14 +0000)]
[WebAssembly] Fix load/store name detection for atomic instructions
Summary:
Fixed a bug in the routine in AsmParser that determines whether the
current instruction is a load or a store. Atomic instructions' prefixes
are not `atomic_` but `atomic.`, and all atomic instructions are also
memory instructions. Also fixed the printing format of atomic
instructions to match other memory instructions and added encoding tests
for atomic instructions.
Reviewers: aardappel, tlively
Subscribers: dschuff, sbc100, jgravelle-google, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58337
llvm-svn: 354419
Adrian Prantl [Wed, 20 Feb 2019 01:14:05 +0000 (01:14 +0000)]
Move -fcxx-modules to MANDATORY_MODULE_BUILD_CFLAGS (NFC)
llvm-svn: 354418
Tom Stellard [Wed, 20 Feb 2019 01:11:05 +0000 (01:11 +0000)]
CMake: Fix stand-alone clang builds since r353268
Summary:
Handle the case where LLVM_MAIN_SRC_DIR is not set and also use
LLVM_CMAKE_DIR for locating installed cmake files rather than
LLVM_CMAKE_PATH.
Reviewers: phosek, andrewrk, smeenai
Reviewed By: phosek, andrewrk, smeenai
Subscribers: mgorny, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D58204
llvm-svn: 354417
Wouter van Oortmerssen [Wed, 20 Feb 2019 00:55:59 +0000 (00:55 +0000)]
[WebAssembly] Fixed disassembler not knowing about OPERAND_EVENT
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58414
llvm-svn: 354416
Davide Italiano [Wed, 20 Feb 2019 00:54:10 +0000 (00:54 +0000)]
[lldbtest] Remove some accidentally commented out code.
llvm-svn: 354415
Davide Italiano [Wed, 20 Feb 2019 00:54:07 +0000 (00:54 +0000)]
[testsuite] Fix TestUnicodeString to work with Py2 and Py3.
llvm-svn: 354414
Nico Weber [Wed, 20 Feb 2019 00:34:19 +0000 (00:34 +0000)]
gn build: Merge r354365 more
llvm-svn: 354413
Philip Reames [Wed, 20 Feb 2019 00:31:28 +0000 (00:31 +0000)]
[GVN] Small tweaks to comments, style, and missed vector handling
Noticed these while doing a final sweep of the code to make sure I hadn't missed anything in my last couple of patches. The (minor) missed optimization was noticed because of the stylistic fix to avoid an overly specific cast.
llvm-svn: 354412
Nico Weber [Wed, 20 Feb 2019 00:30:08 +0000 (00:30 +0000)]
gn build: Merge r354365
llvm-svn: 354411
Bob Haarman [Wed, 20 Feb 2019 00:26:01 +0000 (00:26 +0000)]
[lld-link] preserve @llvm.used symbols in LTO
Summary:
We translate @llvm.used to COFF by generating /include directives
in the .drectve section. However, in LTO links, this happens after
directives have already been processed, so the new directives do
not take effect. This change marks @llvm.used symbols as GCRoots
so that they are preserved as intended.
Fixes PR40733.
Reviewers: rnk, pcc, ruiu
Reviewed By: ruiu
Subscribers: mehdi_amini, steven_wu, dexonsmith, dang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58255
llvm-svn: 354410
Yonghong Song [Wed, 20 Feb 2019 00:22:19 +0000 (00:22 +0000)]
[BPF] make test case reloc-btf.ll tolerable for old compilers
The test case reloc-btf.ll is generated with an IR containing
spFlags introduced by https://reviews.llvm.org/rL347806.
In the case of BTF backporting, the old compiler may not
have this patch, so this test will fail during
validation.
This patch removed spFlags from IR in the test case
and used the old way for various flags.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 354409
Sanjay Patel [Wed, 20 Feb 2019 00:20:38 +0000 (00:20 +0000)]
Revert "[InstSimplify] use any-zero matcher for fcmp folds"
This reverts commit
058bb8351351d56d2a4e8a772570231f9e5305e5.
Forgot to update another test affected by this change.
llvm-svn: 354408
Philip Reames [Wed, 20 Feb 2019 00:15:54 +0000 (00:15 +0000)]
[GVN] Fix last crasher w/non-integral pointers
Same case as for memset and memcpy, but this time for clobbering stores and loads. We still can't allow coercion to or from non-integrals, regardless of the transform.
Now that I'm done the whole little sequence, it seems apparent that we'd entirely missed reasoning about clobbers in the original GVN support for non-integral pointers.
My appologies, I thought we'd upstreamed all of this, but it turns out we were still carrying a downstream hack which hid all of these issues. My chanks to Cherry Zhang for helping debug.
llvm-svn: 354407
Sanjay Patel [Wed, 20 Feb 2019 00:09:50 +0000 (00:09 +0000)]
[InstSimplify] use any-zero matcher for fcmp folds
The m_APFloat matcher does not work with anything but strict
splat vector constants, so we could miss these folds and then
trigger an assertion in instcombine:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201
llvm-svn: 354406
Rui Ueyama [Wed, 20 Feb 2019 00:01:21 +0000 (00:01 +0000)]
Sort enum members so that arch-dependent members are at the right place. NFC.
llvm-svn: 354405
Sanjay Patel [Tue, 19 Feb 2019 23:58:02 +0000 (23:58 +0000)]
[InstSimplify] add vector tests for fcmp+fabs; NFC
llvm-svn: 354404
Philip Reames [Tue, 19 Feb 2019 23:49:38 +0000 (23:49 +0000)]
[GVN] Fix a crash bug w/non-integral pointers and memtransfers
Problem is very similiar to the one fixed for memsets in r354399, we try to coerce a value to non-integral type, and then crash while try to do so. Since we shouldn't be doing such coercions to start with, easy fix. From inspection, I see two other cases which look to be similiar and will follow up with most test cases and fixes if confirmed.
llvm-svn: 354403
Evgeniy Stepanov [Tue, 19 Feb 2019 23:41:42 +0000 (23:41 +0000)]
[msan] Fix name_to_handle_at test on overlayfs.
Udev supports name_to_handle_at. Use /dev/null instead of /bin/cat.
llvm-svn: 354402
Philip Reames [Tue, 19 Feb 2019 23:19:51 +0000 (23:19 +0000)]
[GVN] Fix a non-integral pointer bug w/vector types
GVN generally doesn't forward structs or array types, but it *will* forward vector types to non-vectors and vice versa. As demonstrated in tests, we need to inhibit the same set of transforms for vector of non-integral pointers as for non-integral pointers themselves.
llvm-svn: 354401
Jonas Devlieghere [Tue, 19 Feb 2019 23:13:29 +0000 (23:13 +0000)]
[lldb-instr] Don't print REGISTER macro when RECORD is already present
Currently we'd always print the LLDB_REGISTER macro, even if the
LLDB_RECORD macro was already present. This patches changes that to make
it easier to incrementally update the macros.
Note that it's still possible for the RECORD and REGISTER macros to get
out of sync.
llvm-svn: 354400
Philip Reames [Tue, 19 Feb 2019 23:07:15 +0000 (23:07 +0000)]
[GVN] Fix a crash bug around non-integral pointers
If we encountered a location where we tried to forward the value of a memset to a load of a non-integral pointer, we crashed. Such a forward is not legal in general, but we can forward null pointers. Test for both cases are included.
llvm-svn: 354399
Philip Reames [Tue, 19 Feb 2019 22:57:30 +0000 (22:57 +0000)]
[Test] Autogenerate existing tests before adding more
llvm-svn: 354398
Thomas Lively [Tue, 19 Feb 2019 22:56:19 +0000 (22:56 +0000)]
[WebAssembly] Update MC for bulk memory
Summary:
Rename MemoryIndex to InitFlags and implement logic for determining
data segment layout in ObjectYAML and MC. Also adds a "passive" flag
for the .section assembler directive although this cannot be assembled
yet because the assembler does not support data sections.
Reviewers: sbc100, aardappel, aheejin, dschuff
Subscribers: jgravelle-google, hiraditya, sunfish, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57938
llvm-svn: 354397
Craig Topper [Tue, 19 Feb 2019 22:37:00 +0000 (22:37 +0000)]
[X86] Mark FP32_TO_INT16_IN_MEM/FP32_TO_INT32_IN_MEM/FP32_TO_INT64_IN_MEM as clobbering EFLAGS to prevent mis-scheduling during conversion from SelectionDAG to MIR.
After r354178, these instruction expand to a sequence that uses an OR instruction. That OR clobbers EFLAGS so we need to state that to avoid accidentally using the clobbered flags.
Our tests show the bug, but I didn't notice because the SETcc instructions didn't move after r354178 since it used to be safe to do the fp->int conversion first.
We should probably convert this whole sequence to SelectionDAG instead of a custom inserter to avoid mistakes like this.
Fixes PR40779
llvm-svn: 354395
Sanjay Patel [Tue, 19 Feb 2019 22:35:12 +0000 (22:35 +0000)]
[LangRef] add to description of alloca instruction
As mentioned in D58359, we can explicitly state that the
memory allocated is uninitialized and reading that memory
produces undef.
llvm-svn: 354394
Sanjay Patel [Tue, 19 Feb 2019 22:14:21 +0000 (22:14 +0000)]
[InstCombine] reduce even more unsigned saturated add with 'not' op
We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613
Name: uaddsat, -1 fval
%notx = xor i32 %x, -1
%a = add i32 %x, %y
%c = icmp ugt i32 %notx, %y
%r = select i1 %c, i32 %a, i32 -1
=>
%a = add i32 %x, %y
%c2 = icmp ugt i32 %y, %a
%r = select i1 %c2, i32 -1, i32 %a
Name: uaddsat, -1 fval + ult
%notx = xor i32 %x, -1
%a = add i32 %x, %y
%c = icmp ult i32 %y, %notx
%r = select i1 %c, i32 %a, i32 -1
=>
%a = add i32 %x, %y
%c2 = icmp ugt i32 %y, %a
%r = select i1 %c2, i32 -1, i32 %a
https://rise4fun.com/Alive/nTp
llvm-svn: 354393
Kostya Serebryany [Tue, 19 Feb 2019 22:11:50 +0000 (22:11 +0000)]
[libFuzzer] docs: add a FAQ entry about dlclose
llvm-svn: 354392
Rui Ueyama [Tue, 19 Feb 2019 22:06:44 +0000 (22:06 +0000)]
Move MinGW-specific code out of LinkerDriver::link. NFC.
LinkerDriver::link is getting too long, it's time to simplify it.
Differential Revision: https://reviews.llvm.org/D58395
llvm-svn: 354391
Renato Golin [Tue, 19 Feb 2019 22:06:27 +0000 (22:06 +0000)]
second test on git-llvm-push
llvm-svn: 354390
Daniel Sanders [Tue, 19 Feb 2019 22:02:38 +0000 (22:02 +0000)]
Fix builds with llvm/runtimes/compiler-rt after r354365
Compiler-rt doesn't include config-ix which was providing CheckSymbolExists to
the LLVM build. Add it to HandleLLVMOptions to fix this
llvm-svn: 354389
Craig Topper [Tue, 19 Feb 2019 21:58:23 +0000 (21:58 +0000)]
[ArgumentPromotion] Add a lit.local.cfg to disable X86 specific tests if the X86 target doesn't exist.
Hopefully this fixes some buildbot failure after r354376
llvm-svn: 354388
Martin Storsjo [Tue, 19 Feb 2019 21:57:49 +0000 (21:57 +0000)]
[MinGW] Hook up the --exclude-all-symbols option
Differential Revision: https://reviews.llvm.org/D58380
llvm-svn: 354387
Martin Storsjo [Tue, 19 Feb 2019 21:57:44 +0000 (21:57 +0000)]
[COFF] Add -exclude-all-symbols for MinGW
This is a private undocumented option, intended to be used by
the MinGW driver frontend.
Also restructure the condition to put if (Config->MinGW) first.
This changes the behaviour for the tautological combination of
-export-all-symbols without -lldmingw.
Differential Revision: https://reviews.llvm.org/D58380
llvm-svn: 354386
Greg Clayton [Tue, 19 Feb 2019 21:48:34 +0000 (21:48 +0000)]
Add Facebook Minidump directory streams and options to dump them.
Facebook creates minidump files that contain specific information about why things crash. Adding ways to dump these allows tools to be made that can auto download symbols based on the information that is contained in the minidump files.
Differential Revision: https://reviews.llvm.org/D58398
llvm-svn: 354385
Sanjay Patel [Tue, 19 Feb 2019 21:46:13 +0000 (21:46 +0000)]
[InstCombine] rearrange saturated add folds; NFC
This is no-functional-change-intended, but that was also
true when it was part of rL354276, and I managed to lose
2 predicates for the fold with constant...causing much bot
distress. So this time I'm adding a couple of negative tests
to avoid that.
llvm-svn: 354384
Renato Golin [Tue, 19 Feb 2019 21:32:05 +0000 (21:32 +0000)]
Testing git-llvm-push script
llvm-svn: 354383
Jinsong Ji [Tue, 19 Feb 2019 21:25:13 +0000 (21:25 +0000)]
PowerPC: Fix typos in comments
llvm-svn: 354382
Andrew Scheidecker [Tue, 19 Feb 2019 21:21:54 +0000 (21:21 +0000)]
[ConstantFold] Fix misfolding fcmp of a ConstantExpr NaN with itself.
The code incorrectly inferred that the relationship of a constant expression
to itself is FCMP_OEQ (ordered and equal), when it's actually FCMP_UEQ
(unordered *or* equal). This change corrects that, and adds some more limited
folds that can be done in this case.
Differential revision: https://reviews.llvm.org/D51216
llvm-svn: 354381
Andrew Scheidecker [Tue, 19 Feb 2019 21:03:20 +0000 (21:03 +0000)]
[ConstantFold] Fix misfolding of icmp with a bitcast FP second operand.
In the process of trying to eliminate the bitcast, this was producing a
malformed icmp with FP operands.
Differential revision: https://reviews.llvm.org/D51215
llvm-svn: 354380
Vedant Kumar [Tue, 19 Feb 2019 20:45:00 +0000 (20:45 +0000)]
[llvm-cov] Add support for gcov --hash-filenames option
The patch adds support for --hash-filenames to llvm-cov. This option adds md5
hash of the source path to the name of the generated .gcov file. The option is
crucial for cases where you have multiple files with the same name but can't
use --preserve-paths as resulting filenames exceed the limit.
from gcov(1):
```
-x
--hash-filenames
By default, gcov uses the full pathname of the source files to to
create an output filename. This can lead to long filenames that
can overflow filesystem limits. This option creates names of the
form source-file##md5.gcov, where the source-file component is
the final filename part and the md5 component is calculated from
the full mangled name that would have been used otherwise.
```
Patch by Igor Ignatev!
Differential Revision: https://reviews.llvm.org/D58370
llvm-svn: 354379
Andrew Scheidecker [Tue, 19 Feb 2019 20:38:51 +0000 (20:38 +0000)]
Testing commit access
llvm-svn: 354378
Vitaly Buka [Tue, 19 Feb 2019 20:36:52 +0000 (20:36 +0000)]
[msan] Remove cxa_atexit_race.cc
Summary:
The goal of the test to check that msan does not crash when code is racy on __cxa_atexit. Original crash was caused by race condition in the glibc. With
the msan patch the msan does not crashes however the race is still there and the test triggers it.
Because the test relies on triggering of undefined behavior results are not
very predictable and it may occasionally crashes or hangs.
I don't see how to reasonably improve the test, so I remove it.
Reviewers: eugenis, peter.smith
Subscribers: jfb, jdoerfert, #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D58396
llvm-svn: 354377
Craig Topper [Tue, 19 Feb 2019 20:12:20 +0000 (20:12 +0000)]
[X86] Don't consider functions ABI compatible for ArgumentPromotion pass if they view 512-bit vectors differently.
The use of the -mprefer-vector-width=256 command line option mixed with functions
using vector intrinsics can create situations where one function thinks 512 vectors
are legal, but another fucntion does not.
If a 512 bit vector is passed between them via a pointer, its possible ArgumentPromotion
might try to pass by value instead. This will result in type legalization for the two
functions handling the 512 bit vector differently leading to runtime failures.
Had the 512 bit vector been passed by value from clang codegen, both functions would
have been tagged with a min-legal-vector-width=512 function attribute. That would
make them be legalized the same way.
I observed this issue in 32-bit mode where a union containing a 512 bit vector was
being passed by a function that used intrinsics to one that did not. The caller
ended up passing in zmm0 and the callee tried to read it from ymm0 and ymm1.
The fix implemented here is just to consider it a mismatch if two functions
would handle 512 bit differently without looking at the types that are being
considered. This is the easist and safest fix, but it can be improved in the future.
Differential Revision: https://reviews.llvm.org/D58390
llvm-svn: 354376
Matthew Voss [Tue, 19 Feb 2019 19:46:08 +0000 (19:46 +0000)]
Revert "Revert "[llvm-objdump] Allow short options without arguments to be grouped""
- Tests that use multiple short switches now test them grouped and ungrouped.
- Ensure the output of ungrouped and grouped variants is identical
Differential Revision: https://reviews.llvm.org/D57904
llvm-svn: 354375
Daniel Sanders [Tue, 19 Feb 2019 19:45:03 +0000 (19:45 +0000)]
Fix builds for older macOS deployment targets after r354365
Surprisingly, check_symbol_exists is not sufficient. The macOS linker checks the
called functions against a compatibility list for the given deployment target
and check_symbol_exists doesn't trigger this check as it never calls the
function.
This fixes the GreenDragon bots where the deployment target is 10.9
llvm-svn: 354374
Kostya Serebryany [Tue, 19 Feb 2019 19:28:08 +0000 (19:28 +0000)]
[sanitizers] add a regression test for the bug fixed in r354366
llvm-svn: 354373
Jonathan Peyton [Tue, 19 Feb 2019 19:00:29 +0000 (19:00 +0000)]
[OpenMP] Remove XFAIL for cancellation tests using gcc
llvm-svn: 354370
Jonathan Peyton [Tue, 19 Feb 2019 18:51:11 +0000 (18:51 +0000)]
[OpenMP 5.0] Add omp_get_supported_active_levels()
This patch adds the new 5.0 API function omp_get_supported_active_levels().
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D58211
llvm-svn: 354368
Jonathan Peyton [Tue, 19 Feb 2019 18:47:57 +0000 (18:47 +0000)]
[OpenMP] Adding GOMP compatible cancellation
Remove fatal error messages from the cancellation API for GOMP
Add __kmp_barrier_gomp_cancel() to implement cancellation of parallel regions.
This new function uses the linear barrier algorithm with a cancellable
nonsleepable wait loop.
Differential Revision: https://reviews.llvm.org/D57969
llvm-svn: 354367
Kostya Serebryany [Tue, 19 Feb 2019 18:43:24 +0000 (18:43 +0000)]
[sanitizer] fix a memory safety bug (!!!) in sanitizer suppressions code, discovered by Aaron Jacobs
llvm-svn: 354366
Daniel Sanders [Tue, 19 Feb 2019 18:18:31 +0000 (18:18 +0000)]
Annotate timeline in Instruments with passes and other timed regions.
Summary:
Instruments is a useful tool for finding performance issues in LLVM but it can
be difficult to identify regions of interest on the timeline that we can use
to filter the profiler or allocations instrument. Xcode 10 and the latest
macOS/iOS/etc. added support for the os_signpost() API which allows us to
annotate the timeline with information that's meaningful to LLVM.
This patch causes timer start and end events to emit signposts. When used with
-time-passes, this causes the passes to be annotated on the Instruments timeline.
In addition to visually showing the duration of passes on the timeline, it also
allows us to filter the profile and allocations instrument down to an individual
pass allowing us to find the issues within that pass without being drowned out
by the noise from other parts of the compiler.
Using this in conjunction with the Time Profiler (in high frequency mode) and
the Allocations instrument is how I found the SparseBitVector that should have
been a BitVector and the DenseMap that could be replaced by a sorted vector a
couple months ago. I added NamedRegionTimers to TableGen and used the resulting
annotations to identify the slow portions of the Register Info Emitter. Some of
these were placed according to educated guesses while others were placed
according to hot functions from a previous profile. From there I filtered the
profile to a slow portion and the aforementioned issues stood out in the
profile.
To use this feature enable LLVM_SUPPORT_XCODE_SIGNPOSTS in CMake and run the
compiler under Instruments with -time-passes like so:
instruments -t 'Time Profiler' bin/llc -time-passes -o - input.ll'
Then open the resulting trace in Instruments.
There was a talk at WWDC 2018 that explained the feature which can be found at
https://developer.apple.com/videos/play/wwdc2018/405/ if you'd like to know
more about it.
Reviewers: bogner
Reviewed By: bogner
Subscribers: jdoerfert, mgorny, kristina, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D52954
llvm-svn: 354365
Jordan Rupprecht [Tue, 19 Feb 2019 18:14:44 +0000 (18:14 +0000)]
[libObject][NFC] Use sys::path::convert_to_slash.
Summary: As suggested in rL353995
Reviewers: compnerd
Reviewed By: compnerd
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58298
llvm-svn: 354364
Simon Pilgrim [Tue, 19 Feb 2019 18:05:42 +0000 (18:05 +0000)]
[X86][SSE] Generalize X86ISD::BLENDI support to more value types
D42042 introduced the ability for the ExecutionDomainFixPass to more easily change between BLENDPD/BLENDPS/PBLENDW as the domains required.
With this ability, we can avoid most bitcasts/scaling in the DAG that was occurring with X86ISD::BLENDI lowering/combining, blend with the vXi32/vXi64 vectors directly and use isel patterns to lower to the float vector equivalent vectors.
This helps the shuffle combining and SimplifyDemandedVectorElts be more aggressive as we lose track of fewer UNDEF elements than when we go up/down through bitcasts.
I've introduced a basic blend(bitcast(x),bitcast(y)) -> bitcast(blend(x,y)) fold, there are more generalizations I can do there (e.g. widening/scaling and handling the tricky v16i16 repeated mask case).
The vector-reduce-smin/smax regressions will be fixed in a future improvement to SimplifyDemandedBits to peek through bitcasts and support X86ISD::BLENDV.
Reapplied after reversion at rL353699 - AVX2 isel fix was applied at rL354358, additional test at rL354360/rL354361
Differential Revision: https://reviews.llvm.org/D57888
llvm-svn: 354363
Serge Guelton [Tue, 19 Feb 2019 18:03:47 +0000 (18:03 +0000)]
[NFC] Remove unused headers in Optional.h
llvm-svn: 354362
Simon Pilgrim [Tue, 19 Feb 2019 17:57:36 +0000 (17:57 +0000)]
Fix stupid assembly comment typo
llvm-svn: 354361
Simon Pilgrim [Tue, 19 Feb 2019 17:56:14 +0000 (17:56 +0000)]
[X86][SSE] Add pblendw commuted load test case
Reduced test case for the regression caused in D57888/rL353610
llvm-svn: 354360
Nikita Popov [Tue, 19 Feb 2019 17:37:55 +0000 (17:37 +0000)]
[SDAG] Use shift amount type in MULO promotion; NFC
Directly use the correct shift amount type if it is possible, and
future-proof the code against vectors. The added test makes sure that
bitwidths that do not fit into the shift amount type do not assert.
Split out from D57997.
llvm-svn: 354359
Simon Pilgrim [Tue, 19 Feb 2019 17:23:55 +0000 (17:23 +0000)]
[X86][AVX2] Hide VPBLENDD instructions behind AVX2 predicate
This was the cause of the regression in D57888 - the commuted load pattern wasn't hidden by the predicate so once we enabled v4i32 blends on SSE41+ targets then isel was incorrectly matched against AVX2+ instructions.
llvm-svn: 354358
Craig Topper [Tue, 19 Feb 2019 17:16:23 +0000 (17:16 +0000)]
[X86] Bugfix for nullptr check by klocwork
klocwork critical issues in CG files:
Patch by Xiang Zhang (xiangzhangllvm)
Differential Revision: https://reviews.llvm.org/D58363
llvm-svn: 354357
Craig Topper [Tue, 19 Feb 2019 17:13:40 +0000 (17:13 +0000)]
X86AsmParser AVX-512: Return error instead of hitting assert
When parsing a sequence of tokens beginning with {, it will hit an assert and crash if the token afterwards is not an identifier. Instead of this, return a more verbose error as seen elsewhere in the function.
Patch by Brandon Jones (BrandonTJones)
Differential Revision: https://reviews.llvm.org/D57375
llvm-svn: 354356
Craig Topper [Tue, 19 Feb 2019 17:05:11 +0000 (17:05 +0000)]
[X86] Filter out tuning feature flags and a few ISA feature flags when checking for function inline compatibility.
Tuning flags don't have any effect on the available instructions so aren't a good reason to prevent inlining.
There are also some ISA flags that don't have any intrinsics our ABI requirements that we can exclude. I've put only the most basic ones like cmpxchg16b and lahfsahf. These are interesting because they aren't present in all 64-bit CPUs, but we have codegen workarounds when they aren't present.
Loosening these checks can help with scenarios where a caller has a more specific CPU than a callee. The default tuning flags on our generic 'x86-64' CPU can currently make it inline compatible with other CPUs. I've also added an example test for 'nocona' and 'prescott' where 'nocona' is just a 64-bit capable version of 'prescott' but in 32-bit mode they should be completely compatible.
I've based the implementation here of the similar code in AMDGPU.
Differential Revision: https://reviews.llvm.org/D58371
llvm-svn: 354355