Chris Jackson [Fri, 30 Aug 2019 10:17:16 +0000 (10:17 +0000)]
[llvm-objcopy] Allow the visibility of symbols created by --binary and
--add-symbol to be specified with --new-symbol-visibility
llvm-svn: 370458
Balazs Keri [Fri, 30 Aug 2019 10:12:14 +0000 (10:12 +0000)]
[ASTImporter] Propagate errors during import of overridden methods.
Summary:
If importing overridden methods fails for a method it can be seen
incorrectly as non-virtual. To avoid this inconsistency the method
is marked with import error to avoid later use of it.
Reviewers: martong, a.sidorin, shafik, a_sidorin
Reviewed By: martong, shafik
Subscribers: rnkovacs, dkrupp, Szelethus, gamesh411, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66933
llvm-svn: 370457
Hideto Ueno [Fri, 30 Aug 2019 10:00:32 +0000 (10:00 +0000)]
[Attributor] Implement AANoAliasCallSiteArgument initialization
Summary: This patch adds an appropriate `initialize` method for `AANoAliasCallSiteArgument`.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66927
llvm-svn: 370456
Shaurya Gupta [Fri, 30 Aug 2019 09:57:56 +0000 (09:57 +0000)]
[Clangd] ExtractFunction Added checks for broken control flow
Summary:
- Added checks for broken control flow
- Added unittests
Reviewers: sammccall, kadircet
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66732
llvm-svn: 370455
Roman Lebedev [Fri, 30 Aug 2019 09:51:23 +0000 (09:51 +0000)]
[LoopIdiomRecognize] BCmp loop idiom recognition
Summary:
@mclow.lists brought up this issue up in IRC.
It is a reasonably common problem to compare some two values for equality.
Those may be just some integers, strings or arrays of integers.
In C, there is `memcmp()`, `bcmp()` functions.
In C++, there exists `std::equal()` algorithm.
One can also write that function manually.
libstdc++'s `std::equal()` is specialized to directly call `memcmp()` for
various types, but not `std::byte` from C++2a. https://godbolt.org/z/mx2ejJ
libc++ does not do anything like that, it simply relies on simple C++'s
`operator==()`. https://godbolt.org/z/er0Zwf (GOOD!)
So likely, there exists a certain performance opportunities.
Let's compare performance of naive `std::equal()` (no `memcmp()`) with one that
is using `memcmp()` (in this case, compiled with modified compiler). {
F8768213}
```
#include <algorithm>
#include <cmath>
#include <cstdint>
#include <iterator>
#include <limits>
#include <random>
#include <type_traits>
#include <utility>
#include <vector>
#include "benchmark/benchmark.h"
template <class T>
bool equal(T* a, T* a_end, T* b) noexcept {
for (; a != a_end; ++a, ++b) {
if (*a != *b) return false;
}
return true;
}
template <typename T>
std::vector<T> getVectorOfRandomNumbers(size_t count) {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<T> dis(std::numeric_limits<T>::min(),
std::numeric_limits<T>::max());
std::vector<T> v;
v.reserve(count);
std::generate_n(std::back_inserter(v), count,
[&dis, &gen]() { return dis(gen); });
assert(v.size() == count);
return v;
}
struct Identical {
template <typename T>
static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) {
auto Tmp = getVectorOfRandomNumbers<T>(count);
return std::make_pair(Tmp, std::move(Tmp));
}
};
struct InequalHalfway {
template <typename T>
static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) {
auto V0 = getVectorOfRandomNumbers<T>(count);
auto V1 = V0;
V1[V1.size() / size_t(2)]++; // just change the value.
return std::make_pair(std::move(V0), std::move(V1));
}
};
template <class T, class Gen>
void BM_bcmp(benchmark::State& state) {
const size_t Length = state.range(0);
const std::pair<std::vector<T>, std::vector<T>> Data =
Gen::template Gen<T>(Length);
const std::vector<T>& a = Data.first;
const std::vector<T>& b = Data.second;
assert(a.size() == Length && b.size() == a.size());
benchmark::ClobberMemory();
benchmark::DoNotOptimize(a);
benchmark::DoNotOptimize(a.data());
benchmark::DoNotOptimize(b);
benchmark::DoNotOptimize(b.data());
for (auto _ : state) {
const bool is_equal = equal(a.data(), a.data() + a.size(), b.data());
benchmark::DoNotOptimize(is_equal);
}
state.SetComplexityN(Length);
state.counters["eltcnt"] =
benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariant);
state.counters["eltcnt/sec"] =
benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariantRate);
const size_t BytesRead = 2 * sizeof(T) * Length;
state.counters["bytes_read/iteration"] =
benchmark::Counter(BytesRead, benchmark::Counter::kDefaults,
benchmark::Counter::OneK::kIs1024);
state.counters["bytes_read/sec"] = benchmark::Counter(
BytesRead, benchmark::Counter::kIsIterationInvariantRate,
benchmark::Counter::OneK::kIs1024);
}
template <typename T>
static void CustomArguments(benchmark::internal::Benchmark* b) {
const size_t L2SizeBytes = []() {
for (const benchmark::CPUInfo::CacheInfo& I :
benchmark::CPUInfo::Get().caches) {
if (I.level == 2) return I.size;
}
return 0;
}();
// What is the largest range we can check to always fit within given L2 cache?
const size_t MaxLen = L2SizeBytes / /*total bufs*/ 2 /
/*maximal elt size*/ sizeof(T) / /*safety margin*/ 2;
b->RangeMultiplier(2)->Range(1, MaxLen)->Complexity(benchmark::oN);
}
BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, Identical)
->Apply(CustomArguments<uint8_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, Identical)
->Apply(CustomArguments<uint16_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, Identical)
->Apply(CustomArguments<uint32_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, Identical)
->Apply(CustomArguments<uint64_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, InequalHalfway)
->Apply(CustomArguments<uint8_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, InequalHalfway)
->Apply(CustomArguments<uint16_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, InequalHalfway)
->Apply(CustomArguments<uint32_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, InequalHalfway)
->Apply(CustomArguments<uint64_t>);
```
{
F8768210}
```
$ ~/src/googlebenchmark/tools/compare.py --no-utest benchmarks build-{old,new}/test/llvm-bcmp-bench
RUNNING: build-old/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpb6PEUx
2019-04-25 21:17:11
Running build-old/test/llvm-bcmp-bench
Run on (8 X 4000 MHz CPU s)
CPU Caches:
L1 Data 16K (x8)
L1 Instruction 64K (x4)
L2 Unified 2048K (x4)
L3 Unified 8192K (x1)
Load Average: 0.65, 3.90, 4.14
---------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
---------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000 432131 ns 432101 ns 1613 bytes_read/iteration=1000k bytes_read/sec=2.20706G/s eltcnt=825.856M eltcnt/sec=1.18491G/s
BM_bcmp<uint8_t, Identical>_BigO 0.86 N 0.86 N
BM_bcmp<uint8_t, Identical>_RMS 8 % 8 %
<...>
BM_bcmp<uint16_t, Identical>/256000 161408 ns 161409 ns 4027 bytes_read/iteration=1000k bytes_read/sec=5.90843G/s eltcnt=1030.91M eltcnt/sec=1.58603G/s
BM_bcmp<uint16_t, Identical>_BigO 0.67 N 0.67 N
BM_bcmp<uint16_t, Identical>_RMS 25 % 25 %
<...>
BM_bcmp<uint32_t, Identical>/128000 81497 ns 81488 ns 8415 bytes_read/iteration=1000k bytes_read/sec=11.7032G/s eltcnt=1077.12M eltcnt/sec=1.57078G/s
BM_bcmp<uint32_t, Identical>_BigO 0.71 N 0.71 N
BM_bcmp<uint32_t, Identical>_RMS 42 % 42 %
<...>
BM_bcmp<uint64_t, Identical>/64000 50138 ns 50138 ns 10909 bytes_read/iteration=1000k bytes_read/sec=19.0209G/s eltcnt=698.176M eltcnt/sec=1.27647G/s
BM_bcmp<uint64_t, Identical>_BigO 0.84 N 0.84 N
BM_bcmp<uint64_t, Identical>_RMS 27 % 27 %
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000 192405 ns 192392 ns 3638 bytes_read/iteration=1000k bytes_read/sec=4.95694G/s eltcnt=1.86266G eltcnt/sec=2.66124G/s
BM_bcmp<uint8_t, InequalHalfway>_BigO 0.38 N 0.38 N
BM_bcmp<uint8_t, InequalHalfway>_RMS 3 % 3 %
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000 127858 ns 127860 ns 5477 bytes_read/iteration=1000k bytes_read/sec=7.45873G/s eltcnt=1.40211G eltcnt/sec=2.00219G/s
BM_bcmp<uint16_t, InequalHalfway>_BigO 0.50 N 0.50 N
BM_bcmp<uint16_t, InequalHalfway>_RMS 0 % 0 %
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000 49140 ns 49140 ns 14281 bytes_read/iteration=1000k bytes_read/sec=19.4072G/s eltcnt=1.82797G eltcnt/sec=2.60478G/s
BM_bcmp<uint32_t, InequalHalfway>_BigO 0.40 N 0.40 N
BM_bcmp<uint32_t, InequalHalfway>_RMS 18 % 18 %
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000 32101 ns 32099 ns 21786 bytes_read/iteration=1000k bytes_read/sec=29.7101G/s eltcnt=1.3943G eltcnt/sec=1.99381G/s
BM_bcmp<uint64_t, InequalHalfway>_BigO 0.50 N 0.50 N
BM_bcmp<uint64_t, InequalHalfway>_RMS 1 % 1 %
RUNNING: build-new/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpQ46PP0
2019-04-25 21:19:29
Running build-new/test/llvm-bcmp-bench
Run on (8 X 4000 MHz CPU s)
CPU Caches:
L1 Data 16K (x8)
L1 Instruction 64K (x4)
L2 Unified 2048K (x4)
L3 Unified 8192K (x1)
Load Average: 1.01, 2.85, 3.71
---------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
---------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000 18593 ns 18590 ns 37565 bytes_read/iteration=1000k bytes_read/sec=51.2991G/s eltcnt=19.2333G eltcnt/sec=27.541G/s
BM_bcmp<uint8_t, Identical>_BigO 0.04 N 0.04 N
BM_bcmp<uint8_t, Identical>_RMS 37 % 37 %
<...>
BM_bcmp<uint16_t, Identical>/256000 18950 ns 18948 ns 37223 bytes_read/iteration=1000k bytes_read/sec=50.3324G/s eltcnt=9.52909G eltcnt/sec=13.511G/s
BM_bcmp<uint16_t, Identical>_BigO 0.08 N 0.08 N
BM_bcmp<uint16_t, Identical>_RMS 34 % 34 %
<...>
BM_bcmp<uint32_t, Identical>/128000 18627 ns 18627 ns 37895 bytes_read/iteration=1000k bytes_read/sec=51.198G/s eltcnt=4.85056G eltcnt/sec=6.87168G/s
BM_bcmp<uint32_t, Identical>_BigO 0.16 N 0.16 N
BM_bcmp<uint32_t, Identical>_RMS 35 % 35 %
<...>
BM_bcmp<uint64_t, Identical>/64000 18855 ns 18855 ns 37458 bytes_read/iteration=1000k bytes_read/sec=50.5791G/s eltcnt=2.39731G eltcnt/sec=3.3943G/s
BM_bcmp<uint64_t, Identical>_BigO 0.32 N 0.32 N
BM_bcmp<uint64_t, Identical>_RMS 33 % 33 %
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000 9570 ns 9569 ns 73500 bytes_read/iteration=1000k bytes_read/sec=99.6601G/s eltcnt=37.632G eltcnt/sec=53.5046G/s
BM_bcmp<uint8_t, InequalHalfway>_BigO 0.02 N 0.02 N
BM_bcmp<uint8_t, InequalHalfway>_RMS 29 % 29 %
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000 9547 ns 9547 ns 74343 bytes_read/iteration=1000k bytes_read/sec=99.8971G/s eltcnt=19.0318G eltcnt/sec=26.8159G/s
BM_bcmp<uint16_t, InequalHalfway>_BigO 0.04 N 0.04 N
BM_bcmp<uint16_t, InequalHalfway>_RMS 29 % 29 %
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000 9396 ns 9394 ns 73521 bytes_read/iteration=1000k bytes_read/sec=101.518G/s eltcnt=9.41069G eltcnt/sec=13.6255G/s
BM_bcmp<uint32_t, InequalHalfway>_BigO 0.08 N 0.08 N
BM_bcmp<uint32_t, InequalHalfway>_RMS 30 % 30 %
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000 9499 ns 9498 ns 73802 bytes_read/iteration=1000k bytes_read/sec=100.405G/s eltcnt=4.72333G eltcnt/sec=6.73808G/s
BM_bcmp<uint64_t, InequalHalfway>_BigO 0.16 N 0.16 N
BM_bcmp<uint64_t, InequalHalfway>_RMS 28 % 28 %
Comparing build-old/test/llvm-bcmp-bench to build-new/test/llvm-bcmp-bench
Benchmark Time CPU Time Old Time New CPU Old CPU New
---------------------------------------------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000 -0.9570 -0.9570 432131 18593 432101 18590
<...>
BM_bcmp<uint16_t, Identical>/256000 -0.8826 -0.8826 161408 18950 161409 18948
<...>
BM_bcmp<uint32_t, Identical>/128000 -0.7714 -0.7714 81497 18627 81488 18627
<...>
BM_bcmp<uint64_t, Identical>/64000 -0.6239 -0.6239 50138 18855 50138 18855
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000 -0.9503 -0.9503 192405 9570 192392 9569
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000 -0.9253 -0.9253 127858 9547 127860 9547
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000 -0.8088 -0.8088 49140 9396 49140 9394
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000 -0.7041 -0.7041 32101 9499 32099 9498
```
What can we tell from the benchmark?
* Performance of naive equality check somewhat improves with element size,
maxing out at eltcnt/sec=1.58603G/s for uint16_t, or bytes_read/sec=19.0209G/s
for uint64_t. I think, that instability implies performance problems.
* Performance of `memcmp()`-aware benchmark always maxes out at around
bytes_read/sec=51.2991G/s for every type. That is 2.6x the throughput of the
naive variant!
* eltcnt/sec metric for the `memcmp()`-aware benchmark maxes out at
eltcnt/sec=27.541G/s for uint8_t (was: eltcnt/sec=1.18491G/s, so 24x) and
linearly decreases with element size.
For uint64_t, it's ~4x+ the elements/second.
* The call obvious is more pricey than the loop, with small element count.
As it can be seen from the full output {
F8768210}, the `memcmp()` is almost
universally worse, independent of the element size (and thus buffer size) when
element count is less than 8.
So all in all, bcmp idiom does indeed pose untapped performance headroom.
This diff does implement said idiom recognition. I think a reasonable test
coverage is present, but do tell if there is anything obvious missing.
Now, quality. This does succeed to build and pass the test-suite, at least
without any non-bundled elements. {
F8768216} {
F8768217}
This transform fires 91 times:
```
$ /build/test-suite/utils/compare.py -m loop-idiom.NumBCmp result-new.json
Tests: 1149
Metric: loop-idiom.NumBCmp
Program result-new
MultiSourc...Benchmarks/7zip/7zip-benchmark 79.00
MultiSource/Applications/d/make_dparser 3.00
SingleSource/UnitTests/vla 2.00
MultiSource/Applications/Burg/burg 1.00
MultiSourc.../Applications/JM/lencod/lencod 1.00
MultiSource/Applications/lemon/lemon 1.00
MultiSource/Benchmarks/Bullet/bullet 1.00
MultiSourc...e/Benchmarks/MallocBench/gs/gs 1.00
MultiSourc...gs-C/TimberWolfMC/timberwolfmc 1.00
MultiSourc...Prolangs-C/simulator/simulator 1.00
```
The size changes are:
I'm not sure what's going on with SingleSource/UnitTests/vla.test yet, did not look.
```
$ /build/test-suite/utils/compare.py -m size..text result-{old,new}.json --filter-hash
Tests: 1149
Same hash: 907 (filtered out)
Remaining: 242
Metric: size..text
Program result-old result-new diff
test-suite...ingleSource/UnitTests/vla.test 753.00 833.00 10.6%
test-suite...marks/7zip/7zip-benchmark.test
1001697.00 966657.00 -3.5%
test-suite...ngs-C/simulator/simulator.test 32369.00 32321.00 -0.1%
test-suite...plications/d/make_dparser.test 89585.00 89505.00 -0.1%
test-suite...ce/Applications/Burg/burg.test 40817.00 40785.00 -0.1%
test-suite.../Applications/lemon/lemon.test 47281.00 47249.00 -0.1%
test-suite...TimberWolfMC/timberwolfmc.test 250065.00 250113.00 0.0%
test-suite...chmarks/MallocBench/gs/gs.test 149889.00 149873.00 -0.0%
test-suite...ications/JM/lencod/lencod.test 769585.00 769569.00 -0.0%
test-suite.../Benchmarks/Bullet/bullet.test 770049.00 770049.00 0.0%
test-suite...HMARK_ANISTROPIC_DIFFUSION/128 NaN NaN nan%
test-suite...HMARK_ANISTROPIC_DIFFUSION/256 NaN NaN nan%
test-suite...CHMARK_ANISTROPIC_DIFFUSION/64 NaN NaN nan%
test-suite...CHMARK_ANISTROPIC_DIFFUSION/32 NaN NaN nan%
test-suite...ENCHMARK_BILATERAL_FILTER/64/4 NaN NaN nan%
Geomean difference nan%
result-old result-new diff
count 1.
000000e+01 10.00000 10.000000
mean 3.
152090e+05 311695.40000 0.006749
std 3.
790398e+05 372091.42232 0.036605
min 7.
530000e+02 833.00000 -0.034981
25% 4.
243300e+04 42401.00000 -0.000866
50% 1.
197370e+05 119689.00000 -0.000392
75% 6.
397050e+05 639705.00000 -0.000005
max 1.
001697e+06 966657.00000 0.106242
```
I don't have timings though.
And now to the code. The basic idea is to completely replace the whole loop.
If we can't fully kill it, don't transform.
I have left one or two comments in the code, so hopefully it can be understood.
Also, there is a few TODO's that i have left for follow-ups:
* widening of `memcmp()`/`bcmp()`
* step smaller than the comparison size
* Metadata propagation
* more than two blocks as long as there is still a single backedge?
* ???
Reviewers: reames, fhahn, mkazantsev, chandlerc, craig.topper, courbet
Reviewed By: courbet
Subscribers: hiraditya, xbolva00, nikic, jfb, gchatelet, courbet, llvm-commits, mclow.lists
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61144
llvm-svn: 370454
Roman Lebedev [Fri, 30 Aug 2019 09:51:02 +0000 (09:51 +0000)]
[NFC] SCEVExpander: add SetCurrentDebugLocation() / getCurrentDebugLocation() wrappers
Summary:
The internal `Builder` is private, which means there is
currently no way to set the debuginfo locations for `SCEVExpander`.
This only adds the wrappers, but does not use them anywhere.
Reviewers: mkazantsev, sanjoy, gberry, jyknight, dneilson
Reviewed By: sanjoy
Subscribers: javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61007
llvm-svn: 370453
Johan Vikstrom [Fri, 30 Aug 2019 09:33:27 +0000 (09:33 +0000)]
[clangd] Collecting main file macro expansion locations in ParsedAST.
Summary: TokenBuffer does not collect macro expansions inside macro arguments which is needed for semantic higlighting. Therefore collects macro expansions in the main file in a PPCallback when building the ParsedAST instead.
Reviewers: hokein, ilya-biryukov
Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66928
llvm-svn: 370452
Dmitri Gribenko [Fri, 30 Aug 2019 09:29:34 +0000 (09:29 +0000)]
[Tooling] Migrated APIs that take ownership of objects to unique_ptr
Subscribers: jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66960
llvm-svn: 370451
Pavel Labath [Fri, 30 Aug 2019 09:07:42 +0000 (09:07 +0000)]
dotest: improvements to the pexpect tests
Summary:
While working on r370054, i've found it frustrating that the test output
was compeletely unhelpful in case of failures. Therefore I've decided to
improve that. In this I reuse the PExpectTest class, which was one of
our mechanisms for running pexpect tests, but which has gotten orhpaned
in the mean time.
I've replaced the existing send methods with a "expect" method, which
I've tried to design so that it has a similar interface to the expect
method in regular non-pexpect dotest tests (as it essentially does
something very similar). I've kept the ability to dump the transcript of
the pexpect communication to stdout in the "trace" mode, as that is a
very handy way to figure out what the test is doing. I've also removed
the "expect_string" method used in the existing tests -- I've found this
to be unhelpful because it hides the message that would be normally
displayed by the EOF exception. Although vebose, this message includes
some important information, like what strings we were searching for,
what were the last bits of lldb output, etc. I've also beefed up the
class to automatically disable the debug info test duplication, and
auto-skip tests when the host platform does not support pexpect.
This patch ports TestMultilineCompletion and TestIOHandlerCompletion to
the new class. It also deletes TestFormats as it is not testing anything
(definitely not formats) -- it was committed with the test code
commented out (r228207), and then the testing code was deleted in
r356000.
Reviewers: teemperor, JDevlieghere, davide
Subscribers: aprantl, lldb-commits
Differential Revision: https://reviews.llvm.org/D66954
llvm-svn: 370449
David Stenberg [Fri, 30 Aug 2019 09:06:50 +0000 (09:06 +0000)]
[LiveDebugValues] Insert entry values after bundles
Summary:
Change LiveDebugValues so that it inserts entry values after the bundle
which contains the clobbering instruction. Previously it would insert
the debug value after the bundle head using insertAfter(), breaking the
bundle.
Reviewers: djtodoro, NikolaPrica, aprantl, vsk
Reviewed By: vsk
Subscribers: hiraditya, llvm-commits
Tags: #debug-info, #llvm
Differential Revision: https://reviews.llvm.org/D66888
llvm-svn: 370448
Haojian Wu [Fri, 30 Aug 2019 09:06:18 +0000 (09:06 +0000)]
[clangd] Add .vscode-test to .gitignore.
Reviewers: jvikstrom
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66949
llvm-svn: 370446
Alexander Potapenko [Fri, 30 Aug 2019 08:58:46 +0000 (08:58 +0000)]
[CodeGen]: fix error message for "=r" asm constraint
Summary:
Nico Weber reported that the following code:
char buf[9];
asm("" : "=r" (buf));
yields the "impossible constraint in asm: can't store struct into a register"
error message, although |buf| is not a struct (see
http://crbug.com/999160).
Make the error message more generic and add a test for it.
Also make sure other tests in x86_64-PR42672.c check for the full error
message.
Reviewers: eli.friedman, thakis
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66948
llvm-svn: 370444
Sven van Haastregt [Fri, 30 Aug 2019 08:52:55 +0000 (08:52 +0000)]
vim: add `immarg` keyword
The `immarg` attribute was added in r355981.
llvm-svn: 370443
Nico Weber [Fri, 30 Aug 2019 08:26:37 +0000 (08:26 +0000)]
gn build: Merge r370441
llvm-svn: 370442
Dmitri Gribenko [Fri, 30 Aug 2019 08:21:55 +0000 (08:21 +0000)]
[ADT] Removed VariadicFunction
Summary:
It is not used. It uses macro-based unrolling instead of variadic
templates, so it is not idiomatic anymore, and therefore it is a
questionable API to keep "just in case".
Subscribers: mgorny, dmgreen, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66961
llvm-svn: 370441
Raphael Isemann [Fri, 30 Aug 2019 07:44:29 +0000 (07:44 +0000)]
[lldb][NFC] Move Clang-specific flags to ClangUserExpression
LLVMUserExpression doesn't use these variables and they are all specific to Clang.
Also removes m_const_object as this was actually never used by anyone (and Clang
didn't report it as we assigned it in the constructor which seems to count as use).
llvm-svn: 370440
Fangrui Song [Fri, 30 Aug 2019 07:10:30 +0000 (07:10 +0000)]
[ELF] Set `referenced` bit of Undefined created by BitcodeFile
D64136 and D65584, while fixing STB_WEAK issues and improving our
compatibility with ld.bfd, can cause another STB_WEAK problem related to
LTO:
If %tundef.o has an undefined reference on f,
and %tweakundef.o has a weak undefined reference on f,
%tdef.o has a definition of f
```
ld.lld %tundef.o %tweakundef.o --start-lib %tdef.o --end-lib
```
1) `%tundef.o` doesn't set the `referenced` bit.
2) `%weakundef.o` changes the binding from STB_GLOBAL to STB_WEAK
3) `%tdef.o` is not fetched because the binding is weak.
Step (1) is incorrect. This patch sets the `referenced` bit of Undefined
created by bitcode files.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D66992
llvm-svn: 370437
Martin Storsjo [Fri, 30 Aug 2019 06:56:33 +0000 (06:56 +0000)]
[LLD] [COFF] Support merging resource object files
Extend WindowsResourceParser to support using a ResourceSectionRef for
loading resources from an object file.
Only allow merging resource object files in mingw mode; keep the
existing error on multiple resource objects in link mode.
If there only is one resource object file and no .res resources,
don't parse and recreate the .rsrc section, but just link it in without
inspecting it. This allows users to produce any .rsrc section (outside
of what the parser supports), just like before. (I don't have a specific
need for this, but it reduces the risk of this new feature.)
Separate out the .rsrc section chunks in InputFiles.cpp, and only include
them in the list of section chunks to link if we've determined that there
only was one single resource object. (We need to keep other chunks from
those object files, as they can legitimately contain other sections as
well, in addition to .rsrc section chunks.)
Differential Revision: https://reviews.llvm.org/D66824
llvm-svn: 370436
Martin Storsjo [Fri, 30 Aug 2019 06:56:02 +0000 (06:56 +0000)]
[WindowsResource] Remove use of global variables in WindowsResourceParser
Instead of updating a global variable counter for the next index of
strings and data blobs, pass along a reference to actual data/string
vectors and let the TreeNode insertion methods add their data/strings to
the vectors when a new entry is needed.
Additionally, if the resource tree had duplicates, that were ignored
with -force:multipleres in lld, we no longer store all versions of the
duplicated resource data, now we only keep the one that actually ends
up referenced.
Differential Revision: https://reviews.llvm.org/D66823
llvm-svn: 370435
Martin Storsjo [Fri, 30 Aug 2019 06:55:54 +0000 (06:55 +0000)]
[WindowsResource] Avoid duplicating the input filenames for each resource. NFC.
Differential Revision: https://reviews.llvm.org/D66821
llvm-svn: 370434
Martin Storsjo [Fri, 30 Aug 2019 06:55:49 +0000 (06:55 +0000)]
[COFF] Add a ResourceSectionRef method for getting resource contents
This allows llvm-readobj to print the contents of each resource
when printing resources from an object file or executable, like it
already does for plain .res files.
This requires providing the whole COFFObjectFile to ResourceSectionRef.
This supports both object files and executables. For executables,
the DataRVA field is used as is to look up the right section.
For object files, ideally we would need to complete linking of them
and fix up all relocations to know what the DataRVA field would end up
being. In practice, the only thing that makes sense for an RVA field
is an ADDR32NB relocation. Thus, find a relocation pointing at this
field, verify that it has the expected type, locate the symbol it
points at, look up the section the symbol points at, and read from the
right offset in that section.
This works both for GNU windres object files (which use one single
.rsrc section, with all relocations against the base of the .rsrc
section, with the original value of the DataRVA field being the
offset of the data from the beginning of the .rsrc section) and
cvtres object files (with two separate .rsrc$01 and .rsrc$02 sections,
and one symbol per data entry, with the original pre-relocated DataRVA
field being set to zero).
Differential Revision: https://reviews.llvm.org/D66820
llvm-svn: 370433
Petar Avramovic [Fri, 30 Aug 2019 05:51:12 +0000 (05:51 +0000)]
[MIPS GlobalISel] Lower uitofp
Add custom lowering for G_UITOFP for MIPS32.
Differential Revision: https://reviews.llvm.org/D66930
llvm-svn: 370432
Petar Avramovic [Fri, 30 Aug 2019 05:44:02 +0000 (05:44 +0000)]
[MIPS GlobalISel] Lower fptoui
Add lower for G_FPTOUI. Algorithm is similar to the SDAG version
in TargetLowering::expandFP_TO_UINT.
Lower G_FPTOUI for MIPS32.
Differential Revision: https://reviews.llvm.org/D66929
llvm-svn: 370431
Dan Gohman [Fri, 30 Aug 2019 04:33:22 +0000 (04:33 +0000)]
[CodeGen] Fix lowering for returning the result of an extractvalue
When the number of return values exceeds the number of registers available,
SelectionDAGBuilder::visitRet transforms a function's return to use a
pointer to a buffer to hold return values. When the returned value is an
operator such as extractvalue, the value may have a non-zero result number.
Add that number to the indexing when obtaining the values to store.
This fixes https://bugs.llvm.org/show_bug.cgi?id=43132.
Differential Revision: https://reviews.llvm.org/D66978
llvm-svn: 370430
Nathan Ridge [Fri, 30 Aug 2019 03:37:24 +0000 (03:37 +0000)]
[clangd] Add distinct highlightings for static fields and methods
Reviewers: hokein, ilya-biryukov, jvikstrom
Reviewed By: hokein
Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66828
llvm-svn: 370429
Jinsong Ji [Fri, 30 Aug 2019 03:16:41 +0000 (03:16 +0000)]
[PowerPC][NFC] Use inline Subtarget->isPPC64()
To be consistent with all the other instances.
llvm-svn: 370428
Jinsong Ji [Fri, 30 Aug 2019 02:57:33 +0000 (02:57 +0000)]
[PowerPC][NFC] Use -mtriple in RUN line, remove target triple in tls.ll
To avoid confusion, especially when -mtriple are also added for PPC32.
llvm-svn: 370427
Fangrui Song [Fri, 30 Aug 2019 02:20:49 +0000 (02:20 +0000)]
[PPC32] Emit R_PPC_GOT_TPREL16 instead R_PPC_GOT_TPREL16_LO
Unlike ppc64, which has ADDISgotTprelHA+LDgotTprelL pairs,
ppc32 just uses LDgotTprelL32, so it does not make lots of sense to use
_LO without a paired _HA.
Emit R_PPC_GOT_TPREL16 instead R_PPC_GOT_TPREL16_LO to match GCC, and
get better linker relocation check. Note, R_PPC_GOT_TPREL16_{HA,LO}
don't have good linker support:
(a) lld does not support R_PPC_GOT_TPREL16_{HA,LO}.
(b) Top of tree ld.bfd does not support R_PPC_GOT_REL16_HA Initial-Exec -> Local-Exec relaxation:
// a.o
addis 3, 3, tsd_tls@got@tprel@ha
lwz 3, tsd_tls@got@tprel@l(3)
add 3, 3, tsd_tls@tls
// b.o
.section .tdata,"awT"; .globl tsd_tls; tsd_tls:
// ld/ld-new a.o b.o
internal error, aborting at ../../bfd/elf32-ppc.c:7952 in ppc_elf_relocate_section
Reviewed By: adalava
Differential Revision: https://reviews.llvm.org/D66925
llvm-svn: 370426
Alex Lorenz [Fri, 30 Aug 2019 01:25:57 +0000 (01:25 +0000)]
[clang-scan-deps] NFC, refactor the DependencyScanningWorker to use a consumer
to report the dependencies to the client
This will allow the scanner to report modular dependencies to the consumer.
This will also allow the scanner to accept regular cc1 clang invocations, e.g.
in an implementation of a libclang C API for clang-scan-deps, that I will add
follow-up patches for in the future.
llvm-svn: 370425
Craig Topper [Fri, 30 Aug 2019 00:54:36 +0000 (00:54 +0000)]
[X86] Explicitly list all the always trivially rematerializable instructions.
Add a default with an llvm_unreachable for anything we don't expect.
This seems safer that just blindly returning true for anything
missing from the switch.
llvm-svn: 370424
Saleem Abdulrasool [Fri, 30 Aug 2019 00:16:02 +0000 (00:16 +0000)]
DebugInfo: add CodeView register mapping for ARM NT
Add the core registers and NEON registers mapping to the CodeView
register ID. This is sufficient to compile a basic C program with debug
info using CodeView debug info.
llvm-svn: 370423
Bruno Cardoso Lopes [Thu, 29 Aug 2019 23:14:08 +0000 (23:14 +0000)]
[Modules] Make ReadModuleMapFileBlock errors reliable
This prevents a crash when an error should be emitted instead.
During implicit module builds, there are cases where ReadASTCore is called with
ImportedBy set to nullptr, which breaks expectations in ReadModuleMapFileBlock,
leading to crashes.
Fix this by improving ReadModuleMapFileBlock to handle ImportedBy correctly.
This only happens non deterministically in the wild, when the underlying file
system changes while concurrent compiler invocations use implicit modules,
forcing rebuilds which see an inconsistent filesystem state. That said, there's
no much to do w.r.t. writing tests here.
rdar://problem/
48828801
llvm-svn: 370422
Petr Hosek [Thu, 29 Aug 2019 23:12:06 +0000 (23:12 +0000)]
[CMake][Fuchsia] Enable experimental pass manager by default
We plan on using experimental new pass manager for Fuchsia toolchain.
Differential Revision: https://reviews.llvm.org/D58214
llvm-svn: 370421
Alex Lorenz [Thu, 29 Aug 2019 22:56:38 +0000 (22:56 +0000)]
[clang-scan-deps] reuse the file manager across invocations of
the dependency scanner on a single worker thread
This behavior can be controlled using the new `-reuse-filemanager` clang-scan-deps
option. By default the file manager is reused.
The added test/ClangScanDeps/symlink.cpp is able to pass with
the reused filemanager after the related FileEntryRef changes
landed earlier. The test test/ClangScanDeps/subframework_header_dir_symlink.m
still fails when the file manager is reused (I run the FileCheck with not to
make it PASS). I will address this in a follow-up patch that improves
the DirectoryEntry name modelling in the FileManager.
llvm-svn: 370420
Richard Smith [Thu, 29 Aug 2019 22:49:34 +0000 (22:49 +0000)]
Fix silent wrong-code bugs and crashes with designated initialization.
We failed to correctly handle the 'holes' left behind by designated
initializers in VerifyOnly mode. This would result in us thinking that a
designated initialization would be valid, only to find that it is not
actually valid when we come to build it. In a +Asserts build, that would
assert, and in a -Asserts build, that would silently lose some part of
the initialization or crash.
With this change, when an InitListExpr contains any designators, we now
always build a structured list so that we can track the locations of the
'holes' that we need to go back and fill in.
We could in principle do better: we only need the structured form if
there is a designator that jumps backwards (and can otherwise check for
the holes as we progress through the initializer list), but dealing with
that turns out to be rather complicated, so it's not done as part of
this patch.
llvm-svn: 370419
Richard Smith [Thu, 29 Aug 2019 22:49:33 +0000 (22:49 +0000)]
Refactor InitListChecker to check only a single (explicit) initializer
list, rather than recursively checking multiple lists in C.
This simplification is in preparation for making InitListChecker
maintain more state that's specific to the explicit initializer list,
particularly when handling designated initialization.
llvm-svn: 370418
Richard Smith [Thu, 29 Aug 2019 22:49:32 +0000 (22:49 +0000)]
Refactor InitListChecker to make it a bit clearer that hasError is only
set to true in VerifyOnly mode in cases where it's also set to true when
actually building the initializer list.
Add FIXMEs for the two cases where that's not true. No functionality
change intended.
llvm-svn: 370417
Dan Gohman [Thu, 29 Aug 2019 22:41:05 +0000 (22:41 +0000)]
[WebAssembly] Implement NO_STRIP
This patch implements support for the NO_STRIP flag, which will allow
__attribute__((used)) to be implemented.
This accompanies https://reviews.llvm.org/D62542, which moves to setting the
NO_STRIP flag, and will continue to set EXPORTED for Emscripten targets for
compatibility.
Differential Revision: https://reviews.llvm.org/D66968
llvm-svn: 370416
Dan Gohman [Thu, 29 Aug 2019 22:40:00 +0000 (22:40 +0000)]
[WebAssembly] Make __attribute__((used)) not imply export.
Add an WASM_SYMBOL_NO_STRIP flag, so that __attribute__((used)) doesn't
need to imply exporting. When targeting Emscripten, have
WASM_SYMBOL_NO_STRIP imply exporting.
Differential Revision: https://reviews.llvm.org/D62542
llvm-svn: 370415
Philip Reames [Thu, 29 Aug 2019 22:08:17 +0000 (22:08 +0000)]
[Tests] Precommit a few cases where we're missing oppurtunities for block local simplications off assumes.
llvm-svn: 370414
Jonas Devlieghere [Thu, 29 Aug 2019 22:02:28 +0000 (22:02 +0000)]
[lit] Print exit code in for unresolved (lldb)tests.
A test is marked unresolved when we're unable to find PASSED or FAILED
in the dotest output. Usually this is because we crashed and when that
happens the exit code can give a clue as to why. This patch adds the
exit code to the lit output to make it easier to investigate those
issues.
Differential revision: https://reviews.llvm.org/D66975
llvm-svn: 370413
Nandor Licker [Thu, 29 Aug 2019 21:57:47 +0000 (21:57 +0000)]
[NFC] Test commit - sorted headers.
llvm-svn: 370412
Jinsong Ji [Thu, 29 Aug 2019 21:53:59 +0000 (21:53 +0000)]
[PowerPC] Support extended mnemonics mffprwz etc.
Summary:
Reported in https://github.com/opencv/opencv/issues/15413.
We have serveral extended mnemonics for Move To/From Vector-Scalar Register Instructions
eg: mffprd,mtfprd etc.
We only support one of them, this patch add the others.
Reviewers: nemanjai, steven.zhang, hfinkel, #powerpc
Reviewed By: hfinkel
Subscribers: wuzish, qcolombet, hiraditya, kbarton, MaskRay, shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66963
llvm-svn: 370411
Jessica Paquette [Thu, 29 Aug 2019 21:53:58 +0000 (21:53 +0000)]
[AArch64][GlobalISel] Select arithmetic extended register patterns
This teaches GISel to select patterns which fold an extend plus optional shift
into the addressing mode. In particular, adds and subs.
Factor out the arith extended register ComplexPatterns in AArch64InstrFormats.td
and create GISel equivalents.
Add some equivalent functions to the ones in AArch64ISelDAGToDAG:
- `selectArithExtendedRegister`
- `narrowExtendRegIfNeeded`
- `getExtendTypeForInst`
`getExtendTypeForInst` includes the checks for loads and stores. This will be
used for WRO addressing modes in loads + stores.
Teach selectCopy to properly handle subregister copies on the same bank in
order to support `narrowExtendRegIfNeeded`. The extended register must be a
GPR32, so we need to support same-bank subregister copies.
Fix a bug in getSubRegForClass which would cause registers on things like
GPR32common to end up getting ssub. Just change the check to look for FPR32
rather than GPR32.
For tests:
- Add select-arith-extended-reg.mir
- Update addsub_ext.ll to include GlobalISel checks
Differential Revision: https://reviews.llvm.org/D66835
llvm-svn: 370410
Reid Kleckner [Thu, 29 Aug 2019 21:24:41 +0000 (21:24 +0000)]
[X86] Don't emit unreachable stack adjustments
Summary:
This is a minor improvement on our past attempts to do this. Fixes
PR43155.
Reviewers: hans
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66905
llvm-svn: 370409
Reid Kleckner [Thu, 29 Aug 2019 21:15:02 +0000 (21:15 +0000)]
Allow '@' to appear in x86 mingw symbols
Summary:
There is no reason to differ in assembler behavior here between -msvc
and -gnu targets. Without this setting, the text after the '@' is
interpreted as a symbol variable, like foo@IMGREL.
Reviewers: mstorsjo
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66974
llvm-svn: 370408
Sanjay Patel [Thu, 29 Aug 2019 20:57:50 +0000 (20:57 +0000)]
[InstCombine] add possible bswap as widening shuffle test; NFC
Goes with the proposal in D66965.
llvm-svn: 370407
Artem Dergachev [Thu, 29 Aug 2019 20:37:28 +0000 (20:37 +0000)]
[CFG] Fix CFG for statement-expressions in return values.
We're building the CFG from bottom to top, so when the return-value expression
has a non-trivial CFG on its own, we need to continue building from the entry
to the return-value expression CFG rather than from the block to which
we've just appended the return statement.
Fixes a false positive warning "control may reach end of non-void function".
llvm-svn: 370406
Reid Kleckner [Thu, 29 Aug 2019 20:32:53 +0000 (20:32 +0000)]
Fix the build for MSVC builds using M_PI
llvm-svn: 370405
Simon Pilgrim [Thu, 29 Aug 2019 20:22:08 +0000 (20:22 +0000)]
[X86][SSE] combinePMULDQ - pmuldq(x, 0) -> zero vector (PR43159)
ISD::isBuildVectorAllZeros permits undef elements to be present, which means we can't return it as a zero vector. PMULDQ/PMULUDQ is an extending multiply so a multiply by zero of the lower 32-bits should result in a zero 64-bit element.
llvm-svn: 370404
Julian Lettner [Thu, 29 Aug 2019 20:20:05 +0000 (20:20 +0000)]
[ASan] Version mismatch check follow-up
Follow-up for:
[ASan] Make insertion of version mismatch guard configurable
3ae9b9d5e40d1d9bdea1fd8e6fca322df920754a
This tiny change makes sure that this test passes on our internal bots
as well.
llvm-svn: 370403
Matt Arsenault [Thu, 29 Aug 2019 20:06:48 +0000 (20:06 +0000)]
AMDGPU/GlobalISel: Legalize sin/cos
llvm-svn: 370402
Aaron Ballman [Thu, 29 Aug 2019 20:00:40 +0000 (20:00 +0000)]
Avoid crash when dumping NULL Type as JSON.
Patch by Bert Belder.
llvm-svn: 370401
Volodymyr Sapsai [Thu, 29 Aug 2019 19:51:25 +0000 (19:51 +0000)]
Remove `FileManager::invalidateCache` as it has no callers anymore. NFC.
llvm-svn: 370400
Sanjay Patel [Thu, 29 Aug 2019 19:36:18 +0000 (19:36 +0000)]
[InstCombine] reduce duplicated code; NFC
llvm-svn: 370399
Jordan Rupprecht [Thu, 29 Aug 2019 19:03:58 +0000 (19:03 +0000)]
Revert [MBP] Disable aggressive loop rotate in plain mode
This reverts r369664 (git commit
51f48295cbe8fa3a44db263b528dd9f7bae7bf9a)
It causes many benchmark regressions, internally and in llvm's benchmark suite.
llvm-svn: 370398
Alina Sbirlea [Thu, 29 Aug 2019 19:01:23 +0000 (19:01 +0000)]
Revert enabling MemorySSA.
Breaks sanitizers bots.
Differential Revision: https://reviews.llvm.org/D58311
llvm-svn: 370397
Michael Kruse [Thu, 29 Aug 2019 18:55:55 +0000 (18:55 +0000)]
[DependenceInfo] Compute WAR dependence info using ISL kills. NFC.
When reading code of Dependences::calculateDependences, I noticed that
WAR is computed specifically by buildWAR. Given ISL now
supports "kills" in approximate dataflow analysis, this patch takes
advantage of it.
This patch also cleans up a couple lines redundant codes.
Patch by bin.narwal <bin.narwal@gmail.com>
Differential Revision: https://reviews.llvm.org/D66741
llvm-svn: 370396
Raphael Isemann [Thu, 29 Aug 2019 18:53:20 +0000 (18:53 +0000)]
[lldb][NFC] Document options parameter in ClangUserExpression constructor
Somehow this option was only documented in the swift branch.
llvm-svn: 370395
Jonas Devlieghere [Thu, 29 Aug 2019 18:37:05 +0000 (18:37 +0000)]
[test] Fix various module cache bugs and inconsistencies
Currently, lit tests don't set neither the module cache for building
inferiors nor the module cache used by lldb when running tests.
Furthermore, we have several places where we rely on the path to the
module cache being always the same, rather than passing the correct
value around. This makes it hard to specify a different module cache
path when debugging a a test.
This patch reworks how we determine and pass around the module cache
paths and fixes the omission on the lit side. It also adds a sanity
check to the lit and dotest suites.
Differential revision: https://reviews.llvm.org/D66966
llvm-svn: 370394
Craig Topper [Thu, 29 Aug 2019 18:09:02 +0000 (18:09 +0000)]
[X86] Remove what little support we had for MPX
-Deprecate -mmpx and -mno-mpx command line options
-Remove CPUID detection of mpx for -march=native
-Remove MPX from all CPUs
-Remove MPX preprocessor define
I've left the "mpx" string in the backend so we don't fail on old IR, but its not connected to anything.
gcc has also deprecated these command line options. https://www.phoronix.com/scan.php?page=news_item&px=GCC-Patch-To-Drop-MPX
Differential Revision: https://reviews.llvm.org/D66669
llvm-svn: 370393
Matt Arsenault [Thu, 29 Aug 2019 17:55:05 +0000 (17:55 +0000)]
GlobalISel: Don't compute known bits for non-integral GEP
llvm-svn: 370392
Florian Hahn [Thu, 29 Aug 2019 17:47:58 +0000 (17:47 +0000)]
[LoopUnrollAndJam] Use Lazy strategy for DTU.
We can also apply the earlier updates to the lazy DTU, instead of
applying them directly.
Reviewers: kuhar, brzycki, asbirlea, SjoerdMeijer
Reviewed By: brzycki, asbirlea, SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D66918
llvm-svn: 370391
Matthew G McGovern [Thu, 29 Aug 2019 17:47:43 +0000 (17:47 +0000)]
[cmake] enable x86 libfuzzer on Windows
- recent commit https://reviews.llvm.org/D66433 enabled libfuzzer
to build on windows, this just enables the option to build as part
of the the regular build.
llvm-svn: 370390
Matt Arsenault [Thu, 29 Aug 2019 17:24:36 +0000 (17:24 +0000)]
GlobalISel: Add maskedValueIsZero and signBitIsZero to known bits
I dropped the DemandedElts since it seems to be missing from some of
the new interfaces, but not others.
llvm-svn: 370389
Matt Arsenault [Thu, 29 Aug 2019 17:24:32 +0000 (17:24 +0000)]
GlobalISel: Add known bits to InstructionSelector
AMDGPU uses this for some addressing mode selection patterns. The
analysis run itself doesn't do anything so it seems easier to just
always require this than adding a way to opt in.
llvm-svn: 370388
Jonas Devlieghere [Thu, 29 Aug 2019 17:19:00 +0000 (17:19 +0000)]
[dotest] Remove deprecated loggin through env variables.
It used to be possible to enable logging through environment variables
read by dotest. This approach is deprecated, as stated in the dotest
help output. Instead --channel should be used.
Differential revision: https://reviews.llvm.org/D66920
llvm-svn: 370387
Jonas Devlieghere [Thu, 29 Aug 2019 17:18:57 +0000 (17:18 +0000)]
[dotest] Remove the curses result formatter.
This removes the curses result formatter which appears to be broken.
Passing --curses to dotest.py screws up my terminal and doesn't run any
tests. It even crashes Python on occasion.
Differential revision: https://reviews.llvm.org/D66917
llvm-svn: 370386
Davide Italiano [Thu, 29 Aug 2019 17:14:25 +0000 (17:14 +0000)]
Revert "[TSanRuntime] Upstream thread swift race detector."
Sometimes it's easier to resolve merge conflict than arguing.
llvm-svn: 370385
Alina Sbirlea [Thu, 29 Aug 2019 17:08:13 +0000 (17:08 +0000)]
[MemorySSA & LoopPassManager] Enable MemorySSA as loop dependency. Update tests.
Summary:
I'm not planning to check this in at the moment, but feedback is very welcome, in particular how this affects performance.
The feedback obtains here will guide the next steps towards enabling this.
This patch enables the use of MemorySSA in the loop pass manager.
Passes that currently use MemorySSA:
- EarlyCSE
Passes that use MemorySSA after this patch:
- EarlyCSE
- LICM
- SimpleLoopUnswitch
Loop passes that update MemorySSA (and do not use it yet, but could use it after this patch):
- LoopInstSimplify
- LoopSimplifyCFG
- LoopUnswitch
- LoopRotate
- LoopSimplify
- LCSSA
Loop passes that do *not* update MemorySSA:
- IndVarSimplify
- LoopDelete
- LoopIdiom
- LoopSink
- LoopUnroll
- LoopInterchange
- LoopUnrollAndJam
- LoopVectorize
- LoopReroll
- IRCE
Reviewers: chandlerc, george.burgess.iv, davide, sanjoy, gberry
Subscribers: jlebar, Prazek, dmgreen, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58311
llvm-svn: 370384
Dmitri Gribenko [Thu, 29 Aug 2019 16:58:13 +0000 (16:58 +0000)]
Added 'inline' to functions defined in headers to avoid ODR violations
llvm-svn: 370383
Jessica Paquette [Thu, 29 Aug 2019 16:55:55 +0000 (16:55 +0000)]
[GlobalISel][AArch64] Select llvm.aarch64.stxr* intrinsics.
Add a GISelPredicateCode to the stxr_* PatFrags in AArch64InstrAtomics.td.
This allows us to select these intrinsics.
Differential Revision: https://reviews.llvm.org/D65779
llvm-svn: 370382
Sanjay Patel [Thu, 29 Aug 2019 16:48:00 +0000 (16:48 +0000)]
[InstCombine] add tests for bswap disguised as shuffle; NFC
Somewhat motivating case In PR43146:
https://bugs.llvm.org/show_bug.cgi?id=43146
But that's a lot more complicated.
llvm-svn: 370381
Jessica Paquette [Thu, 29 Aug 2019 16:45:19 +0000 (16:45 +0000)]
[GlobalISel][AArch64] Use a GISelPredicateCode to select llvm.aarch64.stlxr.*
Remove manual selection code for this intrinsic and use a GISelPredicateCode
instead.
This allows us to fully select this intrinsic without any tricky custom C++
matching.
Differential Revision: https://reviews.llvm.org/D65780
llvm-svn: 370380
Dmitri Gribenko [Thu, 29 Aug 2019 16:38:36 +0000 (16:38 +0000)]
Changed FrontendActionFactory::create to return a std::unique_ptr
Subscribers: jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66947
llvm-svn: 370379
Jessica Paquette [Thu, 29 Aug 2019 16:33:01 +0000 (16:33 +0000)]
[AArch64][GlobalISel] Select @llvm.aarch64.ldxr.* intrinsics
Same thing as D66897, but for ldxr.* instead. Add a GISelPredicateCode to the
ldxr_* definitions, which allows us to import them.
Add select-ldxr-intrin.mir, and update arm64-ldxr-stxr.ll.
Differential Revision: https://reviews.llvm.org/D66898
llvm-svn: 370378
Jessica Paquette [Thu, 29 Aug 2019 16:16:38 +0000 (16:16 +0000)]
[AArch64][GlobalISel] Select @llvm.aarch64.ldaxr.* intrinsics
Add a GISelPredicateCode to ldaxr_*. This allows us to import the patterns for
@llvm.aarch64.ldaxr.*, and thus select them.
Add `isLoadStoreOfNumBytes` for the GISelPredicateCode, since each of these
intrinsics involves the same check.
Add select-ldaxr-intrin.mir, and update arm64-ldxr-stxr.ll.
Differential Revision: https://reviews.llvm.org/D66897
llvm-svn: 370377
Michael Liao [Thu, 29 Aug 2019 16:12:05 +0000 (16:12 +0000)]
[SimplifyCFG] Skip sinking common lifetime markers of `alloca`.
Summary:
- Similar to the workaround in fix of PR30188, skip sinking common
lifetime markers of `alloca`. They are mostly left there after
inlining functions in branches.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66950
llvm-svn: 370376
Jinsong Ji [Thu, 29 Aug 2019 15:38:02 +0000 (15:38 +0000)]
[PowerPC][NFC] Update fp-int-conversions-direct-moves.ll using script
Also add -ppc-asm-full-reg-names,-ppc-vsr-nums-as-vr.
llvm-svn: 370375
Pavel Labath [Thu, 29 Aug 2019 15:30:52 +0000 (15:30 +0000)]
Fix GetDIEForDeclContext so it only returns entries matching the provided context
Currently, we return all the entries such that their decl_ctx pointer >= decl_ctx provided.
Instead, we should return only the ones that decl_ctx pointer == decl_ctx provided.
Differential Revision: https://reviews.llvm.org/D66357
Patch by Guilherme Andrade <guiandrade@google.com>.
llvm-svn: 370374
Pavel Labath [Thu, 29 Aug 2019 15:21:26 +0000 (15:21 +0000)]
Remove DWARFExpression::LocationListSize
Summary:
The only reason for this function's existance is so that we could pass
the correct size into the DWARFExpression constructor. However, there is
no harm in passing the entire data extractor into the DWARFExpression,
since the same code is performing the size determination as well as the
subsequent parse. So, if we get malformed input or there's a bug in the
parser, we'd compute the wrong size anyway.
Additionally, reducing the number of entry points into the location list
parsing machinery makes it easier to switch the llvm debug_loc(lists)
parsers.
While inside, I added a couple of tests for invalid location list
handling.
Reviewers: JDevlieghere, clayborg
Subscribers: aprantl, javed.absar, kristof.beyls, lldb-commits
Differential Revision: https://reviews.llvm.org/D66789
llvm-svn: 370373
Shaurya Gupta [Thu, 29 Aug 2019 15:11:59 +0000 (15:11 +0000)]
[Clangd] NFC: Added fixme for checking for local/anonymous types for extracted parameters
llvm-svn: 370372
Haojian Wu [Thu, 29 Aug 2019 14:57:32 +0000 (14:57 +0000)]
[clangd] Update out-of-date links in readme, NFC.
llvm-svn: 370371
Roman Lebedev [Thu, 29 Aug 2019 14:46:49 +0000 (14:46 +0000)]
[NFC][SimplifyCFG] 'Safely extract low bits' pattern will also benefit from -phi-node-folding-threshold=3
This is the naive implementation of x86 BZHI/BEXTR instruction:
it takes input and bit count, and extracts low nbits up to bit width.
I.e. unlike shift it does not have any UB when nbits >= bitwidth.
Which means we don't need a while PHI here, simple select will do.
And if it's a select, it should then be trivial to fix codegen
to select it to BEXTR/BZHI.
See https://bugs.llvm.org/show_bug.cgi?id=34704
llvm-svn: 370369
Michael Kruse [Thu, 29 Aug 2019 14:42:41 +0000 (14:42 +0000)]
[ScopBuilder] Remove superfluous while loop in buildDomains. NFC.
The while loop iterating parent loop in ScopBuilder::buildDomains is
unnecessary because either L or LD are later unused, this is a simple
patch removing it.
Patch by bin.narwal <bin.narwal@gmail.com>
Differential Revision: https://reviews.llvm.org/D66698
llvm-svn: 370368
Kadir Cetinkaya [Thu, 29 Aug 2019 14:38:02 +0000 (14:38 +0000)]
[clangd][NFC] Update background-index command line description
Summary:
We didn't change this in D64019 just in case we revert it back.
Deleting it now.
Reviewers: hokein, sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66943
llvm-svn: 370367
Simon Pilgrim [Thu, 29 Aug 2019 14:34:07 +0000 (14:34 +0000)]
[DAGCombine] Fix shadow variable warnings. NFCI.
llvm-svn: 370365
Pavel Labath [Thu, 29 Aug 2019 14:26:05 +0000 (14:26 +0000)]
DWARFDebugLoc: Make parsing and error reporting more robust
Summary:
While examining this class for possible use in lldb, I noticed two
things:
- it spits out parsing errors directly to stderr
- the loclists parser can incorrectly return valid location lists when
parsing malformed (truncated) data
I improve the stderr situation by making the parseOneLocationList
functions return Expected<T>s. The errors are still dumped to stderr by
their callers, so this is only a partial fix, but it is enough for my
use case, as I intend to parse the locations lists one by one.
I fix the behavior in the truncated scenario by using the newly
introduced DataExtractor Cursor API.
I also add tests for handling the error cases, as they currently have no
coverage.
Reviewers: dblaikie, JDevlieghere, probinson
Subscribers: lldb-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63591
llvm-svn: 370363
Dmitri Gribenko [Thu, 29 Aug 2019 14:21:05 +0000 (14:21 +0000)]
Removed a function declaration that doesn't have a definition
llvm-svn: 370361
Luis Marques [Thu, 29 Aug 2019 14:05:59 +0000 (14:05 +0000)]
[RISCV] Fix callee-saved-gprs.ll test ABIs
llvm-svn: 370359
Joerg Sonnenberger [Thu, 29 Aug 2019 13:22:30 +0000 (13:22 +0000)]
Allow replaceAndRecursivelySimplify to list unsimplified visitees.
This is part of D65280 and split it to avoid ABI changes on the 9.0
release branch.
llvm-svn: 370355
Simon Atanasyan [Thu, 29 Aug 2019 13:19:50 +0000 (13:19 +0000)]
[mips] Inline emitStoreWithSymOffset and emitLoadWithSymOffset methods. NFC
Both methods `MipsTargetStreamer::emitStoreWithSymOffset` and
`MipsTargetStreamer::emitLoadWithSymOffset` are almost the same and
differ argument names only. These methods are used in the single place
so it's better to inline their code and remove original methods.
llvm-svn: 370354
Simon Atanasyan [Thu, 29 Aug 2019 13:19:38 +0000 (13:19 +0000)]
[mips] Fix expanding `lw/sw $reg1, symbol($reg2)` instruction
When a "base" in the `lw/sw $reg1, symbol($reg2)` instruction is
a register and generated code is position independent, backend
does not add the "base" value to the symbol address.
```
lw $reg1, %got(symbol)($gp)
lw/sw $reg1, 0($reg1)
```
This patch fixes the bug and adds the missed `addu` instruction by
passing `BaseReg` into the `loadAndAddSymbolAddress` routine and handles
the case when the `BaseReg` is the zero register to escape redundant
`move reg, reg` instruction:
```
lw $reg1, %got(symbol)($gp)
addu $reg1, $reg1, $reg2
lw/sw $reg1, 0($reg1)
```
Differential Revision: https://reviews.llvm.org/D66894
llvm-svn: 370353
Roman Lebedev [Thu, 29 Aug 2019 12:48:04 +0000 (12:48 +0000)]
[InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with.overflow` inverted overflow bit
Summary:
Now that with D65143/D65144 we've produce `@llvm.umul.with.overflow`,
and with D65147 we've flattened the CFG, we now can see that
the guard may have been there to prevent division by zero is redundant.
We can simply drop it:
```
----------------------------------------
Name: no overflow or zero
%iszero = icmp eq i4 %y, 0
%umul = smul_overflow i4 %x, %y
%umul.ov = extractvalue {i4, i1} %umul, 1
%umul.ov.not = xor %umul.ov, -1
%retval.0 = or i1 %iszero, %umul.ov.not
ret i1 %retval.0
=>
%iszero = icmp eq i4 %y, 0
%umul = smul_overflow i4 %x, %y
%umul.ov = extractvalue {i4, i1} %umul, 1
%umul.ov.not = xor %umul.ov, -1
%retval.0 = or i1 %iszero, %umul.ov.not
ret i1 %umul.ov.not
Done: 1
Optimization is correct!
```
Note that this is inverted from what we have in a previous patch,
here we are looking for the inverted overflow bit.
And that inversion is kinda problematic - given this particular
pattern we neither hoist that `not` closer to `ret` (then the pattern
would have been identical to the one without inversion,
and would have been handled by the previous patch), neither
do the opposite transform. But regardless, we should handle this too.
I've filled [[ https://bugs.llvm.org/show_bug.cgi?id=42720 | PR42720 ]].
Reviewers: nikic, spatel, xbolva00, RKSimon
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65151
llvm-svn: 370351
Roman Lebedev [Thu, 29 Aug 2019 12:47:50 +0000 (12:47 +0000)]
[InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with.overflow` overflow bit
Summary:
Now that with D65143/D65144 we've produce `@llvm.umul.with.overflow`,
and with D65147 we've flattened the CFG, we now can see that
the guard may have been there to prevent division by zero is redundant.
We can simply drop it:
```
----------------------------------------
Name: no overflow and not zero
%iszero = icmp ne i4 %y, 0
%umul = umul_overflow i4 %x, %y
%umul.ov = extractvalue {i4, i1} %umul, 1
%retval.0 = and i1 %iszero, %umul.ov
ret i1 %retval.0
=>
%iszero = icmp ne i4 %y, 0
%umul = umul_overflow i4 %x, %y
%umul.ov = extractvalue {i4, i1} %umul, 1
%retval.0 = and i1 %iszero, %umul.ov
ret %umul.ov
Done: 1
Optimization is correct!
```
Reviewers: nikic, spatel, xbolva00
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65150
llvm-svn: 370350
Roman Lebedev [Thu, 29 Aug 2019 12:47:34 +0000 (12:47 +0000)]
[SimplifyCFG] FoldTwoEntryPHINode(): don't bailout on i1 PHI's if we can hoist a 'not' from incoming values
Summary:
As it can be seen in the tests in D65143/D65144, even though we have formed an '@llvm.umul.with.overflow'
and got rid of potential for division-by-zero, the control flow remains, we still have that branch.
We have this condition:
```
// Don't fold i1 branches on PHIs which contain binary operators
// These can often be turned into switches and other things.
if (PN->getType()->isIntegerTy(1) &&
(isa<BinaryOperator>(PN->getIncomingValue(0)) ||
isa<BinaryOperator>(PN->getIncomingValue(1)) ||
isa<BinaryOperator>(IfCond)))
return false;
```
which was added back in rL121764 to help with `select` formation i think?
That check prevents us to flatten the CFG here, even though we know
we no longer need that guard and will be able to drop everything
but the '@llvm.umul.with.overflow' + `not`.
As it can be seen from tests, we end here because the `not` is being
sinked into the PHI's incoming values by InstCombine,
so we can't workaround this by hoisting it to after PHI.
Thus i suggest that we relax that check to not bailout if we'd get to hoist the `not`.
Reviewers: craig.topper, spatel, fhahn, nikic
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65147
llvm-svn: 370349
Roman Lebedev [Thu, 29 Aug 2019 12:47:20 +0000 (12:47 +0000)]
[InstCombine] Fold '((%x * %y) u/ %x) != %y' to '@llvm.umul.with.overflow' + overflow bit extraction
Summary:
`((%x * %y) u/ %x) != %y` is one of (3?) common ways to check that
some unsigned multiplication (will not) overflow.
Currently, we don't catch it. We could:
```
$ /repositories/alive2/build-Clang-unknown/alive -root-only ~/llvm-patch1.ll
Processing /home/lebedevri/llvm-patch1.ll..
----------------------------------------
Name: no overflow
%o0 = mul i4 %y, %x
%o1 = udiv i4 %o0, %x
%r = icmp ne i4 %o1, %y
ret i1 %r
=>
%n0 = umul_overflow i4 %x, %y
%o0 = extractvalue {i4, i1} %n0, 0
%o1 = udiv %o0, %x
%r = extractvalue {i4, i1} %n0, 1
ret %r
Done: 1
Optimization is correct!
----------------------------------------
Name: no overflow
%o0 = mul i4 %y, %x
%o1 = udiv i4 %o0, %x
%r = icmp eq i4 %o1, %y
ret i1 %r
=>
%n0 = umul_overflow i4 %x, %y
%o0 = extractvalue {i4, i1} %n0, 0
%o1 = udiv %o0, %x
%n1 = extractvalue {i4, i1} %n0, 1
%r = xor %n1, -1
ret i1 %r
Done: 1
Optimization is correct!
```
Reviewers: nikic, spatel, efriedma, xbolva00, RKSimon
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65144
llvm-svn: 370348
Roman Lebedev [Thu, 29 Aug 2019 12:47:08 +0000 (12:47 +0000)]
[InstCombine] Fold '(-1 u/ %x) u< %y' to '@llvm.umul.with.overflow' + overflow bit extraction
Summary:
`(-1 u/ %x) u< %y` is one of (3?) common ways to check that
some unsigned multiplication (will not) overflow.
Currently, we don't catch it. We could:
```
----------------------------------------
Name: no overflow
%o0 = udiv i4 -1, %x
%r = icmp ult i4 %o0, %y
=>
%o0 = udiv i4 -1, %x
%n0 = umul_overflow i4 %x, %y
%r = extractvalue {i4, i1} %n0, 1
Done: 1
Optimization is correct!
----------------------------------------
Name: no overflow, swapped
%o0 = udiv i4 -1, %x
%r = icmp ugt i4 %y, %o0
=>
%o0 = udiv i4 -1, %x
%n0 = umul_overflow i4 %x, %y
%r = extractvalue {i4, i1} %n0, 1
Done: 1
Optimization is correct!
----------------------------------------
Name: overflow
%o0 = udiv i4 -1, %x
%r = icmp uge i4 %o0, %y
=>
%o0 = udiv i4 -1, %x
%n0 = umul_overflow i4 %x, %y
%n1 = extractvalue {i4, i1} %n0, 1
%r = xor %n1, -1
Done: 1
Optimization is correct!
----------------------------------------
Name: overflow
%o0 = udiv i4 -1, %x
%r = icmp ule i4 %y, %o0
=>
%o0 = udiv i4 -1, %x
%n0 = umul_overflow i4 %x, %y
%n1 = extractvalue {i4, i1} %n0, 1
%r = xor %n1, -1
Done: 1
Optimization is correct!
```
As it can be observed from tests, while simply forming the `@llvm.umul.with.overflow`
is easy, if we were looking for the inverted answer, then more work needs to be done
to cleanup the now-pointless control-flow that was guarding against division-by-zero.
This is being addressed in follow-up patches.
Reviewers: nikic, spatel, efriedma, xbolva00, RKSimon
Reviewed By: nikic, xbolva00
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65143
llvm-svn: 370347
Simon Pilgrim [Thu, 29 Aug 2019 12:41:19 +0000 (12:41 +0000)]
Fix variable ‘IsInitCapturePack’ set but not used warning. NFCI.
llvm-svn: 370345
Simon Pilgrim [Thu, 29 Aug 2019 12:37:02 +0000 (12:37 +0000)]
Fix MSVC "not all control paths return a value" warning. NFCI.
llvm-svn: 370343