Lang Hames [Fri, 28 Sep 2018 01:41:33 +0000 (01:41 +0000)]
[ORC] clang-format the ThreadSafeModule code.
Evidently I forgot to do this before committing r343055.
llvm-svn: 343288
Lang Hames [Fri, 28 Sep 2018 01:41:33 +0000 (01:41 +0000)]
[ORC] Add a const version of ThreadSafeModule::getModule().
llvm-svn: 343287
Lang Hames [Fri, 28 Sep 2018 01:41:29 +0000 (01:41 +0000)]
[ORC] Lock ThreadSafeContext during module destruction in ThreadSafeModule's
move constructor.
This is basically the same fix as r343261, but applied to the move constructor:
Failure to lock the context during module destruction can lead to data races if
other threads are operating on the context.
llvm-svn: 343286
Richard Smith [Fri, 28 Sep 2018 01:16:43 +0000 (01:16 +0000)]
[cxx2a] P0641R2: (Some) type mismatches on defaulted functions only
render the function deleted instead of rendering the program ill-formed.
This change also adds an enabled-by-default warning for the case where
an explicitly-defaulted special member function of a non-template class
is implicitly deleted by the type checking rules. (This fires either due
to this language change or due to pre-C++20 reasons for the member being
implicitly deleted). I've tested this on a large codebase and found only
bugs (where the program means something that's clearly different from
what the programmer intended), so this is enabled by default, but we
should revisit this if there are problems with this being enabled by
default.
llvm-svn: 343285
Craig Topper [Fri, 28 Sep 2018 01:06:13 +0000 (01:06 +0000)]
[ScalarizeMaskedMemIntrin] Fix the alignment calculation for the scalar stores of a masked store expansion.
It should be the minimum of the original alignment and the scalar size.
llvm-svn: 343284
Craig Topper [Fri, 28 Sep 2018 01:06:09 +0000 (01:06 +0000)]
[ScalarizeMaskedMemIntrin] Add test cases for masked store expansion. Increase alignment of one of the masked load test cases.
The masked store alignment is being miscalculated, but masked load is correct.
llvm-svn: 343283
Fangrui Song [Thu, 27 Sep 2018 23:59:57 +0000 (23:59 +0000)]
[XRay] Fix argv0-log-file-name.cc race when tests are executed parallelly
`rm xray-log.*` may delete log files of other tests and cause them to
fail, when tests are executed parallelly.
llvm-svn: 343282
Craig Topper [Thu, 27 Sep 2018 23:25:10 +0000 (23:25 +0000)]
[X86] Add the test case from PR38986.
The assembly for this test should be optimal now after changes to the ScalarizeMaskedMemIntrin patch.
llvm-svn: 343281
Dean Michael Berris [Thu, 27 Sep 2018 23:15:05 +0000 (23:15 +0000)]
[XRay] Add LD_LIBRARY_PATH to env variables for Unit Tests
Summary:
This change allows us to use the library path from which the LLVM
libraries are installed, in case the LLVM installation generates shared
libraries.
This should address llvm.org/PR39070.
Reviewers: mboerger, eizan
Subscribers: mgorny, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D52597
llvm-svn: 343280
Richard Smith [Thu, 27 Sep 2018 22:47:04 +0000 (22:47 +0000)]
[cxx2a] P0624R2: Lambdas with no capture-default are
default-constructible and assignable.
llvm-svn: 343279
Craig Topper [Thu, 27 Sep 2018 22:31:42 +0000 (22:31 +0000)]
[ScalarizeMaskedMemIntrin] Ensure the mask is a vector of ConstantInts before generating the expansion without control flow.
Its possible the mask itself or one of the elements is a ConstantExpr and we shouldn't optimize in that case.
llvm-svn: 343278
Craig Topper [Thu, 27 Sep 2018 22:31:40 +0000 (22:31 +0000)]
[ScalarizeMaskedMemIntrin] Use cast instead of dyn_cast checked by an assert. Consistently make use of the element type variable we already have. NFCI
cast will take care of asserting internally.
llvm-svn: 343277
George Karpenkov [Thu, 27 Sep 2018 22:31:13 +0000 (22:31 +0000)]
[analyzer] Hotfix for the bug in exploded graph printing
llvm-svn: 343276
Derek Schuff [Thu, 27 Sep 2018 22:20:33 +0000 (22:20 +0000)]
WebAssembly: Rename GetSignature to GetLibcallSignature [NFC]
llvm-svn: 343275
Craig Topper [Thu, 27 Sep 2018 21:28:59 +0000 (21:28 +0000)]
[ScalarizeMaskedMemIntrin] When expanding masked gathers, start with the passthru vector and insert the new load results into it.
Previously we started with undef and did a final merge with the passthru at the end.
llvm-svn: 343273
Craig Topper [Thu, 27 Sep 2018 21:28:55 +0000 (21:28 +0000)]
[ScalarizeMaskedMemIntrin] Add some IR only test cases for masked gather expansion.
llvm-svn: 343272
Craig Topper [Thu, 27 Sep 2018 21:28:52 +0000 (21:28 +0000)]
[ScalarizeMaskedMemIntrin] When expanding masked loads, start with the passthru value and insert each conditional load result over their element.
Previously we started with undef and did one final merge at the end with a select.
llvm-svn: 343271
Craig Topper [Thu, 27 Sep 2018 21:28:46 +0000 (21:28 +0000)]
[ScalarizeMaskedMemIntrin] Handle the case where the mask is an all zero vector.
This shouldn't really happen in practice I hope, but we tried to handle other constant cases. We missed this one because we checked for ConstantVector without realizing that zero becomes ConstantAggregateZero instead.
So instead just check for Constant and use getAggregateElement which will do the dirty work for us.
llvm-svn: 343270
Craig Topper [Thu, 27 Sep 2018 21:28:43 +0000 (21:28 +0000)]
[ScalarizeMaskedMemIntrin] Add dedicated IR only tests for masked load expansion so I can begin making modifications.
llvm-svn: 343269
Craig Topper [Thu, 27 Sep 2018 21:28:41 +0000 (21:28 +0000)]
[ScalarizeMaskedMemIntrin] Remove some temporary variables that are only used by a single if condition.
llvm-svn: 343268
Craig Topper [Thu, 27 Sep 2018 21:28:39 +0000 (21:28 +0000)]
[ScalarizeMaskedMemIntrin] Cleanup comments. NFC
llvm-svn: 343267
Lang Hames [Thu, 27 Sep 2018 21:13:07 +0000 (21:13 +0000)]
[ORC] Add definition for IRLayer::setCloneToNewContextOnEmit, use it to set the
flag to true in LLJIT when running in multithreaded mode.
The IRLayer::setCloneToNewContextOnEmit method sets a flag within the IRLayer
that causes modules added to that layer to be moved to a new context (by
serializing to/from a memory buffer) when they are emitted. This allows modules
that were all loaded on the same context to be compiled in parallel.
llvm-svn: 343266
Sam Clegg [Thu, 27 Sep 2018 21:06:25 +0000 (21:06 +0000)]
[WebAssembly] Add --[no]-export-dynamic to replace --export-default
In a very recent change I introduced a --no-export-default flag
but after conferring with others it seems that this feature already
exists in gnu GNU ld and lld in the form the --export-dynamic flag
which is off by default.
This change replaces export-default with export-dynamic and also
changes the default to match the traditional linker behaviour.
Now, by default, only the entry point is exported. If other symbols
are required by the embedder then --export-dynamic or --export can
be used to export all visibility hidden symbols or individual
symbols respectively.
This change touches a lot of tests that were relying on symbols
being exported by default. I imagine it will also effect many
users but do think the change is worth it match of the traditional
behaviour and flag names.
Differential Revision: https://reviews.llvm.org/D52587
llvm-svn: 343265
Konstantin Zhuravlyov [Thu, 27 Sep 2018 20:49:00 +0000 (20:49 +0000)]
AMDGPU: Split HasExt into HasExtDPP/SDWA/SDWA9
llvm-svn: 343264
Erik Pilkington [Thu, 27 Sep 2018 20:36:28 +0000 (20:36 +0000)]
NFC: Fix some darwin linker warnings introduced in r338385
The darwin linker was complaining about Toolchains/RISCV.cpp and
Toolchains/Arch/RISCV.cpp had the same name. Fix is to just rename
Toolchains/RISCV.cpp to Toolchains/RISCVToolchain.cpp.
Differential revision: https://reviews.llvm.org/D52574
llvm-svn: 343263
Lang Hames [Thu, 27 Sep 2018 20:36:10 +0000 (20:36 +0000)]
[ORC] Make LocalIndirectStubsManager's operations thread-safe.
Locks stub management operations and switches to atomic update for stub
pointers.
llvm-svn: 343262
Lang Hames [Thu, 27 Sep 2018 20:36:08 +0000 (20:36 +0000)]
[ORC] Lock ThreadSafeContext during Module destructing in ThreadSafeModule.
Failure to lock the context can lead to data races if other threads are
operating on other ThreadSafeModules that share the same context.
llvm-svn: 343261
Gheorghe-Teodor Bercea [Thu, 27 Sep 2018 20:29:00 +0000 (20:29 +0000)]
[OpenMP] Make default parallel for schedule in NVPTX target regions in SPMD mode achieve coalescing
Summary: Set default schedule for parallel for loops to schedule(static, 1) when using SPMD mode on the NVPTX device offloading toolchain to ensure coalescing.
Reviewers: ABataev, Hahnfeld, caomhin
Reviewed By: ABataev
Subscribers: jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D52629
llvm-svn: 343260
Konstantin Zhuravlyov [Thu, 27 Sep 2018 19:46:41 +0000 (19:46 +0000)]
AMDGPU: Split VOP2Inst into VOP2Inst_e32/e64/sdwa
llvm-svn: 343259
Patrick Lyster [Thu, 27 Sep 2018 19:30:32 +0000 (19:30 +0000)]
Test commit. NFC
llvm-svn: 343258
Lang Hames [Thu, 27 Sep 2018 19:27:20 +0000 (19:27 +0000)]
[ORC] Coalesce all of ORC's symbol renaming / linkage-promotion utilities into
one SymbolLinkagePromoter utility.
SymbolLinkagePromoter renames anonymous and private symbols, and bumps all
linkages to at least global/hidden-visibility. Modules whose symbols have been
promoted by this utility can be decomposed into sub-modules without introducing
link errors. This is used by the CompileOnDemandLayer to extract single-function
modules for lazy compilation.
llvm-svn: 343257
Lang Hames [Thu, 27 Sep 2018 19:27:20 +0000 (19:27 +0000)]
[ORC] LastKey needs to be protected to prevent data races.
llvm-svn: 343256
Lang Hames [Thu, 27 Sep 2018 19:27:19 +0000 (19:27 +0000)]
[lli] Fix ArgV setup bug when running in -jit-kind=orc-lazy mode.
ArgV[ArgC] should be null.
llvm-svn: 343255
Konstantin Zhuravlyov [Thu, 27 Sep 2018 19:24:05 +0000 (19:24 +0000)]
AMDGPU/NFC: Simplify VOP_MAC_F16/F32
llvm-svn: 343254
Gheorghe-Teodor Bercea [Thu, 27 Sep 2018 19:22:56 +0000 (19:22 +0000)]
[OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing
Summary: For the OpenMP NVPTX toolchain choose a default distribute schedule that ensures coalescing on the GPU when in SPMD mode. This significantly increases the performance of offloaded target code and reduces the number of registers used on the GPU side.
Reviewers: ABataev, caomhin, Hahnfeld
Reviewed By: ABataev, Hahnfeld
Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D52434
llvm-svn: 343253
Kostya Kortchinsky [Thu, 27 Sep 2018 19:15:40 +0000 (19:15 +0000)]
[sanitizer] Disable failing Android test after D52371
Summary:
The default values used for Space/Size for the new SizeClassMap do not work
with Android. The Compact map appears to be in the same boat.
Disable the test on Android for now to turn the bots green, but there is no
reason Compact & Dense should not have an Android test.
Added a FIXME, I will revisit this soon.
Reviewers: eugenis
Subscribers: srhines, kubamracek, delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D52623
llvm-svn: 343252
Roman Lebedev [Thu, 27 Sep 2018 19:07:48 +0000 (19:07 +0000)]
[clang][ubsan][NFC] Slight test cleanup in preparation for D50901
Reviewers: vsk, vitalybuka, filcab
Reviewed By: vitalybuka
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D52589
llvm-svn: 343251
Roman Lebedev [Thu, 27 Sep 2018 19:07:47 +0000 (19:07 +0000)]
[compiler-rt][ubsan][NFC] Slight test cleanup in preparation for D50902.
Reviewers: vsk, vitalybuka, filcab
Reviewed By: vitalybuka
Subscribers: kubamracek, dberris, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D52590
llvm-svn: 343250
Stanislav Mekhanoshin [Thu, 27 Sep 2018 18:55:20 +0000 (18:55 +0000)]
[AMDGPU] Fold copy (copy vgpr)
This allows to reduce a number of used VGPRs in some cases.
Differential Revision: https://reviews.llvm.org/D52577
llvm-svn: 343249
Eric Liu [Thu, 27 Sep 2018 18:46:00 +0000 (18:46 +0000)]
[clangd] Initial supoprt for cross-namespace global code completion.
Summary:
When no scope qualifier is specified, allow completing index symbols
from any scope and insert proper automatically. This is still experimental and
hidden behind a flag.
Things missing:
- Scope proximity based scoring.
- FuzzyFind supports weighted scopes.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: kbobyrev, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52364
llvm-svn: 343248
Eric Liu [Thu, 27 Sep 2018 18:23:23 +0000 (18:23 +0000)]
[clangd] Add more tracing to index queries. NFC
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52611
llvm-svn: 343247
Kostya Kortchinsky [Thu, 27 Sep 2018 18:20:42 +0000 (18:20 +0000)]
[sanitizer] Introduce a new SizeClassMap with minimal amount of cached entries
Summary:
_Note_: I am not attached to the name `DenseSizeClassMap`, so if someone has a
better idea, feel free to suggest it.
The current pre-defined `SizeClassMap` hold a decent amount of cached entries,
either in cheer number of, or in amount of memory cached.
Empirical testing shows that more compact per-class arrays (whose sizes are
directly correlated to the number of cached entries) are beneficial to
performances, particularly in highly threaded environments.
The new proposed `SizeClassMap` has the following properties:
```
c00 => s: 0 diff: +0 00% l 0 cached: 0 0; id 0
c01 => s: 16 diff: +16 00% l 4 cached: 8 128; id 1
c02 => s: 32 diff: +16 100% l 5 cached: 8 256; id 2
c03 => s: 48 diff: +16 50% l 5 cached: 8 384; id 3
c04 => s: 64 diff: +16 33% l 6 cached: 8 512; id 4
c05 => s: 80 diff: +16 25% l 6 cached: 8 640; id 5
c06 => s: 96 diff: +16 20% l 6 cached: 8 768; id 6
c07 => s: 112 diff: +16 16% l 6 cached: 8 896; id 7
c08 => s: 128 diff: +16 14% l 7 cached: 8 1024; id 8
c09 => s: 144 diff: +16 12% l 7 cached: 7 1008; id 9
c10 => s: 160 diff: +16 11% l 7 cached: 6 960; id 10
c11 => s: 176 diff: +16 10% l 7 cached: 5 880; id 11
c12 => s: 192 diff: +16 09% l 7 cached: 5 960; id 12
c13 => s: 208 diff: +16 08% l 7 cached: 4 832; id 13
c14 => s: 224 diff: +16 07% l 7 cached: 4 896; id 14
c15 => s: 240 diff: +16 07% l 7 cached: 4 960; id 15
c16 => s: 256 diff: +16 06% l 8 cached: 4 1024; id 16
c17 => s: 320 diff: +64 25% l 8 cached: 3 960; id 49
c18 => s: 384 diff: +64 20% l 8 cached: 2 768; id 50
c19 => s: 448 diff: +64 16% l 8 cached: 2 896; id 51
c20 => s: 512 diff: +64 14% l 9 cached: 2 1024; id 48
c21 => s: 640 diff: +128 25% l 9 cached: 1 640; id 49
c22 => s: 768 diff: +128 20% l 9 cached: 1 768; id 50
c23 => s: 896 diff: +128 16% l 9 cached: 1 896; id 51
c24 => s: 1024 diff: +128 14% l 10 cached: 1 1024; id 48
c25 => s: 1280 diff: +256 25% l 10 cached: 1 1280; id 49
c26 => s: 1536 diff: +256 20% l 10 cached: 1 1536; id 50
c27 => s: 1792 diff: +256 16% l 10 cached: 1 1792; id 51
c28 => s: 2048 diff: +256 14% l 11 cached: 1 2048; id 48
c29 => s: 2560 diff: +512 25% l 11 cached: 1 2560; id 49
c30 => s: 3072 diff: +512 20% l 11 cached: 1 3072; id 50
c31 => s: 3584 diff: +512 16% l 11 cached: 1 3584; id 51
c32 => s: 4096 diff: +512 14% l 12 cached: 1 4096; id 48
c33 => s: 5120 diff: +1024 25% l 12 cached: 1 5120; id 49
c34 => s: 6144 diff: +1024 20% l 12 cached: 1 6144; id 50
c35 => s: 7168 diff: +1024 16% l 12 cached: 1 7168; id 51
c36 => s: 8192 diff: +1024 14% l 13 cached: 1 8192; id 48
c37 => s: 10240 diff: +2048 25% l 13 cached: 1 10240; id 49
c38 => s: 12288 diff: +2048 20% l 13 cached: 1 12288; id 50
c39 => s: 14336 diff: +2048 16% l 13 cached: 1 14336; id 51
c40 => s: 16384 diff: +2048 14% l 14 cached: 1 16384; id 48
c41 => s: 20480 diff: +4096 25% l 14 cached: 1 20480; id 49
c42 => s: 24576 diff: +4096 20% l 14 cached: 1 24576; id 50
c43 => s: 28672 diff: +4096 16% l 14 cached: 1 28672; id 51
c44 => s: 32768 diff: +4096 14% l 15 cached: 1 32768; id 48
c45 => s: 40960 diff: +8192 25% l 15 cached: 1 40960; id 49
c46 => s: 49152 diff: +8192 20% l 15 cached: 1 49152; id 50
c47 => s: 57344 diff: +8192 16% l 15 cached: 1 57344; id 51
c48 => s: 65536 diff: +8192 14% l 16 cached: 1 65536; id 48
c49 => s: 81920 diff: +16384 25% l 16 cached: 1 81920; id 49
c50 => s: 98304 diff: +16384 20% l 16 cached: 1 98304; id 50
c51 => s: 114688 diff: +16384 16% l 16 cached: 1 114688; id 51
c52 => s: 131072 diff: +16384 14% l 17 cached: 1 131072; id 48
c53 => s: 64 diff: +0 00% l 0 cached: 8 512; id 4
Total cached: 864928 (152/432)
```
It holds a bit less of 1MB of cached entries at most, and the cache fits in a
page.
The plan is to use this map by default for Scudo once we make sure that there
is no unforeseen impact for any of current use case.
Benchmarks give the most increase in performance (with Scudo) when looking at
highly threaded/contentious environments. For example, rcp2-benchmark
experiences a 10K QPS increase (~3%), and a decrease of 50MB for the max RSS
(~10%). On platforms like Android where we only have a couple of caches,
performance remain similar.
Reviewers: eugenis, kcc
Reviewed By: eugenis
Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D52371
llvm-svn: 343246
Jordan Rupprecht [Thu, 27 Sep 2018 18:13:01 +0000 (18:13 +0000)]
[compiler-rt] [builtins] Restore tests from r342917 (disabled in r343095) on Windows.
Summary:
-lm is needed for these tests on Linux, but the lit config for this package automatically adds it for Linux and excludes it for Windows. So we should be able to get these tests running again by just dropping -lm and let the lit config add it when possible.
I was under the impression that -lm worked across platforms because it exists in other tests without and 'UNSUPPORTED: windows' commands (e.g. divsc3_test.c), but those are actually excluded because they 'REQUIRES: c99-complex' which is excluded from windows platforms (also by the local lit config).
I don't have easy access to a windows machine to verify this patch, but I can trigger a build bot run on clang-x64-ninja-win7 shortly after submitting.
Reviewers: hans
Subscribers: dberris, delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D52563
llvm-svn: 343245
Craig Topper [Thu, 27 Sep 2018 18:01:48 +0000 (18:01 +0000)]
[ScalarizeMaskedMemIntrin] Don't emit 'icmp eq i1 %x, 1' to check mask values. That's just %x so use that directly.
Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical.
llvm-svn: 343244
Greg Clayton [Thu, 27 Sep 2018 17:55:36 +0000 (17:55 +0000)]
Fixes for GDB remote packet disassembler:
- Add latency timings to GDB packet log summary if timestamps are on log
- Add the ability to plot the latencies for each packet type with --plot
- Don't crash the script when target xml register info is in wierd format
llvm-svn: 343243
Greg Clayton [Thu, 27 Sep 2018 17:45:14 +0000 (17:45 +0000)]
Add an interactive mode to BSD archive parser.
llvm-svn: 343242
Simon Pilgrim [Thu, 27 Sep 2018 17:29:13 +0000 (17:29 +0000)]
[X86] Remove BT/BTC/BTR/BTS rr/ri overrides
llvm-svn: 343241
Jonas Hahnfeld [Thu, 27 Sep 2018 17:27:48 +0000 (17:27 +0000)]
Fix greedy FileCheck expression in test/Driver/mips-abi.c
'ld{{.*}}"' seems to match the complete line for me which is failing
the test. Only allow an optional '.exe' for Windows systems as most
other tests do.
Another possibility would be to collapse the greedy expression with
the next check to avoid matching the full line.
Differential Revision: https://reviews.llvm.org/D52619
llvm-svn: 343240
George Karpenkov [Thu, 27 Sep 2018 17:26:41 +0000 (17:26 +0000)]
[analyzer] Highlight nodes which have error reports in them in red in exploded graph
Differential Revision: https://reviews.llvm.org/D52584
llvm-svn: 343239
Simon Pilgrim [Thu, 27 Sep 2018 17:13:57 +0000 (17:13 +0000)]
[X86][Btver2] (V)MPSADBW instructions take 3uops not 1
llvm-svn: 343238
Kadir Cetinkaya [Thu, 27 Sep 2018 17:13:07 +0000 (17:13 +0000)]
Introduce completionItemKind capability support.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ilya-biryukov, ioeric, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D52616
llvm-svn: 343237
Luke Cheeseman [Thu, 27 Sep 2018 16:48:04 +0000 (16:48 +0000)]
Revert r343193 together with r343192
llvm-svn: 343236
Luke Cheeseman [Thu, 27 Sep 2018 16:47:30 +0000 (16:47 +0000)]
Revert r343192 as an ubsan build is currently failing
llvm-svn: 343235
Simon Pilgrim [Thu, 27 Sep 2018 16:39:52 +0000 (16:39 +0000)]
[X86][Btver2] BTC/BTR/BTS instructions take 2uops not 1
llvm-svn: 343234
Simon Pilgrim [Thu, 27 Sep 2018 16:24:42 +0000 (16:24 +0000)]
[X86] Split BT and BTC/BTR/BTS scheduler classes
llvm-svn: 343233
Simon Pilgrim [Thu, 27 Sep 2018 16:21:35 +0000 (16:21 +0000)]
[Sparc] EXPENSIVE_CHECKS now passes all machine verifier errors (PR27461)
Now that D51487 has landed, the last machine verifier tests that failed EXPENSIVE_CHECKS builds have now been fixed/removed, so we can remove @MatzeB 's isMachineVerifierClean() hack for sparc targets.
Differential Revision: https://reviews.llvm.org/D52612
llvm-svn: 343232
Oliver Stannard [Thu, 27 Sep 2018 16:19:04 +0000 (16:19 +0000)]
[AArch64] Refactor immediate details out of add/sub tblgen class (NFCI)
Bits [23-22] are used in Add and Sub to specify the shift. The value of the
shift field must be 0x; values of 1x are unallocated. MTE adds some instructions
that use such encodings, and this patch refactors the Add/Sub class so that
another class could derive from this one to implement other encodings and other
formats of bitfields.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52489
llvm-svn: 343231
Jonas Hahnfeld [Thu, 27 Sep 2018 16:12:32 +0000 (16:12 +0000)]
[OpenMP] Improve search for libomptarget-nvptx
When looking for the bclib Clang considered the default library
path first while it preferred directories in LIBRARY_PATH when
constructing the invocation of nvlink. The latter actually makes
more sense because during development it allows using a non-default
runtime library. So change the search for the bclib to start
looking in directories given by LIBRARY_PATH.
Additionally add a new option --libomptarget-nvptx-path= which
will be searched first. This will be handy for testing purposes.
Differential Revision: https://reviews.llvm.org/D51686
llvm-svn: 343230
Oliver Stannard [Thu, 27 Sep 2018 16:09:05 +0000 (16:09 +0000)]
[AArch64][v8.5A] Add speculation barriers SSBB and PSSBB
This adds two new barrier instructions which can be used to restrict
speculative execution of load instructions.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52483
llvm-svn: 343229
Sanjay Patel [Thu, 27 Sep 2018 15:59:24 +0000 (15:59 +0000)]
[InstCombine] Without infinites, fold (C / X) < 0.0 --> (X < 0)
When C is not zero and infinites are not allowed (C / X) > 0 is a sign
test. Depending on the sign of C, the predicate must be swapped.
E.g.:
foo(double X) {
if ((-2.0 / X) <= 0) ...
}
=>
foo(double X) {
if (X >= 0) ...
}
Patch by: @marels (Martin Elshuber)
Differential Revision: https://reviews.llvm.org/D51942
llvm-svn: 343228
Simon Pilgrim [Thu, 27 Sep 2018 14:57:57 +0000 (14:57 +0000)]
[X86][Btver2] BLSI/BLSMSK/BLSR instructions take 2uops not 1 (same as TZCNT)
llvm-svn: 343227
Teresa Johnson [Thu, 27 Sep 2018 14:55:32 +0000 (14:55 +0000)]
[WPD] Fix incorrect devirtualization after indirect call promotion
Summary:
Add a dominance check to ensure that the possible devirtualizable
call is actually dominated by the type test/checked load intrinsic being
analyzed. With PGO, after indirect call promotion is performed during
the compile step, followed by inlining, we may have a type test in the
promoted and inlined sequence that allows an indirect call in that
sequence to be devirtualized. That indirect call (inserted by inlining
after promotion) will share the same vtable pointer as the fallback
indirect call that cannot be devirtualized.
Before this patch the code was incorrectly devirtualizing the fallback
indirect call.
See the new test and the example described there for more details.
Reviewers: pcc, vitalybuka
Subscribers: mehdi_amini, Prazek, eraman, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D52514
llvm-svn: 343226
Oliver Stannard [Thu, 27 Sep 2018 14:54:33 +0000 (14:54 +0000)]
[AArch64][v8.5A] Add Branch Target Identification instructions
This adds new instructions used by the Branch Target Identification
feature. When this is enabled, these are the only instructions which can
be targeted by indirect branch instructions.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52485
llvm-svn: 343225
Eric Liu [Thu, 27 Sep 2018 14:50:24 +0000 (14:50 +0000)]
[Tooling] Get rid of uses of llvm::Twine::str which is slow. NFC
llvm-svn: 343224
Eric Liu [Thu, 27 Sep 2018 14:27:02 +0000 (14:27 +0000)]
[clangd] Make IncludeInserter less slow. NFC
llvm-svn: 343223
Sanjay Patel [Thu, 27 Sep 2018 14:24:29 +0000 (14:24 +0000)]
[InstCombine] add tests for FP sign-bit cmp optimization with fdiv; NFC
These are baseline tests for D51942.
Patch by: @marels (Martin Elshuber)
llvm-svn: 343222
Kadir Cetinkaya [Thu, 27 Sep 2018 14:21:07 +0000 (14:21 +0000)]
Tell whether file/folder for include completions.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ilya-biryukov, ioeric, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D52547
llvm-svn: 343221
Oliver Stannard [Thu, 27 Sep 2018 14:20:59 +0000 (14:20 +0000)]
[AArch64][v8.5A] Test optional Armv8.5-A random number extension
The implementation of this is in TargetParser, so we only need to add a
test for it in clang.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52492
llvm-svn: 343220
Oliver Stannard [Thu, 27 Sep 2018 14:05:46 +0000 (14:05 +0000)]
[AArch64][v8.5A] Add speculation restriction system registers
This adds some new system registers which can be used to restrict
certain types of speculative execution.
Patch by Pablo Barrio and David Spickett!
Differential revision: https://reviews.llvm.org/D52482
llvm-svn: 343218
Oliver Stannard [Thu, 27 Sep 2018 14:01:40 +0000 (14:01 +0000)]
[AArch64][v8.5A] Add Armv8.5-A random number instructions
This adds two new system registers, used to generate random numbers.
This is an optional extension to v8.5-A, and will be controlled by the
"+rng" modifier of the -march= and -mcpu= options.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52481
llvm-svn: 343217
Oliver Stannard [Thu, 27 Sep 2018 13:53:35 +0000 (13:53 +0000)]
[AArch64][v8.5A] Add Armv8.5-A "DC CVADP" instruction
This adds a new variant of the DC system instruction for persistent
memory.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52480
llvm-svn: 343216
Simon Pilgrim [Thu, 27 Sep 2018 13:49:52 +0000 (13:49 +0000)]
The llvm-exegesis output file is a html file not a txt file.
llvm-svn: 343215
Oliver Stannard [Thu, 27 Sep 2018 13:47:40 +0000 (13:47 +0000)]
[AArch64][v8.5A] Add prediction invalidation instructions to AArch64
This adds new system instructions which act as barriers to speculative
execution based on earlier execution within a particular execution
context.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52479
llvm-svn: 343214
Oliver Stannard [Thu, 27 Sep 2018 13:41:14 +0000 (13:41 +0000)]
[ARM][v8.5A] Add speculation barrier to ARM & Thumb instruction sets
This is a new barrier which limits speculative execution of the
instructions following it.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52477
llvm-svn: 343213
Michael Kruse [Thu, 27 Sep 2018 13:39:37 +0000 (13:39 +0000)]
[IslAst] Fix InParallelFor nesting.
IslAst could mark two nested outer loops as "OutermostParallel". It
caused that the code generator tried to OpenMP-parallelize both loops,
which it is not prepared loop.
It was because the recursive AST build algorithm managed a flag
"InParallelFor" to ensure that no nested loop is also marked as
"OutermostParallel". Unfortunatetly the same flag was used by nodes
marked as SIMD, and reset to false after the SIMD node. Since loops can
be marked as SIMD inside "OutermostParallel" loops, the recursive
algorithm again tried to mark loops as "OutermostParellel" although
still nested inside another "OutermostParallel" loop.
The fix exposed another bug: The function "astScheduleDimIsParallel" was
only called when a loop was potentially "OutermostParallel" or
"InnermostParallel", but as a side-effect also determines the minimum
dependence distance. Hence, changing when we need to know whether a loop
is "OutermostParallel" also changed which loop was annotated with
"#pragma minimal dependence distance".
Moreover, some complex condition linked with "InParallelFor" determined
whether a loop should be an "InnermostParallel" loop. It missed some
situations where it would not use mark as such although being inside an
SIMD mark node, and therefore not be annotated using "#pragma simd".
The changes in particular:
1. Split the "InParallelFor" flag into an "InParallelFor" and an
"InSIMD" flag.
2. Unconditionally call "astScheduleDimIsParallel" for its side-effects
and store the result in "InParallel" for later use.
3. Simplify the condition when a loop is "InnermostParallel".
Fixes llvm.org/PR33153 and llvm.org/PR38073.
llvm-svn: 343212
Oliver Stannard [Thu, 27 Sep 2018 13:39:06 +0000 (13:39 +0000)]
[AArch64][v8.5A] Add speculation barrier to AArch64 instruction set
This is a new barrier which limits speculative execution of the
instructions following it.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52476
llvm-svn: 343211
Daniel Cederman [Thu, 27 Sep 2018 13:32:54 +0000 (13:32 +0000)]
[Sparc] Remove the support for builtin setjmp/longjmp
Summary: It is currently broken and for Sparc there is not much benefit
in using a builtin version compared to a library version. Both versions
needs to store the same four values in setjmp and flush the register
windows in longjmp. If the need for a builtin setjmp/longjmp arises there
is an improved implementation available at https://reviews.llvm.org/D50969.
Reviewers: jyknight, joerg, venkatra
Subscribers: fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D51487
llvm-svn: 343210
Oliver Stannard [Thu, 27 Sep 2018 13:32:06 +0000 (13:32 +0000)]
[AArch64][v8.5A] Add FRINT[32,64][Z,X] instructions
These are some new variants of the "Floating-point Round to Integral"
family of instructions, which round to the nearest floating-point value
which fits in a 32- or 64-bit integer.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52475
llvm-svn: 343209
Clement Courbet [Thu, 27 Sep 2018 13:26:37 +0000 (13:26 +0000)]
[llvm-exegesis] Fix PR39096.
Summary: The key is now the resource name, not the resource id.
Reviewers: gchatelet
Subscribers: tschuett, RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D52607
llvm-svn: 343208
Sven van Haastregt [Thu, 27 Sep 2018 13:20:29 +0000 (13:20 +0000)]
[OpenCL] Improve extension-version.cl and to_addr_builtin.cl tests
Add cl_khr_depth_images to extension-version.cl.
Extend to_addr_builtin.cl to additionally test the built-in methods
to_private and to_local, and test assignment with to_global to
incorrect types.
Patch by Alistair Davies.
Differential Revision: https://reviews.llvm.org/D52020
llvm-svn: 343207
Kristof Umann [Thu, 27 Sep 2018 12:46:37 +0000 (12:46 +0000)]
Revert untintentionally commited changes
llvm-svn: 343205
Kristof Umann [Thu, 27 Sep 2018 12:40:16 +0000 (12:40 +0000)]
[Lex] TokenConcatenation now takes const Preprocessor
Differential Revision: https://reviews.llvm.org/D52502
llvm-svn: 343204
Daniel Cederman [Thu, 27 Sep 2018 12:34:53 +0000 (12:34 +0000)]
[Sparc] Add unimp alias
Summary: Use 0 as the default immediate for the UNIMP instruction.
This matches the behavior in gas.
Reviewers: jyknight, venkatra
Subscribers: fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D51526
llvm-svn: 343203
Daniel Cederman [Thu, 27 Sep 2018 12:34:48 +0000 (12:34 +0000)]
[Sparc] Add support for the partial write PSR instruction
Summary:
Partial write %PSR (WRPSR) is a SPARC V8e option that allows WRPSR
instructions to only affect the %PSR.ET field. It is supported by
the GR740 and GR716.
Reviewers: jyknight, venkatra
Subscribers: fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D48644
llvm-svn: 343202
Jonas Toth [Thu, 27 Sep 2018 12:30:44 +0000 (12:30 +0000)]
[clang-tidy] use CHECK-NOTES in tests for bugprone suspicious-enum-usage
Reviewers: alexfh, aaron.ballman, hokein
Subscribers: xazax.hun, cfe-commits
Differential Revision: https://reviews.llvm.org/D52229
llvm-svn: 343201
Simon Pilgrim [Thu, 27 Sep 2018 12:28:47 +0000 (12:28 +0000)]
[X86][Btver2] TZCNT instructions take 2uops not 1
llvm-svn: 343200
Jonas Toth [Thu, 27 Sep 2018 12:22:48 +0000 (12:22 +0000)]
[clang-tidy] use CHECK-NOTES in tests for bugprone-use-after-move
Reviewers: alexfh, aaron.ballman, hokein
Subscribers: xazax.hun, cfe-commits
Differential Revision: https://reviews.llvm.org/D52228
llvm-svn: 343199
Jonas Toth [Thu, 27 Sep 2018 12:17:59 +0000 (12:17 +0000)]
[clang-tidy] use CHECK-NOTES in tests for bugprone-forward-declaration-namespace
Reviewers: aaron.ballman, alexfh, hokein
Subscribers: xazax.hun, cfe-commits
Differential Revision: https://reviews.llvm.org/D52185
llvm-svn: 343198
Kadir Cetinkaya [Thu, 27 Sep 2018 12:12:42 +0000 (12:12 +0000)]
Improve diagnostics range reporting.
Summary:
If we have some range information coming from clang diagnostic, promote
that one even if it doesn't contain diagnostic location inside.
Reviewers: sammccall, ioeric
Reviewed By: ioeric
Subscribers: ilya-biryukov, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D52544
llvm-svn: 343197
Peter Smith [Thu, 27 Sep 2018 12:07:47 +0000 (12:07 +0000)]
[COFF] Add missing Requires x86 to fix buildbot
Add REQUIRES: x86 to pdb-debug-f.s as this is causing the Arm and
AArch64 buildbots to fail as they do not have the x86 backend.
Differential Revision: https://reviews.llvm.org/D52606
llvm-svn: 343196
Nemanja Ivanovic [Thu, 27 Sep 2018 11:49:47 +0000 (11:49 +0000)]
[PowerPC] [NFC] Refactor code for printing register operands
We have an unfortunate situation in our back end where we have to keep pairs of
functions synchronized. Needless to say that this is not an ideal situation as
it is very difficult to enforce. Even without bugs, it's annoying to have to do
the same thing in two places.
This patch just refactors the code so that the two pairs of those functions that
pertain to printing register operands are unified:
- stripRegisterPrefix() - this just removes the letter prefixes from registers
for the InstrPrinter and AsmPrinter. This patch provides this as a static
member of PPCRegisterInfo
- Handling of PPCII::UseVSXReg - there are 3 places where we do something
special for instructions with that flag set. Each of those places does its
own checking of this flag and implements code customization. Any changes to
how we print/encode VSX/VMX registers require modifying all 3 places. This
patch unifies this into a static function in PPCInstrInfo that returns the
register number adjusted as needed.
Differential revision: https://reviews.llvm.org/D52467
llvm-svn: 343195
Simon Pilgrim [Thu, 27 Sep 2018 11:40:26 +0000 (11:40 +0000)]
[X86][Btver2] Add uops counter for exegesis reports
llvm-svn: 343194
Luke Cheeseman [Thu, 27 Sep 2018 10:42:14 +0000 (10:42 +0000)]
Update CallFrameString API to account for r343114
- CallFrameString now takes an Arch parameter to account for multiplexing
overlapping CFI directives
llvm-svn: 343193
Luke Cheeseman [Thu, 27 Sep 2018 10:39:20 +0000 (10:39 +0000)]
Reapply changes reverted in r343114, lldb patch to follow shortly
llvm-svn: 343192
Raphael Isemann [Thu, 27 Sep 2018 10:12:54 +0000 (10:12 +0000)]
Refactor ClangUserExpression::GetLanguageForExpr
Summary:
The `ClangUserExpression::GetLanguageForExpr` method is currently a big
source of sadness, as it's name implies that it's an accessor method, but it actually
is also initializing some variables that we need for parsing. This caused that we
currently call this getter just for it's side effects while ignoring it's return value,
which is confusing for the reader.
This patch renames it to `UpdateLanguageForExpr` and merges all calls to the
method into a single call in `ClangUserExpression::PrepareForParsing` (as calling
this method is anyway mandatory for parsing to succeed)
While looking at the code, I also found that we actually have two language
variables in this class hierarchy. The normal `Language` from the UserExpression
class and the `LanguageForExpr` that we implemented in this subclass. Both
don't seem to actually contain the same value, so we probably should look at this
next.
Reviewers: xbolva00
Reviewed By: xbolva00
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D52561
llvm-svn: 343191
Nicola Zaghen [Thu, 27 Sep 2018 10:08:38 +0000 (10:08 +0000)]
[InstCombine] Add new tests in preparation for a combine of icmp (mul nsw/nuw X, C2), C
Proof for the future optimisations are here:
- eq/neq: https://rise4fun.com/Alive/9PBA
- sgt/ugt: https://rise4fun.com/Alive/58yr
- slt/ult: https://rise4fun.com/Alive/VCQ
Differential Revision: https://reviews.llvm.org/D51625
llvm-svn: 343190
Hans Wennborg [Thu, 27 Sep 2018 09:59:27 +0000 (09:59 +0000)]
Revert r342942 "[MachineCopyPropagation] Reimplement CopyTracker in terms of register units"
It seems to have broken several targets, see comments on the llvm-commits thread.
> Change the copy tracker to keep a single map of register units instead
> of 3 maps of registers. This gives a very significant compile time
> performance improvement to the pass. I measured a 30-40% decrease in
> time spent in MCP on x86 and AArch64 and much more significant
> improvements on out of tree targets with more registers.
>
> Differential Revision: https://reviews.llvm.org/D52374
llvm-svn: 343189
Guillaume Chatelet [Thu, 27 Sep 2018 09:23:04 +0000 (09:23 +0000)]
[llvm-exegesis][NFC] moving code around.
Summary: Renaming InstructionBuilder into InstructionTemplate and moving code generation tools from MCInstrDescView to CodeTemplate.
Reviewers: courbet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D52592
llvm-svn: 343188
Oliver Stannard [Thu, 27 Sep 2018 09:11:27 +0000 (09:11 +0000)]
[AArch64][v8.5A] Add PSTATE manipulation instructions XAFlag and AXFlag
These new instructions manipluate the NZCV bits, to convert between the
regular Arm floating-point comare format and an alternative format.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52473
llvm-svn: 343187
Simon Atanasyan [Thu, 27 Sep 2018 08:51:18 +0000 (08:51 +0000)]
[mips] Add support MIPS r6 Debian triples
Debian uses different triples for MIPS r6 and paths. Here we use SubArch
to determine whether it is r6, if we found `r6' in CPU section of triple.
These new triples include:
mipsisa32r6-linux-gnu
mipsisa32r6el-linux-gnu
mipsisa64r6-linux-gnuabi64
mipsisa64r6el-linux-gnuabi64
mipsisa64r6-linux-gnuabin32
mipsisa64r6el-linux-gnuabin32
Patch by YunQiang Su.
Differential revision: https://reviews.llvm.org/D50857
llvm-svn: 343185