Alex Lorenz [Thu, 9 Nov 2017 17:54:49 +0000 (17:54 +0000)]
Add _LIBCPP_INLINE_VISIBILITY to __compressed_pair_elem members
The commit r300140 changed the implementation of compressed_pair, but didn't add
_LIBCPP_INLINE_VISIBILITY to the constructors and get members of the
compressed_pair_elem class. This patch adds the visibility annotation.
I didn't find a way to test this change with libc++ regression tests.
rdar://
35352579
Differential Revision: https://reviews.llvm.org/D39751
llvm-svn: 317816
Nuno Lopes [Thu, 9 Nov 2017 17:35:36 +0000 (17:35 +0000)]
revert r317812 [BasicAA] fix build break by converting the previously introduced assert into an if stmt
The code has a bug, but some tests regress.
I'll discuss this further on the mailing list.
llvm-svn: 317815
Weiming Zhao [Thu, 9 Nov 2017 17:32:57 +0000 (17:32 +0000)]
[Builtins] Do not use tailcall for Thumb1
Summary:
The `b` instruction in Thumb1 has limited range, which may cause link-time errors if the jump target is far away.
This patch guards the tailcalls for non-Thumb1
Reviewers: peter.smith, compnerd, rengolin, eli.friedman
Reviewed By: rengolin
Subscribers: joerg, dalias, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D39700
llvm-svn: 317814
Alexey Bataev [Thu, 9 Nov 2017 17:32:15 +0000 (17:32 +0000)]
[OPENMP] Codegen for `#pragma omp target parallel for simd`.
Added codegen for `#pragma omp target parallel for simd` and clauses.
llvm-svn: 317813
Nuno Lopes [Thu, 9 Nov 2017 17:06:42 +0000 (17:06 +0000)]
[BasicAA] fix build break by converting the previously introduced assert into an if stmt
Apparently V1Size == -1 doest imply V2Size == -1, which is a bit surprising to me.
llvm-svn: 317812
Alexey Bataev [Thu, 9 Nov 2017 17:01:35 +0000 (17:01 +0000)]
[OPENMP] Treat '#pragma omp target parallel for simd' as simd directive.
`#pragma omp target parallel for simd` mistakenly was not treated as a
simd directive, fixed this problem.
llvm-svn: 317811
Sanjay Patel [Thu, 9 Nov 2017 16:46:04 +0000 (16:46 +0000)]
revert r317809 - [Reassociate] regenerate test checks; NFC
The reassociate pass generates named values such as "%tmp2" which trips up the script's regex's
because the script uses a 'TMP' prefix for unnamed values (%2).
llvm-svn: 317810
Sanjay Patel [Thu, 9 Nov 2017 16:35:30 +0000 (16:35 +0000)]
[Reassociate] regenerate test checks; NFC
llvm-svn: 317809
Michael Kruse [Thu, 9 Nov 2017 16:33:29 +0000 (16:33 +0000)]
Update formatting to reflect change in clang-format. NFC.
clang-format has changed its algorithm
for sorting includes in r317794.
llvm-svn: 317808
Ulrich Weigand [Thu, 9 Nov 2017 16:31:57 +0000 (16:31 +0000)]
[SystemZ] Add support for the "o" inline asm constraint
We don't really need any special handling of "offsettable"
memory addresses, but since some existing code uses inline
asm statements with the "o" constraint, add support for this
constraint for compatibility purposes.
llvm-svn: 317807
Sanjay Patel [Thu, 9 Nov 2017 16:30:19 +0000 (16:30 +0000)]
[Reassociate] regenerate test checks; NFC
llvm-svn: 317806
Sanjay Patel [Thu, 9 Nov 2017 16:25:35 +0000 (16:25 +0000)]
[Reassociate] add check lines; NFC
llvm-svn: 317805
Sanjay Patel [Thu, 9 Nov 2017 16:23:32 +0000 (16:23 +0000)]
[Reassociate] add tests with 'reassoc' FMF and regenerate checks; NFC
llvm-svn: 317804
Nuno Lopes [Thu, 9 Nov 2017 16:16:46 +0000 (16:16 +0000)]
[BasicAA] add assertion for corner case in aliasGEP()
llvm-svn: 317803
Bill Seurer [Thu, 9 Nov 2017 16:14:57 +0000 (16:14 +0000)]
[PowerPC][msan] Update msan to handle changed memory layouts in newer kernels
In more recent Linux kernels (including those with 47 bit VMAs) the layout of
virtual memory for powerpc64 changed causing the memory sanitizer to not
work properly. This patch adjusts the memory ranges in the tables for the
memory sanitizer to work on the newer kernels while continuing to work on the
older ones as well.
Tested on several 4.x and 3.x kernel releases.
llvm-svn: 317802
Simon Dardis [Thu, 9 Nov 2017 16:02:18 +0000 (16:02 +0000)]
[mips] Correct microMIP's jump and add unconditional branch pseudo
Correct the definition of 'j' as being unavailable for microMIPS32R6 and
provide the 'b' assembly idiom for codegen purposes for microMIPS32r3.
Provide the necessary 'br' pattern for microMIPS32R6 as it now longer
incorrectly uses the 'j' instruction.
Reviewers: atanasyan
Differential Revision: https://reviews.llvm.org/D39741
llvm-svn: 317801
Ben Hamilton [Thu, 9 Nov 2017 16:01:16 +0000 (16:01 +0000)]
[VirtualFileSystem] InMemoryFileSystem::addFile(): Type and Perms
Summary:
This implements a FIXME in InMemoryFileSystem::addFile(), allowing
clients to specify User, Group, Type, and/or Perms when creating a
file in an in-memory filesystem.
New tests included. Ran tests with:
% ninja BasicTests && ./tools/clang/unittests/Basic/BasicTests
Fixes PR#35172 (https://bugs.llvm.org/show_bug.cgi?id=35172)
Reviewers: bkramer, hokein
Reviewed By: bkramer, hokein
Subscribers: alexfh
Differential Revision: https://reviews.llvm.org/D39572
llvm-svn: 317800
Krasimir Georgiev [Thu, 9 Nov 2017 15:54:59 +0000 (15:54 +0000)]
[clang-format] Keep Sphinx happy after r317794
llvm-svn: 317799
Jonas Hahnfeld [Thu, 9 Nov 2017 15:52:29 +0000 (15:52 +0000)]
Add const to some variables to avoid const_casts
In these places the const attribute seems correct and doesn't
need any other change, so let's do it.
Differential Revision: https://reviews.llvm.org/D39756
llvm-svn: 317798
Jonas Hahnfeld [Thu, 9 Nov 2017 15:52:25 +0000 (15:52 +0000)]
Remove const from variables with dynamic memory
Allocated memory is typically not 'const' if it needs to be freed.
This patch removes around 50 wrong const attributes, modifies the
corresponding functions and finally gets rid of some const_casts.
These have especially been strange for __kmp_str_fname_free() that
added a 'const' to call __kmp_str_free() which removed it again.
Two minor cleanups that I performed in this process:
* __kmp_tool_libraries now lives in kmp_settings.cpp as it is
used nowhere else.
* __kmp_msg_empty was removed as it was never used and Clang
now complained that it was assigned a string literal that
is 'const char *'.
Differential Revision: https://reviews.llvm.org/D39755
llvm-svn: 317797
Alex Bradbury [Thu, 9 Nov 2017 15:45:42 +0000 (15:45 +0000)]
[RISCV] Re-generate test/CodeGen/RISCV/alu32.ll using update_llc_test_checks.py
No real change, but makes it marginally easier to merge the remainder of the
out-of-tree patches.
llvm-svn: 317796
Pavel Labath [Thu, 9 Nov 2017 15:45:09 +0000 (15:45 +0000)]
llgs-tests: Replace the "log+return false" pattern with llvm::Error
Summary:
These tests used to log the error message and return plain bool mainly
because at the time they we written, we did not have a nice way to
assert on llvm::Error values. That is no longer true, so replace this
pattern with a more idiomatic approach.
As a part of this patch, I also move the formatting of
GDBRemoteCommunication::PacketResult values out of the test code, as
that can be useful elsewhere.
Reviewers: zturner, eugene
Subscribers: mgorny, lldb-commits
Differential Revision: https://reviews.llvm.org/D39790
llvm-svn: 317795
Krasimir Georgiev [Thu, 9 Nov 2017 15:41:23 +0000 (15:41 +0000)]
[clang-format] Sort using declarations by splitting on '::'
Summary: This patch improves using declarations sorting.
Reviewers: bkramer
Reviewed By: bkramer
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D39786
llvm-svn: 317794
Krasimir Georgiev [Thu, 9 Nov 2017 15:12:17 +0000 (15:12 +0000)]
[clang-format] Apply a clang-tidy suggestion, NFC
llvm-svn: 317793
Pavel Labath [Thu, 9 Nov 2017 15:06:31 +0000 (15:06 +0000)]
Add a unit test for ClangASTContext template arguments handling
I am planning to make changes to this piece of code, so I wrote this
test to add more coverage to it first.
llvm-svn: 317792
Alex Bradbury [Thu, 9 Nov 2017 15:00:03 +0000 (15:00 +0000)]
[RISCV] MC layer support for the standard RV32A instruction set extension
llvm-svn: 317791
Simon Pilgrim [Thu, 9 Nov 2017 14:56:17 +0000 (14:56 +0000)]
Fix 'not all control paths return a value' warning on MSVC builds
llvm-svn: 317790
Dave Lee [Thu, 9 Nov 2017 14:53:43 +0000 (14:53 +0000)]
Reapply: Allow yaml2obj to order implicit sections for ELF
Summary:
This change allows yaml input to control the order of implicitly added sections
(`.symtab`, `.strtab`, `.shstrtab`). The order is controlled by adding a
placeholder section of the given name to the Sections field.
This change is to support changes in D39582, where it is desirable to control
the location of the `.dynsym` section.
This reapplied version fixes:
1. use of a function call within an assert
2. failing lld test which has an unnamed section
3. incorrect section count when given an unnamed section
Additionally, one more test to cover the unnamed section failure.
Reviewers: compnerd, jakehehrlich
Reviewed By: jakehehrlich
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39749
llvm-svn: 317789
Alex Bradbury [Thu, 9 Nov 2017 14:46:30 +0000 (14:46 +0000)]
[RISCV] MC layer support for the standard RV32M instruction set extension
llvm-svn: 317788
Jonas Hahnfeld [Thu, 9 Nov 2017 14:26:14 +0000 (14:26 +0000)]
[OMPT] Fix test cancel_parallel.c
If a parallel region is cancelled, execution resumes at the end
of the structured block. That is why this test cannot use the
"normal" macros that print right after inserting the label.
Instead it previously printed the addresses before the pragma
and swapped the checks compared to the other tests.
However, this does not work because FileChecks '*' is greedy
so that RETURN_ADDRESS always matched the second address. This
makes the test fail when an "overflow" occurrs and the first
address matches the value of codeptr_ra.
I discovered this on my MacBook but I'm unable to reproduce the
failure with the current version. Nevertheless we should fix this
problem to avoid that this test fails later after an unrelated change.
Differential Revision: https://reviews.llvm.org/D39708
llvm-svn: 317787
Jonas Hahnfeld [Thu, 9 Nov 2017 14:26:12 +0000 (14:26 +0000)]
[OMPT] Add support for testing return addresses on POWER
Return addresses are determined based on the address of a label
that is inserted directly after a pragma / API call. In some cases
the tests can assume a known number of instructions between the
addresses. However, the instructions and their encoded lengths
depend on the target that the test is compiled on.
Firstly, this patch refactors the macro print_current_address() to
allow such target dependent modifications and adds information for
the observed instructions on POWER. Secondly, it adapts the related
macro print_fuzzy_address() to reuse much of "hacky" code and fixes
the used formatting strings in the printf() call. Finally, it also
adds documentation about how these macros are intended to work.
Differential Revision: https://reviews.llvm.org/D39699
llvm-svn: 317786
Andrew V. Tischenko [Thu, 9 Nov 2017 14:19:59 +0000 (14:19 +0000)]
Sched model improving on btver2: JFPU01 resource, vtestp* for xmm.
Differential Revision: https://reviews.llvm.org/D39802
llvm-svn: 317785
Krasimir Georgiev [Thu, 9 Nov 2017 13:22:03 +0000 (13:22 +0000)]
[clang-format] Fix a clang-tidy finding, NFC
llvm-svn: 317784
Krasimir Georgiev [Thu, 9 Nov 2017 13:19:14 +0000 (13:19 +0000)]
[clang-format] Fix argument name comment, NFC
llvm-svn: 317783
Andrew V. Tischenko [Thu, 9 Nov 2017 12:45:40 +0000 (12:45 +0000)]
Add -print-schedule scheduling comments to inline asm.
Differential Revision: https://reviews.llvm.org/D39728
llvm-svn: 317782
Simon Atanasyan [Thu, 9 Nov 2017 12:10:14 +0000 (12:10 +0000)]
[MIPS] Fix calculation of the R_MICROMIPS_LO16 / HI16 relocations
llvm-svn: 317781
Haojian Wu [Thu, 9 Nov 2017 11:30:04 +0000 (11:30 +0000)]
[clangd] Add rename support.
Summary:
Make clangd handle "textDocument/rename" request. The rename
functionality comes from the "local-rename" sub-tool of clang-refactor.
Currently clangd only supports local rename (only symbol occurrences in
the main file will be renamed).
Reviewers: sammccall, ilya-biryukov
Reviewed By: sammccall
Subscribers: cfe-commits, ioeric, arphaman, mgorny
Differential Revision: https://reviews.llvm.org/D39676
llvm-svn: 317780
Pavel Labath [Thu, 9 Nov 2017 10:43:16 +0000 (10:43 +0000)]
Simplify NativeProcessProtocol::GetArchitecture/GetByteOrder
Summary:
These functions used to return bool to signify whether they were able to
retrieve the data. This is redundant because the ArchSpec and ByteOrder
already have their own "invalid" states, *and* because both of the
current implementations (linux, netbsd) can always provide a valid
result.
This allows us to simplify bits of the code handling these values.
Reviewers: eugene, krytarowski
Subscribers: javed.absar, lldb-commits
Differential Revision: https://reviews.llvm.org/D39733
llvm-svn: 317779
Simon Atanasyan [Thu, 9 Nov 2017 10:42:22 +0000 (10:42 +0000)]
[MIPS] Setup less-significant bit in a symbol value in microMIPS thunks
The less-significant bit signals about microMIPS code for jump/branch instructions.
llvm-svn: 317778
Sam McCall [Thu, 9 Nov 2017 10:37:39 +0000 (10:37 +0000)]
[Tooling] Use FixedCompilationDatabase when `compile_flags.txt` is found.
Summary:
This is an alternative to JSONCompilationDatabase for simple projects that
don't use a build system such as CMake.
(You can also drop one in ~, to make your tools use e.g. C++11 by default)
There's no facility for varying flags per-source-file or per-machine.
Possibly this could be accommodated backwards-compatibly using cpp, but even if
not the simplicity seems worthwhile for the cases that are addressed.
Tested with clangd, works great! (requires clangd restart)
Reviewers: klimek
Subscribers: ilya-biryukov, cfe-commits
Differential Revision: https://reviews.llvm.org/D39799
llvm-svn: 317777
John McCall [Thu, 9 Nov 2017 09:32:32 +0000 (09:32 +0000)]
Fix a bug with the use of __builtin_bzero in a conditional expression.
Patch by Bharathi Seshadri!
llvm-svn: 317776
Craig Topper [Thu, 9 Nov 2017 08:26:26 +0000 (08:26 +0000)]
[X86] Give priority to EVEX FMA instructions over FMA4 instructions.
No existing processor has both so it doesn't really matter what we do here. But we were previously just relying on pattern order which gave FMA4 priority.
llvm-svn: 317775
Vitaly Buka [Thu, 9 Nov 2017 07:53:06 +0000 (07:53 +0000)]
[sanitizers] Rename GetStackTraceWithPcBpAndContext
Name does not need to enumerate arguments.
llvm-svn: 317774
Vitaly Buka [Thu, 9 Nov 2017 07:48:53 +0000 (07:48 +0000)]
[msan] Add context argument into GetStackTrace
llvm-svn: 317773
Vitaly Buka [Thu, 9 Nov 2017 07:46:30 +0000 (07:46 +0000)]
[lsan] Add "static" to internal function
llvm-svn: 317772
Vitaly Buka [Thu, 9 Nov 2017 07:46:13 +0000 (07:46 +0000)]
Fix "default label in switch which covers all enumeration values" warning
llvm-svn: 317771
Sanjoy Das [Thu, 9 Nov 2017 06:31:33 +0000 (06:31 +0000)]
[SectionMemoryManager] Abstract out mmap, munmap, mprotect even more ; NFC
Summary:
This will let ORC JIT clients plug in custom logic for the mmap, munmap and
mprotect paths.
Reviewers: loladiro, dblaikie
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D39300
llvm-svn: 317770
Craig Topper [Thu, 9 Nov 2017 06:17:05 +0000 (06:17 +0000)]
[X86] Make X86ISD::FMADDS3 isel patterns commutable.
This was missed when FMADDS3 was split from X86ISD::FMADDS3_RND.
llvm-svn: 317769
Serguei Katkov [Thu, 9 Nov 2017 06:02:18 +0000 (06:02 +0000)]
[GVN PRE] Patch the source for Phi node in PRE
We must patch all existing incoming values of Phi node,
otherwise it is possible that we can see poison
where program does not expect to see it.
This is the similar what GVN does.
The added test test/Transforms/GVN/PRE/pre-jt-add.ll shows an
example of wrong optimization done by jump threading due to
GVN PRE did not patch existing incoming value.
Reviewers: mkazantsev, wmi, dberlin, davide
Reviewed By: dberlin
Subscribers: efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D39637
llvm-svn: 317768
Kostya Serebryany [Thu, 9 Nov 2017 05:49:28 +0000 (05:49 +0000)]
[libFuzzer] allow merge to resume after being preempted
llvm-svn: 317767
Craig Topper [Thu, 9 Nov 2017 04:10:46 +0000 (04:10 +0000)]
[X86] Rename the VEX scalar fma builtins to end with a '3' to match gcc
I think we need to use different builtins for the FMA4 instructions since those instructions zero the upper bits and FMA3 instructions pass the bits through.
So this moves the existing builtins to be the FMA3 versions. New versions will be added for FMA4.
llvm-svn: 317766
Craig Topper [Thu, 9 Nov 2017 04:10:42 +0000 (04:10 +0000)]
[X86] Rename the VEX scalar fma builtins to end with a '3' to match gcc
I think we need to use different builtins for the FMA4 instructions since those instructions zero the upper bits and FMA3 instructions pass the bits through.
So this moves the existing builtins to be the FMA3 versions. New versions will be added for FMA4.
llvm-svn: 317765
Vedant Kumar [Thu, 9 Nov 2017 02:50:24 +0000 (02:50 +0000)]
[llvm-cov] Fix more -path-equivalence test bugs
llvm-svn: 317764
Vedant Kumar [Thu, 9 Nov 2017 02:42:34 +0000 (02:42 +0000)]
[llvm-cov] Fix a -path-equivalence bug in a test
llvm-svn: 317763
Vedant Kumar [Thu, 9 Nov 2017 02:33:44 +0000 (02:33 +0000)]
[llvm-cov] Don't render empty region marker lines
This fixes an issue where llvm-cov prints an empty line, thinking it
needs to display region markers, when it actually doesn't.
llvm-svn: 317762
Vedant Kumar [Thu, 9 Nov 2017 02:33:43 +0000 (02:33 +0000)]
[Coverage] Use the wrapped segment when a line has entry segments
We've worked around bugs in the frontend by ignoring the count from
wrapped segments when a line has at least one region entry segment.
Those frontend bugs are now fixed, so it's time to regenerate the
checked-in covmapping files and remove the workaround.
llvm-svn: 317761
Vedant Kumar [Thu, 9 Nov 2017 02:33:40 +0000 (02:33 +0000)]
[Coverage] Emit deferred regions in headers
There are some limitations with emitting regions in macro expansions
because we don't gather file IDs within the expansions. Fix the check
that prevents us from emitting deferred regions in expansions to make an
exception for headers, which is something we can handle.
rdar://
35373009
llvm-svn: 317760
Vedant Kumar [Thu, 9 Nov 2017 02:33:39 +0000 (02:33 +0000)]
[Coverage] Complete top-level deferred regions before labels
The area immediately after a terminated region in the function top-level
should have the same count as the label it precedes.
This solves another problem with wrapped segments. Consider:
1| a:
2| return 0;
3| b:
4| return 1;
Without a gap area starting after the first return, the wrapped segment
from line 2 would make it look like line 3 is executed, when it's not.
rdar://
35373009
llvm-svn: 317759
Vedant Kumar [Thu, 9 Nov 2017 02:33:38 +0000 (02:33 +0000)]
[Coverage] Emit a gap area after if conditions
The area immediately after the closing right-paren of an if condition
should have a count equal to the 'then' block's count. Use a gap region
to set this count, so that region highlighting for the 'then' block
remains precise.
This solves a problem we have with wrapped segments. Consider:
1| if (false)
2| foo();
Without a gap area starting after the condition, the wrapped segment
from line 1 would make it look like line 2 is executed, when it's not.
rdar://
35373009
llvm-svn: 317758
Peter Collingbourne [Thu, 9 Nov 2017 02:22:07 +0000 (02:22 +0000)]
ubsan: Allow programs to use setenv to configure ubsan_standalone.
Previously ubsan_standalone used the GetEnv function to read the
environment variables UBSAN_OPTIONS and UBSAN_SYMBOLIZER_PATH. The
problem with GetEnv is that it does not respect changes to the
environment variables made using the libc setenv function, which
prevents clients from setting environment variables to configure
ubsan before loading ubsan-instrumented libraries.
The reason why we have GetEnv is that some runtimes need to read
environment variables while they initialize using .preinit_array,
and getenv does not work while .preinit_array functions are being
called. However, it is unnecessary for ubsan_standalone to initialize
that early. So this change switches ubsan_standalone to using getenv
and removes the .preinit_array entry. The static version of the runtime
still ends up being initialized using a C++ constructor that exists
to support the shared runtime.
Differential Revision: https://reviews.llvm.org/D39827
llvm-svn: 317757
Kostya Serebryany [Thu, 9 Nov 2017 02:13:43 +0000 (02:13 +0000)]
[libFuzzer] mechanically simplify a test, NFC
llvm-svn: 317756
Marek Olsak [Thu, 9 Nov 2017 01:52:55 +0000 (01:52 +0000)]
AMDGPU: Merge BUFFER_STORE_DWORD_OFFEN/OFFSET into x2, x4
Summary:
Only 56 shaders (out of 48486) are affected.
Totals from affected shaders (changed stats only):
SGPRS: 2420 -> 2460 (1.65 %)
Spilled VGPRs: 94 -> 112 (19.15 %)
Scratch size: 524 -> 528 (0.76 %) dwords per thread
Code Size: 187400 -> 184992 (-1.28 %) bytes
One DiRT Showdown shader spills 6 more VGPRs.
One Grid Autosport shader spills 12 more VGPRs.
The other 54 shaders only have a decrease in code size.
(I'm ignoring the SGPR noise)
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D39012
llvm-svn: 317755
Marek Olsak [Thu, 9 Nov 2017 01:52:48 +0000 (01:52 +0000)]
AMDGPU: Lower buffer store and atomic intrinsics manually
Summary:
Without this, SIMemoryLegalizer inserts s_waitcnt vmcnt(0) before every
buffer store and atomic instruction.
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D39060
llvm-svn: 317754
Marek Olsak [Thu, 9 Nov 2017 01:52:36 +0000 (01:52 +0000)]
AMDGPU: Merge BUFFER_LOAD_DWORD_OFFSET into x2, x4
Summary: Only 3 (out of 48486) shaders are affected.
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D38951
llvm-svn: 317753
Marek Olsak [Thu, 9 Nov 2017 01:52:30 +0000 (01:52 +0000)]
AMDGPU: Merge BUFFER_LOAD_DWORD_OFFEN into x2, x4
Summary:
-9.9% code size decrease in affected shaders.
Totals (changed stats only):
SGPRS: 2151462 -> 2170646 (0.89 %)
VGPRS: 1634612 -> 1640288 (0.35 %)
Spilled SGPRs: 8942 -> 8940 (-0.02 %)
Code Size:
52940672 ->
51727288 (-2.29 %) bytes
Max Waves: 373066 -> 371718 (-0.36 %)
Totals from affected shaders:
SGPRS: 283520 -> 302704 (6.77 %)
VGPRS: 227632 -> 233308 (2.49 %)
Spilled SGPRs: 3966 -> 3964 (-0.05 %)
Code Size:
12203080 ->
10989696 (-9.94 %) bytes
Max Waves: 44070 -> 42722 (-3.06 %)
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D38950
llvm-svn: 317752
Marek Olsak [Thu, 9 Nov 2017 01:52:23 +0000 (01:52 +0000)]
AMDGPU: Merge S_BUFFER_LOAD_DWORD_IMM into x2, x4
Summary:
Only constant offsets (*_IMM opcodes) are merged.
It reuses code for LDS load/store merging.
It relies on the scheduler to group loads.
The results are mixed, I think they are mostly positive. Most shaders are
affected, so here are total stats only:
SGPRS: 2072198 -> 2151462 (3.83 %)
VGPRS: 1628024 -> 1634612 (0.40 %)
Spilled SGPRs: 7883 -> 8942 (13.43 %)
Spilled VGPRs: 97 -> 101 (4.12 %)
Scratch size: 1488 -> 1492 (0.27 %) dwords per thread
Code Size:
60222620 ->
52940672 (-12.09 %) bytes
Max Waves: 374337 -> 373066 (-0.34 %)
There is 13.4% increase in SGPR spilling, DiRT Showdown spills a few more
VGPRs (now 37), but 12% decrease in code size.
These are the new stats for SGPR spilling. We already spill a lot SGPRs,
so it's uncertain whether more spilling will make any difference since
SGPRs are always spilled to VGPRs:
SGPR SPILLING APPS Shaders SpillSGPR AvgPerSh
alien_isolation 2938 100 0.0
batman_arkham_origins 589 6 0.0
bioshock-infinite 1769 4 0.0
borderlands2 3968 22 0.0
counter_strike_glob.. 1142 60 0.1
deus_ex_mankind_div.. 1410 79 0.1
dirt-showdown 533 4 0.0
dirt_rally 364 1163 3.2
divinity 1052 2 0.0
dota2 1747 7 0.0
f1-2015 776 1515 2.0
grid_autosport 1767 1505 0.9
hitman 1413 273 0.2
left_4_dead_2 1762 4 0.0
life_is_strange 1296 26 0.0
mad_max 358 96 0.3
metro_2033_redux 2670 60 0.0
payday2 1362 22 0.0
portal 474 3 0.0
saints_row_iv 1704 8 0.0
serious_sam_3_bfe 392 1348 3.4
shadow_of_mordor 1418 12 0.0
shadow_warrior 3956 239 0.1
talos_principle 324 1735 5.4
thea 172 17 0.1
tomb_raider 1449 215 0.1
total_war_warhammer 242 56 0.2
ue4_effects_cave 295 55 0.2
ue4_elemental 572 12 0.0
unigine_tropics 210 56 0.3
unigine_valley 278 152 0.5
victor_vran 1262 84 0.1
yofrankie 82 2 0.0
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D38949
llvm-svn: 317751
Marek Olsak [Thu, 9 Nov 2017 01:52:17 +0000 (01:52 +0000)]
AMDGPU: Fold immediate offset into BUFFER_LOAD_DWORD lowered from SMEM
Summary:
-5.3% code size in affected shaders.
Changed stats only:
48486 shaders in 30489 tests
Totals:
SGPRS: 2086406 -> 2072430 (-0.67 %)
VGPRS: 1626872 -> 1627960 (0.07 %)
Spilled SGPRs: 7865 -> 7912 (0.60 %)
Code Size:
60978060 ->
60188764 (-1.29 %) bytes
Max Waves: 374530 -> 374342 (-0.05 %)
Totals from affected shaders:
SGPRS: 299664 -> 285688 (-4.66 %)
VGPRS: 233844 -> 234932 (0.47 %)
Spilled SGPRs: 3959 -> 4006 (1.19 %)
Code Size:
14905272 ->
14115976 (-5.30 %) bytes
Max Waves: 46202 -> 46014 (-0.41 %)
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D38915
llvm-svn: 317750
Kostya Serebryany [Thu, 9 Nov 2017 01:45:59 +0000 (01:45 +0000)]
[libFuzzer] fix a test (and hopefully, the bot)
llvm-svn: 317749
Craig Topper [Thu, 9 Nov 2017 01:06:47 +0000 (01:06 +0000)]
[X86] Make sure we don't read too many operands from X86ISD::FMADDS1/FMADDS3 nodes when doing FNEG combine.
r317453 added new ISD nodes without rounding modes that were added to an existing if/else chain. But all the previous nodes handled there included a rounding mode. The final code after this if/else chain expected an extra operand that isn't present for the new nodes.
llvm-svn: 317748
Kostya Serebryany [Thu, 9 Nov 2017 01:05:29 +0000 (01:05 +0000)]
[libFuzzer] allow user to specify the merge control file
llvm-svn: 317747
Petr Hosek [Thu, 9 Nov 2017 00:21:29 +0000 (00:21 +0000)]
[CMake] Passthrough CMAKE_SYSROOT to external projects
Differential Revision: https://reviews.llvm.org/D39029
llvm-svn: 317744
Mitch Phillips [Thu, 9 Nov 2017 00:18:31 +0000 (00:18 +0000)]
[cfi-verify] Adds blacklist blame behaviour to cfi-verify.
Adds the blacklist behaviour to llvm-cfi-verify. Now will calculate which lines caused expected failures in the blacklist and reports the number of affected indirect CF instructions for each blacklist entry.
Also moved DWARF checking after instruction analysis to improve performance significantly - unrolling the inlining stack is expensive.
Reviewers: vlad.tsyrklevich
Subscribers: aprantl, pcc, kcc, llvm-commits
Differential Revision: https://reviews.llvm.org/D39750
llvm-svn: 317743
Petr Hosek [Wed, 8 Nov 2017 23:44:27 +0000 (23:44 +0000)]
[CMake][runtimes] Fix the variable name
This typo causes the llvm-lit path resolution to fail.
Differential Revision: https://reviews.llvm.org/D39811
llvm-svn: 317742
Simon Atanasyan [Wed, 8 Nov 2017 23:34:34 +0000 (23:34 +0000)]
[MIPS] Setup less-significant bit in the .got and .got.plt entries in case of microMIPS code
The less-significant bit signals about microMIPS code for jump/branch
instructions.
llvm-svn: 317741
Rafael Espindola [Wed, 8 Nov 2017 23:07:32 +0000 (23:07 +0000)]
Handle "-" in tryCreateFile.
Otherwise we would fail with -M if the we didn't have write
permissions to the current directory.
llvm-svn: 317740
Rui Ueyama [Wed, 8 Nov 2017 22:57:48 +0000 (22:57 +0000)]
[FileOutputBuffer] Move factory methods out of their classes.
InMemoryBuffer and OnDiskBuffer classes have both factory methods and
public constructors, and that looks a bit odd. This patch makes factory
methods non-member function to fix it.
Differential Revision: https://reviews.llvm.org/D39693
llvm-svn: 317739
Evgeniy Stepanov [Wed, 8 Nov 2017 22:51:09 +0000 (22:51 +0000)]
[Sanitizers, CMake] Also use version script for libclang_rt.asan-i386.so
When building LLVM on x86_64-pc-linux-gnu (Fedora 25) with the bundled gcc 6.4.1
which uses gld 2.26.1-1.fc25, the dynamic/Asan-i386-calls-Dynamic-Test and
dynamic/Asan-i386-inline-Dynamic-Test tests failed to link with
/usr/bin/ld: /var/scratch/gcc/llvm/dist/lib/clang/6.0.0/lib/linux/libclang_rt.asan-i386.so: fork: invalid version 21 (max 0)
/var/scratch/gcc/llvm/dist/lib/clang/6.0.0/lib/linux/libclang_rt.asan-i386.so: error adding symbols: Bad value
I tried building with a self-compiled gcc 7.1.0 using gld 2.28, but the error remained.
It seems the error has been hit before (cf. https://reviews.llvm.org/rL314085), but
no real explanation has been found.
However, the problem goes away when linking the i386 libclang_rt.asan with a version
script just like every other variant is. Not using the version script in this single case
dates back to the initial introduction of the version script in r236551, but this change
was just checked in without any explanation AFAICT.
Since I've not found any other workaround and no reason for not always using the
version script, I propose to do so.
Tested on x86_64-pc-linux-gnu.
Patch by Rainer Orth.
Differential Revision: https://reviews.llvm.org/D39795
llvm-svn: 317738
Alex Lorenz [Wed, 8 Nov 2017 22:47:15 +0000 (22:47 +0000)]
Remove redundant copy-pasted comment in test file from r317736
llvm-svn: 317737
Alex Lorenz [Wed, 8 Nov 2017 22:44:34 +0000 (22:44 +0000)]
[ObjC] Fix function signature handling for blocks literals with attributes
Block literals can have a type with attributes in its signature, e.g.
ns_returns_retained. The code that inspected the type loc of the block when
declaring its parameters didn't account for this fact, and only looked through
paren type loc. This commit ensures that getAsAdjusted is used instead of
IgnoreParens to find the block's FunctionProtoTypeLoc. This ensures that
block parameters are declared correctly in the block and avoids the
'undeclared identifier' error.
rdar://
35416160
llvm-svn: 317736
Kamil Rytarowski [Wed, 8 Nov 2017 22:34:17 +0000 (22:34 +0000)]
Correct atexit(3) support in TSan/NetBSD
Summary:
The NetBSD specific implementation of cxa_atexit() does not
preserve the 2nd argument if dso is equal to NULL.
Changes:
- Split paths of handling intercepted __cxa_atexit() and atexit(3).
This affects all supported Operating Systems.
- Add a local stack-like structure to hold the __cxa_atexit() context.
atexit(3) is documented in the C standard as calling callback from the
earliest to the oldest entry. This path also fixes potential ABI
problem of passing an argument to a function from the atexit(3)
callback mechanism.
- Add new test to ensure LIFO style of atexit(3) callbacks: atexit3.cc
Proposal to change the behavior of __cxa_atexit() in NetBSD has been rejected.
With the above changes TSan/NetBSD with the current tsan_interceptors.cc
can bootstrap into operation.
Sponsored by <The NetBSD Foundation>
Reviewers: vitalybuka, dvyukov, joerg, kcc, eugenis
Reviewed By: dvyukov
Subscribers: kubamracek, llvm-commits, #sanitizers
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D39619
llvm-svn: 317735
Volodymyr Sapsai [Wed, 8 Nov 2017 22:30:29 +0000 (22:30 +0000)]
[libcxx] Mark test cxa_deleted_virtual.pass.cpp as failing for previous libcxx versions.
r313500 added a fix for undefined "___cxa_deleted_virtual" symbol.
Previous libcxx versions don't have the fix and corresponding test
should be failing.
rdar://problem/
34521053
Reviewers: EricWF, mclow.lists, ahatanak
Reviewed By: ahatanak
Subscribers: mehdi_amini, cfe-commits
Differential Revision: https://reviews.llvm.org/D39776
llvm-svn: 317734
Craig Topper [Wed, 8 Nov 2017 22:26:41 +0000 (22:26 +0000)]
[X86] X86MaskedGatherSDNode shouldn't inherit from MaskedGatherScatterSDNode
The classof implementation in MaskedGatherScatterSDNode doesn't consider X86MaskedGatherSDNode so its misleading.
llvm-svn: 317733
Craig Topper [Wed, 8 Nov 2017 22:26:39 +0000 (22:26 +0000)]
[X86] Preserve memory refs when folding loads into divides.
This is similar to what we already do for multiplies. Without this we can't unfold and hoist an invariant load.
llvm-svn: 317732
Craig Topper [Wed, 8 Nov 2017 22:26:37 +0000 (22:26 +0000)]
[X86] Remove an if check on the result of a cast. NFC
cast takes a non-null input and produces a non-null output. So this if can never fail.
llvm-svn: 317731
Adrian Prantl [Wed, 8 Nov 2017 22:04:43 +0000 (22:04 +0000)]
Let replaceVTableHolder accept any type.
In Rust, a trait can be implemented for any type, and if a trait
object pointer is used for the type, then a virtual table will be
emitted for that trait/type combination.
We would like debuggers to be able to inspect trait objects, which
requires finding the concrete type associated with a given vtable.
This patch changes LLVM so that any type can be passed to
replaceVTableHolder. This allows the Rust compiler to emit the needed
debug info -- associating a vtable with the concrete type for which it
was emitted.
This is a DWARF extension: DWARF only specifies the meaning of
DW_AT_containing_type in one specific situation. This style of DWARF
extension is routine, though, and LLVM already has one such case for
DW_AT_containing_type.
Patch by Tom Tromey!
Differential Revision: https://reviews.llvm.org/D39503
llvm-svn: 317730
Dan Gohman [Wed, 8 Nov 2017 21:59:51 +0000 (21:59 +0000)]
Add an @llvm.sideeffect intrinsic
This patch implements Chandler's idea [0] for supporting languages that
require support for infinite loops with side effects, such as Rust, providing
part of a solution to bug 965 [1].
Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual
effect, but which appears to optimization passes to have obscure side effects,
such that they don't optimize away loops containing it. It also teaches
several optimization passes to ignore this intrinsic, so that it doesn't
significantly impact optimization in most cases.
As discussed on llvm-dev [2], this patch is the first of two major parts.
The second part, to change LLVM's semantics to have defined behavior
on infinite loops by default, with a function attribute for opting into
potential-undefined-behavior, will be implemented and posted for review in
a separate patch.
[0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
[1] https://bugs.llvm.org/show_bug.cgi?id=965
[2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html
Differential Revision: https://reviews.llvm.org/D38336
llvm-svn: 317729
Teresa Johnson [Wed, 8 Nov 2017 21:48:27 +0000 (21:48 +0000)]
[ThinLTO] New test needs to require LTO
Fix buildbot failures:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/5262/steps/annotate/logs/stdio
llvm-svn: 317728
Alex Lorenz [Wed, 8 Nov 2017 21:33:15 +0000 (21:33 +0000)]
[ObjC] Boxed strings should use the nullability from stringWithUTF8String's return type
Objective-C NSString has a class method stringWithUTF8String that creates a new
NSString from a C string. Objective-C box expression @(...) can be used to
create an NSString instead of invoking the stringWithUTF8String method directly
(The compiler lowers it down to the invocation though). This commit ensures that
the type of @(string-value) gets the same nullability attributes as the return
type of stringWithUTF8String to ensure that the diagnostics are consistent
between the two.
rdar://
33847186
Differential Revision: https://reviews.llvm.org/D39762
llvm-svn: 317727
Reid Kleckner [Wed, 8 Nov 2017 21:31:14 +0000 (21:31 +0000)]
Revert "Correct dwarf unwind information in function epilogue for X86"
This reverts r317579, originally committed as r317100.
There is a design issue with marking CFI instructions duplicatable. Not
all targets support the CFIInstrInserter pass, and targets like Darwin
can't cope with duplicated prologue setup CFI instructions. The compact
unwind info emission fails.
When the following code is compiled for arm64 on Mac at -O3, the CFI
instructions end up getting tail duplicated, which causes compact unwind
info emission to fail:
int a, c, d, e, f, g, h, i, j, k, l, m;
void n(int o, int *b) {
if (g)
f = 0;
for (; f < o; f++) {
m = a;
if (l > j * k > i)
j = i = k = d;
h = b[c] - e;
}
}
We get assembly that looks like this:
; BB#1: ; %if.then
Lloh3:
adrp x9, _f@GOTPAGE
Lloh4:
ldr x9, [x9, _f@GOTPAGEOFF]
mov w8, wzr
Lloh5:
str wzr, [x9]
stp x20, x19, [sp, #-16]! ; 8-byte Folded Spill
.cfi_def_cfa_offset 16
.cfi_offset w19, -8
.cfi_offset w20, -16
cmp w8, w0
b.lt LBB0_3
b LBB0_7
LBB0_2: ; %entry.if.end_crit_edge
Lloh6:
adrp x8, _f@GOTPAGE
Lloh7:
ldr x8, [x8, _f@GOTPAGEOFF]
Lloh8:
ldr w8, [x8]
stp x20, x19, [sp, #-16]! ; 8-byte Folded Spill
.cfi_def_cfa_offset 16
.cfi_offset w19, -8
.cfi_offset w20, -16
cmp w8, w0
b.ge LBB0_7
LBB0_3: ; %for.body.lr.ph
Note the multiple .cfi_def* directives. Compact unwind info emission
can't handle that.
llvm-svn: 317726
Vedant Kumar [Wed, 8 Nov 2017 21:26:40 +0000 (21:26 +0000)]
[cmake] Allow LLVM_BUILD_INSTRUMENTED to be set to IR or Frontend
- This deprecates LLVM_ENABLE_IR_PGO but keeps it around for now.
- Errors out when LLVM_BUILD_INSTRUMENTED and LLVM_BUILD_INSTRUMENTED_COVERAGE
are both set.
Motivated by bogner's post-commit review of r313770.
llvm-svn: 317725
Rafael Espindola [Wed, 8 Nov 2017 21:15:21 +0000 (21:15 +0000)]
Make sure an error is always handled.
llvm-svn: 317724
Teresa Johnson [Wed, 8 Nov 2017 20:27:28 +0000 (20:27 +0000)]
[ThinLTO] Ensure sanitizer passes are run
Recommit new test as linux-only.
llvm-svn: 317723
Marshall Clow [Wed, 8 Nov 2017 20:25:47 +0000 (20:25 +0000)]
Added include for <cassert>
llvm-svn: 317722
Alex Bradbury [Wed, 8 Nov 2017 20:19:16 +0000 (20:19 +0000)]
Set hasSideEffects=0 for PHI and fix affected passes
Previously, hasSideEffects was ? for TargetOpcode::PHI and would be inferred
as 1. D37065 sets the previously inferred properties explicitly. This patch sets
hasSideEffects=0 for PHI, as it is for G_PHI. MachineInstr::isSafeToMove has
been updated so it still returns false for PHI.
Additionally, HexagonBitSimplify relied on a PHI node having the
hasUnmodeledSideEffects property. This patch fixes that assumption.
Differential Revision: https://reviews.llvm.org/D37097
llvm-svn: 317721
Craig Topper [Wed, 8 Nov 2017 20:17:33 +0000 (20:17 +0000)]
[X86] Correct the implementation of BEXTR load folding to use the shift as the parent node and pass a separate root.
We were calling tryFoldLoad with the 'and' node was the root and parent node of the load. But the parent of the load should be the shift that proceeds the and. While the and node is correctly the root node.
To fix this I had to make tryFoldLoad take a separate use and root input. I've added a convenience version with the old signature to avoid updating the other call sites.
llvm-svn: 317720
Alexey Bataev [Wed, 8 Nov 2017 20:16:14 +0000 (20:16 +0000)]
[OPENMP] Codegen for `#pragma omp target parallel for`.
llvm-svn: 317719
Sam Clegg [Wed, 8 Nov 2017 20:14:06 +0000 (20:14 +0000)]
[WebAssembly] Update test expectations
I believe these were fixed in rL317707
Differential Revision: https://reviews.llvm.org/D39813
llvm-svn: 317718
Teresa Johnson [Wed, 8 Nov 2017 20:08:15 +0000 (20:08 +0000)]
Revert "[ThinLTO] Ensure sanitizer passes are run"
This reverts commit r317715. It failed a Windows buildbot since
ThinLTO is presumably not supported, leading to a corrupt file error
on the object file:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/19395/steps/run%20check-asan/logs/stdio
Will re-commit the new ThinLTO part of the test to a linux only test
file.
llvm-svn: 317717
David L. Jones [Wed, 8 Nov 2017 20:03:11 +0000 (20:03 +0000)]
Add a missing "REQUIRES: system-windows" to a Windows-only test.
This un-breaks builds on other platforms. Otherwise, they fail due to warnings like:
warning: unable to find a Visual Studio installation; try running Clang from a developer command prompt [-Wmsvc-not-found]
llvm-svn: 317716
Teresa Johnson [Wed, 8 Nov 2017 19:46:25 +0000 (19:46 +0000)]
[ThinLTO] Ensure sanitizer passes are run
Summary:
Test fix to pass manager for ThinLTO.
Depends on D39565.
Reviewers: pcc
Subscribers: kubamracek, mehdi_amini, llvm-commits, inglorion
Differential Revision: https://reviews.llvm.org/D39566
llvm-svn: 317715