Aaron Ballman [Tue, 19 Jul 2016 17:46:55 +0000 (17:46 +0000)]
This code block breaks the docs build (lab.llvm.org:8011/builders/llvm-sphinx-docs/builds/11920/steps/docs-llvm-html/logs/stdio), but I cannot see anything immediately wrong with it and cannot reproduce the diagnostic locally. Setting the code highlighting to none instead of nasm to hopefully get the bot stumbling back towards green.
llvm-svn: 275998
Ed Maste [Tue, 19 Jul 2016 17:28:38 +0000 (17:28 +0000)]
libunwind: sync some coments with NetBSD's version
NetBSD's system unwinder is a modified version of LLVM's libunwind.
Slightly reduce diffs by updating comments to match theirs where
appropriate.
llvm-svn: 275997
Ed Maste [Tue, 19 Jul 2016 17:15:50 +0000 (17:15 +0000)]
libunwind: Use conventional DWARF capitalization in comments and errors
llvm-svn: 275996
Sanjay Patel [Tue, 19 Jul 2016 17:07:35 +0000 (17:07 +0000)]
add tests related to PR28466
llvm-svn: 275995
Simon Pilgrim [Tue, 19 Jul 2016 17:04:28 +0000 (17:04 +0000)]
[X86][AVX512] Added AVX512 subvector broadcast tests
llvm-svn: 275994
Matthias Gehre [Tue, 19 Jul 2016 17:02:54 +0000 (17:02 +0000)]
cppcoreguidelines-pro-bounds-constant-array-index: ignore implicit constructor
Summary:
The code
struct A {
int x[3];
};
gets an compiler-generated copy constructor that uses ArraySubscriptExpr (see below).
Previously, the check would generate a warning on that copy constructor.
This commit disables the warning on implicitly generated code.
AST:
|-CXXConstructorDecl 0x337b3c8 <col:8> col:8 implicit used constexpr A 'void (const struct A &) noexcept' inline
| |-ParmVarDecl 0x337b510 <col:8> col:8 used 'const struct A &'
| |-CXXCtorInitializer Field 0x3379238 'x' 'int [3]'
| | `-ImplicitCastExpr 0x337e158 <col:8> 'int' <LValueToRValue>
| | `-ArraySubscriptExpr 0x337e130 <col:8> 'const int' lvalue
| | |-ImplicitCastExpr 0x337e118 <col:8> 'const int *' <ArrayToPointerDecay>
| | | `-MemberExpr 0x337dfc8 <col:8> 'int const[3]' lvalue .x 0x3379238
| | | `-DeclRefExpr 0x337dfa0 <col:8> 'const struct A' lvalue ParmVar 0x337b510 '' 'const struct A &'
| | `-ImplicitCastExpr 0x337e098 <col:8> 'unsigned long' <LValueToRValue>
| | `-DeclRefExpr 0x337e070 <col:8> 'unsigned long' lvalue Var 0x337e010 '__i0' 'unsigned long'
Reviewers: alexfh, aaron.ballman
Subscribers: aemerson, nemanjai, cfe-commits
Differential Revision: https://reviews.llvm.org/D22381
llvm-svn: 275993
Simon Pilgrim [Tue, 19 Jul 2016 16:52:05 +0000 (16:52 +0000)]
[X86][AVX] Fixed typo in test names
llvm-svn: 275992
Chad Rosier [Tue, 19 Jul 2016 16:50:57 +0000 (16:50 +0000)]
[DSE] Add additional debug output. NFC.
llvm-svn: 275991
Sanjay Patel [Tue, 19 Jul 2016 16:49:55 +0000 (16:49 +0000)]
add missing test for simplifySelectBitTest()
llvm-svn: 275990
Tobias Grosser [Tue, 19 Jul 2016 16:39:17 +0000 (16:39 +0000)]
[InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp))
Summary:
Currently, InstCombine is already able to fold expressions of the form `logic(cast(A), cast(B))` to the simpler form `cast(logic(A, B))`, where logic designates one of `and`/`or`/`xor`. This transformation is implemented in `foldCastedBitwiseLogic()` in InstCombineAndOrXor.cpp. However, this optimization will not be performed if both `A` and `B` are `icmp` instructions. The decision to preclude casts of `icmp` instructions originates in r48715 in combination with r261707, and can be best understood by the title of the former one:
> Transform (zext (or (icmp), (icmp))) to (or (zext (cimp), (zext icmp))) if at least one of the (zext icmp) can be transformed to eliminate an icmp.
Apparently, it introduced a transformation that is a reverse of the transformation that is done in `foldCastedBitwiseLogic()`. Its purpose is to expose pairs of `zext icmp` that would subsequently be optimized by `transformZExtICmp()` in InstCombineCasts.cpp. Therefore, in order to avoid an endless loop of switching back and forth between these two transformations, the one in `foldCastedBitwiseLogic()` has been restricted to exclude `icmp` instructions which is mirrored in the responsible check:
`if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src)) && ...`
This check seems to sort out more cases than necessary because:
- the reverse transformation is obviously done for `or` instructions only
- and also not every `zext icmp` pair is necessarily the result of this reverse transformation
Therefore we now remove this check and replace it by a more finegrained one in `shouldOptimizeCast()` that now rejects only those `logic(zext(icmp), zext(icmp))` that would be able to be optimized by `transformZExtICmp()`, which also avoids the mentioned endless loop. That means we are now able to also simplify expressions of the form `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` (`cast` being an arbitrary `CastInst`).
As an example, consider the following IR snippet
```
%1 = icmp sgt i64 %a, %b
%2 = zext i1 %1 to i8
%3 = icmp slt i64 %a, %c
%4 = zext i1 %3 to i8
%5 = and i8 %2, %4
```
which would now be transformed to
```
%1 = icmp sgt i64 %a, %b
%2 = icmp slt i64 %a, %c
%3 = and i1 %1, %2
%4 = zext i1 %3 to i8
```
This issue became apparent when experimenting with the programming language Julia, which makes use of LLVM. Currently, Julia lowers its `Bool` datatype to LLVM's `i8` (also see https://github.com/JuliaLang/julia/pull/17225). In fact, the above IR example is the lowered form of the Julia snippet `(a > b) & (a < c)`. Like shown above, this may introduce `zext` operations, casting between `i1` and `i8`, which could for example hinder ScalarEvolution and Polly on certain code.
Reviewers: grosser, vtjnash, majnemer
Subscribers: majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D22511
Contributed-by: Matthias Reisinger
llvm-svn: 275989
Matt Arsenault [Tue, 19 Jul 2016 16:27:56 +0000 (16:27 +0000)]
AMDGPU: Only use legal inline immediates with kill pseudo
Only if the value is negative or positive is what matters,
so use a constant that doesn't require an instruction to
materialize.
These should really just emit the write exec directly,
but for stick with the kill pseudo-terminator.
llvm-svn: 275988
Tobias Grosser [Tue, 19 Jul 2016 15:56:25 +0000 (15:56 +0000)]
GPGPU: Bail out of scops with hoisted invariant loads
This is currently not supported and will only be added later. Also update the
test cases to ensure no invariant code hoisting is applied.
llvm-svn: 275987
NAKAMURA Takumi [Tue, 19 Jul 2016 15:53:11 +0000 (15:53 +0000)]
clangRename: Update libdeps to add clangASTMatchers.
Note, ClangRenameTests is linking USRFindingAction.cpp directly.
llvm-svn: 275986
NAKAMURA Takumi [Tue, 19 Jul 2016 15:33:14 +0000 (15:33 +0000)]
ClangRenameTests: Update libdeps. r275958 introduced clangASTMatchers.
llvm-svn: 275985
Etienne Bergeron [Tue, 19 Jul 2016 15:30:22 +0000 (15:30 +0000)]
fix compiler warnings [NFC]
llvm-svn: 275984
Ed Maste [Tue, 19 Jul 2016 15:28:02 +0000 (15:28 +0000)]
Typo corrections identified by codespell
Submitted by giffunip@yahoo.com; I fixed a couple of nearby errors and
incorrect changes in the patch.
llvm.org/pr27634
llvm-svn: 275983
Etienne Bergeron [Tue, 19 Jul 2016 15:27:23 +0000 (15:27 +0000)]
[compiler-rt] Fix Asan imports/exports unittest
Summary:
Avoid mismatch between imports/exports for 32-bit and 64-bits version.
The test is running grep over macros to detect which functions are
intercepted. Unfortunately, exception handlers differ in 32-bit and
64-bit.
This patch is removing the exception handlers from the test.
Reviewers: rnk
Subscribers: llvm-commits, wang0109, kubabrecka, chrisha
Differential Revision: https://reviews.llvm.org/D22484
llvm-svn: 275982
Simon Pilgrim [Tue, 19 Jul 2016 15:07:43 +0000 (15:07 +0000)]
[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR
D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead.
It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match).
This patch changes both scalar and packed versions back to using x86-specific builtins.
It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding.
A companion clang patch is at D22105
Differential Revision: https://reviews.llvm.org/D22106
llvm-svn: 275981
Haojian Wu [Tue, 19 Jul 2016 14:49:04 +0000 (14:49 +0000)]
[include-fixer] A refactoring of IncludeFixerContext.
Summary:
No functional changes in this patch. It is a refactoring (pull out a
structure representing the symbol being queried).
This is a preparing step for inserting missing namespace qualifiers to all
instances of an unidentified symbol.
Reviewers: bkramer
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D22510
llvm-svn: 275980
Sam Parker [Tue, 19 Jul 2016 14:44:05 +0000 (14:44 +0000)]
[ARM] Refactor Thumb2 Mul and Mla instr descs
Recommitting after r274347 was reverted. This patch introduces some
classes to refactor the 3 and 4 register Thumb2 multiplication
instruction descriptions, plus improved tests for some of those
instructions.
Differential Revision: https://reviews.llvm.org/D21929
llvm-svn: 275979
Pankaj Gode [Tue, 19 Jul 2016 14:30:21 +0000 (14:30 +0000)]
[AArch64] PredictableSelectIsExpensive for Vulcan.
Adding PredictableSelectIsExpensive for Vulcan
Differential Revision: https://reviews.llvm.org/D22448
llvm-svn: 275978
Peter Smith [Tue, 19 Jul 2016 14:15:33 +0000 (14:15 +0000)]
Add support for tlsldm assembler operator to ARM target
The standard local dynamic model for TLS on ARM systems needs two
relocations:
- R_ARM_TLS_LDM32 (module idx)
- R_ARM_TLS_LDO32 (offset of object from origin of module TLS block)
In GNU style assembler we use symbol(tlsldm) and symbol(tlsldo) to
produce these relocations.
llvm-mc for ARM supports symbol(tlsldo) but does not support symbol(tlsldm).
This patch wires up the existing symbol(tlsldm) to R_ARM_TLS_LDM32.
TLS for ARM is defined in Addenda to, and Errata in, the ABI for the
ARM Architecture
Differential Revision: https://reviews.llvm.org/D22461
llvm-svn: 275977
Simon Pilgrim [Tue, 19 Jul 2016 14:12:45 +0000 (14:12 +0000)]
[AARCH64] Fix linu triple typo
As promised in D22191
llvm-svn: 275976
Sylvestre Ledru [Tue, 19 Jul 2016 14:00:57 +0000 (14:00 +0000)]
Add support of the latest Ubuntu (Yakkety Yak - 16.10)
llvm-svn: 275975
Dmitry Polukhin [Tue, 19 Jul 2016 13:35:15 +0000 (13:35 +0000)]
Fix for failing bot sanitizer-x86_64-linux-fast after r275970
More info http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/14774/steps/check-clang%20msan/logs/stdio
llvm-svn: 275974
Simon Pilgrim [Tue, 19 Jul 2016 13:35:11 +0000 (13:35 +0000)]
[AARCH64] Enable AARCH64 lit tests on windows dev machines
As discussed on PR27654, this patch fixes the triples of a lot of aarch64 tests and enables lit tests on windows
This will hopefully help stop cases where windows developers break the aarch64 target
Differential Revision: https://reviews.llvm.org/D22191
llvm-svn: 275973
Rafael Espindola [Tue, 19 Jul 2016 12:33:46 +0000 (12:33 +0000)]
Fix build with gcc 6.
llvm-svn: 275972
Simon Pilgrim [Tue, 19 Jul 2016 12:26:51 +0000 (12:26 +0000)]
Get rid of VS2015 operator precedence warning. NFCI.
llvm-svn: 275971
Dmitry Polukhin [Tue, 19 Jul 2016 11:29:16 +0000 (11:29 +0000)]
Deprecated (legacy) string literal conversion to 'char *' causes strange overloading resolution
It's a patch for PR28050. Seems like overloading resolution wipes out
the first standard conversion sequence (before user-defined conversion)
in case of deprecated string literal conversion.
Differential revision: https://reviews.llvm.org/D21228
Patch by Alexander Makarov
llvm-svn: 275970
Tobias Grosser [Tue, 19 Jul 2016 11:13:58 +0000 (11:13 +0000)]
GPGPU: Disable invariant load hoisting for GPU code generation
This simplifies the upcoming patches to add code generation for ScopStmts. Load
hoisting support will later be added in a separate commit. This commit will
be implicitly tested by the subsequent GPGPU changes.
llvm-svn: 275969
Daniel Sanders [Tue, 19 Jul 2016 10:58:06 +0000 (10:58 +0000)]
[mips][ias] R_MIPS_GOT_(PAGE|OFST) do not need symbols
Reviewers: sdardis
Subscribers: dsanders, llvm-commits, sdardis
Differential Revision: https://reviews.llvm.org/D22458
llvm-svn: 275968
Daniel Sanders [Tue, 19 Jul 2016 10:49:03 +0000 (10:49 +0000)]
[mips] Correct label prefixes for N32 and N64.
Summary:
N32 and N64 follow the standard ELF conventions (.L) whereas O32 uses its own
($).
This fixes the majority of object differences between -fintegrated-as and
-fno-integrated-as.
Reviewers: sdardis
Subscribers: dsanders, sdardis, llvm-commits
Differential Revision: https://reviews.llvm.org/D22412
llvm-svn: 275967
Daniel Sanders [Tue, 19 Jul 2016 10:22:19 +0000 (10:22 +0000)]
[mips] Recognise the triple used by Debian stretch for mips64el.
Summary:
The triple used for this distribution is mips64el-linux-gnuabi64.
Reviewers: sdardis
Subscribers: sdardis, llvm-commits
Differential Revision: https://reviews.llvm.org/D22406
llvm-svn: 275966
Eugene Leviant [Tue, 19 Jul 2016 09:25:43 +0000 (09:25 +0000)]
[ELF] Minimal PHDRS parser and section to segment assignment support
llvm-svn: 275965
Tobias Grosser [Tue, 19 Jul 2016 09:06:08 +0000 (09:06 +0000)]
[InstCombine] Minor cleanup of cast simplification code [NFC]
Summary:
This patch cleans up parts of InstCombine to raise its compliance with the LLVM coding standards and to increase its readability. The changes and according rationale are summarized in the following:
- Rename `ShouldOptimizeCast()` to `shouldOptimizeCast()` since functions should start with a lower case letter.
- Move `shouldOptimizeCast()` from InstCombineCasts.cpp to InstCombineAndOrXor.cpp since it's only used there.
- Simplify interface of `shouldOptimizeCast()`.
- Minor code style adaptions in `shouldOptimizeCast()`.
- Remove the documentation on the function definition of `shouldOptimizeCast()` since it just repeats the documentation on its declaration. Also enhance the documentation on its declaration with more information describing its intended use and make it doxygen-compliant.
- Change a comment in `foldCastedBitwiseLogic()` from `fold (logic (cast A), (cast B)) -> (cast (logic A, B))` to `fold logic(cast(A), cast(B)) -> cast(logic(A, B))` since the surrounding comments use this format.
- Remove comment `Only do this if the casts both really cause code to be generated.` in `foldCastedBitwiseLogic()` since it just repeats parts of the documentation of `shouldOptimizeCast()` and does not help to improve readability.
- Simplify the interface of `isEliminableCastPair()`.
- Removed the documentation on the function definition of `isEliminableCastPair()` which only contained obvious statements about its implementation. Instead added more general doxygen-compliant documentation to its declaration.
- Renamed parameter `DoXform` of `transformZExtIcmp()` to `DoTransform` to make its intention clearer.
- Moved documentation of `transformZExtIcmp()` from its definition to its declaration and made it doxygen-compliant.
Reviewers: vtjnash, grosser
Subscribers: majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D22449
Contributed-by: Matthias Reisinger
llvm-svn: 275964
Tobias Grosser [Tue, 19 Jul 2016 09:01:46 +0000 (09:01 +0000)]
Style: drop some unnecessary ';' [NFC]
llvm-svn: 275963
Tobias Grosser [Tue, 19 Jul 2016 07:47:27 +0000 (07:47 +0000)]
test: Add missing 'REQUIRES' line
llvm-svn: 275962
George Rimar [Tue, 19 Jul 2016 07:42:07 +0000 (07:42 +0000)]
Reformat comment from 3 to 2 lines. NFC.
llvm-svn: 275961
Tobias Grosser [Tue, 19 Jul 2016 07:39:54 +0000 (07:39 +0000)]
test: Add missing 'REQUIRES' line
llvm-svn: 275960
George Rimar [Tue, 19 Jul 2016 07:39:07 +0000 (07:39 +0000)]
Fixed comment. NFC.
llvm-svn: 275959
Kirill Bobyrev [Tue, 19 Jul 2016 07:37:43 +0000 (07:37 +0000)]
[clang-rename] add support for overridden functions
Reviewers: klimek
Differential Revision: https://reviews.llvm.org/D22408
llvm-svn: 275958
Tobias Grosser [Tue, 19 Jul 2016 07:33:16 +0000 (07:33 +0000)]
GPGPU: Emit in-kernel synchronization statements
We use this opportunity to further classify the different user statements that
can arise and add TODOs for the ones not yet implemented.
llvm-svn: 275957
Tobias Grosser [Tue, 19 Jul 2016 07:33:11 +0000 (07:33 +0000)]
GPGPU: generate control flow within the kernel
llvm-svn: 275956
Tobias Grosser [Tue, 19 Jul 2016 07:33:06 +0000 (07:33 +0000)]
GPGPU: add scop parameters to kernel arguments
llvm-svn: 275955
Tobias Grosser [Tue, 19 Jul 2016 07:32:55 +0000 (07:32 +0000)]
GPGPU: add host iterators to kernel arguments
llvm-svn: 275954
Tobias Grosser [Tue, 19 Jul 2016 07:32:44 +0000 (07:32 +0000)]
GPGPU: add intrinsic functions to obtain a kernels thread and block ids
llvm-svn: 275953
Tobias Grosser [Tue, 19 Jul 2016 07:32:38 +0000 (07:32 +0000)]
GPGPU: create kernel function skeleton
Create for each kernel a separate LLVM-IR module containing a single function
marked as kernel function and taking one pointer for each array referenced
by this kernel. Add debugging output to verify the kernels are generated
correctly.
llvm-svn: 275952
Simon Atanasyan [Tue, 19 Jul 2016 07:23:15 +0000 (07:23 +0000)]
[driver][mips] Remove empty folder from test inputs
llvm-svn: 275951
Elena Demikhovsky [Tue, 19 Jul 2016 07:14:21 +0000 (07:14 +0000)]
AVX-512: Fixed BT instruction selection.
The following condition expression ( a >> n) & 1 is converted to "bt a, n" instruction. It works on all intel targets.
But on AVX-512 it was broken because the expression is modified to (truncate (a >>n) to i1).
I added the new sequence (truncate (a >>n) to i1) to the BT pattern.
Differential Revision: https://reviews.llvm.org/D22354
llvm-svn: 275950
Simon Atanasyan [Tue, 19 Jul 2016 07:09:48 +0000 (07:09 +0000)]
[driver][mips] Support MIPS targets in modern Android NDK
Initial patch provided by Duane Sand.
llvm-svn: 275949
Derek Bruening [Tue, 19 Jul 2016 05:06:48 +0000 (05:06 +0000)]
[esan|wset] Fix flaky sampling tests
Adds a new esan public interface routine __esan_get_sample_count() and uses
it to ensure that tests of sampling receive the minimum number of samples.
llvm-svn: 275948
Alexey Bataev [Tue, 19 Jul 2016 05:06:39 +0000 (05:06 +0000)]
[OPENMP] Removed loop statement as its body executes at most once, NFC.
Removed not required loop statement, addressing comments from Richard
Smith.
llvm-svn: 275947
Derek Bruening [Tue, 19 Jul 2016 05:03:38 +0000 (05:03 +0000)]
[esan] Fix sideline thread flaky assert
Fixes an esan sideline thread CHECK that failed to account for the sideline
thread reaching its code before the internal_clone() return value was
assigned in the parent.
llvm-svn: 275946
Alexey Bataev [Tue, 19 Jul 2016 04:21:09 +0000 (04:21 +0000)]
[OPENMP] Improved processing of 'priority' clause, NFC.
Removed some old comments + improved handling of 'priority' clause value
during codegen after comments from Richard Smith.
llvm-svn: 275945
Jason Molenda [Tue, 19 Jul 2016 02:37:07 +0000 (02:37 +0000)]
Ignore clang-module-cache directories that may be created
in the testsuite directory while it runs.
llvm-svn: 275944
Saleem Abdulrasool [Tue, 19 Jul 2016 02:13:08 +0000 (02:13 +0000)]
clang-rename: fix referenced variable in vim-script
llvm-svn: 275943
Craig Topper [Tue, 19 Jul 2016 02:00:38 +0000 (02:00 +0000)]
[AVX512] Give priority to EVEX encoded PSHUFB over the VEX versions.
llvm-svn: 275942
Craig Topper [Tue, 19 Jul 2016 02:00:35 +0000 (02:00 +0000)]
[X86] Remove superfluous parameter from a multiclass. All instantiations passed the same value.
llvm-svn: 275941
George Burgess IV [Tue, 19 Jul 2016 01:29:15 +0000 (01:29 +0000)]
[MemorySSA] Update to the new shiny walker.
This patch updates MemorySSA's use-optimizing walker to be more
accurate and, in some cases, faster.
Essentially, this changed our core walking algorithm from a
cache-as-you-go DFS to an iteratively expanded DFS, with all of the
caching happening at the end. Said expansion happens when we hit a Phi,
P; we'll try to do the smallest amount of work possible to see if
optimizing above that Phi is legal in the first place. If so, we'll
expand the search to see if we can optimize to the next phi, etc.
An iteratively expanded DFS lets us potentially quit earlier (because we
don't assume that we can optimize above all phis) than our old walker.
Additionally, because we don't cache as we go, we can now optimize above
loops.
As an added bonus, this patch adds a ton of verification (if
EXPENSIVE_CHECKS are enabled), so finding bugs is easier.
Differential Revision: https://reviews.llvm.org/D21777
llvm-svn: 275940
Craig Topper [Tue, 19 Jul 2016 01:26:19 +0000 (01:26 +0000)]
[X86] Rename VINSERTzrr to use a capital Z to match other instructions. NFC
llvm-svn: 275939
Vedant Kumar [Tue, 19 Jul 2016 01:17:20 +0000 (01:17 +0000)]
Retry: [llvm-profdata] Speed up merging by using a thread pool
Add a "-j" option to llvm-profdata to control the number of threads used.
Auto-detect NumThreads when it isn't specified, and avoid spawning threads when
they wouldn't be beneficial.
I tested this patch using a raw profile produced by clang (147MB). Here is the
time taken to merge 4 copies together on my laptop:
No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total
With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total
Changes since the initial commit:
- When handling odd-length inputs, call ThreadPool::wait() before merging the
last profile. Should fix a race/off-by-one (see r275937).
Differential Revision: https://reviews.llvm.org/D22438
llvm-svn: 275938
Vedant Kumar [Tue, 19 Jul 2016 00:57:09 +0000 (00:57 +0000)]
Revert "[llvm-profdata] Speed up merging by using a thread pool"
This reverts commit r275921. It broke the ppc64be bot:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/3537
I'm not sure why it broke, but based on the output, it looks like an
off-by-one (one profile left un-merged).
llvm-svn: 275937
Wei Mi [Tue, 19 Jul 2016 00:50:43 +0000 (00:50 +0000)]
Recommit the patch "Use uniforms set to populate VecValuesToIgnore".
For instructions in uniform set, they will not have vector versions so
add them to VecValuesToIgnore.
For induction vars, those only used in uniform instructions or consecutive
ptrs instructions have already been added to VecValuesToIgnore above. For
those induction vars which are only used in uniform instructions or
non-consecutive/non-gather scatter ptr instructions, the related phi and
update will also be added into VecValuesToIgnore set.
The change will make the vector RegUsages estimation less conservative.
Differential Revision: https://reviews.llvm.org/D20474
The recommit fixed the testcase global_alias.ll.
llvm-svn: 275936
Matt Arsenault [Tue, 19 Jul 2016 00:35:22 +0000 (00:35 +0000)]
AMDGPU/SI: Fix SI scheduler refcount issue
Without this fix, releaseSuccessors when InOrOutBlock is
false could release SUs outside the schedule BasicBlock.
Patch by Axel Davy
llvm-svn: 275935
Matt Arsenault [Tue, 19 Jul 2016 00:35:03 +0000 (00:35 +0000)]
AMDGPU: Expand register indexing pseudos in custom inserter
This is to help moveSILowerControlFlow to before regalloc.
There are a couple of tradeoffs with this. The complete CFG
is visible to more passes, the loop body avoids an extra copy of m0,
vcc isn't required, and immediate offsets can be shrunk into s_movk_i32.
The disadvantage is the register allocator doesn't understand that
the single lane's vector is dead within the loop body, so an extra
register is used to outlive the loop block when expanding the
VGPR -> m0 loop. This also now results in worse waitcnt insertion
before the loop instead of after for pending operations at the point
of the indexing, but that should be fixed by future improvements to
cross block waitcnt insertion.
v_movreld_b32's operands are now modeled more correctly since vdst
is not a true output. This is kind of a hack to treat vdst as a
use operand. Extra checking is required in the verifier since
I can't seem to get tablegen to emit an implicit operand for a
virtual register.
llvm-svn: 275934
Lang Hames [Tue, 19 Jul 2016 00:25:52 +0000 (00:25 +0000)]
[Kaleidoscope][BuildingAJIT] More work on the text for Chapter 3.
Add an overview of stubs and compile callbacks before the discussion of the
source changes.
-- This line, and those below, will be ignored--
M docs/tutorial/BuildingAJIT3.rst
llvm-svn: 275933
Sanjoy Das [Tue, 19 Jul 2016 00:23:54 +0000 (00:23 +0000)]
[LoopReroll] Reroll loops with unordered atomic memory accesses
Reviewers: hfinkel, jfb, reames
Subscribers: mcrosier, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D22385
llvm-svn: 275932
Samuel Antao [Tue, 19 Jul 2016 00:01:12 +0000 (00:01 +0000)]
Append clang system include path for offloading tool chains.
Summary:
This patch adds clang system include path when offloading tool chains, e.g. CUDA, are used in the current compilation.
This fixes an issue detected by @rsmith in response to r275645.
Reviewers: rsmith, tra
Subscribers: rsmith, cfe-commits
Differential Revision: https://reviews.llvm.org/D22490
llvm-svn: 275931
Samuel Antao [Mon, 18 Jul 2016 23:22:11 +0000 (23:22 +0000)]
[OpenMP] Remove dead code in conditional of mappable expressions SEMA.
llvm-svn: 275930
Matt Arsenault [Mon, 18 Jul 2016 23:20:46 +0000 (23:20 +0000)]
TableGen: Allow custom register operand decoder method
This is for a situation where the encoding for a register may be
different depending on the specific operand. For some instructions,
we want to apply additional restrictions beyond the encoding's
constraints.
In AMDGPU some operands are VSrc_32, using the VS_32 pseudo register
class which accept VGPRs, SGPRs, or immediates in the encoding.
Some specific instructions with the same encoding operand do not want
to allow immediates or SGPRs, but the encoding format is different
in this case than a regular VGPR_32 operand.
This allows specifying the encoding should be treated the same
without introducing yet another dummy register class.
llvm-svn: 275929
Matt Arsenault [Mon, 18 Jul 2016 23:09:51 +0000 (23:09 +0000)]
AMDGPU: Fix test name and broken CHECK-LABEL
llvm-svn: 275928
Vedant Kumar [Mon, 18 Jul 2016 22:50:10 +0000 (22:50 +0000)]
[utils] Generate html reports with the code coverage utility script
Instead of extracting raw coverage mappings into an artifact directory,
actually generate useful html reports for a given list of binaries with
symbol demangling turned on.
No tests, but this is actively being used to drive the (still nascent)
coverage bot.
llvm-svn: 275927
Kelvin Li [Mon, 18 Jul 2016 22:49:16 +0000 (22:49 +0000)]
[OpenMP] Fix incorrect diagnostics in map clause
Having the following code pattern will result in incorrect diagnostic
int main() {
int arr[10];
#pragma omp target data map(arr[:])
#pragma omp target map(arr)
{}
}
t.cpp:4:24: error: original storage of expression in data environment is shared
but data environment do not fully contain mapped expression storage
#pragma omp target map(arr)
^~~
t.cpp:3:29: note: used here
#pragma omp target data map(arr[:])
^~~~~~
1 error generated.
Patch by David S.
Differential Revision: https://reviews.llvm.org/D22075
llvm-svn: 275926
Richard Smith [Mon, 18 Jul 2016 22:37:35 +0000 (22:37 +0000)]
Fix some minor issues found by Coverity.
llvm-svn: 275925
Vedant Kumar [Mon, 18 Jul 2016 22:32:02 +0000 (22:32 +0000)]
[Coverage] Remove '..' from filenames *after* getting an absolute path
Failure to do this breaks relative paths which begin with '..'.
This issue was caught by the (still nascent) coverage bot.
llvm-svn: 275924
Matt Arsenault [Mon, 18 Jul 2016 22:12:46 +0000 (22:12 +0000)]
Fix -Wreturn-type with gcc 4.8 and libc++
llvm-svn: 275922
Vedant Kumar [Mon, 18 Jul 2016 22:02:39 +0000 (22:02 +0000)]
[llvm-profdata] Speed up merging by using a thread pool
Add a "-j" option to llvm-profdata to control the number of threads
used. Auto-detect NumThreads when it isn't specified, and avoid spawning
threads when they wouldn't be beneficial.
I tested this patch using a raw profile produced by clang (147MB). Here is the
time taken to merge 4 copies together on my laptop:
No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total
With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total
Differential Revision: https://reviews.llvm.org/D22438
llvm-svn: 275921
Artem Belevich [Mon, 18 Jul 2016 21:58:48 +0000 (21:58 +0000)]
[NVPTX] Make sure we adjust alignment at all call sites
.. including calls from kernel functions that were
ignored by mistake before.
llvm-svn: 275920
Dehao Chen [Mon, 18 Jul 2016 21:41:50 +0000 (21:41 +0000)]
[PM] Convert Loop Strength Reduce pass to new PM
Summary: Convert Loop String Reduce pass to new PM
Reviewers: davidxl, silvas
Subscribers: junbuml, sanjoy, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D22468
llvm-svn: 275919
Mehdi Amini [Mon, 18 Jul 2016 21:29:24 +0000 (21:29 +0000)]
Update doxygen description for `WriteBitcodeToFile()` API (NFC)
llvm-svn: 275917
Teresa Johnson [Mon, 18 Jul 2016 21:22:24 +0000 (21:22 +0000)]
[PM] Port FunctionImport Pass to new PM
Summary: Port FunctionImport Pass to new PM.
Reviewers: mehdi_amini, davide
Subscribers: davidxl, llvm-commits
Differential Revision: https://reviews.llvm.org/D22475
llvm-svn: 275916
Wei Mi [Mon, 18 Jul 2016 21:14:43 +0000 (21:14 +0000)]
Revert rL275912.
llvm-svn: 275915
Chaoren Lin [Mon, 18 Jul 2016 21:11:43 +0000 (21:11 +0000)]
Add missing headers after header cleanup in r275882.
llvm-svn: 275914
Vedant Kumar [Mon, 18 Jul 2016 21:01:27 +0000 (21:01 +0000)]
[Coverage] Normalize '..' out of filename strings
This fixes the issue of having duplicate entries for the same file in a
coverage report s.t none of the entries actually displayed the correct
coverage information.
llvm-svn: 275913
Wei Mi [Mon, 18 Jul 2016 20:59:53 +0000 (20:59 +0000)]
Use uniforms set to populate VecValuesToIgnore.
For instructions in uniform set, they will not have vector versions so
add them to VecValuesToIgnore.
For induction vars, those only used in uniform instructions or consecutive
ptrs instructions have already been added to VecValuesToIgnore above. For
those induction vars which are only used in uniform instructions or
non-consecutive/non-gather scatter ptr instructions, the related phi and
update will also be added into VecValuesToIgnore set.
The change will make the vector RegUsages estimation less conservative.
Differential Revision: https://reviews.llvm.org/D20474
llvm-svn: 275912
Sanjay Patel [Mon, 18 Jul 2016 20:56:53 +0000 (20:56 +0000)]
refactor SimplifySelectInst; NFCI
llvm-svn: 275911
Justin Lebar [Mon, 18 Jul 2016 20:40:35 +0000 (20:40 +0000)]
Write isUInt using template specializations to work around an incorrect MSVC warning.
Summary:
Per D22441, MSVC warns on our old implementation of isUInt<64>. It sees
uint64_t(1) << 64 and doesn't realize that it's not going to be
executed. Writing as a template specialization is ugly, but prevents
the warning.
Reviewers: RKSimon
Subscribers: majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D22472
llvm-svn: 275909
Sanjay Patel [Mon, 18 Jul 2016 20:37:51 +0000 (20:37 +0000)]
add tests for missed sext transform
llvm-svn: 275908
Bruno Cardoso Lopes [Mon, 18 Jul 2016 20:37:06 +0000 (20:37 +0000)]
[Sema] Create a separate group for incompatible function pointer warning
Give incompatible function pointer warning its own diagnostic group
but still leave it as a subgroup of incompatible-pointer-types. This is in
preparation to promote -Wincompatible-function-pointer-types to error on
darwin.
Differential Revision: https://reviews.llvm.org/D22248
rdar://problem/
12907612
llvm-svn: 275907
Mehdi Amini [Mon, 18 Jul 2016 20:33:09 +0000 (20:33 +0000)]
Add missing header in ClangFuzzer (after r275882 cleanup)
llvm-svn: 275906
Bob Wilson [Mon, 18 Jul 2016 20:29:14 +0000 (20:29 +0000)]
Allow iOS and tvOS version numbers with 2-digit major version numbers.
rdar://problem/
26921601
llvm-svn: 275905
Marshall Clow [Mon, 18 Jul 2016 20:27:19 +0000 (20:27 +0000)]
Bump version # to 4.0.0
llvm-svn: 275904
Hans Wennborg [Mon, 18 Jul 2016 20:26:46 +0000 (20:26 +0000)]
build_llvm_package.bat: update version to 4.0.0
llvm-svn: 275903
Vedant Kumar [Mon, 18 Jul 2016 20:07:27 +0000 (20:07 +0000)]
[interception] Remove extra whitespace to appease linters (NFC)
Attempt to fix:
http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/7774
llvm-svn: 275901
Sanjay Patel [Mon, 18 Jul 2016 20:06:51 +0000 (20:06 +0000)]
auto-generate checks
llvm-svn: 275899
Hans Wennborg [Mon, 18 Jul 2016 20:06:27 +0000 (20:06 +0000)]
Revert r273099 "If the revision number starts with r, drop it. It will get added back"
This doesn't seem to work with Bash:
$ /work/llvm/utils/release/merge.sh --proj llvm --rev r275870
/work/llvm/utils/release/merge.sh: line 34: ${$1#r}: bad substitution
I get the same error with and without a leading 'r'.
llvm-svn: 275898
Vedant Kumar [Mon, 18 Jul 2016 19:56:38 +0000 (19:56 +0000)]
[Driver] Compute effective target triples once per job (NFCI)
Compute an effective target triple exactly once in ConstructJob(), and
then simply pass around references to it. This eliminates wasteful
re-computation of effective triples (e.g in getARMFloatABI()).
Differential Revision: https://reviews.llvm.org/D22290
llvm-svn: 275895
Vedant Kumar [Mon, 18 Jul 2016 19:56:33 +0000 (19:56 +0000)]
[Driver] Make Driver::DefaultTargetTriple private (NFCI)
No in-tree targets access this `DefaultTargetTriple` directly, and usage
of default triples is generally discouraged. Make the field private.
This is part of en effort to make the clang driver use effective triples
more pervasively.
Differential Revision: https://reviews.llvm.org/D22289
llvm-svn: 275894
Artem Belevich [Mon, 18 Jul 2016 19:54:56 +0000 (19:54 +0000)]
[NVPTX] Force minimum alignment of 4 for byval arguments of device-side functions.
Taking address of a byval variable in PTX is legal, but currently runs
into miscompilation by ptxas on sm_50+ (NVIDIA issue 1789042).
Work around the issue by enforcing minimum alignment on byval arguments
of device functions.
The change is a no-op on SASS level for sm_3x where ptxas already aligns
local copy by at least 4.
Differential Revision: https://reviews.llvm.org/D22428
llvm-svn: 275893
Etienne Bergeron [Mon, 18 Jul 2016 19:50:55 +0000 (19:50 +0000)]
[compiler-rt] Fix incorrect handling of indirect load.
Summary:
Indirect load are relative offset from RIP.
The current trampoline implementation is incorrectly
copying these instructions which make some unittests
crashing.
This patch is not fixing the unittests but it's fixing
the crashes. The functions are no longer hooked.
Patches will come soon to fix these unittests.
Reviewers: rnk
Subscribers: llvm-commits, wang0109, chrisha
Differential Revision: https://reviews.llvm.org/D22410
llvm-svn: 275892