Matt Arsenault [Sat, 3 Sep 2016 17:25:39 +0000 (17:25 +0000)]
AMDGPU: Fix adding duplicate implicit exec uses
I'm not sure if this should be considered a bug in
copyImplicitOps or not, but implicit operands that are part
of the static instruction definition should not be copied.
llvm-svn: 280594
Craig Topper [Sat, 3 Sep 2016 17:20:07 +0000 (17:20 +0000)]
[AVX-512] Add integer ADD/SUB instructions to load folding tables. Add an AVX512 stack folding test.
llvm-svn: 280593
Craig Topper [Sat, 3 Sep 2016 16:28:03 +0000 (16:28 +0000)]
[AVX-512] Mark EVEX encoded vpcmpeq as commutable just like its AVX and SSE equivalent.
llvm-svn: 280592
Aaron Ballman [Sat, 3 Sep 2016 15:36:52 +0000 (15:36 +0000)]
Fix the attribute documentation build.
llvm-svn: 280591
Nicolai Haehnle [Sat, 3 Sep 2016 12:26:38 +0000 (12:26 +0000)]
AMDGPU: Reduce the duration of whole-quad-mode
Summary:
This contains two changes that reduce the time spent in WQM, with the
intention of reducing bandwidth required by VMEM loads:
1. Sampling instructions by themselves don't need to run in WQM, only their
coordinate inputs need it (unless of course there is a dependent sampling
instruction). The initial scanInstructions step is modified accordingly.
2. When switching back from WQM to Exact, switch back as soon as possible.
This affects the logic in processBlock.
This should always be a win or at best neutral.
There are also some cleanups (e.g. remove unused ExecExports) and some new
debugging output.
Reviewers: arsenm, tstellarAMD, mareko
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D22092
llvm-svn: 280590
Nicolai Haehnle [Sat, 3 Sep 2016 12:26:32 +0000 (12:26 +0000)]
AMDGPU: Fix an interaction between WQM and polygon stippling
Summary:
This fixes a rare bug in polygon stippling with non-monolithic pixel shaders.
The underlying problem is as follows: the prolog part contains the polygon
stippling sequence, i.e. a kill. The main part then enables WQM based on the
_reduced_ exec mask, effectively undoing most of the polygon stippling.
Since we cannot know whether polygon stippling will be used, the main part
of a non-monolithic shader must always return to exact mode to fix this
problem.
Reviewers: arsenm, tstellarAMD, mareko
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D23131
llvm-svn: 280589
Eric Fiselier [Sat, 3 Sep 2016 08:07:40 +0000 (08:07 +0000)]
Fix PR30202 - notify_all_at_thread_exit seg faults if run from a raw pthread context.
Summary:
This patch allows threads not created using `std::thread` to use `std::notify_all_at_thread_exit` by ensuring the TL state has been initialized within `std::notify_all_at_thread_exit`.
Additionally this patch "fixes" a potential oddity in `__thread_local_pointer::reset(pointer)`, which would previously delete the old thread local data. However there should *never* be old thread local data because pthread *should* null it out on thread exit. Unfortunately it's possible that pthread failed to do this according to the spec:
>
> Upon key creation, the value NULL shall be associated with the new key in all active threads. Upon thread creation, the value NULL shall be associated with all defined keys in the new thread.
>
> An optional destructor function may be associated with each key value. At thread exit, if a key value has a non-NULL destructor pointer, and the thread has a non-NULL value associated with that key, the value of the key is set to NULL, and then the function pointed to is called with the previously associated value as its sole argument. The order of destructor calls is unspecified if more than one destructor exists for a thread when it exits.
>
> If, after all the destructors have been called for all non-NULL values with associated destructors, there are still some non-NULL values with associated destructors, then the process is repeated. If, after at least {PTHREAD_DESTRUCTOR_ITERATIONS} iterations of destructor calls for outstanding non-NULL values, there are still some non-NULL values with associated destructors, implementations may stop calling destructors, or they may continue calling destructors until no non-NULL values with associated destructors exist, even though this might result in an infinite loop.
However if pthread fails to delete the value it is probably incorrect for us to do it. Destroying the value performs all of the "at thread exit" actions registered with it but we are way past "at thread exit".
Reviewers: mclow.lists, bcraig, EricWF
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D24159
llvm-svn: 280588
Niels Ole Salscheider [Sat, 3 Sep 2016 07:13:54 +0000 (07:13 +0000)]
Replace the Radeon GCN GPU family names by more descriptive ones
Differential Revision: https://reviews.llvm.org/D23957
llvm-svn: 280587
Matt Arsenault [Sat, 3 Sep 2016 07:06:58 +0000 (07:06 +0000)]
AMDGPU: Do basic folding of class intrinsic
This allows more of the OCML builtin library to be
constant folded.
llvm-svn: 280586
Eric Fiselier [Sat, 3 Sep 2016 07:05:40 +0000 (07:05 +0000)]
memory_resource still needs init_priority when built with GCC 4.9
llvm-svn: 280585
Matt Arsenault [Sat, 3 Sep 2016 06:57:55 +0000 (06:57 +0000)]
AMDGPU: Fix spilling of m0
readlane/writelane do not support using m0 as the output/input.
Constrain the register class of spill vregs to try to avoid this,
but also handle spilling of the physreg when necessary by inserting
an additional copy to a normal SGPR.
llvm-svn: 280584
Matt Arsenault [Sat, 3 Sep 2016 06:57:49 +0000 (06:57 +0000)]
Improve debug error message with register name
llvm-svn: 280583
Craig Topper [Sat, 3 Sep 2016 04:37:50 +0000 (04:37 +0000)]
[AVX-512] Add EVEX encoded VPCMPEQ and VPCMPGT to the load folding tables.
llvm-svn: 280581
Nico Weber [Sat, 3 Sep 2016 04:27:14 +0000 (04:27 +0000)]
Add a test Aaron asked for that I forgot to add before landing r280578.
llvm-svn: 280580
NAKAMURA Takumi [Sat, 3 Sep 2016 04:06:37 +0000 (04:06 +0000)]
Make lit/util.py py3-compatible.
llvm-svn: 280579
Nico Weber [Sat, 3 Sep 2016 03:25:22 +0000 (03:25 +0000)]
[ms] Add support for parsing uuid as a Microsoft attribute.
Some Windows SDK classes, for example
Windows::Storage::Streams::IBufferByteAccess, use the ATL way of spelling
attributes:
[uuid("....")] class IBufferByteAccess {};
To be able to use __uuidof() to grab the uuid off these types, clang needs to
support uuid as a Microsoft attribute. There was already code to skip Microsoft
attributes, extend that to look for uuid and parse it. Use the new "Microsoft"
attribute type added in r280575 (and r280574, r280576) for this.
Final part of https://reviews.llvm.org/D23895
llvm-svn: 280578
Nico Weber [Sat, 3 Sep 2016 03:18:49 +0000 (03:18 +0000)]
Revert r280549.
The test it added doesn't pass:
http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/15318/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Apdbdump-yaml-types.test
Command Output (stdout):
--
$ "D:/buildslave/clang-x64-ninja-win7/stage1/./bin\llvm-pdbdump.EXE" "pdb2yaml" "-tpi-stream" "D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB/Inputs/empty.pdb"
$ "D:/buildslave/clang-x64-ninja-win7/stage1/./bin\FileCheck.EXE" "-check-prefix=YAML" "D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB\pdbdump-yaml-types.test"
# command stderr:
D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB\pdbdump-yaml-types.test:36:7: error: expected string not found in input
YAML: Name: apartment
^
<stdin>:153:10: note: scanning from here
Value: 161
^
llvm-svn: 280577
Nico Weber [Sat, 3 Sep 2016 03:01:32 +0000 (03:01 +0000)]
Let Microsoft attributes apply to the type, not the variable.
There was already a function that moved attributes off the declspec into
an attribute list for attributes applying to the type, teach that function to
also move Microsoft attributes around and rename it to match its new broader
role.
Nothing uses Microsoft attributes yet, so no behavior change.
Part of https://reviews.llvm.org/D23895
llvm-svn: 280576
Nico Weber [Sat, 3 Sep 2016 02:55:10 +0000 (02:55 +0000)]
Add plumbing for new attribute type "Microsoft".
This is for attributes in []-delimited lists preceding a class, like e.g.
`[uuid("...")] class Foo {};` Not used by anything yet, so no behavior change.
Part of https://reviews.llvm.org/D23895
llvm-svn: 280575
Nico Weber [Sat, 3 Sep 2016 02:48:03 +0000 (02:48 +0000)]
Move calls of MaybeParseMicrosoftAttributes() before ParseExternalDeclaration()
into ParseDeclOrFunctionDefInternal() (which is called by
MaybeParseMicrosoftAttributes()), so that the attributes can be stored in
the DeclSpec. No behavior change yet, part of https://reviews.llvm.org/D23895
llvm-svn: 280574
Duncan P. N. Exon Smith [Sat, 3 Sep 2016 02:43:42 +0000 (02:43 +0000)]
ADT: Use std::list in SparseBitVector, NFC
The only intrusive thing about SparseBitVector's usage of ilist<> was
that new was usually called externally. There were no custom traits.
It seems like the reason to switch to ilist in r41855 was to avoid
pointer invalidation, but std::list<> has that feature too. Maybe
std::list<>::emplace makes this a little more obvious than it was then.
Switch over to std::list<> and simplify the code.
llvm-svn: 280573
Nico Weber [Sat, 3 Sep 2016 02:41:17 +0000 (02:41 +0000)]
Remove function name from comment.
The comment starting with "ParseDeclarationOrFunctionDefinition -" is above
a function called ParseDeclOrFunctionDefInternal. Fix the comment by not
mentioning a function name, like the style guide requests nowadays. No behavior
change.
llvm-svn: 280572
Hal Finkel [Sat, 3 Sep 2016 02:31:44 +0000 (02:31 +0000)]
[PowerPC] Support asm parsing for bc[l][a][+-] mnemonics
PowerPC assembly code in the wild, so it seems, has things like this:
bc+ 12, 28, .L9
This is a bit odd because the '+' here becomes part of the BO field, and the BO
field is otherwise the first operand. Nevertheless, the ISA specification does
clearly say that the +- hint syntax applies to all conditional-branch mnemonics
(that test either CTR or a condition register, although not the forms which
check both), both basic and extended, so this is supposed to be valid.
This introduces some asm-parser-only definitions which take only the upper
three bits from the specified BO value, and the lower two bits are implied by
the +- suffix (via some associated aliases).
Fixes PR23646.
llvm-svn: 280571
Duncan P. N. Exon Smith [Sat, 3 Sep 2016 02:27:35 +0000 (02:27 +0000)]
ADT: Do not inherit from std::iterator in ilist_iterator
Inheriting from std::iterator uses more boiler-plate than manual
typedefs. Avoid that in both ilist_iterator and
MachineInstrBundleIterator.
This has the side effect of removing ilist_iterator from certain ADL
lookups in namespace std; calls to std::next need to be qualified by
"std::" that didn't have to before. The one case of this in-tree was
operating on a temporary, so I used the more compact operator++.
llvm-svn: 280570
Duncan P. N. Exon Smith [Sat, 3 Sep 2016 02:07:45 +0000 (02:07 +0000)]
ADT: Split out iplist_impl from iplist, NFC
Split out iplist_impl from iplist, and change SymbolTableList to inherit
directly from iplist_impl. This makes it more straightforward to add
new template paramaters to iplist [*]:
- iplist_impl takes a "base" list that provides the intrusive
functionality (usually simple_ilist<T>) and a traits class.
- iplist no longer takes a "Traits" template parameter. It only takes
the value_type, T, and instantiates iplist_impl with simple_ilist<T>
and ilist_traits<T>.
- SymbolTableList now inherits from iplist_impl, instead of iplist.
Note for out-of-tree code: if you have an iplist whose second template
parameter was *not* the default (i.e., not ilist_traits<YourT>), you
have three options:
- Stop using a custom traits class, and instead specialize
ilist_traits<YourT>. This is the usual thing to do.
- Specialize iplist<YourT> to pass your custom traits class into
iplist_impl.
- Create your own trivial list type that passes your custom traits class
into iplist_impl (see SymbolTableList<> for an example).
[*]: The eventual goal is to start tracking a sentinel bit on the
MachineInstr list even when LLVM_ENABLE_ABI_BREAKING_CHECKS is off,
which will enable MachineBasicBlock::reverse_iterator to have normal
list invalidation semantics that matching the new
iplist<>::reverse_iterator from r280032.
llvm-svn: 280569
Wei Mi [Sat, 3 Sep 2016 01:43:28 +0000 (01:43 +0000)]
Fix buildbot error.
Add -mtriple=x86_64-unknown-linux-gnu for the test and move it to CodeGen/X86.
llvm-svn: 280568
Duncan P. N. Exon Smith [Sat, 3 Sep 2016 01:42:40 +0000 (01:42 +0000)]
ADT: Rename NodeTy to T in iplist/ilist template parameters
And use other typedefs so that the next rename has a smaller diff.
llvm-svn: 280567
Duncan P. N. Exon Smith [Sat, 3 Sep 2016 01:29:36 +0000 (01:29 +0000)]
ReaderWriter: Use ilist_noalloc_traits for TrieEdge, NFC
Adopt r280128 in lld, specializing ilist_alloc_traits rather than
reinventing the wheel.
llvm-svn: 280566
Duncan P. N. Exon Smith [Sat, 3 Sep 2016 01:22:56 +0000 (01:22 +0000)]
ADT: Remove external uses of ilist_iterator, NFC
Delete the dead code for Write(ilist_iterator) in the IR Verifier,
inline report(ilist_iterator) at its call sites in the MachineVerifier,
and use simple_ilist<>::iterator in SymbolTableListTraits.
The only remaining reference to ilist_iterator outside of the ilist
implementation is from MachineInstrBundleIterator. I'll get rid of that
in a follow-up.
llvm-svn: 280565
Duncan P. N. Exon Smith [Sat, 3 Sep 2016 01:06:08 +0000 (01:06 +0000)]
ADT: Fix up IListTest.privateNode and get it passing
This test was using the wrong type, and so not actually testing much.
ilist_iterator constructors weren't going through ilist_node_access, so
they didn't actually work with private inheritance.
llvm-svn: 280564
Jason Henline [Sat, 3 Sep 2016 00:32:07 +0000 (00:32 +0000)]
[SE] Add getByteCount methods for device memory
Summary:
Simple utility methods will prevent users from making mistakes when
converting element counts to byte counts.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24197
llvm-svn: 280563
George Burgess IV [Sat, 3 Sep 2016 00:28:25 +0000 (00:28 +0000)]
[Sema] Fix how we set implicit conversion kinds.
We have invariants we like to guarantee for the
`ImplicitConversionKind`s in a `StandardConversionSequence`. These
weren't being upheld in code that r280553 touched, so Richard suggested
that we should fix that. See D24113.
I'm not entirely sure how to go about testing this, so no test case is
included. Suggestions welcome.
llvm-svn: 280562
Eric Fiselier [Sat, 3 Sep 2016 00:11:33 +0000 (00:11 +0000)]
Define _LIBCPP_SAFE_STATIC __attribute__((require_constant_initialization)), and apply it to memory_resource
llvm-svn: 280561
Hal Finkel [Fri, 2 Sep 2016 23:42:01 +0000 (23:42 +0000)]
[PowerPC] Add asm parser/disassembler support for hrfid,nap,slbmfev
These few book-III instructions are used by the Linux kernel.
Partially fixes PR24796.
llvm-svn: 280560
Hal Finkel [Fri, 2 Sep 2016 23:41:54 +0000 (23:41 +0000)]
[PowerPC] Add support for the extended dcbf form and mnemonics
dcbf has an optional hint-like field, add support for the extended form and the
associated mnemonics (dcbfl and dcbflp).
Partially fixes PR24796.
llvm-svn: 280559
Tobias Grosser [Fri, 2 Sep 2016 23:40:15 +0000 (23:40 +0000)]
Dependences: Only create flat StmtSchedule in presence of reductions
Without reductions we do not need a flat union_map schedule describing
the computation we want to perform, but can work purely on the schedule
tree. This reduces the dependence computation and scheduling time from 33ms
to 25ms. Another 30% reduction.
llvm-svn: 280558
Tobias Grosser [Fri, 2 Sep 2016 23:29:38 +0000 (23:29 +0000)]
Dependences: Exit early, if no reduction dependences are needed.
In case we do not compute reduction dependences or dependences that are more
fine-grained than statement level dependences, we can avoid the corresponding
part of the dependence analysis all together. For the 3mm benchmark, this
reduces scheduling + dependence analysis time from 62ms to 33ms for a no-asserts
build. The majority of the compile time is anyhow spent in the LLVM backends,
when doing code generation. Nevertheless, there is no need to waste compile time
either.
llvm-svn: 280557
Yunzhong Gao [Fri, 2 Sep 2016 23:16:06 +0000 (23:16 +0000)]
(clang part) Implement MASM-flavor intel syntax behavior for inline MS asm block.
Clang tests for verifying the following syntaxes:
1. 0xNN and NNh are accepted as valid hexadecimal numbers, but 0xNNh is not.
0xNN and NNh may come with optional U or L suffix.
2. NNb is accepted as a valid binary (base-2) number, but 0bNN is not.
NNb may come with optional U or L suffix.
Differential Revision: https://reviews.llvm.org/D22112
llvm-svn: 280556
Yunzhong Gao [Fri, 2 Sep 2016 23:15:29 +0000 (23:15 +0000)]
(LLVM part) Implement MASM-flavor intel syntax behavior for inline MS asm block:
1. 0xNN and NNh are accepted as valid hexadecimal numbers, but 0xNNh is not.
0xNN and NNh may come with optional U or L suffix.
2. NNb is accepted as a valid binary (base-2) number, but 0bNN is not.
NNb may come with optional U or L suffix.
Differential Revision: https://reviews.llvm.org/D22112
llvm-svn: 280555
Tobias Grosser [Fri, 2 Sep 2016 23:05:42 +0000 (23:05 +0000)]
Introduce option to run isl AST generation, but no IR generation.
We replace the options
-polly-code-generator=none
=isl
with the options
-polly-code-generation=none
=ast
=full
This allows us to measure the overhead of Polly itself, versus the compile
time increases due to us generating more IR and consequently the LLVM backends
spending more time on this IR.
We also use this opportunity to rename the option. The original name was
introduced at a point where we still had two code generators. CLooG and the
isl AST generator. Since we only have one AST generator left, there is no need
to distinguish between 'isl' and something else. However, being able to disable
code generation all together has been shown useful for debugging. Hence, we
rename and extend this option to make it a good fit for its new use case.
llvm-svn: 280554
George Burgess IV [Fri, 2 Sep 2016 22:59:57 +0000 (22:59 +0000)]
[Sema] Relax overloading restrictions in C.
This patch allows us to perform incompatible pointer conversions when
resolving overloads in C. So, the following code will no longer fail to
compile (though it will still emit warnings, assuming the user hasn't
opted out of them):
```
void foo(char *) __attribute__((overloadable));
void foo(int) __attribute__((overloadable));
void callFoo() {
unsigned char bar[128];
foo(bar); // selects the char* overload.
}
```
These conversions are ranked below all others, so:
A. Any other viable conversion will win out
B. If we had another incompatible pointer conversion in the example
above (e.g. `void foo(int *)`), we would complain about
an ambiguity.
Differential Revision: https://reviews.llvm.org/D24113
llvm-svn: 280553
Ron Lieberman [Fri, 2 Sep 2016 22:56:24 +0000 (22:56 +0000)]
Make sure to maintain register liveness when generating predicated instructions.
Author: Krzysztof Parzyszek <kparzysz@codeaurora.org>
Differential Revision: https://reviews.llvm.org/D24209
llvm-svn: 280552
Gor Nishanov [Fri, 2 Sep 2016 22:54:26 +0000 (22:54 +0000)]
gitignore: ignore VS Code editor files
Summary: VS code creates .vscode folder to keep its stuff that we really don't need in git.
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D24211
llvm-svn: 280551
Ivan Krasin [Fri, 2 Sep 2016 22:31:24 +0000 (22:31 +0000)]
lit: print process output, if getting the list of google-tests failed.
Summary:
This is a follow up to r280455, where a check for the process exit code
was introduced. Some ASAN bots throw this error now, but it's impossible
to understand what's wrong with them, and the issue is not reproducible.
Reviewers: vitalybuka
Differential Revision: https://reviews.llvm.org/D24210
llvm-svn: 280550
Zachary Turner [Fri, 2 Sep 2016 22:19:01 +0000 (22:19 +0000)]
[codeview] Make FieldList records print as a yaml sequence.
Before we were kind of imitating the behavior of a Yaml sequence
by outputting each record one after the other. This makes it a
little cumbersome when we want to go the other direction -- from
Yaml to Pdb. So this treats FieldList records as no different than
any other list of records, by printing them as a Yaml sequence with
the exact same format.
llvm-svn: 280549
Rui Ueyama [Fri, 2 Sep 2016 22:15:08 +0000 (22:15 +0000)]
Update comments.
llvm-svn: 280548
Xinliang David Li [Fri, 2 Sep 2016 22:03:40 +0000 (22:03 +0000)]
[Profile] handle select instruction in 'expect' lowering
Builtin expect lowering currently ignores select. This patch
fixes the issue
Differential Revision: http://reviews.llvm.org/D24166
llvm-svn: 280547
Simon Atanasyan [Fri, 2 Sep 2016 21:54:35 +0000 (21:54 +0000)]
[ELF] PR30221 - linker script expression parser does not accept '~'
The patch adds support for both '-' and '~' unary expressions. Also it
brings support for signed numbers is expressions.
https://llvm.org/bugs/show_bug.cgi?id=30221
Differential revision: https://reviews.llvm.org/D24128
llvm-svn: 280546
Hal Finkel [Fri, 2 Sep 2016 21:37:07 +0000 (21:37 +0000)]
[PowerPC] For larger offsets, when possible, fold offset into addis toc@ha
When we have an offset into a global, etc. that is accessed relative to the TOC
base pointer, and the offset is larger than the minimum alignment of the global
itself and the TOC base pointer (which is 8-byte aligned), we can still fold
the @toc@ha into the memory access, but we must update the addis instruction's
symbol reference with the offset as the symbol addend. When there is only one
use of the addi to be folded and only one use of the addis that would need its
symbol's offset adjusted, then we can make the adjustment and fold the @toc@l
into the memory access.
llvm-svn: 280545
George Rimar [Fri, 2 Sep 2016 21:17:20 +0000 (21:17 +0000)]
[ELF] - Use std::regex instead of hand written logic in elf::globMatch()
Use std::regex instead of hand written matcher.
Patch based on code and ideas of Rui Ueyama.
Differential revision: https://reviews.llvm.org/D23829
llvm-svn: 280544
Dimitry Andric [Fri, 2 Sep 2016 21:02:11 +0000 (21:02 +0000)]
Avoid narrowing warnings in __bitset constructor
When <bitset> is compiled with warnings enabled, on a platform where
size_t is 4 bytes, it results in errors similar to:
bitset:265:16: error: non-constant-expression cannot be narrowed
from type 'unsigned long long' to '__storage_type' (aka 'unsigned
int') in initializer list [-Wc++11-narrowing]
: __first_{__v, __v >> __bits_per_word}
^~~
bitset:676:52: note: in instantiation of member function
'std::__1::__bitset<2, 53>::__bitset' requested here
bitset(unsigned long long __v) _NOEXCEPT : base(__v) {}
^
Fix these by casting the initializer list elements to __storage_type.
Reviewers: mclow.lists, EricWF
Differential Revision: https://reviews.llvm.org/D23960
llvm-svn: 280543
Jonathan Peyton [Fri, 2 Sep 2016 20:54:58 +0000 (20:54 +0000)]
Move function into cpp file under KMP_AFFINITY_SUPPORTED guard.
When affinity isn't supported, __kmp_affinity_compact doesn't exist. The
problem is that in kmp_affinity.h there is a function which uses it without the
proper KMP_AFFINITY_SUPPORTED guard around it. The compiler was smart enough to
ignore it and the function __kmp_affinity_cmp_Address_child_num which relies on
it, but I think it is cleaner to have it under the proper guard. Since the
function is only used in the kmp_affinity.cpp file and there aren't any plans to
have it elsewhere. I have moved it there.
llvm-svn: 280542
Rui Ueyama [Fri, 2 Sep 2016 20:46:06 +0000 (20:46 +0000)]
Move a test file to the right place.
llvm-svn: 280541
Rui Ueyama [Fri, 2 Sep 2016 20:40:53 +0000 (20:40 +0000)]
Remove useless file prefix.
Differential Revision: https://reviews.llvm.org/D24207
llvm-svn: 280540
Hans Wennborg [Fri, 2 Sep 2016 20:39:46 +0000 (20:39 +0000)]
Remove link to clang's release notes; keeping it up-to-date is hard
llvm-svn: 280539
Jonathan Peyton [Fri, 2 Sep 2016 20:35:47 +0000 (20:35 +0000)]
Decouple the kmp_affin_mask_t type from determining if affinity is capable
the __kmp_affinity_determine_capable() functions are highly operating system
specific. This change has the functions use the type they expect explicitly.
llvm-svn: 280538
James Y Knight [Fri, 2 Sep 2016 20:29:11 +0000 (20:29 +0000)]
[Sparc] Mark i128 shift libcalls unavailable in 32-bit mode.
Recently, llvm wants to emit calls to these functions, while it didn't
seem to be an issue before. Not sure why. Nor do I know why only these
three are important to disable, out of all of the i128 libcalls.
Nevertheless, many other targets have this snippet of code, so, just
copying it to sparc as well, to unbreak things.
llvm-svn: 280537
Rui Ueyama [Fri, 2 Sep 2016 20:20:04 +0000 (20:20 +0000)]
Use od instead of hexdump.
od is defined by POSIX and exists since version 1 AT&T Unix.
hexdump is not part of any standard as far as I know.
So od is a better choice than hexdump.
Differential Revision: https://reviews.llvm.org/D24205
llvm-svn: 280536
Jan Vesely [Fri, 2 Sep 2016 20:13:19 +0000 (20:13 +0000)]
AMDGPU/R600: EXTRACT_VECT_ELT should only bypass BUILD_VECTOR if the vectors have the same number of elements.
Fixes R600 piglit regressions since r280298
Differential Revision: https://reviews.llvm.org/D24174
llvm-svn: 280535
Sjoerd Meijer [Fri, 2 Sep 2016 19:51:34 +0000 (19:51 +0000)]
Setting fp trapping mode and denormal type: this an improvement of
r280246 and calculates compatibility of functions attributes in
a better way.
Differential Revision: https://reviews.llvm.org/D24070
llvm-svn: 280534
Rui Ueyama [Fri, 2 Sep 2016 19:49:27 +0000 (19:49 +0000)]
Simplify. NFC.
llvm-svn: 280533
Krzysztof Parzyszek [Fri, 2 Sep 2016 19:48:55 +0000 (19:48 +0000)]
Do not consider subreg defs as reads when computing subrange liveness
Subregister definitions are considered uses for the purpose of tracking
liveness of the whole register. At the same time, when calculating live
interval subranges, subregister defs should not be treated as uses.
Differential Revision: https://reviews.llvm.org/D24190
llvm-svn: 280532
Sanjay Patel [Fri, 2 Sep 2016 19:38:37 +0000 (19:38 +0000)]
[InstCombine] auto-generate assertions for tighter checking
llvm-svn: 280531
Jonathan Peyton [Fri, 2 Sep 2016 19:37:12 +0000 (19:37 +0000)]
Replace a bad instance of __kmp_free() with KMP_CPU_FREE_ARRAY() macro.
llvm-svn: 280530
Rui Ueyama [Fri, 2 Sep 2016 19:36:29 +0000 (19:36 +0000)]
Rename UnresolvedPolicy::Error -> UnresolvedPolicy::ReportError.
"Error" looks like it is indicating a parse error. "Error" actually
instructs the later process to report an error if there's an error
condition. Thus the new name.
llvm-svn: 280529
Rui Ueyama [Fri, 2 Sep 2016 19:20:33 +0000 (19:20 +0000)]
Add -nostdlib.
llvm-svn: 280528
Chad Rosier [Fri, 2 Sep 2016 19:09:50 +0000 (19:09 +0000)]
[SLP] Don't pass a global CL option as an argument. NFC.
Differential Revision: https://reviews.llvm.org/D24199
llvm-svn: 280527
Jan Vesely [Fri, 2 Sep 2016 19:07:06 +0000 (19:07 +0000)]
AMDGPU/R600: Expand unaligned writes to local and global AS
LOCAL and GLOBAL AS only
PRIVATE needs special treatment
Differential Revision: https://reviews.llvm.org/D23971
llvm-svn: 280526
Eric Fiselier [Fri, 2 Sep 2016 18:53:31 +0000 (18:53 +0000)]
Implement __attribute__((require_constant_initialization)) for safe static initialization.
Summary:
This attribute specifies expectations about the initialization of static and
thread local variables. Specifically that the variable has a
[constant initializer](http://en.cppreference.com/w/cpp/language/constant_initialization)
according to the rules of [basic.start.static]. Failure to meet this expectation
will result in an error.
Static objects with constant initializers avoid hard-to-find bugs caused by
the indeterminate order of dynamic initialization. They can also be safely
used by other static constructors across translation units.
This attribute acts as a compile time assertion that the requirements
for constant initialization have been met. Since these requirements change
between dialects and have subtle pitfalls it's important to fail fast instead
of silently falling back on dynamic initialization.
```c++
// -std=c++14
#define SAFE_STATIC __attribute__((require_constant_initialization)) static
struct T {
constexpr T(int) {}
~T();
};
SAFE_STATIC T x = {42}; // OK.
SAFE_STATIC T y = 42; // error: variable does not have a constant initializer
// copy initialization is not a constant expression on a non-literal type.
```
This attribute can only be applied to objects with static or thread-local storage
duration.
Reviewers: majnemer, rsmith, aaron.ballman
Subscribers: jroelofs, cfe-commits
Differential Revision: https://reviews.llvm.org/D23385
llvm-svn: 280525
Rui Ueyama [Fri, 2 Sep 2016 18:52:41 +0000 (18:52 +0000)]
Dispatch without hash table lookup.
Cmd used to be the single central place to dispatch. It is not longer
the case because we have a logic for readProvideOrAssignment().
This patch removes the hash table so that evrything is in a single
function. This is slightly verbose but should improve readability.
Differential Revision: https://reviews.llvm.org/D24200
llvm-svn: 280524
Jan Vesely [Fri, 2 Sep 2016 18:52:28 +0000 (18:52 +0000)]
AMDGPU: Reorganize store tests
Split by AS.
Merge with some prviously failing tests.
Differential Revision: https://reviews.llvm.org/D23969
llvm-svn: 280523
Reid Kleckner [Fri, 2 Sep 2016 18:43:27 +0000 (18:43 +0000)]
[codeview] Use the correct max CV record length of 0xFF00
Previously we were splitting our records at 0xFFFF bytes, which the
Microsoft tools don't like.
Should fix failure on the new Windows self-host buildbot.
This length appears in microsoft-pdb/PDB/dbi/dbiimpl.h
llvm-svn: 280522
Eric Fiselier [Fri, 2 Sep 2016 18:43:25 +0000 (18:43 +0000)]
Revert r280516 since it contained accidental changes.
llvm-svn: 280521
Aaron Ballman [Fri, 2 Sep 2016 18:31:31 +0000 (18:31 +0000)]
Based on post-commit feedback over IRC with dblaikie, ideally, we should have a SmallVector constructor that accepts anything which can supply a range via ADL begin()/end() calls so that we can construct the SmallVector directly from anything range-like.
Since that doesn't exist right now, use a local variable instead of calling getAssocExprs() twice; NFC.
llvm-svn: 280520
Jonathan Peyton [Fri, 2 Sep 2016 18:29:45 +0000 (18:29 +0000)]
Use 'critical' reduction method when 'atomic' is not available but requested.
In case atomic reduction method is not available (the compiler can't generate
it) the assertion failure occurred if KMP_FORCE_REDUCTION=atomic was specified.
This change replaces the assertion with a warning and sets the reduction method
to the default one - 'critical'.
Patch by Olga Malysheva
Differential Revision: https://reviews.llvm.org/D23990
llvm-svn: 280519
Kyle Butt [Fri, 2 Sep 2016 18:29:28 +0000 (18:29 +0000)]
IfConversion: Add assertions that both sides of a diamond don't pred-clobber.
One side of a diamond may end with a predicate clobbering instruction.
That side of the diamond has to be if-converted second. Both sides can't
clobber the predicate or the ifconversion is invalid. This is checked
elsewhere, but add an assert as a safety check. NFC
llvm-svn: 280518
Kyle Butt [Fri, 2 Sep 2016 18:29:26 +0000 (18:29 +0000)]
IfConversion: Fix bug introduced by rescanning diamonds.
Passing the wrong values for predicate-clobbering. Simple to miss.
Added an assert to make this easier to catch in the future.
llvm-svn: 280517
Eric Fiselier [Fri, 2 Sep 2016 18:25:29 +0000 (18:25 +0000)]
Implement __attribute__((require_constant_initialization)) for safe static initialization.
Summary:
This attribute specifies expectations about the initialization of static and
thread local variables. Specifically that the variable has a
[constant initializer](http://en.cppreference.com/w/cpp/language/constant_initialization)
according to the rules of [basic.start.static]. Failure to meet this expectation
will result in an error.
Static objects with constant initializers avoid hard-to-find bugs caused by
the indeterminate order of dynamic initialization. They can also be safely
used by other static constructors across translation units.
This attribute acts as a compile time assertion that the requirements
for constant initialization have been met. Since these requirements change
between dialects and have subtle pitfalls it's important to fail fast instead
of silently falling back on dynamic initialization.
```c++
// -std=c++14
#define SAFE_STATIC __attribute__((require_constant_initialization)) static
struct T {
constexpr T(int) {}
~T();
};
SAFE_STATIC T x = {42}; // OK.
SAFE_STATIC T y = 42; // error: variable does not have a constant initializer
// copy initialization is not a constant expression on a non-literal type.
```
This attribute can only be applied to objects with static or thread-local storage
duration.
Reviewers: majnemer, rsmith, aaron.ballman
Subscribers: jroelofs, cfe-commits
Differential Revision: https://reviews.llvm.org/D23385
llvm-svn: 280516
Rui Ueyama [Fri, 2 Sep 2016 18:19:00 +0000 (18:19 +0000)]
Add comments.
llvm-svn: 280515
Enrico Granata [Fri, 2 Sep 2016 18:15:48 +0000 (18:15 +0000)]
Check for null
llvm-svn: 280513
Jason Henline [Fri, 2 Sep 2016 18:07:48 +0000 (18:07 +0000)]
[SE] Remove broken doc ref
llvm-svn: 280512
Jason Henline [Fri, 2 Sep 2016 17:59:12 +0000 (17:59 +0000)]
[SE] Doc tweaks
Summary:
* Sections on main page.
* Use std algorithm for equality check in example.
* Add tree view on left side.
* Add extra CSS sheet to restrict content width.
* Add mild background color.
* Restrict alphabetic indexes to 1 column.
* Round corners of content boxes.
* Rename example to CUDASaxpy.cpp.
* Add CUDASaxpy.cpp to "Examples" section.
Reviewers: jprice
Subscribers: parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24198
llvm-svn: 280511
Rui Ueyama [Fri, 2 Sep 2016 17:34:17 +0000 (17:34 +0000)]
Remove temoprary files.
Previously, we created temporary files using llvm::sys::fs::createTemporaryFile
and removed them using llvm::FileRemover. This is error-prone as it is easy to
forget creating FileRemover instances after creating temporary files.
There is actually a temporary file leak bug.
This patch introduces a new class, TemporaryFile, to manage temporary files
in the RAII style.
Differential Revision: https://reviews.llvm.org/D24176
llvm-svn: 280510
Jason Henline [Fri, 2 Sep 2016 17:22:42 +0000 (17:22 +0000)]
[SE] GlobalDeviceMemory owns its handle
Summary:
Final step in getting GlobalDeviceMemory to own its handle.
* Make GlobalDeviceMemory movable, but no longer copyable.
* Make Device::freeDeviceMemory function private and make
GlobalDeviceMemoryBase a friend of Device so GlobalDeviceMemoryBase
can free its memory in its destructor.
* Make GlobalDeviceMemory constructor private and make Device a friend
so it can construct GlobalDeviceMemory.
* Remove SharedDeviceMemoryBase class because it is never used.
* Remove explicit memory freeing from example code.
This change just consumes any errors generated during device memory freeing.
The real error handling will be added in a future patch.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24195
llvm-svn: 280509
Adam Nemet [Fri, 2 Sep 2016 17:20:32 +0000 (17:20 +0000)]
Fix up comment from r280442, noticed by Justin.
llvm-svn: 280508
Rui Ueyama [Fri, 2 Sep 2016 17:19:28 +0000 (17:19 +0000)]
Fix potential test failures.
Windows does not allow opened files to be removed. This patch
fixes two types of errors.
- Output file being the same as input file. Because LLD itself
holds a file descriptor of the input file, it cannot create an
output file with the same name as a new file.
- Removing files before releasing MemoryBuffer objects.
These tests are not failing no because MemoryBuffer happens to
decide not to use mmap on these files. But we shouldn't rely on
that behavior.
llvm-svn: 280507
Jason Henline [Fri, 2 Sep 2016 17:19:19 +0000 (17:19 +0000)]
[SE] Add "install" actions to cmake build
The "install" build target will now copy the StreamExecutor library and
headers to the appropriate subdirectories of CMAKE_INSTALL_PREFIX.
llvm-svn: 280506
Wei Mi [Fri, 2 Sep 2016 17:17:04 +0000 (17:17 +0000)]
Split the store of a wide value merged from an int-fp pair into multiple stores.
For the store of a wide value merged from a pair of values, especially int-fp pair,
sometimes it is more efficent to split it into separate narrow stores, which can
remove the bitwise instructions or sink them to colder places.
Now the feature is only enabled on x86 target, and only store of int-fp pair is
splitted. It is possible that the application scope gets extended with perf evidence
support in the future.
Differential Revision: https://reviews.llvm.org/D22840
llvm-svn: 280505
Sanjay Patel [Fri, 2 Sep 2016 17:05:43 +0000 (17:05 +0000)]
[InsttCombine] fold insertelement of constant into shuffle with constant operand (PR29126)
The motivating case occurs with SSE/AVX scalar intrinsics, so this is a first step towards
shrinking that to a single shufflevector.
Note that the transform is intentionally limited to shuffles that are equivalent to vector
selects to avoid creating arbitrary shuffle masks that may not lower well.
This should solve PR29126:
https://llvm.org/bugs/show_bug.cgi?id=29126
Differential Revision: https://reviews.llvm.org/D23886
llvm-svn: 280504
Davide Italiano [Fri, 2 Sep 2016 16:37:31 +0000 (16:37 +0000)]
[lib/LTO] Simplify. No functional change intended.
llvm-svn: 280503
Reid Kleckner [Fri, 2 Sep 2016 16:33:15 +0000 (16:33 +0000)]
Quick fix to make LIT_PRESERVES_TMP work again
llvm-svn: 280502
Reid Kleckner [Fri, 2 Sep 2016 16:29:24 +0000 (16:29 +0000)]
[lit] Clean up temporary files created by tests
Do this by creating a temp directory in the normal system temp
directory, and cleaning it up on exit.
It is still possible for this temp directory to leak if Python exits
abnormally, but this is probably good enough for now.
Fixes PR18335
llvm-svn: 280501
Derek Schuff [Fri, 2 Sep 2016 16:26:24 +0000 (16:26 +0000)]
[WebAssembly] Update known test failures
Fixed an issue with the experimental C headers
llvm-svn: 280498
Matthew Simpson [Fri, 2 Sep 2016 16:19:22 +0000 (16:19 +0000)]
[LV] Ensure reverse interleaved group GEPs remain uniform
For uniform instructions, we're only required to generate a scalar value for
the first vector lane of each unroll iteration. Thus, if we have a reverse
interleaved group, computing the member index off the scalar GEP corresponding
to the last vector lane of its pointer operand technically makes the GEP
non-uniform. We should compute the member index off the first scalar GEP
instead.
I've added the updated member index computation to the existing reverse
interleaved group test.
llvm-svn: 280497
Jason Henline [Fri, 2 Sep 2016 16:10:51 +0000 (16:10 +0000)]
[SE] Don't pack raw device mem args
Summary:
Step 4 of getting GlobalDeviceMemory to own its handle.
Take out code to pack untyped device memory types as kernel arguments.
When GlobalDeviceMemory owns its handle, users will never touch untyped
device memory types, so they will never pass them as kernel args.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24177
llvm-svn: 280496
George Rimar [Fri, 2 Sep 2016 16:01:42 +0000 (16:01 +0000)]
[ELF] - Linkerscript: add support for suffixes in numbers.
Both bfd and gold accept:
foo = 1K;
bar = 1M;
zed = 1H;
And lowercase forms: k, m, h.
Patch adds support for that.
Differential revision: https://reviews.llvm.org/D24194
llvm-svn: 280494
Tamas Berghammer [Fri, 2 Sep 2016 15:56:33 +0000 (15:56 +0000)]
Fix build breakage caused by r280490
llvm-svn: 280492
Andrea Di Biagio [Fri, 2 Sep 2016 15:55:25 +0000 (15:55 +0000)]
Simplify code a bit. No functional change intended.
We don't need to call `GetCompareTy(LHS)' every single time true or false is
returned from function SimplifyFCmpInst as suggested by Sanjay in review D24142.
llvm-svn: 280491
Tamas Berghammer [Fri, 2 Sep 2016 15:52:19 +0000 (15:52 +0000)]
Fix 2 waring in the OCaml AST context
llvm-svn: 280490
Sanjay Patel [Fri, 2 Sep 2016 15:43:25 +0000 (15:43 +0000)]
fix documentation comments; NFC
llvm-svn: 280489