Sanjay Patel [Mon, 22 Jun 2020 13:26:46 +0000 (09:26 -0400)]
[VectorCombine] add helper to replace uses and rename
The tests are regenerated to show a path that missed renaming,
but there should be no functional difference from this patch.
Valentin Clement [Mon, 22 Jun 2020 13:56:14 +0000 (09:56 -0400)]
Revert commit 9e52530 because of dependencies issue
This reverts commit
9e525309fb3cbea4ab341b54d127d97831962285.
Valentin Clement [Mon, 22 Jun 2020 13:32:47 +0000 (09:32 -0400)]
[openmp] Base of tablegen generated OpenMP common declaration
Summary:
As discussed previously when landing patch for OpenMP in Flang, the idea is
to share common part of the OpenMP declaration between the different Frontend.
While doing this it was thought that moving to tablegen instead of Macros will also
give a cleaner and more powerful way of generating these declaration.
This first part of a future series of patches is setting up the base .td file for
DirectiveLanguage as well as the OpenMP version of it. The base file is meant to
be used by other directive language such as OpenACC.
In this first patch, the Directive and Clause enums are generated with tablegen
instead of the macros on OMPConstants.h. The next pacth will extend this
to other enum and move the Flang frontend to use it.
Reviewers: jdoerfert, DavidTruby, fghanim, ABataev, jdenny, hfinkel, jhuber6, kiranchandramohan, kiranktp
Reviewed By: jdoerfert, jdenny
Subscribers: cfe-commits, mgorny, yaxunl, hiraditya, guansong, jfb, sstefan1, aaron.ballman, llvm-commits
Tags: #llvm, #openmp, #clang
Differential Revision: https://reviews.llvm.org/D81736
Xing GUO [Mon, 22 Jun 2020 13:35:51 +0000 (21:35 +0800)]
[DWARFYAML][debug_info] Add support for error handling.
This patch helps add support for error handling in `DWARFYAML::emitDebugInfo()`.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D82275
Xing GUO [Mon, 22 Jun 2020 13:33:00 +0000 (21:33 +0800)]
[DWARFYAML][debug_info] Use 'AbbrCode' to index the abbreviation.
Before this patch, we use `(uint32_t)AbbrCode - (uint32_t)FirstAbbrCode` to index the abbreviation. It's impossible for we to use the preceeding abbreviation of the previous one (e.g., if the previous DIE's `AbbrCode` is 2, we are unable to use the abbreviation with index 1). In this patch, we use `AbbrCode` to index the abbreviation directly.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D82173
Simon Pilgrim [Mon, 22 Jun 2020 13:17:02 +0000 (14:17 +0100)]
[DAG] Add SimplifyMultipleUseDemandedVectorElts helper for SimplifyMultipleUseDemandedBits. NFCI.
We have many cases where we call SimplifyMultipleUseDemandedBits and demand specific vector elements, but all the bits from them - this adds a helper wrapper to handle this.
David Spickett [Mon, 22 Jun 2020 13:18:54 +0000 (14:18 +0100)]
Revert "[clang][Driver] Correct tool search path priority"
Revert
028571d60843cb87e2637ef69ee09090d4526c62 to investigate
MacOS failure.
(also the review link was incorrect, should be
https://reviews.llvm.org/D79842)
Raphael Isemann [Sat, 20 Jun 2020 17:30:20 +0000 (19:30 +0200)]
[lldb][NFC] Add more test for builtin formats
Reland
90c1af106a20785ffd01c0d6a41db8bc0160fd11 . This changes the char format
tests which were printing the pointer value of the C-string instead of its
contents, so this test failed on other machines. Now they just print the
bytes in a uint128_t.
Original commit description:
The previous tests apparently missed a few code branches in DumpDataExtractor
code. Also renames the 'test_instruction' which had the same name as another
test (and Python therefore ignored the test entirely).
Sanjay Patel [Mon, 22 Jun 2020 12:57:37 +0000 (08:57 -0400)]
[VectorCombine] add/use pass-level IRBuilder
This saves creating/destroying a builder every time we
perform some transform.
The tests show instruction ordering diffs resulting from
always inserting at the root instruction now, but those
should be benign.
Jay Foad [Fri, 19 Jun 2020 13:27:15 +0000 (14:27 +0100)]
[AMDGPU] Update more live intervals in SIWholeQuadMode
This fixes various assertion failures that would otherwise be triggered
by a later patch to move SIWholeQuadMode later in the pass pipeline.
Differential Revision: https://reviews.llvm.org/D82190
Georgii Rymar [Fri, 19 Jun 2020 13:21:57 +0000 (16:21 +0300)]
[llvm-readelf] - Do not crash when dumping the dynamic symbol table when its sh_entzize == 0.
We have a division by zero crash currently when
the sh_entzize of the dynamic symbol table is 0.
Differential revision: https://reviews.llvm.org/D82180
Yaxun (Sam) Liu [Mon, 22 Jun 2020 12:00:57 +0000 (08:00 -0400)]
Let HIP default include respect -nogpuinc and -nogpulib
Sanjay Patel [Mon, 22 Jun 2020 12:32:55 +0000 (08:32 -0400)]
[VectorCombine] improve IR debugging by providing/salvaging value names
The tests are regenerated to show the diffs, but there should be no
functional change from this patch.
Tim Corringham [Mon, 22 Jun 2020 11:28:24 +0000 (12:28 +0100)]
[AMDGPU] clang-format of SIModeRegister.cpp
Ran clang-format just to ease future reviews. No functional changes.
Georgii Rymar [Fri, 19 Jun 2020 15:37:15 +0000 (18:37 +0300)]
[llvm-readobj] - Validate the DT_STRSZ value to avoid crash.
It is possible to trigger a crash when a dynamic symbol has a
broken (too large) st_name and the DT_STRSZ is also broken.
We have the following code in the `Elf_Sym_Impl<ELFT>::getName`:
```
template <class ELFT>
Expected<StringRef> Elf_Sym_Impl<ELFT>::getName(StringRef StrTab) const {
uint32_t Offset = this->st_name;
if (Offset >= StrTab.size())
return createStringError(object_error::parse_failed,
"st_name (0x%" PRIx32
") is past the end of the string table"
" of size 0x%zx",
Offset, StrTab.size());
...
```
The problem is that `StrTab` here is a `ELFDumper::DynamicStringTab` member
which is not validated properly on initialization. So it is possible to bypass the
`if` even when the `st_name` is huge.
This patch fixes the issue.
Differential revision: https://reviews.llvm.org/D82201
Anton Korobeynikov [Mon, 22 Jun 2020 11:29:32 +0000 (14:29 +0300)]
Attempt to unbreak the test introduced in
359fae6eb094 on Windows
Simon Pilgrim [Mon, 22 Jun 2020 11:11:11 +0000 (12:11 +0100)]
[DAG] SimplifyMultipleUseDemandedBits - drop unnecessary *_EXTEND_VECTOR_INREG cases
For little endian targets, if we only need the lowest element and none of the extended bits then we can just use the (bitcasted) source vector directly.
We already do this in SimplifyDemandedBits, this adds the SimplifyMultipleUseDemandedBits equivalent.
Simon Pilgrim [Sat, 20 Jun 2020 14:59:46 +0000 (15:59 +0100)]
OptimizationRemarkEmitter.h - reduce unnecessary Function.h include to forward declaration. NFC.
Jakub Lichman [Mon, 22 Jun 2020 11:23:39 +0000 (13:23 +0200)]
[mlir] Fix linalg.generic matmul example in the doc
Example of Matmul implementation in linalg.generic operation contained few mistakes that can puzzle new startes when trying to run the example.
Differential Revision: https://reviews.llvm.org/D82289
Denys Petrov [Mon, 22 Jun 2020 11:17:43 +0000 (14:17 +0300)]
[analyzer] Handle `\l` symbol in string literals in exploded-graph-rewriter
Fix for test due to build-bot complains.
Tres Popp [Mon, 22 Jun 2020 11:06:18 +0000 (13:06 +0200)]
Revert "[CGP] Enable CodeGenPrepares phi type convertion."
This reverts commit
67121d7b82ed78a47ea32f0c87b7317e2b469ab2.
This is causing compile times to be 2x slower on some large binaries.
Loïc Joly [Mon, 22 Jun 2020 10:41:39 +0000 (12:41 +0200)]
[ASTMatcher] Correct memoization bug ignoring direction (descendants or ancestors)
Summary:
In ASTMatcher, when we have `has(...)` and `hasParent(...)` called with the same internal matcher on the same node, the memoization process will mix-up the two calls because the direction of the traversal is not part of the memoization key.
This patch adds this information.
Reviewers: klimek
Reviewed By: klimek
Subscribers: Godin, njames93, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80025
Serguei Katkov [Mon, 22 Jun 2020 10:33:00 +0000 (17:33 +0700)]
Revert "[Peeling] Extend the scope of peeling a bit"
This reverts commit
29b2c1ca72096ca06415b5e626e6728c42ef1e74.
The patch causes the DT verifier failure like:
DominatorTree is different than a freshly computed one!
Not sure the patch itself it wrong but revert to investigate the failure.
Vitaly Buka [Mon, 22 Jun 2020 08:28:25 +0000 (01:28 -0700)]
[StackSafety] Check variable lifetime
We can't consider variable safe if out-of-lifetime access is possible.
So if StackLifetime can't prove that the instruction always uses
the variable when it's still alive, we consider it unsafe.
Vitaly Buka [Mon, 22 Jun 2020 07:45:08 +0000 (00:45 -0700)]
[StackSafety] Ignore unreachable instructions
Usually DominatorTree provides this info, but here we use
StackLifetime. The reason is that in the next patch StackLifetime
will be used for actual lifetime checks and we can avoid
forwarding the DominatorTree into this code.
Denys Petrov [Mon, 22 Jun 2020 10:11:54 +0000 (13:11 +0300)]
[analyzer] Handle `\l` symbol in string literals in exploded-graph-rewriter
Summary:
Handle `\l` separately because a string literal can be in code like "string\\literal" with the `\l` inside. Also on Windows macros __FILE__ produces specific delimiters `\` and a directory or file may starts with the letter `l`.
Fix:
Use regex for replacing all `\l` (like `,\l`, `}\l`, `[\l`) except `\\l`, because a literal as a rule contains multiple `\` before `\l`.
Differential Revision: https://reviews.llvm.org/D82092
Anton Korobeynikov [Mon, 22 Jun 2020 10:36:52 +0000 (13:36 +0300)]
Revert "[MSP430] Update register names"
This reverts commit
8f6620f663031da2bb35b788239f4b607271af84.
David Zarzycki [Mon, 22 Jun 2020 10:34:49 +0000 (06:34 -0400)]
Make ninja smart console builds more pretty
Summary: CMake's `find_package` outputs to the console on success, which confuses the smart console mode of the `ninja` build system. Let's quiet the success message and manually warn instead.
Reviewers: tstellar, phosek, mehdi_amini
Reviewed By: mehdi_amini
Subscribers: mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82276
Anatoly Trosinenko [Mon, 22 Jun 2020 10:22:59 +0000 (13:22 +0300)]
[MSP430] Update register names
When writing a unit test on replacing standard epilogue sequences with `BR __mspabi_func_epilog_<N>`, by manually asm-clobbering `rN` - `r10` for N = 4..10, everything worked well except for seeming inability to clobber r4.
The problem was that MSP430 code generator of LLVM used an obsolete name FP for that register. Things were worse because when `llc` read an unknown register name, it silently ignored it.
Differential Revision: https://reviews.llvm.org/D82184
Momchil Velikov [Mon, 22 Jun 2020 10:14:53 +0000 (11:14 +0100)]
[LTO] Use StringRef instead of C-style strings in setCodeGenDebugOptions
Fixes an issue with missing nul-terminators and saves us some string
copying, compared to a version which would insert nul-terminators.
Differential Revision: https://reviews.llvm.org/D82033
Anatoly Trosinenko [Mon, 22 Jun 2020 10:14:02 +0000 (13:14 +0300)]
[MSP430] Enable some basic support for debug information
This commit technically permits LLVM to emit the debug information for ELF files for MSP430 architecture. Aside from this, it only defines the register numbers as defined by part 10.1 of MSP430 EABI specification (assuming the 1-byte subregisters share the register numbers with corresponding full-size registers).
This commit was basically tested by me with TI-provided GCC 8.3.1 toolchain by compiling an example program with `clang` (please note manual linking may be required due to upstream `clang` not yet handling the `-msim` option necessary to run binaries on the GDB-provided simulator) and then running it and single-stepping with `msp430-elf-gdb` like this:
```
$sysroot/bin/msp430-elf-gdb ./test -ex "target sim" -ex "load ./test"
(gdb) ... traditional GDB commands follow ...
```
While this implementation is most probably far from completeness and is considered experimental, it can already help with debugging MSP430 programs as well as finding issues in LLVM debug info support for MSP430 itself.
One of the use cases includes trying to find a point where UBSan check in a trap-on-error mode was triggered.
The expected debug information format is described in the [MSP430 Embedded Application Binary Interface](http://www.ti.com/lit/an/slaa534/slaa534.pdf) specification, part 10.
Differential Revision: https://reviews.llvm.org/D81488
Anatoly Trosinenko [Mon, 22 Jun 2020 10:08:20 +0000 (13:08 +0300)]
[DebugInfo] Explicitly permit addr_size = 0x02 when parsing DWARF data
Current LLVM implementation uses `MCAsmInfo::CodePointerSize` as addr_size when emitting the DWARF data. llvm-dwarfdump, on the other hand, handles `addr_size`s of 4 and 8 properly and considers all other sizes as an error. This works for most of mainline targets except for MSP430 and AVR.
msp430-gcc v8.3.1 emits DWARF32 with addr_size = 4 (DWARF32 does not imply addr_size = 4, 32 refers to internal offset width of 4 bytes) that is handled by llvm-dwarfdump already. Still, emitting 2-byte target pointers on MSP430 seems correct as well (but not for MSP430X that is supported by msp430-gcc but not by LLVM and has 20-bit address space).
This patch make it possible for MSP430 debug info support to be tested with llvm-dwarfdump.
Differential Revision: https://reviews.llvm.org/D82055
Nathan James [Mon, 22 Jun 2020 10:07:21 +0000 (11:07 +0100)]
[clang-tidy] Improved accuracy of check list updater script
- Added `FixItHint` comments to Check files for the script to mark those checks as offering fix-its when the fix-its are generated in another file.
- Case insensitive file searching when looking for the file a checker code resides in.
Also regenerated the list, sphinx had no issue generating the docs after this.
Reviewed By: sylvestre.ledru
Differential Revision: https://reviews.llvm.org/D81932
Florian Hahn [Mon, 22 Jun 2020 09:57:44 +0000 (10:57 +0100)]
[DSE,MSSA] Remove unused arguments for isDSEBarrier (NFC).
Nathan James [Mon, 22 Jun 2020 09:56:04 +0000 (10:56 +0100)]
Fixed ASTMatchers registry and regen ast docs
Tobias Gysi [Mon, 22 Jun 2020 09:37:01 +0000 (11:37 +0200)]
[mlir] make the bitwidth of device side index computations configurable
The patch makes the index type lowering of the GPU to NVVM/ROCDL
conversion configurable. It introduces a pass option that controls the
bitwidth used when lowering index computations.
Differential Revision: https://reviews.llvm.org/D80285
Balázs Kéri [Mon, 22 Jun 2020 07:04:05 +0000 (09:04 +0200)]
[Analyzer][StreamChecker] Add note tags for file opening.
Summary:
Bug reports of resource leak are now improved.
If there are multiple resource leak paths for the same stream,
only one wil be reported.
Reviewers: Szelethus, xazax.hun, baloghadamsoftware, NoQ
Reviewed By: Szelethus, NoQ
Subscribers: NoQ, rnkovacs, xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, gamesh411, Charusso, martong, ASDenysPetrov, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81407
Djordje Todorovic [Mon, 22 Jun 2020 07:01:59 +0000 (09:01 +0200)]
[CSInfo][MIPS] Don't describe parameters loaded by sub/super reg copy
When describing parameter value loaded by a COPY instruction, consider
case where needed Reg value is a sub- or super- register of the COPY
instruction's destination register. Without this patch, compile process
will crash with the assertion "TargetInstrInfo::describeLoadedValue
can't describe super- or sub-regs for copy instructions".
Patch by Nikola Tesic
Differential revision: https://reviews.llvm.org/D82000
David Spickett [Mon, 11 May 2020 16:13:00 +0000 (17:13 +0100)]
[clang][Driver] Correct tool search path priority
Summary:
As seen in:
https://bugs.llvm.org/show_bug.cgi?id=45693
When clang looks for a tool it has a set of
possible names for it, in priority order.
Previously it would look for these names in
the program path. Then look for all the names
in the PATH.
This means that aarch64-none-elf-gcc on the PATH
would lose to gcc in the program path.
(which was /usr/bin in the bug's case)
This changes that logic to search each name in both
possible locations, then move to the next name.
Which is more what you would expect to happen when
using a non default triple.
(-B prefixes maybe should follow this logic too,
but are not changed in this patch)
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79988
Stephan Herhut [Fri, 19 Jun 2020 12:14:30 +0000 (14:14 +0200)]
[mlir] Add for loop specialization
Summary:
We already had a parallel loop specialization pass that is used to
enable unrolling and consecutive vectorization by rewriting loops
whose bound is defined as a min of a constant and a dynamic value
into a loop with static bound (the constant) and the minimum as
bound, wrapped into a conditional to dispatch between the two.
This adds the same rewriting for for loops.
Differential Revision: https://reviews.llvm.org/D82189
Vassil Vassilev [Fri, 19 Jun 2020 07:01:02 +0000 (07:01 +0000)]
Return false if the identifier is not in the global module index.
This allows clients to use the idiom:
if (GlobalIndex->lookupIdentifier(Name, FoundModules)) {
// work on the FoundModules
}
This is also a minor performance improvent for clang.
Differential Revision: https://reviews.llvm.org/D81077
Serguei Katkov [Wed, 3 Jun 2020 10:56:08 +0000 (17:56 +0700)]
[Peeling] Extend the scope of peeling a bit
Currently we allow peeling of the loops if there is a exiting latch block
and all other exits are blocks ending with deopt.
Actually we want that exit would end up with deopt unconditionally but
it is not required that exit itself ends with deopt.
Reviewers: reames, ashlykov, fhahn, apilipenko, fedor.sergeev
Reviewed By: apilipenko
Subscribers: hiraditya, zzheng, dantrushin, llvm-commits
Differential Revision: https://reviews.llvm.org/D81140
sameeran joshi [Mon, 22 Jun 2020 04:54:28 +0000 (10:24 +0530)]
[flang]Fix individual tests with lit when building out of tree
Summary:
Fix individual check tests with lit when building out-of-tree
`ninja check-flang-<folder>` was not working.
The CMakeLists.txt was looking for the lit tests in the source directory
instead of the build directory.
This commit extends @CarolineConcatto previous patch[D81002]
Reviewers: DavidTruby, sscalpone, tskeith, CarolineConcatto, jdoerfert
Reviewed By: DavidTruby
Subscribers: flang-commits, llvm-commits, CarolineConcatto
Tags: #flang, #llvm
Differential Revision: https://reviews.llvm.org/D82120
Craig Topper [Mon, 22 Jun 2020 03:30:13 +0000 (20:30 -0700)]
[X86] Add an AVX check prefix to bitcast-vector-bool.ll to combine checks where AVX1/2/512 are all the same. NFC
Craig Topper [Mon, 22 Jun 2020 00:46:33 +0000 (17:46 -0700)]
[X86] Add test file that was supposed to go with D81327.
Must have forgotten to git add the file.
Michael Liao [Fri, 19 Jun 2020 04:09:20 +0000 (00:09 -0400)]
[amdgpu] Fix REL32 relocations with negative offsets.
Summary: - The offset should be treated as a signed one.
Reviewers: rampitec, arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82234
Jez Ng [Mon, 15 Jun 2020 07:03:24 +0000 (00:03 -0700)]
[lld-macho] Refactor segment/section creation, sorting, and merging
Summary:
There were a few issues with the previous setup:
1. The section sorting comparator used a declarative map of section names to
determine the correct order, but it turns out we need to match on more than
just names -- in particular, an upcoming diff will sort based on whether the
S_ZERO_FILL flag is set. This diff changes the sorter to a more imperative but
flexible form.
2. We were sorting OutputSections stored in a MapVector, which left the
MapVector in an inconsistent state -- the wrong keys map to the wrong values!
In practice, we weren't doing key lookups (only container iteration) after the
sort, so this was fine, but it was still a dubious state of affairs. This diff
copies the OutputSections to a vector before sorting them.
3. We were adding unneeded OutputSections to OutputSegments and then filtering
them out later, which meant that we had to remember whether an OutputSegment
was in a pre- or post-filtered state. This diff only adds the sections to the
segments if they are needed.
In addition to those major changes, two minor ones worth noting:
1. I renamed all OutputSection variable names to `osec`, to parallel `isec`.
Previously we were using some inconsistent combination of `osec`, `os`, and
`section`.
2. I added a check (and a test) for InputSections with names that clashed with
those of our synthetic OutputSections.
Reviewers: #lld-macho
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81887
Craig Topper [Sun, 21 Jun 2020 23:20:24 +0000 (16:20 -0700)]
[X86] Add cooperlake and tigerlake to the enum in cpu_model.c
I forgot to do this when I added then to _cpu_indicator_init.
Craig Topper [Sun, 21 Jun 2020 20:24:43 +0000 (13:24 -0700)]
[X86] Assign a feature priority to 'tigerlake' so it won't assert when used with function multiversioning
Also test cooperlake since it was also just added to function
multiversioning when it was enabled for __builtin_cpu_is.
Sanjay Patel [Sun, 21 Jun 2020 19:57:07 +0000 (15:57 -0400)]
[VectorCombine] create class for pass to hold analyses, etc; NFC
This doesn't change anything currently, but it would make sense
to create a class-level IRBuilder instead of recreating that
everywhere. As we expand to more optimizations, we will probably
also want to hold things like the DataLayout or other constant
refs in here too.
Craig Topper [Sun, 21 Jun 2020 18:30:00 +0000 (11:30 -0700)]
[X86] Add 'cooperlake' and 'tigerlake' to __builtin_cpu_is.
Cooperlake can be detect by compiler-rt now, but not libgcc yet.
Tigerlake can't be detected by either. Both names are accepted by
gcc. Hopefully the detection code will be in place soon.
Craig Topper [Sun, 21 Jun 2020 07:03:44 +0000 (00:03 -0700)]
[X86] Add cooperlake detection to _cpu_indicator_init.
libgcc has this enum encoding defined for a while, but their
detection code is missing. I've raised a bug with them so that
should get fixed soon.
Nathan James [Sun, 21 Jun 2020 18:01:09 +0000 (19:01 +0100)]
[clang-tidy] Implement storeOptions for checks missing it.
Just adds the storeOptions for Checks that weren't already storing their options.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D82223
Luboš Luňák [Sun, 21 Jun 2020 16:59:51 +0000 (18:59 +0200)]
fix clang/PCH/delayed-pch-instantiate test
-target must match between PCH creation and use.
David Green [Sun, 21 Jun 2020 11:56:30 +0000 (12:56 +0100)]
[CGP] Enable CodeGenPrepares phi type convertion.
Florian Hahn [Sun, 21 Jun 2020 15:34:54 +0000 (16:34 +0100)]
[DSE,MSSA] Move reachability check to main loop.
As we traverse the CFG backwards, we could end up reaching unreachable
blocks. For unreachable blocks, we won't have computed post order
numbers and because DomAccess is reachable, unreachable blocks cannot be
on any path from it.
This fixes a crash with unreachable blocks.
Luboš Luňák [Sun, 19 Apr 2020 15:49:47 +0000 (17:49 +0200)]
add option to instantiate templates already in the PCH
Add -fpch-instantiate-templates which makes template instantiations be
performed already in the PCH instead of it being done in every single
file that uses the PCH (but every single file will still do it as well
in order to handle its own instantiations). I can see 20-30% build
time saved with the few tests I've tried.
The change may reorder compiler output and also generated code, but
should be generally safe and produce functionally identical code.
There are some rare cases that do not compile with it,
such as test/PCH/pch-instantiate-templates-forward-decl.cpp. If
template instantiation bailed out instead of reporting the error,
these instantiations could even be postponed, which would make them
work.
Enable this by default for clang-cl. MSVC creates PCHs by compiling
them using an empty .cpp file, which means templates are instantiated
while building the PCH and so the .h needs to be self-contained,
making test/PCH/pch-instantiate-templates-forward-decl.cpp to fail
with MSVC anyway. So the option being enabled for clang-cl matches this.
Differential Revision: https://reviews.llvm.org/D69585
David Green [Sun, 21 Jun 2020 10:28:31 +0000 (11:28 +0100)]
[CGP] Convert phi types
If a collection of interconnected phi nodes is only ever loaded, stored
or bitcast then we can convert the whole set to the bitcast type,
potentially helping to reduce the number of register moves needed as the
phi's are passed across basic block boundaries. This has to be done in
CodegenPrepare as it naturally straddles basic blocks.
The alorithm just looks from phi nodes, looking at uses and operands for
a collection of nodes that all together are bitcast between float and
integer types. We record visited phi nodes to not have to process them
more than once. The whole subgraph is then replaced with a new type.
Loads and Stores are bitcast to the correct type, which should then be
folded into the load/store, changing it's type.
This comes up in the biquad testcase due to the way MVE needs to keep
values in integer registers. I have also seen it come up from aarch64
partner example code, where a complicated set of sroa/inlining produced
integer phis, where float would have been a better choice.
I also added undef and extract element handling which increased the
potency in some cases.
This adds it with an option that defaults to off, and disabled for 32bit
X86 due to potential issues around canonicalizing NaNs.
Differential Revision: https://reviews.llvm.org/D81827
David Green [Sun, 21 Jun 2020 10:07:07 +0000 (11:07 +0100)]
[CGP][AArch64] Convert Phi type tests. NFC
Nikita Popov [Sat, 20 Jun 2020 11:59:24 +0000 (13:59 +0200)]
[ValueTracking, BasicAA] Don't simplify instructions
GetUnderlyingObject() (and by required symmetry
DecomposeGEPExpression()) will call SimplifyInstruction() on the
passed value if other checks fail. This simplification is very
expensive, but has little effect in practice. This patch removes
the SimplifyInstruction call(), and replaces it with a check for
single-argument phis (which can occur in canonical IR in LCSSA
form), which is the only useful simplification case I was able to
identify.
At O3 the geomean CTMark improvement is -1.7%. The largest
improvement is SPASS with ThinLTO at -6%.
In test-suite, I see only two tests with a hash difference and
no code size difference (PAQ8p, Ptrdist), which indicates that
the simplification only ends up being useful very rarely. (I would
have liked to figure out which simplification is responsible here,
but wasn't able to spot it looking at transformation logs.)
The AMDGPU test case that is update was using two selects with
undef condition, in which case GetUnderlyingObject will return
the first select operand as the underlying object. This will of
course not happen with non-undef conditions, so this was not
testing anything realistic. Additionally this illustrates potential
unsoundness: While GetUnderlyingObject will pick the first operand,
the select might be later replaced by the second operand, resulting
in inconsistent assumptions about the undef value.
Differential Revision: https://reviews.llvm.org/D82261
Bruno Ricci [Sun, 21 Jun 2020 13:30:39 +0000 (14:30 +0100)]
Revert "Add --hot-func-list to llvm-profdata show for sample profiles"
This reverts commit
7348b951fe74f306970f6ac567fe5dddbb1c42d4.
It is causing Asan failures.
Sanjay Patel [Sun, 21 Jun 2020 12:50:29 +0000 (08:50 -0400)]
[ValueTracking] improve analysis for fdiv with same operands
(The 'nnan' variant of this pattern is already tested to produce '1.0'.)
https://alive2.llvm.org/ce/z/D4hPBy
define i1 @src(float %x, i32 %y) {
%0:
%d = fdiv float %x, %x
%uge = fcmp uge float %d, 0.000000
ret i1 %uge
}
=>
define i1 @tgt(float %x, i32 %y) {
%0:
ret i1 1
}
Transformation seems to be correct!
Sanjay Patel [Sun, 21 Jun 2020 12:18:24 +0000 (08:18 -0400)]
[InstSimplify] add test for fdiv signbit; NFC
Bruno Ricci [Sun, 21 Jun 2020 12:49:27 +0000 (13:49 +0100)]
[clang][test][NFC] Also test for serialization in AST dump tests, part 3/n.
The outputs between the direct ast-dump test and the ast-dump test after
deserialization should match modulo a few differences.
For hand-written tests, strip the "<undeserialized declarations>"s and
the "imported"s with sed.
For tests generated with "make-ast-dump-check.sh", regenerate the output.
Part 3/n.
Bruno Ricci [Sun, 21 Jun 2020 12:35:15 +0000 (13:35 +0100)]
[clang][test][NFC] Also test for serialization in AST dump tests, part 2/n.
The outputs between the direct ast-dump test and the ast-dump test after
deserialization should match modulo a few differences.
For hand-written tests, strip the "<undeserialized declarations>"s and
the "imported"s with sed.
For tests generated with "make-ast-dump-check.sh", regenerate the
output.
Part 2/n.
Bruno Ricci [Sun, 21 Jun 2020 12:32:10 +0000 (13:32 +0100)]
[clang][NFC] Regenerate test/AST/ast-dump-lambda.cpp with --match-full-lines.
Bruno Ricci [Sun, 21 Jun 2020 12:29:06 +0000 (13:29 +0100)]
[clang][utils] Minor tweak to make-ast-dump-check.sh
Remove the space after the "CHECK:" on each line. This space makes the use
of FileCheck --match-full-lines impossible.
Bruno Ricci [Sun, 21 Jun 2020 12:02:48 +0000 (13:02 +0100)]
[clang][Serialization] Fix the serialization of ConstantExpr.
The serialization of ConstantExpr has currently a number of problems:
- Some fields are just not serialized (ConstantExprBits.APValueKind and
ConstantExprBits.IsImmediateInvocation).
- ASTStmtReader::VisitConstantExpr forgets to add the trailing APValue
to the list of objects to be destroyed when the APValue needs cleanup.
While we are at it, bring the serialization of ConstantExpr more in-line
with what is done with the other expressions by doing the following NFCs:
- Get rid of ConstantExpr::DefaultInit. It is better to not initialize
the fields of an empty ConstantExpr since this will allow msan to
detect if a field was not deserialized.
- Move the initialization of the fields of ConstantExpr to the constructor;
ConstantExpr::Create allocates the memory and ConstantExpr::ConstantExpr
is responsible for the initialization.
Review after commit since this is a straightforward mechanical fix
similar to the other serialization fixes.
Bruno Ricci [Sun, 21 Jun 2020 11:47:18 +0000 (12:47 +0100)]
[clang][NFC] Fix typos/wording in the comments of ConstantExpr.
It is "trailing objects" and "tail-allocated storage".
Nikita Popov [Sun, 21 Jun 2020 11:51:07 +0000 (13:51 +0200)]
[LangRef] Fix sphinx warnings
Nikita Popov [Sun, 21 Jun 2020 11:45:43 +0000 (13:45 +0200)]
[Docs] Fix code block in MemorySSA docs (NFC)
Simon Pilgrim [Sun, 21 Jun 2020 10:16:07 +0000 (11:16 +0100)]
[X86][SSE] Add SimplifyDemandedVectorEltsForTargetShuffle to handle target shuffle variable masks
Pulled out from the ongoing work on D66004, currently we don't do a good job of simplifying variable shuffle masks that have already lowered to constant pool entries.
This patch adds SimplifyDemandedVectorEltsForTargetShuffle (a custom x86 helper) to first try SimplifyDemandedVectorElts (which we already do) and then constant pool simplification to help mark undefined elements.
To prevent lowering/combines infinite loops, we only handle basic constant pool loads instead of creating new BUILD_VECTOR nodes for lowering - e.g. we don't try to convert them to broadcast/vzext_load - there might be some benefit to this but if so I'd rather we come up with some way to reuse existing code than reimplement a lot of BUILD_VECTOR code.
Differential Revision: https://reviews.llvm.org/D81791
clfbbn [Sun, 21 Jun 2020 06:18:47 +0000 (14:18 +0800)]
[Attributor][NFC] Fix indentation
Summary: The patch D81022 seems to break the indentation of the `cleanupIR()` function. This patch fixes this problem
Reviewers: jdoerfert, sstefan1, uenoku
Reviewed By: jdoerfert
Subscribers: hiraditya, uenoku, kuter, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82260
Wenlei He [Fri, 19 Jun 2020 17:25:31 +0000 (10:25 -0700)]
[Remarks] Add callsite locations to inline remarks
Summary:
Add call site location info into inline remarks so we can differentiate inline sites.
This can be useful for inliner tuning. We can also reconstruct full hierarchical inline
tree from parsing such remarks. The messege of inline remark is also tweaked so we can
differentiate SampleProfileLoader inline from CGSCC inline.
Reviewers: wmi, davidxl, hoy
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D82213
Jonas Devlieghere [Sun, 21 Jun 2020 06:28:22 +0000 (23:28 -0700)]
[lldb/Lua] Remove redundant variable (NFC)
Jonas Devlieghere [Sun, 21 Jun 2020 05:38:26 +0000 (22:38 -0700)]
[lldb] Remove unused <iostream> includes (NFC)
Amy Kwan [Sat, 20 Jun 2020 23:29:16 +0000 (18:29 -0500)]
[PowerPC][Power10] Implement Vector Clear Left/Rightmost Bytes Builtins in LLVM/Clang
This patch implements builtins for the following prototypes:
```
vector signed char vec_clrl (vector signed char a, unsigned int n);
vector unsigned char vec_clrl (vector unsigned char a, unsigned int n);
vector signed char vec_clrr (vector signed char a, unsigned int n);
vector signed char vec_clrr (vector unsigned char a, unsigned int n);
```
Differential Revision: https://reviews.llvm.org/D81707
Eric Christopher [Sat, 20 Jun 2020 23:02:27 +0000 (16:02 -0700)]
[clang/llvm] As part of using inclusive language within
the llvm project, migrate away from the use of blacklist and whitelist.
Craig Topper [Sat, 20 Jun 2020 22:36:04 +0000 (15:36 -0700)]
[X86] Set the cpu_vendor in __cpu_indicator_init to VENDOR_OTHER if cpuid isn't supported on the CPU.
We need to set the cpu_vendor to a non-zero value to indicate
that we already called __cpu_indicator_init once.
This should only happen on a 386 or 486 CPU.
Eric Christopher [Sat, 20 Jun 2020 22:20:11 +0000 (15:20 -0700)]
[clang-tidy] As part of using inclusive language within
the llvm project, migrate away from the use of blacklist and whitelist.
Eric Christopher [Sat, 20 Jun 2020 21:44:41 +0000 (14:44 -0700)]
Update comment to be more clear.
Eric Christopher [Sat, 20 Jun 2020 21:37:29 +0000 (14:37 -0700)]
Rename function to more accurately reflect what it does.
Eric Christopher [Sat, 20 Jun 2020 21:20:51 +0000 (14:20 -0700)]
Temporarily Revert "[lldb][NFC] Add more test for builtin formats"
as it's failing on the debian buildbots:
http://lab.llvm.org:8011/builders/lldb-x86_64-debian/builds/12531
This reverts commit
90c1af106a20785ffd01c0d6a41db8bc0160fd11.
Eric Schweitz [Fri, 19 Jun 2020 18:42:23 +0000 (11:42 -0700)]
[flang] Add BoxValue.h
The bridge uses internal boxes of related ssa-values to track all the
information associated with a Fortran variable. Variables may have a
location and a value, but may also carry other properties such as rank,
shape, LEN parameters, etc. in Fortran.
Differential revision: https://reviews.llvm.org/D82228
Eric Christopher [Sat, 20 Jun 2020 21:04:48 +0000 (14:04 -0700)]
Typos around a -> an.
Sanjay Patel [Sat, 20 Jun 2020 19:18:27 +0000 (15:18 -0400)]
[VectorCombine] fix assert for type of compare operand
As shown in the post-commit comment for D81661 - we need to
loosen the type assertion to allow scalarization of a compare
for vectors of pointers.
Raphael Isemann [Sat, 20 Jun 2020 17:30:20 +0000 (19:30 +0200)]
[lldb][NFC] Add more test for builtin formats
The previous tests apparently missed a few code branches in DumpDataExtractor
code. Also renames the 'test_instruction' which had the same name as another
test (and Python therefore ignored the test entirely).
weihe [Sat, 20 Jun 2020 17:13:02 +0000 (10:13 -0700)]
Add --hot-func-list to llvm-profdata show for sample profiles
Summary: Add the --hot-func-list feature to llvm-profdata show for sample profiles. This feature prints a list of hot functions whose max sample count are above the 99% threshold, with their numbers of total samples, total samples percentage, max samples, entry samples, and their function names.
Reviewers: wmi, hoyFB, wenlei
Reviewed By: wmi
Subscribers: hoyFB, wenlei, llvm-commits, weihe
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81800
Sanjay Patel [Sat, 20 Jun 2020 15:47:00 +0000 (11:47 -0400)]
[InstCombine] remove unused parameter and add assert; NFC
Sanjay Patel [Sat, 20 Jun 2020 15:07:23 +0000 (11:07 -0400)]
[InstCombine] add tests for fmul/fdiv with fabs operands; NFC
Simon Pilgrim [Sat, 20 Jun 2020 14:57:05 +0000 (15:57 +0100)]
ProfileSummaryInfo.h - reduce unnecessary Function.h include to forward declaration. NFC.
Simon Pilgrim [Sat, 20 Jun 2020 14:30:11 +0000 (15:30 +0100)]
RegionPass.h - remove unnecessary Function.h include. NFC.
Forward declaration is already used.
Sanjay Patel [Sat, 20 Jun 2020 14:20:21 +0000 (10:20 -0400)]
[InstCombine] fabs(X) / fabs(X) -> X / X
Also, consolidate related folds so we don't miss/repeat these.
Sanjay Patel [Sat, 20 Jun 2020 13:52:12 +0000 (09:52 -0400)]
[InstCombine] add tests for fabs(x) / fabs (x); NFC
Simon Pilgrim [Sat, 20 Jun 2020 11:35:24 +0000 (12:35 +0100)]
[X86] combineSetCCMOVMSK - consistently use CmpBits variable. NFCI.
The comparison value should be the same size - I've added an assert to be absolutely certain.
Simon Pilgrim [Fri, 19 Jun 2020 15:05:04 +0000 (16:05 +0100)]
[X86][SSE] Fold MOVMSK(PCMPEQ(X,0)) != -1 -> !PTESTZ(X,X) allof patterns
Nikita Popov [Sat, 20 Jun 2020 11:01:54 +0000 (13:01 +0200)]
[CVP] Add another non null test (NFC)
Nikita Popov [Sat, 20 Jun 2020 10:52:53 +0000 (12:52 +0200)]
[JumpThreading] Make test more robust (NFC)
Optimizing away this comparison is not the point of this test,
so make sure it cannot be optimized away.
Nikita Popov [Sat, 20 Jun 2020 10:49:08 +0000 (12:49 +0200)]
[LVI] Extract addValueHandle() method (NFC)
There will be more places registering value handles.
Nikita Popov [Sat, 13 Jun 2020 13:15:39 +0000 (15:15 +0200)]
[LVI] Use find_as() where possible (NFC)
This prevents us from creating temporary PoisoningVHs and
AssertingVHs while performing hashmap lookups. As such, it only
matters in assertion-enabled builds.