Nadav Rotem [Thu, 16 Jul 2020 00:12:48 +0000 (17:12 -0700)]
[LiveVariables] Replace std::vector with SmallVector.
Replace std::vector with SmallVector to reduce the number of mallocs.
This method is frequently executed, and the number of elements in the
vector is typically small.
https://reviews.llvm.org/D83920
Thomas Lively [Thu, 16 Jul 2020 18:37:25 +0000 (11:37 -0700)]
[WebAssembly] Implement v128.select
Although the SIMD spec proposal does not specifically include a
select instruction, the select instruction in MVP WebAssembly is
polymorphic over the selected types, so it is able to work on v128
values when they are enabled. This patch introduces a new variant of
the select instruction for each legal vector type. Additional ISel
patterns are adapted from the SELECT_I32 and SELECT_I64 patterns.
Depends on D83736.
Differential Revision: https://reviews.llvm.org/D83737
Nadav Rotem [Thu, 16 Jul 2020 05:36:32 +0000 (22:36 -0700)]
[TableGen] Change std::vector to SmallVector
The size of VTList that is pushed into this container is usually 1, but
often 6 or 7. Change the vector to SmallVector to eliminate frequent
mallocs. This happens hundreds of thousands of times in each tablegen
execution during the LLVM/clang build.
https://reviews.llvm.org/D83849
Matt Arsenault [Wed, 1 Jul 2020 16:30:24 +0000 (12:30 -0400)]
AMDGPU: Move handling of AGPR copies to a separate function
This is in preparation for fixing multiple problems with the way AGPR
copies are handled, but this change is NFC itself. First, it's relying
on recursively calling copyPhysReg, which is losing information
necessary to get correct super register handling.
Second, it's constructing a new RegScavenger and doing a O(N^2) walk
on every single sub-spill for every AGPR tuple copy. Third, it's using
the forward form of the scavenger, and not using the preferred
backwards scan.
Fangrui Song [Thu, 16 Jul 2020 18:27:16 +0000 (11:27 -0700)]
[Driver] Make -B take precedence over COMPILER_PATH
There is currently no COMPILER_PATH test. A subsequent --ld-path patch
will improve the coverage here.
Arthur Eubanks [Thu, 16 Jul 2020 18:09:47 +0000 (11:09 -0700)]
[SCEV] Fix ScalarEvolution tests under NPM
Many tests use opt's -analyze feature, which does not translate well to
NPM and has better alternatives. The alternative here is to explicitly
add a pass that calls ScalarEvolution::print().
The legacy pass manager RUNs aren't changing, but they are now pinned to
the legacy pass manager. For each legacy pass manager RUN, I added a
corresponding NPM RUN using the 'print<scalar-evolution>' pass. For
compatibility with update_analyze_test_checks.py and existing test
CHECKs, 'print<scalar-evolution>' now prints what -analyze prints per
function.
This was generated by the following Python script and failures were
manually fixed up:
import sys
for i in sys.argv:
with open(i, 'r') as f:
s = f.read()
with open(i, 'w') as f:
for l in s.splitlines():
if "RUN:" in l and ' -analyze ' in l and '\\' not in l:
f.write(l.replace(' -analyze ', ' -analyze -enable-new-pm=0 '))
f.write('\n')
f.write(l.replace(' -analyze ', ' -disable-output ').replace(' -scalar-evolution ', ' "-passes=print<scalar-evolution>" ').replace(" | ", " 2>&1 | "))
f.write('\n')
else:
f.write(l)
There are a couple failures still in ScalarEvolution under NPM, but
those are due to other unrelated naming conflicts.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D83798
Thomas Lively [Thu, 16 Jul 2020 18:19:08 +0000 (11:19 -0700)]
[WebAssembly] Autogenerate tests for simd-select.ll
Updating the simd-select.ll tests manually with consistent named
regexps for the register numbers was taking more time than it was
worth, so this patch updates that test file to have autogenerated
output. This is not a significant readability regression because the
tests in that file are all very small.
Depends on D83734.
Differential Revision: https://reviews.llvm.org/D83736
Thomas Lively [Thu, 16 Jul 2020 18:11:19 +0000 (11:11 -0700)]
[WebAssembly] Lower vselect to v128.bitselect
We were previously expanding vselect and matching on the expansion to
generate bitselects, but in some cases the expansion would be further
combined and a bitselect would not get generated. This patch improves
codegen in those cases by legalizing vselect and lowering it to
v128.bitselect. The old pattern that matches the expansion is still
useful for lowering IR that already uses the expansion rather than a
select operation.
Differential Revision: https://reviews.llvm.org/D83734
Craig Topper [Thu, 16 Jul 2020 17:33:20 +0000 (10:33 -0700)]
[X86] Add test case for PR46455.
Matt Arsenault [Tue, 14 Jul 2020 12:19:39 +0000 (08:19 -0400)]
DAG: Try scalarizing when expanding saturating add/sub
In an upcoming AMDGPU patch, the scalar cases will be legal and vector
ops should be scalarized, rather than producing a long sequence of
vector ops which will also need to be scalarized.
Use a lazy heuristic that seems to work and improves the thumb2 MVE
test.
George Rokos [Thu, 16 Jul 2020 18:02:55 +0000 (11:02 -0700)]
[clang] Fix compilation warnings in OpenMP declare mapper codegen.
This patch fixes the compilation warnings that L is not a reference.
Thanks to Lingda Li for providing the patch.
Differential Revision: https://reviews.llvm.org/D83959
Denis Antrushin [Wed, 10 Jun 2020 12:53:54 +0000 (19:53 +0700)]
MIR Statepoint refactoring. Part 1: Basic MI level changes.
Basic support for variadic-def MIR Statepoint:
- Change TableGen STATEPOINT description to variadic out list
(For self-documentation purpose; by itself it does not affect
code generation in any way).
- Update StatepointOpers helper class to handle variadic defs.
- Update MachineVerifier to properly handle them, too.
With this change, new Statepoint instruction can be passed through
backend (excluding ISEL) without errors.
Full change set is available at D81603.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D81645
Matt Arsenault [Fri, 10 Jul 2020 01:02:25 +0000 (21:02 -0400)]
Use findEnumAttribute helper for preallocated
Matt Arsenault [Mon, 29 Jun 2020 19:13:32 +0000 (15:13 -0400)]
IR: Rename Argument::hasPassPointeeByValueAttr to prepare for byref
When the byref attribute is added, there will need to be two similar
functions for the existing cases which have an associate value copy,
and byref which does not. Most, but not all of the existing uses will
use the existing version.
The associated size function added by D82679 also needs to
contextually differ, and will help eliminate a few places still
relying on pointee element types.
Matt Arsenault [Thu, 25 Jun 2020 23:12:55 +0000 (19:12 -0400)]
ValueTracking: Fix isKnownNonZero for non-0 null pointers for byval
The IR doesn't have a proper concept of invalid pointers, and "null"
constants are just all zeros (though it really needs one).
I think it's not possible to break this for AMDGPU due to the copy
semantics of byval. If you have an original stack object at 0, the
byval copy will be placed above it so I don't think it's really
possible to hit a 0 address.
Julian Lettner [Thu, 4 Jun 2020 19:33:30 +0000 (12:33 -0700)]
[Darwin] Fix OS version checks inside simulators
compiler-rt checks OS versions by querying the Darwin kernel version.
This is not necessarily correct inside the simulators if the simulator
runtime is not aligned with the host macOS. Let's instead check the
`SIMULATOR_RUNTIME_VERSION` env var.
Note that we still use the old code path as a fallback in case the
`SIMULATOR_RUNTIME_VERSION` environment variable isn't set.
rdar://
63031937
Reviewers: delcypher
Differential Revision: https://reviews.llvm.org/D79979
Nico Weber [Thu, 16 Jul 2020 17:46:45 +0000 (13:46 -0400)]
[gn build] Fix merge script mishap
Fred Riss [Wed, 15 Jul 2020 22:00:52 +0000 (15:00 -0700)]
[lldb/ObjectFileMachO] Fetch shared cache images from our own shared cache
Summary:
On macOS 11, the libraries that have been integrated in the system
shared cache are not present on the filesystem anymore. LLDB was
using those files to get access to the symbols of those libraries.
LLDB can get the images from the target process memory though.
This has 2 consequences:
- LLDB cannot load the images before the process starts, reporting
an error if someone tries to break on a system symbol.
- Loading the symbols by downloading the data from the inferior
is super slow. It takes tens of seconds at the start of the
debug session to populate the Module list.
To fix this, we can use the library images LLDB has in its own
mapping of the shared cache. Shared cache images are somewhat
special as their LINKEDIT segment is moved to the end of the cache
and thus the images are not contiguous in memory. All of this can
hidden in ObjectFileMachO.
This patch fixes a number of test failures on macOS 11 due to the
first problem described above and adds some specific unittesting
for the new SharedCache Host utilities.
Reviewers: jasonmolenda, labath
Subscribers: llvm-commits, lldb-commits
Tags: #lldb, #llvm
Differential Revision: https://reviews.llvm.org/D83023
Matt Arsenault [Wed, 15 Jul 2020 00:51:45 +0000 (20:51 -0400)]
AMDGPU: Rename gfx9 version of v_add_i32/v_sub_i32
The carry-out opcode is renamed, so eliminate the deceptive _gfx9,
which looked like the encoded instruction. The real encoded version
was named _gfx9_gfx9.
Move it into the VI encoding namespace. The gfx9 namespace is just to
deal with the renamed instructions that reinterpret the opcode. When
codegened, it would fail to find the real instruction since it wasn't
in the right namespace.
Jinsong Ji [Thu, 16 Jul 2020 17:25:21 +0000 (17:25 +0000)]
[docs] fix ident in llvm-exegesis.rst
Matt Arsenault [Tue, 14 Jul 2020 13:18:36 +0000 (09:18 -0400)]
AMDGPU: Rename add/sub with carry out instructions
The hardware has created a real mess in the naming for add/sub, which
have been renamed basically every generation. Switch the carry out
pseudos to have the gfx9/gfx10 names. We were using the original SI/CI
v_add_i32/v_sub_i32 names. Later targets reintroduced these names as
carryless instructions with a saturating clamp bit, which we do not
define. Do this rename so we can unambiguously add these missing
instructions.
The carry-in versions should also be renamed, but at least those had a
consistent _u32 name to begin with. The 16-bit instructions were also
renamed, but aren't ambiguous.
This does regress assembler error message quality in some cases. In
mismatched wave32/wave64 situations, this will switch from
"unsupported instruction" to "invalid operand", with the error
pointing at the wrong position. I couldn't quite follow how the
assembler selects these, but the previous behavior seemed accidental
to me. It looked like there was a partial attempt to handle this which
was never completed (i.e. there is an AMDGPUOperand::isBoolReg but it
isn't used for anything).
Jinsong Ji [Thu, 16 Jul 2020 17:07:12 +0000 (17:07 +0000)]
[docs][lldb] Fix lldb item in releasenotes
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D83962
Michael Forster [Thu, 16 Jul 2020 16:48:34 +0000 (18:48 +0200)]
Add hashing support for std::tuple
Summary:
All tuple values are passed directly to hash_combine. This is inspired by the implementation used for Swift:
https://github.com/llvm/llvm-project-staging/commit/
4a1b4edbe1d1969284c1528e2950ac81b25edc8f
https://github.com/llvm/llvm-project-staging/commit/
845f3829b91522920a59c351b9011af01c5c7f87
Reviewers: gribozavr2
Reviewed By: gribozavr2
Subscribers: dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83887
Florian Hahn [Thu, 16 Jul 2020 16:55:54 +0000 (17:55 +0100)]
[Matrix] Add test for running matrix lowering with -O0.
Louis Dionne [Thu, 16 Jul 2020 16:30:11 +0000 (12:30 -0400)]
[runtimes] Move the enable_rtti Lit parameter to the DSL
Kostya Kortchinsky [Wed, 17 Jun 2020 17:31:53 +0000 (10:31 -0700)]
[scudo][standalone] Release smaller blocks less often
Summary:
Releasing smaller blocks is costly and only yields significant
results when there is a large percentage of free bytes for a given
size class (see numbers below).
This CL introduces a couple of additional checks for sizes lower
than 256. First we want to make sure that there is enough free bytes,
relatively to the amount of allocated bytes. We are looking at 8X% to
9X% (smaller blocks require higher percentage). We also want to make
sure there has been enough activity with the freelist to make it
worth the time, so we now check that the bytes pushed to the freelist
is at least 1/16th of the allocated bytes for those classes.
Additionally, we clear batches before destroying them now - this
could have prevented some releases to occur (class id 0 rarely
releases anyway).
Here are the numbers, for about 1M allocations in multiple threads:
Size: 16
85% freed -> 0% released
86% freed -> 0% released
87% freed -> 0% released
88% freed -> 0% released
89% freed -> 0% released
90% freed -> 0% released
91% freed -> 0% released
92% freed -> 0% released
93% freed -> 0% released
94% freed -> 0% released
95% freed -> 0% released
96% freed -> 0% released
97% freed -> 2% released
98% freed -> 7% released
99% freed -> 27% released
Size: 32
85% freed -> 0% released
86% freed -> 0% released
87% freed -> 0% released
88% freed -> 0% released
89% freed -> 0% released
90% freed -> 0% released
91% freed -> 0% released
92% freed -> 0% released
93% freed -> 0% released
94% freed -> 0% released
95% freed -> 1% released
96% freed -> 3% released
97% freed -> 7% released
98% freed -> 17% released
99% freed -> 41% released
Size: 48
85% freed -> 0% released
86% freed -> 0% released
87% freed -> 0% released
88% freed -> 0% released
89% freed -> 0% released
90% freed -> 0% released
91% freed -> 0% released
92% freed -> 0% released
93% freed -> 0% released
94% freed -> 1% released
95% freed -> 3% released
96% freed -> 7% released
97% freed -> 13% released
98% freed -> 27% released
99% freed -> 52% released
Size: 64
85% freed -> 0% released
86% freed -> 0% released
87% freed -> 0% released
88% freed -> 0% released
89% freed -> 0% released
90% freed -> 0% released
91% freed -> 0% released
92% freed -> 1% released
93% freed -> 2% released
94% freed -> 3% released
95% freed -> 6% released
96% freed -> 11% released
97% freed -> 20% released
98% freed -> 35% released
99% freed -> 59% released
Size: 80
85% freed -> 0% released
86% freed -> 0% released
87% freed -> 0% released
88% freed -> 0% released
89% freed -> 0% released
90% freed -> 1% released
91% freed -> 1% released
92% freed -> 2% released
93% freed -> 4% released
94% freed -> 6% released
95% freed -> 10% released
96% freed -> 17% released
97% freed -> 26% released
98% freed -> 41% released
99% freed -> 64% released
Size: 96
85% freed -> 0% released
86% freed -> 0% released
87% freed -> 0% released
88% freed -> 0% released
89% freed -> 1% released
90% freed -> 1% released
91% freed -> 3% released
92% freed -> 4% released
93% freed -> 6% released
94% freed -> 10% released
95% freed -> 14% released
96% freed -> 21% released
97% freed -> 31% released
98% freed -> 47% released
99% freed -> 68% released
Size: 112
85% freed -> 0% released
86% freed -> 1% released
87% freed -> 1% released
88% freed -> 2% released
89% freed -> 3% released
90% freed -> 4% released
91% freed -> 6% released
92% freed -> 8% released
93% freed -> 11% released
94% freed -> 16% released
95% freed -> 22% released
96% freed -> 30% released
97% freed -> 40% released
98% freed -> 55% released
99% freed -> 74% released
Size: 128
85% freed -> 0% released
86% freed -> 1% released
87% freed -> 1% released
88% freed -> 2% released
89% freed -> 3% released
90% freed -> 4% released
91% freed -> 6% released
92% freed -> 8% released
93% freed -> 11% released
94% freed -> 16% released
95% freed -> 22% released
96% freed -> 30% released
97% freed -> 40% released
98% freed -> 55% released
99% freed -> 74% released
Size: 144
85% freed -> 1% released
86% freed -> 2% released
87% freed -> 3% released
88% freed -> 4% released
89% freed -> 6% released
90% freed -> 7% released
91% freed -> 10% released
92% freed -> 13% released
93% freed -> 17% released
94% freed -> 22% released
95% freed -> 28% released
96% freed -> 37% released
97% freed -> 47% released
98% freed -> 61% released
99% freed -> 78% released
Size: 160
85% freed -> 1% released
86% freed -> 2% released
87% freed -> 3% released
88% freed -> 4% released
89% freed -> 5% released
90% freed -> 7% released
91% freed -> 10% released
92% freed -> 13% released
93% freed -> 17% released
94% freed -> 22% released
95% freed -> 28% released
96% freed -> 37% released
97% freed -> 47% released
98% freed -> 61% released
99% freed -> 78% released
Size: 176
85% freed -> 2% released
86% freed -> 3% released
87% freed -> 4% released
88% freed -> 6% released
89% freed -> 7% released
90% freed -> 9% released
91% freed -> 12% released
92% freed -> 15% released
93% freed -> 20% released
94% freed -> 25% released
95% freed -> 32% released
96% freed -> 40% released
97% freed -> 51% released
98% freed -> 64% released
99% freed -> 80% released
Size: 192
85% freed -> 4% released
86% freed -> 5% released
87% freed -> 6% released
88% freed -> 8% released
89% freed -> 10% released
90% freed -> 13% released
91% freed -> 16% released
92% freed -> 20% released
93% freed -> 24% released
94% freed -> 30% released
95% freed -> 37% released
96% freed -> 45% released
97% freed -> 55% released
98% freed -> 68% released
99% freed -> 82% released
Size: 224
85% freed -> 8% released
86% freed -> 10% released
87% freed -> 12% released
88% freed -> 14% released
89% freed -> 17% released
90% freed -> 20% released
91% freed -> 23% released
92% freed -> 28% released
93% freed -> 33% released
94% freed -> 39% released
95% freed -> 46% released
96% freed -> 53% released
97% freed -> 63% released
98% freed -> 73% released
99% freed -> 85% released
Size: 240
85% freed -> 8% released
86% freed -> 10% released
87% freed -> 12% released
88% freed -> 14% released
89% freed -> 17% released
90% freed -> 20% released
91% freed -> 23% released
92% freed -> 28% released
93% freed -> 33% released
94% freed -> 39% released
95% freed -> 46% released
96% freed -> 54% released
97% freed -> 63% released
98% freed -> 73% released
99% freed -> 85% released
Reviewers: cferris, pcc, hctim, eugenis
Subscribers: #sanitizers, llvm-commits
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D82031
Louis Dionne [Thu, 16 Jul 2020 16:42:26 +0000 (12:42 -0400)]
[libc++abi] NFC: Fix indentation
Dmitri Gribenko [Thu, 16 Jul 2020 16:14:59 +0000 (18:14 +0200)]
Use TestClangConfig in AST Matchers tests and run them in more configurations
Summary:
I am changing tests for AST Matchers to run in multiple language standards
versions, and under multiple triples that have different behavior with regards
to templates. This change is similar to https://reviews.llvm.org/D82179.
To keep the size of the patch manageable, in this patch I'm only migrating one
file to get the process started and get feedback on this approach.
Reviewers: ymandel
Reviewed By: ymandel
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83868
Rahul Joshi [Wed, 15 Jul 2020 05:34:26 +0000 (22:34 -0700)]
[NFC] Use appropriate names for `for_each` and `transform` template parameters
Differential Revision: https://reviews.llvm.org/D83848
Rahul Joshi [Wed, 15 Jul 2020 01:27:51 +0000 (18:27 -0700)]
[MLIR][TableGen] Add default value for named attributes for 2 more build methods
- Added more default values for `attributes` parameter for 2 more build methods
- Extend the op-decls.td unit test to test these build methods.
Differential Revision: https://reviews.llvm.org/D83839
David Green [Thu, 16 Jul 2020 14:42:05 +0000 (15:42 +0100)]
[BasicAA] Fix -basicaa-recphi for geps with negative offsets
As shown in D82998, the basic-aa-recphi option can cause miscompiles for
gep's with negative constants. The option checks for recursive phi, that
recurse through a contant gep. If it finds one, it performs aliasing
calculations using the other phi operands with an unknown size, to
specify that an unknown number of elements after the initial value are
potentially accessed. This works fine expect where the constant is
negative, as the size is still considered to be positive. So this patch
expands the check to make sure that the constant is also positive.
Differential Revision: https://reviews.llvm.org/D83576
LLVM GN Syncbot [Thu, 16 Jul 2020 16:14:13 +0000 (16:14 +0000)]
[gn build] Port
1360e140cc7
Vy Nguyen [Thu, 18 Jun 2020 02:45:31 +0000 (22:45 -0400)]
[llvm-exegesis] Add benchmark latency option on X86 that uses LBR for more precise measurements.
Starting with Skylake, the LBR contains the precise number of cycles between the two
consecutive branches.
Making use of this will hopefully make the measurements more precise than the
existing methods of using RDTSC.
Differential Revision: https://reviews.llvm.org/D77422
Rahul Joshi [Thu, 16 Jul 2020 14:35:46 +0000 (07:35 -0700)]
[flang] Adopt NoRegionArguments (WhereOp) and ParentOneOf (ResultOp) traits
Differential Revision: https://reviews.llvm.org/D83522
Sjoerd Meijer [Thu, 16 Jul 2020 15:14:47 +0000 (16:14 +0100)]
And now really disable that test.
Sjoerd Meijer [Thu, 16 Jul 2020 15:11:27 +0000 (16:11 +0100)]
Last attempt for rG3a624c327add: one test fails with the NPM,
so disable that one for now.
Rahul Joshi [Thu, 16 Jul 2020 05:27:17 +0000 (22:27 -0700)]
[MLIR] Add OpPrintingFlags to IRPrinterConfig.
- This will enable tweaking IR printing options when enabling printing (for ex,
tweak elideLargeElementsAttrs to create smaller IR logs)
Differential Revision: https://reviews.llvm.org/D83930
Louis Dionne [Wed, 22 Apr 2020 15:16:27 +0000 (11:16 -0400)]
[CMake] Enforce the minimum CMake version to be at least 3.13.4
This commit changes the warning for CMake < 3.13.4 into a fatal error.
The intent is to revert and re-apply this simple commit until all build
bots are migrated to CMake >= 3.13.4.
This is part of the effort discussed on llvm-dev here:
http://lists.llvm.org/pipermail/llvm-dev/2020-April/140578.html
Differential Revision: https://reviews.llvm.org/D78646
Louis Dionne [Thu, 16 Jul 2020 10:03:00 +0000 (06:03 -0400)]
[runtimes][NFC] Remove unused or unnecessary CMake variables
Frederik Gossen [Thu, 16 Jul 2020 14:43:42 +0000 (14:43 +0000)]
[MLIR][Shape] Lower `shape.shape_eq` to `scf`
Lower `shape.shape_eq` to the `scf` (and `std`) dialect. For now, this lowering
is limited to extent tensor operands.
Differential Revision: https://reviews.llvm.org/D82530
Xiangling Liao [Sat, 11 Jul 2020 21:34:53 +0000 (17:34 -0400)]
[AIX]Generate debug info for static init related functions
Set the debug location for static init related functions(__dtor
and __finalize) so we can generate valid debug info on AIX by invoking
-g with clang or -debug-info-kind=limited with clang_cc1.
This also works for any other future targets who may use sinit and
sterm functions for static initialization, where a direct call to
dtor will be generated within finalize function body.
This patch also aims at validating that the debug info generated
is correct for AIX sinit related functions.
Differential Revision: https://reviews.llvm.org/D83702
Sjoerd Meijer [Thu, 16 Jul 2020 14:36:01 +0000 (15:36 +0100)]
Follow up of rG3a624c327add: pacify buildbot, add "REQUIRES: aarch64" to test
Florian Hahn [Thu, 16 Jul 2020 14:22:29 +0000 (15:22 +0100)]
[SCCP] Add test cases for adding !range to call-sites.
Xing GUO [Thu, 16 Jul 2020 14:32:31 +0000 (22:32 +0800)]
[DWARFYAML] Implement the .debug_str_offsets section.
This patch helps add support for emitting the .debug_str_offsets section
to yaml2elf.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D83853
David Green [Thu, 16 Jul 2020 12:37:57 +0000 (13:37 +0100)]
[BasicAA] Add additional negative phi tests. NFC
Petar Avramovic [Thu, 16 Jul 2020 14:31:57 +0000 (16:31 +0200)]
AMDGPU/GlobalISel: Legalize s64->s16 G_SITOFP/G_UITOFP
Add widenScalar for TypeIdx == 0 for G_SITOFP/G_UITOFP.
Legailize, using widenScalar, as s64->s32 G_SITOFP/G_UITOFP
followed by s32->s16 G_FPTRUNC.
Differential Revision: https://reviews.llvm.org/D83880
Joachim Protze [Thu, 16 Jul 2020 14:15:21 +0000 (16:15 +0200)]
[TSan] Optimize handling of racy address
This patch splits the handling of racy address and racy stack into separate
functions. If a race was already reported for the address, we can avoid the
cost for collecting the involved stacks.
This patch also removes the race condition in storing the racy address / racy
stack. This race condition allowed all threads to report the race.
This patch changes the transitive suppression of reports. Previously
suppression could transitively chain memory location and racy stacks.
Now racy memory and racy stack are separate suppressions.
Commit again, now with fixed tests.
Reviewed by: dvyukov
Differential Revision: https://reviews.llvm.org/D83625
Jay Foad [Thu, 16 Jul 2020 12:10:09 +0000 (13:10 +0100)]
[PowerPC] Precommit 64-bit funnel shift test cases
Sjoerd Meijer [Thu, 16 Jul 2020 12:11:30 +0000 (13:11 +0100)]
[Matrix] Add the matrix test from D83570. NFC.
Florian Hahn [Thu, 16 Jul 2020 11:08:54 +0000 (12:08 +0100)]
[SCCP] Only track returns of functions with non-void ret ty (NFC).
There is no need to add functions with void return types to the set of
tracked return values. This does not change functionality, because we
such functions do not have return values and we never update or access
them.
James Y Knight [Thu, 16 Jul 2020 14:01:52 +0000 (10:01 -0400)]
Remove TwoAddressInstructionPass::sink3AddrInstruction.
This function has a bug which will incorrectly reschedule instructions
after an INLINEASM_BR (which can branch). (The bug may also allow
scheduling past a throwing-CALL, I'm not certain.)
I could fix that bug, but, as the removed FIXME notes, it's better to
attempt rescheduling before converting to 3-addr form, as that may
remove the need to convert in the first place. In fact, the code to do
such reordering was added to this pass only a few months later, in
2011, via the addition of the function rescheduleMIBelowKill. That
code does not contain the same bug.
The removal of the sink3AddrInstruction function is not a no-op: in
some cases it would move an instruction post-conversion, when
rescheduleMIBelowKill would not move the instruction pre-converison.
However, this does not appear to be important: the machine instruction
scheduler can reorder the after-conversion instructions, in any case.
This patch fixes a kernel panic 4.4 LTS x86_64 Linux kernels, when
built with clang after
4b0aa5724feaa89a9538dcab97e018110b0e4bc3.
Link: https://github.com/ClangBuiltLinux/linux/issues/1085
Differential Revision: https://reviews.llvm.org/D83708
Frederik Gossen [Thu, 16 Jul 2020 13:57:56 +0000 (13:57 +0000)]
[MLIR][Shape] Use callback builder again
The issue that callback builders caused during rollback of conversion patterns
has been resolved. We can use them again.
See https://bugs.llvm.org/show_bug.cgi?id=46731
Differential Revision: https://reviews.llvm.org/D83932
Frederik Gossen [Thu, 16 Jul 2020 13:55:08 +0000 (13:55 +0000)]
[MLIR] Lower `shape.reduce` to `scf.for` only when argument is `tensor<?xindex>`
To make it clear when shape error values cannot occur the shape operations can
operate on extent tensors. This change updates the lowering for `shape.reduce`
accordingly.
Differential Revision: https://reviews.llvm.org/D83944
Frederik Gossen [Thu, 16 Jul 2020 13:52:42 +0000 (13:52 +0000)]
[MLIR][Shape] Allow `shape.reduce` to operate on extent tensors
Allow `shape.reduce` to take both `shape.shape` and `tensor<?xindex>` as an
argument.
Differential Revision: https://reviews.llvm.org/D83943
David Truby [Thu, 16 Jul 2020 10:27:03 +0000 (11:27 +0100)]
[flang] Add missing link dependencies to FrontendOpenACC.
Summary:
These link dependencies are required for shared library builds to
work correctly.
Reviewers: clementval
Reviewed By: clementval
Subscribers: mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83938
Jay Foad [Thu, 16 Jul 2020 12:41:29 +0000 (13:41 +0100)]
[PowerPC] Use CHECK-LABEL for better diagnostics
Roman Lebedev [Thu, 16 Jul 2020 12:04:47 +0000 (15:04 +0300)]
Reland "[NFC] SimplifyCFG: refactor/deduplicate command-line settings override handling"
Initially i forgot to stage the SimplifyCFGPass::SimplifyCFGPass() change
to actually take the passed params..
David Green [Thu, 16 Jul 2020 11:28:42 +0000 (12:28 +0100)]
[ARM] Add a PreferNoCSEL option. NFC
This disables CSEL, falling back to the old predicated move behaviour
for cases where that is useful for debugging.
Paul Walker [Thu, 16 Jul 2020 11:22:22 +0000 (11:22 +0000)]
[SVE] Add lowering for scalable vector fadd, fdiv, fmul and fsub operations.
Lower the operations to predicated variants. This is prep work
required for fixed length code generation but also fixes a bug
whereby these operations fail selection when "unpacked" vector
types (e.g. MVT::nxv2f32) are used.
This patch also adds the missing "unpacked" patterns for FMA.
Differential Revision: https://reviews.llvm.org/D83765
AndreyChurbanov [Thu, 16 Jul 2020 11:28:09 +0000 (14:28 +0300)]
[openmp] libomp: added itt notifications for task, taskwait, taskgroup
Add releasing->acquire edges for child task->taskwait and
child task->end of taskgroup.
Differential Revision: https://reviews.llvm.org/D83804
Roman Lebedev [Thu, 16 Jul 2020 11:26:32 +0000 (14:26 +0300)]
Revert "[NFC] SimplifyCFG: refactor/deduplicate command-line settings override handling"
Seems to be breaking the bots.
This reverts commit
740a1da108ab9097268b509c85ed9ede7f4d5df5.
Georgii Rymar [Thu, 16 Jul 2020 10:33:01 +0000 (13:33 +0300)]
[yaml2obj] - Fix an issue with NoHeaders key.
When setting the NoHeaders to false,
the e_shnum field wasn't set correctly.
This patch fixes this bug.
Differential revision: https://reviews.llvm.org/D83941
Ilya Golovenko [Thu, 16 Jul 2020 10:47:44 +0000 (12:47 +0200)]
[clang] Fix printing of lambdas with capture expressions
Patch by @walrus !
Reviewers: lattner, kadircet
Reviewed By: kadircet
Subscribers: riccibruno, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83855
Roman Lebedev [Thu, 16 Jul 2020 10:25:44 +0000 (13:25 +0300)]
[NFC] SimplifyCFG: refactor/deduplicate command-line settings override handling
Roman Lebedev [Thu, 16 Jul 2020 10:24:12 +0000 (13:24 +0300)]
[NFC] SimplifyCFGPass::SimplifyCFGPass(): use default SimplifyCFGOptions - we aren't deviating from them here
Roman Lebedev [Thu, 16 Jul 2020 10:18:45 +0000 (13:18 +0300)]
Reland "[NFC] SimplifyCFGOptions: drop multi-parameter ctor, use default member-init"
This reverts commit
5831e86190966d58385678eb74b26aefacbfd101,
which reverted commit
90c1b0442a031d6cad686fdc4e5d3db03c3603a6
in preparation for reverting
commit
b2018198c32a0535bb1f5bb5b40fbcf50d8d47b7 in
commit
1067d3e176ea7b0b1942c163bf8c6c90107768c1 due to the introducton
of a dependency cycle.
Now that the other revert is reverted with a fix, this can be relanded.
Roman Lebedev [Thu, 16 Jul 2020 09:52:55 +0000 (12:52 +0300)]
Reland "[NFCI] createCFGSimplificationPass(): migrate to also take SimplifyCFGOptions"
This reverts commit
1067d3e176ea7b0b1942c163bf8c6c90107768c1,
which reverted commit
b2018198c32a0535bb1f5bb5b40fbcf50d8d47b7,
because it introduced a Dependency Cycle between Transforms/Scalar and
Transforms/Utils.
So let's just move SimplifyCFGOptions.h into Utils/, thus avoiding
the cycle.
Kadir Cetinkaya [Thu, 16 Jul 2020 09:27:31 +0000 (11:27 +0200)]
[clangd] Always retrieve ProjectInfo from Base in OverlayCDB
Summary:
Clangd is returning current working directory for overriden commands.
This can cause inconsistencies between:
- header and the main files, as OverlayCDB only contains entries for the main
files it direct any queries for the headers to the base, creating a
discrepancy between the two.
- different clangd instances, as the results will be different depending on the
timing of execution of the query and override of the command. hence clangd
might see two different project infos for the same file between different
invocations.
- editors and the way user has invoked it, as current working directory of
clangd will depend on those, hence even when there's no underlying base CWD
might change depending on the editor, or the directory user has started the
editor in.
This patch gets rid of that discrepency by always directing queries to base or
returning llvm::None in absence of it.
For a sample bug see https://reviews.llvm.org/D83099#2154185.
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83934
Pavel Iliin [Tue, 14 Jul 2020 18:36:56 +0000 (19:36 +0100)]
[ARM] VBIT/VBIF support added.
Vector bitwise selects are matched by pseudo VBSP instruction
and expanded to VBSL/VBIT/VBIF after register allocation
depend on operands registers to minimize extra copies.
Sjoerd Meijer [Thu, 16 Jul 2020 10:05:02 +0000 (11:05 +0100)]
Follow up of
2b3c505d0f6e: fixed a typo, and added some more formatting. NFC.
David Green [Tue, 14 Jul 2020 09:04:55 +0000 (10:04 +0100)]
[ARM] CSEL generation
This adds a peephole optimisation to turn a t2MOVccr that could not be
folded into any other instruction into a CSEL on 8.1-m. The t2MOVccr
would usually be expanded into a conditional mov, that becomes an IT;
MOV pair. We can instead generate a CSEL instruction, which can
potentially be smaller and allows better register allocation freedom,
which can help reduce codesize. Performance is more variable and may
depend on the micrarchitecture details, but initial results look good.
If we need to control this per-cpu, we can add a subtarget feature as we
need it.
Original patch by David Penry.
Differential Revision: https://reviews.llvm.org/D83566
Kerry McLaughlin [Thu, 16 Jul 2020 09:12:41 +0000 (10:12 +0100)]
[SVE][CodeGen] Legalisation of masked loads and stores
Summary:
This patch modifies IncrementMemoryAddress to use a vscale
when calculating the new address if the data type is scalable.
Also adds tablegen patterns which match an extract_subvector
of a legal predicate type with zip1/zip2 instructions
Reviewers: sdesmalen, efriedma, david-arm
Reviewed By: efriedma, david-arm
Subscribers: tschuett, hiraditya, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83137
Florian Hahn [Thu, 16 Jul 2020 08:55:34 +0000 (09:55 +0100)]
[Matrix] Also run lowering during -O0.
Currently the backends cannot lower the matrix intrinsics directly and
rely on the lowering to vector instructions happening in the middle-end.
At the moment, this means the backend crashes when matrix types
extension code is compiled with -O0, e.g.
http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-aarch64-O0-g/7902/
This patch enables also runs the lowering with -O0 in the middle-end as
a temporary solution. Long term, a lightweight version of the lowering
should run in the backend, on demand.
Max Kazantsev [Thu, 16 Jul 2020 09:21:01 +0000 (16:21 +0700)]
[Test] Add test that shows how SimplifyCFG may insert redunant Phi
It happens when a block cannot be threaded because of a convergent function.
David Truby [Tue, 14 Jul 2020 14:04:38 +0000 (15:04 +0100)]
[flang] Fix shared library builds for lib/Lower.
Summary:
This adds missing definitions for functions in the Lower directory
that were causing failures in shared library builds.
The definitions for these are taken from the fir-dev branch on github.
Reviewers: sscalpone, schweitz, jeanPerier, klausler
Reviewed By: schweitz
Subscribers: mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83771
Petar Avramovic [Thu, 16 Jul 2020 09:09:35 +0000 (11:09 +0200)]
AMDGPU/GlobalISel: Select G_FREEZE
Select G_FREEZE in the same way that COPY is selected.
Differential Revision: https://reviews.llvm.org/D83031
Max Kazantsev [Thu, 16 Jul 2020 06:29:16 +0000 (13:29 +0700)]
Re-enable "[InstCombine] Simplify boolean Phis with const inputs using CFG"
This reverts commit
b893822e32ffe3c1dcf4d5ac0571a282582d72b2.
+ Clang test fixes
+ Insertion point fix for landing pads
Adrian Kuegel [Thu, 16 Jul 2020 08:54:10 +0000 (10:54 +0200)]
Revert "[NFCI] createCFGSimplificationPass(): migrate to also take SimplifyCFGOptions"
This reverts commit
b2018198c32a0535bb1f5bb5b40fbcf50d8d47b7.
This commit introduced a Dependency Cycle between Transforms/Scalar and
Transforms/Utils. Transforms/Scalar already depends on Transforms/Utils,
so if SimplifyCFGOptions.h is moved to Scalar, and Utils/Local.h still
depends on it, we have a cycle.
Adrian Kuegel [Thu, 16 Jul 2020 08:32:50 +0000 (10:32 +0200)]
Revert "[NFC] SimplifyCFGOptions: drop multi-parameter ctor, use default member-init"
This reverts commit
90c1b0442a031d6cad686fdc4e5d3db03c3603a6.
This is based on another commit which also needs to be reverted.
The other commit introduced a Dependency Cycle between Transforms/Scalar
and TransformUtils. Scalar already depends (in many ways) on
TransformUtils, so making TransformUtils depend on Scalar should be
avoided.
Mikael Holmen [Thu, 16 Jul 2020 07:28:34 +0000 (09:28 +0200)]
[clangd] Fix a few gcc warnings [NFC]
Mikael Holmen [Thu, 16 Jul 2020 07:27:26 +0000 (09:27 +0200)]
[MasmParser] Remove unused method emitStructValue to silence warning
The method was added in
bc8e262afe83 and has been unused ever since so
remove it to silence a gcc warning.
Jaroslav Sevcik [Wed, 15 Jul 2020 07:18:20 +0000 (09:18 +0200)]
[lldb] Desugar template specializations
Template specializations are not handled in many of the
TypeSystemClang methods. For example, GetNumChildren does not handle
the TemplateSpecialization type class, so template specializations
always look like empty objects.
This patch just desugars template specializations in the existing
RemoveWrappingTypes desugaring helper.
Differential Revision: https://reviews.llvm.org/D83858
Craig Topper [Thu, 16 Jul 2020 06:50:29 +0000 (23:50 -0700)]
[X86] Allow lsl/lar to be parsed with a GR16, GR32, or GR64 as source register.
This matches GNU assembler behavior. Operand size is determined
only from the destination register.
Max Kazantsev [Thu, 16 Jul 2020 05:58:39 +0000 (12:58 +0700)]
Revert "[InstCombine] Simplify boolean Phis with const inputs using CFG"
This reverts commit
00472067c34ccbceb2fad4b905524f3c780bb7d5.
Need to fix failing clang tests.
Amy Kwan [Thu, 16 Jul 2020 05:10:54 +0000 (00:10 -0500)]
[PowerPC][Power10] Fix VINS* (vector insert byte/half/word) instructions to have i32 arguments.
Previously, the vins* intrinsic was incorrectly defined to have its second and
third argument arguments as an i64. This patch fixes the second and third
argument of the vins* instruction and intrinsic to have i32s instead.
Differential Revision: https://reviews.llvm.org/D83497
Max Kazantsev [Thu, 16 Jul 2020 05:04:01 +0000 (12:04 +0700)]
[InstCombine] Simplify boolean Phis with const inputs using CFG
This patch adds simplification for pattern:
```
if (cond)
/ \
... ...
\ /
p = phi [true] [false]
...
br p, succ_1, succ_2
```
If we can prove that top block's branches dominate respective
inputs of a block that has a Phi with constant inputs, we can
use the branch condition (maybe inverted) instead of Phi.
This will make proofs of implication for further jump threading
more transparent.
Differential Revision: https://reviews.llvm.org/D81375
Reviewed By: xbolva00
Craig Topper [Thu, 16 Jul 2020 03:15:30 +0000 (20:15 -0700)]
Revert "[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms" and subsequent patches
This reverts most of the following patches due to reports of miscompiles.
I've left the added test cases with comments updated to be FIXMEs.
1cf6f210a2e [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison.
469da663f2d [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison
122b0640fc9 [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison
ac0af12ed2f [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison
9b1e95329af [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms
Kiran Kumar T P [Thu, 16 Jul 2020 04:40:59 +0000 (10:10 +0530)]
[flang][OpenMP] Enhance parser support for taskwait construct to OpenMP 5.0
Summary:
This patch enhances parser support for taskwait construct to OpenMP 5.0.
2.17.5 taskwait Construct
!$omp taskwait [clause[ [,] clause] ... ]
where clause is one of the following:
depend([depend-modifier,]dependence-type : locator-list)
The patch includes code changes and testcase modifications.
Reviewed By: Valentin Clement, Kiran Chandramohan
Differential Revision: https://reviews.llvm.org/D82255
Aden Grue [Thu, 16 Jul 2020 03:46:08 +0000 (03:46 +0000)]
Standardize `linalg.generic` on `args_in`/`args_out` instead of `inputCount`/`outputCount`
This also fixes the outdated use of `n_views` in the documentation.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D83795
George Rokos [Thu, 16 Jul 2020 03:30:34 +0000 (20:30 -0700)]
Fix lit test related to declare mapper patch D67833.
Carl Ritson [Thu, 16 Jul 2020 02:07:26 +0000 (11:07 +0900)]
[AMDGPU] Update VMEM scalar write hazard mitigation sequence
Using s_waitcnt_depctr 0xffe3 is potentially faster than v_nop.
Reviewed By: rampitec, foad
Differential Revision: https://reviews.llvm.org/D83872
Ryan Prichard [Tue, 14 Jul 2020 05:06:47 +0000 (22:06 -0700)]
[libunwind] Fix getSLEB128 on large values
Previously, for large-enough values, getSLEB128 would attempt to shift
a signed int in the range [0..0x7f] by 28, 35, 42... bits, which is
undefined behavior and likely to fail.
Avoid shifting (-1ULL) by 70 for large values. e.g. For INT64_MAX, the
last two bytes will be:
- 0x7f [bit==56]
- 0x00 [bit==63]
Differential Revision: https://reviews.llvm.org/D83742
Ryan Prichard [Tue, 14 Jul 2020 05:06:22 +0000 (22:06 -0700)]
[libunwind] Fix CIE v1 return address parsing
- For CIE version 1 (e.g. in DWARF 2.0.0), the return_address_register
field is a ubyte [0..255].
- For CIE version 3 (e.g. in DWARF 3), the field is instead a ULEB128
constant.
Previously, libunwind accepted a CIE version of 1 or 3, but always
parsed the field as ULEB128.
Clang always outputs CIE version 1 into .eh_frame. (It can output CIE
version 3 or 4, but only into .debug_frame.)
Differential Revision: https://reviews.llvm.org/D83741
George Rokos [Thu, 16 Jul 2020 01:10:57 +0000 (18:10 -0700)]
[OpenMP 5.0] Codegen support to pass user-defined mapper functions to runtime
This patch implements the code generation to use OpenMP 5.0 declare mapper (a.k.a. user-defined mapper) constructs.
Patch written by Lingda Li.
Differential Revision: https://reviews.llvm.org/D67833
George Rokos [Wed, 15 Jul 2020 20:24:03 +0000 (13:24 -0700)]
[OpenMP][Offload] Declare mapper runtime implementation
Libomptarget patch adding runtime support for "declare mapper".
Patch co-developed by Lingda Li and George Rokos.
Differential revision: https://reviews.llvm.org/D68100
Quentin Colombet [Thu, 16 Jul 2020 00:26:37 +0000 (17:26 -0700)]
[CalcSpillWeights] Propagate the fact that a live-interval is not spillable
When we calculate the weight of a live-interval, add some code to
check if the original live-interval was markied as not spillable and
if so, progagate that information down to the new interval.
Previously we would just recompute a weight for the new interval,
thus, we could in theory just spill live-intervals marked as not
spillable by just splitting them. That goes against the spirit of
a non-spillable live-interval.
E.g., previously we could do:
v1 = // v1 must not be spilled
...
= v1
Split:
v1 = // v1 must not be spilled
...
v2 = v1 // v2 can be spilled
...
v3 = v2 // v3 can be spilled
= v3
There's no test case for that one as we would need to split a
non-spillable live-interval without using LiveRangeEdit to see this
happening.
RegAlloc inserts non-spillable intervals only as part of the spilling
mechanism, thus at this point the intervals are not splittable anymore.
On top of that, RegAlloc uses the LiveRangeEdit API, which already
properly propagate that information.
In other words, this could only happen if a target was to mark
a live-interval as not spillable before register allocation and
split it without using LRE, e.g., through
LiveIntervals::splitSeparateComponent.
dfukalov [Wed, 15 Jul 2020 23:25:01 +0000 (02:25 +0300)]
[AMDGPU][CostModel] Improve cost estimation for fused {fadd|fsub}(a,fmul(b,c))
Summary:
If result of fmul(b,c) has one use, in almost all cases (except denormals are
IEEE) the pair of operations will be fused in one fma/mad/mac/etc.
Reviewers: rampitec
Reviewed By: rampitec
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits, kerbowa
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83919
Roman Lebedev [Wed, 15 Jul 2020 22:39:37 +0000 (01:39 +0300)]
[NFC] SimplifyCFGOptions: drop multi-parameter ctor, use default member-init
Likewise, just use the builder pattern.
Taking multiple params is unmaintainable.
Jonas Devlieghere [Wed, 15 Jul 2020 22:39:24 +0000 (15:39 -0700)]
[lldb/Test] Skip async process connect tests with reproducers
Reproducers only support synchronous mode.
Adrian Prantl [Wed, 15 Jul 2020 22:25:50 +0000 (15:25 -0700)]
Add missing include