Nicolas Vasilache [Thu, 6 Aug 2020 13:00:38 +0000 (09:00 -0400)]
[mlir][Vector] Add 2-D vector contract lowering to ReduceOp
This new pattern mixes vector.transpose and direct lowering to vector.reduce.
This allows more progressive lowering than immediately going to insert/extract and
composes more nicely with other canonicalizations.
This has 2 use cases:
1. for very wide vectors the generated IR may be much smaller
2. when we have a custom lowering for transpose ops we can target it directly
rather than rely LLVM
Differential Revision: https://reviews.llvm.org/D85428
Max Kazantsev [Fri, 7 Aug 2020 09:46:39 +0000 (16:46 +0700)]
[Test] Added test showing missing range check elimination opportunity in IndVars
Seems that SCEV is not powerful enough to handle this.
Oliver Stannard [Fri, 7 Aug 2020 09:41:16 +0000 (10:41 +0100)]
[AArch64] Disable waitid.cpp test for AArch64
This test is failing intermittently on the AArch64 build bots, disable
it for now to keep the bots green while we investigate it.
Haojian Wu [Fri, 7 Aug 2020 09:36:33 +0000 (11:36 +0200)]
[clangd] Include the underlying decls in go-to-definition.
Fixes https://github.com/clangd/clangd/issues/277
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D74054
Nathan James [Fri, 7 Aug 2020 09:22:26 +0000 (10:22 +0100)]
[NFC] Replace hasName in loop for hasAnyName
Kazushi (Jam) Marukawa [Tue, 4 Aug 2020 07:41:12 +0000 (16:41 +0900)]
[VE] Change to expand multiply related instructions
Change to expand MULHU/MULHS/UMUL_LOHI/SMUL_LOHI for i32 and i64 since
those instructions are not available on Aurora SX VE. Some of them
are used in expansion of i128 multiply, so need to modify them to
support i128. Then, update basic arithmetic regression tests of
i128 and signed/unsigned i32 typed integer values.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D85490
Kazushi (Jam) Marukawa [Wed, 24 Jun 2020 11:55:40 +0000 (20:55 +0900)]
[VE] Remove obsoleted getVEAsmModeForCPU function
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D85507
Haojian Wu [Fri, 7 Aug 2020 09:14:37 +0000 (11:14 +0200)]
[AST][RecoveryExpr] Fix the missing type when rebuilding RecoveryExpr in TreeTransform.
Differential Revision: https://reviews.llvm.org/D85423
Igor Kudrin [Thu, 6 Aug 2020 10:48:57 +0000 (17:48 +0700)]
[DebugInfo] Remove DwarfUnit::getDwarfVersion(). NFC.
This helper method was used only in one place, which can easily use the
direct call.
Differential revision: https://reviews.llvm.org/D85438
Igor Kudrin [Thu, 6 Aug 2020 10:49:05 +0000 (17:49 +0700)]
[DebugInfo] Clean up DIEUnit. NFC.
This removes members of the DIEUnit class which were used only in unit
tests. Note also that child classes shadowed some of these methods,
namely, getDwarfVersion() was overridden in DwartfUnit and getLength()
was overridden in DwarfCompileUnit.
Differential Revision: https://reviews.llvm.org/D85436
Eduardo Caldas [Thu, 6 Aug 2020 12:13:36 +0000 (12:13 +0000)]
[SyntaxTree][NFC] remove redundant namespace-specifiers
Differential Revision: https://reviews.llvm.org/D85427
Shinji Okumura [Fri, 7 Aug 2020 08:06:42 +0000 (17:06 +0900)]
[Attributor] AAPotentialValues Interface
This is a split patch of D80991.
This patch introduces AAPotentialValues and its interface only.
For more detail of AAPotentialValues abstract attribute, see the original patch.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D83283
Thomas Preud'homme [Wed, 5 Aug 2020 22:03:21 +0000 (23:03 +0100)]
[clang, test, Darwin] Fix tests expecting Darwin target
Clang tests Driver/apple-arm64-arch.c and
Driver/darwin-warning-options.c test Darwin driver functionality but
only require the host system to be Darwin. This leads the tests to fail
when building a cross-compiler on Darwin and to be marked unsupported
when cross-compiling to Darwin from another system. This commit changes
the requirements for those tests to require the target to be Darwin.
Reviewed By: steven_wu
Differential Revision: https://reviews.llvm.org/D85367
Christian Kühnel [Fri, 7 Aug 2020 07:36:49 +0000 (09:36 +0200)]
Revert "[CMake] Simplify CMake handling for zlib"
This reverts commit
1adc494bce44f6004994deed61b30d4b71fe1d05.
This patch broke the Windows compilation on buildbot and pre-merge testing:
http://lab.llvm.org:8011/builders/mlir-windows/builds/5945
https://buildkite.com/llvm-project/llvm-master-build/builds/780
Max Kazantsev [Fri, 7 Aug 2020 07:11:53 +0000 (14:11 +0700)]
[Test] Add one more test on IndVars that was failing on one of older builds
Nathan Ridge [Fri, 7 Aug 2020 05:31:03 +0000 (01:31 -0400)]
[clangd] Highlight structured bindings at local scope as LocalVariable
Differential Revision: https://reviews.llvm.org/D85500
QingShan Zhang [Fri, 7 Aug 2020 07:09:48 +0000 (07:09 +0000)]
[NFC] Add the stats for load/store cluster
We have the stats for MacroFusion but miss it for load/store cluster.
David Sherwood [Thu, 6 Aug 2020 15:53:13 +0000 (16:53 +0100)]
[SVE][CodeGen] Fix bug with store of unpacked FP scalable vectors
Fixed an incorrect pattern in lib/Target/AArch64/AArch64SVEInstrInfo.td
for storing out <vscale x 2 x f32> unpacked scalable vectors. Added
a couple of tests to
test/CodeGen/AArch64/sve-st1-addressing-mode-reg-imm.ll
Differential Revision: https://reviews.llvm.org/D85441
Jonas Devlieghere [Fri, 7 Aug 2020 06:18:18 +0000 (23:18 -0700)]
[LLDB] Mark test_launch_simple as a no-debug-info test
No need to run this test with the multiple variants.
biplmish [Fri, 7 Aug 2020 06:02:29 +0000 (01:02 -0500)]
[PowerPC] Implement Vector Extract Low/High Order Builtins in LLVM/Clang
This patch implements the function prototypes vec_extractl and vec_extracth in altivec.h to utilize the vector extract double element instructions introduced in Power10.
Differential Revision: https://reviews.llvm.org/D84622
QingShan Zhang [Fri, 7 Aug 2020 05:16:36 +0000 (05:16 +0000)]
[PowerPC] Support constrained fp operation for setcc
The constrained fp operation fcmp was added by https://reviews.llvm.org/D69281.
This patch is trying to add the support for PowerPC backend.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D81727
QingShan Zhang [Fri, 7 Aug 2020 04:53:37 +0000 (04:53 +0000)]
[Scheduling] Create the missing dependency edges for store cluster
If it is load cluster, we don't need to create the dependency edges(SUb->reg) from SUb to SUa
as they both depend on the base register "reg"
+-------+
+----> reg |
| +---+---+
| ^
| |
| |
| |
| +---+---+
| | SUa | Load 0(reg)
| +---+---+
| ^
| |
| |
| +---+---+
+----+ SUb | Load 4(reg)
+-------+
But if it is store cluster, we need to create it as follow shows to avoid the instruction store
depend on scheduled in-between SUb and SUa.
+-------+
+----> reg |
| +---+---+
| ^
| | Missing +-------+
| | +-------------------->+ y |
| | | +---+---+
| +---+-+-+ ^
| | SUa | Store x 0(reg) |
| +---+---+ |
| ^ |
| | +------------------------+
| | |
| +---+--++
+----+ SUb | Store y 4(reg)
+-------+
Reviewed By: evandro, arsenm, rampitec, foad, fhahn
Differential Revision: https://reviews.llvm.org/D72031
Michał Górny [Wed, 5 Aug 2020 08:22:32 +0000 (10:22 +0200)]
[Polly] Support linking ScopPassManager against LLVM dylib
Link ScopPassManager to LLVM dylib target if LLVM_LINK_LLVM_DYLIB
is enabled. This fixes build failures on systems where static LLVM
libraries are not installed.
Differential Revision: https://reviews.llvm.org/D85281
Sameer Sahasrabuddhe [Fri, 7 Aug 2020 03:54:52 +0000 (09:24 +0530)]
[AArch64][NFC] require aarch64 support for hwasan test
This was breaking builds where the target is not enabled.
Reviewed By: danielkiss, eugenis
Differential Revision: https://reviews.llvm.org/D85412
Vitaly Buka [Fri, 7 Aug 2020 03:46:02 +0000 (20:46 -0700)]
[StackSafety,NFC] Fix tests in debug
Tim Keith [Fri, 7 Aug 2020 03:33:59 +0000 (20:33 -0700)]
[flang] Improve message for assignment to subprogram
In the example below we were producing the error message
"Assignment to constant 'f' is not allowed":
```
function f() result(r)
f = 1.0
end
```
This changes it to a more helpful message when the LHS is a subprogram
name and also mentions the function result name when it's a function.
Differential Revision: https://reviews.llvm.org/D85483
Shinji Okumura [Fri, 7 Aug 2020 02:40:53 +0000 (11:40 +0900)]
[Attributor] Check violation of returned position nonnull and noundef attribute in AAUndefinedBehavior
This patch is a follow up of D84733.
If a function has noundef attribute in returned position, instructions that return undef or poison value cause UB.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D85178
Vitaly Buka [Fri, 7 Aug 2020 02:49:26 +0000 (19:49 -0700)]
[StackSafety,NFC] Add more tests
Vitaly Buka [Fri, 7 Aug 2020 02:46:52 +0000 (19:46 -0700)]
[StackSafety,NFC] Sort llvm-lto2 resolutions in tests
Vitaly Buka [Thu, 6 Aug 2020 12:19:12 +0000 (05:19 -0700)]
[StackSafety,NFC] Add debug counters
Vitaly Buka [Thu, 6 Aug 2020 12:09:44 +0000 (05:09 -0700)]
[StackSafety,NFC] Use CHECK-EMPTY in tests
Vitaly Buka [Fri, 7 Aug 2020 02:16:39 +0000 (19:16 -0700)]
[LLParser,NFC] Simplify forward GV refs update
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D85238
Michael Kruse [Fri, 7 Aug 2020 02:07:29 +0000 (21:07 -0500)]
[polly] Unbreak buildbot.
The test failed since commit
bc10888dc "DomTree: Make PostDomTree indifferent to block successors swap"
which is a re-commit of
c35585e20 "DomTree: Make PostDomTree immune to block successors swap"
Vitaly Buka [Fri, 7 Aug 2020 02:10:02 +0000 (19:10 -0700)]
[StackSafety] Skip ambiguous lifetime analysis
If we can't identify alloca used in lifetime marker we
need to assume to worst case scenario.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D84630
Richard Smith [Fri, 7 Aug 2020 02:07:50 +0000 (19:07 -0700)]
Reinstate check that we don't crash.
Richard Smith [Fri, 7 Aug 2020 02:03:23 +0000 (19:03 -0700)]
Disable clang-tidy test that started failing after clang commit ed5a18f.
This checker appears to be intentionally not diagnosing cases where an
operator appearing in a duplicated expression might have side-effects;
Clang is now modeling fold-expressions as having an unresolved operator
name within them, so they now trip up this check.
Vitaly Buka [Fri, 7 Aug 2020 01:52:35 +0000 (18:52 -0700)]
[LTO,NFC] Skip generateParamAccessSummary when empty
addGlobalValueSummary can check newly added FunctionSummary
and set HasParamAccess to mark that generateParamAccessSummary
is needed.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D85182
Arthur Eubanks [Thu, 6 Aug 2020 22:14:50 +0000 (15:14 -0700)]
[NewPM] Add callback for skipped passes
Parallel to https://reviews.llvm.org/D84772.
Will use this for printing when a pass is skipped.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D85478
Nathan Ridge [Wed, 5 Aug 2020 03:33:41 +0000 (23:33 -0400)]
[clangd] Semantic highlighting for dependent template name in template argument
Fixes https://github.com/clangd/clangd/issues/484
Differential Revision: https://reviews.llvm.org/D85272
Nico Weber [Fri, 7 Aug 2020 01:02:41 +0000 (21:02 -0400)]
fix doc typo to cycle bots
Kazushi (Jam) Marukawa [Tue, 4 Aug 2020 07:41:12 +0000 (16:41 +0900)]
[VE] Optimize trunc related instructions
Change to not generate truncate instructions if all use of a truncate
operation don't care about higher bits. For example, an i32 add
instruction doesn't care about higher 32 bits in 64 bit registers.
Updates regression tests also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D85418
Richard Smith [Thu, 6 Aug 2020 22:57:35 +0000 (15:57 -0700)]
PR30738: Implement two-phase name lookup for fold-expressions.
Jessica Paquette [Wed, 5 Aug 2020 23:32:55 +0000 (16:32 -0700)]
[GlobalISel] Fix computing known bits for loads with range metadata
In GlobalISel, if you have a load into a small type with a range, you'll hit
an assert if you try to compute known bits on it starting at a larger type.
e.g.
```
%x:_(s8) = G_LOAD %whatever(p0) :: (load 1 ... !range !n)
...
%y:_(s32) = G_SOMETHING %x
```
When we walk through G_SOMETHING and hit the load, the width of our known bits
is 32. However, the width of the range is going to be 8. This will cause us
to hit an assert.
To fix this, make computeKnownBitsFromRangeMetadata zero extend or truncate
the range type to match the bitwidth of the known bits we're calculating.
Add a testcase in CodeGen/GlobalISel/KnownBitsTest.cpp to reflect that this
works now.
https://reviews.llvm.org/D85375
Adrian Prantl [Wed, 5 Aug 2020 22:22:20 +0000 (15:22 -0700)]
Factor out common code from the iPhone/AppleTV/WatchOS simulator platform plugins. (NFC)
The implementation of these classes was copied & pasted from the
iPhone simulator plugin with only a handful of configuration
parameters substituted. This patch moves the redundant implementations
into the base class PlatformAppleSimulator.
Differential Revision: https://reviews.llvm.org/D85243
Matt Arsenault [Tue, 28 Jul 2020 01:13:40 +0000 (21:13 -0400)]
GlobalISel: Implement lower for G_INSERT_VECTOR_ELT
Mark Mentovai [Thu, 6 Aug 2020 22:57:56 +0000 (18:57 -0400)]
[gn build] mac: use frameworks instead of libs where appropriate
As of GN
3028c6a426a4, the hack that transformed "libs" ending in
".framework" from -l arguments to -framework arguments has been removed.
Instead, "frameworks" must be used, and the toolchain must provide
support.
Differential Revision: https://reviews.llvm.org/D84219
Arthur Eubanks [Thu, 6 Aug 2020 04:12:08 +0000 (21:12 -0700)]
[NewPM][GuardWidening] Fix loop guard widening tests under NPM
Reviewed By: ychen, asbirlea
Differential Revision: https://reviews.llvm.org/D85394
Fangrui Song [Thu, 6 Aug 2020 19:34:16 +0000 (12:34 -0700)]
[ELF] Change tombstone values to (.debug_ranges/.debug_loc) 1 and (other .debug_*) 0
tl;dr See D81784 for the 'tombstone value' concept. This patch changes our behavior to be almost the same as GNU ld (except that we also use 1 for .debug_loc):
* .debug_ranges & .debug_loc: 1 (LLD<11: 0+addend; GNU ld uses 1 for .debug_ranges)
* .debug_*: 0 (LLD<11: 0+addend; GNU ld uses 0; future LLD: -1)
We make the tweaks because:
1) The new tombstone is novel and needs more time to be adopted by consumers before it's the default.
2) The old (gold) strategy had problems with zero-length functions - so rather than going back that, we're going to the GNU ld strategy which doesn't have that problem.
3) One slight tweak to (2) is to apply the .debug_ranges workaround to .debug_loc for the same reasons it applies to debug_ranges - to avoid terminating lists early.
-----
http://lists.llvm.org/pipermail/llvm-dev/2020-July/143482.html
The tombstone value -1 in .debug_line caused problems to lldb (fixed by D83957;
will be included in 11.0.0) and breakpad (fixed by
https://crrev.com/c/2321300). It may potentially affects other DWARF consumers.
For .debug_ranges & .debug_loc: 1, an argument preferring 1 (GNU ld for .debug_ranges) over -2 is that:
```
{-1, -2} <<< base address selection entry
{0, length} <<< address range
```
may create a situation where low_pc is greater than high_pc. So we use
1, the GNU ld behavior for .debug_ranges
For other .debug_* sections, there haven't been many reports. One issue is that
bloaty (src/dwarf.cc) can incorrectly count address ranges in .debug_ranges . To
reduce similar disruption, this patch changes the tombstone values to be similar to GNU ld.
This does mean another behavior change to the default trunk behavior. Sorry
about it. The default trunk behavior will be similar to release/11.x while we work on a transition plan for LLD users.
Reviewed By: dblaikie, echristo
Differential Revision: https://reviews.llvm.org/D84825
Yonghong Song [Thu, 6 Aug 2020 22:22:03 +0000 (15:22 -0700)]
BPF: fix libLLVMBPFCodeGen.so build failure
Buildbot reported a build failure when building shared
library libLLVMBPFCodeGen.so with unknown reference to
"createCFGSimplificationPass".
Commit
87cba434027b ("BPF: add a SimplifyCFG IR pass during
generic Scalar/IPO optimization") added an IR pass SimplifyCFG
by BPF target. The commit called function
createCFGSimplificationPass() defined in "Scalar" library.
Add this library in Target/BPF/LLVMBuild.txt so
shared library build can succeed.
Tony [Thu, 6 Aug 2020 22:04:20 +0000 (22:04 +0000)]
[AMDGPU] Correct missing sram-ecc target feature for gfx906
Differential Revision: https://reviews.llvm.org/D85476
Matt Arsenault [Sun, 26 Jul 2020 21:47:59 +0000 (17:47 -0400)]
AMDGPU/GlobalISel: Enable s_{and|or}n2_{b32|b64} patterns
Evgenii Stepanov [Wed, 5 Aug 2020 19:32:17 +0000 (12:32 -0700)]
[msan] Support %ms in scanf.
Differential Revision: https://reviews.llvm.org/D85350
Michael Kruse [Thu, 6 Aug 2020 20:44:47 +0000 (15:44 -0500)]
[flang][msvc] Do not use gcc/clang command line options for msvc.
The command line options `-Wno-error` and `-Wno-unused-parameter` are specific to gcc/clang, do not use them when compiling with other compilers.
This patch is part of the series to [[ http://lists.llvm.org/pipermail/flang-dev/2020-July/000448.html | make flang compilable with MS Visual Studio ]].
Reviewed By: isuruf
Differential Revision: https://reviews.llvm.org/D85355
Roman Lebedev [Thu, 6 Aug 2020 20:04:05 +0000 (23:04 +0300)]
[InstCombine] Fold (x + C1) * (-1<<C2) --> (-C1 - x) * (1<<C2)
Negator knows how to do this, but the one-use reasoning is getting
a bit muddy here, we don't really want to increase instruction count,
so we need to both lie that "IsNegation" and have an one-use check
on the outermost LHS value.
Roman Lebedev [Thu, 6 Aug 2020 18:10:43 +0000 (21:10 +0300)]
[InstCombine] Generalize %x * (-1<<C) --> (-%x) * (1<<C) fold
Multiplication is commutative, and either of operands can be negative,
so if the RHS is a negated power-of-two, we should try to make it
true power-of-two (which will allow us to turn it into a left-shift),
by trying to sink the negation down into LHS op.
But, we shouldn't re-invent the logic for sinking negation,
let's just use Negator for that.
Tests and original patch by: Simon Pilgrim @RKSimon!
Differential Revision: https://reviews.llvm.org/D85446
Roman Lebedev [Thu, 6 Aug 2020 19:43:39 +0000 (22:43 +0300)]
[NFC][InstCombine] Add some more tests for negation sinking into mul
Roman Lebedev [Thu, 6 Aug 2020 18:08:30 +0000 (21:08 +0300)]
[InstCombine] Fold sdiv exact X, -1<<C --> -(ashr exact X, C)
While that does increases instruction count,
shift is obviously better than a division.
Name: base
Pre: (1<<C1) >= 0
%o0 = shl i8 1, C1
%r = sdiv exact i8 C0, %o0
=>
%r = ashr exact i8 C0, C1
Name: neg
%o0 = shl i8 -1, C1
%r = sdiv exact i8 C0, %o0
=>
%t0 = ashr exact i8 C0, C1
%r = sub i8 0, %t0
Name: reverse
Pre: C1 != 0 && C1 u< 8
%t0 = ashr exact i8 C0, C1
%r = sub i8 0, %t0
=>
%o0 = shl i8 -1, C1
%r = sdiv exact i8 C0, %o0
https://rise4fun.com/Alive/MRplf
Roman Lebedev [Thu, 6 Aug 2020 18:07:45 +0000 (21:07 +0300)]
[NFC][InstCombine] Negator: add a comment about negating exact arithmentic shift
Roman Lebedev [Thu, 6 Aug 2020 17:18:55 +0000 (20:18 +0300)]
[InstCombine] Generalize sdiv exact X, 1<<C --> ashr exact X, C fold to handle non-splat vectors
Roman Lebedev [Thu, 6 Aug 2020 18:04:03 +0000 (21:04 +0300)]
[NFC][InstCombine] Better tests for x s/EXACT (1 << y) pattern
Roman Lebedev [Thu, 6 Aug 2020 16:57:33 +0000 (19:57 +0300)]
[NFC][InstCombine] Tests for x s/EXACT (-1 << y) pattern
Adrian Prantl [Thu, 6 Aug 2020 17:52:16 +0000 (10:52 -0700)]
Unify the code that updates the ArchSpec after finding a fat binary
with how it is done for a lean binary
In particular this affects how target create --arch is handled — it
allowed us to override the deployment target (a useful feature for the
expression evaluator), but the fat binary case didn't.
rdar://problem/
66024437
Differential Revision: https://reviews.llvm.org/D85049
(cherry picked from commit
470bdd3caaab0b6e0ffed4da304244be40b78668)
Richard Smith [Tue, 4 Aug 2020 22:33:57 +0000 (15:33 -0700)]
Add -Wtautological-value-range-compare warning.
This warning diagnoses cases where an expression is compared to a
constant, and the comparison is tautological due to the form of the
expression (but not merely due to its type). This applies in cases such
as comparisons of bit-fields and the result of bit-masks.
The new warning is added to the Clang diagnostic group
-Wtautological-constant-in-range-compare but not to the
formerly-equivalent GCC-compatibility diagnostic group -Wtype-limits,
which retains its old meaning of diagnosing only tautological
comparisons to extremal values of a type (eg, int > INT_MAX).
Reviewed By: rtrieu
Differential Revision: https://reviews.llvm.org/D85256
Craig Topper [Thu, 6 Aug 2020 19:44:30 +0000 (12:44 -0700)]
[LegalTypes] Move VSELECT node creation out of WidenVSELECTAndMask and push to 2 of the 3 callers.
One of the callers only wants the condition, but the vselect can
be simplified by getNode making it hard or impossible to retrieve
the condition.
Instead, return the condition and make the other 2 callers
responsible for creating the vselect node using the condition.
Rename the function to WidenVSELECTMask accordingly.
Differential Revision: https://reviews.llvm.org/D85468
Craig Topper [Thu, 6 Aug 2020 18:07:21 +0000 (11:07 -0700)]
[X86] Optimize out a few extra strlen calls in getX86TargetCPU. NFCI
We had a conversion from const char * to StringRef and const char *
to std::string conversion. These both do their own
strlen call if the compiler doens't figure out how to share them.
By adding the temporary StringRef we can convert it to std::string
instead.
The other case is to use a StringSwitch<StringRef> instead of
StringSwitch<const char *> since the output values of the switch
are string literals. This allows the length to be computed at
compile time. Otherwise we have to convert from const char *
to std::string after the StringSwitch.
Craig Topper [Thu, 6 Aug 2020 16:23:22 +0000 (09:23 -0700)]
[X86] Make getX86TargetCPU return std::string instead of const char *. Remove call to MakeArgString. NFCI
I believe this function used to be called directly from X86
specific code and was used to immediately create -target-cpu
command line. A later refactoring changed it to to be called from
a generic getCPU function that returns std::string. So on some
paths we created a string using MakeArgString converted that to
std::string then called MakeArgString again from that.
Instead just return std::string directly like the other targets.
Yonghong Song [Wed, 5 Aug 2020 21:11:35 +0000 (14:11 -0700)]
BPF: add a SimplifyCFG IR pass during generic Scalar/IPO optimization
The following bpf linux kernel selftest failed with latest
llvm:
$ ./test_progs -n 7/10
...
The sequence of 8193 jumps is too complex.
verification time 126272 usec
stack depth 320
processed 114799 insns (limit 1000000)
...
libbpf: failed to load object 'pyperf600_nounroll.o'
test_bpf_verif_scale:FAIL:110
#7/10 pyperf600_nounroll.o:FAIL
#7 bpf_verif_scale:FAIL
After some investigation, I found the following llvm patch
https://reviews.llvm.org/D84108
is responsible. The patch disabled hoisting common instructions
in SimplifyCFG by default. Later on, the code changes and a
SimplifyCFG phase with hoisting on cannot do the work any more.
A test is provided to demonstrate the problem.
The IR before simplifyCFG looks like:
for.cond:
%i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
%cmp = icmp ult i32 %i.0, 6
br i1 %cmp, label %for.body, label %for.cond.cleanup
for.cond.cleanup:
%2 = load i8*, i8** %frame_ptr, align 8, !tbaa !2
%cmp2 = icmp eq i8* %2, null
%conv = zext i1 %cmp2 to i32
call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %1) #3
call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3
ret i32 %conv
for.body:
%3 = load i8*, i8** %frame_ptr, align 8, !tbaa !2
%tobool.not = icmp eq i8* %3, null
br i1 %tobool.not, label %for.inc, label %land.lhs.true
The first two insns of `for.cond.cleanup` and `for.body`, load and
icmp, can be hoisted to `for.cond` block. With Patch D84108, the
optimization is delayed. But unfortunately, later on loop rotation
added addition phi nodes to `for.body` and hoisting cannot
be done any more.
Note such a hoisting is beneficial to bpf programs as
bpf verifier does path sensitive analysis and verification.
The hoisting preverts reloading from stack which will assume
conservative value and increase exploited insns. In this case,
it caused verifier failure.
To fix this problem, I added an IR pass from bpf target
to performance additional simplifycfg with hoisting common inst
enabled.
Differential Revision: https://reviews.llvm.org/D85434
Snehasish Kumar [Thu, 6 Aug 2020 00:11:48 +0000 (17:11 -0700)]
[NFC] Rename BBSectionsPrepare -> BasicBlockSections.
Rename the BBSectionsPrepare pass as suggested by the review comment in
https://reviews.llvm.org/D85368.
Differential Revision: https://reviews.llvm.org/D85380
Adrian Prantl [Thu, 6 Aug 2020 20:07:06 +0000 (13:07 -0700)]
Add missing override to Makefile
Sanjay Patel [Thu, 6 Aug 2020 20:05:04 +0000 (16:05 -0400)]
[InstSimplify] avoid crashing by trying to rem-by-zero
Bug was noted in the post-commit comments for:
rGe8760bb9a8a3
Jonas Devlieghere [Thu, 6 Aug 2020 20:02:52 +0000 (13:02 -0700)]
[LLDB] Skip test_launch_simple from TestTargetAPI.py when remote
Matt Arsenault [Wed, 6 May 2020 00:24:53 +0000 (20:24 -0400)]
clang: Use byref for aggregate kernel arguments
Add address space to indirect abi info and use it for kernels.
Previously, indirect arguments assumed assumed a stack passed object
in the alloca address space using byval. A stack pointer is unsuitable
for kernel arguments, which are passed in a separate, constant buffer
with a different address space.
Start using the new byref for aggregate kernel arguments. Previously
these were emitted as raw struct arguments, and turned into loads in
the backend. These will lower identically, although with byref you now
have the option of applying an explicit alignment. In the future, a
reasonable implementation would use byref for all kernel arguments
(this would be a practical problem at the moment due to losing things
like noalias on pointer arguments).
This is mostly to avoid fighting the optimizer's treatment of
aggregate load/store. SROA and instcombine both turn aggregate loads
and stores into a long sequence of element loads and stores, rather
than the optimizable memcpy I would expect in this situation. Now an
explicit memcpy will be introduced up-front which is better understood
and helps eliminate the alloca in more situations.
This skips using byref in the case where HIP kernel pointer arguments
in structs are promoted to global pointers. At minimum an additional
patch is needed to allow coercion with indirect arguments. This also
skips using it for OpenCL due to the current workaround used to
support kernels calling kernels. Distinct function bodies would need
to be generated up front instead of emitting an illegal call.
Sanjay Patel [Thu, 6 Aug 2020 16:54:55 +0000 (12:54 -0400)]
[VectorCombine] add tests for load+insert; NFC
Adrian Prantl [Wed, 5 Aug 2020 20:57:14 +0000 (13:57 -0700)]
Correctly detect legacy iOS simulator Mach-O objectfiles
The code in ObjectFileMachO didn't disambiguate between ios and
ios-simulator object files for Mach-O objects using the legacy
ambiguous LC_VERSION_MIN load commands. This used to not matter before
taught ArchSpec that ios and ios-simulator are no longer compatible.
rdar://problem/
66545307
Differential Revision: https://reviews.llvm.org/D85358
cgyurgyik [Thu, 6 Aug 2020 19:21:07 +0000 (15:21 -0400)]
[libc] Add tolower, toupper implementation.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D85326
Anton Afanasyev [Tue, 14 Jul 2020 15:04:25 +0000 (18:04 +0300)]
[SLP] Fix order of `insertelement`/`insertvalue` seed operands
Summary:
This patch takes the indices operands of `insertelement`/`insertvalue`
into account while generation of seed elements for `findBuildAggregate()`.
This function has kept the original order of `insert`s before.
Also this patch optimizes `findBuildAggregate()` preventing it from
redundant temporary vector allocations and its multiple reversing.
Fixes llvm.org/pr44067
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83779
Evgenii Stepanov [Thu, 6 Aug 2020 18:32:33 +0000 (11:32 -0700)]
Fix CFI issues in <future>
This change fixes errors reported by Control Flow Integrity (CFI) checking when using `std::packaged_task`. The errors mostly stem from casting the underlying storage (`__buf_`) to `__base*`, even if it is uninitialized. The solution is to wrap `__base*` access to `__buf_` behind a getter marked with _LIBCPP_NO_CFI.
Differential Revision: https://reviews.llvm.org/D82627
Matt Arsenault [Thu, 6 Aug 2020 18:57:11 +0000 (14:57 -0400)]
Add freeze keyword to IR emacs mode
MaheshRavishankar [Thu, 6 Aug 2020 18:48:49 +0000 (11:48 -0700)]
[mlir][SPIR-V] Fix wrongly placed Rationale section.
Differential Revision: https://reviews.llvm.org/D85461
Jonas Devlieghere [Thu, 6 Aug 2020 18:45:12 +0000 (11:45 -0700)]
[lldb] Use target.GetLaunchInfo() instead of creating an empty one.
Update tests that were creating an empty LaunchInfo instead of using the
one coming from the target. This ensures target properties are honored.
Aleksandr Platonov [Thu, 6 Aug 2020 18:44:08 +0000 (21:44 +0300)]
[clangd] Fix crash in bugprone-bad-signal-to-kill-thread clang-tidy check.
Inside clangd, clang-tidy checks don't see preprocessor events in the preamble.
This leads to `Token::PtrData == nullptr` for tokens that the macro is defined to.
E.g. `#define SIGTERM 15`:
- Token::Kind == tok::numeric_constant (Token::isLiteral() == true)
- Token::UintData == 2
- Token::PtrData == nullptr
As the result of this, bugprone-bad-signal-to-kill-thread check crashes at null-dereference inside clangd.
Reviewed By: hokein
Differential Revision: https://reviews.llvm.org/D85417
dfukalov [Fri, 31 Jul 2020 00:56:54 +0000 (03:56 +0300)]
[AMDGPU][CostModel] Add f16, f64 and contract cases to fused costs estimation.
Add cases of fused fmul+fadd/fsub with f16 and f64 operands to cost model.
Also added operations with contract attribute.
Fixed line endings in test.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D84995
Matt Arsenault [Mon, 27 Jul 2020 13:58:17 +0000 (09:58 -0400)]
GlobalISel: Implement fewerElementsVector for G_EXTRACT_VECTOR_ELT
Use the same basic strategy as LegalizeVectorTypes. Try to index into
smaller pieces if there's a constant index, and otherwise fall back to
a stack temporary.
Arthur Eubanks [Wed, 5 Aug 2020 20:49:00 +0000 (13:49 -0700)]
[NewPM][LoopUnswitch] Pin loop-unswitch to legacy PM or use simple-loop-unswitch
As mentioned in
http://lists.llvm.org/pipermail/llvm-dev/2020-July/143395.html,
loop-unswitch has not been ported to the NPM. Instead people are using
simple-loop-unswitch.
Pin all tests in Transforms/LoopUnswitch to legacy PM and replace all
other uses of loop-unswitch with simple-loop-unswitch.
One test that didn't fit into the above was
2014-06-21-congruent-constant.ll which seems to only pass with
loop-unswitch. That is also pinned to legacy PM.
Now all tests containing "-loop-unswitch" anywhere in the test succeed with
NPM turned on by default.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D85360
Aaron En Ye Shi [Wed, 5 Aug 2020 19:53:30 +0000 (19:53 +0000)]
[HIP] Ignore invalid ar linker options
Instead of accepting the same arguments as regular linker,
the static linker will only accept input files.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D85442
Fred Riss [Thu, 6 Aug 2020 17:38:06 +0000 (10:38 -0700)]
[lldb/testsuite] Change get_debugserver_exe to support Rosetta
In order to be able to run the debugserver tests against the Rosetta
debugserver, detect the Rosetta run configuration and return the
system Rosetta debugserver.
Arthur Eubanks [Thu, 6 Aug 2020 04:32:26 +0000 (21:32 -0700)]
[NewPM] Pin -assumption-cache-tracker tests to legacy PM
All tests have corresponding NPM RUN lines.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D85395
Matt Arsenault [Tue, 4 Aug 2020 21:42:47 +0000 (17:42 -0400)]
AMDGPU: Define raw/struct variants of buffer atomic fadd
Somehow the new FP atomic buffer intrinsics ended up using the legacy
style for buffer intrinsics.
Sterling Augustine [Thu, 6 Aug 2020 17:16:23 +0000 (10:16 -0700)]
Remove unused variable "saved_opts".
wattr_get is a macro, and the documentation states:
"The parameter opts is reserved for future use,
applications must supply a null pointer."
In practice, passing a variable there is harmless, except
that it is unused inside the macro, which causes unused
variable warnings.
The various places where
Matt Arsenault [Thu, 6 Aug 2020 17:10:08 +0000 (13:10 -0400)]
AArch64/GlobalISel: Fix verifier error after selecting returnaddress
This was caching the wrong register to re-use later.
Simon Pilgrim [Thu, 6 Aug 2020 17:00:06 +0000 (18:00 +0100)]
[SLP][X86] Regenerate sdiv test noticed in D83779. NFC.
Mircea Trofin [Thu, 6 Aug 2020 16:21:14 +0000 (09:21 -0700)]
[NFC]{MLInliner] Point out the tests' model dependencies
Matt Arsenault [Fri, 31 Jul 2020 23:31:07 +0000 (19:31 -0400)]
AMDGPU: Fix spilling of 96-bit AGPRs
Matt Arsenault [Thu, 2 Jul 2020 02:34:16 +0000 (22:34 -0400)]
AMDGPU/GlobalISel: Start trying to handle AGPR bank
Try to use AGPR banks for the various merge/unmerge type
operations. Previously these would introduce copies to VGPR.
Matt Arsenault [Thu, 16 Jul 2020 21:18:43 +0000 (17:18 -0400)]
GlobalISel: Define InvalidRegBankID enum value
Alexey Bataev [Thu, 6 Aug 2020 16:36:52 +0000 (12:36 -0400)]
[OPENMP]Fix for Windows buildbots, NFC.
Alexey Bataev [Fri, 26 Jun 2020 21:42:31 +0000 (17:42 -0400)]
[OPENMP]Redesign of OMPExecutableDirective/OMPDeclarativeDirective representation.
Summary:
Introduced OMPChildren class to handle all associated clauses, statement
and child expressions/statements. It allows to represent some directives
more correctly (like flush, depobj etc. with pseudo clauses, ordered
depend directives, which are standalone, and target data directives).
Also, it will make easier to avoid using of CapturedStmt in directives,
if required (atomic, tile etc. directives).
Also, it simplifies serialization/deserialization of the
executable/declarative directives.
Reduces number of allocation operations for mapper declarations.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, jfb, cfe-commits, sstefan1, aaron.ballman, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83261
Simon Pilgrim [Thu, 6 Aug 2020 16:13:13 +0000 (17:13 +0100)]
[InstCombine] Add tests for mul(add(x,c),negpow2) -> mul(sub(-c,x),pow2) fold
Also fix some undef vector elements in the similar vector tests that I missed.
Mircea Trofin [Thu, 6 Aug 2020 16:04:15 +0000 (09:04 -0700)]
[llvm][MLInliner] Don't log 'mandatory' events
We don't want mandatory events in the training log. We do want to handle
them, to keep the native size accounting accurate, but that's all.
Fixed the code, also expanded the test to capture this.
Differential Revision: https://reviews.llvm.org/D85373
Joel E. Denny [Thu, 6 Aug 2020 15:36:27 +0000 (11:36 -0400)]
[OpenMP] Fix ref count dec for implicit map of partial data
D85342 broke this case. The new test case presents an example.
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D85369