Anastasia Stulova [Fri, 10 Jul 2020 18:04:49 +0000 (19:04 +0100)]
[OpenCL] Fixed typo for ctor stub name in UsersManual
Alexandre Ganea [Fri, 10 Jul 2020 16:47:58 +0000 (12:47 -0400)]
Re-land [CodeView] Add full repro to LF_BUILDINFO record
This patch adds some missing information to the LF_BUILDINFO which allows for rebuilding an .OBJ without any external dependency but the .OBJ itself (other than the compiler executable).
Some tools need this information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO therefore stores a full path to the compiler, the PWD (which is the CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variable). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding.
For more information see PR36198 and D43002.
Differential Revision: https://reviews.llvm.org/D80833
Craig Topper [Fri, 10 Jul 2020 17:41:46 +0000 (10:41 -0700)]
[IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison.
This matches the recent change to InstSimplify from D83440.
Differential Revision: https://reviews.llvm.org/D83535
Luke Geeson [Wed, 1 Jul 2020 11:50:36 +0000 (12:50 +0100)]
[ARM] Add Cortex-A78 and Cortex-X1 Support for Clang and LLVM
This patch upstreams support for the Arm-v8 Cortex-A78 and Cortex-X1
processors for AArch64 and ARM.
In detail:
- Adding cortex-a78 and cortex-x1 as cpu options for aarch64 and arm targets in clang
- Adding Cortex-A78 and Cortex-X1 CPU names and ProcessorModels in llvm
details of the CPU can be found here:
https://www.arm.com/products/cortex-x
https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a78
The following people contributed to this patch:
- Luke Geeson
- Mikhail Maltsev
Reviewers: t.p.northover, dmgreen
Reviewed By: dmgreen
Subscribers: dmgreen, kristof.beyls, hiraditya, danielkiss, cfe-commits,
llvm-commits, miyuki
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D83206
Benjamin Kramer [Fri, 10 Jul 2020 17:13:47 +0000 (19:13 +0200)]
[CGProfile] Fix layering, IPO depends in Instrumentation.
Sergej Jaskiewicz [Fri, 10 Jul 2020 17:01:50 +0000 (20:01 +0300)]
Revert "[compiler-rt] [test] Use the parent process env as base env in tests"
This reverts commit
5ab446cfe5503fd4431a94db4d741cf3b5fdcd15.
That commit caused memory sanitizer test failures on PowerPC buildbots
Daniel Grumberg [Fri, 10 Jul 2020 16:54:44 +0000 (17:54 +0100)]
Remove clang options that were added back when merging the TableGen files
Saleem Abdulrasool [Fri, 10 Jul 2020 16:33:54 +0000 (09:33 -0700)]
repair standalone clang builds
Add missing C++ language standard setup for clang standalone build.
Patch by Michele Scandale!
Differential Revision: https://reviews.llvm.org/D83426
Eduardo Caldas [Fri, 10 Jul 2020 09:23:09 +0000 (09:23 +0000)]
Use FileRange::text instead of Lexer::getSpelling
* as we are using them only for integer and floating literals they have
the same behavior
* FileRange::text is simpler to call and is within the context of
syntax trees
Eduardo Caldas [Thu, 9 Jul 2020 15:49:15 +0000 (15:49 +0000)]
Add kinded UDL for raw literal operator and numeric literal operator template
Eduardo Caldas [Mon, 15 Jun 2020 17:08:39 +0000 (17:08 +0000)]
Fix crash on `user defined literals`
Summary:
Given an UserDefinedLiteral `1.2_w`:
Problem: Lexer generates one Token for the literal, but ClangAST
references two source locations
Fix: Ignore the operator and interpret it as the underlying literal.
e.g.: `1.2_w` token generates syntax node IntegerLiteral(1.2_w)
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82157
Kang Zhang [Fri, 10 Jul 2020 16:08:07 +0000 (16:08 +0000)]
[NFC][PowerPC] Add a new MIR file to test mi-peephole pass
Zequan Wu [Thu, 9 Jul 2020 21:56:06 +0000 (14:56 -0700)]
[Lexer] Fix missing coverage line after #endif
Summary: bug reported here: https://bugs.llvm.org/show_bug.cgi?id=46660
Reviewers: vsk, efriedma, arphaman
Reviewed By: vsk
Subscribers: dexonsmith, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83514
Zequan Wu [Wed, 8 Jul 2020 19:30:28 +0000 (12:30 -0700)]
[LPM] Port CGProfilePass from NPM to LPM
Reviewers: hans, chandlerc!, asbirlea, nikic
Reviewed By: hans, nikic
Subscribers: steven_wu, dexonsmith, nikic, echristo, void, zhizhouy, cfe-commits, aeubanks, MaskRay, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D83013
Roman Lebedev [Fri, 10 Jul 2020 15:54:48 +0000 (18:54 +0300)]
Revert "[OpenMPOpt] ICV Tracking"
There appears to be some kind of memory corruption/use-after-free/etc
going on here. In particular, in `OpenMPOpt::deleteParallelRegions()`,
in `DeleteCallCB()`, `CI` is garbage.
WIll post reproducer in the original review.
This reverts commit
6c4a5e9257bac022ffe60e466686ba7fc96ffd1a.
Daniel Grumberg [Fri, 10 Jul 2020 15:11:54 +0000 (16:11 +0100)]
Delete CC1Options.td, since it should have happened in D82574
Florian Hahn [Fri, 10 Jul 2020 15:43:35 +0000 (16:43 +0100)]
[ARM] Pass -verify-machineinstr to test and XFAIL until fixed.
Some bots run with -verify-machineinstr enabled. Add it to the new test
and XFAIL it until fixed.
Johannes Doerfert [Fri, 10 Jul 2020 15:37:31 +0000 (10:37 -0500)]
[Attributor][NFC] Update tests after recent changes
Attributor tests are mostly updated using the auto upgrade scripts but
sometimes we forget. If we do it manually or continue using old check
lines that still match we see unrelated changes down the line. This is
just a cleanup.
Florian Hahn [Fri, 10 Jul 2020 15:39:15 +0000 (16:39 +0100)]
[DomTreeUpdater] Use const auto * when iterating over pointers (NFC).
This silences the warning below:
llvm-project/llvm/lib/Analysis/DomTreeUpdater.cpp:510:20: warning: loop variable 'BB' is always a copy because the range of type 'const SmallPtrSet<llvm::BasicBlock *, 8>' does not return a reference [-Wrange-loop-analysis]
for (const auto &BB : DeletedBBs) {
^
llvm-project/llvm/lib/Analysis/DomTreeUpdater.cpp:510:8: note: use non-reference type 'llvm::BasicBlock *'
for (const auto &BB : DeletedBBs) {
^~~~~~~~~~~~~~~~
1 warning generated.
Florian Hahn [Fri, 10 Jul 2020 15:08:25 +0000 (16:08 +0100)]
[ARM] Add test with tcreturn and debug value.
In the attached test case, a non-terminator instruction (DBG_VALUE) is
inserted after a terminator, producing an invalid MBB.
Sanjay Patel [Fri, 10 Jul 2020 00:03:49 +0000 (20:03 -0400)]
[AArch64][x86] add tests for rotated store merge; NFC
Sanjay Patel [Thu, 9 Jul 2020 21:49:14 +0000 (17:49 -0400)]
[DAGCombiner] move/rename variables for readability; NFC
Nicolas Vasilache [Fri, 10 Jul 2020 13:49:22 +0000 (09:49 -0400)]
[mlir][Vector] Add ExtractOp folding when fed by a TransposeOp
TransposeOp are often followed by ExtractOp.
In certain cases however, it is unnecessary (and even detrimental) to lower a TransposeOp to either a flat transpose (llvm.matrix intrinsics) or to unrolled scalar insert / extract chains.
Providing foldings of ExtractOp mitigates some of the unnecessary complexity.
Differential revision: https://reviews.llvm.org/D83487
Joel E. Denny [Fri, 10 Jul 2020 11:50:42 +0000 (07:50 -0400)]
[FileCheck] Implement -dump-input-filter
This makes the input dump filtering implemented by D82203 more
configurable. D82203 enables filtering out everything but the initial
input lines of error diagnostics (plus some context). This patch
enables including any line with any kind of annotation.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D83097
Joel E. Denny [Fri, 10 Jul 2020 11:50:31 +0000 (07:50 -0400)]
[FileCheck] In input dump, elide only if ellipsis is shorter
For example, given `-dump-input-context=3 -vv`, the following now
shows more leading context for the error than requested because a
leading ellipsis would occupy the same number of lines as it would
elide:
```
<<<<<<
1: foo6
2: foo5
3: foo4
4: foo3
5: foo2
6: foo1
7: hello world
check:1 ^~~~~
check:2 X~~~~ error: no match found
8: foo1
check:2 ~~~~
9: foo2
check:2 ~~~~
10: foo3
check:2 ~~~~
.
.
.
>>>>>>
```
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D83526
Joel E. Denny [Fri, 10 Jul 2020 11:50:00 +0000 (07:50 -0400)]
[FileCheck] Implement -dump-input-context
This patch is motivated by discussions at each of:
* <https://reviews.llvm.org/D81422>
* <http://lists.llvm.org/pipermail/llvm-dev/2020-June/142369.html>
When input is dumped as specified by `-dump-input=fail`, this patch
filters the dump to show only input lines that are the starting lines
of error diagnostics plus the number of contextual lines specified
`-dump-input-context` (defaults to 5).
When `-dump-input=always`, there might be not be any errors, so all
input lines are printed, as without this patch.
Here's some sample output with `-dump-input-context=3 -vv`:
```
<<<<<<
.
.
.
13: foo
14: foo
15: hello world
check:1 ^~~~~~~~~~~
16: foo
check:2'0 X~~ error: no match found
17: foo
check:2'0 ~~~
18: foo
check:2'0 ~~~
19: foo
check:2'0 ~~~
.
.
.
27: foo
check:2'0 ~~~
28: foo
check:2'0 ~~~
29: foo
check:2'0 ~~~
30: goodbye word
check:2'0 ~~~~~~~~~~~~
check:2'1 ? possible intended match
31: foo
check:2'0 ~~~
32: foo
check:2'0 ~~~
33: foo
check:2'0 ~~~
.
.
.
>>>>>>
```
Reviewed By: mehdi_amini, arsenm, jhenderson, rsmith, SjoerdMeijer, Meinersbur, lattner
Differential Revision: https://reviews.llvm.org/D82203
Alexandre Ganea [Fri, 10 Jul 2020 14:38:29 +0000 (10:38 -0400)]
[PDB] Fix out-of-bounds acces when sorting GSI buckets
When building in Debug on Windows-MSVC after
b7402edce315, a lot of tests were failing because we were dereferencing an element past the end of HashRecords. This happened towards the end of the table, in unused slots.
Sam McCall [Fri, 10 Jul 2020 14:08:14 +0000 (16:08 +0200)]
[clangd] Update semanticTokens support to reflect latest LSP draft
Summary: Mostly a few methods and message names have been renamed.
Reviewers: hokein
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83556
Roman Lebedev [Fri, 10 Jul 2020 13:55:44 +0000 (16:55 +0300)]
Reland "[InstCombine] Lower infinite combine loop detection thresholds""
This relands commit
cd7f8051ac7b6f08734102446482c1e5d951bfcc that was
reverted since lower threshold have successfully found an issue.
Now that the issue is fixed, let's wait until the next one is reported.
This reverts commit
caa423eef0d128f35ac11ddbce34964caafb61c1.
Roman Lebedev [Fri, 10 Jul 2020 13:40:15 +0000 (16:40 +0300)]
[InstCombine] After merging store into successor, queue prev. store to be visited (PR46661)
We can happen to have a situation with many stores eligible for transform,
but due to our visitation order (top to bottom), when we have processed
the first eligible instruction, we would not try to reprocess the previous
instructions that are now also eligible.
So after we've successfully merged a store that was second-to-last instruction
into successor, if the now-second-to-last instruction is also a such store
that is eligible, add it to worklist to be revisited.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46661
Roman Lebedev [Fri, 10 Jul 2020 13:30:23 +0000 (16:30 +0300)]
[NFCI][InstCombine] PR46661: multiple stores eligible for merging into successor - worklist issue
The testcase should pass with a single instcombine iteration.
Kevin P. Neal [Fri, 10 Jul 2020 14:31:41 +0000 (10:31 -0400)]
[FPEnv][Clang][Driver] Disable constrained floating point on targets lacking support."
Use the new -fexperimental-strict-floating-point flag in more cases to
fix the arm and aarch64 bots.
Differential Revision: https://reviews.llvm.org/D80952
Nicolas Vasilache [Fri, 10 Jul 2020 14:21:45 +0000 (10:21 -0400)]
[mlir][Linalg] Generalize Vectorization of Linalg contractions
This revision adds support for vectorizing named and generic contraction ops to vector.contract. Cases in which the memref is 0-D are special cased to emit std.load/std.store instead of vector.transfer. Relevant tests are added.
Differential revision: https://reviews.llvm.org/D83307
Haojian Wu [Fri, 10 Jul 2020 14:18:10 +0000 (16:18 +0200)]
[clangd] Fix hover crash on InitListExpr.
Fixes https://github.com/clangd/clangd/issues/455
Differential Revision: https://reviews.llvm.org/D83546
Nicolas Vasilache [Fri, 10 Jul 2020 13:31:02 +0000 (09:31 -0400)]
[mlir][Vector] Fold chains of ExtractOp
This revision adds folding to ExtractOp by simply concatenating the position attributes.
Daniel Grumberg [Fri, 10 Jul 2020 12:57:59 +0000 (13:57 +0100)]
Normalize default value for -triple correctly
Kevin P. Neal [Fri, 10 Jul 2020 12:46:09 +0000 (08:46 -0400)]
Reland "[FPEnv][Clang][Driver] Disable constrained floating point on targets lacking support."
We currently have strict floating point/constrained floating point enabled
for all targets. Constrained SDAG nodes get converted to the regular ones
before reaching the target layer. In theory this should be fine.
However, the changes are exposed to users through multiple clang options
already in use in the field, and the changes are _completely_ _untested_
on almost all of our targets. Bugs have already been found, like
"https://bugs.llvm.org/show_bug.cgi?id=45274".
This patch disables constrained floating point options in clang everywhere
except X86 and SystemZ. A warning will be printed when this happens.
Use the new -fexperimental-strict-floating-point flag to force allowing
strict floating point on hosts that aren't already marked as supporting
it (X86 and SystemZ).
Differential Revision: https://reviews.llvm.org/D80952
David Green [Fri, 10 Jul 2020 12:16:17 +0000 (13:16 +0100)]
Revert "[BasicAA] Enable -basic-aa-recphi by default"
This reverts commit
af839a96187e3538d63ad57571e4bdf01e2b15c5.
Some issues appear to be being caused by this. Reverting whilst we
investigate.
Sam McCall [Thu, 9 Jul 2020 21:33:46 +0000 (23:33 +0200)]
[clangd] Config: If.PathExclude
Reviewers: hokein
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83511
Victor Huang [Fri, 10 Jul 2020 11:47:47 +0000 (06:47 -0500)]
[PowerPC] Implement R_PPC64_REL24_NOTOC calls, callee also has no TOC
The PC Relative code allows for calls that are marked with the relocation
R_PPC64_REL24_NOTOC. This indicates that the caller does not have a valid TOC
pointer in R2 and does not require R2 to be restored after the call.
This patch is added to support local calls to callees tha also do not have a TOC.
Reviewed By: sfertile, MaskRay, stefanp
Differential Revision: https://reviews.llvm.org/D82816
Ulrich Weigand [Fri, 10 Jul 2020 11:52:47 +0000 (13:52 +0200)]
[ABI] Handle C++20 [[no_unique_address]] attribute
Many platform ABIs have special support for passing aggregates that
either just contain a single member of floatint-point type, or else
a homogeneous set of members of the same floating-point type.
When making this determination, any extra "empty" members of the
aggregate type will typically be ignored. However, in C++ (at least
in all prior versions), no data member would actually count as empty,
even if it's type is an empty record -- it would still be considered
to take up at least one byte of space, and therefore make those ABI
special cases not apply.
This is now changing in C++20, which introduced the [[no_unique_address]]
attribute. Members of empty record type, if they also carry this
attribute, now do *not* take up any space in the type, and therefore
the ABI special cases for single-element or homogeneous aggregates
should apply.
The C++ Itanium ABI has been updated accordingly, and GCC 10 has
added support for this new case. This patch now adds support to
LLVM. This is cross-platform; it affects all platforms that use
the single-element or homogeneous aggregate ABI special case and
implement this using any of the following common subroutines
in lib/CodeGen/TargetInfo.cpp:
isEmptyField
isEmptyRecord
isSingleElementStruct
isHomogeneousAggregate
Simon Pilgrim [Fri, 10 Jul 2020 11:47:02 +0000 (12:47 +0100)]
DomTreeUpdater::dump() - use const auto& iterator in for-range-loop.
Avoids unnecessary copies and silences clang tidy warning.
Nathan James [Fri, 10 Jul 2020 11:27:08 +0000 (12:27 +0100)]
[clang-tidy] Use Options priority in enum options where it was missing
Simon Pilgrim [Fri, 10 Jul 2020 11:07:37 +0000 (12:07 +0100)]
StackSafetyAnalysis.cpp - pass ConstantRange arg as const reference.
Avoids unnecessary copies and silences clang tidy warning - we do this in most places, there are just a few that were missed.
Simon Pilgrim [Fri, 10 Jul 2020 10:20:46 +0000 (11:20 +0100)]
[X86][SSE] Use shouldUseHorizontalOp helper to determine whether to use (F)HADD. NFCI.
dstuttar [Wed, 8 Jul 2020 10:02:47 +0000 (11:02 +0100)]
[NFC] Change isFPPredicate comparison to ignore lower bound
Summary:
Since changing the Predicate to be an unsigned enum, the lower bound check for
isFPPredicate no longer needs to check the lower bound, since
it will always evaluate to true.
Also fixed a similar issue in SIISelLowering.cpp by removing the need for
comparing to FIRST and LAST predicates
Added an assert to the isFPPredicate comparison to flag if the
FIRST_FCMP_PREDICATE is ever changed to anything other than 0, in which case the
logic will break.
Without this change warnings are generated in VS.
Change-Id: I358f0daf28c0628c7bda8ad4cab4e1757b761bab
Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83540
Paul Walker [Fri, 10 Jul 2020 10:37:19 +0000 (10:37 +0000)]
[SVE] Code generation for fixed length vector truncates.
Lower fixed length vector truncates to a sequence of SVE UZP1 instructions.
Differential Revision: https://reviews.llvm.org/D83395
Pavel Labath [Fri, 10 Jul 2020 09:53:47 +0000 (11:53 +0200)]
[lldb/pecoff] Use a different llvm createBinary overload for parsing
Change the code the use the version which accepts a memory buffer,
instead of the one taking a file name.
This ensures we are not loading the file into memory twice
(ObjectFilePECOFF also loads a copy), reducing our memory footprint, as
well as enabling additional goodies in the future, like being able to
open files which don't exist on disk (D83512).
Haojian Wu [Fri, 10 Jul 2020 09:42:04 +0000 (11:42 +0200)]
[clang-tidy] More strict on matching the standard memset function in memset-usage check.
The check assumed the matched function call has 3 arguments, but the
matcher didn't guaranteed that.
Differential Revision: https://reviews.llvm.org/D83301
Florian Hahn [Fri, 10 Jul 2020 08:45:02 +0000 (09:45 +0100)]
[LV] Pick vector loop body as insert point for SCEV expansion.
Currently the DomTree is not kept up to date for additional blocks
generated in the vector loop, for example when vectorizing with
predication. SCEVExpander relies on dominance checks when looking for
existing instructions to re-use and in some cases that can lead to the
expander picking instructions that do not actually dominate their insert
point (e.g. as in PR46525).
Unfortunately keeping the DT up-to-date is a bit tricky, because the CFG
is only patched up after generating code for a block. For now, we can
just use the vector loop header, as this ensures the inserted
instructions dominate all uses in the vector loop. There should be no
noticeable impact on the generated code, as other passes should sink
those instructions, if profitable.
Fixes PR46525.
Reviewers: Ayal, gilr, mkazantsev, dmgreen
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D83288
Mirko Brkusanin [Fri, 10 Jul 2020 09:32:32 +0000 (11:32 +0200)]
[AMDGPU][GlobalISel] Fix G_AMDGPU_TBUFFER_STORE_FORMAT mapping
Add missing mappings and tablegen definitions for TBUFFER_STORE_FORMAT.
Differential Revision: https://reviews.llvm.org/D83240
Simon Pilgrim [Thu, 9 Jul 2020 16:36:26 +0000 (17:36 +0100)]
extractConstantWithoutWrapping - use const APInt& returned by SCEVConstant::getAPInt()
Avoids unnecessary APInt copies and silences clang tidy warning.
Vitaly Buka [Fri, 10 Jul 2020 08:24:58 +0000 (01:24 -0700)]
Fix check-all with -DLLVM_USE_SANITIZER=Address
Simon Pilgrim [Fri, 10 Jul 2020 08:33:10 +0000 (09:33 +0100)]
[X86][AVX] Attempt to fold PACK(SHUFFLE(X,Y),SHUFFLE(X,Y)) -> SHUFFLE(PACK(X,Y)).
Truncations lowered as shuffles of multiple (concatenated) vectors often leave us with lane-crossing shuffles that feed a PACKSS/PACKUS, if both shuffles are fed from the same 2 vector sources, then we can PACK the sources directly and shuffle the result instead.
This is currently limited to whole i128 lanes in a 256-bit vector, but we can extend this if the need arises (but I'm not seeing many examples in real world code).
Valeriy Savchenko [Fri, 10 Jul 2020 08:20:36 +0000 (11:20 +0300)]
[analyzer][tests] Fix zip unpacking
Differential Revision: https://reviews.llvm.org/D83374
Valeriy Savchenko [Fri, 10 Jul 2020 08:20:20 +0000 (11:20 +0300)]
[analyzer][tests] Make test interruption safe
Differential Revision: https://reviews.llvm.org/D83373
Valeriy Savchenko [Fri, 10 Jul 2020 07:54:18 +0000 (10:54 +0300)]
[analyzer][tests] Measure peak memory consumption for every project
Differential Revision: https://reviews.llvm.org/D82967
Danila Kutenin [Fri, 10 Jul 2020 07:46:57 +0000 (09:46 +0200)]
[builtins] Optimize udivmodti4 for many platforms.
Summary:
While benchmarking uint128 division we found out that it has huge latency for small divisors
https://reviews.llvm.org/D83027
```
Benchmark Time(ns) CPU(ns) Iterations
--------------------------------------------------------------------------------------------------
BM_DivideIntrinsic128UniformDivisor<unsigned __int128> 13.0 13.0
55000000
BM_DivideIntrinsic128UniformDivisor<__int128> 14.3 14.3
50000000
BM_RemainderIntrinsic128UniformDivisor<unsigned __int128> 13.5 13.5
52000000
BM_RemainderIntrinsic128UniformDivisor<__int128> 14.1 14.1
50000000
BM_DivideIntrinsic128SmallDivisor<unsigned __int128> 153 153 5000000
BM_DivideIntrinsic128SmallDivisor<__int128> 170 170 3000000
BM_RemainderIntrinsic128SmallDivisor<unsigned __int128> 153 153 5000000
BM_RemainderIntrinsic128SmallDivisor<__int128> 155 155 5000000
```
This patch suggests a more optimized version of the division:
If the divisor is 64 bit, we can proceed with the divq instruction on x86 or constant multiplication mechanisms for other platforms. Once both divisor and dividend are not less than 2**64, we use branch free subtract algorithm, it has at most 64 cycles. After that our benchmarks improved significantly
```
Benchmark Time(ns) CPU(ns) Iterations
--------------------------------------------------------------------------------------------------
BM_DivideIntrinsic128UniformDivisor<unsigned __int128> 11.0 11.0
64000000
BM_DivideIntrinsic128UniformDivisor<__int128> 13.8 13.8
51000000
BM_RemainderIntrinsic128UniformDivisor<unsigned __int128> 11.6 11.6
61000000
BM_RemainderIntrinsic128UniformDivisor<__int128> 13.7 13.7
52000000
BM_DivideIntrinsic128SmallDivisor<unsigned __int128> 27.1 27.1
26000000
BM_DivideIntrinsic128SmallDivisor<__int128> 29.4 29.4
24000000
BM_RemainderIntrinsic128SmallDivisor<unsigned __int128> 27.9 27.8
26000000
BM_RemainderIntrinsic128SmallDivisor<__int128> 29.1 29.1
25000000
```
If not using divq instrinsics, it is still much better
```
Benchmark Time(ns) CPU(ns) Iterations
--------------------------------------------------------------------------------------------------
BM_DivideIntrinsic128UniformDivisor<unsigned __int128> 12.2 12.2
58000000
BM_DivideIntrinsic128UniformDivisor<__int128> 13.5 13.5
52000000
BM_RemainderIntrinsic128UniformDivisor<unsigned __int128> 12.7 12.7
56000000
BM_RemainderIntrinsic128UniformDivisor<__int128> 13.7 13.7
51000000
BM_DivideIntrinsic128SmallDivisor<unsigned __int128> 30.2 30.2
24000000
BM_DivideIntrinsic128SmallDivisor<__int128> 33.2 33.2
22000000
BM_RemainderIntrinsic128SmallDivisor<unsigned __int128> 31.4 31.4
23000000
BM_RemainderIntrinsic128SmallDivisor<__int128> 33.8 33.8
21000000
```
PowerPC benchmarks:
Was
```
BM_DivideIntrinsic128UniformDivisor<unsigned __int128> 22.3 22.3
32000000
BM_DivideIntrinsic128UniformDivisor<__int128> 23.8 23.8
30000000
BM_RemainderIntrinsic128UniformDivisor<unsigned __int128> 22.5 22.5
32000000
BM_RemainderIntrinsic128UniformDivisor<__int128> 24.9 24.9
29000000
BM_DivideIntrinsic128SmallDivisor<unsigned __int128> 394 394 2000000
BM_DivideIntrinsic128SmallDivisor<__int128> 397 397 2000000
BM_RemainderIntrinsic128SmallDivisor<unsigned __int128> 399 399 2000000
BM_RemainderIntrinsic128SmallDivisor<__int128> 397 397 2000000
```
With this patch
```
BM_DivideIntrinsic128UniformDivisor<unsigned __int128> 21.7 21.7
33000000
BM_DivideIntrinsic128UniformDivisor<__int128> 23.0 23.0
31000000
BM_RemainderIntrinsic128UniformDivisor<unsigned __int128> 21.9 21.9
33000000
BM_RemainderIntrinsic128UniformDivisor<__int128> 23.9 23.9
30000000
BM_DivideIntrinsic128SmallDivisor<unsigned __int128> 32.7 32.6
23000000
BM_DivideIntrinsic128SmallDivisor<__int128> 33.4 33.4
21000000
BM_RemainderIntrinsic128SmallDivisor<unsigned __int128> 31.1 31.1
22000000
BM_RemainderIntrinsic128SmallDivisor<__int128> 33.2 33.2
22000000
```
My email: danilak@google.com, I don't have commit rights
Reviewers: howard.hinnant, courbet, MaskRay
Reviewed By: courbet
Subscribers: steven.zhang, #sanitizers
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D81809
Diogo Sampaio [Fri, 10 Jul 2020 07:01:04 +0000 (08:01 +0100)]
[BDCE] SExt -> ZExt when no sign bits is used and instruction has multiple uses
Summary: This allows to convert any SExt to a ZExt when we know none of the extended bits are used, specially in cases where there are multiple uses of the value.
Reviewers: dmgreen, eli.friedman, spatel, lebedev.ri, nikic
Reviewed By: lebedev.ri, nikic
Subscribers: hiraditya, dmgreen, craig.topper, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60413
David Sherwood [Fri, 3 Jul 2020 12:27:21 +0000 (13:27 +0100)]
[CodeGen] Replace calls to getVectorNumElements() in DAGTypeLegalizer::SetSplitVector
In DAGTypeLegalizer::SetSplitVector I have changed calls in the assert
from getVectorNumElements() to getVectorElementCount(), since this
code path works for both fixed and scalable vectors.
This fixes up one warning in the test:
sve-sext-zext.ll
Differential Revision: https://reviews.llvm.org/D83196
Thomas Lively [Fri, 10 Jul 2020 07:18:59 +0000 (00:18 -0700)]
[WebAssembly][NFC] Simplify vector shift lowering and add tests
This patch builds on
0d7286a652 by simplifying the code for detecting
splat values and adding new tests demonstrating the lowering of
splatted absolute value shift amounts, which are common in code
generated by Halide. The lowering is very bad right now, but
subsequent patches will improve it considerably. The tests will be
useful for evaluating the improvements in those patches.
Reviewed By: aheejin
Differential Revision: https://reviews.llvm.org/D83493
George Mitenkov [Thu, 9 Jul 2020 21:53:15 +0000 (00:53 +0300)]
[MLIR][SPIRVToLLVM] Conversion of SPIR-V struct type without offset
This patch introduces type conversion for SPIR-V structs. Since
handling offset case requires thorough testing, it was left out
for now. Hence, only structs with no offset are currently
supported. Also, structs containing member decorations cannot
be translated.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D83403
David Sherwood [Mon, 6 Jul 2020 09:17:16 +0000 (10:17 +0100)]
[CodeGen] Replace calls to getVectorNumElements() in SelectionDAG::SplitVector
This patch replaces some invalid calls to getVectorNumElements() with calls
to getVectorMinNumElements() instead, since the code paths changed in this
patch work for both fixed and scalable vector types.
Fixes warnings in this test:
sve-sext-zext.ll
Differential Revision: https://reviews.llvm.org/D83203
Muhammad Omair Javaid [Tue, 7 Jul 2020 20:22:35 +0000 (01:22 +0500)]
[LLDB] Update AArch64 Dwarf and EH frame register numbers
This patch updates ARM64_ehframe_Registers.h and ARM64_DWARF_Registers.h
with latest register numbers in line with AArch64 SVE support.
For refernce take a look at "DWARF for the ARMĀ® 64-bit Architecture (AArch64)
with SVE support" manual from Arm.
Version used: abi_sve_aadwarf_100985_0000_00_en.pdf
Daniel Grumberg [Tue, 30 Jun 2020 13:25:23 +0000 (14:25 +0100)]
Add diagnostic option backing field for -fansi-escape-codes
Summary:
Keep track of -fansi-escape-codes in DiagnosticOptions and move the
option to the new option parsing system.
Depends on D82860
Reviewers: Bigcheese
Subscribers: dexonsmith, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82874
Zakk Chen [Thu, 30 Apr 2020 10:24:19 +0000 (03:24 -0700)]
[RISCV] Refactor FeatureRVCHints to make ProcessorModel more intuitive
Reviewers: luismarques, asb, evandro
Reviewed By: asb, evandro
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77030
Nathan Ridge [Wed, 8 Jul 2020 06:43:38 +0000 (02:43 -0400)]
[clangd] Factor out some helper functions related to heuristic resolution in TargetFinder
Summary:
Two helpers are introduced:
* Some of the logic previously in TargetFinder::Visit*() methods is
factored out into resolveDependentExprToDecls().
* Some of the logic in getMembersReferencedViaDependentName() is
factored out into resolveTypeToRecordDecl().
D82739 will build on this and use these functions in new ways.
Reviewers: hokein
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83371
SharmaRithik [Fri, 10 Jul 2020 05:46:06 +0000 (11:16 +0530)]
[CodeMoverUtils] Move OrderedInstructions to CodeMoverUtils
Summary: This patch moves OrderedInstructions to CodeMoverUtils as It was
the only place where OrderedInstructions is required.
Authored By: RithikSharma
Reviewer: Whitney, bmahjour, etiotto, fhahn, nikic
Reviewed By: Whitney, nikic
Subscribers: mgorny, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80643
Fangrui Song [Fri, 10 Jul 2020 05:39:56 +0000 (22:39 -0700)]
[llvm-symbolizer][test] Fix options-from-env.test
options-from-env.test (D71668) does not test it intended to test:
`llvm-symbolizer 0x20112f` prints `0x20112f` in the absence of an environment
variable.
Guillaume Chatelet [Fri, 10 Jul 2020 04:27:39 +0000 (04:27 +0000)]
[NFC] Separate bitcode reading for FUNC_CODE_INST_CMPXCHG(_OLD)
This is preparatory work to unable storing alignment for AtomicCmpXchgInst.
See D83136 for context and bug: https://bugs.llvm.org/show_bug.cgi?id=27168
Differential Revision: https://reviews.llvm.org/D83375
Richard Smith [Fri, 10 Jul 2020 04:08:39 +0000 (21:08 -0700)]
[demangler] More properly save and restore the template parameter state
when parsing an encoding.
Petr Hosek [Fri, 10 Jul 2020 04:07:44 +0000 (21:07 -0700)]
[CMake][Fuchsia] Move runtimes to outer scope
This is needed for runtimes to be properly configured, addressing an
issue introduced in
53e38c85.
Stella Laurenzo [Fri, 10 Jul 2020 00:35:05 +0000 (17:35 -0700)]
Add Python bindings guide.
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, Kayjukh, jurahul, msifontes
Tags: #mlir
Differential Revision: https://reviews.llvm.org/D83527
Richard Smith [Fri, 10 Jul 2020 03:36:04 +0000 (20:36 -0700)]
[demangler] Don't allow the template parameters from the <encoding> in a
<local-name> to leak out into later parts of the name.
This caused us to fail to demangle certain constructs involving generic
lambdas.
Oliver Hunt [Fri, 10 Jul 2020 03:27:03 +0000 (20:27 -0700)]
CrashTracer: clang at clang: llvm::BitstreamWriter::ExitBlock
Add a guard for re-entering an SDiagsWriter's HandleDiagnostics
method after we've started finalizing. This is a generic catch
all for unexpected fatal errors so we don't recursive crash inside
the generic llvm error handler.
We also add logic to handle the actual error case in
llvm::~raw_fd_ostream caused by failing to clear errors before
it is destroyed.
<rdar://problem/
63335596>
Chen Zheng [Fri, 10 Jul 2020 00:01:32 +0000 (20:01 -0400)]
[SCEV][IndVarSimplify] insert point should not be block front.
The block front may be a PHI node, inserting a cast instructions like
BitCast, PtrToInt, IntToPtr among PHIs is not right.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D80975
Jordan Rupprecht [Fri, 10 Jul 2020 01:38:49 +0000 (18:38 -0700)]
[lldb] Declare extern template instantiation to fix linking issues.
NativeProcessELF::GetELFImageInfoAddress<...>() is declared in NativeProcessELF.h, but only defined in NativeProcessELF.cpp. Via some optimized builds (e.g. thinlto), this instantiation may be removed when it is used in a different TU (NativeProcessELFTest.cpp).
Vitaly Buka [Fri, 10 Jul 2020 01:00:44 +0000 (18:00 -0700)]
[StackSafety,NFC] Reduce FunctionSummary size
Most compiler infocations will not need ParamAccess,
so we can optimize memory usage there with smaller unique_ptr
instead of empty vector.
Suggested in D80908 review.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D83458
Julian Lettner [Tue, 30 Jun 2020 20:19:25 +0000 (13:19 -0700)]
[Sanitizer] Update macOS version checking
Support macOS 11 in our runtime version checking code and update
`GetMacosAlignedVersionInternal()` accordingly. This follows the
implementation of `Triple::getMacOSXVersion()` in the Clang driver.
Reviewed By: delcypher
Differential Revision: https://reviews.llvm.org/D82918
Richard Smith [Thu, 9 Jul 2020 21:11:21 +0000 (14:11 -0700)]
PR46648: Do not eagerly instantiate default arguments for a generic
lambda when instantiating a call operator specialization.
We previously incorrectly thought that such substitution was happening
in the context of substitution into a local scope, which is a context
where we should perform eager default argument instantiation.
Richard Smith [Thu, 9 Jul 2020 21:57:30 +0000 (14:57 -0700)]
Push parameters into the local instantiation scope before instantiating
a default argument.
Default arguments can (after recent language changes) refer to
parameters of the same function. Make sure they're added to the local
instantiation scope before transforming a default argument so that we
can remap such references to them properly.
Richard Smith [Thu, 9 Jul 2020 21:31:19 +0000 (14:31 -0700)]
Move default argument instantiation to SemaTemplateInstantiateDecl.cpp.
No functionality change intended.
ergawy [Thu, 9 Jul 2020 23:08:51 +0000 (19:08 -0400)]
[MLIR][SPIRV] Support two memory access attributes in OpCopyMemory.
This commit augments spv.CopyMemory's implementation to support 2 memory
access operands. Hence, more closely following the spec. The following
changes are introduces:
- Customize logic for spv.CopyMemory serialization and deserialization.
- Add 2 additional attributes for source memory access operand.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D83241
Amara Emerson [Fri, 10 Jul 2020 00:04:09 +0000 (17:04 -0700)]
[AArch64][GlobalISel] Add more specific debug info tests for
613f12dd8e2403f5630ab299d2a1bb2cb111ead1.
As requested, these tests check for specific debug locs on the output of the
legalizer. The only one that I couldn't write was for moreElementsVector, which
AFAICT we don't trigger on AArch64.
Arthur Eubanks [Thu, 9 Jul 2020 23:49:48 +0000 (16:49 -0700)]
[NFC] Derive from PassInfoMixin for no-op/printing passes
PassInfoMixin should be used for all NPM passes, rater than a custom
`name()`.
This caused ambiguous references in LegacyPassManager.cpp, so had to
remove "using namespace llvm::legacy" and move some things around.
The passes had to be moved to the llvm namespace, or else they would get
printed as "(anonymous namespace)::FooPass".
Reviewed By: ychen, asbirlea
Differential Revision: https://reviews.llvm.org/D83498
Wei Mi [Wed, 8 Jul 2020 18:19:59 +0000 (11:19 -0700)]
[NFC] Change getEntryForPercentile to be a static function in ProfileSummaryBuilder.
Change file static function getEntryForPercentile to be a static member function
in ProfileSummaryBuilder so it can be used by other files.
Differential Revision: https://reviews.llvm.org/D83439
Wei Mi [Thu, 9 Jul 2020 23:12:43 +0000 (16:12 -0700)]
[NFC] Extract the code to write instr profile into function writeInstrProfile
So that the function writeInstrProfile can be used in other places.
Differential Revision: https://reviews.llvm.org/D83521
Stella Laurenzo [Tue, 7 Jul 2020 06:05:46 +0000 (23:05 -0700)]
Initial boiler-plate for python bindings.
Summary:
* Native '_mlir' extension module.
* Python mlir/__init__.py trampoline module.
* Lit test that checks a message.
* Uses some cmake configurations that have worked for me in the past but likely needs further elaboration.
Subscribers: mgorny, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, Kayjukh, jurahul, msifontes
Tags: #mlir
Differential Revision: https://reviews.llvm.org/D83279
Eli Friedman [Thu, 9 Jul 2020 00:05:56 +0000 (17:05 -0700)]
[AArch64][SVE] Add lowering for llvm.fma.
This is currently bare-bones; we aren't taking advantage of any of the
FMA variant instructions. But it's enough to at least generate
code.
Differential Revision: https://reviews.llvm.org/D83444
Eric Schweitz [Thu, 9 Jul 2020 23:08:45 +0000 (16:08 -0700)]
[flang] ifdef to avoid warning about supposedly dead function
peter klausler [Thu, 9 Jul 2020 18:08:41 +0000 (11:08 -0700)]
[flang] Fix frontend build with -DBUILD_SHARED_LIBS=On
Fix fronted shared library builds by eliminating dependences of
the parser on other component libraries, moving some code around that
wasn't in the right library, and making some dependences
explicit in the CMakeLists.txt files. The lowering library
does not yet build as a shared library due to some undefined
names.
Reviewed By: tskeith
Differential Revision: https://reviews.llvm.org/D83515
Zequan Wu [Thu, 9 Jul 2020 22:49:56 +0000 (15:49 -0700)]
Revert "[Lexer] Fix missing coverage line after #endif"
This reverts commit
672ae621e91ff5cdefb2535bdd530641536685ea.
Pete Steinfeld [Thu, 9 Jul 2020 16:38:22 +0000 (09:38 -0700)]
[flang] Fix a crash when creating generics from a copy
Summary:
When a program unit creates a generic based on one defined in a module, the
function `CopyFrom()` is called to create the `GenericDetails`. This function
copied the `specificProcs_` but failed to copy the `bindingNames_`. If the
function `CheckGeneric()` then gets called, it tries to index into the empty
binding names and causes the crash.
I fixed this by adding code to `CopyFrom()` to copy the binding names.
I also added a test that causes the crash.
Reviewers: klausler, tskeith, DavidTruby
Subscribers: llvm-commits
Tags: #llvm, #flang
Differential Revision: https://reviews.llvm.org/D83491
Amy Huang [Wed, 29 Apr 2020 23:21:05 +0000 (16:21 -0700)]
Switch to using -debug-info-kind=constructor as default (from =limited)
Summary:
-debug-info-kind=constructor reduces the amount of class debug info that
is emitted; this patch switches to using this as the default.
Constructor homing emits the complete type info for a class only when the
constructor is emitted, so it is expected that there will be some classes that
are not defined in the debug info anymore because they are never constructed,
and we shouldn't need debug info for these classes.
I compared the PDB files for clang, and there are 273 class types that are defined with `=limited`
but not with `=constructor` (out of ~60,000 total class types).
We've looked at a number of the types that are no longer defined with =constructor. The vast
majority of cases are something like class A is used as a parameter in a member function of
some other class B, which is emitted. But the function that uses class A is never called, and class A
is never constructed, and therefore isn't emitted in the debug info.
Bug: https://bugs.llvm.org/show_bug.cgi?id=46537
Subscribers: aprantl, cfe-commits, lldb-commits
Tags: #clang, #lldb
Differential Revision: https://reviews.llvm.org/D79147
Zequan Wu [Thu, 9 Jul 2020 21:56:06 +0000 (14:56 -0700)]
[Lexer] Fix missing coverage line after #endif
Summary: bug reported here: https://bugs.llvm.org/show_bug.cgi?id=46660
Reviewers: vsk, efriedma, arphaman
Reviewed By: vsk
Subscribers: dexonsmith, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83514
Albion Fung [Thu, 9 Jul 2020 22:03:41 +0000 (17:03 -0500)]
[PowerPC][Power10] Add Instruction definition/MC Tests for Load/Store Rightmost VSX Vector
This patch adds the instruction definitions and the assembly/disassembly
tests for the Load/Store VSX Vector Rightmose instructions.
Differential Revision: https://reviews.llvm.org/D83364
Joel E. Denny [Thu, 9 Jul 2020 21:31:31 +0000 (17:31 -0400)]
[FileCheck] Improve -dump-input documentation
Document the default of `fail` in `-help`. Extend `-dump-input=help`
to help users find related command-line options, but let `-help`
provide their full documentation.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D83091
Craig Topper [Thu, 9 Jul 2020 21:52:16 +0000 (14:52 -0700)]
Recommit "[X86] Merge the FEATURE_64BIT and FEATURE_EM64T bits in X86TargetParser.def."
This time without the change to make operator| use operator&=.
That seems to be the source of the gcc 5.3 miscompile.
Original commit message:
These represent the same thing but 64BIT only showed up from
getHostCPUFeatures providing a list of featuers to clang. While
EM64T showed up from getting the features for a named CPU.
EM64T didn't have a string specifically so it would not be passed
up to clang when getting features for a named CPU. While 64bit
needed a name since that's how it is index.
Merge them by filtering 64bit out before sending features to clang
for named CPUs.
Stanislav Mekhanoshin [Thu, 18 Jun 2020 23:32:17 +0000 (16:32 -0700)]
[AMDGPU] Return restricted number of regs from TTI
This is practically NFC at the moment because nothing really
asks the real number or does anything useful with it.
Differential Revision: https://reviews.llvm.org/D82202
Sanjay Patel [Thu, 9 Jul 2020 15:53:51 +0000 (11:53 -0400)]
[DAGCombiner] convert if-chain in store merging to switch; NFC