review.tizen.org Git - platform/upstream/llvm.git/log

[InstCombine] add helper function to reduce code duplication; NFC

llvm-svn: 347604

[stack-safety] Local analysis implementation

Summary:
Analysis produces StackSafetyInfo which contains information with how allocas
and parameters were used in functions.

From prototype by Evgenii Stepanov and Vlad Tsyrklevich.

Reviewers: eugenis, vlad.tsyrklevich, pcc, glider

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D54504

llvm-svn: 347603

[stack-safety] Empty local passes for Stack Safety Local Analysis

Reviewers: eugenis, vlad.tsyrklevich

Subscribers: mgorny, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D54502

llvm-svn: 347602

[cfi] Help sanstats to find binary if they are not at the original location

Summary:
By default sanstats search binaries at the same location where they were when
stats was collected. Sometime you can not print report immediately or you need
to move post-processing to another workstation. To support this use-case when
original binary is missing sanstats will fall-back to directory with sanstats
file.

Reviewers: pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53857

llvm-svn: 347601

[cfi] Make sanstats print address of the check

Summary: Help with off-line symbolization or other type debugging.

Reviewers: pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53606

llvm-svn: 347600

[AArch64] Refactor the scheduling predicates (3/3) (NFC)

Refactor the scheduling predicates based on `MCInstPredicate`. In this
case, `AArch64InstrInfo::hasExtendedReg()`.

Differential revision: https://reviews.llvm.org/D54822

llvm-svn: 347599

[AArch64] Refactor the scheduling predicates (2/3) (NFC)

Refactor the scheduling predicates based on `MCInstPredicate`. In this
case, `AArch64InstrInfo::hasShiftedReg()`.

Differential revision: https://reviews.llvm.org/D54820

llvm-svn: 347598

[AArch64] Refactor the scheduling predicates (1/3) (NFC)

Refactor the scheduling predicates based on `MCInstPredicate`. In this
case, `AArch64InstrInfo::isScaledAddr()`

Differential revision: https://reviews.llvm.org/D54777

llvm-svn: 347597

Support for inserting profile-directed cache prefetches

Summary:
Support for profile-driven cache prefetching (X86)

This change is part of a larger system, consisting of a cache prefetches recommender, create_llvm_prof (https://github.com/google/autofdo), and LLVM.

A proof of concept recommender is DynamoRIO's cache miss analyzer. It processes memory access traces obtained from a running binary and identifies patterns in cache misses. Based on them, it produces a csv file with recommendations. The expectation is that, by leveraging such recommendations, we can reduce the amount of clock cycles spent waiting for data from memory. A microbenchmark based on the DynamoRIO analyzer is available as a proof of concept: https://goo.gl/6TM2Xp.

The recommender makes prefetch recommendations in terms of:

* the binary offset of an instruction with a memory operand;
* a delta;
* and a type (nta, t0, t1, t2)

meaning: a prefetch of that type should be inserted right before the instrution at that binary offset, and the prefetch should be for an address delta away from the memory address the instruction will access.

For example:

0x400ab2,64,nta

and assuming the instruction at 0x400ab2 is:

movzbl (%rbx,%rdx,1),%edx

means that the recommender determined it would be beneficial for a prefetchnta instruction to be inserted right before this instruction, as such:

prefetchnta 0x40(%rbx,%rdx,1)
movzbl (%rbx, %rdx, 1), %edx

The workflow for prefetch cache instrumentation is as follows (the proof of concept script details these steps as well):

1. build binary, making sure -gmlt -fdebug-info-for-profiling is passed. The latter option will enable the X86DiscriminateMemOps pass, which ensures instructions with memory operands are uniquely identifiable (this causes ~2% size increase in total binary size due to the additional debug information).

2. collect memory traces, run analysis to obtain recommendations (see above-referenced DynamoRIO demo as a proof of concept).

3. use create_llvm_prof to convert recommendations to reference insertion locations in terms of debug info locations.

4. rebuild binary, using the exact same set of arguments used initially, to which -mllvm -prefetch-hints-file=<file> needs to be added, using the afdo file obtained at step 3.

Note that if sample profiling feedback-driven optimization is also desired, that happens before step 1 above. In this case, the sample profile afdo file that was used to produce the binary at step 1 must also be included in step 4.

The data needed by the compiler in order to identify prefetch insertion points is very similar to what is needed for sample profiles. For this reason, and given that the overall approach (memory tracing-based cache recommendation mechanisms) is under active development, we use the afdo format as a syntax for capturing this information. We avoid confusing semantics with sample profile afdo data by feeding the two types of information to the compiler through separate files and compiler flags. Should the approach prove successful, we can investigate improvements to this encoding mechanism.

Reviewers: davidxl, wmi, craig.topper

Reviewed By: davidxl, wmi, craig.topper

Subscribers: davide, danielcdh, mgorny, aprantl, eraman, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D54052

llvm-svn: 347596

AMDGPU: Record SGPR spills when restoring too

It's possible in some cases to have a restore present
without a corresponding spill. Due to an apparent bug
in D54366 <https://reviews.llvm.org/D54366>, only the
restore for a register was emitted. It's probably
always a bug for this to happen, but due to how SGPR
spilling is implemented, this makes the issues appear
worse than it is.

llvm-svn: 347595

ELF: ICF: Include contents of referenced sections in initial partitioning hash. NFCI.

On my machine this reduced median link time of lld-speed-test/chrome
from 2.68s to 2.41s. It also reduces link time of Chrome for Android
with a prototype compiler change that causes the compiler to create
large numbers of identical (modulo relocations) sections from >15
minutes to a few seconds.

Differential Revision: https://reviews.llvm.org/D54773

llvm-svn: 347594

[LegalizeVectorTypes][X86][ARM][AArch64][PowerPC] Don't use SplitVecOp_TruncateHelper for FP_TO_SINT/UINT.

SplitVecOp_TruncateHelper tries to promote the result type while splitting FP_TO_SINT/UINT. It then concatenates the result and introduces a truncate to the original result type. But it does this without inserting the AssertZExt/AssertSExt that the regular result type promotion would insert. Nor does it turn FP_TO_UINT into FP_TO_SINT the way normal result type promotion for these operations does. This is bad on X86 which doesn't support FP_TO_SINT until AVX512.

This patch disables the use of SplitVecOp_TruncateHelper for these operations and just lets normal promotion handle it. I've tweaked a couple things in X86ISelLowering to avoid a few obvious regressions there. I believe all the changes on X86 are improvements. The other targets look neutral.

Differential Revision: https://reviews.llvm.org/D54906

llvm-svn: 347593

[ThinLTO] Consolidate cache key computation between new/old LTO APIs

Summary:
The old legacy LTO API had a separate cache key computation, which was
a subset of the cache key computation in the new LTO API (from what I
can tell this is largely just because certain features such as CFI,
dsoLocal, etc are only utilized via the new LTO API). However, having
separate computations is unnecessary (much of the code is duplicated),
and can lead to bugs when adding new optimizations if both cache
computation algorithms aren't updated properly - it's much easier to
maintain if we have a single facility.

This patch refactors the old LTO API code to use the cache key
computation from the new LTO API. To do this, we set up an lto::Config
object and fill in the fields that the old LTO was hashing (the others
will just use the defaults).

There are two notable changes:
- I added a Freestanding flag to the LTO Config. Currently this is only
used by the legacy LTO API. In the patch that added it (D30791) I had
asked about adding it to the new LTO API, but it looks like that was not
addressed. This should probably be discussed as a follow up to this
change, as it is orthogonal.
- The legacy LTO API had some code that was hashing the GUID of all
preserved symbols defined in the module. I looked back at the history of
this (which was added with the original hashing in the legacy LTO API in
D18494), and there is a comment in the review thread that it was added
in preparation for future internalization. We now do the internalization
of course, and that is handled in the new LTO API cache key computation
by hashing the recorded linkage type of all defined globals. Therefore I
didn't try to move over and keep the preserved symbols handling.

Reviewers: steven_wu, pcc

Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, dang, llvm-commits

Differential Revision: https://reviews.llvm.org/D54635

llvm-svn: 347592

[SelectionDAG] Teach BaseIndexOffset::match to unwrap the base after looking through an add/or

We might find a target specific node that needs to be unwrapped after we look through an add/or. Otherwise we get inconsistent results if one pointer is just X86WrapperRIP and the other is (add X86WrapperRIP, C)

Differential Revision: https://reviews.llvm.org/D54818

llvm-svn: 347591

[X86] Add test case for D54818

llvm-svn: 347590

Add basic_string::__resize_default_init (from P1072)

This patch adds an implementation of __resize_default_init as
described in P1072R2. Additionally, it uses it in filesystem to
demonstrate its intended utility.

Once P1072 lands, or if it changes it's interface, I will adjust
the internal libc++ implementation to match.

llvm-svn: 347589

Revert "[clang][slh] add attribute for speculative load hardening"

This reverts commit 801eaf91221ba6dd6996b29ff82659ad6359e885.

llvm-svn: 347588

[COFF] ICF: use parallelForEach{,N}

Summary: They have an additional `ThreadsEnabled` check, which does not matter much.

Reviewers: pcc, ruiu, rnk

Reviewed By: ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54812

llvm-svn: 347587

[clang][slh] add attribute for speculative load hardening

Summary:
LLVM IR already has an attribute for speculative_load_hardening. Before
this commit, when a user passed the -mspeculative-load-hardening flag to
Clang, every function would have this attribute added to it. This Clang
attribute will allow users to opt into SLH on a function by function basis.

This can be applied to functions and Objective C methods.

Reviewers: chandlerc, echristo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54555

llvm-svn: 347586

[libcxx] Fix XFAILs for aligned allocation tests

In r339743, I marked several aligned allocation tests as downright
unsupported on macosx in an attempt to unbreak the build. It turns
out that marking them as unuspported whenever we're on OS X is way
too coarse grained. This commit marks the tests as XFAIL with more
granularity.

llvm-svn: 347585

[CodeGen] Support custom format of stack maps

Summary:
Add a hook to the GCMetadataPrinter for emitting stack maps in
custom format. The hook will be called at stack map generation
time. The default stack map format is used if there is no hook.

For this to be useful a few data structures and accessors are
exposed from the StackMaps class, so the custom printer can
access the stack map data.

This patch authored by Cherry Zhang <cherryyz@google.com>.

Reviewers: thanm, apilipenko, reames

Reviewed By: reames

Subscribers: reames, apilipenko, nemanjai, javed.absar, kbarton, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D53892

llvm-svn: 347584

[OPENMP][NVPTX]Emit default locations with the correct Exec|Runtime
modes.

If the region is inside target|teams|distribute region, we can emit the
locations with the correct info for execution mode and runtime mode.
Patch adds this ability to the NVPTX codegen to help the optimizer to
produce better code.

llvm-svn: 347583

[clang][slh] Forward mSLH only to Clang CC1

Summary:
-mno-speculative-load-hardening isn't a cc1 option, therefore,
before this change:

clang -mno-speculative-load-hardening hello.cpp

would have the following error:

error: unknown argument: '-mno-speculative-load-hardening'

This change will only ever forward -mspeculative-load-hardening
which is a CC1 option based on which flag was passed to clang.

Also added a test that uses this option that fails if an error like the
above is ever thrown.

Thank you ericwf for help debugging and fixing this error.

Reviewers: chandlerc, EricWF

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54763

llvm-svn: 347582

Delete dead code introduced in r347354.

ParentTy is never used other than an assignment, and since it is a
pointer, there is no side effect. Some versions of GCC notice and warn
on this.

Change-Id: I37dc1a18c7b58040419afb803621de13d8904a8f
llvm-svn: 347581

[libcxx] Fix XFAIL for aligned deallocation test with trunk Clang

The test was marked as failing whenever the deployment target was 10.12
or older, but in reality the test passes when the deployment target is
10.12 on recent Clangs. This happens because only older clangs do not
honor the -faligned-allocation flag, which disables any availability
error related to aligned allocation support, regardless of the
deployment target.

llvm-svn: 347580

[lit] Fully qualify lit_config to avoid runtime crashes.

llvm-svn: 347579

[Cmake] Add missing dependency to `count`.

llvm-svn: 347578

[NFC] Replace magic numbers with CodeGenOpt enums

Use enum values from llvm/Support/CodeGen.h for the optimisation
levels in CompilerInvocation.

llvm-svn: 347577

AMDGPU: Cleanup / relax tests for future changes

llvm-svn: 347576

[ASTImporter] Set MustBuildLookupTable on PrimaryContext

Summary: SetMustBuildLookupTable() must always be called on a primary context.

Reviewers: labath, shafik, a.sidorin

Subscribers: rnkovacs, dkrupp, Szelethus, gamesh411

Differential Revision: https://reviews.llvm.org/D54863

llvm-svn: 347575

[clangd] Do not drop diagnostics from macros

if they still end up being in the main file.

llvm-svn: 347574

AMDGPU: Don't optimize exec masks at -O0

llvm-svn: 347573

AMDGPU: Only add implicit super-reg def for first subreg

llvm-svn: 347572

[AArch64] Add aarch64_vector_pcs function attribute to Clang

This is the Clang patch to complement the following LLVM patches:
  https://reviews.llvm.org/D51477
  https://reviews.llvm.org/D51479

More information describing the vector ABI and procedure call standard
can be found here:

https://developer.arm.com/products/software-development-tools/\
                          hpc/arm-compiler-for-hpc/vector-function-abi

Patch by Kerry McLaughlin.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D54425

llvm-svn: 347571

[clang-tidy] Improving narrowing conversions

Summary:
Newly flagged narrowing conversions:
- integer to narrower signed integer (this is compiler implementation defined),
- integer - floating point narrowing conversions,
- floating point - integer narrowing conversions,
- constants with narrowing conversions (even in ternary operator).

Reviewers: hokein, alexfh, aaron.ballman, JonasToth

Reviewed By: aaron.ballman, JonasToth

Subscribers: lebedev.ri, courbet, nemanjai, xazax.hun, kbarton, cfe-commits

Tags: #clang-tools-extra

Differential Revision: https://reviews.llvm.org/D53488

llvm-svn: 347570

[CodeGen] Take SPAdj into account for STATEPOINT liveness args

Summary:
STATEPOINT records its args' locations on stack relative to SP.
If the SP is changed, take that into account.

This patch authored by Cherry Zhang <cherryyz@google.com>.

Reviewers: thanm, reames

Reviewed By: reames

Subscribers: reames, llvm-commits

Differential Revision: https://reviews.llvm.org/D53603

llvm-svn: 347569

[libcxx] Use a type that is always an aggregate in variant's tests

Summary:
In PR39232, we noticed that some variant tests started failing in C++2a mode
with recent Clangs, because the rules for literal types changed in C++2a. As
a result, a temporary fix was checked in (enabling the test only in C++17).

This commit is what I believe should be the long term fix: I removed the
tests that checked constexpr default-constructibility with a weird type
from the tests for index() and valueless_by_exception(), and instead I
added tests for those using an obviously literal type in the test for the
default constructor.

Reviewers: EricWF, mclow.lists

Subscribers: christof, jkorous, dexonsmith, arphaman, libcxx-commits, rsmith

Differential Revision: https://reviews.llvm.org/D54767

llvm-svn: 347568

[clangd] Enable auto-index behind a flag.

Summary:
Ownership and configuration:
The auto-index (background index) is maintained by ClangdServer, like Dynamic.
(This means ClangdServer will be able to enqueue preamble indexing in future).
For now it's enabled by a simple boolean flag in ClangdServer::Options, but
we probably want to eventually allow injecting the storage strategy.

New 'sync' command:
In order to meaningfully test the integration (not just unit-test components)
we need a way for tests to ensure the asynchronous index reads/writes occur
before a certain point.
Because these tests and assertions are few, I think exposing an explicit "sync"
command for use in tests is simpler than allowing threading to be completely
disabled in the background index (as we do for TUScheduler).

Bugs:
I fixed a couple of trivial bugs I found while testing, but there's one I can't.
JSONCompilationDatabase::getAllFiles() may return relative paths, and currently
we trigger an assertion that assumes they are absolute.
There's no efficient way to resolve them (you have to retrieve the corresponding
command and then resolve against its directory property). In general I think
this behavior is broken and we should fix it in JSONCompilationDatabase and
require CompilationDatabase::getAllFiles() to be absolute.

Reviewers: kadircet

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54894

llvm-svn: 347567

[clangd] Fix compilation of IndexBenchmark

llvm-svn: 347566

Remove an unnecessary file; NFC.

This source file has not been needed since r346522 and was triggering diagnostics in MSVC about an object file which exports no public symbols (LNK4221).

llvm-svn: 347565

[ASTImporter][Structural Eq] Check for isBeingDefined

Summary:
If one definition is currently being defined, we do not compare for
equality and we assume that the decls are equal.

Reviewers: a_sidorin, a.sidorin, shafik

Reviewed By: a_sidorin

Subscribers: gamesh411, shafik, rnkovacs, dkrupp, Szelethus, cfe-commits

Differential Revision: https://reviews.llvm.org/D53697

llvm-svn: 347564

[clangd] Fix use-after-free with expected types in indexing

llvm-svn: 347563

[clangd] Add type boosting in code completion

Reviewers: sammccall, ioeric

Reviewed By: sammccall

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52276

llvm-svn: 347562

[DemandedBits] Add support for funnel shifts

Add support for funnel shifts to the DemandedBits analysis. The
demanded bits of the first two operands can be determined if the
shift amount is constant. The demanded bits of the third operand
(shift amount) can be determined if the bitwidth is a power of two.

This is basically the same functionality as implemented in D54869
and D54478, but for DemandedBits rather than InstCombine.

Differential Revision: https://reviews.llvm.org/D54876

llvm-svn: 347561

[clangd] Collect and store expected types in the index

Summary:
And add a hidden option to control whether the types are collected.
For experiments, will be removed when expected types implementation
is stabilized.

The index size is almost unchanged, e.g. the YAML index for all clangd
sources increased from 53MB to 54MB.

Reviewers: ioeric, sammccall

Reviewed By: sammccall

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52274

llvm-svn: 347560

[clangd] Initial implementation of expected types

Summary:
Provides facilities to model the C++ conversion rules without the AST.
The introduced representation can be stored in the index and used to
implement type-based ranking improvements for index-based completions.

Reviewers: sammccall, ioeric

Reviewed By: sammccall

Subscribers: malaperle, mgorny, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52273

llvm-svn: 347559

[Index] Expose USR generation for types

Summary: Used in clangd.

Reviewers: sammccall, ioeric

Reviewed By: sammccall

Subscribers: kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52275

llvm-svn: 347558

[x86] promote all multiply i8 by constant to i32

We have these 2 "isDesirable" promotion hooks (I'm not sure why we need both of them, but that's
independent of this patch), and we can adjust them to promote "mul i8 X, C" to i32. Then, all of
our existing LEA and other multiply expansion magic happens as it would for i32 ops.

Some of the test diffs show that we could end up with an actual 32-bit mul instruction here
because we choose not to expand to simpler ops. That instruction could be slower depending on the
subtarget. On the plus side, this means we don't need a separate instruction to load the constant
operand and possibly an extra instruction to move the result. If we need to tune mul i32 further,
we could add a later transform that tries to shrink it back to i8 based on subtarget timing.

I did not bother to duplicate all of the 32-bit test file RUNs and target settings that exist to
test whether LEA expansion is cheap or not. The diffs here assume a default target, so that means
LEA is generally cheap.

Differential Revision: https://reviews.llvm.org/D54803

llvm-svn: 347557

[PowerPC] Vector load/store builtins overstate alignment of pointers

A number of builtins in altivec.h load/store vectors from pointers to scalar
types. Currently they just cast the pointer to a vector pointer, but expressions
like that have the alignment of the target type. Of course, the input pointer
did not have that alignment so this triggers UBSan (and rightly so).

This resolves https://bugs.llvm.org/show_bug.cgi?id=39704

Differential revision: https://reviews.llvm.org/D54787

llvm-svn: 347556

Create a diagnostic group for warn_call_to_pure_virtual_member_function_from_ctor_dtor, so it can be turned into an error using Werror

Summary: Patch by Arnaud Bienner

Reviewers: davide, rsmith, jkorous

Reviewed By: jkorous

Subscribers: jkorous, sylvestre.ledru, cfe-commits

Differential Revision: https://reviews.llvm.org/D53807

llvm-svn: 347555

[clangd] Fix missing include from r347538 - fix windows buildbots

llvm-svn: 347554

[LLD][ELF] - Added a test for "-image-base: number expected" message. NFC.

We had no such test.

llvm-svn: 347553

[LLD][ELF] - Add a test for "unbalanced --push-state/--pop-state" error.

We had no such test.

llvm-svn: 347552

[clang-tidy] No warning for auto new expression in smart check

Summary: The fix for `auto` new expression is illegal.

Reviewers: aaron.ballman

Subscribers: xazax.hun, cfe-commits

Differential Revision: https://reviews.llvm.org/D54832

llvm-svn: 347551

[LLD][ELF] - Add a check for --split-stack-adjust-size error message. NFCI.

"--split-stack-adjust-size: size must be >= 0" message
was never tested.

llvm-svn: 347550

[LLD][ELF] - Do not crash when parsing the -defsym option from a error state.

When we are in a error state, script parser will not parse the -defsym
expression and hence will not tokenize it. Then ScriptLexer::Pos will be 0
and LLD will assert and crash here:

MemoryBufferRef ScriptLexer::getCurrentMB() {
assert(!MBs.empty() && Pos > 0); // Bang !

Solution - stop parsing the defsym in a error state. That is consistent
with the regular case (when we parse the linker script).

llvm-svn: 347549

[clangd] Tune down scope boost for global scope

Summary:
This improves cross-namespace completions and has ignorable
impact on other completion types.

Metrics
```
==================================================================================================
                                        OVERALL (excl. CROSS_NAMESPACE)
==================================================================================================
  Total measurements: 109367 (-6)
  All measurements:
MRR: 68.11 (+0.04) Top-1: 58.59% (+0.03%) Top-5: 80.00% (+0.01%) Top-100: 95.92% (-0.02%)
  Full identifiers:
MRR: 98.35 (+0.09) Top-1: 97.87% (+0.17%) Top-5: 98.96% (+0.01%) Top-100: 99.03% (+0.00%)
  Filter length 0-5:
MRR:      23.20 (+0.05) 58.72 (+0.01) 70.16 (-0.03) 73.44 (+0.03) 76.24 (+0.00) 80.79 (+0.14)
Top-1:    11.90% (+0.03%) 45.07% (+0.03%) 58.49% (-0.05%) 62.44% (-0.02%) 66.31% (-0.05%) 72.10% (+0.07%)
Top-5:    35.51% (+0.08%) 76.94% (-0.01%) 85.10% (-0.13%) 87.40% (-0.02%) 88.65% (+0.01%) 91.84% (+0.17%)
Top-100:  83.25% (-0.02%) 96.61% (-0.15%) 98.15% (-0.02%) 98.43% (-0.01%) 98.53% (+0.01%) 98.66% (+0.02%)

==================================================================================================
                                        CROSS_NAMESPACE
==================================================================================================
  Total measurements: 17702 (+27)
  All measurements:
MRR: 28.12 (+3.26) Top-1: 21.07% (+2.70%) Top-5: 35.11% (+4.48%) Top-100: 74.31% (+1.02%)
  Full identifiers:
MRR: 79.20 (+3.72) Top-1: 71.78% (+4.86%) Top-5: 88.39% (+2.84%) Top-100: 98.99% (+0.00%)
  Filter length 0-5:
MRR:      0.92 (-0.10) 5.51 (+0.57) 18.30 (+2.34) 21.62 (+3.76) 32.00 (+6.00) 41.55 (+7.61)
Top-1:    0.56% (-0.08%) 2.44% (+0.15%) 9.82% (+1.47%) 12.59% (+2.16%) 21.17% (+4.47%) 30.05% (+6.72%)
Top-5:    1.20% (-0.15%) 7.14% (+1.04%) 25.17% (+3.91%) 29.74% (+5.90%) 43.29% (+9.59%) 54.75% (+9.79%)
Top-100:  5.49% (-0.01%) 56.22% (+2.59%) 86.69% (+1.08%) 89.03% (+2.04%) 93.74% (+0.78%) 96.99% (+0.59%)
```

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54851

llvm-svn: 347548

[clangd] Use testPath in the test.

llvm-svn: 347547

[clang-tidy] PrintStackTraceOnErrorSignal

llvm-svn: 347546

[ARM GlobalISel] Support G_CTLZ and G_CTLZ_ZERO_UNDEF

We can now select CLZ via the TableGen'erated code, so support G_CTLZ
and G_CTLZ_ZERO_UNDEF throughout the pipeline for types <= s32.

Legalizer:
If the CLZ instruction is available, use it for both G_CTLZ and
G_CTLZ_ZERO_UNDEF. Otherwise, use a libcall for G_CTLZ_ZERO_UNDEF and
lower G_CTLZ in terms of it.

In order to achieve this we need to add support to the LegalizerHelper
for the legalization of G_CTLZ_ZERO_UNDEF for s32 as a libcall (__clzsi2).

We also need to allow lowering of G_CTLZ in terms of G_CTLZ_ZERO_UNDEF
if that is supported as a libcall, as opposed to just if it is Legal or
Custom. Due to a minor refactoring of the helper function in charge of
this, we will also allow the same behaviour for G_CTTZ and G_CTPOP.
This is not going to be a problem in practice since we don't yet have
support for treating G_CTTZ and G_CTPOP as libcalls (not even in
DAGISel).

Reg bank select:
Map G_CTLZ to GPR. G_CTLZ_ZERO_UNDEF should not make it to this point.

Instruction select:
Nothing to do.

llvm-svn: 347545

Fix typo in comment. NFC

llvm-svn: 347544

[LLD][ELF] - Remove the excessive safety return. NFC.

We explicitly call finalizeContents() only once for
DynamicSection. The code testing we do not do it twice is
just excessive.

It could be an assert, but we don't do
that for other sections, so does not seem we
should do it here too.

llvm-svn: 347543

[ARM] Prevent parallel macs for unsigned values

Both zext and sext are currently allowed during the search for narrow
sequences and sexts operands are later added to the mac candidates.
But operands of muls are also added, without checking whether they're
sext or zext, which means we can generate a signed smlad when we
shouldn't.

Differential Revision: https://reviews.llvm.org/D54790

llvm-svn: 347542

Revert "[TTI] Reduction costs only need to include a single extract element cost"

This reverts commit r346970.
It was causing PR39774, a crash in slp-vectorizer on a rather simple loop
with just a bunch of 'and's in the body.

llvm-svn: 347541

[LLD][ELF] - Add llvm_unreachable. NFC.

We never should call writeTo() for BSS section.

llvm-svn: 347540

[clangd] Cleanup after landing documentSymbol. NFC

- fix compile error on older gcc in Protocol.cpp,
- remove redundant 'llvm::' qualifiers from Protocol.cpp,
- remove unused variables in AST.cpp

llvm-svn: 347539

[clangd] Auto-index watches global CDB for changes.

Summary:
Instead of receiving compilation commands, auto-index is triggered by just
filenames to reindex, and gets commands from the global comp DB internally.
This has advantages:
- more of the work can be done asynchronously (fetching compilation commands
upfront can be slow for large CDBs)
- we get access to the CDB which can be used to retrieve interpolated commands
for headers (useful in some cases where the original TU goes away)
- fits nicely with the filename-only change observation from r347297

The interface to GlobalCompilationDatabase gets extended: when retrieving a
compile command, the GCDB can optionally report the project the file belongs to.
This naturally fits together with getCompileCommand: it's hard to implement one
without the other. But because most callers don't care, I've ended up with an
awkward optional-out-param-in-virtual method pattern - maybe there's a better
one.

This is the main missing integration point between ClangdServer and
BackgroundIndex, after this we should be able to add an auto-index flag.

Reviewers: ioeric, kadircet

Subscribers: MaskRay, jkorous, arphaman, cfe-commits, ilya-biryukov

Differential Revision: https://reviews.llvm.org/D54865

llvm-svn: 347538

[clang-tidy] Don't generate incorrect fixes for class with deleted copy constructor in smart_ptr check.

Summary:
The fix for aggregate initialization (`std::make_unique<Foo>(Foo {1, 2})` needs
to see Foo copy constructor, otherwise we will have a compiler error. So we
only emit the check warning.

Reviewers: JonasToth, aaron.ballman

Subscribers: xazax.hun, cfe-commits

Differential Revision: https://reviews.llvm.org/D54745

llvm-svn: 347537

[ELF] - Added test case for invalid relocation target errors. NFCI.

We had a proper error reporting, but no test cases.

llvm-svn: 347536

[LLD][ELF] - Add a test for non-null terminated mergeable strings section. NFCI.

LLD reports an error in this case, but we had no test.

llvm-svn: 347535

Revert "[PowerPC] Fix inconsistent ImmMustBeMultipleOf for same instruction"

This reverts commits r347532. Forget add the option
-mtriple powerpc64-unknown-linux-gnu. So other platform is error except
for PowerPC.

llvm-svn: 347534

[X86] Add test cases to show bad type legalization of fptosi/fptosui v16f32->v16i8 and v8f64->v8i16 on pre-AVX512 targets.

When splitting the v16f32/v8f64 result type, type legalization will try to promote the integer result type before a concat and an explicit truncate. But for the fptoui test case this is particularly bad since fptoui isn't supported on X86 until AVX512. We could use an fptosi since the result range would fit in a signed 32-bit value, but the generic type legalization doesn't do that transformation when splitting. It does do this when promoting.

llvm-svn: 347533

[PowerPC] Fix inconsistent ImmMustBeMultipleOf for same instruction

Summary:
There are 4 instructions which have Inconsistent ImmMustBeMultipleOf in the
function PPCInstrInfo::instrHasImmForm, they are LFS, LFD, STFS, STFD.
These four instructions should set the ImmMustBeMultipleOf to 1 instead of 4.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D54738

llvm-svn: 347532

A "constexpr" is evaluated in a constant context. Make sure this is reflected
if a __builtin_constant_p() is a part of a constexpr.

llvm-svn: 347531

[Support/FileSystem] Add sub-second precision for atime/mtime of sys::fs::file_status on unix platforms

Summary:
getLastAccessedTime() and getLastModificationTime() provided times in nanoseconds but with only 1 second resolution, even when the underlying file system could provide more precise times than that.
These changes add sub-second precision for unix platforms that support improved precision.

Also add some comments to make sure people are aware that the resolution of times can vary across different file systems.

Reviewers: labath, zturner, aaron.ballman, kristina

Reviewed By: aaron.ballman, kristina

Subscribers: lebedev.ri, mgorny, kristina, llvm-commits

Differential Revision: https://reviews.llvm.org/D54826

llvm-svn: 347530

[CodeComplete] Simplify CodeCompleteConsumer.cpp, NFC

Use range-based for loops
Use XStr.compare(YStr) < 0
Format misaligned code

llvm-svn: 347529

[MetadataTest] Fix off-by-one strncpy warning reported by gcc8. (NFC)

llvm-svn: 347528

[CodeGen] translate MS rotate builtins to LLVM funnel-shift intrinsics

This was originally part of:
D50924

and should resolve PR37387:
https://bugs.llvm.org/show_bug.cgi?id=37387

...but it was reverted because some bots using a gcc host compiler
would crash for unknown reasons with this included in the patch.
Trying again now to see if that's still a problem.

llvm-svn: 347527

[x86] limit transform for select-of-fp-constants

This should likely be adjusted to limit this transform
further, but these diffs should be clear wins.

If we have blendv/conditional move, then we should assume
those are cheap ops. The loads become independent of the
compare, so those can be speculated before we need to use
the values in the blend/mov.

llvm-svn: 347526

[x86] add tests for select-of-fp-constants; NFC

There are many options here depending on subtarget,
but we are uniformly relying on a transform that was
driven by performance for a 32-bit SSE2 target in 2009.

Note: The same motivation was apparently used to do this
transform for *all* targets, so non-x86 may want to look
at this too.

llvm-svn: 347525

[IPSCCP] Use input operand instead of OriginalOp for ssa_copy.

OriginalOp of a Predicate refers to the original IR value,
before renaming. While solving in IPSCCP, we have to use
the operand of the ssa_copy instead, to avoid missing
updates for nested conditions on the same IR value.

Fixes PR39772.

llvm-svn: 347524

[SelectionDAG] move constant or splat functions to common location

rL347502 moved the null sibling, so we should group all of these
together. I'm not sure why these aren't methods of the SDValue
class itself, but that's another patch if that's possible.

llvm-svn: 347523

[llvm-mca] Add support for instructions with a variadic number of operands.

By default, llvm-mca conservatively assumes that a register operand from the
variadic sequence is both a register read and a register write. That is because
MCInstrDesc doesn't describe extra variadic operands; we don't have enough
dataflow information to tell which register operands from the variadic sequence
is a definition, and which is a use instead.

However, if a variadic instruction is flagged 'mayStore' (but not 'mayLoad'),
and it has no 'unmodeledSideEffects', then llvm-mca (very) optimistically
assumes that any register operand in the variadic sequence is a register read
only. Conversely, if a variadic instruction is marked as 'mayLoad' (but not
'mayStore'), and it has no 'unmodeledSideEffects', then llvm-mca optimistically
assumes that any extra register operand is a register definition only.
These assumptions work quite well for variadic load/store multiple instructions
defined by the ARM backend.

llvm-svn: 347522

add Kang Zhang(shkzhang@cn.ibm.com) to the CREDITS.TXT

llvm-svn: 347521

A bit of AST matcher cleanup, NFC.

Removed the uses of the allOf() matcher inside node matchers that are implicit
allOf(). Replaced uses of allOf() with the explicit node matcher where it makes
matchers more readable. Replace anyOf(hasName(), hasName(), ...) with the more
efficient and readable hasAnyName().

llvm-svn: 347520

[X86][compiler-rt] Add missing semicolon

llvm-svn: 347519

[X86] Synchronize a macro in getAvailableFeatures in Host.cpp with the same macro in compiler-rt to fix a negative shift amount warning.

llvm-svn: 347518

[X86] Make conversion of feature bits into a mask explicitly unsigned by using 1U instead of 1.

llvm-svn: 347517

[X86][compiler-rt] Attempt to fix a warning about a shift amount being negative in a macro expansion.

llvm-svn: 347516

[InstCombine] Determine demanded and known bits for funnel shifts

Support funnel shifts in InstCombine demanded bits simplification.
If the shift amount is constant, we can determine both the demanded
bits of the operands, as well as the known bits of the result.

If one of the operands has no demanded bits, it will be replaced
by undef and the funnel shift will be simplified into a simple shift
due to the simplifications added in D54778.

Differential Revision: https://reviews.llvm.org/D54869

llvm-svn: 347515

[llvm-mca] InstrBuilder: warnings for call/ret instructions are only reported once.

llvm-svn: 347514

[analyzer] INT50-CPP. Do not cast to an out-of-range enumeration checker

This checker implements a solution to the "INT50-CPP. Do not cast to an
out-of-range enumeration value" rule [1].
It lands in alpha for now, and a number of followup patches are planned in order
to enable it by default.

[1] https://www.securecoding.cert.org/confluence/display/cplusplus/INT50-CPP.+Do+not+cast+to+an+out-of-range+enumeration+value

Patch by: Endre Fülöp and Alexander Zaitsev!

Differential Revision: https://reviews.llvm.org/D33672

llvm-svn: 347513

isEvaluatable() implies a constant context.

Assume that we're in a constant context if we're asking if the expression can
be compiled into a constant initializer. This fixes the issue where a
__builtin_constant_p() in a compound literal was diagnosed as not being
constant, even though it's always possible to convert the builtin into a
constant.

llvm-svn: 347512

Revert unapproved commit

llvm-svn: 347511

[AArch64] Enable libm vectorized functions via SLEEF

This changeset is modeled after Intel's submission for SVML. It enables
trigonometry functions vectorization via SLEEF: http://sleef.org/.

* A new vectorization library enum is added to TargetLibraryInfo.h: SLEEF.
* A new option is added to TargetLibraryInfoImpl - ClVectorLibrary: SLEEF.
* A comprehensive test case is included in this changeset.
* In a separate changeset (for clang), a new vectorization library argument is
added to -fveclib: -fveclib=SLEEF.

Trigonometry functions that are vectorized by sleef:

acos
asin
atan
atanh
cos
cosh
exp
exp2
exp10
lgamma
log10
log2
log
sin
sinh
sqrt
tan
tanh
tgamma

Patch by Stefan Teleman
Differential Revision: https://reviews.llvm.org/D53927

llvm-svn: 347510

[clangd] Add 'Switch header/source' command in clangd-vscode

Summary:
Alt+o is used on Windows/Linux and Option+Cmd+o on macOS.

Signed-off-by: Marc-Andre Laperle <malaperle@gmail.com>
Reviewers: hokein, ilya-biryukov, ioeric

Reviewed By: ioeric

Subscribers: sammccall, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54781

llvm-svn: 347509

[CodeComplete] Delete unused variable in rC342449

llvm-svn: 347508

[CodeComplete] Format SemaCodeComplete.cpp and improve code consistency

There are some mis-indented places and missing spaces here and there. Just format the whole file.

Also, newer code (from 2014 onwards) in this file prefers const auto *X = dyn_cast to not repeat the Decl type name. Make other occurrences consistent.
Remove two anonymous namespaces that are not very necessary: 1) a typedef 2) a local function (should use static)

llvm-svn: 347507

[ARM] Add dependency from ARMAsmParser to ARMAsmPrinter after r347494

This fixes -DBUILD_SHARED_LIBS=on

llvm-svn: 347506

[InstCombine] Simplify funnel shift with zero/undef operand to shift

The following simplifications are implemented:

* `fshl(X, 0, C) -> shl X, C%BW`
* `fshl(X, undef, C) -> shl X, C%BW` (assuming undef = 0)
* `fshl(0, X, C) -> lshr X, BW-C%BW`
* `fshl(undef, X, C) -> lshr X, BW-C%BW` (assuming undef = 0)
* `fshr(X, 0, C) -> shl X, (BW-C%BW)`
* `fshr(X, undef, C) -> shl X, BW-C%BW` (assuming undef = 0)
* `fshr(0, X, C) -> lshr X, C%BW`
* `fshr(undef, X, C) -> lshr, X, C%BW` (assuming undef = 0)

The simplification is only performed if the shift amount C is constant,
because we can explicitly compute C%BW and BW-C%BW in this case.

Differential Revision: https://reviews.llvm.org/D54778

llvm-svn: 347505