review.tizen.org Git - platform/upstream/llvm.git/log

[AMDGPU] Fix bug with tracking processed blocks in SIInsertWaitcnts

BlockWaitcntProcessedSet was not being cleared between calls, so it was
producing incorrect counts in cases where MBB addresses happened to coincide
across multiple calls.

Differential Revision: https://reviews.llvm.org/D48391

llvm-svn: 335268

AMDGPU/AMDHSA: Remove GridWorkGroupCountX/Y/Z
and everything that comes with it from implementation
and v3 header files.

Leave definition in v2 header files for backwards
compatibility.

Differential Revision: https://reviews.llvm.org/D48191

llvm-svn: 335267

[InstCombine] add tests for shuffled cmps; NFC

llvm-svn: 335266

[tsan] Use DARWIN_osx_LINK_FLAGS when building unit tests to match ASan behavior.

llvm-svn: 335265

[DebugInfo] Ignore DBG_VALUE instructions in PostRA Machine Sink

Summary:
The logic for handling the sinking of COPY instructions was generating
different code when building with debug flags.

The original code did not take into consideration debug instructions. This
resulted in the registers in the DBG_VALUE instructions being treated as used,
and prevented the COPY from being sunk. This patch avoids analyzing debug
instructions when trying to sink COPY instructions.

This patch also creates a routine from the code in MachineSinking::SinkInstruction to
perform the logic of sinking an instruction along with its debug instructions.
This functionality is used in multiple places, including the code for sinking COPY instrs.

Reviewers: junbuml, javed.absar, MatzeB, bjope

Reviewed By: bjope

Subscribers: aprantl, probinson, thegameg, jonpa, bjope, vsk, kristof.beyls, JDevlieghere, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D45637

llvm-svn: 335264

Fix an issue where DW_OP_deref might be dereferencing a file address. Convert the file address to a load address so this works.

https://bugs.llvm.org/show_bug.cgi?id=36871

llvm-svn: 335263

[InstCombine] use constant pattern matchers with icmp+sext

The previous code worked with vectors, but it failed when the
vector constants contained undef elements.
The matchers handle those cases.

llvm-svn: 335262

[InstCombine] add vector icmp tests with undefs; NFC

llvm-svn: 335261

Partially revert r335236

Jim pointed out that XCode has build configurations that build without
python and removing the ifdefs around the python code breaks them.

This reverts the #ifdef part of the above patch, while keeping the cmake
parts.

llvm-svn: 335260

[Intrinsics] Add/move some builtin declarations in intrin.h to get ms-intrinsics.c to not issue warnings

ud2 and int2c were missing declarations entirely. And the bitscans were only under x86_64, but they seem to be in BuiltinsARM.def as well and are tested by ms_intrinsics.c

Differential Revision: https://reviews.llvm.org/D48187

llvm-svn: 335259

[InstCombine] simplify binops before trying other folds

This is outwardly NFC from what I can tell, but it should be more efficient
to simplify first (despite the name, SimplifyAssociativeOrCommutative does
not actually simplify as InstSimplify does - it creates/morphs instructions).

This should make it easier to refactor duplicated code that runs for all binops.

llvm-svn: 335258

[LoopVectorize] regenerate full checks; NFC

llvm-svn: 335257

[X86] Update fast-isel tests for clang r335253.

The new IR fixes a mismatch in the final extractelement for the i32 intrinsics. Previously we extracted a 64-bit element even though we only wanted 32 bits.

SimplifyDemandedElts isn't able to make FP elements undef now and the shuffle mask I used prevents the use of horizontal add we had before. Not sure we should have been using horizontal add anyway. It's implemented on Intel with two port 5 shuffles and an add. So we have on less shuffle now, but an additional instruction to decode.

Differential Revision: https://reviews.llvm.org/D48347

llvm-svn: 335256

[DebugInfo] Inline for without DebugLocation

Summary:
This test is a strip down version of a function inside the
amalgamated sqlite source. When converted to IR clang produces
a phi instruction without debug location.

This patch fixes the above issue.

Differential Revision: https://reviews.llvm.org/D47720

llvm-svn: 335255

[DWARF] Warn on and ignore ".file 0" for DWARF v4 and earlier.

This had been messing with the directory table for prior versions, and
also could induce a crash when generating asm output.

llvm-svn: 335254

[X86] Rewrite the add/mul/or/and reduction intrinsics to make better use of other intrinsics and remove undef shuffle indices.

Similar to what was done to max/min recently.

These already reduced the vector width to 256 and 128 bit as we go unlike the original max/min code.

Differential Revision: https://reviews.llvm.org/D48346

llvm-svn: 335253

[clang-tidy] Remove the google-readability-redundant-smartptr-get alias

I don't remember why I added it, but it's definitely not needed, since the check
doesn't have any options and the check doesn't have any special relation to the
Google C++ style.

llvm-svn: 335252

Revert "[AArch64] Coalesce Copy Zero during instruction selection"

This reverts commit d8f57105010cc7e78026e511d5def873fc91e0e7.

Original Commit:

Author: Haicheng Wu <haicheng@codeaurora.org>
Date:   Sun Feb 18 13:51:33 2018 +0000

    [AArch64] Coalesce Copy Zero during instruction selection

    Add special case for copy of zero to avoid a double copy.

    Differential Revision: https://reviews.llvm.org/D36104

Author's intention is to remove a BB that has one mov instruction. In
order to do that, d8f571050 pessmizes MachineSinking by introducing a
copy, such that mov instruction is NOT moved to the BB. Optimization
downstream gets rid of the BB with only mov instruction. This works well
if we have only one fall through branch as there is only one "extra"
mov instruction.

If we have multiple fall throughs, we will have a lot of redundant movs.
In such a case, it's better to have this BB which has one mov instruction.

This is causing degradation in jpeg, fft and other codebases. I believe
if we want to remove a BB with only one branch instruction, we should not
pessimize Machine Sinking at all, and find some other solution.

llvm-svn: 335251

DAG combine "and|or (select c, -1, 0), x" -> "select c, x, 0|-1"

Allowed folding for "and/or" binops with non-constant operand if
arguments of select are 0/-1 values.

Normally this code with "and" opcode does not get to a DAG combiner
and simplified yet in the InstCombine. However AMDGPU produces it
during lowering and InstCombine has no chance to optimize it out.

In turn the same pattern with "or" opcode can reach DAG.

Differential Revision: https://reviews.llvm.org/D48301

llvm-svn: 335250

[ARM] Enable useAA() for the in-order Cortex-R52

This option allows codegen (such as DAGCombine or MI scheduling) to use alias
analysis information, which can help with the codegen on in-order cpu's,
especially machine scheduling. Here I have done things the same way as AArch64,
adding a subtarget feature to enable this for specific cores, and enabled it for
the R52 where we have a schedule to make use of it.

Differential Revision: https://reviews.llvm.org/D48074

llvm-svn: 335249

Fix macos build for r335244

I've made the code accept only 16 byte UUIDs, which is technically not
NFC (previously it would also accept 20 byte ones, but use only the
first 16 bytes), but this should be more correct as mac UUIDs are always
16 byte long.

llvm-svn: 335247

Remove UUID::SetFromCString

Replace uses with SetFromStringRef. NFC.

llvm-svn: 335246

[sanitizer] Stop running tests against 32-bit iOS simulator

llvm-svn: 335245

Modernize UUID class

Instead of a separate GetBytes + GetByteSize methods I introduce a
single GetBytes method returning an ArrayRef.

This is NFC cleanup now, but it should make handling arbitrarily-sized
UUIDs cleaner, should we choose to go that way. I also took the
opportunity to add some unit tests for this class.

llvm-svn: 335244

[WebAssembly] Only mark non-hidden symbols as live if they are also defined

Previously we were also marking undefined symbols (i.e. imports)
as live.

Differential Revision: https://reviews.llvm.org/D48299

llvm-svn: 335243

[InstCombine] make div/rem vector constant utility function; NFCI

This was originally in D48401 and will be used there.

llvm-svn: 335242

[NFC][ARM] ldrd/strd negative tests

Add negative tests for load and stores of alignment 2.

llvm-svn: 335241

[llvm-exegesis][NFC] Simplify BenchmarkRunner.

Get rid of createExecutableFunction().

llvm-svn: 335240

[RISCV] Tail calls don't need to save return address

Summary:
When expanding the PseudoTail in expandFunctionCall() we were using X6
to save the return address. Since this is a tail call the return
address is not needed, this patch replaces it with X0 to be ignored.

This matches the behaviour listed in the ISA V2.2 document page 110.
tail offset -----> jalr x0, x6, offset

GCC exhibits the same behavior.

Reviewers: apazos, asb, mgrang

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01

Differential Revision: https://reviews.llvm.org/D48343

llvm-svn: 335239

[x86] Lower some trunc + shuffle patterns to vpmov[q|d][b|w]

This should help in lowering the following four intrinsics:
_mm256_cvtepi32_epi8
_mm256_cvtepi64_epi16
_mm256_cvtepi64_epi8
_mm512_cvtepi64_epi8

Differential Revision: https://reviews.llvm.org/D46957

llvm-svn: 335238

[llvm-exegesis][NFC] Simplify LLVMState.

Summary: Pretty much everything we need is in llvm::TargetMachine.

Reviewers: gchatelet

Subscribers: llvm-commits, tschuett

Differential Revision: https://reviews.llvm.org/D48428

llvm-svn: 335237

ScriptInterpreterPython cleanup

Instead of #ifdef-ing the contents of all files in the plugin for all
non-python builds, just disable the plugin at the cmake level. Also,
remove spurious extra linking of the Python plugin in liblldb. This
plugin is already included as a part of LLDB_ALL_PLUGINS variable.

llvm-svn: 335236

Disable gmodules tests on linux

These tests are extremely environment-dependent. if the environment is
not module-enabled (which is the likely scenario), they won't test
anything. If one happens to have a module-enabled libc++, then the he
will start running into problems.

The first one is that the debug info in pcm file contains relocations
that ObjectFileELF doesn't handle (particularly on non-x86
architectures), but even after that is resolved, it seems we still are
unable to pull debug info out of the pcm file. I've filed pr37893 to
track that, and I am disabling gmodules tests on linux until these
issues are resolved.

llvm-svn: 335235

[liblang] Remove DOS line endings in libclang.exports

Undefined                       first referenced
symbol                             in file
clang_getCompletionFixIt           /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
clang_getTokenLocation             /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
clang_getToken                     /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
clang_getTemplateCursorKind        /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
clang_getTUResourceUsageName       /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
clang_getCompletionChunkKind       /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
clang_getCompletionChunkText       /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
clang_getSpellingLocation          /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
clang_getCompletionParent          /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
clang_getCompletionChunkCompletionString /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
clang_getCompletionPriority        /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
clang_getCompletionNumFixIts       /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
clang_getTokenExtent               /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
clang_getCompletionNumAnnotations  /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
clang_getTokenKind                 /var/gcc/llvm/obj/local/tools/clang/tools/libclang/libclang.exports
ld: fatal: symbol referencing errors
collect2: error: ld returned 1 exit status
make[2]: *** [tools/clang/tools/libclang/CMakeFiles/libclang.dir/build.make:651: lib/libclang.so.7] Error 1

It turns out that this is caused by https://reviews.llvm.org/D46862: it added a
couple of CRs (^M) to some lines.  Solaris ld takes them to be part of the symbol
names, which of course are missing from the input objects.  GNU ld handles this
just fine.  Fixed by removing the CRs.

Bootstrapped on i386-pc-solaris2.11.  I guess this is obvious.

Differential Revision: https://reviews.llvm.org/D48423

llvm-svn: 335234

[CodeGen] Avoid handling DBG_VALUE in LiveRegUnits::stepBackward

Patch by Jesper Antonsson.

Differential Revision: https://reviews.llvm.org/D48420

llvm-svn: 335233

AMDGPU: Remove redundant MIMG instruction variants

Summary:
For sample and gather ops, we can accurately determine the set of
vaddr-size instruction variants that are required. This reduces
the size of instruction tables by ~5%.

The number of machine instruction opcodes is reduced from 10002
to 9476.

Change-Id: Ie7fc65d3657b762c7816017fe70b2e9bec644a8a

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D48168

llvm-svn: 335232

AMDGPU: Remove old-style image intrinsics

Summary:
This also removes the need for atomic pseudo instructions, since
we select the correct encoding directly in SITargetLowering::lowerImage
for dimension-aware image intrinsics.

Mesa uses dimension-aware image intrinsics since
commit a9a7993441.

Change-Id: I7473d20009476a4ed6d919cae4e6dca9ff42e77a

Reviewers: arsenm, rampitec, mareko, tpr, b-sumner

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48167

llvm-svn: 335231

InstCombine/AMDGPU: Add dimension-aware image intrinsics to SimplifyDemanded

Summary:
Use the expanded features of the TableGen generic tables to avoid manually
adding the combinatorially exploded set of intrinsics. The
getAMDGPUImageDimIntrinsic lookup function is early-out,
i.e. non-AMDGPU intrinsics will never look at the underlying table.

Use a generic approach for getting the new intrinsic overload to keep the
code simple, and make the image dmask handling more generic:
- handle non-sampler image loads
- handle the case where the set of demanded elements is not a prefix

There is some overlap between this code and an optimization that happens
in the backend during code generation. They currently complement each other:

- only the codegen optimization can generate vec3 loads
- only the InstCombine optimization can handle D16

The InstCombine optimization also likely covers more cases since the
codegen optimization is fairly ad-hoc. Ideally, we'll remove the optimization
in codegen once the infrastructure for vec3 is in place (which will probably
take a long time).

Modify the test cases to use dimension-aware intrinsics. This makes it
easier to see that the test coverage for the new intrinsics is equivalent,
and the old style intrinsics will be removed in a follow-up commit anyway.

Change-Id: I4b91ea661413d13004956fe4ef7d13d41b8ce3ad

Reviewers: arsenm, rampitec, majnemer

Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48165

llvm-svn: 335230

AMDGPU: Convert test cases to the dimension-aware intrinsics

Summary:
Also explicitly port over some tests in llvm.amdgcn.image.* that were
missing. Some tests are removed because they no longer apply (i.e.
explicitly testing building an address vector via insertelement).

This is in preparation for the eventual removal of the old-style
intrinsics.

Some additional notes:
- constant-address-space-32bit.ll: change some GCN-NEXT to GCN because
  the instruction schedule was subtly altered
- insert_vector_elt.ll: the old test didn't actually test anything,
  because %tmp1 was not used; remove the load, because it doesn't work
  (Because of the amdgpu_ps calling convention? In any case, it's
  orthogonal to what the test claims to be testing.)

Change-Id: Idfa99b6512ad139e755e82b8b89548ab08f0afcf

Reviewers: arsenm, rampitec

Subscribers: MatzeB, qcolombet, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D48018

llvm-svn: 335229

AMDGPU: Select MIMG instructions manually in SITargetLowering

Summary:
Having TableGen patterns for image intrinsics is hitting limitations:
for D16 we already have to manually pre-lower the packing of data
values, and we will have to do the same for A16 eventually.

Since there is already some custom C++ code anyway, it is arguably easier
to just do everything in C++, now that we can use the beefed-up generic
tables backend of TableGen to provide all the required metadata and map
intrinsics to corresponding opcodes. With this approach, all image
intrinsic lowering happens in SITargetLowering::lowerImage. That code is
dense due to all the cases that it handles, but it should still be easier
to follow than what we had before, by virtue of it all being done in a
single location, and by virtue of not relying on the TableGen pattern
magic that very few people really understand.

This means that we will have MachineSDNodes with MIMG instructions
during DAG combining, but that seems alright: previously we had
intrinsic nodes instead, but those are similarly opaque to the generic
CodeGen infrastructure, and the final pattern matching just did a 1:1
translation to machine instructions anyway. If anything, the fact that
we now merge the address words into a vector before DAG combine should
be an advantage.

Change-Id: I417f26bd88f54ce9781c1668acc01f3f99774de6

Reviewers: arsenm, rampitec, rtaylor, tstellar

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48017

llvm-svn: 335228

AMDGPU: Refactor MIMG instruction TableGen using generic tables

Summary:
This allows us to access rich information about MIMG opcodes from C++ code.
Simplifying the mapping between equivalent opcodes of different data size
becomes quite natural.

This also flattens the MIMG-related class and multiclass hierarchy a little,
and collapses together some of the scaffolding for sample and gather4 opcodes.

Change-Id: I1a2549fdc1e881ff100e5393d2d87e73729a0ccd

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48016

llvm-svn: 335227

AMDGPU: Use generic tables instead of SearchableTable

Summary:

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48014

Change-Id: Ibb43f90d955275571aff17d0c3ecfb5e5b299641
llvm-svn: 335226

TableGen/SearchableTables: Support more generic enums and tables

Summary:
This is essentially a rewrite of the backend which introduces TableGen
base classes GenericEnum, GenericTable, and SearchIndex. They allow
generating custom enums and tables with lookup functions using
separately defined records as the underlying database.

Also added as part of this change:

- Lookup functions may use indices composed of multiple fields.

- Instruction fields are supported similar to Intrinsic fields.

- When the lookup key has contiguous numeric values, the lookup
function will directly index into the table instead of using a binary
search.

The existing SearchableTable functionality is internally mapped to the
new primitives.

Change-Id: I444f3490fa1dbfb262d7286a1660a2c4308e9932

Reviewers: arsenm, tra, t.p.northover

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D48013

llvm-svn: 335225

AMDGPU: Pass AMDGPUSampleVariant to MIMG_{Sampler,Gather}(_WQM)

Summary:
This will allows us to provide rich metadata about the instructions
in tables that are accessible by custom C++ code.

Change-Id: Id9305a26304ab6a6cceb6c65c8cd49141cc0101d

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48011

llvm-svn: 335224

AMDGPU: Add implicit def of SCC to kill and indirect pseudos

Summary:
Kill instructions sometimes do use SCC in unusual circumstances, when
v_cmpx cannot be used due to the operands that are involved.

Additionally, even if SCC was never defined by the expansion, kill pseudos
could previously occur between an s_cmp and an s_cbranch_scc, which breaks
the SCC liveness tracking when the pseudo is expanded to split the basic
block. While it would be possible to explicitly mark the SCC as live-in for
the successor basic block, it's simpler to just mark the pseudo as using SCC,
so that such a sequence is never emitted by instruction selection in the
first place.

A similar issue affects indirect source/dest pseudos in principle, although
I haven't been able to come up with a test case where it actually matters
(this affects instruction selection, so a MIR test can't be used).

Fixes: dEQP-GLES3.functional.shaders.discard.dynamic_loop_always
Change-Id: Ica8d82ecff1a763b892a1112cf1b06c948863a4f

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D47761

llvm-svn: 335223

AMDGPU: Turn D16 for MIMG instructions into a regular operand

Summary:
This allows us to reduce the number of different machine instruction
opcodes, which reduces the table sizes and helps flatten the TableGen
multiclass hierarchies.

We can do this because for each hardware MIMG opcode, we have a full set
of IMAGE_xxx_Vn_Vm machine instructions for all required sizes of vdata
and vaddr registers. Instead of having separate D16 machine instructions,
a packed D16 instructions loading e.g. 4 components can simply use the
same V2 opcode variant that non-D16 instructions use.

We still require a TSFlag for D16 buffer instructions, because the
D16-ness of buffer instructions is part of the opcode. Renaming the flag
should help avoid future confusion.

The one non-obvious code change is that for gather4 instructions, the
disassembler can no longer automatically decide whether to use a V2 or
a V4 variant. The existing logic which choose the correct variant for
other MIMG instruction is extended to cover gather4 as well.

As a bonus, some of the assembler error messages are now more helpful
(e.g., complaining about a wrong data size instead of a non-existing
instruction).

While we're at it, delete a whole bunch of dead legacy TableGen code.

Change-Id: I89b02c2841c06f95e662541433e597f5d4553978

Reviewers: arsenm, rampitec, kzhuravl, artem.tamazov, dp, rtaylor

Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D47434

llvm-svn: 335222

TableGen: Allow foreach in multiclass to depend on template args

Summary:
This also allows inner foreach loops to have a list that depends on
the iteration variable of an outer foreach loop. The test cases show
some very simple examples of how this can be used.

This was perhaps the last remaining major non-orthogonality in the
TableGen frontend.

Change-Id: I79b92d41a5c0e7c03cc8af4000c5e1bda5ef464d

Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D47431

llvm-svn: 335221

Fix line endings in recently updated test file

llvm-svn: 335220

[llvm-mca] Updates comment in code, and remove some stale comments. NFC

Also, rename fields `TotalMappings` and `NumUsedMappings` in struct
RegisterMappingTracker into `NumPhysRegs` and `NumUsedPhysRegs`.

llvm-svn: 335219

[clangd] Expose 'shouldCollectSymbol' helper from SymbolCollector.

Summary: This allows tools to examine symbols that would be collected in a symbol index. For example, a tool that measures index-based completion quality would be interested in references to symbols that are collected in the index.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D48418

llvm-svn: 335218

[DA] Enable -da-delinearize by default

This enables da-delinearize in Dependence Analysis for delinearizing array
accesses into multiple dimensions. This can help to increase the power of
Dependence analysis on multi-dimensional arrays and prevent having to fall
back to the slower and less accurate MIV tests. It adds static checks on the
bounds of the arrays to ensure that one dimension doesn't overflow into
another, and brings our code in line with our tests.

Differential Revision: https://reviews.llvm.org/D45872

llvm-svn: 335217

[X86][AVX] Reduce v4f64/v4i64 shuffle costs (PR37882)

These were being over cautious for costs for one/two op general shuffles - VSHUFPD doesn't have to replicate the same shuffle in both lanes like VSHUFPS does.

llvm-svn: 335216

[SLPVectorizer][X86] Add horizontal add/sub tests

Shows PR37882 perf regression

llvm-svn: 335215

[DebugInfo] Make sure all DBG_VALUEs' reguse operands have IsDebug property

Summary:
In some cases, these operands lacked the IsDebug property, which is meant to signal that
they should not affect codegen. This patch adds a check for this property in the
MachineVerifier and adds it where it was missing.

This includes refactorings to use MachineInstrBuilder construction functions instead of
manually setting up the intrinsic everywhere.

Patch by: JesperAntonsson

Reviewers: aprantl, rnk, echristo, javed.absar

Reviewed By: aprantl

Subscribers: qcolombet, sdardis, nemanjai, JDevlieghere, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D48319

llvm-svn: 335214

CODE_OWNERS: Take ownership of the MIPS backend

As agreed with Simon Dardis, I'm taking over as a code owner
for the MIPS backend.

llvm-svn: 335213

[clangd] Fix proximity signal output format. NFC

llvm-svn: 335212

[Sema] Fix overloaded static functions for templates

Apply almost the same fix as https://reviews.llvm.org/D36390 but for templates.

Differential Revision: https://reviews.llvm.org/D43453

llvm-svn: 335211

[DAGCombine] Fix alignment for offset loads/stores

The alignment parameter to getExtLoad is treated as a base alignment,
not the alignment of the load (base + offset). When we infer a better
alignment for a Ptr we need to ensure that it applies to the base to
prevent the alignment on the load from being wrong.

This fixes a bug where the alignment could then be used to incorrectly
prove noalias between a load and a store, leading to a miscompile.

Differential Revision: https://reviews.llvm.org/D48029

llvm-svn: 335210

Fix double-float constant truncation warnings. NFCI.

llvm-svn: 335209

Remove FIXME comment about WIP.
This is the only line other than the function signature remaining
of the original patch.

llvm-svn: 335208

Add some explanatory text to the associated symbol support.

llvm-svn: 335207

Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions.

r335150 should resolve the issues with the clang-with-thin-lto-ubuntu
and clang-with-lto-ubuntu builders.

Original message:
This patch updates IPSCCP to use PredicateInfo to propagate
facts to true branches predicated by EQ and to false branches
predicated by NE.

As a follow up, we should be able to extend it to also propagate additional
facts about nonnull.

Reviewers: davide, mssimpso, dberlin, efriedma

Reviewed By: davide, dberlin

llvm-svn: 335206

[DebugInfo] Keep DBG_VALUE undef in LiveDebugVariables

Summary:
Fixes PR36579.

For cases where we had e.g.

DBG_VALUE 42
[...]
DBG_VALUE undef

LiveDebugVariables would discard all undef DBG_VALUEs and then it would
look like the variable had the value 42 throughout the rest of the
function, which is incorrect.

With this patch we don't remove all undef DBG_VALUEs in LiveDebugVariables
so they will be kept after register allocation just like other DBG_VALUEs
which will yield more correct debug information.

Reviewers: aprantl

Reviewed By: aprantl

Subscribers: bjope, Ka-Ka, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D48277

llvm-svn: 335205

[X86] Go through some tests that still reference old intrinsics that have been autoupgraded and replace them with the upgraded IR.

This is mostly the stack folding tests and is by no means a thorough audit of tests.

llvm-svn: 335204

[PM/LoopUnswitch] Add partial non-trivial unswitching for invariant
conditions feeding a chain of `and`s or `or`s for a branch.

Much like with full non-trivial unswitching, we rely on the pass manager
to handle iterating until all of the profitable unswitches have been
done. This is to allow other more profitable unswitches to fire on any
of the cloned, simpler versions of the loop if viable.

Threading the partial unswiching through the non-trivial unswitching
logic motivated some minor refactorings. If those are too disruptive to
make it reasonable to review this patch, I can separate them out, but
it'll be somewhat timeconsuming so I wanted to send it for initial
review as-is. Feel free to tell me whether it warrants pulling apart.

I've tried to re-use (and factor out) logic form the partial trivial
unswitching, but not as much could be shared as I had haped. Still, this
wasn't as bad as I naively expected.

Some basic testing is added, but I probably need more. Suggestions for
things you'd like to see tested more than welcome. One thing I'd like to
do is add some testing that when we schedule this with loop-instsimplify
it effectively cleans up the cruft created.

Last but not least, this uncovered a bug that has been in loop cloning
the entire time for non-trivial unswitching. Specifically, we didn't
correctly add the outer-most cloned loop to the list of cloned loops.
This meant that LCSSA wouldn't be updated for it hypothetically, and
more significantly that we would never visit it in the loop pass
manager. I noticed this while checking loop-instsimplify by hand. I'll
try to separate this bugfix out into its own patch with a more focused
test. But it is just one line, so shouldn't significantly confuse the
review here.

After this patch, the only missing "feature" in this unswitch I'm aware
of us non-trivial unswitching of switches. I'll try implementing *full*
non-trivial unswitching of switches (which is at least a sound thing to
implement), but *partial* non-trivial unswitching of switches is
something I don't see any sound and principled way to implement. I also
have no interesting test cases for the latter, so I'm not really
worried. The rest of the things that need to be ported are bug-fixes and
more narrow / targeted support for specific issues.

Differential Revision: https://reviews.llvm.org/D47522

llvm-svn: 335203

[RISC-V] Fix a test case to not include label names as those aren't
stable in non-asserts builds. This fixes a test failure in release
config.

llvm-svn: 335202

ProvenanceAnalysis: Store WeakTrackingVH instead of Value* in UnderlyingValue Cache.

Summary:
Since the value stored in the cache might be deleted or replaced with
something else, we need to use tracking ValueHandlers instead of plain
Value pointers. It was discovered in one of internal builds, and
unfortunately there is no small reproducer for the issue.

The cache was introduced in rL327328.

Reviewers: ahatanak, pete

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D48407

llvm-svn: 335201

[X86] Remove masking from the 512-bit floating point max/min builtins. Use select in IR instead.

llvm-svn: 335200

[X86] Remove masking from 512-bit floating max/min intrinsics. Use select instruction instead.

llvm-svn: 335199

Revert "[SCEV] Improve zext(A /u B) and zext(A % B)"

This reverts commit r335197, as some bots are not happy.

llvm-svn: 335198

[SCEV] Improve zext(A /u B) and zext(A % B)

Summary:
Try to match udiv and urem patterns, and sink zext down to the leaves.

I'm not entirely sure why some unrelated tests change, but the added <nsw>s seem right.

Reviewers: sanjoy

Subscribers: jlebar, hiraditya, bixia, llvm-commits

Differential Revision: https://reviews.llvm.org/D48338

llvm-svn: 335197

Revert "Add python tool to dump and construct header maps"

This reverts commit fcfa2dd517ec1a6045a81e8247e346d630a22618.

Broke bots:

http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/11315
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/10411/steps/test-check-all/logs/stdio

llvm-svn: 335196

Revert "Warning for framework headers using double quote includes"

This reverts commit 9b5ff2db7e31c4bb11a7d468260b068b41c7c285.

Broke bots:

http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/11315
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/10411/steps/test-check-all/logs/stdio

llvm-svn: 335195

Revert "Fix hmaptool cmake file to work on Windows"

This reverts commit 63711c3cd337a0d22617579a904af07481139611, due to
breaking bots:

http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/11315
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/10411/steps/test-check-all/logs/stdio

llvm-svn: 335194

ASan docs: no_sanitize("address") works on globals.

Summary: Mention that no_sanitize attribute can be used with globals.

Reviewers: alekseyshl

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D48390

llvm-svn: 335193

[WebAssembly] Error on mismatched function signature in final output

During symbol resolution, emit warnings for function signature
mismatches. During GC, if any mismatched symbol is marked as live
then generate an error.

This means that we only error out if the mismatch is written to the
final output. i.e. if we would generate an invalid wasm file.

Differential Revision: https://reviews.llvm.org/D48394

llvm-svn: 335192

When a dependent alignas is applied to a non-dependent typedef,
prioritize the error for the bad subject over the error for the
dependent / non-dependent mismatch.

llvm-svn: 335191

Fix hmaptool cmake file to work on Windows

Unbreak a few windows buildbots:
http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/11315
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/10411/steps/test-check-all/logs/stdio

llvm-svn: 335190

Use cast instead of dyn_cast_or_null.

This addresses John's post-commit review feedback.

https://reviews.llvm.org/rC335021#inline-2038

llvm-svn: 335189

[DWARF] Improved error reporting for range lists.
Errors found processing the DW_AT_ranges attribute are propagated by lower level
routines and reported by their callers.

Reviewer: JDevlieghere

Differential Revision: https://reviews.llvm.org/D48344

llvm-svn: 335188

[WebAssembly] Minor cleanup to test inputs. NFC.

Update load-undefined.test such that it doesn't rely on
ret32 and ret64 having default visibility.

Split out from: https://reviews.llvm.org/D48394

Differential Revision: https://reviews.llvm.org/D48403

llvm-svn: 335187

[WebAssembly] Update function signature mismatch error message. NFC.

We don't start our error messages with capital letters.

Split out from https://reviews.llvm.org/D48394

Differential Revision: https://reviews.llvm.org/D48400

llvm-svn: 335186

[mips] Add microMIPS specific addressing patterns.

These are identical but use microMIPS instructions instead of MIPS instructions.

Also, flatten the 'let AdditionalPredicates = [InMicroMips]' by using the
ISA_MICROMIPS adjective. Add tests for constant materialization.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D48275

llvm-svn: 335185

Warning for framework headers using double quote includes

Introduce -Wquoted-include-in-framework-header, which should fire a warning
whenever a quote include appears in a framework header and suggest a fix-it.
For instance, for header A.h added in the tests, this is how the warning looks
like:

./A.framework/Headers/A.h:2:10: warning: double-quoted include "A0.h" in framework header, expected angle-bracketed instead [-Wquoted-include-in-framework-header]
#include "A0.h"
         ^~~~~~
         <A/A0.h>
./A.framework/Headers/A.h:3:10: warning: double-quoted include "B.h" in framework header, expected angle-bracketed instead [-Wquoted-include-in-framework-header]
#include "B.h"
         ^~~~~
         <B.h>

This helps users to prevent frameworks from using local headers when in fact
they should be targetting system level ones.

The warning is off by default.

Differential Revision: https://reviews.llvm.org/D47157

rdar://problem/37077034

llvm-svn: 335184

Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred.

Summary:
Two utils methods have essentially the same functionality. This is an attempt to merge them into one.
1. lib/Transforms/Utils/Local.cpp : MergeBasicBlockIntoOnlyPred
2. lib/Transforms/Utils/BasicBlockUtils.cpp : MergeBlockIntoPredecessor

Prior to the patch:
1. MergeBasicBlockIntoOnlyPred
Updates either DomTree or DeferredDominance
Moves all instructions from Pred to BB, deletes Pred
Asserts BB has single predecessor
If address was taken, replace the block address with constant 1 (?)

2. MergeBlockIntoPredecessor
Updates DomTree, LoopInfo and MemoryDependenceResults
Moves all instruction from BB to Pred, deletes BB
Returns if doesn't have a single predecessor
Returns if BB's address was taken

After the patch:
Method 2. MergeBlockIntoPredecessor is attempting to become the new default:
Updates DomTree or DeferredDominance, and LoopInfo and MemoryDependenceResults
Moves all instruction from BB to Pred, deletes BB
Returns if doesn't have a single predecessor
Returns if BB's address was taken

Uses of MergeBasicBlockIntoOnlyPred that need to be replaced:

1. lib/Transforms/Scalar/LoopSimplifyCFG.cpp
Updated in this patch. No challenges.

2. lib/CodeGen/CodeGenPrepare.cpp
Updated in this patch.
  i. eliminateFallThrough is straightforward, but I added using a temporary array to avoid the iterator invalidation.
  ii. eliminateMostlyEmptyBlock(s) methods also now use a temporary array for blocks
Some interesting aspects:
  - Since Pred is not deleted (BB is), the entry block does not need updating.
  - The entry block was being updated with the deleted block in eliminateMostlyEmptyBlock. Added assert to make obvious that BB=SinglePred.
  - isMergingEmptyBlockProfitable assumes BB is the one to be deleted.
  - eliminateMostlyEmptyBlock(BB) does not delete BB on one path, it deletes its unique predecessor instead.
  - adding some test owner as subscribers for the interesting tests modified:
    test/CodeGen/X86/avx-cmp.ll
    test/CodeGen/AMDGPU/nested-loop-conditions.ll
    test/CodeGen/AMDGPU/si-annotate-cf.ll
    test/CodeGen/X86/hoist-spill.ll
    test/CodeGen/X86/2006-11-17-IllegalMove.ll

3. lib/Transforms/Scalar/JumpThreading.cpp
Not covered in this patch. It is the only use case using the DeferredDominance.
I would defer to Brian Rzycki to make this replacement.

Reviewers: chandlerc, spatel, davide, brzycki, bkramer, javed.absar

Subscribers: qcolombet, sanjoy, nemanjai, nhaehnle, jlebar, tpr, kbarton, RKSimon, wmi, arsenm, llvm-commits

Differential Revision: https://reviews.llvm.org/D48202

llvm-svn: 335183

Related to PR37768: improve diagnostics for class name shadowing.

Diagnose the name of the class being shadowed by using declarations, and
improve the diagnostics for the case where the name of the class is
shadowed by a non-static data member in a class with constructors. In
the latter case, we now always give the "member with the same name as
its class" diagnostic regardless of the relative order of the member and
the constructor, rather than giving an inscrutible diagnostic if the
constructor appears second.

llvm-svn: 335182

Fix WasmEHFuncInfo.h to include what it uses

This fixes clang+llvm build with Modules and local submodule visibility.

llvm-svn: 335181

Improve SBThread's stepping API using SBError parameter.

Summary: The new methods will allow to get error messages from stepping API.

Reviewers: aprantl, clayborg, labath, jingham

Reviewed By: aprantl, clayborg, jingham

Subscribers: apolyakov, labath, jingham, clayborg, lemo, lldb-commits

Differential Revision: https://reviews.llvm.org/D47991

llvm-svn: 335180

[MemorySSA] Add convenience APIs in updater to avoid needing MSSA.

Summary:
Ideally passes should not need to pass MSSA around and do all updates through the updater.
Add convenience APIs to help with that.

Reviewers: george.burgess.iv

Subscribers: sanjoy, jlebar, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D48334

llvm-svn: 335179

Remove myself from the release testers list. (NFC)

llvm-svn: 335178

Add python tool to dump and construct header maps

Header maps are binary files used by Xcode, which are used to map
header names or paths to other locations. Clang has support for
those since its inception, but there's not a lot of header map
testing around.

Since it's a binary format, testing becomes pretty much brittle
and its hard to even know what's inside if you don't have the
appropriate tools.

Add a python based tool that allows creating and dumping header
maps based on a json description of those. While here, rewrite
tests to use the tool and remove the binary files from the tree.

This tool was initially written by Daniel Dunbar.

Differential Revision: https://reviews.llvm.org/D46485

rdar://problem/39994722

llvm-svn: 335177

[Dominators] Simplify child lists and make them deterministic

This fixes an extremely subtle non-determinism that can only be
triggered by an unfortunate alignment of passes. In my case:

- JumpThreading does large dominator tree updates
- CorrelatedValuePropagation preserves domtree now
- LICM codegen depends on the order of children on domtree nodes

The last part is non-deterministic if the update was stored in a set.
But it turns out that the set is completely unnecessary, updates are
deduplicated at an earlier stage so we can just use a vector, which is
both more efficient and doesn't destroy the input ordering.

I didn't manage to get the 240 MB IR file reduced enough, triggering
this bug requires a lot of jump threading, so landing this without a
test case.

Differential Revision: https://reviews.llvm.org/D48392

llvm-svn: 335176

[MS] Make sure __GetExceptionInfo works on types with no linkage

Fixes PR36327

llvm-svn: 335175

[MemorySSA] Verify Phi incoming blocks are block predecessors.

Summary: Make the MemorySSA verify also check that all Phi incoming blocks are block predecessors.

Reviewers: george.burgess.iv

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D48333

llvm-svn: 335174

[X86] Use setcc ISD opcode for AVX512 integer comparisons all the way to isel

I don't believe there is any real reason to have separate X86 specific opcodes for vector compares. Setcc has the same behavior just uses a different encoding for the condition code.

I had to change the CondCodeAction for SETLT and SETLE to prevent some transforms from changing SETGT lowering.

Differential Revision: https://reviews.llvm.org/D43608

llvm-svn: 335173

[CMake] Convert paths to the right form in standalone builds on Windows

The paths output from llvm-config --cmakedir and from clang
--print-libgcc-file-name can contain backslashes, while CMake
can't handle the paths in this form.

This matches what compiler-rt already does (since SVN r203789
and r293195).

Differential Revision: https://reviews.llvm.org/D48356

llvm-svn: 335172

[CMake] Convert paths to the right form in standalone builds on Windows

The paths output from llvm-config --cmakedir and from clang
--print-libgcc-file-name can contain backslashes, while CMake
can't handle the paths in this form.

This matches what compiler-rt already does (since SVN r203789
and r293195).

Differential Revision: https://reviews.llvm.org/D48355

llvm-svn: 335171

[SLPVectorizer] Provide InstructionsState down the BoUpSLP vectorization call tree

As described in D48359, this patch pushes InstructionsState down the BoUpSLP call hierarchy instead of the corresponding raw OpValue. This makes it easier to track the alternate opcode etc. and avoids us having to call getAltOpcode which makes it difficult to support more than one alternate opcode.

Differential Revision: https://reviews.llvm.org/D48382

llvm-svn: 335170

[CMake] Convert paths to the right form in standalone builds on Windows

The paths output from llvm-config --cmakedir and from clang
--print-libgcc-file-name can contain backslashes, while CMake
can't handle the paths in this form.

This matches what compiler-rt already does (since SVN r203789
and r293195).

Differential Revision: https://reviews.llvm.org/D48353

llvm-svn: 335169

[CUDA] Removed unused __nvvm_* builtins with non-generic pointers.

They were hot even hooked into CGBuiltin's machinery. Even if they were,
CUDA does not support AS-specific pointers, so there would be no legal way
no way to call these builtins.

This came up in D47154.

Differential Revision: https://reviews.llvm.org/D47845

llvm-svn: 335168