review.tizen.org Git - platform/upstream/llvm.git/log

[MachineCopyPropagation] Reimplement CopyTracker in terms of register units

Change the copy tracker to keep a single map of register units instead
of 3 maps of registers. This gives a very significant compile time
performance improvement to the pass. I measured a 30-40% decrease in
time spent in MCP on x86 and AArch64 and much more significant
improvements on out of tree targets with more registers.

Differential Revision: https://reviews.llvm.org/D52374

llvm-svn: 342942

Revert "[ORC] Switch to asynchronous resolution in JITSymbolResolver."

This reverts commit r342939.

MSVC's promise/future implementation does not like types that are not default
constructible. Reverting while I figure out a solution.

llvm-svn: 342941

[MachineCopyPropagation] Rework how we manage RegMask clobbers

Instead of updating the CopyTracker's maps each time we come across a
RegMask, defer checking for this kind of interference until we're
actually trying to propagate a copy. This avoids the need to
repeatedly iterate over maps in the cases where we don't end up doing
any work.

This is a slight compile time improvement for MachineCopyPropagation
as is, but it also enables a much bigger improvement that I'll follow
up with soon.

Differential Revision: https://reviews.llvm.org/D52370

llvm-svn: 342940

[ORC] Switch to asynchronous resolution in JITSymbolResolver.

Asynchronous resolution (where the caller receives a callback once the requested
set of symbols are resolved) is a core part of the new concurrent ORC APIs. This
change extends the asynchronous resolution model down to RuntimeDyld, which is
necessary to prevent deadlocks when compiling/linking on a fixed number of
threads: If RuntimeDyld's linking process were a blocking operation, then any
complete K-graph in a program will require at least K threads to link in the
worst case, as each thread would block waiting for all the others to complete.
Using callbacks instead allows the work to be passed between dependent threads
until it is complete.

For backwards compatibility, all existing RuntimeDyld functions will continue
to operate in blocking mode as before. This change will enable the introduction
of a new async finalization process in a subsequent patch to enable asynchronous
JIT linking.

llvm-svn: 342939

Revert r342936 "Remove redundant null pointer check in operator delete"

A review for the change was opened in https://reviews.llvm.org/D52401
but the change was committed before being approved by any of the code
owners for libc++.

llvm-svn: 342938

[WebAssembly] SIMD sqrt

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52387

llvm-svn: 342937

Remove redundant null pointer check in operator delete

C89 4.10.3.2 The free function
C99 7.20.3.2 The free function
C11 7.22.3.3 The free function

If ptr is a null pointer, no action shall occur.

_aligned_free on MSDN:

If memblock is a NULL pointer, this function simply performs no actions.

Reviewers: EricWF, mclow.lists

Subscribers: christof, ldionne, cfe-commits, libcxx-commits

Differential Revision: https://reviews.llvm.org/D52401

llvm-svn: 342936

[AMDGPU] Remove useless check from test. NFC.

The check for assignment of zero is practically useless
while the assignment moves around with different scheduling.

llvm-svn: 342935

[X86] Don't create FILD ISD nodes when X87 is disabled.

The included test case previously asserted because the type legalizer tried to soften the FILD ISD node.

Fixes PR38819.

llvm-svn: 342934

[X86] Remove superfluous curly braces. NFC

llvm-svn: 342933

[X86] Update comment. Use 'glued' instead of 'flagged' NFC

llvm-svn: 342932

[WebAssembly] Move .debug_line section address of dead function outside section range

Summary:
Currently we are pointing all debug information that refer removed function code
to the beginning of the code section (offset = 0). A debugger may want to
resolve code offset to the debug information, which will collide with offsets
of the live functions.

Moving offsets of dead functions outside code section range.

Reviewers: sbc100

Reviewed By: sbc100

Subscribers: dblaikie, ruiu, alexcrichton, dschuff, aprantl, jgravelle-google, aheejin, sunfish, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D49446

llvm-svn: 342930

Driver: render arguments for the embedded bitcode correctly

When embedding bitcode, only a subset of the arguments should be recorded into
the bitcode compilation commandline. The frontend job is split into two jobs,
one which will generate the bitcode. Ensure that the arguments for the
compilation to bitcode is properly stripped so that the embedded arguments are
the permitted subset.

llvm-svn: 342929

[WebAssembly][NFC] Fix hardcoded stack indices in tests

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52388

llvm-svn: 342928

[www] Change 'Clang 7' items from yellow to green now Clang 7 is
released.

llvm-svn: 342927

[www] Update cxx_status to mark P0962R1 as done.

llvm-svn: 342926

P0962R1: only use the member form of 'begin' and 'end' in a range-based
for loop if both members exist.

This resolves a DR whereby an errant 'begin' or 'end' member in a base
class could result in a derived class not being usable as a range with
non-member 'begin' and 'end'.

llvm-svn: 342925

[CUDA] Added basic support for compiling with CUDA-10.0

llvm-svn: 342924

[hwasan] Record and display stack history in stack-based reports.

Summary:
Display a list of recent stack frames (not a stack trace!) when
tag-mismatch is detected on a stack address.

The implementation uses alignment tricks to get both the address of
the history buffer, and the base address of the shadow with a single
8-byte load. See the comment in hwasan_thread_list.h for more
details.

Developed in collaboration with Kostya Serebryany.

Reviewers: kcc

Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52249

llvm-svn: 342923

Revert "[hwasan] Record and display stack history in stack-based reports."

This reverts commit r342921: test failures on clang-cmake-arm* bots.

llvm-svn: 342922

[hwasan] Record and display stack history in stack-based reports.

Summary:
Display a list of recent stack frames (not a stack trace!) when
tag-mismatch is detected on a stack address.

The implementation uses alignment tricks to get both the address of
the history buffer, and the base address of the shadow with a single
8-byte load. See the comment in hwasan_thread_list.h for more
details.

Developed in collaboration with Kostya Serebryany.

Reviewers: kcc

Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52249

llvm-svn: 342921

[analyzer] Prevent crashes in FindLastStoreBRVisitor

This patch is a band-aid. A proper solution would be too change
trackNullOrUndefValue to only try to dereference the pointer when it is
relevant to the problem.

Differential Revision: https://reviews.llvm.org/D52435

llvm-svn: 342920

Re-submitting changes in D51550 because it failed to patch.

Reviewers: javed.absar, trentxintong, courbet

Reviewed By: trentxintong

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52433

llvm-svn: 342919

[InstCombine] add bitcast+extelt helper function; NFC

We can handle patterns where the elements have different
sizes, so refactoring ahead of trying to add another blob
within these clauses.

llvm-svn: 342918

[compiler-rt] [builtins] Add logb/logbf/logbl methods to compiler-rt to avoid libm dependencies when possible.

Summary:
The complex division builtins (div?c3) use logb methods from libm to scale numbers during division and avoid rounding issues. However, these come from libm, meaning anyone that uses --rtlib=compiler-rt also has to include -lm. Implement logb* methods for standard ieee 754 floats so we can avoid -lm on those platforms, falling back to the old behavior (using either logb() or `__builtin_logb()`) when not supported.

These new methods are defined internally as `__compiler_rt_logb` so as not to conflict with the libm definitions in any way.

This fixes just the libm methods mentioned in PR32279 and PR28652. libc is still required, although that seems to not be an issue.

Note: this is proposed as an alternative to just adding -lm: D49330.

Reviewers: efriedma, compnerd, scanon, echristo

Reviewed By: echristo

Subscribers: jsji, echristo, nemanjai, dberris, mgorny, kbarton, delcypher, llvm-commits, #sanitizers

Differential Revision: https://reviews.llvm.org/D49514

llvm-svn: 342917

[X86] Remove shift/rotate by CL memory (RMW) overrides

The uops are slightly different to the register variant, so requires a +1uop tweak

llvm-svn: 342916

[lldb-mi] Fix hanging of target-select-so-path.test

Summary:
The target-select-so-path test might hang on
some platforms. The reason of that behavior
was in incorrect usage of Filecheck and lldb-mi
processes. Instead of redirecting lldb-mi's output
to Filecheck, we should run lldb-mi session,
finish the session, collect its output and then pass
it to Filecheck.
Also, this patch adds a timer to the test to prevent
it from hanging in the future.

Reviewers: tatyana-krasnukha, aprantl, teemperor

Reviewed By: tatyana-krasnukha, teemperor

Subscribers: apolyakov, aprantl, teemperor, ki.stfu, abidh, lldb-commits

Differential Revision: https://reviews.llvm.org/D52139

llvm-svn: 342915

[X86] Infer 64bit feature support from the CPUID results in getHostCPUFeatures.

After r341022, we more strictly check the 64bit feature in X86Subtargets constructor when a 64-bit triple is used. If we don't infer this feature for autodetected CPUs we might incorrectly report an error if the CPU name wasn't autodetected to a CPU that supports 64-bit.

llvm-svn: 342914

[profile] Revert commit https://reviews.llvm.org/rL342718

llvm-svn: 342913

[CodeGen] Revert commit https://reviews.llvm.org/rL342717

llvm-svn: 342912

[Power9] [CLANG] Add __float128 exponent GET and SET builtins

Added

__builtin_vsx_scalar_extract_expq
__builtin_vsx_scalar_insert_exp_qp

Builtins should behave the same way as in GCC.

Differential Revision: https://reviews.llvm.org/D48184

llvm-svn: 342911

[Power9] [LLVM] Add __float128 exponent GET and SET builtins

Added

__builtin_vsx_scalar_extract_expq
__builtin_vsx_scalar_insert_exp_qp

Builtins should behave the same way as in GCC.

Differential Revision: https://reviews.llvm.org/D48185

llvm-svn: 342910

Fix the type of 1<<31 integer constants.

Shifting into the sign bit is technically undefined behavior. No known
compiler exploits it though.

llvm-svn: 342909

[X86][AVX] Add truncation as shuffle test for PR31451

llvm-svn: 342908

Reland r342494 after fixing LIT checks.

llvm-svn: 342907

[Analysis] add comment to generalize finding a scalar op from vector; NFC

llvm-svn: 342906

[InstCombine] add/move tests for extractelement; NFC

llvm-svn: 342905

[X86] Remove WriteDiv/WriteIDiv schedule overrides - use classes directly. NFCI.

We're missing quite a bit of data for these instruction, removing the overrides makes this obvious - inconsistent reg/mem variants is a concern as well.

Also, we have Divider resources (HWDivider etc.) but they aren't actually used consistently.

llvm-svn: 342904

[clangd] Fix uninit bool in r342888

llvm-svn: 342903

[InstCombine] improve variable name and use 'match'; NFC

'width' of a vector usually refers to the bit-width.

https://bugs.llvm.org/show_bug.cgi?id=39016
shows a case where we could extend this fold to handle
a case where the number of elements in the bitcasted
vector is not equal to the resulting value.

llvm-svn: 342902

Reverting r342895

- The used builtins do not compile for pre arm v8.3a targets with gcc

llvm-svn: 342901

[ARM] Adjust the cost model for Exynos

Tune `MaxInterleaveFactor` and `LdStMultipleTiming`and remove
`PartialUpdateClearance` for the Exynos processors.

llvm-svn: 342900

[ARM] Adjust the feature set for Exynos

Enable crypto and literals fusion for the Exynos processors.

llvm-svn: 342899

[Thumb1] Any imm8 should have cost of 1

A simple MOVS rd, imm8 can materialize [-128, 127] in signed i8 type or
[0, 255] in unsigned i8 type on Thumb1.

Differential Revision: https://reviews.llvm.org/D52257

llvm-svn: 342898

[python] [tests] Update test_code_completion

Update expected completions to match output generated by clang-7.0.

Differential Revision: https://reviews.llvm.org/D50171

llvm-svn: 342897

[New PM][PassInstrumentation] IR printing support for New Pass Manager

Implementing -print-before-all/-print-after-all/-filter-print-func support
through PassInstrumentation callbacks.

- PrintIR routines implement printing callbacks.

- StandardInstrumentations class provides a central place to manage all
the "standard" in-tree pass instrumentations. Currently it registers
PrintIR callbacks.

Reviewers: chandlerc, paquette, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D50923

llvm-svn: 342896

[AArch64] Unwinding support for return address signing

- When return address signing is enabled, the LR may be signed on function entry
- When an exception is thrown the return address is inspected used to unwind the call stack
- Before this happens, the return address must be correctly authenticated to avoid causing an abort by dereferencing the signed pointer

Differential Revision: https://reviews.llvm.org/D51432

llvm-svn: 342895

[lld-link] Generalize handling of /debug and /debug:{none,full,fastlink,ghash,symtab}

Implement final argument precedence if multiple /debug arguments are passed on the command-line to match expected link.exe behavior.
Support /debug:none and emit warning for /debug:fastlink with automatic fallback to /debug:full.
Emit error if last /debug:option is unknown.
Emit warning if last /debugtype:option is unknown.

https://reviews.llvm.org/D50404

llvm-svn: 342894

Revert "rL342883: [Clang][CodeGen][ObjC]: Fix CoreFoundation on ELF with `-fconstant-cfstrings`."

Seems to be causing buildbot failures, need to look into it.

llvm-svn: 342893

[X86] Split WriteIMul into 8/16/32/64 implementations (PR36931)

Split WriteIMul by size and also by IMUL multiply-by-imm and multiply-by-reg cases.

This removes all the scheduler overrides for gpr multiplies and stops WriteMULH being ignored for BMI2 MULX instructions.

llvm-svn: 342892

[Arm][AsmParser] Restrict register list size for VSTM/VLDM

- The assembler accepts VSTM/VLDM with register lists (specifically double registers lists) with more than 16 registers specified
- The Arm architecture reference manual says this instruction must not contain more than 16 registers when the registers are doubleword registers
- This addresses one of the concerns in https://bugs.llvm.org/show_bug.cgi?id=38389

Differential Revision: https://reviews.llvm.org/D52082

llvm-svn: 342891

[CFString][ELF] Fix a missed test causing buildbot failures from 342883.

Accidetanlly forgot to update it, big sorry.

llvm-svn: 342890

[VFS] Use llvm::StringMap instead of std::map. NFC

llvm-svn: 342889

[clangd] Do bounds checks while reading data, otherwise var-length records are too painful. NFC

llvm-svn: 342888

Correct RISC-V link in release notes

llvm-svn: 342887

[DAGCombiner] use UADDO to optimize saturated unsigned add

This is a preliminary step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

If we have an 'add' instruction that sets flags, we can use that to eliminate an
explicit compare instruction or some other instruction (cmn) that sets flags for
use in the later select.

As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively
reversing an IR icmp canonicalization that replaces a variable operand with a
constant:
https://rise4fun.com/Alive/V1Q

But we're not using 'uaddo' in those cases via DAG transforms. This happens in
CGP after D8889 without checking target lowering to see if the op is supported.
So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with
"using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned
saturated add and converts to uaddo without checking target capabilities.

This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only
see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title
(unlike x86 which sees improvements for all sizes because all sizes are 'custom').
But the AArch code (like x86) looks better when translated to 'uaddo' in all cases.
So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO,
so this patch will fire on those tests.

Another possibility given the existing behavior: we could remove the legal-or-custom
check altogether because we're assuming that a UADDO sequence is canonical/optimal
before we ever reach here. But that seems like a bug to me. If the target doesn't
have an add-with-flags op, then it's not likely that we'll get optimal DAG combining
using a UADDO node. This is similar justification for why we don't canonicalize IR to
the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first
place.

Differential Revision: https://reviews.llvm.org/D51929

llvm-svn: 342886

Revert "We allow implicit function declarations as an extension in all C dialects. Remove OpenCL special case."

Discussed on cfe-commits (Week-of-Mon-20180820), this change leads to
the generation of invalid IR for OpenCL without giving an error.
Therefore, the conclusion was to revert.

llvm-svn: 342885

[Mips][FastISel] Fix selectBranch on icmp i1

The r337288 tried to fix result of icmp i1 when its input is not sanitized
by falling back to DagISel. While it now produces the correct result for
bit 0, the other bits can still hold arbitrary value which is not supported
by MipsFastISel branch lowering. This patch fixes the issue by falling back
to DagISel in this case.

Patch by Dragan Mladjenovic.

Differential Revision: https://reviews.llvm.org/D52045

llvm-svn: 342884

[Clang][CodeGen][ObjC]: Fix CoreFoundation on ELF with `-fconstant-cfstrings`.

[Clang][CodeGen][ObjC]: Fix non-bridged CoreFoundation builds on ELF targets
that use `-fconstant-cfstrings`. The original changes from differential
for a similar patch to PE/COFF (https://reviews.llvm.org/D44491) did not
check for an edge case where the global could be a constant which surfaced
as an issue when building for ELF because of different linkage semantics.

This patch addresses several issues with crashes related to CF builds on ELF
as well as improves data layout by ensuring string literals that back
the actual CFConstStrings end up in .rodata in line with Mach-O.

Change itself tested with CoreFoundation on Linux x86_64 but should be valid
for BSD-like systems as well that use ELF as the native object format.

Differential Revision: https://reviews.llvm.org/D52344

llvm-svn: 342883

[PowerPC] Support operand modifier 'x' in inline asm

gcc uses operand modifier 'x' in inline asm for VSX registers.
Without this modifier, instructions which use VSX numbering for their
operands are printed as VMX registers. This patch adds support for the
operand modifier 'x'.

Differential Revision: https://reviews.llvm.org/D52244

llvm-svn: 342882

[dsymutil] Set LSan blacklist whenever sanitizers are enabled.

LSan can be enabled by itself or as part of the address sanitizer.
Rather than checking the enabled sanitizers for both, just set the LSan
env options whenever a sanitizer is enabled.

llvm-svn: 342881

[NFC][CodeGen][X86][AArch64] More tests for 'bit field extract' w/ constants

It would be best to introduce ISD::BitFieldExtract,
because clearly more than one backend faces the same problem.
But for now let's solve this in the x86-specific DAG combine.

https://bugs.llvm.org/show_bug.cgi?id=38938

llvm-svn: 342880

AMDGPU: Fix private handling for allowsMisalignedMemoryAccesses

If the alignment is at least 4, this should report true.

Something still seems off with how < 4-byte types are
handled here though.

Fixing this seems to change how some combines get
to where they get, but somehow isn't changing the net
result.

llvm-svn: 342879

Fix some missing opcodes in bcanalyzer

llvm-svn: 342878

[llvm-mca] Improve code comments in LSUnit.{h, cpp}. NFC

llvm-svn: 342877

Fix Wundef NDEBUG warning; NFC

Check for definedness of the NDEBUG macro rather than its value,
to be consistent with other uses.

llvm-svn: 342876

Add NativeProcessProtocol unit tests

Summary:
NativeProcessProtocol is an abstract class, but it still contains a
significant amount of code. Some of that code is tested via tests of
specific derived classes, but these tests don't run everywhere, as they
are OS and arch-specific. They are also relatively high-level, which
means some functionalities (particularly the failure cases) are
hard/impossible to test.

In this approach, I replace the abstract methods with mocks, which
allows me to inject failures into the lowest levels of breakpoint
setting code and test the class behavior in this situation.

Reviewers: zturner, teemperor

Subscribers: mgorny, lldb-commits

Differential Revision: https://reviews.llvm.org/D52152

llvm-svn: 342875

[ARM] Do not fuse VADD and VMUL on the Cortex-M4 and Cortex-M33

A sequence of VMUL and VADD instructions always give the same or better
performance than a fused VMLA instruction on the Cortex-M4 and Cortex-M33.
Executing the VMUL and VADD back-to-back requires the same cycles, but
having separate instructions allows scheduling to avoid the hazard between
these 2 instructions.

Differential Revision: https://reviews.llvm.org/D52289

llvm-svn: 342874

Revert r341932 "[ARM] Enable ARMCodeGenPrepare by default"

This caused miscompilation of WebRTC for Android: PR39060.

> We've had the pass enabled downstream for a couple of weeks and it
> seems to be okay, so enable it by default.
>
> Differential Revision: https://reviews.llvm.org/D51920

llvm-svn: 342873

[ARM][ARMLoadStoreOptimizer]

- The load store optimizer is currently merging multiple loads/stores into VLDM/VSTM with more than 16 doubleword registers
- This is an UNPREDICTABLE instruction and shouldn't be done
- It looks like the Limit for how many registers included in a merge got dropped at some point so I am reintroducing it in this patch
- This fixes https://bugs.llvm.org/show_bug.cgi?id=38389

Differential Revision: https://reviews.llvm.org/D52085

llvm-svn: 342872

[deadargelim] Update dbg.value of 'unused' parameters

DeadArgElim pass marks unused function arguments as ‘undef’ without updating
existing dbg.values referring to it. As a consequence the debug info
metadata in the final executable was wrong.

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D51968

llvm-svn: 342871

[ARM] bottom-top mul support ARMParallelDSP

Originally committed in rL342210 but was reverted in rL342260 because
it was causing issues in vectorized code, because I had forgotten to
ensure that we're operating on scalar values.

Original commit message:

On failing to find sequences that can be converted into dual macs,
try to find sequential 16-bit loads that are used by muls which we
can then use smultb, smulbt, smultt with a wide load.

Differential Revision: https://reviews.llvm.org/D51983

llvm-svn: 342870

When running the ios/iossim prepare script show the script output when it returns with a non-zero exit code.

Summary:
Previously we'd just show the exception and not the output from the
executed script. This is unhelpful in the case that the script actually
reports some useful information on the failure.

Now we print the output and re-raise the exception.

Reviewers: kubamracek, george.karpenkov

Subscribers: #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D52350

llvm-svn: 342869

Fix the configuration of the Primary allocator for Darwin ARM64 by
changing the value of `SANITIZER_MMAP_RANGE_SIZE` to something more
sensible. The available VMA is at most 64GiB and not 256TiB that
was previously being used.

This change gives us several wins:

* Drastically improves LeakSanitizer performance on
  Darwin ARM64 devices. On a simple synthentic benchmark
  this took leak detection time from ~30 seconds to 0.5 seconds
  due to the `ForEachChunk(...)` method enumerating a much smaller
  number of regions. Previously we would pointlessly iterate
  over a large portion of the SizeClassAllocator32's ByteMap
  that would could never be set due it being configured for a much
  larger VM space than is actually availble.

* Decreases the memory required for the Primary allocator.
  Previously the ByteMap inside the the allocator used
  an array of pointers that took 512KiB of space. Now the required
  space for the array is 128 bytes.

rdar://problem/43509428

Differential Revision: https://reviews.llvm.org/D51173

llvm-svn: 342868

[clangd] Force Dex to respect symbol collector flags

`Dex` should utilize `FuzzyFindRequest.RestrictForCodeCompletion` flags
and omit symbols not meant for code completion when asked for it.

The measurements below were conducted with setting
`FuzzyFindRequest.RestrictForCodeCompletion` to `true` (so that it's
more realistic). Sadly, the average latency goes down, I suspect that is
mostly because of the empty queries where the number of posting lists is
critical.

| Metrics  | Before | After | Relative difference
| -----  | -----  | -----   | -----
| Cumulative query latency (7000 `FuzzyFindRequest`s over LLVM static index)  | 6182735043 ns    | 7202442053 ns | +16%
| Whole Index size | 81.24 MB    | 81.79 MB | +0.6%

Out of 292252 symbols collected from LLVM codebase 136926 appear to be
restricted for code completion.

Reviewers: ioeric

Differential Revision: https://reviews.llvm.org/D52357

llvm-svn: 342866

[llvm-exegesis] Fix PR39021.

Summary:
The `set` statements was incorrectly reading the value of the local variable and
setting the value of the parent variable.

Reviewers: tycho, gchatelet, john.brawn

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52343

llvm-svn: 342865

Fix llvm-diff anon-func.ll test

llvm-svn: 342864

Remove debug printf leftover from r342397

llvm-svn: 342863

[ARM][AArch64] Add feature +fp16fml

Armv8.4-A adds a few FP16 instructions that can optionally be implemented
in CPUs of Armv8.2-A and above.

This patch adds a feature to clang to permit selection of these
instructions. This interacts with the +fp16 option as follows:

Prior to Armv8.4-A:
*) +fp16fml implies +fp16
*) +nofp16 implies +nofp16fml

From Armv8.4-A:
*) The above conditions apply, additionally: +fp16 implies +fp16fml

Patch by Bernard Ogden.

Differential Revision: https://reviews.llvm.org/D50229

llvm-svn: 342862

Add inherited attributes before parsed attributes.

Currently, attributes from previous declarations ('inherited attributes')
are added to the end of a declaration's list of attributes. Before
r338800, the attribute list was in reverse. r338800 changed the order
of non-inherited (parsed from the current declaration) attributes, but
inherited attributes are still appended to the end of the list.

This patch appends inherited attributes after other inherited
attributes, but before any non-inherited attribute. This is to make the
order of attributes in the AST correspond to the order in the source
code.

Differential Revision: https://reviews.llvm.org/D50214

llvm-svn: 342861

[X86] Add 512-bit test cases to setcc-wide-types.ll. NFC

llvm-svn: 342860

[XRay] Clean up XRay build configuration

Summary:
This change spans both LLVM and compiler-rt, where we do the following:

- Add XRay to the LLVMBuild system, to allow for distributing the XRay
trace loading library along with the LLVM distributions.

- Use `llvm-config` better in the compiler-rt XRay implementation, to
depend on the potentially already-distributed LLVM XRay library.

While this is tested with the standalone compiler-rt build, it does
require that the LLVMXRay library (and LLVMSupport as well) are
available during the build. In case the static libraries are available,
the unit tests will build and work fine. We're still having issues with
attempting to use a shared library version of the LLVMXRay library since
the shared library might not be accessible from the standard shared
library lookup paths.

The larger change here is the inclusion of the LLVMXRay library in the
distribution, which allows for building tools around the XRay traces and
profiles that the XRay runtime already generates.

Reviewers: echristo, beanz

Subscribers: mgorny, hiraditya, mboerger, llvm-commits

Differential Revision: https://reviews.llvm.org/D52349

llvm-svn: 342859

Fix asserts when linking wrong address space declarations

llvm-svn: 342858

llvm-diff: Fix crash on anonymous functions

Not sure what the correct behavior is for this.
Skip them and report how many there were.

llvm-svn: 342857

[DAGCombiner] Remove some dead code from ConstantFoldBITCASTofBUILD_VECTOR

This code handled SCALAR_TO_VECTOR being returned by the recursion, but the code that used to return SCALAR_TO_VECTOR was removed in 2015.

llvm-svn: 342856

[libcxx] Fix the binder deprecation tests on Clang 5.

Tested on Docker containers with Clang 4, 5 and 6.

llvm-svn: 342855

[libcxx] Fix buildbots on Debian

Debian build bots are running Clang 4, which apparently does not support
the "deprecated" attribute properly. Clang pretends to support the attribute,
but the attribute doesn't do anything.

(live example: https://wandbox.org/permlink/0De69aXns0t1D59r)

On a separate note, I'm not sure I understand why we're even running the
libc++ tests under Clang-4. Is this a configuration we support? I can
understand that libc++ should _build_ with Clang 4, but it's not clear
to me that new libc++ headers should be usable under older compilers
like that.

llvm-svn: 342854

[ORC] Add some debugging output to Core.h/Core.cpp

Core now logs when materialization units are dispatched or return to JITDylibs.

llvm-svn: 342853

[X86] Split WriteShift/WriteRotate schedule classes by CL usage.

Variable Shifts/Rotates using the CL register have different behaviours to the immediate instructions - split accordingly to help remove yet more repeated overrides from the schedule models.

llvm-svn: 342852

[DAGCombiner] Clarify a comment. NFC

This comment was misleading about why we were restricting to before legalize types. The reason given would only apply to before legalize ops. But there is a before legalize types reason that should also be listed.

llvm-svn: 342851

[LegalizeTypes] Fix bad indentation. NFC

llvm-svn: 342850

[libcxx] Document new symbols __u64toa and __u32toa on Darwin

Summary:
This is the counterpart for https://reviews.llvm.org/D50130 and
https://reviews.llvm.org/D52391 on Darwin.

Reviewers: EricWF

Subscribers: christof, dexonsmith, cfe-commits, libcxx-commits, lichray

Differential Revision: https://reviews.llvm.org/D52396

llvm-svn: 342849

[X86] Remove unnecessary WriteRotate override. NFCI.

SNB was the last override for ROT(L|R)r(1|i) - they now all use WriteRotate correctly.

llvm-svn: 342848

Fix line ending mismatches. NFCI.

llvm-svn: 342847

[X86] ROR*mCL instruction models should match ROL*mCL etc.

Confirmed with Craig Topper - fix a typo that was missing a Port4 uop for ROR*mCL instructions on some Intel models.

Yet another step on the scheduler model cleanup marathon......

llvm-svn: 342846

[Aarch64] Fix memcpy that was copying 4x too many bytes

Found by asan.

llvm-svn: 342845

[DAGCombiner][x86] extend decompose of integer multiply into shift/add with negation

This is an alternative to https://reviews.llvm.org/D37896. We can't decompose
multiplies generically without a target hook to tell us when it's profitable.

ARM and AArch64 may be able to remove some existing code that overlaps with
this transform.

This extends D52195 and may resolve PR34474:
https://bugs.llvm.org/show_bug.cgi?id=34474
(still an open question about transforming legal vector multiplies, but we
could open another bug report for those)

llvm-svn: 342844

[libc++] Add deprecated attributes to many deprecated components

Summary:
These deprecation warnings are opt-in: they are only enabled when the
_LIBCXX_DEPRECATION_WARNINGS macro is defined, which is not the case
by default. Note that this is a first step in the right direction, but
I wasn't able to get an exhaustive list of all deprecated components
per standard, so there's certainly stuff that's missing. The list of
components this commit marks as deprecated is:

in C++11:
- auto_ptr, auto_ptr_ref
- binder1st, binder2nd, bind1st(), bind2nd()
- pointer_to_unary_function, pointer_to_binary_function, ptr_fun()
- mem_fun_t, mem_fun1_t, const_mem_fun_t, const_mem_fun1_t, mem_fun()
- mem_fun_ref_t, mem_fun1_ref_t, const_mem_fun_ref_t, const_mem_fun1_ref_t, mem_fun_ref()

in C++14:
- random_shuffle()

in C++17:
- unary_negate, binary_negate, not1(), not2()

<rdar://problem/18168350>

Reviewers: mclow.lists, EricWF

Subscribers: christof, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D48912

llvm-svn: 342843

[X86] Added missing RCL/RCR schedule overrides to the generic SNB model

The SandyBridge model was missing schedule values for the RCL/RCR values - instead using the (incredibly optimistic) WriteShift (now WriteRotate) defaults.

I've added overrides with more realistic (slow) values, based on a mixture of Agner/instlatx64 numbers and what later Intel models do as well.

This is necessary to allow WriteRotate to be updated to remove other rotate overrides.

It'd probably be a good idea to investigate a WriteRotateCarry class at some point but its not high priority given the unusualness of these instructions.

llvm-svn: 342842

[X86] Remove unnecessary WriteRotate overrides. NFCI.

llvm-svn: 342841