review.tizen.org Git - platform/upstream/llvm.git/log

ELF: Create synthetic sections for loadable partitions.

We create several types of synthetic sections for loadable partitions, including:
- The dynamic symbol table. This allows code outside of the loadable partitions
  to find entry points with dlsym.
- Creating a dynamic symbol table also requires the creation of several other
  synthetic sections for the partition, such as the dynamic table and hash table
  sections.
- The partition's ELF header is represented as a synthetic section in the
  combined output file, and will be used by llvm-objcopy to extract partitions.

Differential Revision: https://reviews.llvm.org/D62350

llvm-svn: 362819

llvm-objcopy: Implement --extract-partition and --extract-main-partition.

This implements the functionality described in
https://lld.llvm.org/Partitions.html. It works as follows:

- Reads the section headers using the ELF header at file offset 0;
- If extracting a loadable partition:
  - Finds the section containing the required partition ELF header by looking it up in the section table;
  - Reads the ELF and program headers from the section.
- If extracting the main partition:
  - Reads the ELF and program headers from file offset 0.
- Filters the section table according to which sections are in the program headers that it read:
  - If ParentSegment != nullptr or section is not SHF_ALLOC, then it goes in.
  - Sections containing partition ELF headers or program headers are excluded as there are no headers for these in ordinary ELF files.

Differential Revision: https://reviews.llvm.org/D62364

llvm-svn: 362818

AMDGPU: Fix MIR test verifier error

llvm-svn: 362817

Revert rL362792 : [Support][Test] Time profiler: add regression test

Summary:
Add output to `llvm::errs()` when `-ftime-trace` option is enabled,
add regression test checking this option works as expected.

Reviewers: thakis, aganea

Subscribers: cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D61914
........
Breaks buildbots - @anton-afanasyev please can you take a look?

llvm-svn: 362816

[dsymutil] Use the number of threads specified.

Before this patch we used either a single thread, or the number of
hardware threads available, effectively ignoring the number of threads
specified on the command line.

llvm-svn: 362815

[ARM] Add ACLE feature macros for MVE.

Fixup uninitialised variable.

llvm-svn: 362814

[docs]Move llvm-readobj from "Developer Tools" to "Basic Commands"

On the Command Guide page, there are multiple sections with links to the
different documentation pages available for LLVM tools. The "Basic
Tools" section includes tools like llvm-objdump, llvm-nm and so on. The
"Developer Tools" section contains things like FileCheck and lit. This
change moves llvm-readobj into the former block, from the latter.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D63011

llvm-svn: 362813

AST Matchers tutorial requests to enable clang-tools-extra. NFC

Otherwise the examples do not build.

llvm-svn: 362812

[clangd] Return empty results on spurious completion triggers

Summary:
We currently return an error, this causes `coc.nvim` and VSCode to
show an error message in the logs.

Returning empty list of completions allows to avod the error message
without altering other user-visible behavior.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D62999

llvm-svn: 362811

[Analysis] simplify code for getSplatValue(); NFC

AFAIK, this is only currently called by TTI, but it could be
used from instcombine or CGP to help solve problems like:
https://bugs.llvm.org/show_bug.cgi?id=37428
https://bugs.llvm.org/show_bug.cgi?id=42174

llvm-svn: 362810

Attempt to fix nm-archive.test after r362798

llvm-lib now needs a `target triple` for bitcode, so add a new file
that's like trivial.ll but has one, and use that in the test.
(trivial.ll had a comment that looked like it wasn't supposed to be used
in tests directly, so I don't want to change that file.)

llvm-svn: 362809

Build with _XOPEN_SOURCE defined on AIX

Summary:
It is useful to build with _XOPEN_SOURCE defined on AIX, enabling X/Open
and POSIX compatibility mode, to work around stray macros and other
bugs in the headers provided by the system and build compiler.

This patch adds the config to cmake to build with _XOPEN_SOURCE defined
on AIX with a few exceptions. Google Test internals require access to
platform specific thread info constructs on AIX so in that case we build
with _ALL_SOURCE defined instead. Libclang also uses header which needs
_ALL_SOURCE on AIX so we leave that as is as well.

We also add building on AIX with the large file API and doing CMake
header checks with X/OPEN definitions so the results are consistent with
the environment that will be present in the build.

Reviewers: hubert.reinterpretcast, xingxue, andusy

Reviewed By: hubert.reinterpretcast

Subscribers: mgorny, jsji, cfe-commits, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D62533

llvm-svn: 362808

[ARM] Add ACLE feature macros for MVE

If MVE is present at all, then the macro __ARM_FEATURE_MVE is defined
to a value which has bit 0 set for integer MVE, and bit 1 set for
floating-point MVE.

(Floating-point MVE implies integer MVE, so if this macro is defined
at all then it will be set to 1 or 3, never 2.)

Patch mostly by Simon Tatham

Differential Revision: https://reviews.llvm.org/D60710

llvm-svn: 362806

[MachineScheduler] checkResourceLimit boundary condition update

When we call checkResourceLimit in bumpCycle or bumpNode, and we
know the resource count has just reached the limit (the equations
are equal). We should return true to mark that we are resource
limited for next schedule, or else we might continue to schedule
in favor of latency for 1 more schedule and create a schedule that
actually overbook the resource.

When we call checkResourceLimit to estimate the resource limite before
scheduling, we don't need to return true even if the equations are
equal, as it shouldn't limit the schedule for it .

Differential Revision: https://reviews.llvm.org/D62345

llvm-svn: 362805

[CMake] Add special case for processing LLDB_DOTEST_ARGS

Summary:
Allow to run the test suite when building LLDB Standalone with Xcode against a provided LLVM build-tree that used a single-configuration generator like Ninja.

So far both test drivers, lit-based `check-lldb` as well as `lldb-dotest`, were looking for test dependencies (clang/dsymutil/etc.) in a subdirectory with the configuration name (Debug/Release/etc.). It was implicitly assuming that both, LLDB and the provided LLVM used the same generator. In practice, however, the opposite is quite common: build the dependencies with Ninja and LLDB with Xcode for development*. With this patch it becomes the default.

(* In fact, it turned out that the Xcode<->Xcode variant didn't even build out of the box. It's fixed since D62879)

Once this is sound, I'm planning the following steps:
* add stage to the [lldb-cmake-standalone bot](http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake-standalone/) to build and test it
* update `Building LLDB with Xcode` section in the docs
* bring the same mechanism to swift-lldb
* fade out the manually maintained Xcode project

On macOS build and test like this:
```
$ git clone https://github.com/llvm/llvm-project.git /path/to/llvm-project

$ cd /path/to/lldb-dev-deps
$ cmake -GNinja -C/path/to/llvm-project/lldb/cmake/caches/Apple-lldb-macOS.cmake -DLLVM_ENABLE_PROJECTS="clang;libcxx;libcxxabi" /path/to/llvm-project/llvm
$ ninja

$ cd /path/to/lldb-dev-xcode
$ cmake -GXcode -C/path/to/llvm-project/lldb/cmake/caches/Apple-lldb-macOS.cmake -DLLDB_BUILD_FRAMEWORK=Off -DLLVM_DIR=/path/to/lldb-dev-deps/lib/cmake/llvm -DClang_DIR=/path/to/lldb-dev-deps/lib/cmake/clang /path/to/llvm-project/lldb
$ xcodebuild -configuration Debug -target check-lldb
$ xcodebuild -configuration Debug -target lldb-dotest
$ ./Debug/bin/lldb-dotest
```

Reviewers: JDevlieghere, jingham, xiaobai, stella.stamenova, labath

Reviewed By: stella.stamenova

Subscribers: labath, mgorny, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D62859

llvm-svn: 362803

test-commit

llvm-svn: 362802

[NFC] Added tests for D63004

llvm-svn: 362801

TailDuplicator: Remove no-op analyzeBranch call

This could fail, which looked concerning. However nothing was actually
using the results of this. I assume this was intended to use the
anti-feature of analyzeBranch of removing instructions, but wasn't
actually calling it with AllowModify = true.

Fixes bug 42162.

llvm-svn: 362800

[NFC] Don't export helpers of ConstantFoldCall

llvm-svn: 362799

llvm-lib: Disallow mixing object files with different machine types

lib.exe doesn't allow creating .lib files with object files that have
differing machine types. Update llvm-lib to match.

The motivation is to make it possible to infer the machine type of a
.lib file in lld, so that it can warn when e.g. a 32-bit .lib file is
passed to a 64-bit link (PR38965).

Fixes PR38782.

Differential Revision: https://reviews.llvm.org/D62913

llvm-svn: 362798

[x86] narrow extract subvector of vector select

This is a potentially large perf win for AVX1 targets because of the way we
auto-vectorize to 256-bit but then expect the backend to legalize/optimize
for the half-implemented AVX1 ISA.

On the motivating example from PR37428 (even though this patch doesn't solve
the vector shift issue):
https://bugs.llvm.org/show_bug.cgi?id=37428
...there's a 16% speedup when compiling with "-mavx" (perf tested on Haswell)
because we eliminate the remaining 256-bit vblendv ops.

I added comments on a couple of tests that require further work. If we have
256-bit logic ops separating the vselect and extract, we should probably narrow
everything to 128-bit, but that requires a larger pattern match.

Differential Revision: https://reviews.llvm.org/D62969

llvm-svn: 362797

gn build: Merge r362766

llvm-svn: 362796

gn build: Merge r362774

llvm-svn: 362795

gn build: Run `git ls-files '*.gn' '*.gni' | xargs llvm/utils/gn/gn.py format`

llvm-svn: 362794

[ELF][AArch64] Support for BTI and PAC

Branch Target Identification (BTI) and Pointer Authentication (PAC) are
architecture features introduced in v8.5a and 8.3a respectively. The new
instructions have been added in the hint space so that binaries take
advantage of support where it exists yet still run on older hardware. The
impact of each feature is:

BTI: For executable pages that have been guarded, all indirect branches
must have a destination that is a BTI instruction of the appropriate type.
For the static linker, this means that PLT entries must have a "BTI c" as
the first instruction in the sequence. BTI is an all or nothing
property for a link unit, any indirect branch not landing on a valid
destination will cause a Branch Target Exception.

PAC: The dynamic loader encodes with PACIA the address of the destination
that the PLT entry will load from the .plt.got, placing the result in a
subset of the top-bits that are not valid virtual addresses. The PLT entry
may authenticate these top-bits using the AUTIA instruction before
branching to the destination. Use of PAC in PLT sequences is a contract
between the dynamic loader and the static linker, it is independent of
whether the relocatable objects use PAC.

BTI and PAC are independent features that can be combined. So we can have
several combinations of PLT:
- Standard with no BTI or PAC
- BTI PLT with "BTI c" as first instruction.
- PAC PLT with "AUTIA1716" before the indirect branch to X17.
- BTIPAC PLT with "BTI c" as first instruction and "AUTIA1716" before the
first indirect branch to X17.

The use of BTI and PAC in relocatable object files are encoded by feature
bits in the .note.gnu.property section in a similar way to Intel CET. There
is one AArch64 specific program property GNU_PROPERTY_AARCH64_FEATURE_1_AND
and two target feature bits defined:
- GNU_PROPERTY_AARCH64_FEATURE_1_BTI
-- All executable sections are compatible with BTI.
- GNU_PROPERTY_AARCH64_FEATURE_1_PAC
-- All executable sections have return address signing enabled.

Due to the properties of FEATURE_1_AND the static linker can tell when all
input relocatable objects have the BTI and PAC feature bits set. The static
linker uses this to enable the appropriate PLT sequence.
Neither -> standard PLT
GNU_PROPERTY_AARCH64_FEATURE_1_BTI -> BTI PLT
GNU_PROPERTY_AARCH64_FEATURE_1_PAC -> PAC PLT
Both properties -> BTIPAC PLT

In addition to the .note.gnu.properties there are two new command line
options:
--force-bti : Act as if all relocatable inputs had
GNU_PROPERTY_AARCH64_FEATURE_1_BTI and warn for every relocatable object
that does not.
--pac-plt : Act as if all relocatable inputs had
GNU_PROPERTY_AARCH64_FEATURE_1_PAC. As PAC is a contract between the loader
and static linker no warning is given if it is not present in an input.

Two processor specific dynamic tags are used to communicate that a non
standard PLT sequence is being used.
DTI_AARCH64_BTI_PLT and DTI_AARCH64_BTI_PAC.

Differential Revision: https://reviews.llvm.org/D62609

llvm-svn: 362793

[Support][Test] Time profiler: add regression test

Summary:
Add output to `llvm::errs()` when `-ftime-trace` option is enabled,
add regression test checking this option works as expected.

Reviewers: thakis, aganea

Subscribers: cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D61914

llvm-svn: 362792

[ARM] Fix bugs introduced by the fp64/d32 rework.

Change D60691 caused some knock-on failures that weren't caught by the
existing tests. Firstly, selecting a CPU that should have had a
restricted FPU (e.g. `-mcpu=cortex-m4`, which should have 16 d-regs
and no double precision) could give the unrestricted version, because
`ARM::getFPUFeatures` returned a list of features including subtracted
ones (here `-fp64`,`-d32`), but `ARMTargetInfo::initFeatureMap` threw
away all the ones that didn't start with `+`. Secondly, the
preprocessor macros didn't reliably match the actual compilation
settings: for example, `-mfpu=softvfp` could still set `__ARM_FP` as
if hardware FP was available, because the list of features on the cc1
command line would include things like `+vfp4`,`-vfp4d16` and clang
didn't realise that one of those cancelled out the other.

I've fixed both of these issues by rewriting `ARM::getFPUFeatures` so
that it returns a list that enables every FP-related feature
compatible with the selected FPU and disables every feature not
compatible, which is more verbose but means clang doesn't have to
understand the dependency relationships between the backend features.
Meanwhile, `ARMTargetInfo::handleTargetFeatures` is testing for all
the various forms of the FP feature names, so that it won't miss cases
where it should have set `HW_FP` to feed into feature test macros.

That in turn caused an ordering problem when handling `-mcpu=foo+bar`
together with `-mfpu=something_that_turns_off_bar`. To fix that, I've
arranged that the `+bar` suffixes on the end of `-mcpu` and `-march`
cause feature names to be put into a separate vector which is
concatenated after the output of `getFPUFeatures`.

Another side effect of all this is to fix a bug where `clang -target
armv8-eabi` by itself would fail to set `__ARM_FEATURE_FMA`, even
though `armv8` (aka Arm v8-A) implies FP-Armv8 which has FMA. That was
because `HW_FP` was being set to a value including only the `FPARMV8`
bit, but that feature test macro was testing only the `VFP4FPU` bit.
Now `HW_FP` ends up with all the bits set, so it gives the right
answer.

Changes to tests included in this patch:

* `arm-target-features.c`: I had to change basically all the expected
  results. (The Cortex-M4 test in there should function as a
  regression test for the accidental double-precision bug.)
* `arm-mfpu.c`, `armv8.1m.main.c`: switched to using `CHECK-DAG`
  everywhere so that those tests are no longer sensitive to the order
  of cc1 feature options on the command line.
* `arm-acle-6.5.c`: been updated to expect the right answer to that
  FMA test.
* `Preprocessor/arm-target-features.c`: added a regression test for
  the `mfpu=softvfp` issue.

Reviewers: SjoerdMeijer, dmgreen, ostannard, samparker, JamesNagurne

Reviewed By: ostannard

Subscribers: srhines, javed.absar, kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D62998

llvm-svn: 362791

[RISCV] Support Bit-Preserving FP in F/D Extensions

Summary:
This allows some integer bitwise operations to instead be performed by
hardware fp instructions. This is correct because the RISC-V spec
requires the F and D extensions to use the IEEE-754 standard
representation, and fp register loads and stores to be bit-preserving.

This is tested against the soft-float ABI, but with hardware float
extensions enabled, so that the tests also ensure the optimisation also
fires in this case.

Reviewers: asb, luismarques

Reviewed By: asb

Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62900

llvm-svn: 362790

[AMDGPU] Constrain the AMDGPU inliner on maximum number of basic blocks in a caller function (compile time performance)

Differential revision: https://reviews.llvm.org/D62917

llvm-svn: 362789

[ELF] Delete R_PPC64_CALL_PLT from isRelExpr()

It was added by D46654 but is actually never used.
R_PPC64_CALL_PLT (was: R_PPC_CALL_PLT) is a static link-time constant.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D62994

llvm-svn: 362788

[X86][test] Add test cases using immediates to builtins-x86.c

These builtins should work with immediate or variable shift operand for
gcc compatibility.

Differential Revision: https://reviews.llvm.org/D62850

llvm-svn: 362786

[CodeComplete] Improve overload handling for C++ qualified and ref-qualified methods.

Summary:
- when a method is not available because of the target value kind (e.g. an &&
  method on a Foo& variable), then don't offer it.
- when a method is effectively shadowed by another method from the same class
  with a) an identical argument list and b) superior qualifiers, then don't
  offer it.

Reviewers: ilya-biryukov

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D62582

llvm-svn: 362785

Fix some signed/unsigned comparison warnings

llvm-svn: 362784

DWARF: Simplify SymbolFileDWARF::GetDWARFCompileUnit

Summary:
The DWARFCompileUnit is set as the "user data" of the lldb compile unit
directly in the constructor (see ParseCompileUnit).

This means that instead of going through unit indexes, we can just fetch
the DWARF unit directly from there.

Reviewers: clayborg, JDevlieghere

Subscribers: aprantl, jdoerfert, lldb-commits

Differential Revision: https://reviews.llvm.org/D62943

llvm-svn: 362783

Work around a circular dependency between IR and MC introduced in r362735

I replaced the circular library dependency with a forward declaration,
but it is only a workaround, not a real fix.

llvm-svn: 362782

[X86] -march=cooperlake (clang)

Support intel -march=cooperlake in clang

Patch by Shengchen Kan (skan)

Differential Revision: https://reviews.llvm.org/D62835

llvm-svn: 362781

[AArch64][AsmParser] error on unexpected SVE predicate type suffix

Summary:
This patch fixes a bug in the assembler that permitted a type suffix on
predicate registers when not expected. For instance, the following was
previously valid:

    faddv h0, p0.q, z1.h

This bug was present in all SVE instructions containing predicates with
no type suffix and no predication form qualifier, i.e. /z or /m. The
latter instructions are already caught with an appropiate error message
by the assembler, e.g.:

            .text
    <stdin>:1:13: error: not expecting size suffix
    cmpne p1.s, p0.b/z, z2.s, 0
                ^

A similar issue for SVE vector registers was fixed in:

  https://reviews.llvm.org/D59636

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D62942

llvm-svn: 362780

[AArch64][AsmParser] Provide better diagnostics for SVE predicates

Patch by Sander de Smalen (sdesmalen)

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D62941

llvm-svn: 362779

[llvm-objcopy] - Emit error and don't crash if program header reaches past end of file.

This is https://bugs.llvm.org/show_bug.cgi?id=42122.

If an object file has a size less than program header's file [offset + size]
(i.e. if we have overflow), llvm-objcopy crashes instead of reporting a
error.

The patch fixes this issue.

Differential revision: https://reviews.llvm.org/D62898

llvm-svn: 362778

[yaml2elf] - Refactoring followup for D62809

This is a refactoring follow-up for D62809
"Change how we handle implicit sections.".
It allows to simplify the code.

Differential revision: https://reviews.llvm.org/D62912

llvm-svn: 362777

[X86] -march=cooperlake (llvm)

Support intel -march=cooperlake in llvm

Patch by Shengchen Kan (skan)

Differential Revision: https://reviews.llvm.org/D62836

llvm-svn: 362776

Fix for lld buildbot

Removed unused (in non-debug builds) variable.

llvm-svn: 362775

[CodeGen] Generic Hardware Loop Support

Patch which introduces a target-independent framework for generating
hardware loops at the IR level. Most of the code has been taken from
PowerPC CTRLoops and PowerPC has been ported over to use this generic
pass. The target dependent parts have been moved into
TargetTransformInfo, via isHardwareLoopProfitable, with
HardwareLoopInfo introduced to transfer information from the backend.

Three generic intrinsics have been introduced:
- void @llvm.set_loop_iterations
  Takes as a single operand, the number of iterations to be executed.
- i1 @llvm.loop_decrement(anyint)
  Takes the maximum number of elements processed in an iteration of
  the loop body and subtracts this from the total count. Returns
  false when the loop should exit.
- anyint @llvm.loop_decrement_reg(anyint, anyint)
  Takes the number of elements remaining to be processed as well as
  the maximum numbe of elements processed in an iteration of the loop
  body. Returns the updated number of elements remaining.

llvm-svn: 362774

[AVR] Expand 16-bit rotations during the legalization stage

In r356860, the legalization logic for BSWAP was modified to ISD::ROTL,
rather than the old ISD::{SHL, SRL, OR} nodes.

This works fine on AVR for 8-bit rotations, but 16-bit rotations are
currently unimplemented - they always trigger an assertion error in the
AVRExpandPseudoInsts pass ("RORW unimplemented").

This patch instructions the legalizer to expand 16-bit rotations into
the previous SHL, SRL, OR pattern it did previously.

This fixes the 'issue-cannot-select-bswap.ll' test. Interestingly, this
test failure seems flaky - it passes successfully on the avr-build-01
buildbot, but fails locally on my Arch Linux install.

llvm-svn: 362773

[NFC] Delete trailing whitespace character.

llvm-svn: 362772

[llvm-objdump] Print source when subsequent lines in the translation unit come from the same line in two different headers.

Reviewers: grimar, rupprecht, jhenderson

Reviewed By: grimar, jhenderson

Subscribers: llvm-commits, jhenderson

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62461

llvm-svn: 362771

[lld] Allow args::getInterger to parse args larger than 2^31-1

Differential Revision: https://reviews.llvm.org/D62933

llvm-svn: 362770

[WebAssembly] Fix for discarded init functions

When a function is excluded via comdat we shouldn't add it to the
final list of init functions.

Differential Revision: https://reviews.llvm.org/D62983

llvm-svn: 362769

[llvm-objdump] Add warning if --disassemble-functions specifies an unknown symbol

Summary: Fixes Bug 41904 https://bugs.llvm.org/show_bug.cgi?id=41904

Reviewers: jhenderson, rupprecht, grimar, MaskRay

Reviewed By: jhenderson, rupprecht, MaskRay

Subscribers: dexonsmith, rupprecht, kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62275

llvm-svn: 362768

[MC][ELF] Don't create relocations with section symbols for STB_LOCAL ifunc

We should keep the symbol type (STT_GNU_IFUNC) for a local ifunc because
it may result in an IRELATIVE reloc that the dynamic loader will use to
resolve the address at startup time.

There is another problem that is not fixed by this patch: a PC relative
relocation should also create a relocation with the ifunc symbol.

llvm-svn: 362767

[ADT] Enable set_difference() to be used on StringSet

Subscribers: mgorny, mgrang, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62992

llvm-svn: 362766

Set an output file name for the override-new-delete.cpp test.

The android_compile.py script requires one.

llvm-svn: 362764

[NFC] Test commit.

llvm-svn: 362763

[LV] Fix -Wunused-function after r362736

llvm-svn: 362762

AMDGPU: Don't count mask branch pseudo towards skip threshold

llvm-svn: 362761

AMDGPU: Insert skips for blocks with FLAT

This already forced a skip for VMEM, so it should also be done for
flat. I'm somewhat skeptical about the benefit of this though.

llvm-svn: 362760

[PowerPC] Exploit the vector min/max instructions

Use the PPC vector min/max instructions for computing the corresponding
operation as these should be faster than the compare/select sequences
we currently emit.

Differential revision: https://reviews.llvm.org/D47332

llvm-svn: 362759

Change GWP-ASan build to use '-pthread' instead of '-lpthread' in order
to try and fix android buildbot. Also make sure that the empty dummy
test contains an output file name so the android_build.py wrapper script
doesn't check fail.

llvm-svn: 362758

Factor out duplicated code building a MemberExpr and marking it
referenced.

This reinstates r362563, reverted in r362597.

llvm-svn: 362757

Convert MemberExpr creation and serialization to work the same way as
most / all other Expr subclasses.

This reinstates r362551, reverted in r362597, with a fix to a bug that
caused MemberExprs to sometimes have a null FoundDecl after a round-trip
through an AST file.

llvm-svn: 362756

Revert [ELF] Simplify the condition to create .interp

This reverts r362355 (git commit c78c999a9cd7a77b9d13c610c9faebac5d560a55)

This causes some internal tests to fail; details provided offthread.

llvm-svn: 362755

AMDGPU: Insert skip branches over return blocks

SIInsertSkips really doesn't understand the control flow, and makes
very stupid assumptions about the block layout. This was able to get
away with not skipping return blocks, since usually after
structurization there is only one placed at the end of the
function. Tail duplication can break this assumption.

llvm-svn: 362754

[NFC] Test commit, whitespace change

As per the Developer Policy, upon obtaining commit access.

llvm-svn: 362753

[NFC][CodeGen] Add unary fneg tests to X86/fma4-intrinsics-x86.ll

llvm-svn: 362752

[DebugInfo] Incorrect debug info record generated for loop counter.

Incorrect Debug Variable Range was calculated while "COMPUTING LIVE DEBUG VARIABLES" stage.
Range for Debug Variable("i") computed according to current state of instructions
inside of basic block. But Register Allocator creates new instructions which were not taken
into account when Live Debug Variables computed. In the result DBG_VALUE instruction for
the "i" variable was put after these newly inserted instructions. This is incorrect.
Debug Value for the loop counter should be inserted before any loop instruction.

Differential Revision: https://reviews.llvm.org/D62650

llvm-svn: 362750

[AMDGPU] Partial revert for the ba447bae7448435c9986eece0811da1423972fdd

   "Divergence driven ISel. Assign register class for cross block values
       according to the divergence."
       that discovered the design flaw leading to several issues that
       required to be solved before.

       This change reverts AMDGPU specific changes and keeps common part
       unaffected.

llvm-svn: 362749

[NFC][CodeGen] Add unary fneg tests to X86/fma-intrinsics-x86.ll

llvm-svn: 362748

[X86] Make a bunch of merge masked binops commutable for loading folding.

This primarily affects add/fadd/mul/fmul/and/or/xor/pmuludq/pmuldq/max/min/fmaxc/fminc/pmaddwd/pavg.

We already commuted the unmasked and zero masked versions.

I've added 512-bit stack folding tests for most of the instructions
affected. I've tested needing commuting and not commuting across
unmasked, merged masked, and zero masked. The 128/256 bit instructions
should behave similarly.

llvm-svn: 362746

Add cdb test for global constants

Summary: This creates an integration test for global constants

Reviewers: rnk

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62974

llvm-svn: 362745

Revert "Revert "[ELF] Suppress "STT_SECTION symbol should be defined" on .eh_frame, .debug*, .zdebug* and .gcc_except_table""

This reverts commit f49f58527a6d8147524d8d6f2eb1feb70f856292.

llvm-svn: 362744

Revert "Revert "Reland D61583 [ELF] Error on relocations to STT_SECTION symbols if the sections were discarded""

This reverts commit 729111cf1824159bb4dd331cab8a829eab30313f.

Reverting the previous commit breaks other LLD buildbots.

llvm-svn: 362743

[InstSimplify] add tests for fcmp with known-never-nan operands; NFC

llvm-svn: 362742

[NFC][CodeGen] Add unary fneg tests to X86/fma-scalar-combine.ll

llvm-svn: 362741

clang-format: better handle namespace macros

Summary:
Other macros are used to declare namespaces, and should thus be handled
similarly. This is the case for crpcut's TESTSUITE macro, or for
unittest-cpp's SUITE macro:

      TESTSUITE(Foo) {
      TEST(MyFirstTest) {
        assert(0);
      }
      } // TESTSUITE(Foo)

This patch deals with this cases by introducing a new option to specify
lists of namespace macros. Internally, it re-uses the system already in
place for foreach and statement macros, to ensure there is no impact on
performance.

Reviewers: krasimir, djasper, klimek

Reviewed By: klimek

Subscribers: acoomans, cfe-commits, klimek

Tags: #clang

Differential Revision: https://reviews.llvm.org/D37813

llvm-svn: 362740

Revert "Reland D61583 [ELF] Error on relocations to STT_SECTION symbols if the sections were discarded"

This reverts commit 5d3b3188f722456a6470c7effcacf17656406429.

Breaks the PowerPC multi-stage buildbot.

llvm-svn: 362739

Revert "[ELF] Suppress "STT_SECTION symbol should be defined" on .eh_frame, .debug*, .zdebug* and .gcc_except_table"

This reverts commit dcba4828a9ead5f5b1fa27f0853823618075d0e0.

This commit builds on dcba4828a9ead5f5b1fa27f0853823618075d0e0 which breaks the
multi-staged PowerPC buildbot.

llvm-svn: 362738

[CFLGraph] Add support for unary fneg instruction.

Differential Revision: https://reviews.llvm.org/D62791

llvm-svn: 362737

[LV] Wrap LV illegality reporting in a function. NFC.

A function for loop vectorization illegality reporting has been
introduced:

void LoopVectorizationLegality::reportVectorizationFailure(
const StringRef DebugMsg, const StringRef OREMsg,
const StringRef ORETag, Instruction * const I) const;

The function prints a debug message when the debug for the compilation
unit is enabled as well as invokes the optimization report emitter to
generate a message with a specified tag. The function doesn't cover any
complicated logic when a custom lambda should be passed to the emitter,
only generating a message with a tag is supported.

The function always prints the instruction `I` after the debug message
whenever the instruction is specified, otherwise the debug message
ends with a dot: 'LV: Not vectorizing: Disabled/already vectorized.'

Patch by Pavel Samolysov <samolisov@gmail.com>

llvm-svn: 362736

[AIX] Implement function descriptor on SDAG

Summary:
(1) Function descriptor on AIX
On AIX, a called routine may have 2 distinct symbols associated with it:
* A function descriptor (Name)
* A function entry point (.Name)

The descriptor structure on AIX is the same as those in the ELF V1 ABI:
* The address of the entry point of the function.
* The TOC base address for the function.
* The environment pointer.

The descriptor symbol uses the same name as the source level function in C.
The function entry point is analogous to the symbol we would generate for a
function in a non-descriptor-based ABI, except that it is renamed by
prepending a ".".

Which symbol gets referenced depends on the context:
* Taking the address of the function references the descriptor symbol.
* Calling the function references the entry point symbol.

(2) Speaking of implementation on AIX, for direct function call target, we
create proper MCSymbol SDNode(e.g . ".foo") while constructing SDAG to
replace original TargetGlobalAddress SDNode. Then down the path, we can
take advantage of this MCSymbol.

Patch by: Xiangling_L

Reviewed by: sfertile, hubert.reinterpretcast, jasonliu, syzaara

Differential Revision: https://reviews.llvm.org/D62532

llvm-svn: 362735

[NFC][CodeGen] Add unary fneg tests to X86/fma4-fneg-combine.ll

llvm-svn: 362733

[InlineCost] Add support for unary fneg.

This adds support for unary fneg based on the implementation of BinaryOperator without the soft float FP cost.

Previously we would just delegate to visitUnaryInstruction. I think the only real change is that we will pass the FastMath flags to SimplifyFNeg now.

Differential Revision: https://reviews.llvm.org/D62699

llvm-svn: 362732

[clang][HeaderSearch] Consider all path separators equal

Reviewers: ilya-biryukov, sammccall

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D62965

llvm-svn: 362731

[NFC][CodeGen] Add unary fneg tests to X86/fma_patterns.ll

llvm-svn: 362730

Fixing ppc tests: sed -i 's/# REQUIES: ppc/# REQUIRES: ppc/g'

llvm-svn: 362728

[LoopPred] Fix a bug in unconditional latch bailout introduced in r362284

This is a really silly bug that even a simple test w/an unconditional latch would have caught. I tried to guard against the case, but put it in the wrong if check. Oops.

llvm-svn: 362727

[ScheduleTreeTransform] Silence compiler warning. NFC.

Use size_t for position which is the return type type ArrayRef::size()
it is compared to.

llvm-svn: 362724

[DAGCombine] MergeConsecutiveStores - improve non-temporal load\store handling (PR42123)

This patch is the first step towards ensuring MergeConsecutiveStores correctly handles non-temporal loads\stores:

1 - When merging load\stores we must ensure that they all have the same non-temporal flag. This is unlikely to occur, but can in strange cases where we're storing at the end of one page and the beginning of another.

2 - The merged load\store node must retain the non-temporal flag.

Differential Revision: https://reviews.llvm.org/D62910

llvm-svn: 362723

[PPC32] Support GD/LD/IE/LE TLS models and their relaxations

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D62940

llvm-svn: 362722

[PPC32] Improve the 32-bit PowerPC port

Many -static/-no-pie/-shared/-pie applications linked against glibc or musl
should work with this patch. This also helps FreeBSD PowerPC64 to migrate
their lib32 (PR40888).

* Fix default image base and max page size.
* Support new-style Secure PLT (see below). Old-style BSS PLT is not
  implemented, so it is not suitable for FreeBSD rtld now because it doesn't
  support Secure PLT yet.
* Support more initial relocation types:
  R_PPC_ADDR32, R_PPC_REL16*, R_PPC_LOCAL24PC, R_PPC_PLTREL24, and R_PPC_GOT16.
  The addend of R_PPC_PLTREL24 is special: it decides the call stub PLT type
  but it should be ignored for the computation of target symbol VA.
* Support GNU ifunc
* Support .glink used for lazy PLT resolution in glibc
* Add a new thunk type: PPC32PltCallStub that is similar to PPC64PltCallStub.
  It is used by R_PPC_REL24 and R_PPC_PLTREL24.

A PLT stub used in -fPIE/-fPIC usually loads an address relative to
.got2+0x8000 (-fpie/-fpic code uses _GLOBAL_OFFSET_TABLE_ relative
addresses).
Two .got2 sections in two object files have different addresses, thus a PLT stub
can't be shared by two object files. To handle this incompatibility,
change the parameters of Thunk::isCompatibleWith to
`const InputSection &, const Relocation &`.

PowerPC psABI specified an old-style .plt (BSS PLT) that is both
writable and executable. Linkers don't make separate RW- and RWE segments,
which causes all initially writable memory (think .data) executable.
This is a big security concern so a new PLT scheme (secure PLT) was developed to
address the security issue.

TLS will be implemented in D62940.

glibc older than ~2012 requires .rela.dyn to include .rela.plt, it can
not handle the DT_RELA+DT_RELASZ == DT_JMPREL case correctly. A hack
(not included in this patch) in LinkerScript.cpp addOrphanSections() to
work around the issue:

    if (Config->EMachine == EM_PPC) {
      // Older glibc assumes .rela.dyn includes .rela.plt
      Add(In.RelaDyn);
      if (In.RelaPlt->isLive() && !In.RelaPlt->Parent)
        In.RelaDyn->getParent()->addSection(In.RelaPlt);
    }

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D62464

llvm-svn: 362721

[NFC][CodeGen] Add unary fneg tests to X86/fma_patterns_wide.ll

llvm-svn: 362720

gn build: Merge r362685

llvm-svn: 362719

Remove unused PPC.h includes under llvm/lib/Target/PowerPC.

llvm-svn: 362718

[X86] Make masked floating point equality/ordered compares commutable for load folding purposes.

Same as what is supported for the unmasked form.

llvm-svn: 362717

[Profile]: Add runtime interface to specify file handle for profile data (Part-II)

Test cases

Author: Sajjad Mirza

Differential Revision: http://reviews.llvm.org/D62541

llvm-svn: 362716

[NFC][CodeGen] Add unary fneg tests to fmul-combines.ll fnabs.ll

llvm-svn: 362715

[PowerPC] Add R_PPC_IRELATIVE

This will be used by lld's powerpc port.

llvm-svn: 362713

[NFC][CodeGen] Add unary fneg tests to fp-fast.ll fp-fold.ll fp-in-intregs.ll fp-stack-compare-cmov.ll fp-stack-compare.ll fsxor-alignment.ll

llvm-svn: 362712

[DA] Add an option to control delinearization validity checks

Summary: Dependence Analysis performs static checks to confirm validity
of delinearization. These checks often fail for 64-bit targets due to
type conversions and integer wrapping that prevent simplification of the
SCEV expressions. These checks would also fail at compile-time if the
lower bound of the loops are compile-time unknown.

For example:

void foo(int n, int m, int a[][m]) {
  for (int i = 0; i < n; ++i)
    for (int j = 0; j < m; ++j) {
      a[i][j] = a[i+1][j-2];
    }
}

opt -mem2reg -instcombine -indvars -loop-simplify -loop-rotate -inline
-pass-remarks=.* -debug-pass=Arguments
-da-permissive-validity-checks=false k3.ll -analyze -da
will produce the following by default:

da analyze - anti [* *|<]!
but will produce the following expected dependence vector if the
validity checks are disabled:

da analyze - consistent anti [1 -2]!
This revision will introduce a debug option that will leave the validity
checks in place by default, but allow them to be turned off. New tests
are added for cases where it cannot be proven at compile-time that the
individual subscripts stay in-bound with respect to a particular
dimension of an array. These tests enable the option to provide user
guarantee that the subscripts do not over/under-flow into other
dimensions, thereby producing more accurate dependence vectors.

For prior discussion on this topic, leading to this change, please see
the following thread:
http://lists.llvm.org/pipermail/llvm-dev/2019-May/132372.html

Reviewers: Meinersbur, jdoerfert, kbarton, dmgreen, fhahn
Reviewed By: Meinersbur, jdoerfert, dmgreen
Subscribers: fhahn, hiraditya, javed.absar, llvm-commits, Whitney,
etiotto
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D62610

llvm-svn: 362711

[NFC][CodeGen] Remove duplicate test in fp-fast.ll

@test10 is the same as @test11.

llvm-svn: 362710

gn build: Add new tidy checks to gn files

The checks were added in r362673 and r362672.

llvm-svn: 362709