platform/upstream/llvm.git
7 years ago[LTOs] Allow generation of hotness information
Adam Nemet [Fri, 2 Dec 2016 17:53:56 +0000 (17:53 +0000)]
[LTOs] Allow generation of hotness information

The flag is passed by the clang driver.

Differential Revision: https://reviews.llvm.org/D27331

llvm-svn: 288519

7 years agoMake LTO opt-remarks tests matching stricter
Adam Nemet [Fri, 2 Dec 2016 17:53:49 +0000 (17:53 +0000)]
Make LTO opt-remarks tests matching stricter

This ensures that we don't generate the hotness attribute by default.

llvm-svn: 288518

7 years agofix check-label
Sanjay Patel [Fri, 2 Dec 2016 17:50:14 +0000 (17:50 +0000)]
fix check-label

llvm-svn: 288517

7 years agoDo not allow multiple possibly aliasing ptrs in an expression
Johannes Doerfert [Fri, 2 Dec 2016 17:49:52 +0000 (17:49 +0000)]
Do not allow multiple possibly aliasing ptrs in an expression

  Relational comparisons should not involve multiple potentially
  aliasing pointers. Similarly this should hold for switch conditions
  and the two conditions involved in equality comparisons (separately!).
  This is a heuristic based on the C semantics that does only allow such
  operations when the base pointers do point into the same object.
  Since this makes aliasing likely we will bail out early instead of
  producing a probably failing runtime check.

llvm-svn: 288516

7 years ago[x86] add tests to show missing demanded bits analysis; NFC
Sanjay Patel [Fri, 2 Dec 2016 17:48:48 +0000 (17:48 +0000)]
[x86] add tests to show missing demanded bits analysis; NFC

llvm-svn: 288515

7 years agoRerun mem2reg after the inliner
Johannes Doerfert [Fri, 2 Dec 2016 17:43:57 +0000 (17:43 +0000)]
Rerun mem2reg after the inliner

It did happen that after the inliner finished we end up with promotable
allocas in a function. We now run mem2reg to make sure everything is
promoted if possible.

llvm-svn: 288514

7 years ago[CUDA] Forward sanitizer support to host toolchain
Jason Henline [Fri, 2 Dec 2016 17:32:18 +0000 (17:32 +0000)]
[CUDA] Forward sanitizer support to host toolchain

Summary:
This is an improvement on rL288448 where address sanitization was listed
as supported for the CudaToolChain. Since the intent is for the
CudaToolChain not to reject any flags supported by the host compiler,
this patch switches to forwarding the CudaToolChain sanitizer support to
the host toolchain rather than explicitly whitelisting address
sanitization.

Thanks to hfinkel for this suggestion.

Reviewers: jlebar

Subscribers: hfinkel, cfe-commits

Differential Revision: https://reviews.llvm.org/D27351

llvm-svn: 288512

7 years agoRemoved a wrong assertion about non-colorable sections.
Rui Ueyama [Fri, 2 Dec 2016 17:23:58 +0000 (17:23 +0000)]
Removed a wrong assertion about non-colorable sections.

The assertion asserted that colorable sections can never have
a reference to non-colorable sections, but that was simply wrong.
They can have references to non-colorable sections. If that's the
case, referenced sections must be the same in terms of pointer
comparison.

llvm-svn: 288511

7 years ago[InstCombine] Add vector urem tests
Simon Pilgrim [Fri, 2 Dec 2016 17:16:21 +0000 (17:16 +0000)]
[InstCombine] Add vector urem tests

Demonstrate missed opportunity for urem -> and combine for powerof2 or zero non-uniform constant dividers

llvm-svn: 288510

7 years ago[InstCombine] Regenerate vector srem tests
Simon Pilgrim [Fri, 2 Dec 2016 17:12:56 +0000 (17:12 +0000)]
[InstCombine] Regenerate vector srem tests

llvm-svn: 288509

7 years agoRevert "[SLP] Fix for PR6246: vectorization for scalar ops on vector elements."
Renato Golin [Fri, 2 Dec 2016 16:56:26 +0000 (16:56 +0000)]
Revert "[SLP] Fix for PR6246: vectorization for scalar ops on vector elements."

This reverts commit r288497, as it broke the AArch64 build of Compiler-RT's
builtins (twice: once in r288412 and once in r288497). We should investigate
this offline.

llvm-svn: 288508

7 years agoRevert "Compiler-rt part of D26230: Add (constant) masked load/store support (Try...
Filipe Cabecinhas [Fri, 2 Dec 2016 16:19:14 +0000 (16:19 +0000)]
Revert "Compiler-rt part of D26230: Add (constant) masked load/store support (Try #2)"

This reverts commit r288504.

clang-bpf-build fails with no details:
******************** TEST 'AddressSanitizer-x86_64-linux ::
TestCases/masked-ops.cpp' FAILED ********************
Script:
--
/mnt/buildbot/slave-root/clang-bpf-build/stage1/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64 -o /mnt/buildbot/slave-root/clang-bpf-build/stage1/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Output/masked-ops.cpp.tmp /mnt/buildbot/slave-root/clang-bpf-build/llvm/projects/compiler-rt/test/asan/TestCases/masked-ops.cpp -mavx -O1
not /mnt/buildbot/slave-root/clang-bpf-build/stage1/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Output/masked-ops.cpp.tmp l1 2>&1 | FileCheck -check-prefix=CHECK-L1 /mnt/buildbot/slave-root/clang-bpf-build/llvm/projects/compiler-rt/test/asan/TestCases/masked-ops.cpp
/mnt/buildbot/slave-root/clang-bpf-build/stage1/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Output/masked-ops.cpp.tmp l6 2>&1 | FileCheck -check-prefix=CHECK-L6 /mnt/buildbot/slave-root/clang-bpf-build/llvm/projects/compiler-rt/test/asan/TestCases/masked-ops.cpp
/mnt/buildbot/slave-root/clang-bpf-build/stage1/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Output/masked-ops.cpp.tmp la 2>&1 | FileCheck -check-prefix=CHECK-LA /mnt/buildbot/slave-root/clang-bpf-build/llvm/projects/compiler-rt/test/asan/TestCases/masked-ops.cpp
not /mnt/buildbot/slave-root/clang-bpf-build/stage1/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Output/masked-ops.cpp.tmp s1 2>&1 | FileCheck -check-prefix=CHECK-S1 /mnt/buildbot/slave-root/clang-bpf-build/llvm/projects/compiler-rt/test/asan/TestCases/masked-ops.cpp
/mnt/buildbot/slave-root/clang-bpf-build/stage1/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Output/masked-ops.cpp.tmp s6 2>&1 | FileCheck -check-prefix=CHECK-S6 /mnt/buildbot/slave-root/clang-bpf-build/llvm/projects/compiler-rt/test/asan/TestCases/masked-ops.cpp
/mnt/buildbot/slave-root/clang-bpf-build/stage1/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Output/masked-ops.cpp.tmp sa 2>&1 | FileCheck -check-prefix=CHECK-SA /mnt/buildbot/slave-root/clang-bpf-build/llvm/projects/compiler-rt/test/asan/TestCases/masked-ops.cpp
--
Exit Code: 2

Command Output (stderr):
--
FileCheck error: '-' is empty.
FileCheck command line:  FileCheck -check-prefix=CHECK-L6 /mnt/buildbot/slave-root/clang-bpf-build/llvm/projects/compiler-rt/test/asan/TestCases/masked-ops.cpp

--

********************

llvm-svn: 288507

7 years ago[DAGCombiner] do not fold (fmul (fadd X, 1), Y) -> (fmad X, Y, Y) by default
Nicolai Haehnle [Fri, 2 Dec 2016 16:06:18 +0000 (16:06 +0000)]
[DAGCombiner] do not fold (fmul (fadd X, 1), Y) -> (fmad X, Y, Y) by default

Summary:
When X = 0 and Y = inf, the original code produces inf, but the transformed
code produces nan. So this transform (and its relatives) should only be
used when the no-infs-fp-math flag is explicitly enabled.

Also disable the transform using fmad (intermediate rounding) when unsafe-math
is not enabled, since it can reduce the precision of the result; consider this
example with binary floating point numbers with two bits of mantissa:

  x = 1.01
  y = 111

  x * (y + 1) = 1.01 * 1000 = 1010 (this is the exact result; no rounding occurs at any step)

  x * y + x = 1000.11 + 1.01 =r 1000 + 1.01 = 1001.01 =r 1000 (with rounding towards zero)

The example relies on rounding towards zero at least in the second step.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98578

Reviewers: RKSimon, tstellarAMD, spatel, arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D26602

llvm-svn: 288506

7 years agoTidyup code with indentation and clang-format. NFCI.
Simon Pilgrim [Fri, 2 Dec 2016 15:44:30 +0000 (15:44 +0000)]
Tidyup code with indentation and clang-format. NFCI.

llvm-svn: 288505

7 years agoCompiler-rt part of D26230: Add (constant) masked load/store support (Try #2)
Filipe Cabecinhas [Fri, 2 Dec 2016 15:33:04 +0000 (15:33 +0000)]
Compiler-rt part of D26230: Add (constant) masked load/store support (Try #2)

Summary:
Unfortunately, there is no way to emit an llvm masked load/store in
clang without optimizations, and AVX enabled. Unsure how we should go
about making sure this test only runs if it's possible to execute AVX
code.

Reviewers: kcc, RKSimon, pgousseau

Subscribers: kubabrecka, dberris, llvm-commits

Differential Revision: https://reviews.llvm.org/D26506

llvm-svn: 288504

7 years ago[Sparc] Fix parsing of double-precision %f18, %f20, and %f22
Daniel Cederman [Fri, 2 Dec 2016 15:05:26 +0000 (15:05 +0000)]
[Sparc] Fix parsing of double-precision %f18, %f20, and %f22

Summary: They are currently being parsed as %f14, %f16, and %f18.

Reviewers: venkatra, jyknight

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27342

llvm-svn: 288503

7 years ago[clang-tidy] Do not trigger unnecessary-value-param check on methods marked as final
Felix Berger [Fri, 2 Dec 2016 14:44:16 +0000 (14:44 +0000)]
[clang-tidy] Do not trigger unnecessary-value-param check on methods marked as final

Summary: Virtual method overrides of dependent types cannot be recognized unless
they are marked as override or final.

Exclude methods marked as final from check and add test.

Reviewers: sbenza, hokein, alexfh

Subscribers: malcolm.parsons, JDevlieghere, cfe-commits

Differential Revision: https://reviews.llvm.org/D27248

llvm-svn: 288502

7 years ago[X86][SSE] Renamed shuffle combine test.
Simon Pilgrim [Fri, 2 Dec 2016 14:43:39 +0000 (14:43 +0000)]
[X86][SSE] Renamed shuffle combine test.

We're trying to combine to vpunpckhbw not vpunpckhwd

llvm-svn: 288501

7 years agoCODE_OWNERS: Take ownership of IR Linker as discussed on llvm-dev
Teresa Johnson [Fri, 2 Dec 2016 14:06:53 +0000 (14:06 +0000)]
CODE_OWNERS: Take ownership of IR Linker as discussed on llvm-dev

llvm-svn: 288500

7 years ago[X86][SSE] Add support for extracting constant bit data from broadcasted constants
Simon Pilgrim [Fri, 2 Dec 2016 13:16:08 +0000 (13:16 +0000)]
[X86][SSE] Add support for extracting constant bit data from broadcasted constants

llvm-svn: 288499

7 years ago[clang-move] some tweaks.
Haojian Wu [Fri, 2 Dec 2016 12:39:39 +0000 (12:39 +0000)]
[clang-move] some tweaks.

* Don't save SourceManager for each declarations.
* Rename some out-dated methods.

No functionality change.

llvm-svn: 288498

7 years ago[SLP] Fix for PR6246: vectorization for scalar ops on vector elements.
Alexey Bataev [Fri, 2 Dec 2016 12:20:22 +0000 (12:20 +0000)]
[SLP] Fix for PR6246: vectorization for scalar ops on vector elements.

When trying to vectorize trees that start at insertelement instructions
function tryToVectorizeList() uses vectorization factor calculated as
MinVecRegSize/ScalarTypeSize. But sometimes it does not work as tree
cost for this fixed vectorization factor is too high.
Patch tries to improve the situation. It tries different vectorization
factors from max(PowerOf2Floor(NumberOfVectorizedValues),
MinVecRegSize/ScalarTypeSize) to MinVecRegSize/ScalarTypeSize and tries
to choose the best one.

Differential Revision: https://reviews.llvm.org/D27215

llvm-svn: 288497

7 years ago[X86] Refactored getTargetConstantBitsFromNode to allow for expansion. NFCI.
Simon Pilgrim [Fri, 2 Dec 2016 11:58:05 +0000 (11:58 +0000)]
[X86] Refactored getTargetConstantBitsFromNode to allow for expansion. NFCI.

getTargetConstantBitsFromNode currently only extracts constant pool vector data, but it will need to be generalized to support broadcast and scalar constant pool data as well.

Converted Constant bit extraction and Bitset splitting to helper lambda functions.

llvm-svn: 288496

7 years agoFix a buildbot failure in include-fixer.
Eric Liu [Fri, 2 Dec 2016 11:23:07 +0000 (11:23 +0000)]
Fix a buildbot failure in include-fixer.

llvm-svn: 288495

7 years agoReplace __ANDROID_NDK__ with __ANDROID__
Pavel Labath [Fri, 2 Dec 2016 11:15:15 +0000 (11:15 +0000)]
Replace __ANDROID_NDK__ with __ANDROID__

Summary:
This replaces all the uses of the __ANDROID_NDK__ define with __ANDROID__. This
is a preparatory step to remove our custom android toolchain file and rely on
the standard android NDK one instead, which does not provide this define.
Instead I rely, on __ANDROID__, which is set by the compiler.

I haven't yet removed the cmake variable with the same name, as we will need to
do something completely different there -- NDK toolchain defines
CMAKE_SYSTEM_NAME to Android, while our current one pretends it's linux.

Reviewers: tberghammer, zturner

Subscribers: danalbert, srhines, mgorny, lldb-commits

Differential Revision: https://reviews.llvm.org/D27305

llvm-svn: 288494

7 years ago[ClangFormat] Only insert #include into the #include block in the beginning of the...
Eric Liu [Fri, 2 Dec 2016 11:01:43 +0000 (11:01 +0000)]
[ClangFormat] Only insert #include into the #include block in the beginning of the file.

Summary:
This avoid inserting #include into:
- raw string literals containing #include.
- #if block.
- Special #include among declarations (e.g. functions).

Reviewers: djasper

Subscribers: cfe-commits, klimek

Differential Revision: https://reviews.llvm.org/D26909

llvm-svn: 288493

7 years ago[SLPVectorizer][X86] Add tests for vectorization of buildvector of scalar fp-ops...
Simon Pilgrim [Fri, 2 Dec 2016 10:54:46 +0000 (10:54 +0000)]
[SLPVectorizer][X86] Add tests for vectorization of buildvector of scalar fp-ops (PR6246)

llvm-svn: 288492

7 years ago[Frontend] Fix an issue where a quoted search path is incorrectly
Alex Lorenz [Fri, 2 Dec 2016 09:51:51 +0000 (09:51 +0000)]
[Frontend] Fix an issue where a quoted search path is incorrectly
removed as a duplicate header search path

The commit r126167 started passing the First index into RemoveDuplicates, but
forgot to update 0 to First in the loop that looks for the duplicate. This
resulted in a bug where an -iquoted search path was incorrectly removed if you
passed in the same path into -iquote and more than one time into -isystem.

rdar://23991350

Differential Revision: https://reviews.llvm.org/D27298

llvm-svn: 288491

7 years agocompiler-rt/test/profile/Linux/lit.local.cfg: [Py3] Use text mode (universal_newlines...
NAKAMURA Takumi [Fri, 2 Dec 2016 08:17:17 +0000 (08:17 +0000)]
compiler-rt/test/profile/Linux/lit.local.cfg: [Py3] Use text mode (universal_newlines=True).

llvm-svn: 288490

7 years ago[ScopInfo] Fold constant coefficients in array dimensions to the right
Tobias Grosser [Fri, 2 Dec 2016 08:10:56 +0000 (08:10 +0000)]
[ScopInfo] Fold constant coefficients in array dimensions to the right

This allows us to delinearize code such as the one below, where the array
sizes are A[][2 * n] as there are n times two elements in the innermost
dimension. Alternatively, we could try to generate another dimension for the
struct in the innermost dimension, but as the struct has constant size,
recovering this dimension is easy.

   struct com {
     double Real;
     double Img;
   };

   void foo(long n, struct com A[][n]) {
     for (long i = 0; i < 100; i++)
       for (long j = 0; j < 1000; j++)
         A[i][j].Real += A[i][j].Img;
   }

   int main() {
     struct com A[100][1000];
     foo(1000, A);

llvm-svn: 288489

7 years ago[sanitizer] Add a bunch of ifdefs for sparc targets to avoid build failures.
Maxim Ostapenko [Fri, 2 Dec 2016 08:07:35 +0000 (08:07 +0000)]
[sanitizer] Add a bunch of ifdefs for sparc targets to avoid build failures.

Differential Revision: https://reviews.llvm.org/D27301

llvm-svn: 288488

7 years agoPort parallel ICF to COFF.
Rui Ueyama [Fri, 2 Dec 2016 08:03:58 +0000 (08:03 +0000)]
Port parallel ICF to COFF.

LLD used to take 11.73 seconds to link Clang. Now it is 6.94 seconds.
MSVC link takes 83.02 seconds. Note that ICF is enabled by default on
Windows, so a low latency ICF is more important than in ELF.

llvm-svn: 288487

7 years agoDon't include system header inside namespace
Stephan Bergmann [Fri, 2 Dec 2016 08:03:57 +0000 (08:03 +0000)]
Don't include system header inside namespace

...causes build failure at least with GCC 6.2.1, as smmintrin.h indirectly
includes cstdlib, which then runs into problems.

llvm-svn: 288486

7 years agoIgnore R_X86_64_NONE.
Rafael Espindola [Fri, 2 Dec 2016 08:00:09 +0000 (08:00 +0000)]
Ignore R_X86_64_NONE.

It looks like the way dtrace works is

* The user creates .o files that reference magical symbol names.
* dtrace reads those files, collecs the info it needs and changes the
  relocation to R_X86_64_NONE expecting the linker to ignore them.

llvm-svn: 288485

7 years ago[AVX-512] Add EVEX vpshuflw/vpshufhw/vpshufd instructions to load folding tables.
Craig Topper [Fri, 2 Dec 2016 07:57:11 +0000 (07:57 +0000)]
[AVX-512] Add EVEX vpshuflw/vpshufhw/vpshufd instructions to load folding tables.

llvm-svn: 288484

7 years agoFix a bug in ICF involving COFF associative sections.
Rui Ueyama [Fri, 2 Dec 2016 07:46:12 +0000 (07:46 +0000)]
Fix a bug in ICF involving COFF associative sections.

Associative sections are sections that need to be linked if their associated
sections are linked. Associative sections are used to append auxiliary data
such as debug info.

Previously, we compared all associative sections when comparing two comdat
sections. Because usually assocative sections are not mergeable sections,
we missed a lot of mergeable sections. MSVC linker doesn't seem to check
the identity of associative sections.

This patch makes LLD to ignore associative sections when doing ICF.

llvm-svn: 288483

7 years ago[AVX-512] Add EVEX PSHUFB instructions to load folding tables.
Craig Topper [Fri, 2 Dec 2016 07:06:30 +0000 (07:06 +0000)]
[AVX-512] Add EVEX PSHUFB instructions to load folding tables.

llvm-svn: 288482

7 years ago[AVX-512] Add masked VINSERTF/VINSERTI instructions to load folding tables.
Craig Topper [Fri, 2 Dec 2016 06:24:38 +0000 (06:24 +0000)]
[AVX-512] Add masked VINSERTF/VINSERTI instructions to load folding tables.

llvm-svn: 288481

7 years agoFix the worse case performance of ICF.
Rui Ueyama [Fri, 2 Dec 2016 05:35:46 +0000 (05:35 +0000)]
Fix the worse case performance of ICF.

r288228 seems to have regressed ICF performance in some cases in which
a lot of sections are actually mergeable. In r288228, I made a change
to create a Range object for each new color group. So every time we
split a group, we allocated and added a new group to a list of groups.

This patch essentially reverted r288228 with an improvement to
parallelize the original algorithm.

Now the ICF main loop is entirely allocation-free and lock-free.

Just like pre-r288228, we search for group boundaries by linear scan
instead of managing the information using Range class. r288228 was
neutral in performance-wise, and so is this patch.

I confirmed that this produces the exact same result as before
using chromium and clang as tests.

llvm-svn: 288480

7 years ago[ScopInfo] Separate construction and finalization of memory accesses [NFC]
Tobias Grosser [Fri, 2 Dec 2016 05:21:22 +0000 (05:21 +0000)]
[ScopInfo] Separate construction and finalization of memory accesses [NFC]

After having built memory accesses we perform some additional transformations
on them to increase the chances that our delinearization guesses the right
shape. Only after these transformations, we take the assumptions that the
array shape we predict is such that no out-of-bounds memory accesses arise.

Before this change, the construction of the memory access, the access folding
that improves the represenation for certain parametric subscripts, and taking
the assumption was all done right after a memory access was created. In this
change we split this now into three separate iterations over all memory
accesses. This means only after all memory accesses have been built, we start
to canonicalize accesses, and to take assumptions. This split prepares for
future canonicalizations that must consider all memory accesses for deriving
additional beneficial transformations.

llvm-svn: 288479

7 years agoclang/test/Driver/defsym.s: Appease targeting msc. It is incapable of external assemb...
NAKAMURA Takumi [Fri, 2 Dec 2016 05:09:21 +0000 (05:09 +0000)]
clang/test/Driver/defsym.s: Appease targeting msc. It is incapable of external assembler in trunk.

llvm-svn: 288478

7 years agoAdd a test documenting how we handle addends on Elf_Rela.
Rafael Espindola [Fri, 2 Dec 2016 04:20:47 +0000 (04:20 +0000)]
Add a test documenting how we handle addends on Elf_Rela.

llvm-svn: 288477

7 years agoIR: Move NumElements field from {Array,Vector}Type to SequentialType.
Peter Collingbourne [Fri, 2 Dec 2016 03:20:58 +0000 (03:20 +0000)]
IR: Move NumElements field from {Array,Vector}Type to SequentialType.

Now that PointerType is no longer a SequentialType, all SequentialTypes
have an associated number of elements, so we can move that information to
the base class, allowing for a number of simplifications.

Differential Revision: https://reviews.llvm.org/D27122

llvm-svn: 288464

7 years agoChange LoopUnrollPass cost from int to unsigned to make it consistent. (NFC)
Dehao Chen [Fri, 2 Dec 2016 03:17:07 +0000 (03:17 +0000)]
Change LoopUnrollPass cost from int to unsigned to make it consistent. (NFC)

llvm-svn: 288463

7 years agoIR: Change PointerType to derive from Type rather than SequentialType.
Peter Collingbourne [Fri, 2 Dec 2016 03:05:41 +0000 (03:05 +0000)]
IR: Change PointerType to derive from Type rather than SequentialType.

As proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-October/106640.html

This is for a couple of reasons:

- Values of type PointerType are unlike the other SequentialTypes (arrays
  and vectors) in that they do not hold values of the element type. By moving
  PointerType we can unify certain aspects of how the other SequentialTypes
  are handled.
- PointerType will have no place in the SequentialType hierarchy once
  pointee types are removed, so this is a necessary step towards removing
  pointee types.

Differential Revision: https://reviews.llvm.org/D26595

llvm-svn: 288462

7 years agoAllow duplicated abs symbols with the same value.
Rafael Espindola [Fri, 2 Dec 2016 02:58:21 +0000 (02:58 +0000)]
Allow duplicated abs symbols with the same value.

This is a fairly reasonable bfd extension since there is one obvious value.

dtrace depends on this feature as it creates multiple absolute
symbols with the same value.

llvm-svn: 288461

7 years agoFix GlobalISel build.
Peter Collingbourne [Fri, 2 Dec 2016 02:55:30 +0000 (02:55 +0000)]
Fix GlobalISel build.

llvm-svn: 288460

7 years agoConstantFolding: Factor code into helper function
Matt Arsenault [Fri, 2 Dec 2016 02:26:02 +0000 (02:26 +0000)]
ConstantFolding: Factor code into helper function

llvm-svn: 288459

7 years agoIR: Change the gep_type_iterator API to avoid always exposing the "current" type.
Peter Collingbourne [Fri, 2 Dec 2016 02:24:42 +0000 (02:24 +0000)]
IR: Change the gep_type_iterator API to avoid always exposing the "current" type.

Instead, expose whether the current type is an array or a struct, if an array
what the upper bound is, and if a struct the struct type itself. This is
in preparation for a later change which will make PointerType derive from
Type rather than SequentialType.

Differential Revision: https://reviews.llvm.org/D26594

llvm-svn: 288458

7 years agoUpdate implementation of ABI support for throwing noexcept function pointers
Richard Smith [Fri, 2 Dec 2016 02:06:53 +0000 (02:06 +0000)]
Update implementation of ABI support for throwing noexcept function pointers
and catching as non-noexcept to match the final design per discusson on
cxx-abi-dev.

llvm-svn: 288457

7 years ago[CUDA] Fix faulty test from rL288448
Jason Henline [Fri, 2 Dec 2016 02:04:43 +0000 (02:04 +0000)]
[CUDA] Fix faulty test from rL288448

Summary:
The test introduced by rL288448 is currently failing because
unimportant but unexpected errors appear as output from a test compile
line. This patch looks for a more specific error message, in order to
avoid false positives.

Reviewers: jlebar

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D27328

Switch to more specific error

llvm-svn: 288453

7 years agop0012r1: define corresponding feature test macro
Richard Smith [Fri, 2 Dec 2016 02:02:23 +0000 (02:02 +0000)]
p0012r1: define corresponding feature test macro

llvm-svn: 288452

7 years agoWrite the addent to got entries when using Elf_Rel.
Rafael Espindola [Fri, 2 Dec 2016 01:57:24 +0000 (01:57 +0000)]
Write the addent to got entries when using Elf_Rel.

llvm-svn: 288451

7 years ago[DWARF] Put linkage-name on abstract origin even when there's a declaration.
Paul Robinson [Fri, 2 Dec 2016 01:55:17 +0000 (01:55 +0000)]
[DWARF] Put linkage-name on abstract origin even when there's a declaration.

In r266692, we made it possible to emit linkage names for just inlined
functions, putting the attribute on the abstract origin. Make sure we
don't think the linkage-name was already emitted on a declaration.

Differential Revision: http://reviews.llvm.org/D27320

llvm-svn: 288450

7 years agoRecover better from an incompatible .pcm file being provided by -fmodule-file=.
Richard Smith [Fri, 2 Dec 2016 01:52:28 +0000 (01:52 +0000)]
Recover better from an incompatible .pcm file being provided by -fmodule-file=.
We try to include the headers of the module textually in this case, still
enforcing the modules semantic rules. In order to make that work, we need to
still track that we're entering and leaving the module. Also, if the module was
also marked as unavailable (perhaps because it was missing a file), we
shouldn't mark the module unavailable -- we don't need the module to be
complete if we're going to enter it textually.

llvm-svn: 288449

7 years ago[CUDA] "Support" ASAN arguments in CudaToolChain
Jason Henline [Fri, 2 Dec 2016 01:42:54 +0000 (01:42 +0000)]
[CUDA] "Support" ASAN arguments in CudaToolChain

This fixes a bug that was introduced in rL287285. The bug made it
illegal to pass -fsanitize=address during CUDA compilation because the
CudaToolChain class was switched from deriving from the Linux toolchain
class to deriving directly from the ToolChain toolchain class. When
CudaToolChain derived from Linux, it used Linux's getSupportedSanitizers
method, and that method allowed ASAN, but when it switched to deriving
directly from ToolChain, it inherited a getSupportedSanitizers method
that didn't allow for ASAN.

This patch fixes that bug by creating a getSupportedSanitizers method
for CudaToolChain that supports ASAN.

This patch also fixes the test that checks that -fsanitize=address is
passed correctly for CUDA builds. That test didn't used to notice if an
error message was emitted, and that's why it didn't catch this bug when
it was first introduced. With the fix from this patch, that test will
now catch any similar bug in the future.

llvm-svn: 288448

7 years ago[WebAssembly] Add an -mdirect flag for the direct wasm object feature.
Dan Gohman [Fri, 2 Dec 2016 01:12:40 +0000 (01:12 +0000)]
[WebAssembly] Add an -mdirect flag for the direct wasm object feature.

Add a target flag for enabling the new direct wasm object emission
feature.

llvm-svn: 288447

7 years ago[ThinLTO] Stop importing constant global vars as copies in the backend
Teresa Johnson [Fri, 2 Dec 2016 01:02:30 +0000 (01:02 +0000)]
[ThinLTO] Stop importing constant global vars as copies in the backend

Summary:
We were doing an optimization in the ThinLTO backends of importing
constant unnamed_addr globals unconditionally as a local copy (regardless
of whether the thin link decided to import them). This should be done in
the thin link instead, so that resulting exported references are marked
and promoted appropriately, but will need a summary enhancement to mark
these variables as constant unnamed_addr.

The function import logic during the thin link was trying to handle
this proactively, by conservatively marking all values referenced in
the initializer lists of exported global variables as also exported.
However, this only handled values referenced directly from the
initializer list of an exported global variable. If the value is itself
a constant unnamed_addr variable, we could end up exporting its
references as well. This caused multiple issues. The first is that the
transitively exported references weren't promoted. Secondly, some could
not be promoted/renamed (e.g. they had a section or other constraint).
recursively, instead of just adding the first level of initializer list
references to the ExportList directly.

Remove this optimization and the associated handling in the function
import backend. SPEC measurements indicate we weren't getting much
from it in any case.

Fixes PR31052.

Reviewers: mehdi_amini

Subscribers: krasin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26880

llvm-svn: 288446

7 years agoAMDGPU: Use wider scalar spills for SGPR spilling
Matt Arsenault [Fri, 2 Dec 2016 00:54:45 +0000 (00:54 +0000)]
AMDGPU: Use wider scalar spills for SGPR spilling

Since the spill is for the whole wave, these
don't have the swizzling problems that vector stores do
and a single 4-byte allocation is enough to spill a 64 element
register. This should reduce the number of spill instructions and
put all the spills for a register in the same cacheline.

This should save allocated private size, but for now it doesn't.
The extra slots are allocated for each component, but never used
because the frame layout is essentially finalized before frame
indices are replaced. For always using the scalar store path,
this should probably be moved into processFunctionBeforeFrameFinalized.

llvm-svn: 288445

7 years agoDelete tautological assertion.
Jonathan Roelofs [Fri, 2 Dec 2016 00:51:58 +0000 (00:51 +0000)]
Delete tautological assertion.

After r256463, both the LHS and RHS now refer to the same variable. Before,
they referred to the member, the parameter respectively. Now GCC6's
-Wtautological-compare complains.

llvm-svn: 288444

7 years agoFix undefined behavior.
Rui Ueyama [Fri, 2 Dec 2016 00:38:15 +0000 (00:38 +0000)]
Fix undefined behavior.

New items can be added to Ranges here, and that invalidates
an iterater that previously pointed the end of the vector.

llvm-svn: 288443

7 years agoWhen instructions are hoisted out of loops by MachineLICM, remove their debug loc.
Wolfgang Pieb [Fri, 2 Dec 2016 00:37:57 +0000 (00:37 +0000)]
When instructions are hoisted out of loops by MachineLICM, remove their debug loc.
This prevents erratic stepping behavior as well as incorrect source attribution
for sample profiling.

Reviewers: dblakie

Subscribers: llvm-commit

Differential Revision: https://reviews.llvm.org/D27290

llvm-svn: 288442

7 years agoSDAG: Avoid a large, usually empty SmallVector in a recursive function
Justin Bogner [Fri, 2 Dec 2016 00:11:01 +0000 (00:11 +0000)]
SDAG: Avoid a large, usually empty SmallVector in a recursive function

This SmallVector is using up 128 bytes on the stack every time despite
almost always being empty[1], and since this function can recurse quite
deeply that adds up to a lot of overhead. We've seen this run afoul of
ulimits in some cases with ASAN on.

Replacing the SmallVector with a std::vector trades an occasional heap
allocation for vastly less stack usage.

[1]: I gathered some stats on an internal test suite and the vector
was non-empty in only 45,000 of 10,000,000 calls to this function.

llvm-svn: 288441

7 years agoStruct GEPs must use i32, not whatever size_t is. It should be safe
John McCall [Thu, 1 Dec 2016 23:51:30 +0000 (23:51 +0000)]
Struct GEPs must use i32, not whatever size_t is.  It should be safe
to do this unconditionally, given that the indices will always be small
constant integers anyway.

llvm-svn: 288440

7 years ago[AArch64] Fold more spilled/refilled COPYs.
Geoff Berry [Thu, 1 Dec 2016 23:43:55 +0000 (23:43 +0000)]
[AArch64] Fold more spilled/refilled COPYs.

Summary:
Make AArch64InstrInfo::foldMemoryOperandImpl more general by folding all
full COPYs between register classes of the same size that are either
spilled or refilled.

Reviewers: MatzeB, qcolombet

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D27271

llvm-svn: 288439

7 years ago[libclang] Add APIs to check the result of an integer expression in CXEvalResult...
Argyrios Kyrtzidis [Thu, 1 Dec 2016 23:41:27 +0000 (23:41 +0000)]
[libclang] Add APIs to check the result of an integer expression in CXEvalResult without overflow

Patch by Emilio Cobos Álvarez!
See https://reviews.llvm.org/D26788

llvm-svn: 288438

7 years ago[MC] Refactor emitELFSize to make usage more consistent. NFC.
Dan Gohman [Thu, 1 Dec 2016 23:39:08 +0000 (23:39 +0000)]
[MC] Refactor emitELFSize to make usage more consistent. NFC.

Move the cast<MCSymbolELF> inside emitELFSize, so that:
 - it's done in one place instead of at each call
 - it's more consistent with similar functions like EmitCOFFSafeSEH
 - ambiguity between cast<> and dyn_cast<> is avoided (which also
   eliminates an unnecessary dyn_cast call)

This also makes it easier to experiment with using ".size" directives on
non-ELF targets.

llvm-svn: 288437

7 years agoExtend CompilationDatabase by a field for the output filename
Joerg Sonnenberger [Thu, 1 Dec 2016 23:37:45 +0000 (23:37 +0000)]
Extend CompilationDatabase by a field for the output filename

In bigger projects like an Operating System, the same source code is
often compiled in slightly different ways. This could be the difference
between PIC and non-PIC code for static vs dynamic libraries, it could
also be the difference between size optimised versions of tools for
ramdisk images. At the moment, the compilation database has no way to
distinguish such cases. As first step, add a field in the JSON format
for it and process it accordingly.

Differential Revision: https://reviews.llvm.org/D27138

llvm-svn: 288436

7 years agollvm-modextract: Call keep() on the output stream before exiting.
Peter Collingbourne [Thu, 1 Dec 2016 23:13:11 +0000 (23:13 +0000)]
llvm-modextract: Call keep() on the output stream before exiting.

llvm-svn: 288435

7 years ago[ARM] Fix for 64-bit CAS expansion on ARM32 with -O0
Oleg Ranevskyy [Thu, 1 Dec 2016 22:58:35 +0000 (22:58 +0000)]
[ARM] Fix for 64-bit CAS expansion on ARM32 with -O0

Summary:
This patch fixes comparison of 64-bit atomic with its expected value in CMP_SWAP_64 expansion.

Currently, the low words are compared with CMP, while the high words are compared with SBC. SBC expects the carry flag to be set if CMP detects a difference. CMP might leave the carry unset for unequal arguments though if the first one is >= than the second. This might cause the comparison logic to detect false equality.

Example of the broken C++ code:
```
std::atomic<long long> at(2);

long long ll = 1;
std::atomic_compare_exchange_strong(&at, &ll, 3);
```
Even though the atomic `at` and the expected value `ll` are not equal and `atomic_compare_exchange_strong` returns `false`, `at` is changed to 3.

The patch replaces SBC with CMPEQ.

Reviewers: t.p.northover

Subscribers: aemerson, rengolin, llvm-commits, asl

Differential Revision: https://reviews.llvm.org/D27315

llvm-svn: 288433

7 years agoRevert "[SLP] Fix for PR6246: vectorization for scalar ops on vector elements."
Artem Belevich [Thu, 1 Dec 2016 22:52:15 +0000 (22:52 +0000)]
Revert "[SLP] Fix for PR6246: vectorization for scalar ops on vector elements."

This reverts r288412 which causes severe compile-time regression.

llvm-svn: 288431

7 years agoRegisterCoalscer: Only coalesce complete reserved registers.
Matthias Braun [Thu, 1 Dec 2016 22:39:51 +0000 (22:39 +0000)]
RegisterCoalscer: Only coalesce complete reserved registers.

The coalescer eliminates copies from reserved registers of the form:
   %vregX = COPY %rY
in the case where %rY is a reserved register. However this turns out to
be invalid if only some of the subregisters are reserved (see also
https://reviews.llvm.org/D26648).

Differential Revision: https://reviews.llvm.org/D26687

llvm-svn: 288428

7 years agoFix broken buildbots because of r288424 (NFC).
Eugene Zelenko [Thu, 1 Dec 2016 22:26:55 +0000 (22:26 +0000)]
Fix broken buildbots because of r288424 (NFC).

llvm-svn: 288426

7 years ago[ADT, Support, TableGen] Fix some Clang-tidy modernize-use-default and Include What...
Eugene Zelenko [Thu, 1 Dec 2016 22:13:24 +0000 (22:13 +0000)]
[ADT, Support, TableGen] Fix some Clang-tidy modernize-use-default and Include What You Use warnings; other minor fixes (NFC).

llvm-svn: 288424

7 years ago[dsymutil] Simplify a lazy-init condition/expression
David Blaikie [Thu, 1 Dec 2016 22:04:16 +0000 (22:04 +0000)]
[dsymutil] Simplify a lazy-init condition/expression

llvm-svn: 288423

7 years agobuild: fix building for Windows after SVN r287465
Saleem Abdulrasool [Thu, 1 Dec 2016 22:00:54 +0000 (22:00 +0000)]
build: fix building for Windows after SVN r287465

The previous change for enabling MinGW did not preserve the Win32 check and
added the EABI specific routines to a Windows build which does not use the EABI
routines.  Correct the conditional check for that.

llvm-svn: 288422

7 years ago[debug info] Minor cleanup from D27170/r288399
David Blaikie [Thu, 1 Dec 2016 21:59:09 +0000 (21:59 +0000)]
[debug info] Minor cleanup from D27170/r288399

llvm-svn: 288421

7 years ago[SelectionDAG] getRawSubclassData should not return HasDebugValue.
Chih-Hung Hsieh [Thu, 1 Dec 2016 21:56:33 +0000 (21:56 +0000)]
[SelectionDAG] getRawSubclassData should not return HasDebugValue.

This change fixes a regression in r279537 and
makes getRawSubclassData behave like r279536.
Without this change, the fp128-g.ll test case will have an
infinite loop involving SoftenFloatRes_LOAD.

Differential Revision: http://reviews.llvm.org/D26942

llvm-svn: 288420

7 years agoAdd an assert instead of ignoring an impossible condition.
Rui Ueyama [Thu, 1 Dec 2016 21:41:06 +0000 (21:41 +0000)]
Add an assert instead of ignoring an impossible condition.

llvm-svn: 288419

7 years agoAArch64: fix 128-bit cmpxchg at -O0 (again, again).
Tim Northover [Thu, 1 Dec 2016 21:31:59 +0000 (21:31 +0000)]
AArch64: fix 128-bit cmpxchg at -O0 (again, again).

This time the issue is fortunately just a simple mistake rather than a horrible
design spectre. I thought SUBS/SBCS provided sufficient NZCV flags for
comparing two 64-bit values, but they don't.

The fix is slightly clunkier in AArch64 because we can't use conditional
execution to emit a pair of CMPs. Traditionally an "icmp ne i128" would map to
an EOR/EOR/ORR/CBNZ, but that uses more registers so it's easier to go with a
CSET/CINC/CBNZ combination. Slightly less efficient, but this is -O0 anyway.

Thanks to Anton Korobeynikov for pointing out the issue.

llvm-svn: 288418

7 years agoImprove documentation on MSVC workaround for AlignedCharArray (NFC)
Mehdi Amini [Thu, 1 Dec 2016 20:54:29 +0000 (20:54 +0000)]
Improve documentation on MSVC workaround for AlignedCharArray (NFC)

The comment only mentioned "old version of MSVC".

Differential Revision: https://reviews.llvm.org/D27312

llvm-svn: 288417

7 years agoFix unused variable warning in Release builds. NFC.
Benjamin Kramer [Thu, 1 Dec 2016 20:49:34 +0000 (20:49 +0000)]
Fix unused variable warning in Release builds. NFC.

llvm-svn: 288416

7 years ago[PR29121] Don't fold if it would produce atomic vector loads or stores
Philip Reames [Thu, 1 Dec 2016 20:17:06 +0000 (20:17 +0000)]
[PR29121] Don't fold if it would produce atomic vector loads or stores

The instcombine code which folds loads and stores into their use types can trip up if the use is a bitcast to a type which we can't directly load or store in the IR. In principle, such types shouldn't exist, but in practice they do today. This is a workaround to avoid a bug while we work towards the long term goal.

Differential Revision: https://reviews.llvm.org/D24365

llvm-svn: 288415

7 years agoAdd a space in a run line. NFC.
George Burgess IV [Thu, 1 Dec 2016 20:16:56 +0000 (20:16 +0000)]
Add a space in a run line. NFC.

llvm-svn: 288414

7 years agoFactor out common parts of LVI and Float2Int into ConstantRange [NFCI]
Philip Reames [Thu, 1 Dec 2016 20:08:47 +0000 (20:08 +0000)]
Factor out common parts of LVI and Float2Int into ConstantRange [NFCI]

This just extracts out the transfer rules for constant ranges into a single shared point. As it happens, neither bit of code actually overlaps in terms of the handled operators, but with this change that could easily be tweaked in the future.

I also want to have this separated out to make experimenting with a eager value info implementation and possibly a ValueTracking-like fixed depth recursion peephole version. There's no reason all four of these can't share a common implementation which reduces the chances of bugs.

Differential Revision: https://reviews.llvm.org/D27294

llvm-svn: 288413

7 years ago[SLP] Fix for PR6246: vectorization for scalar ops on vector elements.
Alexey Bataev [Thu, 1 Dec 2016 20:06:53 +0000 (20:06 +0000)]
[SLP] Fix for PR6246: vectorization for scalar ops on vector elements.

When trying to vectorize trees that start at insertelement instructions
function tryToVectorizeList() uses vectorization factor calculated as
MinVecRegSize/ScalarTypeSize. But sometimes it does not work as tree
cost for this fixed vectorization factor is too high.
Patch tries to improve the situation. It tries different vectorization
factors from max(PowerOf2Floor(NumberOfVectorizedValues),
MinVecRegSize/ScalarTypeSize) to MinVecRegSize/ScalarTypeSize and tries
to choose the best one.

Differential Revision: https://reviews.llvm.org/D27215

llvm-svn: 288412

7 years ago[WebAssembly] Define more wasm binary encoding constants.
Dan Gohman [Thu, 1 Dec 2016 20:02:12 +0000 (20:02 +0000)]
[WebAssembly] Define more wasm binary encoding constants.

llvm-svn: 288411

7 years agoRefactored X86InterleavedAccess into a class. NFCI.
David L Kreitzer [Thu, 1 Dec 2016 19:56:39 +0000 (19:56 +0000)]
Refactored X86InterleavedAccess into a class. NFCI.

Patch by Farhana Aleen

Differential Revision: https://reviews.llvm.org/D25986

llvm-svn: 288410

7 years agoUpdates file comments and variable names.
Rui Ueyama [Thu, 1 Dec 2016 19:45:22 +0000 (19:45 +0000)]
Updates file comments and variable names.

Use "color" instead of "group id" to describe the ICF algorithm.

llvm-svn: 288409

7 years ago[tablegen] Delete duplicates from a vector without skipping elements
Vedant Kumar [Thu, 1 Dec 2016 19:38:50 +0000 (19:38 +0000)]
[tablegen] Delete duplicates from a vector without skipping elements

Tablegen's -gen-instr-info pass has a bug in its emitEnums() routine.
The function intends for values in a vector to be deduplicated, but it
accidentally skips over elements after performing a deletion.

I think there are smarter ways of doing this deduplication, but we can
do that in a follow-up commit if there's interest. See the thread:
[PATCH] TableGen InstrMapping Bug fix.

Patch by Tyler Kenney!

llvm-svn: 288408

7 years agoRemove unused header, NFC.
Vedant Kumar [Thu, 1 Dec 2016 19:38:48 +0000 (19:38 +0000)]
Remove unused header, NFC.

llvm-svn: 288407

7 years agoSend compiler output to /dev/null in defsym.s test.
Artem Belevich [Thu, 1 Dec 2016 19:34:35 +0000 (19:34 +0000)]
Send compiler output to /dev/null in defsym.s test.

Fixes test failures if tests are run in a read-only source tree.

llvm-svn: 288406

7 years agoMove most EH from MachineModuleInfo to MachineFunction
Matthias Braun [Thu, 1 Dec 2016 19:32:15 +0000 (19:32 +0000)]
Move most EH from MachineModuleInfo to MachineFunction

Recommitting r288293 with some extra fixes for GlobalISel code.

Most of the exception handling members in MachineModuleInfo is actually
per function data (talks about the "current function") so it is better
to keep it at the function instead of the module.

This is a necessary step to have machine module passes work properly.

Also:
- Rename TidyLandingPads() to tidyLandingPads()
- Use doxygen member groups instead of "//===- EH ---"... so it is clear
  where a group ends.
- I had to add an ugly const_cast at two places in the AsmPrinter
  because the available MachineFunction pointers are const, but the code
  wants to call tidyLandingPads() in between
  (markFunctionEnd()/endFunction()).

Differential Revision: https://reviews.llvm.org/D27227

llvm-svn: 288405

7 years ago[CodeGen][ARM] Make sure the value and type used to create a bitcast
Akira Hatanaka [Thu, 1 Dec 2016 19:25:14 +0000 (19:25 +0000)]
[CodeGen][ARM] Make sure the value and type used to create a bitcast
have the same size.

This fixes an asset that is triggered when an address of a boolean
variable is passed to __builtin_arm_ldrex or __builtin_arm_strex.

rdar://problem/29269006

llvm-svn: 288404

7 years agoHandle empty strings when looking for a CFString's encoding.
Sean Callanan [Thu, 1 Dec 2016 19:14:55 +0000 (19:14 +0000)]
Handle empty strings when looking for a CFString's encoding.
Should fix the bots.

llvm-svn: 288403

7 years agoFix a bug with llvm-size and the -m option with multiple files not printing the file...
Kevin Enderby [Thu, 1 Dec 2016 19:12:55 +0000 (19:12 +0000)]
Fix a bug with llvm-size and the -m option with multiple files not printing the file names.

llvm-svn: 288402

7 years agoFix unused variable warning in Release builds. NFC.
Benjamin Kramer [Thu, 1 Dec 2016 19:10:10 +0000 (19:10 +0000)]
Fix unused variable warning in Release builds. NFC.

llvm-svn: 288401

7 years agoFix module map to create a module for the configured header Config/abi-breaking.h
Mehdi Amini [Thu, 1 Dec 2016 19:08:38 +0000 (19:08 +0000)]
Fix module map to create a module for the configured header Config/abi-breaking.h

A client of a header that relies on ABI breaking should get the macro
exported there.
Before this, the unittest for Support/Error including Support/Error.h
didn't get the macro exported by the Support module, because the
latter only re-export its submodules and included module, not
textual headers.

Hopefully, it'll also fix the build with local submodule visibility,
since the LLVM_Utils contains two submodules: ADT and Support. They
both include abi-breaking.h that defines a symbol. The textual
inclusion lead to a double definition of the symbol which broke
the parent module.

Differential Revision: https://reviews.llvm.org/D27273

llvm-svn: 288400

7 years agoThis change removes the dependency on DwarfDebug that was used for DW_FORM_ref_addr...
Greg Clayton [Thu, 1 Dec 2016 18:56:29 +0000 (18:56 +0000)]
This change removes the dependency on DwarfDebug that was used for DW_FORM_ref_addr by making a new DIEUnit class in DIE.cpp.

The DIEUnit class represents a compile or type unit and it owns the unit DIE as an instance variable. This allows anyone with a DIE, to get the unit DIE, and then get back to its DIEUnit without adding any new ivars to the DIE class. Why was this needed? The DIE class has an Offset that is always the CU relative DIE offset, not the "offset in debug info section" as was commented in the header file (the comment has been corrected). This is great for performance because most DIE references are compile unit relative and this means most code that accessed the DIE's offset didn't need to make it into a compile unit relative offset because it already was. When we needed to emit a DW_FORM_ref_addr though, we needed to find the absolute offset of the DIE by finding the DIE's compile/type unit. This class did have the absolute debug info/type offset and could be added to the CU relative offset to compute the absolute offset. With this change we can easily get back to a DIE's DIEUnit which will have this needed offset. Prior to this is required having a DwarfDebug and required calling:

DwarfCompileUnit *DwarfDebug::lookupUnit(const DIE *CU) const;
Now we can use the DIEUnit class to do so without needing DwarfDebug. All clients now use DIEUnit objects (the DwarfDebug stack and the DwarfLinker). A follow on patch for the DWARF generator will also take advantage of this.

Differential Revision: https://reviews.llvm.org/D27170

llvm-svn: 288399

7 years ago[SLP] Fixed cost model for horizontal reduction.
Alexey Bataev [Thu, 1 Dec 2016 18:42:42 +0000 (18:42 +0000)]
[SLP] Fixed cost model for horizontal reduction.

Currently when cost of scalar operations is evaluated the vector type is
used for scalar operations. Patch fixes this issue and fixes evaluation
of the vector operations cost.
Several test showed that vector cost model is too optimistic. It
allowed vectorization of 8 or less add/fadd operations, though scalar
code is faster. Actually, only for 16 or more operations vector code
provides better performance.

Differential Revision: https://reviews.llvm.org/D26277

llvm-svn: 288398