platform/upstream/llvm.git
4 years ago[Attributor] Propagate known align from arguments to call sites arguments
Johannes Doerfert [Tue, 31 Dec 2019 07:27:50 +0000 (01:27 -0600)]
[Attributor] Propagate known align from arguments to call sites arguments

Since the information is known we can simply use it at the call site.
This is especially useful for callbacks but also helps regular calls.

The test changes are mechanical.

4 years ago[Attributor] Use abstract call sites to determine associated arguments
Johannes Doerfert [Thu, 10 Oct 2019 06:19:57 +0000 (01:19 -0500)]
[Attributor] Use abstract call sites to determine associated arguments

This is the second step after D67871 to make use of abstract call sites.
In this patch the argument we associate with a abstract call site
argument can be the one in the callback callee instead of the one in the
callback broker.

Caveat: We cannot allow no-alias arguments for problematic callbacks:
As described in [1], adding no-alias (or restrict) to arguments could
break synchronization as the synchronization effect, e.g., a barrier,
does not "alias" with the pointer anymore. This disables no-alias
annotation for potentially problematic arguments until we implement the
fix described in [1].

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D68008

[1] Compiler Optimizations for OpenMP, J. Doerfert and H. Finkel,
    International Workshop on OpenMP 2018,
    http://compilers.cs.uni-saarland.de/people/doerfert/par_opt18.pdf

4 years ago[Attributor] Annotate the memory behavior of call site arguments
Johannes Doerfert [Tue, 31 Dec 2019 06:57:00 +0000 (00:57 -0600)]
[Attributor] Annotate the memory behavior of call site arguments

Especially for callbacks, annotating the call site arguments is
important. Doing so exposed a too strong dependence of AAMemoryBehavior
on AANoCapture since we handle the case of potentially captured pointers
explicitly.

The changes to the tests are all mechanical.

4 years ago[NFC] Make X86MCCodeEmitter::isPCRel32Branch static
Shengchen Kan [Tue, 31 Dec 2019 07:10:08 +0000 (15:10 +0800)]
[NFC] Make X86MCCodeEmitter::isPCRel32Branch static

4 years agoRevert "DebugInfo: Fix rangesBaseAddress DICompileUnit bitcode serialization/deserial...
David Blaikie [Tue, 31 Dec 2019 06:32:08 +0000 (22:32 -0800)]
Revert "DebugInfo: Fix rangesBaseAddress DICompileUnit bitcode serialization/deserialization"

Seeing some curious CFI failures internally - which makes little sense
to me, as I don't think anyone is using this flag (even us,
internally)... so sounds like a bug in my code somewhere (possibly a
latent one that propagating this flag exposed, not sure). Reverting
while I investigate.

This reverts commit c51b45e32ef7f35c11891f60871aa9c2c04cd991.

4 years ago[NFC] Style cleanup
Shengchen Kan [Tue, 31 Dec 2019 06:23:07 +0000 (14:23 +0800)]
[NFC] Style cleanup

1. Remove function is64BitMode() and use STI.hasFeature(X86::Mode16Bit) directly
2. Use Doxygen features in comment
3. Rename functions to make them start with a lower case letter
4. Format the code with clang-format

4 years ago[mlir] Refactor operation results to use a single use list for all results of the...
River Riddle [Tue, 31 Dec 2019 04:49:47 +0000 (20:49 -0800)]
[mlir] Refactor operation results to use a single use list for all results of the operation.

Summary: A new class is added, IRMultiObjectWithUseList, that allows for representing an IR use list that holds multiple sub values(used in this case for OpResults). This class provides all of the same functionality as the base IRObjectWithUseList, but for specific sub-values. This saves a word per operation result and is a necessary step in optimizing the layout of operation results. For now the use list is placed on the operation itself, so zero-result operations grow by a word. When the work for optimizing layout is finished, this can be moved back to being a trailing object based on memory/runtime benchmarking.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D71955

4 years ago[TargetLowering][AMDGPU] Make scalarizeVectorLoad return a pair of SDValues instead...
Craig Topper [Tue, 31 Dec 2019 03:07:36 +0000 (19:07 -0800)]
[TargetLowering][AMDGPU] Make scalarizeVectorLoad return a pair of SDValues instead of creating a MERGE_VALUES node. NFCI

This allows us to clean up some places that were peeking through
the MERGE_VALUES node after the call. By returning the SDValues
directly, we can clean that up.

Unfortunately, there are several call sites in AMDGPU that wanted
the MERGE_VALUES and now need to create their own.

4 years ago[SelectionDAG] Fix copy/paste mistake in comment. NFC
Craig Topper [Tue, 31 Dec 2019 02:01:59 +0000 (18:01 -0800)]
[SelectionDAG] Fix copy/paste mistake in comment. NFC

I think this was copied from scalarizeVectorLoad where that is
what happens.

4 years ago[NFC] Add comments in unit test aix-xcoff-toc.ll to clarify the intent
jasonliu [Tue, 31 Dec 2019 03:29:50 +0000 (03:29 +0000)]
[NFC] Add comments in unit test aix-xcoff-toc.ll to clarify the intent

Address David's post review comment in https://reviews.llvm.org/D71667.
Add comments to clarify what we are testing in that file.

4 years ago[X86] Add test case for PR44412. NFC
Craig Topper [Mon, 30 Dec 2019 22:40:56 +0000 (14:40 -0800)]
[X86] Add test case for PR44412. NFC

4 years ago[CodeGen] Use IRBuilder::CreateFNeg for __builtin_conj
Craig Topper [Mon, 30 Dec 2019 21:25:23 +0000 (13:25 -0800)]
[CodeGen] Use IRBuilder::CreateFNeg for __builtin_conj

This replaces the fsub -0.0 idiom with an fneg instruction. We didn't see to have a test that showed the current codegen. Just some tests for constant folding and a test that was only checking the declare lines for libcalls. The latter just checked that we did not have a declare for @conj when using __builtin_conj.

Differential Revision: https://reviews.llvm.org/D72012

4 years ago[CodeGen] Use CreateFNeg in buildFMulAdd
Craig Topper [Mon, 30 Dec 2019 21:24:08 +0000 (13:24 -0800)]
[CodeGen] Use CreateFNeg in buildFMulAdd

We have an fneg instruction now and should use it instead of the fsub -0.0 idiom. Looks like we had no test that showed that we handled the negation cases here so I've added new tests.

Differential Revision: https://reviews.llvm.org/D72010

4 years agoRemove a redundant `default:` on an exhaustive switch(enum).
Eric Astor [Mon, 30 Dec 2019 21:11:28 +0000 (16:11 -0500)]
Remove a redundant `default:` on an exhaustive switch(enum).

4 years ago[OpenMP][FIX] Generalize a test check line
Johannes Doerfert [Mon, 30 Dec 2019 20:58:28 +0000 (14:58 -0600)]
[OpenMP][FIX] Generalize a test check line

The new check line is compatible with the clang code generation check
line as it allows a 64 and 32 bit value.

I hope this makes the llvm-clang-win-x-armv7l buildbot happy.

4 years ago[libomptarget][nfc] Change unintentional target_impl prefix to kmpc_impl
Jon Chesterfield [Mon, 30 Dec 2019 20:49:56 +0000 (20:49 +0000)]
[libomptarget][nfc] Change unintentional target_impl prefix to kmpc_impl

4 years ago[PowerPC][docs] Update Embedded PowerPC docs in Compiler Writers Info page
Jinsong Ji [Mon, 30 Dec 2019 20:21:46 +0000 (20:21 +0000)]
[PowerPC][docs] Update Embedded PowerPC docs in Compiler Writers Info page

Summary:
Embedded PowerPC are still actively supported, especially SPE...
So update some important references here:

* adding EREF
* adding SPE/VLE ref

Delete deprecated ones into "Other documents..".

Reviewers: #powerpc, jhibbits, hfinkel

Reviewed By: #powerpc, jhibbits

Subscribers: wuzish, merge_guards_bot, nemanjai, shchenz, steven.zhang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72008

4 years ago[OpenMP] Use the OpenMPIRBuilder for `omp parallel`
Johannes Doerfert [Thu, 26 Dec 2019 17:23:38 +0000 (11:23 -0600)]
[OpenMP] Use the OpenMPIRBuilder for `omp parallel`

This allows to use the OpenMPIRBuilder for parallel regions. Code was
extracted from D61953 and adapted to work with the new version (D70109).

All but one feature should be supported. An update of this patch will
provide test coverage and privatization other than shared.

Reviewed By: fghanim

Differential Revision: https://reviews.llvm.org/D70290

4 years ago[OpenMP] Use the OpenMPIRBuilder for `omp cancel`
Johannes Doerfert [Fri, 27 Dec 2019 21:53:37 +0000 (15:53 -0600)]
[OpenMP] Use the OpenMPIRBuilder for `omp cancel`

An `omp cancel parallel` needs to be emitted by the OpenMPIRBuilder if
the `parallel` was emitted by the OpenMPIRBuilder. This patch makes
this possible. The cancel logic is shared with the cancel barriers.
Testing is done via unit tests and the clang cancel_codegen.cpp file
once D70290 lands.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D71948

4 years ago[X86][AsmParser] re-introduce 'offset' operator
Eric Astor [Mon, 30 Dec 2019 19:33:56 +0000 (14:33 -0500)]
[X86][AsmParser] re-introduce 'offset' operator

Summary:
Amend MS offset operator implementation, to more closely fit with its MS counterpart:

    1. InlineAsm: evaluate non-local source entities to their (address) location
    2. Provide a mean with which one may acquire the address of an assembly label via MS syntax, rather than yielding a memory reference (i.e. "offset asm_label" and "$asm_label" should be synonymous
    3. address PR32530

Based on http://llvm.org/D37461

Fix broken test where the break appears unrelated.

- Set up appropriate memory-input rewrites for variable references.

- Intel-dialect assembly printing now correctly handles addresses by adding "offset".

- Pass offsets as immediate operands (using "r" constraint for offsets of locals).

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D71436

4 years agoAMDGPU/GlobalISel: Select mul24 intrinsics
Matt Arsenault [Sun, 8 Sep 2019 22:11:51 +0000 (18:11 -0400)]
AMDGPU/GlobalISel: Select mul24 intrinsics

4 years agoTableGen: Fix assert on PatFrags with predicate code
Matt Arsenault [Mon, 30 Dec 2019 17:05:25 +0000 (12:05 -0500)]
TableGen: Fix assert on PatFrags with predicate code

This assumed a single pattern if there was a predicate. Relax this a
bit, and allow multiple patterns as long as they have the same class.

This was only broken for the DAG path. GlobalISel seems to have
handled this correctly already.

4 years ago[X86] Add X86ISD::PCMPGT to SimplifyMultipleUseDemandedBitsForTargetNode.
Craig Topper [Mon, 30 Dec 2019 18:50:04 +0000 (10:50 -0800)]
[X86] Add X86ISD::PCMPGT to SimplifyMultipleUseDemandedBitsForTargetNode.

If only the sign bit is demanded, and the LHS is all zeroes, then
we can bypass the PCMPGT.

4 years ago[test] do not parse ls output for file size; NFCI
Bryan Chan [Fri, 27 Dec 2019 22:26:24 +0000 (17:26 -0500)]
[test] do not parse ls output for file size; NFCI

Parsing `ls -l` output to obtain the size of a file is unreliable; the
exact output format is not specified, and some user or group names may
contain multiple words, causing `cut -f5 -d' '` to extract an incorrect
value. `wc -c`, on the other hand, is portable, and there are precendents
of its use in test cases.

4 years agoAMDGPU/GlobalISel: Re-use MRI available in selector
Matt Arsenault [Sun, 29 Dec 2019 14:05:56 +0000 (09:05 -0500)]
AMDGPU/GlobalISel: Re-use MRI available in selector

4 years agoIgnore "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" in favor of ...
Fangrui Song [Wed, 25 Dec 2019 02:12:15 +0000 (18:12 -0800)]
Ignore "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" in favor of "frame-pointer"

D56351 (included in LLVM 8.0.0) introduced "frame-pointer".  All tests
which use "no-frame-pointer-elim" or "no-frame-pointer-elim-non-leaf"
have been migrated to use "frame-pointer".

Implement UpgradeFramePointerAttributes to upgrade the two obsoleted
function attributes for bitcode. Their semantics are ignored.

Differential Revision: https://reviews.llvm.org/D71863

4 years ago[InstCombine] remove stale comment on test; NFC
Sanjay Patel [Mon, 30 Dec 2019 17:38:49 +0000 (12:38 -0500)]
[InstCombine] remove stale comment on test; NFC

4 years ago[MIPS GlobalISel] Select bitreverse. Recommit
Petar Avramovic [Mon, 30 Dec 2019 17:06:29 +0000 (18:06 +0100)]
[MIPS GlobalISel] Select bitreverse. Recommit

G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics,
clang genrates these intrinsics from __builtin_bitreverse32 and
__builtin_bitreverse64.
Add lower and narrowscalar for G_BITREVERSE.
Lower G_BITREVERSE on MIPS32.

Recommit notes:
Introduce temporary variables in order to make sure
instructions get inserted into MachineFunction in same order
regardless of compiler used to build llvm.

Differential Revision: https://reviews.llvm.org/D71363

4 years agoAMDGPU/GlobalISel: Select llvm.amdgcn.fmad.ftz
Matt Arsenault [Sun, 8 Sep 2019 21:44:09 +0000 (17:44 -0400)]
AMDGPU/GlobalISel: Select llvm.amdgcn.fmad.ftz

4 years ago[InstCombine] propagate sign argument through nested copysigns
Sanjay Patel [Mon, 30 Dec 2019 16:04:00 +0000 (11:04 -0500)]
[InstCombine] propagate sign argument through nested copysigns

This is another optimization suggested in PR44153:
https://bugs.llvm.org/show_bug.cgi?id=44153

4 years ago[ARM][Thumb][FIX] Add unwinding information to t4
Diogo Sampaio [Mon, 30 Dec 2019 15:43:32 +0000 (15:43 +0000)]
[ARM][Thumb][FIX] Add unwinding information to t4

Summary:
Add missing part of patch D71361. Now that the stack-frame
can be operated using a addw/subw instruction, they should
appear in the unwinding list.

Reviewers: dmgreen, efriedma

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72000

4 years agoAMDGPU/GlobalISel: Add select test for fexp2
Matt Arsenault [Sun, 8 Sep 2019 19:39:00 +0000 (15:39 -0400)]
AMDGPU/GlobalISel: Add select test for fexp2

4 years agoGlobalISel: moreElementsVector for FP min/max
Matt Arsenault [Sat, 27 Jul 2019 21:47:08 +0000 (17:47 -0400)]
GlobalISel: moreElementsVector for FP min/max

4 years agoAMDGPU: Improve llvm.round.f64 lowering for CI+
Matt Arsenault [Tue, 24 Dec 2019 16:07:45 +0000 (11:07 -0500)]
AMDGPU: Improve llvm.round.f64 lowering for CI+

The path already used for f16/f32 works a lot better when v_trunc_f64
is available.

4 years agoAMDGPU: Generate check lines
Matt Arsenault [Tue, 24 Dec 2019 16:24:30 +0000 (11:24 -0500)]
AMDGPU: Generate check lines

4 years agoAMDGPU/GlobalISel: Account for G_PHI result bank
Matt Arsenault [Sat, 21 Dec 2019 20:34:30 +0000 (15:34 -0500)]
AMDGPU/GlobalISel: Account for G_PHI result bank

Sometimes the result bank of the phi is already assigned to something,
and should not be ignored. This is in preparation for additional
boolean phi handling changes.

Also refine the logic to fix some cases that were incorrectly deciding
to use SGPRs.

4 years ago[PowerPC] Legalize rounding nodes
Nemanja Ivanovic [Mon, 30 Dec 2019 13:38:27 +0000 (07:38 -0600)]
[PowerPC] Legalize rounding nodes

VSX provides a full complement of rounding instructions yet we somehow ended up
with some of them legal and others not. This just legalizes all of the FP
rounding nodes and the FP -> int rounding nodes with unsafe math.

Differential revision: https://reviews.llvm.org/D69949

4 years agoRevert "[MIPS GlobalISel] Select bitreverse"
Dmitri Gribenko [Mon, 30 Dec 2019 13:28:56 +0000 (14:28 +0100)]
Revert "[MIPS GlobalISel] Select bitreverse"

This reverts commit dbc136e0fe7e14c64dcb78e72321bb41af60afa4.
It broke buildbots:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/21066

4 years ago[ARM] Sink splat to ICmp
David Green [Mon, 30 Dec 2019 09:39:14 +0000 (09:39 +0000)]
[ARM] Sink splat to ICmp

This adds ICmp to the list of instructions that we sink a splat to in a
loop, allowing the register forms of instructions to be selected more
often. It does not add FCmp yet as the results look a little odd, trying
to keep the register in an float reg and having to move it back to a GPR.

Differential Revision: https://reviews.llvm.org/D70997

4 years ago[ARM] MVE sink ICmp test. NFC
David Green [Mon, 30 Dec 2019 09:28:10 +0000 (09:28 +0000)]
[ARM] MVE sink ICmp test. NFC

4 years ago[LV][NFC] Keep dominator tree up to date during vectorization.
Evgeniy Brevnov [Mon, 9 Dec 2019 08:11:30 +0000 (15:11 +0700)]
[LV][NFC] Keep dominator tree up to date during vectorization.

4 years ago[LV][NFC] Some refactoring and renaming to facilitate next change.
Evgeniy Brevnov [Thu, 5 Dec 2019 10:57:27 +0000 (17:57 +0700)]
[LV][NFC] Some refactoring and renaming to facilitate next change.

4 years ago[ARM][THUMB2] Allow emitting T3 types of add and sub
Diogo Sampaio [Mon, 30 Dec 2019 10:59:45 +0000 (10:59 +0000)]
[ARM][THUMB2] Allow emitting T3 types of add and sub

Summary:
This patch allows to emit thumb2 add and sub
instructions with 12 bit immediates in the
emitT2RegPlusImmediate function.
- Splitting parts of the D70680

Reviewers: eli.friedman, olista01, efriedma

Reviewed By: efriedma

Subscribers: efriedma, kristof.beyls, hiraditya, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71361

4 years ago[OpenCL] Add mipmap builtin functions
Sven van Haastregt [Mon, 30 Dec 2019 10:47:58 +0000 (10:47 +0000)]
[OpenCL] Add mipmap builtin functions

Add the mipmap builtin functions from the OpenCL extension
specification.

Patch by Pierre Gondois and Sven van Haastregt.

4 years ago[MIPS GlobalISel] Select bitreverse
Petar Avramovic [Mon, 30 Dec 2019 10:26:45 +0000 (11:26 +0100)]
[MIPS GlobalISel] Select bitreverse

G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics,
clang genrates these intrinsics from __builtin_bitreverse32 and
__builtin_bitreverse64.
Add lower and narrowscalar for G_BITREVERSE.
Lower G_BITREVERSE on MIPS32.

Differential Revision: https://reviews.llvm.org/D71363

4 years ago[MIPS GlobalISel] Select bswap
Petar Avramovic [Mon, 30 Dec 2019 10:13:22 +0000 (11:13 +0100)]
[MIPS GlobalISel] Select bswap

G_BSWAP is generated from llvm.bswap.<type> intrinsics, clang genrates
these intrinsics from __builtin_bswap32 and __builtin_bswap64.
Add lower and narrowscalar for G_BSWAP.
Lower G_BSWAP on MIPS32, select G_BSWAP on MIPS32 revision 2 and later.

Differential Revision: https://reviews.llvm.org/D71362

4 years ago[MCP] Add stats for backward copy propagation. NFC.
Kai Luo [Mon, 30 Dec 2019 08:31:41 +0000 (16:31 +0800)]
[MCP] Add stats for backward copy propagation. NFC.

4 years ago[opt] Fix run-twice crash and detection problem
Peter Kokai [Mon, 30 Dec 2019 08:22:55 +0000 (00:22 -0800)]
[opt] Fix run-twice crash and detection problem

1. Execute `opt -run-twice a.ll` with in a terminal will crash.
   https://bugs.llvm.org/show_bug.cgi?id=44382
2. `-run-twice` saves output into two buffers and compares them.
   When outputing the result is disabled, that produces two empty string thus
   they are going to be equal all the time resulting false-positive results.

The proposed solution is to generate the results even if the output will not be
emitted, as that is required for the comparision.

Differential Revision: https://reviews.llvm.org/D71967

4 years ago[Diagnostic] Add ftabstop to -Wmisleading-indentation
Tyker [Sun, 29 Dec 2019 23:14:20 +0000 (00:14 +0100)]
[Diagnostic] Add ftabstop to -Wmisleading-indentation

Summary:
this allow much better support of codebases like the linux kernel that mix tabs and spaces.

-ftabstop=//Width// allow specifying how large tabs are considered to be.

Reviewers: xbolva00, aaron.ballman, rsmith

Reviewed By: aaron.ballman

Subscribers: jyknight, riccibruno, rsmith, nathanchance

Differential Revision: https://reviews.llvm.org/D71037

4 years ago[NFC] Add test for load-insert-store pattern
Qiu Chaofan [Mon, 30 Dec 2019 08:09:07 +0000 (16:09 +0800)]
[NFC] Add test for load-insert-store pattern

This patch adds necessary test cases for load-update-store pattern
which only updates single element of vector.

Differential Revision: https://reviews.llvm.org/D71886

4 years ago[Attributor] Use `changeUseAfterManifest` in AAValueSimplify manifest
Hideto Ueno [Mon, 30 Dec 2019 08:08:48 +0000 (17:08 +0900)]
[Attributor] Use `changeUseAfterManifest` in AAValueSimplify manifest

Summary: This patch makes `AAValueSimplify` use `changeUsesAfterManifest` in `manifest`. This will invoke simple folding after the manifest.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71972

4 years ago[ELF][PPC64] Improve "call lacks nop" diagnostic and make it compatible with GCC...
Fangrui Song [Wed, 18 Dec 2019 00:45:04 +0000 (16:45 -0800)]
[ELF][PPC64] Improve "call lacks nop" diagnostic and make it compatible with GCC<5.5 and GCC<6.4

GCC before r245813 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79439)
did not emit nop after b/bl. This can happen with recursive calls.
r245813 was back ported to GCC 5.5 and GCC 6.4.

This is common, for example, libstdc++.a(locale.o) shipped with GCC 4.9
and many objects in netlib lapack can cause lld to error.  gold allows
such calls to the same section. Our __plt_foo symbol's `section` field
is used for ThunkSection, so we can't implement a similar loosen rule
easily. But we can make use of its `file` field which is currently NULL.

Differential Revision: https://reviews.llvm.org/D71639

4 years ago[ELF][PPC32] Implement IPLT code sequence for non-preemptible IFUNC
Fangrui Song [Tue, 17 Dec 2019 19:23:37 +0000 (11:23 -0800)]
[ELF][PPC32] Implement IPLT code sequence for non-preemptible IFUNC

Similar to D71509 (EM_PPC64), on EM_PPC, the IPLT code sequence should
be similar to a PLT call stub. Unlike EM_PPC64, EM_PPC -msecure-plt has
small/large PIC model differences.

* -fpic/-fpie: R_PPC_PLTREL24 r_addend=0.  The call stub loads an address relative to `_GLOBAL_OFFSET_TABLE_`.
* -fPIC/-fPIE: R_PPC_PLTREL24 r_addend=0x8000. (A partial linked object
  file may have an addend larger than 0x8000.) The call stub loads an address relative to .got2+0x8000.

Just assume large PIC model for now. This patch makes:

  // clang -fuse-ld=lld -msecure-plt -fno-pie -no-pie a.c
  // clang -fuse-ld=lld -msecure-plt -fPIE -pie a.c
  #include <stdio.h>
  static void impl(void) { puts("meow"); }
  void thefunc(void) __attribute__((ifunc("resolver")));
  void *resolver(void) { return &impl; }
  int main(void) {
    thefunc();
    void (*theptr)(void) = &thefunc;
    theptr();
  }

work on Linux glibc. -fpie will crash because the compiler and the
linker do not agree on the value which r30 stores (_GLOBAL_OFFSET_TABLE_
vs .got2+0x8000).

Differential Revision: https://reviews.llvm.org/D71621

4 years ago[ELF][PPC64] Implement IPLT code sequence for non-preemptible IFUNC
Fangrui Song [Sat, 14 Dec 2019 02:30:21 +0000 (18:30 -0800)]
[ELF][PPC64] Implement IPLT code sequence for non-preemptible IFUNC

Non-preemptible IFUNC are placed in in.iplt (.glink on EM_PPC64).  If
there is a non-GOT non-PLT relocation, for pointer equality, we change
the type of the symbol from STT_IFUNC and STT_FUNC and bind it to the
.glink entry.

On EM_386, EM_X86_64, EM_ARM, and EM_AARCH64, the PLT code sequence
loads the address from its associated .got.plt slot. An IPLT also has an
associated .got.plt slot and can use the same code sequence.

On EM_PPC64, the PLT code sequence is actually a bl instruction in
.glink .  It jumps to `__glink_PLTresolve` (the PLT header). and
`__glink_PLTresolve` computes the .plt slot (relocated by
R_PPC64_JUMP_SLOT).

An IPLT does not have an associated R_PPC64_JUMP_SLOT, so we cannot use
`bl` in .iplt . Instead, create a call stub which has a similar code
sequence as PPC64PltCallStub. We don't save the TOC pointer, so such
scenarios will not work: a function pointer to a non-preemptible ifunc,
which resolves to a function defined in another DSO. This is the
restriction described by https://sourceware.org/glibc/wiki/GNU_IFUNC
(though on many architectures it works in practice):

  Requirement (a): Resolver must be defined in the same translation unit as the implementations.

If an ifunc is taken address but not called, technically we don't need
an entry for it, but we currently do that.

This patch makes

  // clang -fuse-ld=lld -fno-pie -no-pie a.c
  // clang -fuse-ld=lld -fPIE -pie a.c
  #include <stdio.h>
  static void impl(void) { puts("meow"); }
  void thefunc(void) __attribute__((ifunc("resolver")));
  void *resolver(void) { return &impl; }
  int main(void) {
    thefunc();
    void (*theptr)(void) = &thefunc;
    theptr();
  }

work on Linux glibc and FreeBSD. Calling a function pointer pointing to
a Non-preemptible IFUNC never worked before.

Differential Revision: https://reviews.llvm.org/D71509

4 years ago[SelectionDAT] Simplify SelectionDAGBuilder::visitInlineAsm
Fangrui Song [Mon, 30 Dec 2019 04:17:31 +0000 (20:17 -0800)]
[SelectionDAT] Simplify SelectionDAGBuilder::visitInlineAsm

Indirect C_Immediate or C_Other constraints have been excluded.

Also simplify an unneeded change to indirect 'X' by D60942.

4 years ago[CMake] Added remote test execution support into CrossWinToARMLinux CMake cache file.
Vladimir Vereschaka [Tue, 17 Dec 2019 20:44:26 +0000 (12:44 -0800)]
[CMake] Added remote test execution support into CrossWinToARMLinux CMake cache file.

Added two confguration argument to provide a host name and SSH user name
to run the tests on the remote target host.

* REMOTE_TEST_HOST  - remote host name or address.
* REMOTE_TEST_USER  - passwordless SSH account name.

Differential Revision: https://reviews.llvm.org/D71625

4 years ago[PowerPC] Exploit the rlwinm instructions for "and" with constant
QingShan Zhang [Mon, 30 Dec 2019 03:18:31 +0000 (03:18 +0000)]
[PowerPC] Exploit the rlwinm instructions for "and" with constant

For now, PowerPC will using several instructions to get the constant and "and" it with the following case:

define i32 @test1(i32 %a) {
  %and = and i32 %a, -2
  ret i32 %and
}

However, we could exploit it with the rotate mask instructions.
               MB  ME
+----------------------+
|xxxxxxxxxxx00011111000|
+----------------------+
 0         32         64
Notice that, we can only do it if the MB is larger than 32 and MB <= ME as
RLWINM will replace the content of [0 - 32) with [32 - 64) even we didn't rotate it.

Differential Revision: https://reviews.llvm.org/D71829

4 years ago[X86] Use APInt::isOneValue and ConstantSDNode::isOne. NFC
Craig Topper [Mon, 30 Dec 2019 00:41:28 +0000 (16:41 -0800)]
[X86] Use APInt::isOneValue and ConstantSDNode::isOne. NFC

These are implemented slightly more efficiently than comparing
to 1 in the case that the value is more than 64 bits.

4 years ago[X86] Use isOneConstant to simplify some code. NFC
Craig Topper [Mon, 30 Dec 2019 00:22:10 +0000 (16:22 -0800)]
[X86] Use isOneConstant to simplify some code. NFC

4 years ago[X86] Remove dyn_casts to ConstantSDNode for operand 1 of X86ISD::VSRLI/VSRAI/VSRLI...
Craig Topper [Mon, 30 Dec 2019 00:19:32 +0000 (16:19 -0800)]
[X86] Remove dyn_casts to ConstantSDNode for operand 1 of X86ISD::VSRLI/VSRAI/VSRLI. Use getConstantOperandVal and APInt operations.

These nodes should only ever be formed with an i8 TargetConstant
so we don't need to check for it to be a constant. It's also
always 8-bits so we don't need to use APInt compare functions.

4 years ago[SelectionDAG] Disallow indirect "i" constraint
Fangrui Song [Sun, 29 Dec 2019 23:53:46 +0000 (15:53 -0800)]
[SelectionDAG] Disallow indirect "i" constraint

This allows us to delete InlineAsm::Constraint_i workarounds in
SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and
TargetLowering::getInlineAsmMemConstraint overrides.

They were introduced to X86 in r237517 to prevent crashes for
constraints like "=*imr". They were later copied to other targets.

4 years ago[lldb][NFC] Simplify ClangASTContext::GetTypeForDecl
Raphael Isemann [Sun, 29 Dec 2019 22:01:53 +0000 (23:01 +0100)]
[lldb][NFC] Simplify ClangASTContext::GetTypeForDecl

Also removes the GetASTContext call from this code.

4 years ago[lldb][NFC] Make integer types functions in ClangASTContext not static
Raphael Isemann [Sun, 29 Dec 2019 20:28:31 +0000 (21:28 +0100)]
[lldb][NFC] Make integer types functions in ClangASTContext not static

These functions need a ClangASTContext instance that we would otherwise
recalculate by calling GetASTContext (which is no longer necessary with
this patch).

4 years agoFix formatting in previous commits
Stephen Kelly [Sun, 29 Dec 2019 19:41:07 +0000 (19:41 +0000)]
Fix formatting in previous commits

4 years ago[lldb][NFC] Delete static versions of ClangASTContext::CreateFunctionType
Raphael Isemann [Sun, 29 Dec 2019 19:10:29 +0000 (20:10 +0100)]
[lldb][NFC] Delete static versions of ClangASTContext::CreateFunctionType

We can always call the member function version of this function.

4 years ago[X86] Make the AVX1 check lines in vec-strict-inttofp-256.ll test 'avx' instead of...
Craig Topper [Sun, 29 Dec 2019 19:11:26 +0000 (11:11 -0800)]
[X86] Make the AVX1 check lines in vec-strict-inttofp-256.ll test 'avx' instead of 'avx2'. Add AVX2 checks. NFC

4 years ago[mlir] Update mlir/CMakeLists.txt to install *.td files
Kern Handa [Sun, 29 Dec 2019 17:04:09 +0000 (18:04 +0100)]
[mlir] Update mlir/CMakeLists.txt to install *.td files

Currently when you build the `install` target, TableGen files don't get
installed.

TableGen files are needed when authoring new MLIR dialects, but right
now they're missing when using the pre-built binaries.

Differential Revision: https://reviews.llvm.org/D71958

4 years ago[lldb][NFC] Remove most GetASTContext calls in AST metadata code
Raphael Isemann [Sat, 28 Dec 2019 22:37:53 +0000 (23:37 +0100)]
[lldb][NFC] Remove most GetASTContext calls in AST metadata code

4 years agoFix use of named values surrounded by newlines in clang-query
Stephen Kelly [Sun, 29 Dec 2019 14:51:22 +0000 (14:51 +0000)]
Fix use of named values surrounded by newlines in clang-query

4 years agoFix newline handling in clang-query parser
Stephen Kelly [Sun, 29 Dec 2019 14:48:47 +0000 (14:48 +0000)]
Fix newline handling in clang-query parser

Don't prematurely remove characters from the end of the string

4 years agoFix handling of newlines in clang-query
Stephen Kelly [Sun, 29 Dec 2019 14:38:33 +0000 (14:38 +0000)]
Fix handling of newlines in clang-query

Replace assert with diagnostic for missing newline.

4 years ago[Attributor] AAUndefinedBehavior: Check for branches on undef value.
Hideto Ueno [Sun, 29 Dec 2019 08:34:08 +0000 (17:34 +0900)]
[Attributor] AAUndefinedBehavior: Check for branches on undef value.

A branch is considered UB if it depends on an undefined / uninitialized value.
At this point this handles simple UB branches in the form: `br i1 undef, ...`
We query `AAValueSimplify` to get a value for the branch condition, so the branch
can be more complicated than just: `br i1 undef, ...`.

Patch By: Stefanos Baziotis (@baziotis)

Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D71799

4 years ago[X86] Stop accidentally custom type legalizing v4i32->v4f32 on SSE1 only targets.
Craig Topper [Sun, 29 Dec 2019 07:09:48 +0000 (23:09 -0800)]
[X86] Stop accidentally custom type legalizing v4i32->v4f32 on SSE1 only targets.

We had a Custom operation action for v4i32 on SSE1. But since
v4i32 isn't legal until SSE2 this was not what was intended. The
code that get executed was intended for op legalization and
creates a bunch of v4i32 nodes that all end up scalarized.

4 years ago[LV] Use getMask() when printing recipe [NFCI]
Gil Rapaport [Sat, 28 Dec 2019 17:59:31 +0000 (19:59 +0200)]
[LV] Use getMask() when printing recipe [NFCI]

Use dedicated API for getting the mask instead of duplicating it.

Differential Revision: https://reviews.llvm.org/D71964

4 years ago[X86] Remove a redundant (scalar_to_vector (extract_vector_elt X))) in LowerUINT_TO_F...
Craig Topper [Sun, 29 Dec 2019 05:49:16 +0000 (21:49 -0800)]
[X86] Remove a redundant (scalar_to_vector (extract_vector_elt X))) in LowerUINT_TO_FP_i32. NFCI

4 years ago[X86] Fix -enable-machine-outliner for x86-32 after D48683
Fangrui Song [Sun, 29 Dec 2019 01:25:12 +0000 (17:25 -0800)]
[X86] Fix -enable-machine-outliner for x86-32 after D48683

D48683 accidentally disabled -enable-machine-outliner for x86-32.

4 years ago[mlir] Fix the wrong computation of dynamic strides for lowering AllocOp to LLVM
Tung Le Duc [Tue, 24 Dec 2019 07:02:10 +0000 (16:02 +0900)]
[mlir] Fix the wrong computation of dynamic strides for lowering AllocOp to LLVM

Leftover change from before the MLIR merge, reviewed at accepted at
https://github.com/tensorflow/mlir/pull/338.

4 years agoRevert "[COFF] Make the autogenerated .weak.<name>.default symbols static"
Martin Storsjö [Sat, 28 Dec 2019 21:38:41 +0000 (23:38 +0200)]
Revert "[COFF] Make the autogenerated .weak.<name>.default symbols static"

This reverts commit 7ca86ee6494d4307333b300bae80e42df4a5140f.

Apparently this change causes MS link.exe to error out with
"LNK1235: corrupt or invalid COFF symbol table".

4 years ago[lldb][NFC] Remove GetASTContext call in ClangPersistentVariables
Raphael Isemann [Sat, 28 Dec 2019 21:00:27 +0000 (22:00 +0100)]
[lldb][NFC] Remove GetASTContext call in ClangPersistentVariables

We try to build a CompilerType from the persistent decls so we need
a ClangASTContext. With this patch the ClangPersistentVariables store
the associated ClangASTContext of the persistent decls (which is
always the scratch ClangASTContext) and no longer call GetASTContext
to map back from clang::ASTContext to ClangASTContext.

4 years agoAllow redeclaration of __declspec(uuid)
Zachary Henkel [Sat, 28 Dec 2019 21:06:13 +0000 (13:06 -0800)]
Allow redeclaration of __declspec(uuid)

msvc allows a subsequent declaration of a uuid attribute on a
struct/class.  Mirror this behavior in clang-cl.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D71439

4 years ago[COFF] Make the autogenerated .weak.<name>.default symbols static
Martin Storsjö [Thu, 19 Dec 2019 08:01:40 +0000 (10:01 +0200)]
[COFF] Make the autogenerated .weak.<name>.default symbols static

If we have references to the same extern_weak in multiple objects,
all of them would generate external symbols with the same name. Make
them static to avoid duplicate definitions; nothing should need to
refer to this symbol outside of the current object.

GCC/binutils seems to handle the same by not using a fixed string
for the ".default" suffix, but instead using the name of some other
defined external symbol from the same object (which is supposed to
be unique among objects unless there's other duplicate definitions).

Differential Revision: https://reviews.llvm.org/D71711

4 years ago[CMake] Fix lld detection after D69685
Fangrui Song [Sat, 28 Dec 2019 00:01:03 +0000 (16:01 -0800)]
[CMake] Fix lld detection after D69685

D69685 actually broke lld detection for my build (probably due to CMake
processing order).

Before:

```
build projects/compiler-rt/lib/sanitizer_common/tests/Sanitizer-x86_64-Test-Nolibc: ... bin/clang || ...
```

After:

```
build projects/compiler-rt/lib/sanitizer_common/tests/Sanitizer-x86_64-Test-Nolibc: ... bin/clang bin/lld || ...
```

Differential Revision: https://reviews.llvm.org/D71950

4 years ago[X86] Add test cases for v4i64->v4f32 and v8i64->v8f32 strict_sint_to_fp/strict_uint_...
Craig Topper [Sat, 28 Dec 2019 19:17:49 +0000 (11:17 -0800)]
[X86] Add test cases for v4i64->v4f32 and v8i64->v8f32 strict_sint_to_fp/strict_uint_to_fp to vec-strict-inttofp-256.ll and vec-strict-inttofp-512.ll. NFC

4 years agoFix bots after a9ad65a2b34f
Nemanja Ivanovic [Sat, 28 Dec 2019 19:07:18 +0000 (13:07 -0600)]
Fix bots after a9ad65a2b34f

In the last commit, I neglected to initialize the new subtarget feature
I added which caused failures on a few bots. This should fix that.

4 years ago[PowerPC] Change default for unaligned FP access for older subtargets
Nemanja Ivanovic [Sat, 28 Dec 2019 17:20:36 +0000 (11:20 -0600)]
[PowerPC] Change default for unaligned FP access for older subtargets

This is a fix for https://bugs.llvm.org/show_bug.cgi?id=40554

Some CPU's trap to the kernel on unaligned floating point access and there are
kernels that do not handle the interrupt. The program then fails with a SIGBUS
according to the PR. This just switches the default for unaligned access to only
allow it on recent server CPUs that are known to allow this.

Differential revision: https://reviews.llvm.org/D71954

4 years agoSimplifyDemandedBits - Remove duplicate getOperand() call. NFC.
Simon Pilgrim [Sat, 28 Dec 2019 16:41:47 +0000 (16:41 +0000)]
SimplifyDemandedBits - Remove duplicate getOperand() call. NFC.

Pulled out from D56387 - cleanup variable names, move shift amount legalization inside if() of its only user and remove duplicate getOperand() call.

4 years agoFix crash in getFullyQualifiedName for inline namespace
Alexey Bader [Sat, 28 Dec 2019 13:13:33 +0000 (16:13 +0300)]
Fix crash in getFullyQualifiedName for inline namespace

Summary: The ICE happens when the most outer namespace is an inline namespace.

Reviewers: bkramer, ilya-biryukov

Reviewed By: ilya-biryukov

Subscribers: ebevhan, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D71962

4 years ago[lldb][NFC] Remove GetASTContext call in ClangDeclVendor
Raphael Isemann [Sat, 28 Dec 2019 13:35:07 +0000 (14:35 +0100)]
[lldb][NFC] Remove GetASTContext call in ClangDeclVendor

Instead of returning NamedDecls and then calling GetASTContext
to find back the ClangASTContext we used can just implement the
FindDecl variant that returns CompilerDecls (and implement the
other function by throwing away the ClangASTContext part of the
compiler decl).

4 years ago[PowerPC] Modify the hasSideEffects of some VSX instructions from 1 to 0
Kang Zhang [Sat, 28 Dec 2019 09:04:54 +0000 (09:04 +0000)]
[PowerPC] Modify the hasSideEffects of some VSX instructions from 1 to 0

Summary:
If we didn't set the value for hasSideEffects bit in our td file,  `llvm-tblgen`
will set it as true for those instructions which has no match pattern.
Below 6 instructions don't set the hasSideEffects flag and don't have match
pattern, so their hasSideEffects flag will be set true by llvm-tblgen.

But in fact below instructions don't modify any special register and don't have
other SideEffects, they shouldn't have SideEffects.
This patch is to modify the hasSideEffects of below instructions from 1 to 0.

```
VEXTUHLX
VEXTUHRX
VEXTUWLX
VEXTUWRX
VSPLTBs
VSPLTHs
```

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D71391

4 years ago[TargetLowering] Update comment to reference the correct compiler-rt function the...
Craig Topper [Sat, 28 Dec 2019 02:33:16 +0000 (18:33 -0800)]
[TargetLowering] Update comment to reference the correct compiler-rt function the code is based on. NFC

4 years ago[mlir] Merge the successor operand count into BlockOperand.
River Riddle [Sat, 28 Dec 2019 04:33:53 +0000 (20:33 -0800)]
[mlir] Merge the successor operand count into BlockOperand.

Summary: The successor operand counts are directly tied to block operands anyways, and this simplifies the trailing objects of Operation(i.e. one less computation to perform).

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D71949

4 years ago[ADT] Fix FoldingSet documentation typos
Brian Gesiak [Sat, 28 Dec 2019 02:21:05 +0000 (21:21 -0500)]
[ADT] Fix FoldingSet documentation typos

* "If found then M with be non-NULL" should be "will be non-NULL".
* The documentation examples (1) and (2) declare and use a variable
  `MyNode *M`, but examples (3) and (4) switch midway to using a
  variable named `N`. Unify the examples to all use `M`.
* The examples demonstrate the use of member functions of
  `FoldingSet`, but (3) and (4) invoke these as if they were free
  functions. Modify them to call member functions on the `MyFoldingSet`
  object constructed in the code above example (1).

4 years agoDelete setjmp_undefined_for_msvc workaround after llvm.setjmp was removed
Fangrui Song [Sat, 28 Dec 2019 02:07:03 +0000 (18:07 -0800)]
Delete setjmp_undefined_for_msvc workaround after llvm.setjmp was removed

4 years ago[Intrinsic] Delete tablegen rules of llvm.{sig,}{setjmp,longjmp}
Fangrui Song [Sat, 28 Dec 2019 01:53:16 +0000 (17:53 -0800)]
[Intrinsic] Delete tablegen rules of llvm.{sig,}{setjmp,longjmp}

4 years agolld: Remove explicit copy ops from AssociatedIterator, relying on implicit operators
David Blaikie [Sat, 28 Dec 2019 01:27:20 +0000 (17:27 -0800)]
lld: Remove explicit copy ops from AssociatedIterator, relying on implicit operators

4 years agoDebugInfo: Fix rangesBaseAddress DICompileUnit bitcode serialization/deserialization
David Blaikie [Sat, 28 Dec 2019 01:18:50 +0000 (17:18 -0800)]
DebugInfo: Fix rangesBaseAddress DICompileUnit bitcode serialization/deserialization

Follow-up to r346788 review feedback from Adrian Prantl.

4 years agoAMDGPU: Adjust test so it will work with GlobalISel
Matt Arsenault [Sat, 28 Dec 2019 00:35:00 +0000 (19:35 -0500)]
AMDGPU: Adjust test so it will work with GlobalISel

This is mostly a workaround for not handling the mubuf store path yet.

4 years ago[ELF] Improve the condition to create .interp
Fangrui Song [Fri, 27 Dec 2019 21:24:41 +0000 (13:24 -0800)]
[ELF] Improve the condition to create .interp

This restores commit 1417558e4a61794347c6bfbafaff7cd96985b2c3 and its follow-up, reverted by commit c3dbd782f1e0578c7ebc342f2e92f54d9644cff7.

After this commit:

clang -fuse-ld=bfd -no-pie -nostdlib a.c => .interp not created
clang -fuse-ld=bfd -pie -fPIE -nostdlib a.c => .interp created

clang -fuse-ld=gold -no-pie -nostdlib a.c => .interp not created
clang -fuse-ld=gold -pie -fPIE -nostdlib a.c => .interp created

clang -fuse-ld=lld -no-pie -nostdlib a.c => .interp created
clang -fuse-ld=lld -pie -fPIE -nostdlib a.c => .interp created

4 years ago[sanitizer] Link Sanitizer-x86_64-Test-Nolibc with -static
Fangrui Song [Fri, 27 Dec 2019 23:08:10 +0000 (15:08 -0800)]
[sanitizer] Link Sanitizer-x86_64-Test-Nolibc with -static

Pass -static so that clang will not pass -Wl,--dynamic-linker,... to the
linker. The test is not expected to run under a ld.so. (Technically it
works under a ld.so but glibc expects to see a PT_DYNAMIC. lld
intentionally does not follow GNU ld's complex rules regarding
PT_DYNAMIC.)

This allows commit 1417558e4a61794347c6bfbafaff7cd96985b2c3 to be
relanded.

4 years agoAMDGPU/GlobalISel: Use SReg_32 for readfirstlane constraining
Matt Arsenault [Fri, 27 Dec 2019 22:41:16 +0000 (17:41 -0500)]
AMDGPU/GlobalISel: Use SReg_32 for readfirstlane constraining

This matches the DAG behavior where we don't use SReg_32_XM0
everywhere anymore, and fixes not coalescing the copies into m0.