platform/upstream/llvm.git
4 years agoFix frame pointer layout on AArch64 Linux.
Owen Anderson [Wed, 26 Aug 2020 16:09:28 +0000 (16:09 +0000)]
Fix frame pointer layout on AArch64 Linux.

When floating point callee-saved registers were used, the frame pointer would
incorrectly point to the bottom of the CSR space (containing saved floating-point
registers), rather than to the frame record.

While all frame offsets were calculated consistently, resulting in working code,
this prevented stack walkers from being about to traverse the frame list.

4 years ago[LV] Fallback strategies if tail-folding fails
Sjoerd Meijer [Wed, 26 Aug 2020 15:55:25 +0000 (16:55 +0100)]
[LV] Fallback strategies if tail-folding fails

This implements 2 different vectorisation fallback strategies if tail-folding
fails: 1) don't vectorise at all, or 2) vectorise using a scalar epilogue. This
can be controlled with option -prefer-predicate-over-epilogue, that has been
changed to take a numeric value corresponding to the tail-folding preference
and preferred fallback.

Patch by: Pierre van Houtryve, Sjoerd Meijer.

Differential Revision: https://reviews.llvm.org/D79783

4 years ago[LiveDebugValues][NFC] Add instr-ref tests, adapt old tests
Jeremy Morse [Tue, 25 Aug 2020 15:43:47 +0000 (16:43 +0100)]
[LiveDebugValues][NFC] Add instr-ref tests, adapt old tests

This patch adds a few tests in DebugInfo/MIR/InstrRef/ of interesting
behaviour that the instruction referencing implementation of
LiveDebugValues has. Mostly, these tests exist to ensure that if you
give the "-experimental-debug-variable-locations" command line switch,
the right implementation runs; and to ensure it behaves the same way as
the VarLoc LiveDebugValues implementation.

I've also touched roughly 30 other tests, purely to make the tests less
rigid about what output to accept. DBG_VALUE instructions are usually
printed with a trailing !debug-location indicating its scope:

  !debug-location !1234

However InstrRefBasedLDV produces new DebugLoc instances on the fly,
meaning there sometimes isn't a numbered node when they're printed,
making the output:

  !debug-location !DILocation(line: 0, blah blah)

Which causes a ton of these tests to fail. This patch removes checks for
that final part of each DBG_VALUE instruction. None of them appear to
be actually checking the scope is correct, just that it's present, so
I don't believe there's any loss in coverage here.

Differential Revision: https://reviews.llvm.org/D83054

4 years ago[libc++] Use xcrun to find Ninja in the macOS backdeployment CI too
Louis Dionne [Wed, 26 Aug 2020 15:32:40 +0000 (11:32 -0400)]
[libc++] Use xcrun to find Ninja in the macOS backdeployment CI too

4 years ago[Clang] Fix tests following rG087047144210
Alexandre Ganea [Wed, 26 Aug 2020 15:32:17 +0000 (11:32 -0400)]
[Clang] Fix tests following rG087047144210

4 years ago[clang][NFC] Properly fix a GCC warning in ASTImporterTest.cpp
Raphael Isemann [Wed, 26 Aug 2020 14:37:35 +0000 (16:37 +0200)]
[clang][NFC] Properly fix a GCC warning in ASTImporterTest.cpp

Follow up to c9b45ce1fd97531c228e092bedee719b971f82a3 which just defined
the function instead of just 'using' the function from the base class (thanks
David).

4 years ago[clangd] Use string[] for allCommitCharacters
Kirill Bobyrev [Wed, 26 Aug 2020 15:08:00 +0000 (17:08 +0200)]
[clangd] Use string[] for allCommitCharacters

As per LSP specification, allCommitCharacters should be string[] instead of
string:

https://microsoft.github.io/language-server-protocol/specification#textDocument_completion

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D86604

4 years ago[libc++] Always run Ninja through xcrun in the macOS CI scripts
Louis Dionne [Wed, 26 Aug 2020 14:44:38 +0000 (10:44 -0400)]
[libc++] Always run Ninja through xcrun in the macOS CI scripts

Ninja isn't installed by default on OSX, so run it through xcrun to find
the one in the developer tools if needed.

4 years ago[clangd] Enable recovery-ast-type by default.
Haojian Wu [Wed, 26 Aug 2020 14:46:24 +0000 (16:46 +0200)]
[clangd] Enable recovery-ast-type by default.

Differential Revision: https://reviews.llvm.org/D86602

4 years agoBump -len_control value in fuzzer-custommutator.test (PR47286)
Hans Wennborg [Wed, 26 Aug 2020 14:44:55 +0000 (16:44 +0200)]
Bump -len_control value in fuzzer-custommutator.test (PR47286)

to make the test more stable, as suggested by mmoroz.

4 years agoFix failing tests after VCTOOLSDIR change
Zachary Henkel [Wed, 26 Aug 2020 14:30:33 +0000 (16:30 +0200)]
Fix failing tests after VCTOOLSDIR change

Switch from hardcoded x64 arch to a regex in the target triple

Differential revision: https://reviews.llvm.org/D86622

4 years ago[lldb][NFC] Simplify string literal in GDBRemoteCommunicationClient
Raphael Isemann [Wed, 26 Aug 2020 09:49:28 +0000 (11:49 +0200)]
[lldb][NFC] Simplify string literal in GDBRemoteCommunicationClient

4 years ago[AMDGPU] Make more use of Subtarget reference in SIInstrInfo
Jay Foad [Wed, 26 Aug 2020 13:53:41 +0000 (14:53 +0100)]
[AMDGPU] Make more use of Subtarget reference in SIInstrInfo

4 years ago[OpenMP] Fix build on macOS sdk 10.12 and newer
AndreyChurbanov [Wed, 26 Aug 2020 13:52:46 +0000 (16:52 +0300)]
[OpenMP] Fix build on macOS sdk 10.12 and newer

Patch by nihui (Ni Hui)

Differential Revision: https://reviews.llvm.org/D76755

4 years ago[libc] Add implementations for sqrt, sqrtf, and sqrtl.
Tue Ly [Tue, 28 Jul 2020 05:41:36 +0000 (01:41 -0400)]
[libc] Add implementations for sqrt, sqrtf, and sqrtl.

Differential Revision: https://reviews.llvm.org/D84726

4 years ago[LegalizeTypes] Add ROTL/ROTR to ScalarizeVectorResult.
Jay Foad [Wed, 26 Aug 2020 08:41:23 +0000 (09:41 +0100)]
[LegalizeTypes] Add ROTL/ROTR to ScalarizeVectorResult.

We can scalarize these just like any other binary operation.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47303 caused by D77152.

Differential Revision: https://reviews.llvm.org/D86601

4 years ago[Support] Allow printing the stack trace only for a given depth
Dibya Ranjan Mishra [Wed, 26 Aug 2020 13:21:34 +0000 (09:21 -0400)]
[Support] Allow printing the stack trace only for a given depth

Differential Revision: https://reviews.llvm.org/D85458

4 years agoAMDGPU: Use Subtarget reference in SIInstrInfo
Matt Arsenault [Wed, 26 Aug 2020 13:15:25 +0000 (09:15 -0400)]
AMDGPU: Use Subtarget reference in SIInstrInfo

4 years agoAdd clang-cl "vctoolsdir" option to specify the location of the msvc toolchain
Zachary Henkel [Wed, 26 Aug 2020 12:58:39 +0000 (14:58 +0200)]
Add clang-cl "vctoolsdir" option to specify the location of the msvc toolchain

Add an option to directly specify where the msvc toolchain lives for
clang-cl and avoid unwanted file and registry probes.

Differential revision: https://reviews.llvm.org/D85998

4 years agoAMDGPU/GlobalISel: Tolerate negated control flow intrinsic outputs
Matt Arsenault [Tue, 25 Aug 2020 23:53:32 +0000 (19:53 -0400)]
AMDGPU/GlobalISel: Tolerate negated control flow intrinsic outputs

If the condition output is negated, swap the branch targets. This is
similar to what SelectionDAG does for when SelectionDAGBuilder
decides to invert the condition and swap the branches.

This is leaving behind a dead constant def for some reason.

4 years agoGlobalISel: Combine G_ADD of G_PTRTOINT to G_PTR_ADD
Matt Arsenault [Sun, 16 Aug 2020 15:49:41 +0000 (11:49 -0400)]
GlobalISel: Combine G_ADD of G_PTRTOINT to G_PTR_ADD

This produces less work for addressing mode matching. I think this is
safe since I don't think machine IR is supposed to give the same
aliasing properties as getelementptr in the IR.

4 years ago[AMDGPU][GlobalISel] Eliminate barrier if workgroup size is not greater than wavefron...
Jay Foad [Wed, 26 Aug 2020 10:41:39 +0000 (11:41 +0100)]
[AMDGPU][GlobalISel] Eliminate barrier if workgroup size is not greater than wavefront size

If a workgroup size is known to be not greater than wavefront size
the s_barrier instruction is not needed since all threads are guaranteed
to come to the same point at the same time.

This is the same optimization that was implemented for SelectionDAG in
D31731.

Differential Revision: https://reviews.llvm.org/D86609

4 years ago[DWARFYAML] Make the unit_length and header_length fields optional.
Xing GUO [Wed, 26 Aug 2020 12:34:36 +0000 (20:34 +0800)]
[DWARFYAML] Make the unit_length and header_length fields optional.

This patch makes the unit_length and header_length fields of line tables
optional. yaml2obj is able to infer them for us.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D86590

4 years ago[Scheduling] Implement a new way to cluster loads/stores
QingShan Zhang [Wed, 26 Aug 2020 12:26:21 +0000 (12:26 +0000)]
[Scheduling] Implement a new way to cluster loads/stores

Before calling target hook to determine if two loads/stores are clusterable,
we put them into different groups to avoid fake cluster due to dependency.
For now, we are putting the loads/stores into the same group if they have
the same predecessor. We assume that, if two loads/stores have the same
predecessor, it is likely that, they didn't have dependency for each other.

However, one SUnit might have several predecessors and for now, we just
pick up the first predecessor that has non-data/non-artificial dependency,
which is too arbitrary. And we are struggling to fix it.

So, I am proposing some better implementation.
1. Collect all the loads/stores that has memory info first to reduce the complexity.
2. Sort these loads/stores so that we can stop the seeking as early as possible.
3. For each load/store, seeking for the first non-dependency instruction with the
   sorted order, and check if they can cluster or not.

Reviewed By: Jay Foad

Differential Revision: https://reviews.llvm.org/D85517

4 years ago[mlir][PDL] Add a PDL Interpreter Dialect
River Riddle [Wed, 26 Aug 2020 12:12:07 +0000 (05:12 -0700)]
[mlir][PDL] Add a PDL Interpreter Dialect

The PDL Interpreter dialect provides a lower level abstraction compared to the PDL dialect, and is targeted towards low level optimization and interpreter code generation. The dialect operations encapsulates low-level pattern match and rewrite "primitives", such as navigating the IR (Operation::getOperand), creating new operations (OpBuilder::create), etc. Many of the operations within this dialect also fuse branching control flow with some form of a predicate comparison operation. This type of fusion reduces the amount of work that an interpreter must do when executing.

An example of this representation is shown below:

```mlir
// The following high level PDL pattern:
pdl.pattern : benefit(1) {
  %resultType = pdl.type
  %inputOperand = pdl.input
  %root, %results = pdl.operation "foo.op"(%inputOperand) -> %resultType
  pdl.rewrite %root {
    pdl.replace %root with (%inputOperand)
  }
}

// May be represented in the interpreter dialect as follows:
module {
  func @matcher(%arg0: !pdl.operation) {
    pdl_interp.check_operation_name of %arg0 is "foo.op" -> ^bb2, ^bb1
  ^bb1:
    pdl_interp.return
  ^bb2:
    pdl_interp.check_operand_count of %arg0 is 1 -> ^bb3, ^bb1
  ^bb3:
    pdl_interp.check_result_count of %arg0 is 1 -> ^bb4, ^bb1
  ^bb4:
    %0 = pdl_interp.get_operand 0 of %arg0
    pdl_interp.is_not_null %0 : !pdl.value -> ^bb5, ^bb1
  ^bb5:
    %1 = pdl_interp.get_result 0 of %arg0
    pdl_interp.is_not_null %1 : !pdl.value -> ^bb6, ^bb1
  ^bb6:
    pdl_interp.record_match @rewriters::@rewriter(%0, %arg0 : !pdl.value, !pdl.operation) : benefit(1), loc([%arg0]), root("foo.op") -> ^bb1
  }
  module @rewriters {
    func @rewriter(%arg0: !pdl.value, %arg1: !pdl.operation) {
      pdl_interp.replace %arg1 with(%arg0)
      pdl_interp.return
    }
  }
}
```

Differential Revision: https://reviews.llvm.org/D84579

4 years ago[llvm/Object] - Make dyn_cast<XCOFFObjectFile> work as it should.
Georgii Rymar [Tue, 25 Aug 2020 14:44:39 +0000 (17:44 +0300)]
[llvm/Object] - Make dyn_cast<XCOFFObjectFile> work as it should.

Currently, `dyn_cast<XCOFFObjectFile>` always does cast and returns a pointer,
even when we pass `ELF`/`Wasm`/`Mach-O` or `COFF` instead of `XCOFF`.

It happens because `XCOFFObjectFile` class does not implement `classof`.
I've fixed it and added a unit test.

Differential revision: https://reviews.llvm.org/D86542

4 years ago[ARM] Increase MVE gather/scatter cost by MVECostFactor.
David Green [Wed, 26 Aug 2020 12:03:46 +0000 (13:03 +0100)]
[ARM] Increase MVE gather/scatter cost by MVECostFactor.

MVE Gather scatter codegeneration is looking a lot better than it used
to, but still has some issues. The instructions we currently model as 1
cycle per element, which is a bit low for some cases. Increasing the
cost by the MVECostFactor brings them in-line with our other instruction
costs. This will have the effect of only generating then when the extra
benefit is more likely to overcome some of the issues. Notably in
running out of registers and vectorizing loops that could otherwise be
SLP vectorized.

In the short-term whilst we look at other ways of dealing with those
more directly, we can increase the costs of gathers to make them more
likely to be beneficial when created.

Differential Revision: https://reviews.llvm.org/D86444

4 years ago[llvm-readobj][test] - Commit trivial.obj.elf-amdhsa-gfx803 binary back.
Georgii Rymar [Wed, 26 Aug 2020 11:56:40 +0000 (14:56 +0300)]
[llvm-readobj][test] - Commit trivial.obj.elf-amdhsa-gfx803 binary back.

It was removed in rGcbedbd12e9837e049f0a936636a82ff39b75692b by mistake.

4 years ago[llvm-readobj/elf][test] - Add testing for EM_* specific OS/ABI values.
Georgii Rymar [Fri, 21 Aug 2020 11:24:12 +0000 (14:24 +0300)]
[llvm-readobj/elf][test] - Add testing for EM_* specific OS/ABI values.

We have no tests for OS/ABI values specific to
EM_TI_C6000, ELFOSABI_AMDGPU_MESA3D and ELFOSABI_ARM machines.

Also, related arrays in the code are not grouped together.
(That is why such testing was missed I guess).

The patch fixes that all.

Differential revision: https://reviews.llvm.org/D86341

4 years ago[RDA] Don't visit the BB of the instruction in getReachingUniqueMIDef
Sam Tebbs [Wed, 26 Aug 2020 10:11:15 +0000 (11:11 +0100)]
[RDA] Don't visit the BB of the instruction in getReachingUniqueMIDef

If the basic block of the instruction passed to getUniqueReachingMIDef
is a transitive predecessor of itself and has a definition of the
register, the function will return that definition even if it is after
the instruction given to the function. This patch stops the function
from scanning the instruction's basic block to prevent this.

Differential Revision: https://reviews.llvm.org/D86607

4 years ago[gn build] Port 357dc1ed125
LLVM GN Syncbot [Wed, 26 Aug 2020 11:33:42 +0000 (11:33 +0000)]
[gn build] Port 357dc1ed125

4 years ago[libunwind] Convert x86, x86_64, arm64 register restore functions to C calling conven...
Martin Storsjö [Sun, 16 Aug 2020 12:12:55 +0000 (15:12 +0300)]
[libunwind] Convert x86, x86_64, arm64 register restore functions to C calling convention and name mangling

Currently, the assembly functions for restoring register state have
been direct implementations of the Registers_*::jumpto() method
(contrary to the functions for saving register state, which are
implementations of the extern C function __unw_getcontext). This has
included having the assembly function name match the C++ mangling of
that method name (and having the function match the C++ member
function calling convention). To simplify the interface of the assembly
implementations, make the functions have C calling conventions and
name mangling.

This fixes building the library in with a MSVC C++ ABI with clang-cl,
which uses a significantly different method name mangling scheme.
(The library might not be of much use as C++ exception unwinder in such
an environment, but the libunwind.h interface for stepwise unwinding
still is usable, as is the _Unwind_Backtrace function.)

Differential Revision: https://reviews.llvm.org/D86041

4 years agoCopy m_plan_is_for_signal_trap member.
Benjamin Kramer [Wed, 26 Aug 2020 11:25:41 +0000 (13:25 +0200)]
Copy m_plan_is_for_signal_trap member.

Otherwise it would stay uninitialized. Found by msan.

4 years ago[obj2yaml] - Cleanup error reporting (remove Error.cpp/.h files)
Georgii Rymar [Tue, 25 Aug 2020 11:55:25 +0000 (14:55 +0300)]
[obj2yaml] - Cleanup error reporting (remove Error.cpp/.h files)

This removes Error.cpp/.h files from obj2yaml.
These files are not needed because we are
using `Error`s instead of error codes widely and do
not need a logic related to obj2yaml specific
error codes anymore.

I had to adjust just a few lines of tool's code
to remove remaining dependencies.

Differential revision: https://reviews.llvm.org/D86536

4 years ago[lldb/DWARF] More DW_AT_const_value fixes
Pavel Labath [Fri, 21 Aug 2020 13:13:29 +0000 (15:13 +0200)]
[lldb/DWARF] More DW_AT_const_value fixes

This fixes several issues in handling of DW_AT_const_value attributes:
- the first is that the size of the data given by data forms does not
  need to match the size of the underlying variable. We already had the
  case to handle this for DW_FORM_(us)data -- this extends the handling
  to other data forms. The main reason this was not picked up is because
  clang uses leb forms in these cases while gcc prefers the fixed-size
  ones.
- The handling of DW_AT_strp form was completely broken -- we would end
  up using the pointer value as the result. I've reorganized this code
  so that it handles all string forms uniformly.
- In case of a completely bogus form we would crash due to
  strlen(nullptr).

Depends on D86311.

Differential Revision: https://reviews.llvm.org/D86348

4 years ago[llvm-readobj] - Don`t crash when --section-symbols is requested for an object w...
Georgii Rymar [Tue, 25 Aug 2020 10:37:32 +0000 (13:37 +0300)]
[llvm-readobj] - Don`t crash when --section-symbols is requested for an object w/o .symtab.

llvm-readobj crashes when `-S --section-symbols` is used
on an object that has no symbol table.

The patch fixes it.

Differential revision: https://reviews.llvm.org/D86520

4 years ago[llvm-readelf][test] - Refine the sections-ext.test
Georgii Rymar [Mon, 24 Aug 2020 13:24:36 +0000 (16:24 +0300)]
[llvm-readelf][test] - Refine the sections-ext.test

The `sections-ext.test` is a test for ELF that is used
to test `--st`, `--sr` and `--sd` extension options for `-S`.

There are 2 problems with it:
1) It is broken, because for CHECK lines it contains there is
no corresponding `FileCheck` call.

2) It uses the precompiled object: `trivial.obj.elf-i386`.
This is the last ELF test where `trivial.obj.elf-i386` is used so we can get
rid of the binary and use an YAML description.

Also, there is a `Inputs/trivial.ll` file that describes how `trivial*` objects
in `Inputs` folders are created. I've removed it from `ELF`, because it is not
actual anymore (we have no more input binaries created with the use of trivial.ll there)
and copied the refined versions of it to `COFF`, `MachO` and `wasm` Input folders.

Differential revision: https://reviews.llvm.org/D86462

4 years ago[SystemZ/ZOS] Additions to the build system.
Kai Nacke [Thu, 9 Jul 2020 12:17:33 +0000 (14:17 +0200)]
[SystemZ/ZOS] Additions to the build system.

This change extend the CMake files with the necessary additions
to build LLVM for z/OS.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D83866

4 years ago[lldb] Correct wording of EXP_MSG
David Spickett [Tue, 25 Aug 2020 15:50:03 +0000 (16:50 +0100)]
[lldb] Correct wording of EXP_MSG

EXP_MSG generates a message to show on assert
failure. Currently it looks like:
AssertionError: False is not True : '<cmd>'
returns expected result, got '<actual output>'

Which seems to say that the test failed but
also got the expected result.

It should say:
AssertionError: False is not True : '<cmd>'
returned unexpected result, got '<actual output>'

Reviewed By: teemperor, #lldb

Differential Revision: https://reviews.llvm.org/D86603

4 years ago[X86] Make sure we do not clobber RBX with mwaitx when used as a base
Pierre Gousseau [Wed, 15 Jan 2020 18:33:28 +0000 (18:33 +0000)]
[X86] Make sure we do not clobber RBX with mwaitx when used as a base
pointer.

mwaitx uses EBX as one of its argument.
Using this instruction clobbers RBX as it is defined to hold one of the
input. When the backend uses dynamically allocated stack, RBX is used as
a reserved register for the base pointer.

This patch is adapted from @qcolombet patch for cmpxchg at r263325.

This fixes PR43528.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D73475

4 years ago[GlobalISel] Fix and tidy up documentation for ValueMapping class (NFC)
Gabriel Hjort Åkerlund [Wed, 26 Aug 2020 08:53:02 +0000 (10:53 +0200)]
[GlobalISel] Fix and tidy up documentation for ValueMapping class (NFC)

The documentation was missing a '*/' in '/*<2x32-bit> vadd {0, 64, VPR}',
and the example code are now aligned to improve readability.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D86201

4 years ago[TableGen][GlobalISel] Fix tblgen optimization bug
Gabriel Hjort Åkerlund [Wed, 26 Aug 2020 08:48:15 +0000 (10:48 +0200)]
[TableGen][GlobalISel] Fix tblgen optimization bug

When optimizing the table, PointerToAnyOperandMatchers would be
incorrectly reported as identical even though they have different
SizeInBits values. This bug was due to failing to overload the
isIdentical() method, which this patch addresses.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D86199

4 years agoReland [IR] Intrinsics default attributes and opt-out flag
sstefan1 [Wed, 26 Aug 2020 08:37:48 +0000 (10:37 +0200)]
Reland [IR] Intrinsics default attributes and opt-out flag

Intrinsic properties can now be set to default and applied to all
intrinsics. If the attributes are not needed, the user can opt-out by
setting the DisableDefaultAttributes flag to true.

Differential Revision: https://reviews.llvm.org/D70365

4 years ago[AArch64][AsmParser] Fix bug in operand printer
Cullen Rhodes [Tue, 25 Aug 2020 12:27:15 +0000 (12:27 +0000)]
[AArch64][AsmParser] Fix bug in operand printer

The switch in AArch64Operand::print was changed in D45688 so the shift
can be printed after printing the register. This is implemented with
LLVM_FALLTHROUGH and was broken in D52485 when BTIHint was put between
the register and shift operands.

Reviewed By: ostannard

Differential Revision: https://reviews.llvm.org/D86535

4 years ago[analyzer] Add modeling of assignment operator in smart ptr
Nithin Vadukkumchery Rajendrakumar [Wed, 26 Aug 2020 09:22:55 +0000 (11:22 +0200)]
[analyzer] Add modeling of assignment operator in smart ptr

Summary: Support for 'std::unique_ptr>::operator=' in SmartPtrModeling

Reviewers: NoQ, Szelethus, vsavchenko, xazax.hun

Reviewed By: NoQ, vsavchenko, xazax.hun

Subscribers: martong, cfe-commits
Tags: #clang

Differential Revision: https://reviews.llvm.org/D86293

4 years ago[AArch64][SVE] Fix calculation restore point for SVE callee saves.
Sander de Smalen [Wed, 19 Aug 2020 10:06:51 +0000 (11:06 +0100)]
[AArch64][SVE] Fix calculation restore point for SVE callee saves.

This fixes an issue where the restore point of callee-saves in the
function epilogues was incorrectly calculated when the basic block
consisted of only a RET instruction. This caused dealloc instructions
to be inserted in between the block of callee-save restore instructions,
rather than before it.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D86099

4 years ago[NFC] Fix some spelling errors in clang Driver Options.td
Yang Zhihui [Wed, 26 Aug 2020 08:28:50 +0000 (04:28 -0400)]
[NFC] Fix some spelling errors in clang Driver Options.td

Differential Revision: https://reviews.llvm.org/D86427

4 years ago[clangd] Compute the inactive code range for semantic highlighting.
Haojian Wu [Wed, 26 Aug 2020 08:50:31 +0000 (10:50 +0200)]
[clangd] Compute the inactive code range for semantic highlighting.

Differential Revision: https://reviews.llvm.org/D85635

4 years ago[Support] Speedup llvm-dwarfdump 3.9x
Jan Kratochvil [Wed, 26 Aug 2020 07:57:53 +0000 (09:57 +0200)]
[Support] Speedup llvm-dwarfdump 3.9x

Currently `strace llvm-dwarfdump x.debug >/tmp/file`:

  ioctl(1, TCGETS, 0x7ffd64d7f340)        = -1 ENOTTY (Inappropriate ioctl for device)
  write(1, "           DW_AT_decl_line\t(89)\n"..., 4096) = 4096
  ioctl(1, TCGETS, 0x7ffd64d7f400)        = -1 ENOTTY (Inappropriate ioctl for device)
  ioctl(1, TCGETS, 0x7ffd64d7f410)        = -1 ENOTTY (Inappropriate ioctl for device)
  ioctl(1, TCGETS, 0x7ffd64d7f400)        = -1 ENOTTY (Inappropriate ioctl for device)

After this patch:

  write(1, "0000000000001102 \"strlen\")\n     "..., 4096) = 4096
  write(1, "site\n                  DW_AT_low"..., 4096) = 4096
  write(1, "d53)\n\n0x000e4d4d:       DW_TAG_G"..., 4096) = 4096

The same speedup can be achieved by `--color=0` but that is not much convenient.

This implementation has been suggested by Joerg Sonnenberger.

Differential Revision: https://reviews.llvm.org/D86406

4 years ago[lldb] XFAIL TestMemoryHistory on Linux
Raphael Isemann [Wed, 26 Aug 2020 08:22:04 +0000 (10:22 +0200)]
[lldb] XFAIL TestMemoryHistory on Linux

This test appears to have never worked on Linux but it seems none of the current
bots ever ran this test as it required enabling compiler-rt (otherwise it
would have just been skipped).

This just copies over the XFAIL decorator that are already on all other sanitizer
tests.

4 years ago[SelectionDAG] Handle non-power-of-2 bitwidths in expandROT
Jay Foad [Mon, 24 Aug 2020 12:24:21 +0000 (13:24 +0100)]
[SelectionDAG] Handle non-power-of-2 bitwidths in expandROT

Differential Revision: https://reviews.llvm.org/D86449

4 years ago[mlir] Fix bug in block merging when the types of the operands differ
River Riddle [Wed, 26 Aug 2020 08:13:01 +0000 (01:13 -0700)]
[mlir] Fix bug in block merging when the types of the operands differ

The merging algorithm was previously not checking for type equivalence.

Fixes PR47314

Differential Revision: https://reviews.llvm.org/D86594

4 years ago[Attributor] Provide an edge-based interface in AAIsDead
Shinji Okumura [Wed, 26 Aug 2020 07:57:52 +0000 (16:57 +0900)]
[Attributor] Provide an edge-based interface in AAIsDead

This patch produces an edge-based interface in AAIsDead.
By this, we can query a set of basic blocks that are directly reachable from a given basic block.
This is specifically useful for implementation of AAReachability.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85547

4 years ago[SyntaxTree] Fix C++ versions on tests of `BuildTreeTest.cpp`
Eduardo Caldas [Wed, 26 Aug 2020 07:18:01 +0000 (07:18 +0000)]
[SyntaxTree] Fix C++ versions on tests of `BuildTreeTest.cpp`

Differential Revision: https://reviews.llvm.org/D86591

4 years ago[SyntaxTree] Add support for `CallExpression`
Eduardo Caldas [Tue, 25 Aug 2020 07:47:52 +0000 (07:47 +0000)]
[SyntaxTree] Add support for `CallExpression`

* Generate `CallExpression` syntax node for all semantic nodes inheriting from
`CallExpr` with call-expression syntax - except `CUDAKernelCallExpr`.
* Implement all the accessors
* Arguments of `CallExpression` have their own syntax node which is based on
the `List` base API

Differential Revision: https://reviews.llvm.org/D86544

4 years ago[InstCombine] PHI-of-extractvalues -> extractvalue-of-PHI, aka invokes are bad
Roman Lebedev [Wed, 26 Aug 2020 06:08:24 +0000 (09:08 +0300)]
[InstCombine] PHI-of-extractvalues -> extractvalue-of-PHI, aka invokes are bad

While since D86306 we do it's sibling fold for `insertvalue`,
we should also do this for `extractvalue`'s.

And unlike that one, the results here are, quite honestly, shocking,
as it can be observed here on vanilla llvm test-suite + RawSpeed results:

```
| statistic name                                     | baseline  | proposed  |       Δ |       % |    |%| |
|----------------------------------------------------|-----------|-----------|--------:|--------:|-------:|
| asm-printer.EmittedInsts                           | 7945095   | 7942507   |   -2588 |  -0.03% |  0.03% |
| assembler.ObjectBytes                              | 273209920 | 273069800 | -140120 |  -0.05% |  0.05% |
| early-cse.NumCSE                                   | 2183363   | 2183398   |      35 |   0.00% |  0.00% |
| early-cse.NumSimplify                              | 541847    | 550017    |    8170 |   1.51% |  1.51% |
| instcombine.NumAggregateReconstructionsSimplified  | 2139      | 108       |   -2031 | -94.95% | 94.95% |
| instcombine.NumCombined                            | 3601364   | 3635448   |   34084 |   0.95% |  0.95% |
| instcombine.NumConstProp                           | 27153     | 27157     |       4 |   0.01% |  0.01% |
| instcombine.NumDeadInst                            | 1694521   | 1765022   |   70501 |   4.16% |  4.16% |
| instcombine.NumPHIsOfExtractValues                 | 0         | 37546     |   37546 |   0.00% |  0.00% |
| instcombine.NumSunkInst                            | 63158     | 63686     |     528 |   0.84% |  0.84% |
| instcount.NumBrInst                                | 874304    | 871857    |   -2447 |  -0.28% |  0.28% |
| instcount.NumCallInst                              | 1757657   | 1758402   |     745 |   0.04% |  0.04% |
| instcount.NumExtractValueInst                      | 45623     | 11483     |  -34140 | -74.83% | 74.83% |
| instcount.NumInsertValueInst                       | 4983      | 580       |   -4403 | -88.36% | 88.36% |
| instcount.NumInvokeInst                            | 61018     | 59478     |   -1540 |  -2.52% |  2.52% |
| instcount.NumLandingPadInst                        | 35334     | 34215     |   -1119 |  -3.17% |  3.17% |
| instcount.NumPHIInst                               | 344428    | 331116    |  -13312 |  -3.86% |  3.86% |
| instcount.NumRetInst                               | 100773    | 100772    |      -1 |   0.00% |  0.00% |
| instcount.TotalBlocks                              | 1081154   | 1077166   |   -3988 |  -0.37% |  0.37% |
| instcount.TotalFuncs                               | 101443    | 101442    |      -1 |   0.00% |  0.00% |
| instcount.TotalInsts                               | 8890201   | 8833747   |  -56454 |  -0.64% |  0.64% |
| instsimplify.NumSimplified                         | 75822     | 75707     |    -115 |  -0.15% |  0.15% |
| simplifycfg.NumHoistCommonCode                     | 24203     | 24197     |      -6 |  -0.02% |  0.02% |
| simplifycfg.NumHoistCommonInstrs                   | 48201     | 48195     |      -6 |  -0.01% |  0.01% |
| simplifycfg.NumInvokes                             | 2785      | 4298      |    1513 |  54.33% | 54.33% |
| simplifycfg.NumSimpl                               | 997332    | 1018189   |   20857 |   2.09% |  2.09% |
| simplifycfg.NumSinkCommonCode                      | 7088      | 6464      |    -624 |  -8.80% |  8.80% |
| simplifycfg.NumSinkCommonInstrs                    | 15117     | 14021     |   -1096 |  -7.25% |  7.25% |
```
... which tells us that this new fold fires whopping 38k times,
increasing the amount of SimplifyCFG's `invoke`->`call` transforms by +54% (+1513) (again, D85787 did that last time),
decreasing total instruction count by -0.64% (-56454),
and sharply decreasing count of `insertvalue`'s (-88.36%, i.e. 9 times less)
and `extractvalue`'s (-74.83%, i.e. four times less).

This causes geomean -0.01% binary size decrease
http://llvm-compile-time-tracker.com/compare.php?from=4d5ca22b8adfb6643466e4e9f48ba14bb48938bc&to=97dacca0111cb2ae678204e52a3cee00e3a69208&stat=size-text
and, ignoring `O0-g`, is a geomean -0.01%..-0.05% compile-time improvement
http://llvm-compile-time-tracker.com/compare.php?from=4d5ca22b8adfb6643466e4e9f48ba14bb48938bc&to=97dacca0111cb2ae678204e52a3cee00e3a69208&stat=instructions

The other thing that tells is, is that while this is a massive win for `invoke`->`call` transform
`InstCombinerImpl::foldAggregateConstructionIntoAggregateReuse()` fold,
which is supposed to be dealing with such aggregate reconstructions,
fires a lot less now. There are two reasons why:
1. After this fold, as it can be seen in tests, we may (will) end up with trivially redundant PHI nodes.
   We don't CSE them in InstCombine presently, which means that EarlyCSE needs to run and then InstCombine rerun.
2. But then, EarlyCSE not only manages to fold such redundant PHI's,
   it also sees that the extract-insert chain recreates the original aggregate,
   and replaces it with the original aggregate.

The take-aways are
1. We maybe should do most trivial, same-BB PHI CSE in InstCombine
2. I need to check if what other patterns remain, and how they can be resolved.
   (i.e. i wonder if `foldAggregateConstructionIntoAggregateReuse()` might go away)

This is a reland of the original commit fcb51d8c2460faa23b71e06abb7e826243887dd6,
because originally i forgot to ensure that the base aggregate types match.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D86530

4 years ago[NFC][InstCombine] Add a PHI-of-insertvalues test with different base aggregate types
Roman Lebedev [Wed, 26 Aug 2020 06:28:22 +0000 (09:28 +0300)]
[NFC][InstCombine] Add a PHI-of-insertvalues test with different base aggregate types

4 years agoAdd assertion in PatternRewriter::create<> to defend the same way as OpBuilder::creat...
Mehdi Amini [Wed, 26 Aug 2020 05:43:43 +0000 (05:43 +0000)]
Add assertion in PatternRewriter::create<> to defend the same way as OpBuilder::create<> against missing dialect registration (NFC)

The code would have failed a few line later, but that way the error
message is more clear/friendly to debug.

4 years agoAdjust assertion when casting to an unregistered operation
Mehdi Amini [Wed, 26 Aug 2020 05:08:59 +0000 (05:08 +0000)]
Adjust assertion when casting to an unregistered operation

This assertion does not achieve what it meant to do originally, as it
would fire only when applied to an unregistered operation, which is a
fairly rare circumstance (it needs a dialect or context allowing
unregistered operation in the input in the first place).
Instead we relax it to only fire when it should have matched but didn't
because of the misconfiguration.

Differential Revision: https://reviews.llvm.org/D86588

4 years ago[libc][NFC] For remquo quotient, compare only 3 bits of MPFR and libc results.
Siva Chandra Reddy [Wed, 26 Aug 2020 06:32:55 +0000 (23:32 -0700)]
[libc][NFC] For remquo quotient, compare only 3 bits of MPFR and libc results.

4 years ago[LLD][MinGW] Handle allow-multiple-definition flag
Mateusz Mikuła [Wed, 26 Aug 2020 06:25:52 +0000 (09:25 +0300)]
[LLD][MinGW] Handle allow-multiple-definition flag

Basically copied from ELF driver.

Differential Revision: https://reviews.llvm.org/D86512

4 years ago[LLD][MinGW] Cleanup Options.td file. NFC.
Mateusz Mikuła [Wed, 26 Aug 2020 06:24:37 +0000 (09:24 +0300)]
[LLD][MinGW] Cleanup Options.td file. NFC.

Based on ELF driver Options.td.

Differential Revision: https://reviews.llvm.org/D86509

4 years ago[MC] [Win64EH] Update the AArch64/seh.s test slightly. NFC.
Martin Storsjö [Thu, 20 Aug 2020 08:53:56 +0000 (11:53 +0300)]
[MC] [Win64EH] Update the AArch64/seh.s test slightly. NFC.

Update the comment stating the aim of the test - this is currently
only checking that these assembler directives doesn't cause the
assembler to fail, but the results of the testcase aren't particularly
correct yet.

Remove bits of the testcase that are even less likely to be found in
the wild (the .seh_startchained/.seh_endchained block), where the
testcase currently doesn't really generate anything interesting
anyway.

Differential Revision: https://reviews.llvm.org/D86524

4 years ago[llvm-readobj] Fix arm64 unwind opcode disassembly printing
Martin Storsjö [Mon, 24 Aug 2020 09:29:56 +0000 (12:29 +0300)]
[llvm-readobj] Fix arm64 unwind opcode disassembly printing

Add a missing minus, fix vertical alignment of instructions for one opcode.

Differential Revision: https://reviews.llvm.org/D86523

4 years ago[mlir][spirv] Infer converted type of scf.for from the init value
Thomas Raoux [Wed, 26 Aug 2020 06:19:09 +0000 (23:19 -0700)]
[mlir][spirv] Infer converted type of scf.for from the init value

Instead of using the TypeConverter infer the value of the alloca created based
on the init value. This will allow some ambiguous types like multidimensional
vectors to be converted correctly.

Differential Revision: https://reviews.llvm.org/D86582

4 years agoRevert "[InstCombine] PHI-of-extractvalues -> extractvalue-of-PHI, aka invokes are...
Roman Lebedev [Wed, 26 Aug 2020 06:23:22 +0000 (09:23 +0300)]
Revert "[InstCombine] PHI-of-extractvalues -> extractvalue-of-PHI, aka invokes are bad"

This reverts commit fcb51d8c2460faa23b71e06abb7e826243887dd6.

As buildbots report, there's apparently some missing check to ensure
that the types of incoming values match the type of PHI.
Let's revert for a moment.

4 years ago[InstCombine] PHI-of-extractvalues -> extractvalue-of-PHI, aka invokes are bad
Roman Lebedev [Wed, 26 Aug 2020 06:08:24 +0000 (09:08 +0300)]
[InstCombine] PHI-of-extractvalues -> extractvalue-of-PHI, aka invokes are bad

While since D86306 we do it's sibling fold for `insertvalue`,
we should also do this for `extractvalue`'s.

And unlike that one, the results here are, quite honestly, shocking,
as it can be observed here on vanilla llvm test-suite + RawSpeed results:

```
| statistic name                                     | baseline  | proposed  |       Δ |       % |    |%| |
|----------------------------------------------------|-----------|-----------|--------:|--------:|-------:|
| asm-printer.EmittedInsts                           | 7945095   | 7942507   |   -2588 |  -0.03% |  0.03% |
| assembler.ObjectBytes                              | 273209920 | 273069800 | -140120 |  -0.05% |  0.05% |
| early-cse.NumCSE                                   | 2183363   | 2183398   |      35 |   0.00% |  0.00% |
| early-cse.NumSimplify                              | 541847    | 550017    |    8170 |   1.51% |  1.51% |
| instcombine.NumAggregateReconstructionsSimplified  | 2139      | 108       |   -2031 | -94.95% | 94.95% |
| instcombine.NumCombined                            | 3601364   | 3635448   |   34084 |   0.95% |  0.95% |
| instcombine.NumConstProp                           | 27153     | 27157     |       4 |   0.01% |  0.01% |
| instcombine.NumDeadInst                            | 1694521   | 1765022   |   70501 |   4.16% |  4.16% |
| instcombine.NumPHIsOfExtractValues                 | 0         | 37546     |   37546 |   0.00% |  0.00% |
| instcombine.NumSunkInst                            | 63158     | 63686     |     528 |   0.84% |  0.84% |
| instcount.NumBrInst                                | 874304    | 871857    |   -2447 |  -0.28% |  0.28% |
| instcount.NumCallInst                              | 1757657   | 1758402   |     745 |   0.04% |  0.04% |
| instcount.NumExtractValueInst                      | 45623     | 11483     |  -34140 | -74.83% | 74.83% |
| instcount.NumInsertValueInst                       | 4983      | 580       |   -4403 | -88.36% | 88.36% |
| instcount.NumInvokeInst                            | 61018     | 59478     |   -1540 |  -2.52% |  2.52% |
| instcount.NumLandingPadInst                        | 35334     | 34215     |   -1119 |  -3.17% |  3.17% |
| instcount.NumPHIInst                               | 344428    | 331116    |  -13312 |  -3.86% |  3.86% |
| instcount.NumRetInst                               | 100773    | 100772    |      -1 |   0.00% |  0.00% |
| instcount.TotalBlocks                              | 1081154   | 1077166   |   -3988 |  -0.37% |  0.37% |
| instcount.TotalFuncs                               | 101443    | 101442    |      -1 |   0.00% |  0.00% |
| instcount.TotalInsts                               | 8890201   | 8833747   |  -56454 |  -0.64% |  0.64% |
| instsimplify.NumSimplified                         | 75822     | 75707     |    -115 |  -0.15% |  0.15% |
| simplifycfg.NumHoistCommonCode                     | 24203     | 24197     |      -6 |  -0.02% |  0.02% |
| simplifycfg.NumHoistCommonInstrs                   | 48201     | 48195     |      -6 |  -0.01% |  0.01% |
| simplifycfg.NumInvokes                             | 2785      | 4298      |    1513 |  54.33% | 54.33% |
| simplifycfg.NumSimpl                               | 997332    | 1018189   |   20857 |   2.09% |  2.09% |
| simplifycfg.NumSinkCommonCode                      | 7088      | 6464      |    -624 |  -8.80% |  8.80% |
| simplifycfg.NumSinkCommonInstrs                    | 15117     | 14021     |   -1096 |  -7.25% |  7.25% |
```
... which tells us that this new fold fires whopping 38k times,
increasing the amount of SimplifyCFG's `invoke`->`call` transforms by +54% (+1513) (again, D85787 did that last time),
decreasing total instruction count by -0.64% (-56454),
and sharply decreasing count of `insertvalue`'s (-88.36%, i.e. 9 times less)
and `extractvalue`'s (-74.83%, i.e. four times less).

This causes geomean -0.01% binary size decrease
http://llvm-compile-time-tracker.com/compare.php?from=4d5ca22b8adfb6643466e4e9f48ba14bb48938bc&to=97dacca0111cb2ae678204e52a3cee00e3a69208&stat=size-text
and, ignoring `O0-g`, is a geomean -0.01%..-0.05% compile-time improvement
http://llvm-compile-time-tracker.com/compare.php?from=4d5ca22b8adfb6643466e4e9f48ba14bb48938bc&to=97dacca0111cb2ae678204e52a3cee00e3a69208&stat=instructions

The other thing that tells is, is that while this is a massive win for `invoke`->`call` transform
`InstCombinerImpl::foldAggregateConstructionIntoAggregateReuse()` fold,
which is supposed to be dealing with such aggregate reconstructions,
fires a lot less now. There are two reasons why:
1. After this fold, as it can be seen in tests, we may (will) end up with trivially redundant PHI nodes.
   We don't CSE them in InstCombine presently, which means that EarlyCSE needs to run and then InstCombine rerun.
2. But then, EarlyCSE not only manages to fold such redundant PHI's,
   it also sees that the extract-insert chain recreates the original aggregate,
   and replaces it with the original aggregate.

The take-aways are
1. We maybe should do most trivial, same-BB PHI CSE in InstCombine
2. I need to check if what other patterns remain, and how they can be resolved.
   (i.e. i wonder if `foldAggregateConstructionIntoAggregateReuse()` might go away)

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D86530

4 years agoFix a 32-bit overflow issue when reading LTO-generated bitcode files whose strtab...
Jianzhou Zhao [Tue, 25 Aug 2020 00:36:24 +0000 (00:36 +0000)]
Fix a 32-bit overflow issue when reading LTO-generated bitcode files whose strtab are of size > 2^29

This happens when using -flto and -Wl,--plugin-opt=emit-llvm to create a linked LTO bitcode file, and the bitcode file has a strtab with size > 2^29.

All the issues relate to a pattern like this
  size_t x64 = y64 + z32 * C
  When z32 is >= (2^32)/C, z32 * C overflows.

Reviewed-by: MaskRay
Differential Revision: https://reviews.llvm.org/D86500

4 years agoRemove the use of global dialect registration from the standalone-translate.cpp examp...
Mehdi Amini [Wed, 26 Aug 2020 05:08:32 +0000 (05:08 +0000)]
Remove the use of global dialect registration from the standalone-translate.cpp example (NFC)

4 years ago[libc][obvious] Add back the accidentally removed MPFRNumber destructor.
Siva Chandra Reddy [Wed, 26 Aug 2020 04:57:46 +0000 (21:57 -0700)]
[libc][obvious] Add back the accidentally removed MPFRNumber destructor.

4 years ago[libc] Extend MPFRMatcher to handle multiple-input-multiple-output functions.
Siva Chandra Reddy [Fri, 21 Aug 2020 05:36:53 +0000 (22:36 -0700)]
[libc] Extend MPFRMatcher to handle multiple-input-multiple-output functions.

Tests for frexp[f|l] now use the new capability. Not all input-output
combinations have been addressed by this change. Support for newer combinations
can be added in future as needed.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D86506

4 years ago[DWARFYAML] Use writeDWARFOffset() to write the prologue_length field. NFC.
Xing GUO [Wed, 26 Aug 2020 04:30:33 +0000 (12:30 +0800)]
[DWARFYAML] Use writeDWARFOffset() to write the prologue_length field. NFC.

Use writeDWARFOffset() to simplify the logic. NFC.

4 years ago[llvm-lipo] Add support for bitcode files
Adrien Guinet [Wed, 26 Aug 2020 01:55:50 +0000 (18:55 -0700)]
[llvm-lipo] Add support for bitcode files

A Mach-O universal binary may contain bitcode as a slice.
This diff adds proper handling of such binaries to llvm-lipo.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D85740

4 years agoAh, one test too many updated. This one should be unmodified.
Jason Molenda [Wed, 26 Aug 2020 04:03:39 +0000 (21:03 -0700)]
Ah, one test too many updated.  This one should be unmodified.

4 years agoUpdate UnwindPlan dump to list if it is a trap handler func; also Command
Jason Molenda [Wed, 26 Aug 2020 03:53:01 +0000 (20:53 -0700)]
Update UnwindPlan dump to list if it is a trap handler func; also Command

Update the "image show-unwind" command output to show if the function
being shown is listed as a user-setting or platform trap handler.

Update the individual UnwindPlan dumps to show whether the unwind plan
is registered as a trap handler.

4 years ago[Docs] Document --lto-whole-program-visibility
Teresa Johnson [Wed, 4 Mar 2020 23:38:45 +0000 (15:38 -0800)]
[Docs] Document --lto-whole-program-visibility

Summary:
Documents interaction of linker option added in D71913 with LTO
visibility.

Reviewers: pcc

Subscribers: inglorion, hiraditya, steven_wu, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75655

4 years agoAdd Z3 to system libraries list if enabled
Mikhail R. Gadelha [Tue, 25 Aug 2020 23:19:58 +0000 (19:19 -0400)]
Add Z3 to system libraries list if enabled

Without this trying to link static LLVM libraries (built with Z3 enabled) fails because `llvm-config` doesn't print `-lz3`.
We are already using this patch at MSYS2: https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-clang/0013-Add-Z3-to-system-libraries-list-if-enabled.patch

Reviewed By: mikhail.ramalho

Differential Revision: https://reviews.llvm.org/D85195

4 years ago[X86] Add an isel pattern for (i8 (trunc (i16 (bitconvert (v16i1 X))))) to avoid...
Craig Topper [Wed, 26 Aug 2020 01:20:41 +0000 (18:20 -0700)]
[X86] Add an isel pattern for (i8 (trunc (i16 (bitconvert (v16i1 X))))) to avoid an extra EXTRACT_SUBREG

Since we can only copy to GR32 we had to EXTRACT from GR32, but
we would first go to GR16 and then the truncate would extra again
to GR8. This adds a special case to go directly from GR32 to GR8.
This would eventually get cleaned up, but though maybe we should
avoid doing it in the first place. Our k-register handling is weird
and we could probably stand to have some more special ISD nodes
for the conversions so the i32 type would be explicit.

4 years ago[Modules] Improve error message when cannot find parent module for submodule definition.
Volodymyr Sapsai [Thu, 23 Jul 2020 19:47:16 +0000 (12:47 -0700)]
[Modules] Improve error message when cannot find parent module for submodule definition.

Before the change the diagnostic for

    module unknown.submodule {}

was "error: expected module name" which is incorrect and misleading
because both "unknown" and "submodule" are valid module names.

We already have a better error message when a parent module is a
submodule itself and is missing. Make the error for a missing top-level
module more like the one for a submodule.

rdar://problem/64424407

Reviewed By: bruno

Differential Revision: https://reviews.llvm.org/D84458

4 years agoRemove global registration from the test dialect in MLIR (NFC)
Mehdi Amini [Tue, 25 Aug 2020 23:21:15 +0000 (23:21 +0000)]
Remove global registration from the test dialect in MLIR (NFC)

4 years ago[X86] Remove extra getOperand(0) call from recently introduced store(extract_element...
Craig Topper [Tue, 25 Aug 2020 23:00:12 +0000 (16:00 -0700)]
[X86] Remove extra getOperand(0) call from recently introduced store(extract_element(vtrunc)) to truncated store combine.

The IsExtractedElement already called getOperand(0) so Extract
here is the source vector. We shouldn't call getOperand(0). This
worked for the original test cases because the result was a
bitcast so the getOperand(0) accidently peeked through the bitcast
which is what we wanted.

In the failing case here, the operand turns out to be undef so
the getOperand(0) asserts because undef has no operands.

Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=25184

Differential Revision: https://reviews.llvm.org/D86428

4 years agoAdd llvm_unreachable after fully covered switch to silence some warnings from GCC...
Mehdi Amini [Tue, 25 Aug 2020 23:08:43 +0000 (23:08 +0000)]
Add llvm_unreachable after fully covered switch to silence some warnings from GCC (NFC)

4 years agoRevert "[Coverage] Enable emitting gap area between macros"
Zequan Wu [Tue, 25 Aug 2020 22:04:33 +0000 (15:04 -0700)]
Revert "[Coverage] Enable emitting gap area between macros"

This reverts commit a31c89c1b7a0a2fd3e2c0b8a587a60921abf4abd.

4 years ago[X86] Remove a redundant COPY_TO_REGCLASS for VK16 after a KMOVWkr in an isel output...
Craig Topper [Tue, 25 Aug 2020 22:16:50 +0000 (15:16 -0700)]
[X86] Remove a redundant COPY_TO_REGCLASS for VK16 after a KMOVWkr in an isel output pattern.

KMOVWkr produces VK16, there's no reason to copy it to VK16 again.

Test changes are presumably because we were scheduling based on
the COPY that is no longer there.

4 years ago[llvm-libtool-darwin] Address post-commit feedback
Shoaib Meenai [Tue, 25 Aug 2020 22:03:37 +0000 (15:03 -0700)]
[llvm-libtool-darwin] Address post-commit feedback

Address James Henderson's comments on https://reviews.llvm.org/D86359.

4 years agoRemove unused/misnamed SetObjectModificationTime
Dave Lee [Mon, 24 Aug 2020 20:46:23 +0000 (13:46 -0700)]
Remove unused/misnamed SetObjectModificationTime

Remove `SetObjectModificationTime` which is not currently used, and assigns to the wrong member.

Differential Revision: https://reviews.llvm.org/D86493

4 years ago[MLInliner] Simplify TFUTILS_SUPPORTED_TYPES
Mircea Trofin [Tue, 25 Aug 2020 16:58:49 +0000 (09:58 -0700)]
[MLInliner] Simplify TFUTILS_SUPPORTED_TYPES

We only need the C++ type and the corresponding TF Enum. The other
parameter was used for the output spec json file, but we can just
standardize on the C++ type name there.

Differential Revision: https://reviews.llvm.org/D86549

4 years ago[AMDGPU] Remove unsound dependency on ISA version in waitcnt
Stanislav Mekhanoshin [Tue, 25 Aug 2020 18:58:23 +0000 (11:58 -0700)]
[AMDGPU] Remove unsound dependency on ISA version in waitcnt

Differential Revision: https://reviews.llvm.org/D86566

4 years ago[TargetLoweringObjectFileImpl] Make .llvmbc and .llvmcmd non-SHF_ALLOC
Fangrui Song [Tue, 25 Aug 2020 20:37:29 +0000 (13:37 -0700)]
[TargetLoweringObjectFileImpl] Make .llvmbc and .llvmcmd non-SHF_ALLOC

There are two ways .llvmbc can be produced:

* clang -c -fembed-bitcode=all (which also produces .llvmcmd)
* LTO backend: ld.lld -mllvm -lto-embed-bitcode or -plugin-opt=-lto-embed-bitcode

.llvmbc and .llvmcmd have the SHF_ALLOC flag, so they can be dropped by
--gc-sections.

This patch sets SectionKind::Metadata to drop the SHF_ALLOC flag. This
is conceptually correct: the two sections are not part of the process
image, so SHF_ALLOC is not appropriate.

`test/LTO/X86/embed-bitcode.ll`: changed `llvm-objcopy -O binary --only-section` to
`llvm-objcopy --dump-section`. `-O binary` does not dump non-SHF_ALLOC sections.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D86374

4 years ago[SDAG] Improve MemSDNode::getBasePtr
Krzysztof Parzyszek [Mon, 24 Aug 2020 18:25:06 +0000 (13:25 -0500)]
[SDAG] Improve MemSDNode::getBasePtr

It returned getOperand(1), except for STORE for which it returned
getOperand(2). Handle MSTORE, MGATHER, and MSCATTER as well.

4 years ago[mlir] [LLVMIR] Mark reductions as side-effect free
aartbik [Tue, 25 Aug 2020 19:26:28 +0000 (12:26 -0700)]
[mlir] [LLVMIR] Mark reductions as side-effect free

Attribute was missing from original base class.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D86569

4 years ago[OpenMP] Pack first-private arguments to improve efficiency of data transfer
Shilei Tian [Tue, 25 Aug 2020 20:04:59 +0000 (16:04 -0400)]
[OpenMP] Pack first-private arguments to improve efficiency of data transfer

In this patch, we pack all small first-private arguments, allocate and transfer them all at once to reduce the number of data transfer which is very expensive.

Let's take the test case as example.
```
int main() {
  int data1[3] = {1}, data2[3] = {2}, data3[3] = {3};
  int sum[16] = {0};
#pragma omp target teams distribute parallel for map(tofrom: sum) firstprivate(data1, data2, data3)
  for (int i = 0; i < 16; ++i) {
    for (int j = 0; j < 3; ++j) {
      sum[i] += data1[j];
      sum[i] += data2[j];
      sum[i] += data3[j];
    }
  }
}
```
Here `data1`, `data2`, and `data3` are three first-private arguments of the target region. In the previous `libomptarget`, it called data allocation and data transfer three times, each of which allocated and transferred 12 bytes. With this patch, it only calls allocation and transfer once. The size is `(12+4)*3=48` where 12 is the size of each array and 4 is the padding to keep the address aligned with 8. It is implemented in this way:
1. First collect all information for those *first*-private arguments. _private_ arguments are not the case because private arguments don't need to be mapped to target device. It just needs a data allocation. With the patch for memory manager, the data allocation could be very cheap, especially for the small size. For each qualified argument, push a place holder pointer `nullptr` to the `vector` for kernel arguments, and we will update them later.
2. After we have all information, create a buffer that can accommodate all arguments plus their paddings. Copy the arguments to the buffer at the right place, i.e. aligned address.
3. Allocate a target memory with the same size as the host buffer, transfer the host buffer to target device, and finally update all place holder pointers in the arguments `vector`.

The reason we only consider small arguments is, the data transfer is asynchronous. Therefore, for the large argument, we could continue to do things on the host side meanwhile, hopefully, the data is also being transferred. The "small" is defined by that the argument size is less than a predefined value. Currently it is 1024. I'm not sure whether it is a good one, and that is an open question. Another question is, do we need to make it configurable via an environment variable?

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D86307

4 years ago[AMDGPU] Switch to named simm16 in vscnt insertion
Stanislav Mekhanoshin [Tue, 25 Aug 2020 19:13:24 +0000 (12:13 -0700)]
[AMDGPU] Switch to named simm16 in vscnt insertion

Differential Revision: https://reviews.llvm.org/D86568

4 years ago[Hexagon] Check if EVT is simple type in HVX lowering
Ankit Aggarwal [Tue, 25 Aug 2020 19:47:44 +0000 (14:47 -0500)]
[Hexagon] Check if EVT is simple type in HVX lowering

4 years ago[lldb] Make Reproducer compatbile with SubsystemRAII (NFC)
Jonas Devlieghere [Tue, 25 Aug 2020 19:49:30 +0000 (12:49 -0700)]
[lldb] Make Reproducer compatbile with SubsystemRAII  (NFC)

Make Reproducer compatbile with SubsystemRAII and use it in
LocateSymbolFileTest.

4 years ago[SystemZ][z/OS] Add z/OS Target and define macros
Abhina Sreeskantharajan [Tue, 25 Aug 2020 17:50:17 +0000 (13:50 -0400)]
[SystemZ][z/OS] Add z/OS Target and define macros

This patch adds the z/OS target and defines macros as a stepping stone
towards enabling a native build on z/OS.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D85324

4 years ago[ValueTracking] Let getGuaranteedNonPoisonOp find multiple non-poison operands
Juneyoung Lee [Sun, 23 Aug 2020 17:37:10 +0000 (02:37 +0900)]
[ValueTracking] Let getGuaranteedNonPoisonOp find multiple non-poison operands

This patch helps getGuaranteedNonPoisonOp find multiple non-poison operands.

Instead of special-casing llvm.assume, I think it is also a viable option to
add noundef to Intrinsics.td. If it makes sense, I'll make a patch for that.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86477

4 years ago[ValueTracking] Add a noundef test for D86477; NFC
Juneyoung Lee [Mon, 24 Aug 2020 17:34:17 +0000 (02:34 +0900)]
[ValueTracking] Add a noundef test for D86477; NFC

4 years agoReland "[DebugInfo] Move constructor homing case in shouldOmitDefinition."
Amy Huang [Tue, 25 Aug 2020 17:36:03 +0000 (10:36 -0700)]
Reland "[DebugInfo] Move constructor homing case in shouldOmitDefinition."

For some reason the ctor homing case was before the template
specialization case, and could have returned false too early.
I moved the code out into a separate function to avoid this.

This reverts commit 05777ab941063192b9ccb1775358a83a2700ccc1.

4 years ago[MemDep] Use BatchAA when computing pointer dependencies
Nikita Popov [Thu, 25 Jun 2020 19:35:41 +0000 (21:35 +0200)]
[MemDep] Use BatchAA when computing pointer dependencies

We're not changing IR while running a single MemDep query, so it's
safe to cache alias analysis results using BatchAA. This adds BatchAA
usage to getSimplePointerDependencyFrom(), which is non-intrusive --
covering larger parts (like a whole processNonLocalLoad query) is
also possible, but requires threading BatchAA through a bunch of APIs.

For the ThinLTO configuration, this is a 1% geomean improvement on CTMark.

Differential Revision: https://reviews.llvm.org/D85583