platform/upstream/llvm.git
4 years ago[NFC][asan] Don't unwind stack before pool check
Vitaly Buka [Fri, 28 Aug 2020 08:59:27 +0000 (01:59 -0700)]
[NFC][asan] Don't unwind stack before pool check

4 years ago[mlir][Linalg] Enhance Linalg fusion on generic op and tensor_reshape op.
Hanhan Wang [Fri, 28 Aug 2020 08:53:59 +0000 (01:53 -0700)]
[mlir][Linalg] Enhance Linalg fusion on generic op and tensor_reshape op.

The tensor_reshape op was only fusible only if it is a collapsing case. Now we
propagate the op to all the operands so there is a further chance to fuse it
with generic op. The pre-conditions are:

1) The producer is not an indexed_generic op.
2) All the shapes of the operands are the same.
3) All the indexing maps are identity.
4) All the loops are parallel loops.
5) The producer has a single user.

It is possible to fuse the ops if the producer is an indexed_generic op. We
still can compute the original indices. E.g., if the reshape op collapses the d0
and d1, we can use DimOp to get the width of d1, and calculate the index
`d0 * width + d1`. Then replace all the uses with it. However, this pattern is
not implemented in the patch.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D86314

4 years ago[BuildLibCalls] Add argmemonly to more lib calls.
Florian Hahn [Fri, 28 Aug 2020 08:37:01 +0000 (09:37 +0100)]
[BuildLibCalls] Add argmemonly to more lib calls.

strspn, strncmp, strcspn, strcasecmp, strncasecmp, memcmp, memchr,
memrchr, memcpy, memmove, memcpy, mempcpy, strchr, strrchr, bcmp
should all only access memory through their arguments.

I broke out strcoll, strcasecmp, strncasecmp because the result
depends on the locale, which might get accessed through memory.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86724

4 years ago[llvm-readobj] - Simplify the code that creates dumpers. NFCI.
Georgii Rymar [Thu, 27 Aug 2020 15:20:13 +0000 (18:20 +0300)]
[llvm-readobj] - Simplify the code that creates dumpers. NFCI.

We have a few helper functions like the following:
```
std::error_code create*Dumper(...)
```

In fact we do not need or want to use `std::error_code` and the code
can be simpler if we just return `std::unique_ptr<ObjDumper>`.

This patch does this change and refines the signature of `createDumper`
as well.

Differential revision: https://reviews.llvm.org/D86718

4 years ago[llvm-readobj][test] - Test "Format" values.
Georgii Rymar [Fri, 21 Aug 2020 14:08:25 +0000 (17:08 +0300)]
[llvm-readobj][test] - Test "Format" values.

This adds testing for the "Format" field printed with `--file-headers`.

llvm-readelf doesn't use them, so only llvm-readobj needs to be tested.

All possible values are defined and tested in `ELFObjectFile<ELFT>::getFileFormatName()`.
Here we test just a few arbitrary ones.

Differential revision: https://reviews.llvm.org/D86350

4 years ago[unittests/Object] - Add testing for missing ELF formats.
Georgii Rymar [Wed, 26 Aug 2020 14:48:23 +0000 (17:48 +0300)]
[unittests/Object] - Add testing for missing ELF formats.

This adds all missing format values that are defined in
ELFObjectFile<ELFT>::getFileFormatName().

Differential revision: https://reviews.llvm.org/D86625

4 years ago[llvm-reduce] Skip chunks that lead to broken modules.
Florian Hahn [Fri, 28 Aug 2020 07:40:40 +0000 (08:40 +0100)]
[llvm-reduce] Skip chunks that lead to broken modules.

Some reduction passes may create invalid IR. I am not aware of any use
case where we would like to proceed reducing invalid IR. Various utils
used here, including CloneModule, assume the module to clone is valid
and crash otherwise.

Ideally, no reduction pass would create invalid IR, but some currently
do. ReduceInstructions can be fixed relatively easily (D86210), but
others are harder. For example, ReduceBasicBlocks may remove result in
invalid PHI nodes.

For now, skip the chunks. If we get to the point where all reduction
passes result in valid IR, we may want to turn this into an assertion.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D86212

4 years ago[ValueTracking] Remove a stray semicolon. NFC.
Martin Storsjö [Fri, 28 Aug 2020 06:23:57 +0000 (09:23 +0300)]
[ValueTracking] Remove a stray semicolon. NFC.

This silences warnings when built with GCC at least.

4 years ago[MC] [Win64EH] Avoid producing malformed xdata records
Martin Storsjö [Thu, 20 Aug 2020 07:38:17 +0000 (10:38 +0300)]
[MC] [Win64EH] Avoid producing malformed xdata records

If there's no unwinding opcodes, omit writing the xdata/pdata records.

Previously, this generated truncated xdata records, and llvm-readobj
would error out when trying to print them.

If writing of an xdata record is forced via the .seh_handlerdata
directive, skip it if there's no info to make a sensible unwind
info structure out of, and clearly error out if such info appeared
later in the process.

Differential Revision: https://reviews.llvm.org/D86527

4 years ago[gn build] Port b1f4e5979b7
LLVM GN Syncbot [Fri, 28 Aug 2020 05:56:49 +0000 (05:56 +0000)]
[gn build] Port b1f4e5979b7

4 years ago(Expensive) Check for Loop, SCC and Region pass return status
serge-sans-paille [Thu, 27 Aug 2020 15:31:54 +0000 (17:31 +0200)]
(Expensive) Check for Loop, SCC and Region pass return status

This generalizes the logic introduced in https://reviews.llvm.org/D80916 to
other passes.

It's needed by https://reviews.llvm.org/D86442 to assert passes correctly report
their status.

Differential Revision: https://reviews.llvm.org/D86589

4 years agoAdd a global flag to disable the global dialect registry "process wise"
Mehdi Amini [Fri, 28 Aug 2020 02:26:27 +0000 (02:26 +0000)]
Add a global flag to disable the global dialect registry "process wise"

This is intended to ease the transition for client with a lot of
dependencies. It'll be removed in the coming weeks.

Differential Revision: https://reviews.llvm.org/D86755

4 years agoAdd an unsigned shift base sanitizer
JF Bastien [Fri, 14 Aug 2020 21:05:57 +0000 (14:05 -0700)]
Add an unsigned shift base sanitizer

It's not undefined behavior for an unsigned left shift to overflow (i.e. to
shift bits out), but it has been the source of bugs and exploits in certain
codebases in the past. As we do in other parts of UBSan, this patch adds a
dynamic checker which acts beyond UBSan and checks other sources of errors. The
option is enabled as part of -fsanitize=integer.

The flag is named: -fsanitize=unsigned-shift-base
This matches shift-base and shift-exponent flags.

<rdar://problem/46129047>

Differential Revision: https://reviews.llvm.org/D86000

4 years ago[flang][openacc] Fix gang-argument parsing and add validity tests for !$acc loop
Valentin Clement [Fri, 28 Aug 2020 02:32:29 +0000 (22:32 -0400)]
[flang][openacc] Fix gang-argument parsing and add validity tests for !$acc loop

This patch fix the prasing for the gang-arg values for the gang clause. It also adds
some clause validity tests for the loop construct.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D86584

4 years ago[MSAN] Add fiber switching APIs
Justin Cady [Fri, 28 Aug 2020 02:21:59 +0000 (19:21 -0700)]
[MSAN] Add fiber switching APIs

Add functions exposed via the MSAN interface to enable MSAN within
binaries that perform manual stack switching (e.g. through using fibers
or coroutines).

This functionality is analogous to the fiber APIs available for ASAN and TSAN.

Fixes google/sanitizers#1232

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D86471

4 years ago[flang][openacc] Add check for tile clause restriction
Valentin Clement [Fri, 28 Aug 2020 02:13:29 +0000 (22:13 -0400)]
[flang][openacc] Add check for tile clause restriction

The tile clause in OpenACC 3.0 imposes some restriction. Element in the tile size list are either * or a
constant positive integer expression. If there are n tile sizes in the list, the loop construct must be immediately
followed by n tightly-nested loops.
This patch implement these restrictions and add some tests.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D86655

4 years ago[PowerPC] PPCBoolRetToInt: Don't translate Constant's operands
Kai Luo [Fri, 28 Aug 2020 01:56:12 +0000 (01:56 +0000)]
[PowerPC] PPCBoolRetToInt: Don't translate Constant's operands

When collecting `i1` values via `findAllDefs`, ignore Constant's
operands, since Constant's operands might not be `i1`.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46923 which causes ICE
```
llvm-project/llvm/lib/IR/Constants.cpp:1924: static llvm::Constant *llvm::ConstantExpr::getZExt(llvm::Constant *, llvm::Type *, bool): Assertion `C->getType()->getScalarSizeInBits() < Ty->getScalarSizeInBits()&& "SrcTy must be smaller than DestTy for ZExt!"' failed.
```

Differential Revision: https://reviews.llvm.org/D85007

4 years ago[MemorySSA] Assert defining access is not a MemoryUse.
Alina Sbirlea [Thu, 27 Aug 2020 23:39:53 +0000 (16:39 -0700)]
[MemorySSA] Assert defining access is not a MemoryUse.

4 years agoRevert "Use find_library for ncurses"
Harmen Stoppels [Fri, 28 Aug 2020 00:57:26 +0000 (17:57 -0700)]
Revert "Use find_library for ncurses"

The introduction of find_library for ncurses caused more issues than it solved problems. The current open issue is it makes the static build of LLVM fail. It is better to revert for now, and get back to it later.

Revert "[CMake] Fix an issue where get_system_libname creates an empty regex capture on windows"
This reverts commit 1ed1e16ab83f55d85c90ae43a05cbe08a00c20e0.

Revert "Fix msan build"
This reverts commit 34fe9613dda3c7d8665b609136a8c12deb122382.

Revert "[CMake] Always mark terminfo as unavailable on Windows"
This reverts commit 76bf26236f6fd453343666c3cd91de8f74ffd89d.

Revert "[CMake] Fix OCaml build failure because of absolute path in system libs"
This reverts commit 8e4acb82f71ad4effec8895b8fc957189ce95933.

Revert "[CMake] Don't look for terminfo libs when LLVM_ENABLE_TERMINFO=OFF"
This reverts commit 495f91fd33d492941c39424a32cf24bcfe192f35.

Revert "Use find_library for ncurses"
This reverts commit a52173a3e56553d7b795bcf3cdadcf6433117107.

Differential revision: https://reviews.llvm.org/D86521

4 years ago[lld-macho][NFC] Define isHidden() in LinkEditSection
Jez Ng [Fri, 28 Aug 2020 00:43:19 +0000 (17:43 -0700)]
[lld-macho][NFC] Define isHidden() in LinkEditSection

Since it's always true

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86749

4 years ago[lld-macho] Weak locals should be relaxed too
Jez Ng [Fri, 28 Aug 2020 00:43:16 +0000 (17:43 -0700)]
[lld-macho] Weak locals should be relaxed too

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86746

4 years ago[lld-macho] Support GOT relocations to __dso_handle
Jez Ng [Thu, 27 Aug 2020 22:59:48 +0000 (15:59 -0700)]
[lld-macho] Support GOT relocations to __dso_handle

Found such a relocation while testing some real world programs.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86642

4 years ago[lld-macho] Implement GOT_LOAD relaxation
Jez Ng [Thu, 27 Aug 2020 22:59:45 +0000 (15:59 -0700)]
[lld-macho] Implement GOT_LOAD relaxation

We can have GOT_LOAD relocations that reference `__dso_handle`.
However, our binding opcode encoder doesn't support binding to the DSOHandle
symbol. Instead of adding support for that, I decided it would be cleaner to
implement GOT_LOAD relaxation since `__dso_handle`'s location is always
statically known.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86641

4 years ago[lld-macho] Emit binding opcodes for defined symbols that override weak dysyms
Jez Ng [Thu, 27 Aug 2020 22:59:30 +0000 (15:59 -0700)]
[lld-macho] Emit binding opcodes for defined symbols that override weak dysyms

These opcodes tell dyld to coalesce the overridden weak dysyms to this
particular symbol definition.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86575

4 years ago[lld-macho] Emit the right header flags for weak bindings/symbols
Jez Ng [Thu, 27 Aug 2020 22:59:15 +0000 (15:59 -0700)]
[lld-macho] Emit the right header flags for weak bindings/symbols

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86574

4 years ago[lld-macho] Implement weak binding for branch relocations
Jez Ng [Thu, 27 Aug 2020 22:54:42 +0000 (15:54 -0700)]
[lld-macho] Implement weak binding for branch relocations

Since there is no "weak lazy" lookup, function calls to weak symbols are
always non-lazily bound. We emit both regular non-lazy bindings as well
as weak bindings, in order that the weak bindings may overwrite the
non-lazy bindings if an appropriate symbol is found at runtime. However,
the bound addresses will still be written (non-lazily) into the
LazyPointerSection.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86573

4 years ago[lldb] Fix "no matching std::pair constructor" on Ubuntu 16.04 (NFC)
Jonas Devlieghere [Fri, 28 Aug 2020 00:20:50 +0000 (17:20 -0700)]
[lldb] Fix "no matching std::pair constructor" on Ubuntu 16.04 (NFC)

Fixes error: no matching constructor for initialization of
'std::pair<std::__cxx11::basic_string<char>, std::__cxx11::basic_string<char> >'
with older toolchain (clang/libcxx) on Ubuntu 16.04. The issue is the
StringRef-to-std::string conversion.

4 years ago[clang-query][NFC] Silence a few lint warnings
Nathan James [Fri, 28 Aug 2020 00:06:46 +0000 (01:06 +0100)]
[clang-query][NFC] Silence a few lint warnings

4 years agoGlobalISel: Implement computeNumSignBits for G_SEXT_INREG
Matt Arsenault [Thu, 27 Aug 2020 18:58:39 +0000 (14:58 -0400)]
GlobalISel: Implement computeNumSignBits for G_SEXT_INREG

4 years agoAMDGPU/GlobalISel: Implement computeKnownBits for groupstaticsize
Matt Arsenault [Thu, 27 Aug 2020 21:21:41 +0000 (17:21 -0400)]
AMDGPU/GlobalISel: Implement computeKnownBits for groupstaticsize

4 years agoAMDGPU: Fix broken switch braces
Matt Arsenault [Thu, 27 Aug 2020 21:17:50 +0000 (17:17 -0400)]
AMDGPU: Fix broken switch braces

4 years agoCorrectly revert "GlobalISel: Use & operator on KnownBits"
Matt Arsenault [Thu, 27 Aug 2020 23:02:34 +0000 (19:02 -0400)]
Correctly revert "GlobalISel: Use & operator on KnownBits"

I mis-resolved the revert through moving the code to another function.

4 years agoRevert "GlobalISel: Use & operator on KnownBits"
Matt Arsenault [Thu, 27 Aug 2020 22:41:02 +0000 (18:41 -0400)]
Revert "GlobalISel: Use & operator on KnownBits"

This reverts commit e53b799779b079a70f600e5cad2ab7267d66b1b7.

Confusingly, this does not simply and the two sets of known bits, but
implements known bits for the and operator.

4 years agoRecommit "[libFuzzer] Fix arguments of InsertPartOf/CopyPartOf calls in CrossOver...
Dokyung Song [Wed, 5 Aug 2020 23:12:19 +0000 (23:12 +0000)]
Recommit "[libFuzzer] Fix arguments of InsertPartOf/CopyPartOf calls in CrossOver mutator."

The CrossOver mutator is meant to cross over two given buffers (referred to as
the first/second buffer henceforth). Previously InsertPartOf/CopyPartOf calls
used in the CrossOver mutator incorrectly inserted/copied part of the second
buffer into a "scratch buffer" (MutateInPlaceHere of the size
CurrentMaxMutationLen), rather than the first buffer. This is not intended
behavior, because the scratch buffer does not always (i) contain the content of
the first buffer, and (ii) have the same size as the first buffer;
CurrentMaxMutationLen is typically a lot larger than the size of the first
buffer. This patch fixes the issue by using the first buffer instead of the
scratch buffer in InsertPartOf/CopyPartOf calls.

A FuzzBench experiment was run to make sure that this change does not
inadvertently degrade the performance. The performance is largely the same; more
details can be found at:
https://storage.googleapis.com/fuzzer-test-suite-public/fixcrossover-report/index.html

This patch also adds two new tests, namely "cross_over_insert" and
"cross_over_copy", which specifically target InsertPartOf and CopyPartOf,
respectively.

- cross_over_insert.test checks if the fuzzer can use InsertPartOf to trigger
  the crash.

- cross_over_copy.test checks if the fuzzer can use CopyPartOf to trigger the
  crash.

These newly added tests were designed to pass with the current patch, but not
without the it (with 790878f291fa5dc58a1c560cb6cc76fd1bfd1c5a these tests do not
pass). To achieve this, -max_len was intentionally given a high value. Without
this patch, InsertPartOf/CopyPartOf will generate larger inputs, possibly with
unpredictable data in it, thereby failing to trigger the crash.

The test pass condition for these new tests is narrowed down by (i) limiting
mutation depth to 1 (i.e., a single CrossOver mutation should be able to trigger
the crash) and (ii) checking whether the mutation sequence of "CrossOver-" leads
to the crash.

Also note that these newly added tests and an existing test (cross_over.test)
all use "-reduce_inputs=0" flags to prevent reducing inputs; it's easier to
force the fuzzer to keep original input string this way than tweaking
cov-instrumented basic blocks in the source code of the fuzzer executable.

Differential Revision: https://reviews.llvm.org/D85554

4 years ago[ValueTracking] Replace recursion with Worklist
Vitaly Buka [Thu, 27 Aug 2020 20:38:29 +0000 (13:38 -0700)]
[ValueTracking] Replace recursion with Worklist

Now findAllocaForValue can handle nontrivial phi cycles.

4 years agoRevert "[CodeGen][AArch64] Support arm_sve_vector_bits attribute"
Cullen Rhodes [Thu, 27 Aug 2020 21:13:23 +0000 (21:13 +0000)]
Revert "[CodeGen][AArch64] Support arm_sve_vector_bits attribute"

Test CodeGen/attr-arm-sve-vector-bits-call.c is failing on some builders
[1][2]. Reverting whilst I investigate.

[1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375
[2] https://luci-milo.appspot.com/p/fuchsia/builders/ci/clang-linux-x64/b8870800848452818112

This reverts commit 42587345a3afc52c03c6e6095db773358a1b03e9.

4 years ago[SSP] Restore setting the visibility of __guard_local to hidden for better code gener...
Brad Smith [Thu, 27 Aug 2020 21:17:38 +0000 (17:17 -0400)]
[SSP] Restore setting the visibility of __guard_local to hidden for better code generation.

Patch by: Philip Guenther

4 years ago[OPENMP]Do not crash for globals in inner regions with outer target
Alexey Bataev [Thu, 27 Aug 2020 20:06:28 +0000 (16:06 -0400)]
[OPENMP]Do not crash for globals in inner regions with outer target
region.

If the global variable is used in the target region,it is always
captured, if not marked as declare target.

4 years ago[Driver][XRay][test] Update the macOS support check
Azharuddin Mohammed [Thu, 27 Aug 2020 20:57:07 +0000 (13:57 -0700)]
[Driver][XRay][test] Update the macOS support check

For macOS, the code says, the XRay flag is only supported on x86_64.
Updating the test and making that check explicit.

Differential Revision: https://reviews.llvm.org/D85773

4 years ago[Attributor] Do not manifest noundef for dead positions
Shinji Okumura [Thu, 27 Aug 2020 20:54:01 +0000 (05:54 +0900)]
[Attributor] Do not manifest noundef for dead positions

Even if noundef is deduced for a position, we should not manifest it when the position is dead.
This is because the associated values with dead positions are replaced with undef values by AAIsDead.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86565

4 years ago[OpenMP] Fix a failing test after D85214
Saiyedul Islam [Thu, 27 Aug 2020 20:54:42 +0000 (20:54 +0000)]
[OpenMP] Fix a failing test after D85214

Removed version 45 testing from a failing test for now.

4 years agoGlobalISel: Implement known bits for min/max
Matt Arsenault [Thu, 27 Aug 2020 16:40:03 +0000 (12:40 -0400)]
GlobalISel: Implement known bits for min/max

4 years agoAArch64/GlobalISel: Fix missing function begin marker in test
Matt Arsenault [Thu, 27 Aug 2020 17:25:12 +0000 (13:25 -0400)]
AArch64/GlobalISel: Fix missing function begin marker in test

4 years agoMIR: Infer not-SSA for subregister defs
Matt Arsenault [Sun, 28 Jun 2020 15:34:42 +0000 (11:34 -0400)]
MIR: Infer not-SSA for subregister defs

It's possible to have a single virtual register def with a subreg
index that would pass the previous check, but it's not possible to
have a subregister def in SSA.

This is in preparation for adding stricter checks for SSA MIR.

4 years ago[StackSafety] Ignore allocas with partial lifetime markers
Vitaly Buka [Thu, 27 Aug 2020 20:53:28 +0000 (13:53 -0700)]
[StackSafety] Ignore allocas with partial lifetime markers

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D86672

4 years ago[NFC][ValueTracking] Add OffsetZero into findAllocaForValue
Vitaly Buka [Thu, 27 Aug 2020 20:45:39 +0000 (13:45 -0700)]
[NFC][ValueTracking] Add OffsetZero into findAllocaForValue

For StackLifetime after finding alloca we need to check that
values ponting to the begining of alloca.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D86692

4 years agoAMDGPU: Use caller subtarget, not intrinsic declaration
Matt Arsenault [Thu, 27 Aug 2020 15:14:59 +0000 (11:14 -0400)]
AMDGPU: Use caller subtarget, not intrinsic declaration

Intrinsic declarations use the default subtarget, but this should be
using the subtarget for the calling function. I haven't been able to
come up with a case where it matters though.

4 years agoGlobalISel: Add and_trivial_mask to all_combines
Matt Arsenault [Thu, 27 Aug 2020 17:15:46 +0000 (13:15 -0400)]
GlobalISel: Add and_trivial_mask to all_combines

Also make up a new category of combines.

4 years ago[Hexagon] Emit better 32-bit multiplication sequence for HVXv62+
Krzysztof Parzyszek [Thu, 27 Aug 2020 20:16:39 +0000 (15:16 -0500)]
[Hexagon] Emit better 32-bit multiplication sequence for HVXv62+

4 years ago[RegisterScavenging] Delete dead function unprocess().
Eli Friedman [Thu, 27 Aug 2020 20:17:47 +0000 (13:17 -0700)]
[RegisterScavenging] Delete dead function unprocess().

4 years ago[Attributor] Do not add AA to dependency graph after the update stage
Shinji Okumura [Thu, 27 Aug 2020 20:16:18 +0000 (05:16 +0900)]
[Attributor] Do not add AA to dependency graph after the update stage

If an AA is registered to the dependency graph in the manifest stage, Attributor aborts in `::manifestAttributes()`.
This patch prevents such termination.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86734

4 years ago[CodeGen] Use an AttrBuilder to bulk remove 'target-cpu', 'target-features', and...
Craig Topper [Thu, 27 Aug 2020 19:32:17 +0000 (12:32 -0700)]
[CodeGen] Use an AttrBuilder to bulk remove 'target-cpu', 'target-features', and 'tune-cpu' before re-adding in CodeGenModule::setNonAliasAttributes.

I think the removeAttributes interface should be faster than
calling removeAttribute 3 times.

4 years ago[OpenMP] Ensure testing for versions 4.5 and default - Part 3
Saiyedul Islam [Thu, 27 Aug 2020 19:35:36 +0000 (19:35 +0000)]
[OpenMP] Ensure testing for versions 4.5 and default - Part 3

This third patch in the series removes version 5.0 string from
test cases making them check for default version. It also add test
cases for version 4.5.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D85214

4 years ago[InstSimplify] SimplifyPHINode(): check that instruction is in basic block first
Roman Lebedev [Thu, 27 Aug 2020 19:31:40 +0000 (22:31 +0300)]
[InstSimplify] SimplifyPHINode(): check that instruction is in basic block first

As pointed out in post-commit review, this can legally be called
on instructions that are not inserted into basic blocks,
so don't blindly assume that there is basic block.

4 years ago[SVE] Remove bad call to VectorType::getNumElements() from HeapProfiler
Christopher Tetreault [Thu, 27 Aug 2020 19:04:39 +0000 (12:04 -0700)]
[SVE] Remove bad call to VectorType::getNumElements() from HeapProfiler

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D86727

4 years ago[analyzer] NFC: Fix wrong parameter name in printFormattedEntry.
Yang Fan [Thu, 27 Aug 2020 19:09:17 +0000 (12:09 -0700)]
[analyzer] NFC: Fix wrong parameter name in printFormattedEntry.

Parameters were in a different order in the header and in the implementation.

Fix surrounding comments a bit.

Differential Revision: https://reviews.llvm.org/D86691

4 years ago[analyzer] Fix the debug print about debug egraph dumps requiring asserts.
Yang Fan [Thu, 27 Aug 2020 18:45:12 +0000 (11:45 -0700)]
[analyzer] Fix the debug print about debug egraph dumps requiring asserts.

There's no need to remind people about that when clang *is* built with asserts.

Differential Revision: https://reviews.llvm.org/D86334

4 years ago[analyzer] pr47037: CastValueChecker: Support for the new variadic isa<>.
Adam Balogh [Thu, 27 Aug 2020 15:06:10 +0000 (08:06 -0700)]
[analyzer] pr47037: CastValueChecker: Support for the new variadic isa<>.

llvm::isa<>() and llvm::isa_and_not_null<>() template functions recently became
variadic. Unfortunately this causes crashes in case of isa_and_not_null<>()
and incorrect behavior in isa<>(). This patch fixes this issue.

Differential Revision: https://reviews.llvm.org/D85728

4 years ago[analyzer] NFC: Store the pointee/referenced type for dynamic type tracking.
Adam Balogh [Thu, 27 Aug 2020 15:01:43 +0000 (08:01 -0700)]
[analyzer] NFC: Store the pointee/referenced type for dynamic type tracking.

The successfulness of a dynamic cast depends only on the C++ class, not the pointer or reference. Thus if *A is a *B, then &A is a &B,
const *A is a const *B etc. This patch changes DynamicCastInfo to store
and check the cast between the unqualified pointed/referenced types.
It also removes e.g. SubstTemplateTypeParmType from both the pointer
and the pointed type.

Differential Revision: https://reviews.llvm.org/D85752

4 years agoRecommit "[libFuzzer] Fix value-profile-load test."
Dokyung Song [Wed, 19 Aug 2020 20:21:05 +0000 (20:21 +0000)]
Recommit "[libFuzzer] Fix value-profile-load test."

value-profile-load.test needs adjustment with a mutator change in
bb54bcf84970c04c9748004f3a4cf59b0c1832a7, which reverted as of now, but will be
recommitted after landing this patch.

This patch makes value-profile-load.test more friendly to (and aware of) the
current value profiling strategy, which is based on the hamming as well as the
absolute distance. To this end, this patch adjusts the set of input values that
trigger an expected crash. More specifically, this patch now uses a single value
0x01effffe as a crashing input, because this value is close to values like
{0x1ffffff, 0xffffff, ...}, which are very likely to be added to the corpus per
the current hamming- and absolute-distance-based value profiling strategy. Note
that previously the crashing input values were {1234567 * {1, 2, ...}, s.t. <
INT_MAX}.

Every byte in the chosen value 0x01effeef is intentionally different; this was
to make it harder to find the value without the intermediate inputs added to the
corpus by the value profiling strategy.

Also note that LoadTest.cpp now uses a narrower condition (Size != 8) for
initial pruning of inputs, effectively preventing libFuzzer from generating
inputs longer than necessary and spending time on mutating such long inputs in
the corpus - a functionality not meant to be tested by this specific test.

Differential Revision: https://reviews.llvm.org/D86247

4 years ago[libcxx] Fix the broken test after D82657.
Haojian Wu [Thu, 27 Aug 2020 18:43:20 +0000 (20:43 +0200)]
[libcxx] Fix the broken test after D82657.

Differential Revision: https://reviews.llvm.org/D86685

4 years ago[Attributor] Guarantee getAAFor not to update AA in the manifestation stage
Shinji Okumura [Thu, 27 Aug 2020 18:29:39 +0000 (03:29 +0900)]
[Attributor] Guarantee getAAFor not to update AA in the manifestation stage

If we query an AA with `Attributor::getAAFor` in `AbstractAttribute::manifest`, the AA may be updated.
This patch makes use of the phase flag in Attributor, and handle `getAAFor` behavior according to the flag.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86635

4 years ago[MLIR] Fixed missing constraint append when adding an AffineIfOp domain
Vincent Zhao [Thu, 27 Aug 2020 18:57:36 +0000 (00:27 +0530)]
[MLIR] Fixed missing constraint append when adding an AffineIfOp domain

The prior diff that introduced `addAffineIfOpDomain` missed appending
constraints from the ifOp domain. This revision fixes this problem.

Differential Revision: https://reviews.llvm.org/D86421

4 years ago[SVE] Remove calls to VectorType::getNumElements from Transforms/Vectorize
Christopher Tetreault [Thu, 27 Aug 2020 18:19:46 +0000 (11:19 -0700)]
[SVE] Remove calls to VectorType::getNumElements from Transforms/Vectorize

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D82056

4 years ago[OpenMP] Ensure testing for versions 4.5 and default - Part 2
Saiyedul Islam [Thu, 27 Aug 2020 18:50:34 +0000 (18:50 +0000)]
[OpenMP] Ensure testing for versions 4.5 and default - Part 2

Many OpenMP Clang tests do not RUN for version 4.5 and the default
version. This second patch in the series handles test cases which
require updation in CHECK lines along with adding RUN lines for
the default version. It involves updating line number of pragmas.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D85150

4 years ago[OpenMP][MLIR] Conversion pattern for OpenMP to LLVM
Kiran Chandramohan [Thu, 13 Aug 2020 08:03:04 +0000 (09:03 +0100)]
[OpenMP][MLIR] Conversion pattern for OpenMP to LLVM

Adding a conversion pattern for the parallel Operation. This will
help the conversion of parallel operation with standard dialect to
parallel operation with llvm dialect. The type conversion of the block
arguments in a parallel region are controlled by the pattern for the
parallel Operation. Without this pattern, a parallel Operation with
block arguments cannot be converted from standard to LLVM dialect.
Other OpenMP operations without regions are marked as legal. When
translation of OpenMP operations with regions are added then patterns
for these operations can also be added.
Also uses all the standard to llvm patterns. Patterns of other dialects
can be added later if needed.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D86273

4 years ago[lld-macho] Disable invalid/stub-link.s test for Mac
Jez Ng [Thu, 27 Aug 2020 18:10:59 +0000 (11:10 -0700)]
[lld-macho] Disable invalid/stub-link.s test for Mac

It seems to be failing on some Google Buildbots.

This diff also includes a minor fix for the install name of one of
libSystem's re-exports. I don't think it's the cause of the test
failure, though. The wrong install name just meant that the symbol
lookup failure would still happen, but it would have been caused by the
re-export not being found, instead of the arch failing to match.

Differential Revision: https://reviews.llvm.org/D86728

4 years ago[libc++][NFC] Define functor's call operator inline
Louis Dionne [Thu, 27 Aug 2020 17:09:23 +0000 (13:09 -0400)]
[libc++][NFC] Define functor's call operator inline

This fixes a mismatched visibility attribute on the call operator in
addition to making the code clearer. Given this is a simple lambda
in essence, the intent has always been to give it inline visibility.

4 years ago[SVE] Remove calls to VectorType::getNumElements from IR
Christopher Tetreault [Thu, 27 Aug 2020 17:39:18 +0000 (10:39 -0700)]
[SVE] Remove calls to VectorType::getNumElements from IR

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D81500

4 years agoGlobalISel: Use & operator on KnownBits
Matt Arsenault [Thu, 27 Aug 2020 16:32:49 +0000 (12:32 -0400)]
GlobalISel: Use & operator on KnownBits

Avoid repeating for zero and one

4 years agoGlobalISel: Implement known bits for G_MERGE_VALUES
Matt Arsenault [Thu, 27 Aug 2020 16:15:16 +0000 (12:15 -0400)]
GlobalISel: Implement known bits for G_MERGE_VALUES

4 years ago[ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics
Mikhail Maltsev [Thu, 27 Aug 2020 17:52:59 +0000 (18:52 +0100)]
[ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics

Add bitcode files which got truncated to 0 length in phabricator.

Differential Revision: https://reviews.llvm.org/D86146

4 years agoGlobalISel: Remove leftover lit.local.cfg
Matt Arsenault [Thu, 27 Aug 2020 17:47:38 +0000 (13:47 -0400)]
GlobalISel: Remove leftover lit.local.cfg

The global-isel feature has been required for a long time and was
removed in c9455d3c579292e7ae5b7559ad0302d459e69a95, so this was
causing all tests to be skipped.

4 years ago[ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics
Mikhail Maltsev [Thu, 27 Aug 2020 17:43:16 +0000 (18:43 +0100)]
[ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics

This patch adjusts the following ARM/AArch64 LLVM IR intrinsics:
- neon_bfmmla
- neon_bfmlalb
- neon_bfmlalt
so that they take and return bf16 and float types. Previously these
intrinsics used <8 x i8> and <4 x i8> vectors (a rudiment from
implementation lacking bf16 IR type).

The neon_vbfdot[q] intrinsics are adjusted similarly. This change
required some additional selection patterns for vbfdot itself and
also for vector shuffles (in a previous patch) because of SelectionDAG
transformations kicking in and mangling the original code.

This patch makes the generated IR cleaner (less useless bitcasts are
produced), but it does not affect the final assembly.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D86146

4 years ago[X86] Don't call hasFnAttribute and getFnAttribute for 'prefer-vector-width' and...
Craig Topper [Thu, 27 Aug 2020 17:30:47 +0000 (10:30 -0700)]
[X86] Don't call hasFnAttribute and getFnAttribute for 'prefer-vector-width' and 'min-legal-vector-width' in getSubtargetImpl

We only need to call getFnAttribute and then check if the Attribute
is None or not.

4 years agoReapply D70800: Fix AArch64 AAPCS frame record chain
Owen Anderson [Wed, 26 Aug 2020 19:36:13 +0000 (19:36 +0000)]
Reapply D70800: Fix AArch64 AAPCS frame record chain

Original Commit Message:
After the commit r368987 (rG643adb55769e) was landed, the frame record (FP and LR register)
may be placed in the middle of a stack frame if a function has both callee-saved
general-purpose registers and floating point registers. This will break the stack unwinders
that simply walk through the frame records (based on the guarantee from AAPCS64
"The Frame Pointer" section). This commit fixes the problem by adding the frame record offset.

Patch By: logan
Differential Revision: D70800

4 years ago[HeapProf] Fix bot failures from instrumentation pass
Teresa Johnson [Thu, 27 Aug 2020 16:38:45 +0000 (09:38 -0700)]
[HeapProf] Fix bot failures from instrumentation pass

Fix bot failure from 7ed8124d46f94601d5f1364becee9cee8538265e:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/8533

Since we are always using dynamic shadow,
insertDynamicShadowAtFunctionEntry should always return true for
modifying the function.

4 years ago[gn build] Port 7ed8124d46f
LLVM GN Syncbot [Thu, 27 Aug 2020 17:08:02 +0000 (17:08 +0000)]
[gn build] Port 7ed8124d46f

4 years ago[gn build] Manually port c9455d3
Arthur Eubanks [Thu, 27 Aug 2020 17:05:34 +0000 (10:05 -0700)]
[gn build] Manually port c9455d3

4 years ago[test][Inliner] Make always-inline.ll work with NPM
Arthur Eubanks [Wed, 26 Aug 2020 23:55:46 +0000 (16:55 -0700)]
[test][Inliner] Make always-inline.ll work with NPM

The NPM doesn't support call-site alwaysinline as described in the comments.

Also make NPM runs more similar to legacy PM runs.

Reviewed By: ychen, asbirlea

Differential Revision: https://reviews.llvm.org/D86663

4 years ago[GISel] Add new GISel combiners for G_SELECT
Aditya Nandakumar [Thu, 27 Aug 2020 16:38:48 +0000 (09:38 -0700)]
[GISel] Add new GISel combiners for G_SELECT

https://reviews.llvm.org/D83833

Patch adds two new GICombinerRules for G_SELECT. The rules include:
combining selects with undef comparisons into their first selectee value,
and to combine away selects with constant comparisons. Patch additionally
adds a new combiner test for the AArch64 target to test these new G_SELECT
combiner rules and the existing select_same_val combiner rule.

Patch by  mkitzan

4 years ago[lldb] Make lldb-argdumper a dependency of liblldb
Jonas Devlieghere [Thu, 27 Aug 2020 16:29:47 +0000 (09:29 -0700)]
[lldb] Make lldb-argdumper a dependency of liblldb

Always make lldb-argdumper a dependency of liblldb. Currently it is only
a dependency of the python swig target because of the relative symlink
in the python resource directory. That means that the dependency won't
be there when LLDB_ENABLE_PYTHON is disabled.

Differential revision: https://reviews.llvm.org/D86722

4 years ago[lldb] Move triple construction out of getArchCFlags in DarwinBuilder (NFC)
Jonas Devlieghere [Wed, 26 Aug 2020 18:56:30 +0000 (11:56 -0700)]
[lldb] Move triple construction out of getArchCFlags in DarwinBuilder (NFC)

Move the construction of the triple out of getArchCFlags in the
DarwinBuilder.

4 years ago[OCaml] Remove add_constant_propagation
Arthur Eubanks [Thu, 27 Aug 2020 16:29:17 +0000 (09:29 -0700)]
[OCaml] Remove add_constant_propagation

After https://reviews.llvm.org/D85159.

4 years ago[sda][nfc] clang-formatting
Simon Moll [Thu, 27 Aug 2020 16:22:47 +0000 (18:22 +0200)]
[sda][nfc] clang-formatting

4 years ago[Attributor] Add a phase flag to Attributor
Shinji Okumura [Thu, 27 Aug 2020 16:16:38 +0000 (01:16 +0900)]
[Attributor] Add a phase flag to Attributor

Add a new flag that indicates which stage in the process we are in.
This flag is introduced for handling behavior of `getAAFor` according to the stage. (discussed in D86635)

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86678

4 years ago[GISel]: Fix one more CSE Non determinism
Aditya Nandakumar [Thu, 27 Aug 2020 15:54:33 +0000 (08:54 -0700)]
[GISel]: Fix one more CSE Non determinism

https://reviews.llvm.org/D86676

Sometimes we can have the following code

 x:gpr(s32) = G_OP

Say we build G_OP2 to the same x and then delete the previous instruction. Using something like

 Register X = ...;
 auto NewMIB = CSEBuilder.buildOp2(X, ... args);

Currently there's a mismatch in how NewMIB is profiled and inserted into the CSEMap (ie it doesn't consider register bank/register class along with type).Unify the profiling by refactoring and calling the common method.

This was found by turning on the CSEInfo::verify in at the end of each of our GISel passes which turns inconsistent state/non determinism in CSEing into crashes which likely usually indicates missing calls to Observer on mutations (the most common case). Here non determinism usually means not cseing sometimes, but almost never about producing incorrect code.
Also this patch adds this verification at the end of the combiners as well.

4 years ago[CodeGen] Properly propagating Calling Convention information when lowering vector...
Lucas Prates [Thu, 27 Aug 2020 14:31:40 +0000 (15:31 +0100)]
[CodeGen] Properly propagating Calling Convention information when lowering vector arguments

When joining the legal parts of vector arguments into its original value
during the lower of Formal Arguments in SelectionDAGBuilder, the Calling
Convention information was not being propagated for the handling of each
individual parts. The same did not happen when lowering calls, causing a
mismatch.

This patch fixes the issue by properly propagating the Calling
Convention details.

This fixes Bugzilla #47001.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D86715

4 years ago[MLIR][GPUToSPIRV] Fix use-after-free. Found by asan.
Benjamin Kramer [Thu, 27 Aug 2020 15:57:11 +0000 (17:57 +0200)]
[MLIR][GPUToSPIRV] Fix use-after-free. Found by asan.

4 years ago[HeapProf] Clang and LLVM support for heap profiling instrumentation
Teresa Johnson [Thu, 13 Aug 2020 23:29:38 +0000 (16:29 -0700)]
[HeapProf] Clang and LLVM support for heap profiling instrumentation

See RFC for background:
http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html

Note that the runtime changes will be sent separately (hopefully this
week, need to add some tests).

This patch includes the LLVM pass to instrument memory accesses with
either inline sequences to increment the access count in the shadow
location, or alternatively to call into the runtime. It also changes
calls to memset/memcpy/memmove to the equivalent runtime version.
The pass is modeled on the address sanitizer pass.

The clang changes add the driver option to invoke the new pass, and to
link with the upcoming heap profiling runtime libraries.

Currently there is no attempt to optimize the instrumentation, e.g. to
aggregate updates to the same memory allocation. That will be
implemented as follow on work.

Differential Revision: https://reviews.llvm.org/D85948

4 years agoRevert "[libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY"
Mikhail Maltsev [Thu, 27 Aug 2020 15:47:18 +0000 (16:47 +0100)]
Revert "[libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY"

This reverts commit 3b71f91558ff8b569199547efe800cb501c3cf94.

The commit is breaking some build bots.

4 years ago[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block
Roman Lebedev [Wed, 26 Aug 2020 08:11:04 +0000 (11:11 +0300)]
[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block

Apparently, we don't do this, neither in EarlyCSE, nor in InstSimplify,
nor in (old) GVN, but do in NewGVN and SimplifyCFG of all places..

While i could teach EarlyCSE how to hash PHI nodes,
we can't really do much (anything?) even if we find two identical
PHI nodes in different basic blocks, same-BB case is the interesting one,
and if we teach InstSimplify about it (which is what i wanted originally,
https://reviews.llvm.org/D86530), we get EarlyCSE support for free.

So i would think this is pretty uncontroversial.

On vanilla llvm test-suite + RawSpeed, this has the following effects:
```
| statistic name                                     | baseline  | proposed  |      Δ |        % |    \|%\| |
|----------------------------------------------------|-----------|-----------|-------:|---------:|---------:|
| instsimplify.NumPHICSE                             | 0         | 23779     |  23779 |    0.00% |    0.00% |
| asm-printer.EmittedInsts                           | 7942328   | 7942392   |     64 |    0.00% |    0.00% |
| assembler.ObjectBytes                              | 273069192 | 273084704 |  15512 |    0.01% |    0.01% |
| correlated-value-propagation.NumPhis               | 18412     | 18539     |    127 |    0.69% |    0.69% |
| early-cse.NumCSE                                   | 2183283   | 2183227   |    -56 |    0.00% |    0.00% |
| early-cse.NumSimplify                              | 550105    | 542090    |  -8015 |   -1.46% |    1.46% |
| instcombine.NumAggregateReconstructionsSimplified  | 73        | 4506      |   4433 | 6072.60% | 6072.60% |
| instcombine.NumCombined                            | 3640264   | 3664769   |  24505 |    0.67% |    0.67% |
| instcombine.NumDeadInst                            | 1778193   | 1783183   |   4990 |    0.28% |    0.28% |
| instcount.NumCallInst                              | 1758401   | 1758799   |    398 |    0.02% |    0.02% |
| instcount.NumInvokeInst                            | 59478     | 59502     |     24 |    0.04% |    0.04% |
| instcount.NumPHIInst                               | 330557    | 330533    |    -24 |   -0.01% |    0.01% |
| instcount.TotalInsts                               | 8831952   | 8832286   |    334 |    0.00% |    0.00% |
| simplifycfg.NumInvokes                             | 4300      | 4410      |    110 |    2.56% |    2.56% |
| simplifycfg.NumSimpl                               | 1019808   | 999607    | -20201 |   -1.98% |    1.98% |
```
I.e. it fires ~24k times, causes +110 (+2.56%) more `invoke` -> `call`
transforms, and counter-intuitively results in *more* instructions total.

That being said, the PHI count doesn't decrease that much,
and looking at some examples, it seems at least some of them
were previously getting PHI CSE'd in SimplifyCFG of all places..

I'm adjusting `Instruction::isIdenticalToWhenDefined()` at the same time.
As a comment in `InstCombinerImpl::visitPHINode()` already stated,
there are no guarantees on the ordering of the operands of a PHI node,
so if we just naively compare them, we may false-negatively say that
the nodes are not equal when the only difference is operand order,
which is especially important since the fold is in InstSimplify,
so we can't rely on InstCombine sorting them beforehand.

Fixing this for the general case is costly (geomean +0.02%),
and does not appear to catch anything in test-suite, but for
the same-BB case, it's trivial, so let's fix at least that.

As per http://llvm-compile-time-tracker.com/compare.php?from=04879086b44348cad600a0a1ccbe1f7776cc3cf9&to=82bdedb888b945df1e9f130dd3ac4dd3c96e2925&stat=instructions
this appears to cause geomean +0.03% compile time increase (regression),
but geomean -0.01%..-0.04% code size decrease (improvement).

4 years ago[NFC][EarlyCSE][InstSimplify] Add tests for CSE of PHI nodes
Roman Lebedev [Thu, 27 Aug 2020 12:03:53 +0000 (15:03 +0300)]
[NFC][EarlyCSE][InstSimplify] Add tests for CSE of PHI nodes

PHI nodes depend on the block they're in,
so we can only deal with the most basic case of same-BB PHI's.

4 years ago[Test] Tidy up loose ends from LLVM_HAS_GLOBAL_ISEL
Russell Gallop [Thu, 27 Aug 2020 14:50:25 +0000 (15:50 +0100)]
[Test] Tidy up loose ends from LLVM_HAS_GLOBAL_ISEL

This hasn't been allowed as a build option since r309990

Remove leftover REQUIRES: global-isel

Differential Revision: https://reviews.llvm.org/D86714

4 years ago[libc++] Install a more recent CMake on libc++ builders
Louis Dionne [Thu, 27 Aug 2020 15:21:37 +0000 (11:21 -0400)]
[libc++] Install a more recent CMake on libc++ builders

4 years ago[libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY
David Nicuesa [Thu, 27 Aug 2020 15:21:35 +0000 (16:21 +0100)]
[libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY

Fix compilation with -DLIBCXX_BUILD_EXTERNAL_THREAD_LIBRARY when using clang. Now linking target  'cxx_external_threads' with 'cxx-headers'. Fix mismatching visibility for `libcpp_timed_backoff_policy` function in file <__threading_support>.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D86598

4 years ago[CodeGen][AArch64] Support arm_sve_vector_bits attribute
Cullen Rhodes [Tue, 11 Aug 2020 14:30:02 +0000 (14:30 +0000)]
[CodeGen][AArch64] Support arm_sve_vector_bits attribute

This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.

VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:

  * Implicit casting between VLA <-> VLS types.
  * Coercion of VLS types in function args/return.
  * Mangling of VLS types.

Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.

Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.

The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:

  __SVE_VLS<typename, unsigned>

where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:

  #if __ARM_FEATURE_SVE_BITS==512
  // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
  typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
  // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
  typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));
  #endif

The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision.  The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.

[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85743

4 years ago[Support] On Windows, add optional support for {rpmalloc|snmalloc|mimalloc}
Alexandre Ganea [Thu, 27 Aug 2020 15:09:20 +0000 (11:09 -0400)]
[Support] On Windows, add optional support for {rpmalloc|snmalloc|mimalloc}

This patch optionally replaces the CRT allocator (i.e., malloc and free) with rpmalloc (mixed public domain licence/MIT licence) or snmalloc (MIT licence) or mimalloc (MIT licence). Please note that the source code for these allocators must be available outside of LLVM's tree.

To enable, use `cmake ... -DLLVM_INTEGRATED_CRT_ALLOC=D:/git/rpmalloc -DLLVM_USE_CRT_RELEASE=MT` where `D:/git/rpmalloc` has already been git clone'd from `https://github.com/mjansson/rpmalloc`. The same applies to snmalloc and mimalloc.

When enabled, the allocator will be embeded (statically linked) into the LLVM tools & libraries. This currently only works with the static CRT (/MT), although using the dynamic CRT (/MD) could potentially work as well in the future.

When enabled, this changes the memory stack from:
  new/delete -> MS VC++ CRT malloc/free -> HeapAlloc -> VirtualAlloc
to:
  new/delete -> {rpmalloc|snmalloc|mimalloc} -> VirtualAlloc

The goal of this patch is to bypass the application's global heap - which is thread-safe thus inducing locking - and instead take advantage of a modern lock-free, thread cache, allocator. On a 6-core Xeon Skylake we observe a 2.5x decrease in execution time when linking a large scale application with LLD and ThinLTO (12 min 20 sec -> 5 min 34 sec), when all hardware threads are being used (using LLD's flag /opt:lldltojobs=all). On a dual 36-core Xeon Skylake with all hardware threads used, we observe a 24x decrease in execution time (1 h 2 min -> 2 min 38 sec) when linking a large application with LLD and ThinLTO. Clang build times also see a decrease in the range 5-10% depending on the configuration.

Differential Revision: https://reviews.llvm.org/D71786

4 years agoRevert "[AIX][XCOFF] emit symbol visibility for xcoff object file."
diggerlin [Thu, 27 Aug 2020 15:07:58 +0000 (11:07 -0400)]
Revert "[AIX][XCOFF] emit symbol visibility for xcoff object file."

This reverts commit a0818689213234d5a078641432d10eccccf61a13.

Based on the Hubert Tong'comment  https://reviews.llvm.org/D84265#inline-799085

4 years ago[MLIR] MemRef Normalization for Dialects
Alexandre E. Eichenberger [Thu, 27 Aug 2020 05:17:33 +0000 (10:47 +0530)]
[MLIR] MemRef Normalization for Dialects

When dealing with dialects that will results in function calls to
external libraries, it is important to be able to handle maps as some
dialects may require mapped data.  Before this patch, the detection of
whether normalization can apply or not, operations are compared to an
explicit list of operations (`alloc`, `dealloc`, `return`) or to the
presence of specific operation interfaces (`AffineReadOpInterface`,
`AffineWriteOpInterface`, `AffineDMAStartOp`, or `AffineDMAWaitOp`).

This patch add a trait, `MemRefsNormalizable` to determine if an
operation can have its `memrefs` normalized.

This trait can be used in turn by dialects to assert that such
operations are compatible with normalization of `memrefs` with
nontrivial memory layout specification. An example is given in the
literal tests.

Differential Revision: https://reviews.llvm.org/D86236