platform/upstream/llvm.git
3 years ago[InstCombine] Fixed non-determinisctic order of new instructions
Stanislav Mekhanoshin [Thu, 28 Oct 2021 19:14:02 +0000 (12:14 -0700)]
[InstCombine] Fixed non-determinisctic order of new instructions

Fixes non-determinisctic order of XOR instructions created after
5a7a458306cd. The order of call argument evaluation is not
defined, so create one Value before the call.

3 years ago[InstCombine] Fold `(c & ~(a | b)) | (b & ~(a | c))` to `~a & (b ^ c)`
Stanislav Mekhanoshin [Thu, 21 Oct 2021 22:12:12 +0000 (15:12 -0700)]
[InstCombine] Fold `(c & ~(a | b)) | (b & ~(a | c))` to `~a & (b ^ c)`

```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %or1 = or i4 %a, %b
  %not1 = xor i4 %or1, 15
  %and1 = and i4 %not1, %c
  %or2 = or i4 %a, %c
  %not2 = xor i4 %or2, 15
  %and2 = and i4 %not2, %b
  %or3 = or i4 %and1, %and2
  ret i4 %or3
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %xor = xor i4 %b, %c
  %not = xor i4 %a, 15
  %or3 = and i4 %xor, %not
  ret i4 %or3
}
Transformation seems to be correct!
```

Differential Revision: https://reviews.llvm.org/D112276

3 years ago[lld][WebAssembly] Handle duplicate archive member names in ThinLTO
Sam Clegg [Thu, 28 Oct 2021 15:15:20 +0000 (08:15 -0700)]
[lld][WebAssembly] Handle duplicate archive member names in ThinLTO

This entire change, including the test case, comes almost verbatim
from the ELF driver.

Fixes: https://github.com/emscripten-core/emscripten/issues/12763

Differential Revision: https://reviews.llvm.org/D112723

3 years ago[lld] Rename addCombinedLTOObjects to match ELF driver. NFC
Sam Clegg [Thu, 28 Oct 2021 14:33:30 +0000 (07:33 -0700)]
[lld] Rename addCombinedLTOObjects to match ELF driver. NFC

This function was renamed in https://reviews.llvm.org/D62291.
The new name seems more accurate and also its good to maintain
some consistency between these methods in the different drivers.

Differential Revision: https://reviews.llvm.org/D112719

3 years ago[AArch64] Rename some timm predicates for consistency. NFC.
Ahmed Bougacha [Thu, 28 Oct 2021 17:20:11 +0000 (10:20 -0700)]
[AArch64] Rename some timm predicates for consistency. NFC.

timm isn't the common case, and TImmLeafs should make it clear what
they are.  We're adding a plain ImmLeaf for 0_65535, so rename
i64_imm0_65535 to timm64_0_65535, and imm32_0_7 to timm32_0_7.

3 years ago[IRSymTab] Mark __stack_chk_guard used
Yuanfang Chen [Thu, 28 Oct 2021 18:19:37 +0000 (11:19 -0700)]
[IRSymTab] Mark __stack_chk_guard used

`__stack_chk_guard` is a global variable that has no uses before the LLVM code generation phase (how it is defined is platform-dependent). LTO needs to preserve this symbol for that reason. Currently, legacy LTO API preserves it by hardcoding the logic in Internalizer, but this symbol is not preserved by regular LTO API in thinlink phase. This patch marks `__stack_chk_guard` used during IR symbol table writing since this is how builtin functions are preserved by thinlink by using `RuntimeLibcalls.def`.

Reviewed By: MaskRay, tejohnson

Differential Revision: https://reviews.llvm.org/D112595

3 years ago[Internalize] Preserve __stack_chk_fail in Internalizer correctly
Yuanfang Chen [Thu, 28 Oct 2021 18:19:24 +0000 (11:19 -0700)]
[Internalize] Preserve __stack_chk_fail in Internalizer correctly

Move the section collecting `AlwaysPreserved` up before any
`maybeInternalize` is called. Otherwise, functions in `AlwaysPreserved` (in this case, `__stack_chk_fail`)
are not preserved.

Reviewed By: MaskRay, tejohnson

Differential Revision: https://reviews.llvm.org/D112684

3 years ago[mlir][linalg] Remove unused method (NFC).
Tobias Gysi [Thu, 28 Oct 2021 17:46:35 +0000 (17:46 +0000)]
[mlir][linalg] Remove unused method (NFC).

Remove an unused method in hoist padding and format comment.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D112714

3 years ago[TwoAddressInstructionPass] Put all new instructions into DistanceMap
Guozhi Wei [Thu, 28 Oct 2021 18:08:36 +0000 (11:08 -0700)]
[TwoAddressInstructionPass] Put all new instructions into DistanceMap

In function convertInstTo3Addr, after converting a two address instruction into
three address instruction, only the last new instruction is inserted into
DistanceMap. This is wrong, DistanceMap should track all instructions from the
beginning of current MBB to the working instruction. When a two address
instruction is converted to three address instruction, multiple instructions may
be generated (usually an extra COPY is generated), all of them should be
inserted into DistanceMap.

Similarly when unfolding memory operand in function tryInstructionTransform
DistanceMap is not maintained correctly.

Differential Revision: https://reviews.llvm.org/D111857

3 years ago[libc++] P0433R2: add the remaining deduction guides.
Konstantin Varlamov [Thu, 28 Oct 2021 07:36:19 +0000 (00:36 -0700)]
[libc++] P0433R2: add the remaining deduction guides.

Add deduction guides to `valarray` and `scoped_allocator_adaptor`. This largely
finishes implementation of the paper:

* deduction guides for other classes mentioned in the paper were
  implemented previously (see the list below);
* deduction guides for several classes contained in the proposal
  (`reference_wrapper`, `lock_guard`, `scoped_lock`, `unique_lock`,
  `shared_lock`) were removed by [LWG2981](https://wg21.link/LWG2981).

Also add deduction guides to the synopsis for the few classes (e.g. `pair`)
where they were missing.

The only part of the paper that isn't fully implemented after this patch is
making sure certain deduction guides don't participate in overload resolution
when given incorrect template parameters.

List of significant commits implementing the other parts of P0433 (omitting some
minor fixes):

* [pair](https://github.com/llvm/llvm-project/commit/af65856eec160d163c764faad250d93357be7c83)
* [basic_string](https://github.com/llvm/llvm-project/commit/6d9f750dec29e8ae5366092e64cd343dae2c7464)
* [array](https://github.com/llvm/llvm-project/commit/0ca8c0895c6034615593c295dd955f29b25bf3d4)
* [deque](https://github.com/llvm/llvm-project/commit/dbb6f8a8179b0604e25707b5c1b72be6164f62d9)
* [forward_list](https://github.com/llvm/llvm-project/commit/e076700b7786959206acef136ecf05d54078e4e1)
* [list](https://github.com/llvm/llvm-project/commit/4a227e582b2f13880ea049b29988a37a0f7c0742)
* [vector](https://github.com/llvm/llvm-project/commit/df8f75479278d5ce16eede342ceb5ba2fd71460b)
* [queue/stack/priority_queue](https://github.com/llvm/llvm-project/commit/5b8b8b5dce587f1e5a4a31cc24f09b18bd53ff9a)
* [basic_regex](https://github.com/llvm/llvm-project/commit/edd5e29cfe9f67ec8e7e0eda12eb05e616fdeebc)
* [optional](https://github.com/llvm/llvm-project/commit/f35b4bc3954f3b01051fc0848535ff784809e9e2)
* [map/multimap](https://github.com/llvm/llvm-project/commit/edfe8525de1f7278f4754f2bffd47b13ec291a17)
* [set/multiset](https://github.com/llvm/llvm-project/commit/e20865c387e09ea0ebd5add15c762cd5271ff65f)
* [unordered_set/unordered_multiset](https://github.com/llvm/llvm-project/commit/296a80102a9b72c3eda80558fb78a3ed8849b341)
* [unordered_map/unordered_multimap](https://github.com/llvm/llvm-project/commit/dfcd4384cbcac0eeb7e5cbce350f875ba4da79d5)
* [function](https://github.com/llvm/llvm-project/commit/e1eabcdfad89f67ae575b0c86aa4a72d277378b4)
* [tuple](https://github.com/llvm/llvm-project/commit/1308011e1b5c5382281a63dd4191a1784f8d2295)
* [shared_ptr/weak_ptr](https://github.com/llvm/llvm-project/commit/83564056d4b186c9fcf016cdbb388755009f7b5a)

Additional notes:
* It was revision 2 of the paper that was voted into the Standard.
  P0433R3 is a separate paper that is not part of the Standard.
* The paper also mandates removing several `make_*_searcher` functions
  (e.g. `make_boyer_moore_searcher`) which are currently not implemented
  (except in `experimental/`).
* The `__cpp_lib_deduction_guides` feature test macro from the paper was
  accidentally omitted from the Standard.

Differential Revision: https://reviews.llvm.org/D112510

3 years ago[compiler-rt] fix asan buildbot failure on unit test for darwin
David CARLIER [Thu, 28 Oct 2021 17:48:54 +0000 (18:48 +0100)]
[compiler-rt] fix asan buildbot failure on unit test for darwin

3 years agoX86InstrInfo: Support immediates that are +1/-1 different in optimizeCompareInstr
Matthias Braun [Tue, 28 Sep 2021 00:57:22 +0000 (17:57 -0700)]
X86InstrInfo: Support immediates that are +1/-1 different in optimizeCompareInstr

This extends `optimizeCompareInstr` to re-use previous comparison
results if the previous comparison was with an immediate that was 1
bigger or smaller. Example:

    CMP x, 13
    ...
    CMP x, 12   ; can be removed if we change the SETg
    SETg ...    ; x > 12  changed to `SETge` (x >= 13) removing CMP

Motivation: This often happens because SelectionDAG canonicalization
tends to add/subtract 1 often when optimizing for fallthrough blocks.
Example for `x > C` the fallthrough optimization switches true/false
blocks with `!(x > C)` --> `x <= C` and canonicalization turns this into
`x < C + 1`.

Differential Revision: https://reviews.llvm.org/D110867

3 years agoX86InstrInfo: Optimize more combinations of SUB+CMP
Matthias Braun [Mon, 27 Sep 2021 22:21:15 +0000 (15:21 -0700)]
X86InstrInfo: Optimize more combinations of SUB+CMP

`X86InstrInfo::optimizeCompareInstr` would only optimize a `SUB`
followed by a `CMP` in `isRedundantFlagInstr`. This extends the code to
also look for other combinations like `CMP`+`CMP`, `TEST`+`TEST`, `SUB
x,0`+`TEST`.

- Change `isRedundantFlagInstr` to run `analyzeCompareInstr` on the
  candidate instruction and compare the results. This normalizes things
  and gives consistent results for various comparisons (`CMP x, y`,
  `SUB x, y`) and immediate cases (`TEST x, x`, `SUB x, 0`,
  `CMP x, 0`...).
- Turn `isRedundantFlagInstr` into a member function so it can call
  `analyzeCompare`.  - We now also check `isRedundantFlagInstr` even if
  `IsCmpZero` is true, since we now have cases like `TEST`+`TEST`.

Differential Revision: https://reviews.llvm.org/D110865

3 years ago[lldb] [Host/ConnectionFileDescriptor] Refactor to improve code reuse
Michał Górny [Mon, 25 Oct 2021 20:55:48 +0000 (22:55 +0200)]
[lldb] [Host/ConnectionFileDescriptor] Refactor to improve code reuse

Refactor ConnectionFileDescriptor to improve code reuse for different
types of sockets.  Unify method naming.

While at it, remove some (now-)dead code from Socket.

Differential Revision: https://reviews.llvm.org/D112495

3 years ago[VPlan] Keep induction recipes in header.
Florian Hahn [Thu, 28 Oct 2021 17:01:53 +0000 (18:01 +0100)]
[VPlan] Keep induction recipes in header.

This patch updates recipe creation to ensure all
VPWidenIntOrFpInductionRecipes are in the header block. At the moment,
new induction recipes can be created in different blocks when trying to
optimize casts and induction variables.

Having all induction recipes in the header makes it easier to
analyze/transform them in VPlan.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D111300

3 years ago[libc++] Update the CI Docker image to Focal
Louis Dionne [Thu, 28 Oct 2021 15:35:18 +0000 (11:35 -0400)]
[libc++] Update the CI Docker image to Focal

Differential Revision: https://reviews.llvm.org/D112726

3 years ago[mlir] Implement replacement of SymbolRefAttrs in Dialect attributes using SubElement...
Markus Böck [Thu, 28 Oct 2021 17:08:10 +0000 (19:08 +0200)]
[mlir] Implement replacement of SymbolRefAttrs in Dialect attributes using SubElementAttr interface

This patch extends the SubElementAttr interface to allow replacing a contained sub attribute. The attribute that should be replaced is identified by an index which denotes the n-th element returned by the accompanying walkImmediateSubElements method.

Using this addition the patch implements replacing SymbolRefAttrs contained within any dialect attributes.

Differential Revision: https://reviews.llvm.org/D111357

3 years agoMachineDominators: Define MachineDomTree type alias
Nicolai Hähnle [Thu, 28 Oct 2021 05:19:34 +0000 (10:49 +0530)]
MachineDominators: Define MachineDomTree type alias

This is a (very) small move towards making the machine dominators more
aligned with the IR dominators:

* DominatorTree / MachineDomTree is the class holding the dominator tree
* DominatorTreeWrapperPass / MachineDominatorTree is the corresponding
  (machine) function pass

This alignment will be used by analyses that are designed as templates
that work with LLVM IR as well as Machine IR.

Reviewed By: critson

Differential Revision: https://reviews.llvm.org/D112690

3 years ago[PowerPC][NFC] Update builtins-ppc-xlcompat-trap-64bit-only.ll and builtins-ppc-xlcom...
Victor Huang [Thu, 28 Oct 2021 16:22:15 +0000 (11:22 -0500)]
[PowerPC][NFC] Update builtins-ppc-xlcompat-trap-64bit-only.ll and builtins-ppc-xlcompat-trap.ll to show full reg names

3 years ago[ELF] Change common diagnostics to report both object file location and source file...
Fangrui Song [Thu, 28 Oct 2021 16:38:45 +0000 (09:38 -0700)]
[ELF] Change common diagnostics to report both object file location and source file location

Many diagnostics use `getErrorPlace` or `getErrorLocation` to report a location.
In the presence of line table debug information, `getErrorPlace` uses a source
file location and ignores the object file location. However, the object file
location is sometimes more useful.

This patch changes "undefined symbol" and "out of range" diagnostics to report
both object/source file locations. Other diagnostics can use similar format if
needed.

The key idea is to let `InputSectionBase::getLocation` report the object file
location and use `getSrcMsg` for source file/line information. `getSrcMsg`
doesn't leverage `STT_FILE` information yet, but I think the temporary lack of
the functionality is ok.

For the ARM "branch and link relocation" diagnostic, I arbitrarily place the
source file location at the end of the line. The diagnostic is not very common
so its formatting doesn't need to be pretty.

Differential Revision: https://reviews.llvm.org/D112518

3 years ago[IR] Fix a warning
Kazu Hirata [Thu, 28 Oct 2021 16:38:25 +0000 (09:38 -0700)]
[IR] Fix a warning

This patch fixes:

  mlir/lib/IR/BuiltinAttributes.cpp:876:39: error: unused function
  'isComplexOfIntType' [-Werror,-Wunused-function]

in a release build.

3 years agoRegen some autogen tests to account for format change
Philip Reames [Thu, 28 Oct 2021 16:19:26 +0000 (09:19 -0700)]
Regen some autogen tests to account for format change

3 years ago[compiler-rt] update detect_write_exec option for apple devices.
David CARLIER [Thu, 28 Oct 2021 16:08:23 +0000 (17:08 +0100)]
[compiler-rt] update detect_write_exec option for apple devices.

Reviewed By: yln, vitalybuka

Differential Revision: https://reviews.llvm.org/D111390

3 years agoAutogen a test for ease of update
Philip Reames [Thu, 28 Oct 2021 15:54:27 +0000 (08:54 -0700)]
Autogen a test for ease of update

3 years agoAdd support for Bazel builds on Windows with `clang-cl`.
Chandler Carruth [Sat, 16 Oct 2021 07:46:19 +0000 (00:46 -0700)]
Add support for Bazel builds on Windows with `clang-cl`.

Adds basic `--config=clang-cl` to set up the basic options needed, and
then fix a number of issues that surface in Windows builds for me.

With these fixes, `//llvm/...` builds cleanly. One unittest still fails,
but its just due to running out of stack space due to creating a large
number of short-lived stack variables. The test should probably be
decomposed into a set of tests (`LegalizerInfoTest::RuleSets`), but that
seemed like too invasive of a change here and with everything building
cleanly this isn't disrupting me experimenting with Windows builds.

Some parts of `//clang/...` builds, but that will require more work.

3 years ago[mlir][sparse] move conversion test back to original CHECK testing
Aart Bik [Thu, 28 Oct 2021 03:35:52 +0000 (20:35 -0700)]
[mlir][sparse] move conversion test back to original CHECK testing

Rationale:
The silent exit(1) gives little clues on where the error occurs on failure
and may even be confusing at first. The CHECK testing of all computed values
and indices may be a little bit more elaborate, but it directly pinpoints
where errors happen if they occur. This style is also consistent with
the other tests, which I actually prefer.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D112688

3 years ago[libc][NFC] Move utils/CPP to src/__support/CPP.
Siva Chandra Reddy [Wed, 27 Oct 2021 18:49:00 +0000 (18:49 +0000)]
[libc][NFC] Move utils/CPP to src/__support/CPP.

The idea is to move all pieces related to the actual libc sources to the
"src" directory. This allows downstream users to ship and build just the
"src" directory.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D112653

3 years ago[clangd] SelectionTree should prefer lexical declcontext
Kadir Cetinkaya [Thu, 28 Oct 2021 12:56:53 +0000 (14:56 +0200)]
[clangd] SelectionTree should prefer lexical declcontext

This is important especially for code that tries to traverse scopes as
written in code, which is the contract SelectionTree tries to satisfy.

Differential Revision: https://reviews.llvm.org/D112712

3 years ago[libc++][ci] Update to Clang 13.
Mark de Wever [Sat, 23 Oct 2021 11:08:01 +0000 (13:08 +0200)]
[libc++][ci] Update to Clang 13.

Per our support plan we should now support Clang 12 and 13. Adjust the
documentation and the CI runners. The change indirectly moves the main
CI runners to use the Clang 14 nightly builds.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D112360

3 years ago[CGProfile] Don't emit call graph profile edges with zero weight
Leonard Grey [Thu, 28 Oct 2021 15:27:01 +0000 (11:27 -0400)]
[CGProfile] Don't emit call graph profile edges with zero weight

With D112160 and D112164, on a Chrome Mac build this reduces the total
size of CGProfile sections by 78% (around 25% eliminated entirely) and
total size of object files by 0.14%.

Differential Revision: https://reviews.llvm.org/D112655

3 years ago[OpenMP] Initial parsing/sema for the 'omp loop' construct
Mike Rice [Thu, 28 Oct 2021 15:10:40 +0000 (08:10 -0700)]
[OpenMP] Initial parsing/sema for the 'omp loop' construct

Adds basic parsing/sema/serialization support for the #pragma omp loop
directive.

Differential Revision: https://reviews.llvm.org/D112499

3 years agoRevert "Reland "[ARM] __cxa_end_cleanup should be called instead of _UnwindResume.""
Daniel Kiss [Thu, 28 Oct 2021 15:24:53 +0000 (17:24 +0200)]
Revert "Reland "[ARM] __cxa_end_cleanup should be called instead of _UnwindResume.""

This reverts commit b6420e575f3bbb6b6df848c0284d6b60eeb07350.

3 years agoReland "[ARM] __cxa_end_cleanup should be called instead of _UnwindResume."
Daniel Kiss [Wed, 27 Oct 2021 08:32:11 +0000 (10:32 +0200)]
Reland "[ARM] __cxa_end_cleanup should be called instead of _UnwindResume."

This is relanding commit da1d1a08694bbfe0ea7a23ea094612436e8a2dd0 .
This patch additionally addresses failures found in buildbots & post review comments.

ARM EHABI[1] specifies the __cxa_end_cleanup to be called after cleanup.
It will call the UnwindResume.
__cxa_begin_cleanup will be called from libcxxabi while __cxa_end_cleanup is never called.
This will trigger a termination when a foreign exception is processed while UnwindResume is called
because the global state will be wrong due to the missing __cxa_end_cleanup call.

Additional test here: D109856
[1] https://github.com/ARM-software/abi-aa/blob/main/ehabi32/ehabi32.rst#941compiler-helper-functions

Reviewed By: logan

Differential Revision: https://reviews.llvm.org/D111703

3 years ago[InstCombine] Extend canonicalizeClampLike to handle truncated inputs
David Green [Thu, 28 Oct 2021 14:46:58 +0000 (15:46 +0100)]
[InstCombine] Extend canonicalizeClampLike to handle truncated inputs

This extends the canonicalizeClampLike function to allow cases where the
input is truncated, but still matching on the types of the ICmps. For
example
  %t = trunc i32 %X to i8
  %a = add i32 %X, 128
  %cmp = icmp ult i32 %a, 256
  %c = icmp sgt i32 %X, -1
  %f = select i1 %c, i8 High, i8 Low
  %r = select i1 %cmp, i8 %t, i8 %f
becomes
  %c1 = icmp slt i32 %X, -128
  %c2 = icmp sge i32 %X, 128
  %s1 = select i1 %c1, i32 sext(Low), i32 %X
  %s2 = select i1 %c2, i32 sext(High), i32 %s1
  %t = trunc i32 %s2 to i8
https://alive2.llvm.org/ce/z/vPzfxH

We limit the transform to constant High and Low values, where we know
the sext are free.

Differential Revision: https://reviews.llvm.org/D108049

3 years ago[docs][NFC] Strip trailing whitespace from GettingStarted.rst
Louis Dionne [Thu, 28 Oct 2021 14:44:48 +0000 (10:44 -0400)]
[docs][NFC] Strip trailing whitespace from GettingStarted.rst

3 years agoRevert "[clang] Fortify warning for scanf calls with field width too big."
Nico Weber [Thu, 28 Oct 2021 14:41:18 +0000 (10:41 -0400)]
Revert "[clang] Fortify warning for scanf calls with field width too big."

This reverts commit 15e3d39110fa4449be4f56196af3bc81b623f3ab.
See https://reviews.llvm.org/D111833#3093629

3 years ago[lldb][NFC] Improve CppModuleConfiguration documentation a bit
Raphael Isemann [Thu, 28 Oct 2021 14:39:06 +0000 (16:39 +0200)]
[lldb][NFC] Improve CppModuleConfiguration documentation a bit

3 years ago[lld][ELF] Update name of function in comment. NFC
Sam Clegg [Thu, 28 Oct 2021 14:29:43 +0000 (07:29 -0700)]
[lld][ELF] Update name of function in comment. NFC

This function was renamed in https://reviews.llvm.org/D62291.

3 years ago[DSE] Eliminates redundant store of an exisiting value (PR16520)
Dawid Jurczak [Fri, 22 Oct 2021 12:11:12 +0000 (14:11 +0200)]
[DSE] Eliminates redundant store of an exisiting value (PR16520)

That's https://reviews.llvm.org/D90328 follow-up.

This change eliminates writes to variables where the value that is being written is already stored in the variable.
This achieves the goal by looping through all memory definitions in the current state and getting defining access from each of them.
When there is defining access where the write instruction is identical to the original instruction it will remove this redundant write.

For example:

void f() {

x = 1;
if foo() {
   x = 1;
   g();
} else {
  h();
}

}
void g();
void h();

The second x=1 will be eliminated since it is rewriting 1 to x. This pass will produce this:

void f() {

x = 1;
if foo() {
   g();
} else {
  h();
}

}
void g();
void h();

Differential Revision: https://reviews.llvm.org/D111727

3 years ago[InstCombine] Fix rare condition violation in canonicalizeClampLike
David Green [Thu, 28 Oct 2021 14:03:07 +0000 (15:03 +0100)]
[InstCombine] Fix rare condition violation in canonicalizeClampLike

With a "ult x, 0", the fold in canonicalizeClampLike does not validate
with undef inputs. This condition will usually have been simplified
away, but we should ensure the code is correct in case.
https://alive2.llvm.org/ce/z/S8HQ6H vs https://alive2.llvm.org/ce/z/h2XBJ_

See: https://reviews.llvm.org/D108049

3 years ago[mlir][linalg] Fix FoldConstantTranspose execution inefficiency
Lei Zhang [Thu, 28 Oct 2021 13:45:07 +0000 (09:45 -0400)]
[mlir][linalg] Fix FoldConstantTranspose execution inefficiency

* Move SmallVectors outside of inner loops to avoid frequent
  allocations and deallocations
* Calculate linearized index and call flat range getters to
  avoid internal shape querying behind `getValue`.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D112099

3 years ago[X86][AVX] Attempt to fold a scaled index into a gather/scatter scale immediate ...
Simon Pilgrim [Thu, 28 Oct 2021 13:07:17 +0000 (14:07 +0100)]
[X86][AVX] Attempt to fold a scaled index into a gather/scatter scale immediate (PR13310)

If the index operand for a gather/scatter intrinsic is being scaled (self-addition or a shl-by-immediate) then we may be able to fold that scaling into the intrinsic scale immediate value instead.

Fixes PR13310.

Differential Revision: https://reviews.llvm.org/D108539

3 years ago[clangd] Escape error message in AddUsing
Kadir Cetinkaya [Thu, 28 Oct 2021 12:47:57 +0000 (14:47 +0200)]
[clangd] Escape error message in AddUsing

Fixes https://github.com/clangd/clangd/issues/900

3 years ago[gn build] (manually) port d736002e90b5
Nico Weber [Thu, 28 Oct 2021 12:48:54 +0000 (08:48 -0400)]
[gn build] (manually) port d736002e90b5

3 years ago[SLP]Improve/fix reordering of the gathered graph nodes.
Alexey Bataev [Mon, 25 Oct 2021 14:32:35 +0000 (07:32 -0700)]
[SLP]Improve/fix reordering of the gathered graph nodes.

Gathered loads/extractelements/extractvalue instructions should be
checked if they can represent a vector reordering node too and their
order should ve taken into account for better graph reordering analysis/
Also, if the gather node has reused scalars, they must be reordered
instead of the scalars themselves.

Differential Revision: https://reviews.llvm.org/D112454

3 years agoRe-instate -Wweak-template-vtables as a no-op flag
Hans Wennborg [Thu, 28 Oct 2021 11:37:27 +0000 (13:37 +0200)]
Re-instate -Wweak-template-vtables as a no-op flag

Follow-up to 8c136805242014b6ad9ff1afcac9d7f4a18bec3f to allow a less
abrupt migration for users.

Differential revision: https://reviews.llvm.org/D112704

3 years ago[MLIR][LLVM] Add llvm.mlir.global_ctors/dtors and translation support
Uday Bondhugula [Wed, 20 Oct 2021 09:44:54 +0000 (15:14 +0530)]
[MLIR][LLVM] Add llvm.mlir.global_ctors/dtors and translation support

Add llvm.mlir.global_ctors and global_dtors ops and their translation
support to LLVM global_ctors/global_dtors global variables.

Differential Revision: https://reviews.llvm.org/D112524

3 years ago[lldb/test] Allow indentation in inline tests
Pavel Labath [Thu, 28 Oct 2021 09:23:24 +0000 (11:23 +0200)]
[lldb/test] Allow indentation in inline tests

This makes it possible to use for loops (and other language constructs)
in inline tests.

Differential Revision: https://reviews.llvm.org/D112706

3 years ago[InstCombine] allow Negator to fold multi-use select with constant arms
Sanjay Patel [Thu, 28 Oct 2021 12:11:59 +0000 (08:11 -0400)]
[InstCombine] allow Negator to fold multi-use select with constant arms

The motivating test is reduced from:
https://llvm.org/PR52261

Note that the more general problem of folding any binop into a multi-use
select of constants is still there. We need to ease the restriction in
InstCombinerImpl::FoldOpIntoSelect() to catch those. But these examples
never reach that code because Negator exclusively handles negation
patterns within visitSub().

Differential Revision: https://reviews.llvm.org/D112657

3 years ago[InstCombine][ConstantFolding] Make ConstantFoldLoadThroughBitcast TypeSize-aware
Peter Waller [Thu, 28 Oct 2021 12:14:52 +0000 (12:14 +0000)]
[InstCombine][ConstantFolding] Make ConstantFoldLoadThroughBitcast TypeSize-aware

The newly added test previously caused the compiler to fail an
assertion. It looks like a strightforward TypeSize upgrade.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D112142

3 years ago[InstSimplify] Add tests for the range of a half float. NFC
David Green [Thu, 28 Oct 2021 11:58:13 +0000 (12:58 +0100)]
[InstSimplify] Add tests for the range of a half float. NFC

3 years ago[GlobalISel][Tablegen] Fix SameOperandMatcher's isIdentical check
Konstantin Schwarz [Sun, 10 Oct 2021 09:17:07 +0000 (11:17 +0200)]
[GlobalISel][Tablegen] Fix SameOperandMatcher's isIdentical check

During rule optimization, identical SameOperandMatchers are hoisted into a common group,
however previously only one operand index was considered.
Commutable patterns can introduce SameOperandMatcher checks where the second index is commuted,
resulting in a different check that cannot be hoisted.

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D111506

3 years ago[libomptarget] Build DeviceRTL for amdgpu
Jon Chesterfield [Thu, 28 Oct 2021 11:33:25 +0000 (12:33 +0100)]
[libomptarget] Build DeviceRTL for amdgpu

Passes same tests as the current deviceRTL. Includes cmake change from D111987.
CI is showing a different set of pass/fails to local, committing this
without the tests enabled by default while debugging that difference.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D112227

3 years agotsan: move memory access functions to a separate file
Dmitry Vyukov [Wed, 27 Oct 2021 14:00:23 +0000 (16:00 +0200)]
tsan: move memory access functions to a separate file

tsan_rtl.cpp is huge and does lots of things.
Move everything related to memory access and tracing
to a separate tsan_rtl_access.cpp file.
No functional changes, only code movement.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D112625

3 years ago[AMDGPU] Add 24-bit mulhi intrinsics in INTRINSIC_WO_CHAIN combine.
Abinav Puthan Purayil [Thu, 28 Oct 2021 10:38:59 +0000 (16:08 +0530)]
[AMDGPU] Add 24-bit mulhi intrinsics in INTRINSIC_WO_CHAIN combine.

mul24 intrinsic's operands are simplified by
AMDGPUTargetLowering::performIntrinsicWOChainCombine(). This change adds
the mul24hi intrinsics in the combine since its operands can be
simplified like that of the mul24 intrinsics.

Differential Revision: https://reviews.llvm.org/D112702

3 years ago[AMDGPU] Fix rhs of the tests in amdgpu-codegenprepare-mul24.ll.
Abinav Puthan Purayil [Thu, 28 Oct 2021 01:33:48 +0000 (07:03 +0530)]
[AMDGPU] Fix rhs of the tests in amdgpu-codegenprepare-mul24.ll.

Differential Revision: https://reviews.llvm.org/D112685

3 years ago[libc] automemcpy
Guillaume Chatelet [Mon, 11 Oct 2021 15:26:43 +0000 (15:26 +0000)]
[libc] automemcpy

3 years ago[flang] Checks for pointers to intrinsic functions
Emil Kieri [Mon, 25 Oct 2021 19:43:17 +0000 (21:43 +0200)]
[flang] Checks for pointers to intrinsic functions

Check that when a procedure pointer is initialised or assigned with an intrinsic
function, or when its interface is being defined by one, that intrinsic function
is unrestricted specific (listed in Table 16.2 of F'2018).

Mark intrinsics LGE, LGT, LLE, and LLT as restricted specific. Getting their
classifications right helps in designing the tests.

Differential Revision: https://reviews.llvm.org/D112381

3 years ago[clangd] NFC: Use more idiomatic way of checking for definition
Kirill Bobyrev [Thu, 28 Oct 2021 10:25:12 +0000 (12:25 +0200)]
[clangd] NFC: Use more idiomatic way of checking for definition

3 years ago[clangd] NFC: Match function signature in the header and source file
Kirill Bobyrev [Thu, 28 Oct 2021 10:11:31 +0000 (12:11 +0200)]
[clangd] NFC: Match function signature in the header and source file

3 years ago[dexter] XFAIL feature_test source-root-dir.cpp
OCHyams [Thu, 28 Oct 2021 09:17:26 +0000 (10:17 +0100)]
[dexter] XFAIL feature_test source-root-dir.cpp

Test is failing for unknown reasons and needs investigating.

3 years ago[AMDGPU] Add gfx10 uaddsat test coverage. NFC.
Jay Foad [Thu, 28 Oct 2021 08:39:19 +0000 (09:39 +0100)]
[AMDGPU] Add gfx10 uaddsat test coverage. NFC.

3 years ago[Test] Regenerate some of llc test checks using auto updater
Max Kazantsev [Thu, 28 Oct 2021 09:18:30 +0000 (16:18 +0700)]
[Test] Regenerate some of llc test checks using auto updater

3 years ago[analyzer] sprintf is a taint propagator not a source
Balazs Benics [Thu, 28 Oct 2021 09:03:02 +0000 (11:03 +0200)]
[analyzer] sprintf is a taint propagator not a source

Due to a typo, `sprintf()` was recognized as a taint source instead of a
taint propagator. It was because an empty taint source list - which is
the first parameter of the `TaintPropagationRule` - encoded the
unconditional taint sources.
This typo effectively turned the `sprintf()` into an unconditional taint
source.

This patch fixes that typo and demonstrated the correct behavior with
tests.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D112558

3 years ago[MLIR][OpenMP] Fixed the missing inclusive clause in omp.wsloop and fix order clause
Shraiysh Vaishay [Thu, 28 Oct 2021 05:34:40 +0000 (11:04 +0530)]
[MLIR][OpenMP] Fixed the missing inclusive clause in omp.wsloop and fix order clause

This patch adds the inclusive clause (which was missed in previous
reorganization - https://reviews.llvm.org/D110903) in omp.wsloop operation.
Added a test for validating it.

Also fixes the order clause, which was not accepting any values. It now accepts
"concurrent" as a value, as specified in the standard.

Reviewed By: kiranchandramohan, peixin, clementval

Differential Revision: https://reviews.llvm.org/D112198

3 years ago[AMDGPU][GlobalISel] Fix waterfall loops
Sebastian Neubauer [Thu, 28 Oct 2021 08:29:06 +0000 (10:29 +0200)]
[AMDGPU][GlobalISel] Fix waterfall loops

- Move the `s_and exec` to its correct position before the content of
  the waterfall loop
- Use the SI_WATERFALL pseudo instruction, like for sdag, to benefit
  from optimizations
- Add support for indirect function calls

To support indirect calls, add a G_SI_CALL instruction without register
class restrictions and insert a waterfall loop when applying register
banks.

Differential Revision: https://reviews.llvm.org/D109052

3 years ago[GlobalISel] Simplify RegBankSelect
Neubauer, Sebastian [Mon, 25 Oct 2021 14:11:42 +0000 (16:11 +0200)]
[GlobalISel] Simplify RegBankSelect

Save the instruction list of a block before selecting banks.
This allows to cope with moved instructions, even if they are reordered
or splitted into multiple basic blocks.

Differential Revision: https://reviews.llvm.org/D111223

3 years ago[lldb] Remove ConstString from Process, ScriptInterpreter and StructuredData plugin...
Pavel Labath [Fri, 22 Oct 2021 17:53:43 +0000 (19:53 +0200)]
[lldb] Remove ConstString from Process, ScriptInterpreter and StructuredData plugin names

3 years ago[Test] Regenerate checks using auto-update script
Max Kazantsev [Thu, 28 Oct 2021 08:13:09 +0000 (15:13 +0700)]
[Test] Regenerate checks using auto-update script

3 years ago[Driver][AArch64]Add driver support for neoverse-512tvb target
Caroline Concatto [Fri, 22 Oct 2021 08:22:14 +0000 (09:22 +0100)]
[Driver][AArch64]Add driver support for neoverse-512tvb target

The support for  neoverse-512tvb mirrors the same option available in GCC[1].
There is no functional effect for this option yet.
This patch ensures the driver accepts "-mcpu=neoverse-512tvb", and enough
plumbing is in place to allow the new option to be used in the future.

[1]https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html

Differential Revision: https://reviews.llvm.org/D112406

3 years ago[lldb] [Host/Socket] Make DecodeHostAndPort() return a dedicated struct
Michał Górny [Wed, 27 Oct 2021 15:52:45 +0000 (17:52 +0200)]
[lldb] [Host/Socket] Make DecodeHostAndPort() return a dedicated struct

Differential Revision: https://reviews.llvm.org/D112629

3 years ago[flang] runtime: Read environment variables directly
Diana Picus [Tue, 12 Oct 2021 12:40:48 +0000 (12:40 +0000)]
[flang] runtime: Read environment variables directly

Add support for reading environment variables directly, via std::getenv.
This needs to allocate a C-style string to pass into std::getenv. If the
memory allocation for that fails, we terminate.

This also changes the interface for EnvVariableLength to receive the
source file and line so we can crash gracefully.

Note that we are now completely ignoring the envp pointer passed into
ProgramStart, since that could go stale if the environment is modified
during execution.

Differential Revision: https://reviews.llvm.org/D111785

3 years ago[Support] [Windows] Manually clean up temp files if not setting delete disposition
Martin Storsjö [Mon, 4 Oct 2021 12:10:52 +0000 (15:10 +0300)]
[Support] [Windows] Manually clean up temp files if not setting delete disposition

Since D81803 / 79657e2339b58bc01fe1b85a448bb073d57d90bb, temp files
created on network shares don't set "Disposition.DeleteFile = true".
This flag normally takes care of removing the temp file both if the
process exits abnormally (either crashing or killed externally), and
when the file is closed cleanly.

For network shares, we voluntarily choose to not set the flag, and
if the operation to inspect the file handle (as a prerequisite to
setting the flag since 79657e2339b58bc01fe1b85a448bb073d57d90bb)
fails we also error out. In both of these cases, we can at least make
sure to remove the temp files when they are closed cleanly.

Adjust the semantics of "OF_Delete" to not set the delete
disposition, but only set the access mode for allowing deletion.
Move the call to setDeleteDisposition into TempFile::create,
where we can check if it failed, and if it did, set a flag noting
that the file should be removed manually at the end.

This does leak files on crash, but at least doesn't leak files
in regular successful runs. (Technically, the alternative codepath
could use the RemoveFileOnSignal function, but that might complicate
the TempFile implementation further.)

This fixes https://github.com/mstorsjo/llvm-mingw/issues/233 and
https://bugs.llvm.org/show_bug.cgi?id=52080.

Differential Revision: https://reviews.llvm.org/D111875

3 years ago[clang] [MinGW] Rename the 'Arch' member to 'SubdirName'. NFC.
Martin Storsjö [Sat, 16 Oct 2021 14:20:16 +0000 (17:20 +0300)]
[clang] [MinGW] Rename the 'Arch' member to 'SubdirName'. NFC.

This string isn't a plain architecture name, but contains the whole
subdir name used for the sysroot, which often is equal to the target
triple.

Differential Revision: https://reviews.llvm.org/D112387

3 years ago[clang][MIPS] Fix search path for Debian multilib O32
YunQiang Su [Wed, 27 Oct 2021 13:16:42 +0000 (16:16 +0300)]
[clang][MIPS] Fix search path for Debian multilib O32

In the situation of multilib, the gcc objects are in a /32 directory. On
Debian, the libraries is under /libo32 to avoid confliction. This patch
enables clang find gcc in /32, and C lib in /libo32.

Differential Revision: https://reviews.llvm.org/D112158

3 years ago[clangd] Avoid expensive checks of buffer names in IncludeCleaner
Sam McCall [Wed, 27 Oct 2021 19:13:32 +0000 (21:13 +0200)]
[clangd] Avoid expensive checks of buffer names in IncludeCleaner

This changes the handling of special buffers (<command-line> etc) that
SourceManager treats as files but FileManager does not.

We now include them in findReferencedFiles() and drop them as part of
translateToHeaderIDs(). This pairs more naturally with the data representations
we're using, and so avoids a bunch of converting between representations for
filtering.

Differential Revision: https://reviews.llvm.org/D112652

3 years ago[CSSPGO] Trim cold base profiles for the CS preinliner.
Hongtao Yu [Wed, 27 Oct 2021 23:56:06 +0000 (16:56 -0700)]
[CSSPGO] Trim cold base profiles for the CS preinliner.

Adding support to the CS preinliner to trim cold base profiles. This makes trimming consistent with the inline decision made by the preinliner. Also disable the existing profile merger when preinliner is on unless explicitly specified.

Reviewed By: wenlei, wlei

Differential Revision: https://reviews.llvm.org/D112489

3 years ago[RISCV] Sync Zvlsseg register order as the same as vector registers.
Hsiangkai Wang [Wed, 22 Sep 2021 23:48:46 +0000 (07:48 +0800)]
[RISCV] Sync Zvlsseg register order as the same as vector registers.

Sync the order of Zvlsseg registers with vector registers to avoid
unnecessary register copies between vector instructions and zvlsseg
instructions.

Differential Revision: https://reviews.llvm.org/D110250

3 years agoAdd unix signal hit counts to the target statistics.
Greg Clayton [Thu, 28 Oct 2021 01:33:17 +0000 (18:33 -0700)]
Add unix signal hit counts to the target statistics.

Android and other platforms make wide use of signals when running applications and this can slow down debug sessions. Tracking this statistic can help us to determine why a debug session is slow.

The new data appears inside each target object and reports the signal hit counts:

      "signals": [
        {
          "SIGSTOP": 1
        },
        {
          "SIGUSR1": 1
        }
      ],

Differential Revision: https://reviews.llvm.org/D112683

3 years ago[mlir][GPUtoNVVM] Relax restriction on wmma op lowering
thomasraoux [Mon, 25 Oct 2021 19:42:36 +0000 (12:42 -0700)]
[mlir][GPUtoNVVM] Relax restriction on wmma op lowering

Allow lowering of wmma ops with 64bits indexes. Change the default
version of the test to use default layout.

Differential Revision: https://reviews.llvm.org/D112479

3 years ago[AMDGPU] Remove unused declaration findNumUsedRegistersSI (NFC)
Kazu Hirata [Thu, 28 Oct 2021 04:24:02 +0000 (21:24 -0700)]
[AMDGPU] Remove unused declaration findNumUsedRegistersSI (NFC)

3 years ago[Test] Add test showing missing simplifycfg opportunity for Phi with undef inputs
Max Kazantsev [Thu, 28 Oct 2021 04:22:34 +0000 (11:22 +0700)]
[Test] Add test showing missing simplifycfg opportunity for Phi with undef inputs

3 years ago[X86] Add a dependency breaking xor before any gathers with an undef passthru value.
Phoebe Wang [Thu, 28 Oct 2021 02:56:04 +0000 (10:56 +0800)]
[X86] Add a dependency breaking xor before any gathers with an undef passthru value.

In the instruction encoding, the passthru register is always
tied to the destination register. The CPU scheduler has to wait
for the last writer of this register to finish executing before
the gather can start. This is true even if the initial mask is
all ones so that the passthru will never be used.

By explicitly zeroing the register we can break the false
dependency. The zero idiom is executed completing by the
register renamer and so is immedately considered ready.

Authored by Craig.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D112505

3 years ago[RISCV] Use vmv.v.[v|i] if we know COPY is under the same vl and vtype.
Hsiangkai Wang [Fri, 2 Jul 2021 01:25:50 +0000 (09:25 +0800)]
[RISCV] Use vmv.v.[v|i] if we know COPY is under the same vl and vtype.

If we know the source operand of COPY is defined by a vector instruction
with tail agnostic and the same LMUL and there is no vsetvli between
COPY and the define instruction to change the vl and vtype, we could use
vmv.v.v or vmv.v.i to copy vector registers to get better performance than
the whole vector register move instructions.

If the source of COPY is from vmv.v.i, we could use vmv.v.i for the
COPY.

This patch only considers all these instructions within one basic block.

Case 1:
```
bb.0:
  ...
  VSETVLI          # The first VSETVLI before COPY and VOP.
  ...              # Use this VSETVLI to check LMUL and tail agnostic.
  ...
  vy = VOP va, vb  # Define vy.
  ...              # There is no vsetvli between VOP and COPY.
  vx = COPY vy
```

Case 2:
```
bb.0:
  ...
  VSETVLI          # The first VSETVLI before VOP.
  ...              # Use this VSETVLI to check LMUL and tail agnostic.
  ...
  vy = VOP va, vb  # Define vy.
  ...              # There is no vsetvli to change vl between VOP and COPY.
  ...
  VSETVLI          # The first VSETVLI before COPY.
  ...              # This VSETVLI does not change vl and vtype.
  ...
  vx = COPY vy
```

Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Co-Authored-by: Kito Cheng <kito.cheng@sifive.com>
Differential Revision: https://reviews.llvm.org/D103510

3 years ago[clang] Fortify warning for scanf calls with field width too big.
Michael Benfield [Thu, 14 Oct 2021 20:02:28 +0000 (20:02 +0000)]
[clang] Fortify warning for scanf calls with field width too big.

Differential Revision: https://reviews.llvm.org/D111833

3 years ago[AMDGPU] Add more llc tests for 48-bit mul generation.
Abinav Puthan Purayil [Tue, 26 Oct 2021 16:08:21 +0000 (21:38 +0530)]
[AMDGPU] Add more llc tests for 48-bit mul generation.

Differential Revision: https://reviews.llvm.org/D112554

3 years ago[SCEV] Invalidate user SCEVs along with operand SCEVs to avoid cache corruption
Max Kazantsev [Thu, 28 Oct 2021 02:08:48 +0000 (09:08 +0700)]
[SCEV] Invalidate user SCEVs along with operand SCEVs to avoid cache corruption

Following discussion in D110390, it seems that we are suffering from unability
to traverse users of a SCEV being invalidated. The result of that is that ScalarEvolution's
inner caches may store obsolete data about SCEVs even if their operands are
forgotten. It creates problems when we try to verify the contents of those caches.

It's also a frequent situation when messing with cache causes very sneaky and
hard-to-analyze bugs related to corruption of memory when dealing with cached
data. They are lurking there because ScalarEvolution's veirfication is not powerful
enough and misses many problematic cases. I plan to make SCEV's verification
much stricter in follow-ups, and this requires dangling-pointers-free caches.

This patch makes sure that, whenever we forget cached information for a SCEV,
we also forget it for all SCEVs that (transitively) use it.

This may have negative compile time impact. It's a sacrifice we are more
than willing to make to enforce correctness. We can also save some time by
reworking invokers of forgetMemoizedResults (maybe we can forget multiple
SCEVs with single query).

Differential Revision: https://reviews.llvm.org/D111533
Reviewed By: reames

3 years ago[RISCV] Replace most uses of RISCVSubtarget::hasStdExtV. NFCI
Craig Topper [Thu, 28 Oct 2021 02:19:03 +0000 (19:19 -0700)]
[RISCV] Replace most uses of RISCVSubtarget::hasStdExtV. NFCI

Add new hasVInstructions() which is currently equivalent.

Replace vector uses of hasStdExtZfh/F/D with new vector specific
versions. The vector spec no longer requires that the vectors implement the
same types as scalar. It only requires that the scalar type is
the maximum size the vectors can support. This is currently
implemented using the scalar rule we were using before.

Add new hasVInstructionsI64() begin using to qualify code that
requires i64 vector elements.

This is all NFC for now, but we can start using this to better
implement D112408 which introduces the Zve extensions.

Reviewed By: frasercrmck, eopXD

Differential Revision: https://reviews.llvm.org/D112496

3 years ago[hwasan] print exact mismatch offset for short granules.
Florian Mayer [Fri, 22 Oct 2021 00:23:45 +0000 (01:23 +0100)]
[hwasan] print exact mismatch offset for short granules.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D104463

3 years ago[clang][compiler-rt][atomics] Add `__c11_atomic_fetch_nand` builtin and support ...
Kai Luo [Thu, 28 Oct 2021 02:18:16 +0000 (02:18 +0000)]
[clang][compiler-rt][atomics] Add `__c11_atomic_fetch_nand` builtin and support `__atomic_fetch_nand` libcall

Add `__c11_atomic_fetch_nand` builtin to language extensions and support `__atomic_fetch_nand` libcall in compiler-rt.

Reviewed By: theraven

Differential Revision: https://reviews.llvm.org/D112400

3 years ago[OpenMP] Declare variants for templates need to match # template args
Johannes Doerfert [Thu, 28 Oct 2021 00:39:28 +0000 (19:39 -0500)]
[OpenMP] Declare variants for templates need to match # template args

A declare variant template is only compatible with a base when the
number of template arguments is equal, otherwise our instantiations will
produce nonsensical results.

Exposes as part of D109344.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D109770

3 years ago[Attributor][FIX] Do not ignore memory writes in AAMemoryBehavior
Johannes Doerfert [Mon, 13 Sep 2021 12:40:41 +0000 (07:40 -0500)]
[Attributor][FIX] Do not ignore memory writes in AAMemoryBehavior

Even if we look for `nocapture` we need to bail on escaping pointers.
The crucial thing is that we might not look at a big enough scope when
we derive the memory behavior. Thus, it might be `nocapture` in a larger
context while it is "captured" in a smaller context.

3 years ago[Attributor][NFX] Pre-commit test case exposing a problem
Johannes Doerfert [Mon, 13 Sep 2021 12:34:51 +0000 (07:34 -0500)]
[Attributor][NFX] Pre-commit test case exposing a problem

The test case is the IR of:

```
  void func(float * restrict a, float *b, int N) {
    N = 199;
    #pragma omp parallel for
    for (int i = 1; i < N; i++)
      a[i] = b[i] + 1.0;
  }
```

3 years ago[Attributor][NFC] Improve debug messages
Johannes Doerfert [Wed, 8 Sep 2021 20:57:18 +0000 (15:57 -0500)]
[Attributor][NFC] Improve debug messages

3 years ago[CMake] Cache the compiler-rt library search results
Petr Hosek [Tue, 29 Sep 2020 00:37:20 +0000 (17:37 -0700)]
[CMake] Cache the compiler-rt library search results

There's a lot of duplicated calls to find various compiler-rt libraries
from build of runtime libraries like libunwind, libc++, libc++abi and
compiler-rt. The compiler-rt helper module already implemented caching
for results avoid repeated Clang invocations.

This change moves the compiler-rt implementation into a shared location
and reuses it from other runtimes to reduce duplication and speed up
the build.

Differential Revision: https://reviews.llvm.org/D88458

3 years ago[openmp] Fix a git misfire in cf37a94c1e42ce
Jon Chesterfield [Thu, 28 Oct 2021 00:35:16 +0000 (01:35 +0100)]
[openmp] Fix a git misfire in cf37a94c1e42ce

3 years ago[lld-macho] Implement -S
Vincent Lee [Wed, 27 Oct 2021 04:42:25 +0000 (21:42 -0700)]
[lld-macho] Implement -S

There are a couple internal builds that require the use of this flag.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D112594

3 years agoRevert "[libomptarget] Build DeviceRTL for amdgpu"
Jon Chesterfield [Thu, 28 Oct 2021 00:01:53 +0000 (01:01 +0100)]
Revert "[libomptarget] Build DeviceRTL for amdgpu"
 - more tests failing on CI than failed locally when writing this patch

This reverts commit 33427fdb7b52b79ce5e25b7e14e0f1a44d876bd2.

3 years ago[openmp] Add amdgpu impl missed from D112153
Jon Chesterfield [Wed, 27 Oct 2021 23:54:29 +0000 (00:54 +0100)]
[openmp] Add amdgpu impl missed from D112153

3 years agoAdd breakpoint resolving stats to each target.
Greg Clayton [Wed, 27 Oct 2021 00:48:42 +0000 (17:48 -0700)]
Add breakpoint resolving stats to each target.

This patch adds breakpoints to each target's statistics so we can track how long it takes to resolve each breakpoint. It also includes the structured data for each breakpoint so the exact breakpoint details are logged to allow for reproduction of slow resolving breakpoints. Each target gets a new "breakpoints" array that contains breakpoint details. Each breakpoint has "details" which is the JSON representation of a serialized breakpoint resolver and filter, "id" which is the breakpoint ID, and "resolveTime" which is the time in seconds it took to resolve the breakpoint. A snippet of the new data is shown here:

  "targets": [
    {
      "breakpoints": [
        {
          "details": {...},
          "id": 1,
          "resolveTime": 0.00039291599999999999
        },
        {
          "details": {...},
          "id": 2,
          "resolveTime": 0.00022679199999999999
        }
      ],
      "totalBreakpointResolveTime": 0.00061970799999999996
    }
  ]

This provides full details on exactly how breakpoints were set and how long it took to resolve them.

Differential Revision: https://reviews.llvm.org/D112587