Stanislav Mekhanoshin [Thu, 28 Oct 2021 19:14:02 +0000 (12:14 -0700)]
[InstCombine] Fixed non-determinisctic order of new instructions
Fixes non-determinisctic order of XOR instructions created after
5a7a458306cd. The order of call argument evaluation is not
defined, so create one Value before the call.
Stanislav Mekhanoshin [Thu, 21 Oct 2021 22:12:12 +0000 (15:12 -0700)]
[InstCombine] Fold `(c & ~(a | b)) | (b & ~(a | c))` to `~a & (b ^ c)`
```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
%or1 = or i4 %a, %b
%not1 = xor i4 %or1, 15
%and1 = and i4 %not1, %c
%or2 = or i4 %a, %c
%not2 = xor i4 %or2, 15
%and2 = and i4 %not2, %b
%or3 = or i4 %and1, %and2
ret i4 %or3
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
%xor = xor i4 %b, %c
%not = xor i4 %a, 15
%or3 = and i4 %xor, %not
ret i4 %or3
}
Transformation seems to be correct!
```
Differential Revision: https://reviews.llvm.org/D112276
Sam Clegg [Thu, 28 Oct 2021 15:15:20 +0000 (08:15 -0700)]
[lld][WebAssembly] Handle duplicate archive member names in ThinLTO
This entire change, including the test case, comes almost verbatim
from the ELF driver.
Fixes: https://github.com/emscripten-core/emscripten/issues/12763
Differential Revision: https://reviews.llvm.org/D112723
Sam Clegg [Thu, 28 Oct 2021 14:33:30 +0000 (07:33 -0700)]
[lld] Rename addCombinedLTOObjects to match ELF driver. NFC
This function was renamed in https://reviews.llvm.org/D62291.
The new name seems more accurate and also its good to maintain
some consistency between these methods in the different drivers.
Differential Revision: https://reviews.llvm.org/D112719
Ahmed Bougacha [Thu, 28 Oct 2021 17:20:11 +0000 (10:20 -0700)]
[AArch64] Rename some timm predicates for consistency. NFC.
timm isn't the common case, and TImmLeafs should make it clear what
they are. We're adding a plain ImmLeaf for 0_65535, so rename
i64_imm0_65535 to timm64_0_65535, and imm32_0_7 to timm32_0_7.
Yuanfang Chen [Thu, 28 Oct 2021 18:19:37 +0000 (11:19 -0700)]
[IRSymTab] Mark __stack_chk_guard used
`__stack_chk_guard` is a global variable that has no uses before the LLVM code generation phase (how it is defined is platform-dependent). LTO needs to preserve this symbol for that reason. Currently, legacy LTO API preserves it by hardcoding the logic in Internalizer, but this symbol is not preserved by regular LTO API in thinlink phase. This patch marks `__stack_chk_guard` used during IR symbol table writing since this is how builtin functions are preserved by thinlink by using `RuntimeLibcalls.def`.
Reviewed By: MaskRay, tejohnson
Differential Revision: https://reviews.llvm.org/D112595
Yuanfang Chen [Thu, 28 Oct 2021 18:19:24 +0000 (11:19 -0700)]
[Internalize] Preserve __stack_chk_fail in Internalizer correctly
Move the section collecting `AlwaysPreserved` up before any
`maybeInternalize` is called. Otherwise, functions in `AlwaysPreserved` (in this case, `__stack_chk_fail`)
are not preserved.
Reviewed By: MaskRay, tejohnson
Differential Revision: https://reviews.llvm.org/D112684
Tobias Gysi [Thu, 28 Oct 2021 17:46:35 +0000 (17:46 +0000)]
[mlir][linalg] Remove unused method (NFC).
Remove an unused method in hoist padding and format comment.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D112714
Guozhi Wei [Thu, 28 Oct 2021 18:08:36 +0000 (11:08 -0700)]
[TwoAddressInstructionPass] Put all new instructions into DistanceMap
In function convertInstTo3Addr, after converting a two address instruction into
three address instruction, only the last new instruction is inserted into
DistanceMap. This is wrong, DistanceMap should track all instructions from the
beginning of current MBB to the working instruction. When a two address
instruction is converted to three address instruction, multiple instructions may
be generated (usually an extra COPY is generated), all of them should be
inserted into DistanceMap.
Similarly when unfolding memory operand in function tryInstructionTransform
DistanceMap is not maintained correctly.
Differential Revision: https://reviews.llvm.org/D111857
Konstantin Varlamov [Thu, 28 Oct 2021 07:36:19 +0000 (00:36 -0700)]
[libc++] P0433R2: add the remaining deduction guides.
Add deduction guides to `valarray` and `scoped_allocator_adaptor`. This largely
finishes implementation of the paper:
* deduction guides for other classes mentioned in the paper were
implemented previously (see the list below);
* deduction guides for several classes contained in the proposal
(`reference_wrapper`, `lock_guard`, `scoped_lock`, `unique_lock`,
`shared_lock`) were removed by [LWG2981](https://wg21.link/LWG2981).
Also add deduction guides to the synopsis for the few classes (e.g. `pair`)
where they were missing.
The only part of the paper that isn't fully implemented after this patch is
making sure certain deduction guides don't participate in overload resolution
when given incorrect template parameters.
List of significant commits implementing the other parts of P0433 (omitting some
minor fixes):
* [pair](https://github.com/llvm/llvm-project/commit/
af65856eec160d163c764faad250d93357be7c83)
* [basic_string](https://github.com/llvm/llvm-project/commit/
6d9f750dec29e8ae5366092e64cd343dae2c7464)
* [array](https://github.com/llvm/llvm-project/commit/
0ca8c0895c6034615593c295dd955f29b25bf3d4)
* [deque](https://github.com/llvm/llvm-project/commit/
dbb6f8a8179b0604e25707b5c1b72be6164f62d9)
* [forward_list](https://github.com/llvm/llvm-project/commit/
e076700b7786959206acef136ecf05d54078e4e1)
* [list](https://github.com/llvm/llvm-project/commit/
4a227e582b2f13880ea049b29988a37a0f7c0742)
* [vector](https://github.com/llvm/llvm-project/commit/
df8f75479278d5ce16eede342ceb5ba2fd71460b)
* [queue/stack/priority_queue](https://github.com/llvm/llvm-project/commit/
5b8b8b5dce587f1e5a4a31cc24f09b18bd53ff9a)
* [basic_regex](https://github.com/llvm/llvm-project/commit/
edd5e29cfe9f67ec8e7e0eda12eb05e616fdeebc)
* [optional](https://github.com/llvm/llvm-project/commit/
f35b4bc3954f3b01051fc0848535ff784809e9e2)
* [map/multimap](https://github.com/llvm/llvm-project/commit/
edfe8525de1f7278f4754f2bffd47b13ec291a17)
* [set/multiset](https://github.com/llvm/llvm-project/commit/
e20865c387e09ea0ebd5add15c762cd5271ff65f)
* [unordered_set/unordered_multiset](https://github.com/llvm/llvm-project/commit/
296a80102a9b72c3eda80558fb78a3ed8849b341)
* [unordered_map/unordered_multimap](https://github.com/llvm/llvm-project/commit/
dfcd4384cbcac0eeb7e5cbce350f875ba4da79d5)
* [function](https://github.com/llvm/llvm-project/commit/
e1eabcdfad89f67ae575b0c86aa4a72d277378b4)
* [tuple](https://github.com/llvm/llvm-project/commit/
1308011e1b5c5382281a63dd4191a1784f8d2295)
* [shared_ptr/weak_ptr](https://github.com/llvm/llvm-project/commit/
83564056d4b186c9fcf016cdbb388755009f7b5a)
Additional notes:
* It was revision 2 of the paper that was voted into the Standard.
P0433R3 is a separate paper that is not part of the Standard.
* The paper also mandates removing several `make_*_searcher` functions
(e.g. `make_boyer_moore_searcher`) which are currently not implemented
(except in `experimental/`).
* The `__cpp_lib_deduction_guides` feature test macro from the paper was
accidentally omitted from the Standard.
Differential Revision: https://reviews.llvm.org/D112510
David CARLIER [Thu, 28 Oct 2021 17:48:54 +0000 (18:48 +0100)]
[compiler-rt] fix asan buildbot failure on unit test for darwin
Matthias Braun [Tue, 28 Sep 2021 00:57:22 +0000 (17:57 -0700)]
X86InstrInfo: Support immediates that are +1/-1 different in optimizeCompareInstr
This extends `optimizeCompareInstr` to re-use previous comparison
results if the previous comparison was with an immediate that was 1
bigger or smaller. Example:
CMP x, 13
...
CMP x, 12 ; can be removed if we change the SETg
SETg ... ; x > 12 changed to `SETge` (x >= 13) removing CMP
Motivation: This often happens because SelectionDAG canonicalization
tends to add/subtract 1 often when optimizing for fallthrough blocks.
Example for `x > C` the fallthrough optimization switches true/false
blocks with `!(x > C)` --> `x <= C` and canonicalization turns this into
`x < C + 1`.
Differential Revision: https://reviews.llvm.org/D110867
Matthias Braun [Mon, 27 Sep 2021 22:21:15 +0000 (15:21 -0700)]
X86InstrInfo: Optimize more combinations of SUB+CMP
`X86InstrInfo::optimizeCompareInstr` would only optimize a `SUB`
followed by a `CMP` in `isRedundantFlagInstr`. This extends the code to
also look for other combinations like `CMP`+`CMP`, `TEST`+`TEST`, `SUB
x,0`+`TEST`.
- Change `isRedundantFlagInstr` to run `analyzeCompareInstr` on the
candidate instruction and compare the results. This normalizes things
and gives consistent results for various comparisons (`CMP x, y`,
`SUB x, y`) and immediate cases (`TEST x, x`, `SUB x, 0`,
`CMP x, 0`...).
- Turn `isRedundantFlagInstr` into a member function so it can call
`analyzeCompare`. - We now also check `isRedundantFlagInstr` even if
`IsCmpZero` is true, since we now have cases like `TEST`+`TEST`.
Differential Revision: https://reviews.llvm.org/D110865
Michał Górny [Mon, 25 Oct 2021 20:55:48 +0000 (22:55 +0200)]
[lldb] [Host/ConnectionFileDescriptor] Refactor to improve code reuse
Refactor ConnectionFileDescriptor to improve code reuse for different
types of sockets. Unify method naming.
While at it, remove some (now-)dead code from Socket.
Differential Revision: https://reviews.llvm.org/D112495
Florian Hahn [Thu, 28 Oct 2021 17:01:53 +0000 (18:01 +0100)]
[VPlan] Keep induction recipes in header.
This patch updates recipe creation to ensure all
VPWidenIntOrFpInductionRecipes are in the header block. At the moment,
new induction recipes can be created in different blocks when trying to
optimize casts and induction variables.
Having all induction recipes in the header makes it easier to
analyze/transform them in VPlan.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D111300
Louis Dionne [Thu, 28 Oct 2021 15:35:18 +0000 (11:35 -0400)]
[libc++] Update the CI Docker image to Focal
Differential Revision: https://reviews.llvm.org/D112726
Markus Böck [Thu, 28 Oct 2021 17:08:10 +0000 (19:08 +0200)]
[mlir] Implement replacement of SymbolRefAttrs in Dialect attributes using SubElementAttr interface
This patch extends the SubElementAttr interface to allow replacing a contained sub attribute. The attribute that should be replaced is identified by an index which denotes the n-th element returned by the accompanying walkImmediateSubElements method.
Using this addition the patch implements replacing SymbolRefAttrs contained within any dialect attributes.
Differential Revision: https://reviews.llvm.org/D111357
Nicolai Hähnle [Thu, 28 Oct 2021 05:19:34 +0000 (10:49 +0530)]
MachineDominators: Define MachineDomTree type alias
This is a (very) small move towards making the machine dominators more
aligned with the IR dominators:
* DominatorTree / MachineDomTree is the class holding the dominator tree
* DominatorTreeWrapperPass / MachineDominatorTree is the corresponding
(machine) function pass
This alignment will be used by analyses that are designed as templates
that work with LLVM IR as well as Machine IR.
Reviewed By: critson
Differential Revision: https://reviews.llvm.org/D112690
Victor Huang [Thu, 28 Oct 2021 16:22:15 +0000 (11:22 -0500)]
[PowerPC][NFC] Update builtins-ppc-xlcompat-trap-64bit-only.ll and builtins-ppc-xlcompat-trap.ll to show full reg names
Fangrui Song [Thu, 28 Oct 2021 16:38:45 +0000 (09:38 -0700)]
[ELF] Change common diagnostics to report both object file location and source file location
Many diagnostics use `getErrorPlace` or `getErrorLocation` to report a location.
In the presence of line table debug information, `getErrorPlace` uses a source
file location and ignores the object file location. However, the object file
location is sometimes more useful.
This patch changes "undefined symbol" and "out of range" diagnostics to report
both object/source file locations. Other diagnostics can use similar format if
needed.
The key idea is to let `InputSectionBase::getLocation` report the object file
location and use `getSrcMsg` for source file/line information. `getSrcMsg`
doesn't leverage `STT_FILE` information yet, but I think the temporary lack of
the functionality is ok.
For the ARM "branch and link relocation" diagnostic, I arbitrarily place the
source file location at the end of the line. The diagnostic is not very common
so its formatting doesn't need to be pretty.
Differential Revision: https://reviews.llvm.org/D112518
Kazu Hirata [Thu, 28 Oct 2021 16:38:25 +0000 (09:38 -0700)]
[IR] Fix a warning
This patch fixes:
mlir/lib/IR/BuiltinAttributes.cpp:876:39: error: unused function
'isComplexOfIntType' [-Werror,-Wunused-function]
in a release build.
Philip Reames [Thu, 28 Oct 2021 16:19:26 +0000 (09:19 -0700)]
Regen some autogen tests to account for format change
David CARLIER [Thu, 28 Oct 2021 16:08:23 +0000 (17:08 +0100)]
[compiler-rt] update detect_write_exec option for apple devices.
Reviewed By: yln, vitalybuka
Differential Revision: https://reviews.llvm.org/D111390
Philip Reames [Thu, 28 Oct 2021 15:54:27 +0000 (08:54 -0700)]
Autogen a test for ease of update
Chandler Carruth [Sat, 16 Oct 2021 07:46:19 +0000 (00:46 -0700)]
Add support for Bazel builds on Windows with `clang-cl`.
Adds basic `--config=clang-cl` to set up the basic options needed, and
then fix a number of issues that surface in Windows builds for me.
With these fixes, `//llvm/...` builds cleanly. One unittest still fails,
but its just due to running out of stack space due to creating a large
number of short-lived stack variables. The test should probably be
decomposed into a set of tests (`LegalizerInfoTest::RuleSets`), but that
seemed like too invasive of a change here and with everything building
cleanly this isn't disrupting me experimenting with Windows builds.
Some parts of `//clang/...` builds, but that will require more work.
Aart Bik [Thu, 28 Oct 2021 03:35:52 +0000 (20:35 -0700)]
[mlir][sparse] move conversion test back to original CHECK testing
Rationale:
The silent exit(1) gives little clues on where the error occurs on failure
and may even be confusing at first. The CHECK testing of all computed values
and indices may be a little bit more elaborate, but it directly pinpoints
where errors happen if they occur. This style is also consistent with
the other tests, which I actually prefer.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D112688
Siva Chandra Reddy [Wed, 27 Oct 2021 18:49:00 +0000 (18:49 +0000)]
[libc][NFC] Move utils/CPP to src/__support/CPP.
The idea is to move all pieces related to the actual libc sources to the
"src" directory. This allows downstream users to ship and build just the
"src" directory.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D112653
Kadir Cetinkaya [Thu, 28 Oct 2021 12:56:53 +0000 (14:56 +0200)]
[clangd] SelectionTree should prefer lexical declcontext
This is important especially for code that tries to traverse scopes as
written in code, which is the contract SelectionTree tries to satisfy.
Differential Revision: https://reviews.llvm.org/D112712
Mark de Wever [Sat, 23 Oct 2021 11:08:01 +0000 (13:08 +0200)]
[libc++][ci] Update to Clang 13.
Per our support plan we should now support Clang 12 and 13. Adjust the
documentation and the CI runners. The change indirectly moves the main
CI runners to use the Clang 14 nightly builds.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D112360
Leonard Grey [Thu, 28 Oct 2021 15:27:01 +0000 (11:27 -0400)]
[CGProfile] Don't emit call graph profile edges with zero weight
With D112160 and D112164, on a Chrome Mac build this reduces the total
size of CGProfile sections by 78% (around 25% eliminated entirely) and
total size of object files by 0.14%.
Differential Revision: https://reviews.llvm.org/D112655
Mike Rice [Thu, 28 Oct 2021 15:10:40 +0000 (08:10 -0700)]
[OpenMP] Initial parsing/sema for the 'omp loop' construct
Adds basic parsing/sema/serialization support for the #pragma omp loop
directive.
Differential Revision: https://reviews.llvm.org/D112499
Daniel Kiss [Thu, 28 Oct 2021 15:24:53 +0000 (17:24 +0200)]
Revert "Reland "[ARM] __cxa_end_cleanup should be called instead of _UnwindResume.""
This reverts commit
b6420e575f3bbb6b6df848c0284d6b60eeb07350.
Daniel Kiss [Wed, 27 Oct 2021 08:32:11 +0000 (10:32 +0200)]
Reland "[ARM] __cxa_end_cleanup should be called instead of _UnwindResume."
This is relanding commit
da1d1a08694bbfe0ea7a23ea094612436e8a2dd0 .
This patch additionally addresses failures found in buildbots & post review comments.
ARM EHABI[1] specifies the __cxa_end_cleanup to be called after cleanup.
It will call the UnwindResume.
__cxa_begin_cleanup will be called from libcxxabi while __cxa_end_cleanup is never called.
This will trigger a termination when a foreign exception is processed while UnwindResume is called
because the global state will be wrong due to the missing __cxa_end_cleanup call.
Additional test here: D109856
[1] https://github.com/ARM-software/abi-aa/blob/main/ehabi32/ehabi32.rst#941compiler-helper-functions
Reviewed By: logan
Differential Revision: https://reviews.llvm.org/D111703
David Green [Thu, 28 Oct 2021 14:46:58 +0000 (15:46 +0100)]
[InstCombine] Extend canonicalizeClampLike to handle truncated inputs
This extends the canonicalizeClampLike function to allow cases where the
input is truncated, but still matching on the types of the ICmps. For
example
%t = trunc i32 %X to i8
%a = add i32 %X, 128
%cmp = icmp ult i32 %a, 256
%c = icmp sgt i32 %X, -1
%f = select i1 %c, i8 High, i8 Low
%r = select i1 %cmp, i8 %t, i8 %f
becomes
%c1 = icmp slt i32 %X, -128
%c2 = icmp sge i32 %X, 128
%s1 = select i1 %c1, i32 sext(Low), i32 %X
%s2 = select i1 %c2, i32 sext(High), i32 %s1
%t = trunc i32 %s2 to i8
https://alive2.llvm.org/ce/z/vPzfxH
We limit the transform to constant High and Low values, where we know
the sext are free.
Differential Revision: https://reviews.llvm.org/D108049
Louis Dionne [Thu, 28 Oct 2021 14:44:48 +0000 (10:44 -0400)]
[docs][NFC] Strip trailing whitespace from GettingStarted.rst
Nico Weber [Thu, 28 Oct 2021 14:41:18 +0000 (10:41 -0400)]
Revert "[clang] Fortify warning for scanf calls with field width too big."
This reverts commit
15e3d39110fa4449be4f56196af3bc81b623f3ab.
See https://reviews.llvm.org/D111833#3093629
Raphael Isemann [Thu, 28 Oct 2021 14:39:06 +0000 (16:39 +0200)]
[lldb][NFC] Improve CppModuleConfiguration documentation a bit
Sam Clegg [Thu, 28 Oct 2021 14:29:43 +0000 (07:29 -0700)]
[lld][ELF] Update name of function in comment. NFC
This function was renamed in https://reviews.llvm.org/D62291.
Dawid Jurczak [Fri, 22 Oct 2021 12:11:12 +0000 (14:11 +0200)]
[DSE] Eliminates redundant store of an exisiting value (PR16520)
That's https://reviews.llvm.org/D90328 follow-up.
This change eliminates writes to variables where the value that is being written is already stored in the variable.
This achieves the goal by looping through all memory definitions in the current state and getting defining access from each of them.
When there is defining access where the write instruction is identical to the original instruction it will remove this redundant write.
For example:
void f() {
x = 1;
if foo() {
x = 1;
g();
} else {
h();
}
}
void g();
void h();
The second x=1 will be eliminated since it is rewriting 1 to x. This pass will produce this:
void f() {
x = 1;
if foo() {
g();
} else {
h();
}
}
void g();
void h();
Differential Revision: https://reviews.llvm.org/D111727
David Green [Thu, 28 Oct 2021 14:03:07 +0000 (15:03 +0100)]
[InstCombine] Fix rare condition violation in canonicalizeClampLike
With a "ult x, 0", the fold in canonicalizeClampLike does not validate
with undef inputs. This condition will usually have been simplified
away, but we should ensure the code is correct in case.
https://alive2.llvm.org/ce/z/S8HQ6H vs https://alive2.llvm.org/ce/z/h2XBJ_
See: https://reviews.llvm.org/D108049
Lei Zhang [Thu, 28 Oct 2021 13:45:07 +0000 (09:45 -0400)]
[mlir][linalg] Fix FoldConstantTranspose execution inefficiency
* Move SmallVectors outside of inner loops to avoid frequent
allocations and deallocations
* Calculate linearized index and call flat range getters to
avoid internal shape querying behind `getValue`.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D112099
Simon Pilgrim [Thu, 28 Oct 2021 13:07:17 +0000 (14:07 +0100)]
[X86][AVX] Attempt to fold a scaled index into a gather/scatter scale immediate (PR13310)
If the index operand for a gather/scatter intrinsic is being scaled (self-addition or a shl-by-immediate) then we may be able to fold that scaling into the intrinsic scale immediate value instead.
Fixes PR13310.
Differential Revision: https://reviews.llvm.org/D108539
Kadir Cetinkaya [Thu, 28 Oct 2021 12:47:57 +0000 (14:47 +0200)]
[clangd] Escape error message in AddUsing
Fixes https://github.com/clangd/clangd/issues/900
Nico Weber [Thu, 28 Oct 2021 12:48:54 +0000 (08:48 -0400)]
[gn build] (manually) port
d736002e90b5
Alexey Bataev [Mon, 25 Oct 2021 14:32:35 +0000 (07:32 -0700)]
[SLP]Improve/fix reordering of the gathered graph nodes.
Gathered loads/extractelements/extractvalue instructions should be
checked if they can represent a vector reordering node too and their
order should ve taken into account for better graph reordering analysis/
Also, if the gather node has reused scalars, they must be reordered
instead of the scalars themselves.
Differential Revision: https://reviews.llvm.org/D112454
Hans Wennborg [Thu, 28 Oct 2021 11:37:27 +0000 (13:37 +0200)]
Re-instate -Wweak-template-vtables as a no-op flag
Follow-up to
8c136805242014b6ad9ff1afcac9d7f4a18bec3f to allow a less
abrupt migration for users.
Differential revision: https://reviews.llvm.org/D112704
Uday Bondhugula [Wed, 20 Oct 2021 09:44:54 +0000 (15:14 +0530)]
[MLIR][LLVM] Add llvm.mlir.global_ctors/dtors and translation support
Add llvm.mlir.global_ctors and global_dtors ops and their translation
support to LLVM global_ctors/global_dtors global variables.
Differential Revision: https://reviews.llvm.org/D112524
Pavel Labath [Thu, 28 Oct 2021 09:23:24 +0000 (11:23 +0200)]
[lldb/test] Allow indentation in inline tests
This makes it possible to use for loops (and other language constructs)
in inline tests.
Differential Revision: https://reviews.llvm.org/D112706
Sanjay Patel [Thu, 28 Oct 2021 12:11:59 +0000 (08:11 -0400)]
[InstCombine] allow Negator to fold multi-use select with constant arms
The motivating test is reduced from:
https://llvm.org/PR52261
Note that the more general problem of folding any binop into a multi-use
select of constants is still there. We need to ease the restriction in
InstCombinerImpl::FoldOpIntoSelect() to catch those. But these examples
never reach that code because Negator exclusively handles negation
patterns within visitSub().
Differential Revision: https://reviews.llvm.org/D112657
Peter Waller [Thu, 28 Oct 2021 12:14:52 +0000 (12:14 +0000)]
[InstCombine][ConstantFolding] Make ConstantFoldLoadThroughBitcast TypeSize-aware
The newly added test previously caused the compiler to fail an
assertion. It looks like a strightforward TypeSize upgrade.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D112142
David Green [Thu, 28 Oct 2021 11:58:13 +0000 (12:58 +0100)]
[InstSimplify] Add tests for the range of a half float. NFC
Konstantin Schwarz [Sun, 10 Oct 2021 09:17:07 +0000 (11:17 +0200)]
[GlobalISel][Tablegen] Fix SameOperandMatcher's isIdentical check
During rule optimization, identical SameOperandMatchers are hoisted into a common group,
however previously only one operand index was considered.
Commutable patterns can introduce SameOperandMatcher checks where the second index is commuted,
resulting in a different check that cannot be hoisted.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D111506
Jon Chesterfield [Thu, 28 Oct 2021 11:33:25 +0000 (12:33 +0100)]
[libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987.
CI is showing a different set of pass/fails to local, committing this
without the tests enabled by default while debugging that difference.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D112227
Dmitry Vyukov [Wed, 27 Oct 2021 14:00:23 +0000 (16:00 +0200)]
tsan: move memory access functions to a separate file
tsan_rtl.cpp is huge and does lots of things.
Move everything related to memory access and tracing
to a separate tsan_rtl_access.cpp file.
No functional changes, only code movement.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D112625
Abinav Puthan Purayil [Thu, 28 Oct 2021 10:38:59 +0000 (16:08 +0530)]
[AMDGPU] Add 24-bit mulhi intrinsics in INTRINSIC_WO_CHAIN combine.
mul24 intrinsic's operands are simplified by
AMDGPUTargetLowering::performIntrinsicWOChainCombine(). This change adds
the mul24hi intrinsics in the combine since its operands can be
simplified like that of the mul24 intrinsics.
Differential Revision: https://reviews.llvm.org/D112702
Abinav Puthan Purayil [Thu, 28 Oct 2021 01:33:48 +0000 (07:03 +0530)]
[AMDGPU] Fix rhs of the tests in amdgpu-codegenprepare-mul24.ll.
Differential Revision: https://reviews.llvm.org/D112685
Guillaume Chatelet [Mon, 11 Oct 2021 15:26:43 +0000 (15:26 +0000)]
[libc] automemcpy
Emil Kieri [Mon, 25 Oct 2021 19:43:17 +0000 (21:43 +0200)]
[flang] Checks for pointers to intrinsic functions
Check that when a procedure pointer is initialised or assigned with an intrinsic
function, or when its interface is being defined by one, that intrinsic function
is unrestricted specific (listed in Table 16.2 of F'2018).
Mark intrinsics LGE, LGT, LLE, and LLT as restricted specific. Getting their
classifications right helps in designing the tests.
Differential Revision: https://reviews.llvm.org/D112381
Kirill Bobyrev [Thu, 28 Oct 2021 10:25:12 +0000 (12:25 +0200)]
[clangd] NFC: Use more idiomatic way of checking for definition
Kirill Bobyrev [Thu, 28 Oct 2021 10:11:31 +0000 (12:11 +0200)]
[clangd] NFC: Match function signature in the header and source file
OCHyams [Thu, 28 Oct 2021 09:17:26 +0000 (10:17 +0100)]
[dexter] XFAIL feature_test source-root-dir.cpp
Test is failing for unknown reasons and needs investigating.
Jay Foad [Thu, 28 Oct 2021 08:39:19 +0000 (09:39 +0100)]
[AMDGPU] Add gfx10 uaddsat test coverage. NFC.
Max Kazantsev [Thu, 28 Oct 2021 09:18:30 +0000 (16:18 +0700)]
[Test] Regenerate some of llc test checks using auto updater
Balazs Benics [Thu, 28 Oct 2021 09:03:02 +0000 (11:03 +0200)]
[analyzer] sprintf is a taint propagator not a source
Due to a typo, `sprintf()` was recognized as a taint source instead of a
taint propagator. It was because an empty taint source list - which is
the first parameter of the `TaintPropagationRule` - encoded the
unconditional taint sources.
This typo effectively turned the `sprintf()` into an unconditional taint
source.
This patch fixes that typo and demonstrated the correct behavior with
tests.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D112558
Shraiysh Vaishay [Thu, 28 Oct 2021 05:34:40 +0000 (11:04 +0530)]
[MLIR][OpenMP] Fixed the missing inclusive clause in omp.wsloop and fix order clause
This patch adds the inclusive clause (which was missed in previous
reorganization - https://reviews.llvm.org/D110903) in omp.wsloop operation.
Added a test for validating it.
Also fixes the order clause, which was not accepting any values. It now accepts
"concurrent" as a value, as specified in the standard.
Reviewed By: kiranchandramohan, peixin, clementval
Differential Revision: https://reviews.llvm.org/D112198
Sebastian Neubauer [Thu, 28 Oct 2021 08:29:06 +0000 (10:29 +0200)]
[AMDGPU][GlobalISel] Fix waterfall loops
- Move the `s_and exec` to its correct position before the content of
the waterfall loop
- Use the SI_WATERFALL pseudo instruction, like for sdag, to benefit
from optimizations
- Add support for indirect function calls
To support indirect calls, add a G_SI_CALL instruction without register
class restrictions and insert a waterfall loop when applying register
banks.
Differential Revision: https://reviews.llvm.org/D109052
Neubauer, Sebastian [Mon, 25 Oct 2021 14:11:42 +0000 (16:11 +0200)]
[GlobalISel] Simplify RegBankSelect
Save the instruction list of a block before selecting banks.
This allows to cope with moved instructions, even if they are reordered
or splitted into multiple basic blocks.
Differential Revision: https://reviews.llvm.org/D111223
Pavel Labath [Fri, 22 Oct 2021 17:53:43 +0000 (19:53 +0200)]
[lldb] Remove ConstString from Process, ScriptInterpreter and StructuredData plugin names
Max Kazantsev [Thu, 28 Oct 2021 08:13:09 +0000 (15:13 +0700)]
[Test] Regenerate checks using auto-update script
Caroline Concatto [Fri, 22 Oct 2021 08:22:14 +0000 (09:22 +0100)]
[Driver][AArch64]Add driver support for neoverse-512tvb target
The support for neoverse-512tvb mirrors the same option available in GCC[1].
There is no functional effect for this option yet.
This patch ensures the driver accepts "-mcpu=neoverse-512tvb", and enough
plumbing is in place to allow the new option to be used in the future.
[1]https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html
Differential Revision: https://reviews.llvm.org/D112406
Michał Górny [Wed, 27 Oct 2021 15:52:45 +0000 (17:52 +0200)]
[lldb] [Host/Socket] Make DecodeHostAndPort() return a dedicated struct
Differential Revision: https://reviews.llvm.org/D112629
Diana Picus [Tue, 12 Oct 2021 12:40:48 +0000 (12:40 +0000)]
[flang] runtime: Read environment variables directly
Add support for reading environment variables directly, via std::getenv.
This needs to allocate a C-style string to pass into std::getenv. If the
memory allocation for that fails, we terminate.
This also changes the interface for EnvVariableLength to receive the
source file and line so we can crash gracefully.
Note that we are now completely ignoring the envp pointer passed into
ProgramStart, since that could go stale if the environment is modified
during execution.
Differential Revision: https://reviews.llvm.org/D111785
Martin Storsjö [Mon, 4 Oct 2021 12:10:52 +0000 (15:10 +0300)]
[Support] [Windows] Manually clean up temp files if not setting delete disposition
Since D81803 /
79657e2339b58bc01fe1b85a448bb073d57d90bb, temp files
created on network shares don't set "Disposition.DeleteFile = true".
This flag normally takes care of removing the temp file both if the
process exits abnormally (either crashing or killed externally), and
when the file is closed cleanly.
For network shares, we voluntarily choose to not set the flag, and
if the operation to inspect the file handle (as a prerequisite to
setting the flag since
79657e2339b58bc01fe1b85a448bb073d57d90bb)
fails we also error out. In both of these cases, we can at least make
sure to remove the temp files when they are closed cleanly.
Adjust the semantics of "OF_Delete" to not set the delete
disposition, but only set the access mode for allowing deletion.
Move the call to setDeleteDisposition into TempFile::create,
where we can check if it failed, and if it did, set a flag noting
that the file should be removed manually at the end.
This does leak files on crash, but at least doesn't leak files
in regular successful runs. (Technically, the alternative codepath
could use the RemoveFileOnSignal function, but that might complicate
the TempFile implementation further.)
This fixes https://github.com/mstorsjo/llvm-mingw/issues/233 and
https://bugs.llvm.org/show_bug.cgi?id=52080.
Differential Revision: https://reviews.llvm.org/D111875
Martin Storsjö [Sat, 16 Oct 2021 14:20:16 +0000 (17:20 +0300)]
[clang] [MinGW] Rename the 'Arch' member to 'SubdirName'. NFC.
This string isn't a plain architecture name, but contains the whole
subdir name used for the sysroot, which often is equal to the target
triple.
Differential Revision: https://reviews.llvm.org/D112387
YunQiang Su [Wed, 27 Oct 2021 13:16:42 +0000 (16:16 +0300)]
[clang][MIPS] Fix search path for Debian multilib O32
In the situation of multilib, the gcc objects are in a /32 directory. On
Debian, the libraries is under /libo32 to avoid confliction. This patch
enables clang find gcc in /32, and C lib in /libo32.
Differential Revision: https://reviews.llvm.org/D112158
Sam McCall [Wed, 27 Oct 2021 19:13:32 +0000 (21:13 +0200)]
[clangd] Avoid expensive checks of buffer names in IncludeCleaner
This changes the handling of special buffers (<command-line> etc) that
SourceManager treats as files but FileManager does not.
We now include them in findReferencedFiles() and drop them as part of
translateToHeaderIDs(). This pairs more naturally with the data representations
we're using, and so avoids a bunch of converting between representations for
filtering.
Differential Revision: https://reviews.llvm.org/D112652
Hongtao Yu [Wed, 27 Oct 2021 23:56:06 +0000 (16:56 -0700)]
[CSSPGO] Trim cold base profiles for the CS preinliner.
Adding support to the CS preinliner to trim cold base profiles. This makes trimming consistent with the inline decision made by the preinliner. Also disable the existing profile merger when preinliner is on unless explicitly specified.
Reviewed By: wenlei, wlei
Differential Revision: https://reviews.llvm.org/D112489
Hsiangkai Wang [Wed, 22 Sep 2021 23:48:46 +0000 (07:48 +0800)]
[RISCV] Sync Zvlsseg register order as the same as vector registers.
Sync the order of Zvlsseg registers with vector registers to avoid
unnecessary register copies between vector instructions and zvlsseg
instructions.
Differential Revision: https://reviews.llvm.org/D110250
Greg Clayton [Thu, 28 Oct 2021 01:33:17 +0000 (18:33 -0700)]
Add unix signal hit counts to the target statistics.
Android and other platforms make wide use of signals when running applications and this can slow down debug sessions. Tracking this statistic can help us to determine why a debug session is slow.
The new data appears inside each target object and reports the signal hit counts:
"signals": [
{
"SIGSTOP": 1
},
{
"SIGUSR1": 1
}
],
Differential Revision: https://reviews.llvm.org/D112683
thomasraoux [Mon, 25 Oct 2021 19:42:36 +0000 (12:42 -0700)]
[mlir][GPUtoNVVM] Relax restriction on wmma op lowering
Allow lowering of wmma ops with 64bits indexes. Change the default
version of the test to use default layout.
Differential Revision: https://reviews.llvm.org/D112479
Kazu Hirata [Thu, 28 Oct 2021 04:24:02 +0000 (21:24 -0700)]
[AMDGPU] Remove unused declaration findNumUsedRegistersSI (NFC)
Max Kazantsev [Thu, 28 Oct 2021 04:22:34 +0000 (11:22 +0700)]
[Test] Add test showing missing simplifycfg opportunity for Phi with undef inputs
Phoebe Wang [Thu, 28 Oct 2021 02:56:04 +0000 (10:56 +0800)]
[X86] Add a dependency breaking xor before any gathers with an undef passthru value.
In the instruction encoding, the passthru register is always
tied to the destination register. The CPU scheduler has to wait
for the last writer of this register to finish executing before
the gather can start. This is true even if the initial mask is
all ones so that the passthru will never be used.
By explicitly zeroing the register we can break the false
dependency. The zero idiom is executed completing by the
register renamer and so is immedately considered ready.
Authored by Craig.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D112505
Hsiangkai Wang [Fri, 2 Jul 2021 01:25:50 +0000 (09:25 +0800)]
[RISCV] Use vmv.v.[v|i] if we know COPY is under the same vl and vtype.
If we know the source operand of COPY is defined by a vector instruction
with tail agnostic and the same LMUL and there is no vsetvli between
COPY and the define instruction to change the vl and vtype, we could use
vmv.v.v or vmv.v.i to copy vector registers to get better performance than
the whole vector register move instructions.
If the source of COPY is from vmv.v.i, we could use vmv.v.i for the
COPY.
This patch only considers all these instructions within one basic block.
Case 1:
```
bb.0:
...
VSETVLI # The first VSETVLI before COPY and VOP.
... # Use this VSETVLI to check LMUL and tail agnostic.
...
vy = VOP va, vb # Define vy.
... # There is no vsetvli between VOP and COPY.
vx = COPY vy
```
Case 2:
```
bb.0:
...
VSETVLI # The first VSETVLI before VOP.
... # Use this VSETVLI to check LMUL and tail agnostic.
...
vy = VOP va, vb # Define vy.
... # There is no vsetvli to change vl between VOP and COPY.
...
VSETVLI # The first VSETVLI before COPY.
... # This VSETVLI does not change vl and vtype.
...
vx = COPY vy
```
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Co-Authored-by: Kito Cheng <kito.cheng@sifive.com>
Differential Revision: https://reviews.llvm.org/D103510
Michael Benfield [Thu, 14 Oct 2021 20:02:28 +0000 (20:02 +0000)]
[clang] Fortify warning for scanf calls with field width too big.
Differential Revision: https://reviews.llvm.org/D111833
Abinav Puthan Purayil [Tue, 26 Oct 2021 16:08:21 +0000 (21:38 +0530)]
[AMDGPU] Add more llc tests for 48-bit mul generation.
Differential Revision: https://reviews.llvm.org/D112554
Max Kazantsev [Thu, 28 Oct 2021 02:08:48 +0000 (09:08 +0700)]
[SCEV] Invalidate user SCEVs along with operand SCEVs to avoid cache corruption
Following discussion in D110390, it seems that we are suffering from unability
to traverse users of a SCEV being invalidated. The result of that is that ScalarEvolution's
inner caches may store obsolete data about SCEVs even if their operands are
forgotten. It creates problems when we try to verify the contents of those caches.
It's also a frequent situation when messing with cache causes very sneaky and
hard-to-analyze bugs related to corruption of memory when dealing with cached
data. They are lurking there because ScalarEvolution's veirfication is not powerful
enough and misses many problematic cases. I plan to make SCEV's verification
much stricter in follow-ups, and this requires dangling-pointers-free caches.
This patch makes sure that, whenever we forget cached information for a SCEV,
we also forget it for all SCEVs that (transitively) use it.
This may have negative compile time impact. It's a sacrifice we are more
than willing to make to enforce correctness. We can also save some time by
reworking invokers of forgetMemoizedResults (maybe we can forget multiple
SCEVs with single query).
Differential Revision: https://reviews.llvm.org/D111533
Reviewed By: reames
Craig Topper [Thu, 28 Oct 2021 02:19:03 +0000 (19:19 -0700)]
[RISCV] Replace most uses of RISCVSubtarget::hasStdExtV. NFCI
Add new hasVInstructions() which is currently equivalent.
Replace vector uses of hasStdExtZfh/F/D with new vector specific
versions. The vector spec no longer requires that the vectors implement the
same types as scalar. It only requires that the scalar type is
the maximum size the vectors can support. This is currently
implemented using the scalar rule we were using before.
Add new hasVInstructionsI64() begin using to qualify code that
requires i64 vector elements.
This is all NFC for now, but we can start using this to better
implement D112408 which introduces the Zve extensions.
Reviewed By: frasercrmck, eopXD
Differential Revision: https://reviews.llvm.org/D112496
Florian Mayer [Fri, 22 Oct 2021 00:23:45 +0000 (01:23 +0100)]
[hwasan] print exact mismatch offset for short granules.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D104463
Kai Luo [Thu, 28 Oct 2021 02:18:16 +0000 (02:18 +0000)]
[clang][compiler-rt][atomics] Add `__c11_atomic_fetch_nand` builtin and support `__atomic_fetch_nand` libcall
Add `__c11_atomic_fetch_nand` builtin to language extensions and support `__atomic_fetch_nand` libcall in compiler-rt.
Reviewed By: theraven
Differential Revision: https://reviews.llvm.org/D112400
Johannes Doerfert [Thu, 28 Oct 2021 00:39:28 +0000 (19:39 -0500)]
[OpenMP] Declare variants for templates need to match # template args
A declare variant template is only compatible with a base when the
number of template arguments is equal, otherwise our instantiations will
produce nonsensical results.
Exposes as part of D109344.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D109770
Johannes Doerfert [Mon, 13 Sep 2021 12:40:41 +0000 (07:40 -0500)]
[Attributor][FIX] Do not ignore memory writes in AAMemoryBehavior
Even if we look for `nocapture` we need to bail on escaping pointers.
The crucial thing is that we might not look at a big enough scope when
we derive the memory behavior. Thus, it might be `nocapture` in a larger
context while it is "captured" in a smaller context.
Johannes Doerfert [Mon, 13 Sep 2021 12:34:51 +0000 (07:34 -0500)]
[Attributor][NFX] Pre-commit test case exposing a problem
The test case is the IR of:
```
void func(float * restrict a, float *b, int N) {
N = 199;
#pragma omp parallel for
for (int i = 1; i < N; i++)
a[i] = b[i] + 1.0;
}
```
Johannes Doerfert [Wed, 8 Sep 2021 20:57:18 +0000 (15:57 -0500)]
[Attributor][NFC] Improve debug messages
Petr Hosek [Tue, 29 Sep 2020 00:37:20 +0000 (17:37 -0700)]
[CMake] Cache the compiler-rt library search results
There's a lot of duplicated calls to find various compiler-rt libraries
from build of runtime libraries like libunwind, libc++, libc++abi and
compiler-rt. The compiler-rt helper module already implemented caching
for results avoid repeated Clang invocations.
This change moves the compiler-rt implementation into a shared location
and reuses it from other runtimes to reduce duplication and speed up
the build.
Differential Revision: https://reviews.llvm.org/D88458
Jon Chesterfield [Thu, 28 Oct 2021 00:35:16 +0000 (01:35 +0100)]
[openmp] Fix a git misfire in
cf37a94c1e42ce
Vincent Lee [Wed, 27 Oct 2021 04:42:25 +0000 (21:42 -0700)]
[lld-macho] Implement -S
There are a couple internal builds that require the use of this flag.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D112594
Jon Chesterfield [Thu, 28 Oct 2021 00:01:53 +0000 (01:01 +0100)]
Revert "[libomptarget] Build DeviceRTL for amdgpu"
- more tests failing on CI than failed locally when writing this patch
This reverts commit
33427fdb7b52b79ce5e25b7e14e0f1a44d876bd2.
Jon Chesterfield [Wed, 27 Oct 2021 23:54:29 +0000 (00:54 +0100)]
[openmp] Add amdgpu impl missed from D112153
Greg Clayton [Wed, 27 Oct 2021 00:48:42 +0000 (17:48 -0700)]
Add breakpoint resolving stats to each target.
This patch adds breakpoints to each target's statistics so we can track how long it takes to resolve each breakpoint. It also includes the structured data for each breakpoint so the exact breakpoint details are logged to allow for reproduction of slow resolving breakpoints. Each target gets a new "breakpoints" array that contains breakpoint details. Each breakpoint has "details" which is the JSON representation of a serialized breakpoint resolver and filter, "id" which is the breakpoint ID, and "resolveTime" which is the time in seconds it took to resolve the breakpoint. A snippet of the new data is shown here:
"targets": [
{
"breakpoints": [
{
"details": {...},
"id": 1,
"resolveTime": 0.
00039291599999999999
},
{
"details": {...},
"id": 2,
"resolveTime": 0.
00022679199999999999
}
],
"totalBreakpointResolveTime": 0.
00061970799999999996
}
]
This provides full details on exactly how breakpoints were set and how long it took to resolve them.
Differential Revision: https://reviews.llvm.org/D112587