Adrian Prantl [Wed, 24 Feb 2021 01:52:21 +0000 (17:52 -0800)]
Add more historic DWARF vendor extensions
The maintainer of libdwarf kindly provided this patch with a bunch of
historic DWARF extensions that are missing from Dwarf.def. This list
is helpful to avoid potential conflicts in the user-defined vendor
extension space in the future.
Patch by David Anderson!
Differential Revision: https://reviews.llvm.org/D97242
Ta-Wei Tu [Wed, 24 Feb 2021 01:52:46 +0000 (09:52 +0800)]
[LoopNest] Use `getUniqueSuccessor()` instead when checking empty blocks
Blocks that contain only a single branch instruction to the next block can be skipped in analyzing the loop-nest structure.
This is currently done by `getSingleSuccessor()`.
However, the branch instruction might have multiple targets which happen to all be the same.
In this case, the block should still be considered as empty and skipped.
An example is `test/Transforms/LoopInterchange/update-condbranch-duplicate-successors.ll` (the LIT test for this patch is modified from it as well).
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D97286
Juneyoung Lee [Tue, 23 Feb 2021 02:46:59 +0000 (11:46 +0900)]
[SimplifyCFG] Update passingValueIsAlwaysUndefined to check more attributes
This is a simple patch to update SimplifyCFG's passingValueIsAlwaysUndefined to inspect more attributes.
A new function `CallBase::isPassingUndefUB` checks attributes that imply noundef.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D97244
Nico Weber [Wed, 24 Feb 2021 01:38:16 +0000 (20:38 -0500)]
Revert "[Driver][Windows] Support per-target runtimes dir layout for profile instr generate"
This reverts commit
7f9d5d6e444c91ce6f2e377b312ac573dfc6779a.
Breaks check-clang everywhere, see https://reviews.llvm.org/D96638#2583608
Kern Handa [Sun, 21 Feb 2021 09:20:37 +0000 (01:20 -0800)]
[mlir] ExecutionEngine needs special handling for COFF binaries
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D97141
River Riddle [Wed, 24 Feb 2021 00:40:09 +0000 (16:40 -0800)]
[mlir][Inliner] Keep the number of async pass managers constant
This prevents a bug in the pass instrumentation implementation where the main thread would end up with a different pass manager in different runs of the pass.
Erich Keane [Tue, 23 Feb 2021 19:58:46 +0000 (11:58 -0800)]
[NFC] Make TrailingObjects non-copyable/non-movable
This got me pretty recently... TrailingObjects cannot be copied or
moved, since they need to be pre-allocated. This patch deletes the copy
and move operations (plus re-adds the default ctor).
Differential Revision: https://reviews.llvm.org/D97324
Jessica Paquette [Wed, 24 Feb 2021 00:12:56 +0000 (16:12 -0800)]
[AArch64][GlobalISel] Correct function evaluation order in applyINS
The order in which the nested calls to Builder.buildWhatever are
evaluated in differs between GCC and Clang.
This caused a bot failure because the MIR in the testcase was
coming out in a different order than expected.
Rather than using nested calls, pull them out in order to fix the
order of evaluation.
Fangrui Song [Wed, 24 Feb 2021 00:09:05 +0000 (16:09 -0800)]
collectUsedGlobalVariables: migrate SmallPtrSetImpl overload to SmallVecImpl overload after D97128
And delete the SmallPtrSetImpl overload.
While here, decrease inline element counts from 8 to 4. See D97128 for the choice.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D97257
Fangrui Song [Tue, 23 Feb 2021 20:32:08 +0000 (12:32 -0800)]
Fix unstable SmallPtrSet iteration issues due to collectUsedGlobalVariables
While here, decrease inline element counts from 8 to 4. See D97128 for the choice.
Depends on D97128 (which added a new SmallVecImpl overload for collectUsedGlobalVariables).
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D97139
Fangrui Song [Tue, 23 Feb 2021 23:50:45 +0000 (15:50 -0800)]
[ThinLTO] Make cloneUsedGlobalVariables deterministic
Iterating on `SmallPtrSet<GlobalValue *, 8>` with more than 8 elements
is not deterministic. Use a SmallVector instead because `Used` is guaranteed to contain unique elements.
While here, decrease inline element counts from 8 to 4. The number of
`llvm.used`/`llvm.compiler.used` elements is usually 0 or 1. For full
LTO/hybrid LTO, the number may be large, so we need to be careful.
According to tejohnson's analysis https://reviews.llvm.org/D97128#2582399 , 4 is
good for a large project with WholeProgramDevirt, when available_externally
vtables are placed in the llvm.compiler.used set.
Differential Revision: https://reviews.llvm.org/D97128
Teresa Johnson [Sat, 20 Feb 2021 22:09:10 +0000 (14:09 -0800)]
[WPD] Fix handling of pure virtual base class
The fix in
3c4c205060c9398da705eb71b63ddd8a04999de9 caused an assert in
the case of a pure virtual base class. In that case, the vTableFuncs
list on the summary will be empty, so we were hitting the new assert
that the linkage type was not available_externally.
In the case of pure virtual, we do not want to assert, and additionally
need to set VS so that we don't treat it conservatively and quit the
analysis of the type id early.
This exposed a pre-existing issue where we were not updating the vcall
visibility on pure virtual functions when whole program visibility was
specified. We were skipping updating the visibility on any global vars
that didn't have any vTableFuncs, which meant all pure virtual were not
updated, and the later analysis would block any devirtualization of
calls that had a type id used on those pure virtual vtables (see the
handling in the other code modified in this patch). Simply remove that
check. It will mean that we may update the vcall visibility on global
vars that aren't vtables, but that setting is ignored for any global
vars that didn't have type metadata anyway.
Added a new test case that asserted without removing the assert, and
that requires the other fixes in this patch (updateVCallVisibilityInIndex
and not skipping all vtables without virtual funcs) to get a successful
devirtualization with index-only WPD. I added cases to test hybrid and
regular LTO for completeness, although those already worked without the
fixes here.
With this final fix, a clang multistage bootstrap with WPD builds and
runs all tests successfully.
Differential Revision: https://reviews.llvm.org/D97126
Hsiangkai Wang [Thu, 4 Feb 2021 04:57:44 +0000 (12:57 +0800)]
[RISCV] Add vadd with mask and without mask builtin.
Demonstrate how to add RISC-V V builtins and lower them to IR intrinsics for V extension.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93446
Siva Chandra Reddy [Tue, 23 Feb 2021 08:13:42 +0000 (00:13 -0800)]
[libc] Add a standalone flavor of an equivalent of std::string_view.
This class is to serve as a replacement for llvm::StringRef as part of
the plans to limit dependency on other parts of LLVM. One use of
llvm::StringRef in MPFRWrapper has been replaced with the new class.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D97330
Tue Ly [Thu, 18 Feb 2021 20:04:50 +0000 (15:04 -0500)]
[libc] Add exhaustive test for sqrtf.
Differential Revision: https://reviews.llvm.org/D96985
Jianzhou Zhao [Tue, 23 Feb 2021 04:24:42 +0000 (04:24 +0000)]
[dfsan] Update memset and dfsan_(set|add)_label with origin tracking
This is a part of https://reviews.llvm.org/D95835.
Reviewed-by: morehouse
Differential Revision: https://reviews.llvm.org/D97302
Matthew Voss [Tue, 23 Feb 2021 22:55:53 +0000 (14:55 -0800)]
[LTO] Fix test failures caused by
6da7d3141651
Adds "REQUIRES: asserts", since the test uses debug messages
Heejin Ahn [Wed, 17 Feb 2021 13:08:52 +0000 (05:08 -0800)]
[WebAssembly] Fix incorrect grouping and sorting of exceptions
This CL is not big but contains changes that span multiple analyses and
passes. This description is very long because it tries to explain basics
on what each pass/analysis does and why we need this change on top of
that. Please feel free to skip parts that are not necessary for your
understanding.
---
`WasmEHFuncInfo` contains the mapping of <EH pad, the EH pad's next
unwind destination>. The value (unwind dest) here is where an exception
should end up when it is not caught by the key (EH pad). We record this
info in WasmEHPrepare to fix catch mismatches, because the CFG itself
does not have this info. A CFG only contains BBs and
predecessor-successor relationship between them, but in `WasmEHFuncInfo`
the unwind destination BB is not necessarily a successor or the key EH
pad BB. Their relationship can be intuitively explained by this C++ code
snippet:
```
try {
try {
foo();
} catch (int) { // EH pad
...
}
} catch (...) { // unwind destination
}
```
So when `foo()` throws, it goes to `catch (int)` first. But if it is not
caught by it, it ends up in the next unwind destination `catch (...)`.
This unwind destination is what you see in `catchswitch`'s
`unwind label %bb` part.
---
`WebAssemblyExceptionInfo` groups exceptions so that they can be sorted
continuously together in CFGSort, as we do for loops. What this analysis
does is very simple: it creates a single `WebAssemblyException` per EH
pad, and all BBs that are dominated by that EH pad are included in this
exception. We also identify subexception relationship in this way: if
EHPad A domiantes EHPad B, EHPad B's exception is a subexception of
EHPad A's exception.
This simple rule turns out to be incorrect in some cases. In
`WasmEHFuncInfo`, if EHPad A's unwind destination is EHPad B, it means
semantically EHPad B should not be included in EHPad A's exception,
because it does not make sense to rethrow/delegate to an inner scope.
This is what happened in CFGStackify as a result of this:
```
try
try
catch
... <- %dest_bb is among here!
end
delegate %dest_bb
```
So this patch adds a phase in `WebAssemblyExceptionInfo::recalculate` to
make sure excptions' unwind destinations are not subexceptions of
their unwind sources in `WasmEHFuncInfo`.
But this alone does not prevent `dest_bb` in the example above from
being sorted within the inner `catch`'s exception, even if its exception
is not a subexception of that `catch`'s exception anymore, because of
how CFGSort works, which will be explained below.
---
CFGSort places BBs within the same `SortRegion` (loop or exception)
continuously together so they can be demarcated with `loop`-`end_loop`
or `catch`-`end_try` in CFGStackify.
`SortRegion` is a wrapper for one of `MachineLoop` or
`WebAssemblyException`. `SortRegionInfo` already does some complicated
things because there discrepancies between those two data structures.
`WebAssemblyException` is what we control, and it is defined as an EH
pad as its header and BBs dominated by the header as its BBs (with a
newly added exception of unwind destinations explained in the previous
paragraph). But `MachineLoop` is an LLVM data structure and uses the
standard loop detection algorithm. So by the algorithm, BBs that are 1.
dominated by the loop header and 2. have a path back to its header.
Because of the second condition, many BBs that are dominated by the loop
header are not included in the loop. So BBs that contain `return` or
branches to outside of the loop are not technically included in
`MachineLoop`, but they can be sorted together with the loop with no
problem.
Maybe to relax the condition, in CFGSort, when we are in a `SortRegion`
we allow sorting of not only BBs that belong to the current innermost
region but also BBs that are by the current region header.
(This was written this way from the first version written by Dan, when
only loops existed.) But now, we have cases in exceptions when EHPad B
is the unwind destination for EHPad A, even if EHPad B is dominated by
EHPad A it should not be included in EHPad A's exception, and should not
be sorted within EHPad A.
One way to make things work, at least correctly, is change `dominates`
condition to `contains` condition for `SortRegion` when sorting BBs, but
this will change compilation results for existing non-EH code and I
can't be sure it will not degrade performance or code size. I think it
will degrade performance because it will force many BBs dominated by a
loop, which don't have the path back to the header, to be placed after
the loop and it will likely to create more branches and blocks.
So this does a little hacky check when adding BBs to `Preferred` list:
(`Preferred` list is a ready list. CFGSort maintains ready list in two
priority queues: `Preferred` and `Ready`. I'm not very sure why, but it
was written that way from the beginning. BBs are first added to
`Preferred` list and then some of them are pushed to `Ready` list, so
here we only need to guard condition for `Preferred` list.)
When adding a BB to `Preferred` list, we check if that BB is an unwind
destination of another BB. To do this, this adds the reverse mapping,
`UnwindDestToSrc`, and getter methods to `WasmEHFuncInfo`. And if the BB
is an unwind destination, it checks if the current stack of regions
(`Entries`) contains its source BB by traversing the stack backwards. If
we find its unwind source in there, we add the BB to its `Deferred`
list, to make sure that unwind destination BB is added to `Preferred`
list only after that region with the unwind source BB is sorted and
popped from the stack.
---
This does not contain a new test that crashes because of this bug, but
this fix changes the result for one of existing test case. This test
case didn't crash because it fortunately didn't contain `delegate` to
the incorrectly placed unwind destination BB.
Fixes https://github.com/emscripten-core/emscripten/issues/13514.
Reviewed By: dschuff, tlively
Differential Revision: https://reviews.llvm.org/D97247
Daniel Hwang [Tue, 23 Feb 2021 22:36:49 +0000 (14:36 -0800)]
[scan-build-py] Add sarif-html support in scan-build-py
Update scan-build-py to be able to trigger sarif-html output format in clang static analyzer.
NOTE: testcase `test_sarif_and_html_creates_sarif_and_html_reports` will fail if the default clang does not have change https://reviews.llvm.org/D96389 . This can be remediated by pointing the default clang in arguments.py to a locally built clang. I was unable to figure out where these particular tests for scan-build-py are being invoked (aside from manually), so any help there would be greatly appreciated.
Reviewed By:
aabbaabb, xazax.hun
Differential Revision: https://reviews.llvm.org/D96570
Amara Emerson [Tue, 23 Feb 2021 22:34:29 +0000 (14:34 -0800)]
Fix a range-loop-analysis warning.
Heejin Ahn [Tue, 23 Feb 2021 19:00:11 +0000 (11:00 -0800)]
[WebAssembly] Disable wasm.lsda() optimization in WasmEHPrepare
In every catchpad except `catch (...)`, we add a call to
`_Unwind_CallPersonality`, which is a wapper to call the personality
function. (In most of other Itanium-based architectures the call is done
from libunwind, but in wasm we don't have the control over the VM.)
Because the personatlity function is called to figure out whether the
current exception is a type we should catch, such as `int` or
`SomeClass&`, `catch (...)` does not need the personality function call.
For the same reason, all cleanuppads don't need it.
When we call `_Unwind_CallPersonality`, we store some necessary info in
a data structure called `__wasm_lpad_context` of type
`_Unwind_LandingPadContext`, which is defined in the wasm's port of
libunwind in Emscripten. Also the personality wrapper function returns
some info (selector and the caught pointer) in that data structure, so
it is used as a medium for communication.
One of the info we need to store is the address for LSDA info for the
current function. `wasm.lsda()` intrinsic returns that address. (This
intrinsic will be lowered to a symbol that points to the LSDA address.)
The simpliest thing is call `wasm.lsda()` every time we need to call
`_Unwind_CallPersonality` and store that info in `__wasm_lpad_context`
data structure. But we tried to be better than that (D77423 and some
more previous CLs), so if catchpad A dominates catchpad B and catchpad A
is not `catch (...)`, we didn't insert `wasm.lsda()` call in catchpad B,
thinking that the LSDA address is the same for a single function and we
already visited catchpad A and `__wasm_lpad_context.lsda` field would
already have that value.
But this can be incorrect if there is a call to another function, which
also can have the personality function and LSDA, between catchpad A and
catchpad B, because `__wasm_lpad_context` is a globally defined
structure and the callee function will overwrite its `lsda` field.
So in this CL we don't try to do any optimizaions on adding
`wasm.lsda()` call; we store the result of `wasm.lsda()` every time we
call `_Unwind_CallPersonality`. We can do some complicated analysis,
like checking if there is a function call between the dominating
catchpad and the current catchpad, but at this time it seems overkill.
This deletes three tests because they all tested `wasm.ldsa()` call
optimization.
Fixes https://github.com/emscripten-core/emscripten/issues/13548.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D97309
River Riddle [Tue, 23 Feb 2021 22:22:30 +0000 (14:22 -0800)]
[mlir][Inliner] Use llvm::parallelForEach instead of llvm::parallelTransformReduce
llvm::parallelTransformReduce does not schedule work on the caller thread, which becomes very costly for
the inliner where a majority of SCCs are small, often ~1 element. The switch to llvm::parallelForEach solves this,
and also aligns the implementation with the PassManager (which realistically should share the same implementation).
This change dropped compile time on an internal benchmark by ~1(25%) second.
Differential Revision: https://reviews.llvm.org/D96086
River Riddle [Tue, 23 Feb 2021 22:22:23 +0000 (14:22 -0800)]
[mlir] Refactor InterfaceMap to use a sorted vector of interfaces, as opposed to a DenseMap
A majority of operations have a very small number of interfaces, which means that the cost of using a hash map is generally larger for interface lookups than just a binary search. In the future when there are a number of operations with large amounts of interfaces, we can switch to a hybrid approach that optimizes lookups based on the number of interfaces. For now, however, a binary search is the best approach.
This dropped compile time on a largish TF MLIR module by 20%(half a second).
Differential Revision: https://reviews.llvm.org/D96085
David Green [Tue, 23 Feb 2021 22:27:06 +0000 (22:27 +0000)]
[ARM] Mir test for pre/postinc ldstopt combines. NFC
Matt Arsenault [Sun, 21 Feb 2021 21:14:40 +0000 (16:14 -0500)]
AMDGPU: Use aligned vgprs/agprs in gfx90a mir tests
These would fail a verifier check in a future change.
David Crook [Tue, 23 Feb 2021 20:48:15 +0000 (12:48 -0800)]
[SEMA] Added warn_decl_shadow support for structured bindings
https://bugs.llvm.org/show_bug.cgi?id=40858
CheckShadow is now called for each binding in the structured binding to make sure it does not shadow any other variable in scope. This does use a custom implementation of getShadowedDeclaration though because a BindingDecl is not a VarDecl
Added a few unit tests for this. In theory though all the other shadow unit tests should be duplicated for the structured binding variables too but whether it is probably not worth it as they use common code. The MyTuple and std interface code has been copied from live-bindings-test.cpp
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D96147
zero9178 [Tue, 23 Feb 2021 21:34:27 +0000 (22:34 +0100)]
[Driver][Windows] Support per-target runtimes dir layout for profile instr generate
When targeting a MSVC triple, --dependant-libs with the name of the clang runtime library for profiling is added to the command line args. In it's current implementations clang_rt.profile-<ARCH> is chosen as the name. When building a distribution using LLVM_ENABLE_PER_TARGET_RUNTIME_DIR this fails, due to the runtime file names not having an architecture suffix in the filename.
This patch refactors getCompilerRT and getCompilerRTBasename to always consider per-target runtime directories. getCompilerRTBasename now simply returns the filename component of the path found by getCompilerRT
Differential Revision: https://reviews.llvm.org/D96638
Jorge Gorbe Moya [Sat, 6 Feb 2021 02:00:26 +0000 (18:00 -0800)]
Defer the decision whether to use the CU or TU index until after reading the unit header.
In DWARF v4 compile units go in .debug_info and type units go in
.debug_types. However, in v5 both kinds of units are in .debug_info.
Therefore we can't decide whether to use the CU or TU index just by
looking at which section we're reading from. We have to wait until we
have read the unit type from the header.
Differential Revision: https://reviews.llvm.org/D96194
Aart Bik [Tue, 23 Feb 2021 19:43:03 +0000 (11:43 -0800)]
[mlir][sparse] incorporate vector index into address computation
When computing dense address, a vectorized index must be accounted
for properly. This bug was formerly undetected because we get 0 * prev + i
in most cases, which folds away the scalar part. Now it works for all cases.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D97317
Eric Schweitz [Tue, 23 Feb 2021 20:28:20 +0000 (12:28 -0800)]
[flang][fir][NFC] remove dead code
Removes unused function from FatalError.h.
Differential revision: https://reviews.llvm.org/D97328
Matthew Voss [Thu, 14 Jan 2021 23:31:32 +0000 (15:31 -0800)]
[llvm-profdata] Emit Error when Invalid MemOpSize Section is Created by llvm-profdata
Under certain (currently unknown) conditions, llvm-profdata is outputting
profiles that have two consecutive entries in the MemOPSize section for the
value 0. This causes the PGOMemOPSizeOpt pass to output an invalid switch
instruction with two cases for 0. As mentioned, we’re not quite sure what’s
causing this to happen, but this patch prevents llvm-profdata from outputting a
profile that has this problem and gives an error with a request for a
reproducible.
Differential Revision: https://reviews.llvm.org/D92074
David Green [Tue, 23 Feb 2021 20:31:01 +0000 (20:31 +0000)]
[AArch64] Introduce UDOT/SDOT DAG nodes
This is used to lower UDOT/SDOT instructions, as opposed to relying on
the intrinsic. Subsequent optimizations will be able to optimize them
more cleanly based on these nodes.
Lang Hames [Tue, 23 Feb 2021 20:27:39 +0000 (07:27 +1100)]
Revert "[docs][ORC] Fix section title and reference."
This reverts commit
6e1affe71c79a1cb5ea9d805ff7baae5cba59c0e, which caused an
error on the Sphinx doc bot.
Craig Topper [Tue, 23 Feb 2021 20:17:43 +0000 (12:17 -0800)]
[RISCV] Use a different constant in one of the smulo test cases to avoid converting the mul to an add.
Jessica Paquette [Tue, 23 Feb 2021 19:27:11 +0000 (11:27 -0800)]
Recommit "[AArch64][GlobalISel] Match G_SHUFFLE_VECTOR -> insert elt + extract elt"
Attempted fix for the added test failing.
https://lab.llvm.org/buildbot/#/builders/104/builds/2355/steps/5/logs/stdio
I can't reproduce the failure anywhere, so I'm going to guess that passing a
std::function as MatchInfo is sketchy in this context.
Switch it to a std::tuple and hope for the best.
Amara Emerson [Tue, 23 Feb 2021 19:23:04 +0000 (11:23 -0800)]
[AArch64][GlobalISel] Lower G_USUBSAT and G_UADDSAT for scalars.
We have some missing optimization counterparts to LowerXALUO, but it's a start.
Florian Hahn [Tue, 23 Feb 2021 15:52:46 +0000 (15:52 +0000)]
[AArch64] Regenerate check lines for neon-compare-instructions.ll.
Auto-generate tests so they can be updated more easily, e.g. for D97303.
Andrei Elovikov [Tue, 23 Feb 2021 18:54:22 +0000 (10:54 -0800)]
[NFC][VPlan] Use VPUser to store block's predicate
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D96529
Florian Hahn [Tue, 23 Feb 2021 17:46:22 +0000 (17:46 +0000)]
[LV] Ensure fixNonInductionPHIs uses a valid insertion point.
In some cases, Builder's insertion point may be invalidated before using
it in VPTransformState::get. Make sure the insertion point is
up-to-date.
This should fix various sanitizer errors, like
https://lab.llvm.org/buildbot/#/builders/5/builds/4933/steps/9/logs/stdio
Nathan James [Tue, 23 Feb 2021 18:29:22 +0000 (18:29 +0000)]
[clang-tidy] Add cppcoreguidelines-prefer-member-initializer to ReleaseNotes
Following a discussion about the current state of this check on the 12.X branch, it was decided to purge the check as it wasn't in a fit to release state, see https://llvm.org/PR49318.
This check has since had some of those issues addressed and should be good for the next release cycle now, pending any more bug reports about it.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D97275
Simon Pilgrim [Tue, 23 Feb 2021 18:26:41 +0000 (18:26 +0000)]
[InstSimplify] Handle nsw shl -> poison patterns
Pulled out from D90479 - this recognises invalid nsw shl patterns with signbit changes that result in poison.
Differential Revision: https://reviews.llvm.org/D97305
Stanislav Mekhanoshin [Mon, 22 Feb 2021 20:25:30 +0000 (12:25 -0800)]
[AMDGPU] Set threshold for regbanks reassign pass
This is to limit compile time. I did experiments with some
inputs and found that compile time keeps reasonable for this
pass if we have less than 100000 virtual registers and then
starts to explode somewhere between 100000 and 150000.
Differential Revision: https://reviews.llvm.org/D97218
Andrzej Warzynski [Mon, 22 Feb 2021 18:05:18 +0000 (18:05 +0000)]
[flang][test] Share all driver test dirs between `f18` and `flang-new`
Originally, when we added the new driver, we created dedicated test
directories for `flang-new`. This way we separated the tests for the
`throwaway` and the new driver.
As we are increasing test coverage and starting to share tests between
the two drivers, it makes sense to share all directories and instead
rely on:
```
! REQUIRES: new-flang-driver
```
to mark tests as exclusively for the new driver.
Differential Revision: https://reviews.llvm.org/D97207
Shilei Tian [Tue, 23 Feb 2021 18:20:13 +0000 (13:20 -0500)]
[OpenMP][NVPTX] Fixed a compilation error in deviceRTLs caused by unsupported feature in release verion of LLVM
`ptx71` is not supported in release version of LLVM yet. As a result,
the support of CUDA 11.2 and CUDA 11.1 caused a compilation error as mentioned
in D97004. Since the support in D97004 is just a WA for releease, and we'll not
use it in the near future, using `ptx70` for CUDA 11 is feasible.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D97195
Adam Straw [Tue, 23 Feb 2021 18:16:36 +0000 (10:16 -0800)]
make Affine parallel and yield ops MemRefsNormalizable
Affine parallel ops may contain and yield results from MemRefsNormalizable ops in the loop body. Thus, both affine.parallel and affine.yield should have the MemRefsNormalizable trait.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D96821
Simon Pilgrim [Tue, 23 Feb 2021 18:08:32 +0000 (18:08 +0000)]
[InstructionSimplify] SimplifyShift - rename shift amount KnownBits. NFCI.
As suggested on D97305.
Duncan P. N. Exon Smith [Tue, 23 Feb 2021 16:38:47 +0000 (08:38 -0800)]
Revert "Module: Use FileEntryRef and DirectoryEntryRef in Umbrella, Header, and DirectoryName, NFC"
This (mostly) reverts
32c501dd88b62787d3a5ffda7aabcf4650dbe3cd. Hit a
case where this causes a behaviour change, perhaps the same root cause
that triggered the revert of
a40db5502b2515a6f2f1676b5d7a655ae0f41179 in
7799ef7121aa7d59f4bd95cdf70035de724ead6f.
(The API changes in DirectoryEntry.h have NOT been reverted as a number
of subsequent commits depend on those.)
https://reviews.llvm.org/D90497#2582166
Craig Topper [Tue, 23 Feb 2021 17:40:30 +0000 (09:40 -0800)]
[LegalizeIntegerTypes] Improve ExpandIntRes_SADDSUBO codegen on targets without SADDO/SSUBO.
This code creates 3 setccs that need to be expanded. It was
creating a sign bit test as setge X, 0 which is non-canonical.
Canonical would be setgt X, -1. This misses the special case in
IntegerExpandSetCCOperands for sign bit tests that assumes
canonical form. If we don't hit this special case we end up
with a multipart setcc instead of just checking the sign of
the high part.
To fix this I've reversed the polarity of all of the setccs to
setlt X, 0 which is canonical. The rest of the logic should
still work. This seems to produce better code on RISCV which
lacks a setgt instruction.
This probably still isn't the best code sequence we could use here.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D97181
Nick Desaulniers [Tue, 23 Feb 2021 17:11:23 +0000 (09:11 -0800)]
[THUMB2] add .w suffixes for ldr/str (immediate) T4
The Linux kernel when built with CONFIG_THUMB2_KERNEL makes use of these
instructions with immediate operands and wide encodings.
These are the T4 variants of the follow sections from the Arm ARM.
F5.1.72 LDR (immediate)
F5.1.229 STR (immediate)
I wasn't able to represent these simple aliases using t2InstAlias due to
the Constraints on the non-suffixed existing instructions, which results
in some manual parsing logic needing to be added.
F1.2 Standard assembler syntax fields
describes the use of the .w (wide) vs .n (narrow) encoding suffix.
Link: https://bugs.llvm.org/show_bug.cgi?id=49118
Link: https://github.com/ClangBuiltLinux/linux/issues/1296
Reported-by: Stefan Agner <stefan@agner.ch>
Reported-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D96632
Emily Shi [Tue, 23 Feb 2021 17:23:02 +0000 (09:23 -0800)]
[darwin] use new crash reporter api
Add support for the new crash reporter api if the headers are available. Falls back to the old API if they are not available. This change was based on [[ https://github.com/llvm/llvm-project/blob/
0164d546d2691c439fc06c8fff126224276c2d02/llvm/lib/Support/PrettyStackTrace.cpp#L111 | /llvm/lib/Support/PrettyStackTrace.cpp ]]
There is a lit for this behavior here: https://reviews.llvm.org/D96737 but is not included in this diff because it is potentially flaky.
rdar://
69767688
Reviewed By: delcypher, yln
Commited by Dan Liew on behalf of Emily Shi.
Differential Revision: https://reviews.llvm.org/D96830
Emily Shi [Tue, 23 Feb 2021 17:22:01 +0000 (09:22 -0800)]
[darwin][asan] add test for application specific information in crash logs
Added a lit test that finds its corresponding crash log and checks to make sure it has asn output under `Application Specific Information`.
This required adding two python commands:
- `get_pid_from_output`: takes the output from the asan instrumentation and parses out the process ID
- `print_crashreport_for_pid`: takes in the pid of the process and the file name of the binary that was run and prints the contents of the corresponding crash log.
This test was added in preparation for changing the integration with crash reporter from the old api to the new api, which is implemented in a subsequent commit.
rdar://
69767688
Reviewed By: delcypher
Commited by Dan Liew on behalf of Emily Shi.
Differential Revision: https://reviews.llvm.org/D96737
Jay Foad [Tue, 23 Feb 2021 16:10:19 +0000 (16:10 +0000)]
[GlobalISel] Make more use of replaceSingleDefInstWithReg. NFC.
Dave Lee [Sun, 21 Feb 2021 22:38:43 +0000 (14:38 -0800)]
[lldb] Add deref support and tests to shared_ptr synthetic
Add `frame variable` dereference suppport to libc++ `std::shared_ptr`.
This change allows for commands like `v *thing_sp` and `v thing_sp->m_id`. These
commands now work the same way they do with raw pointers. This is done by adding an
unaccounted for child member named `$$dereference$$`.
Also, add API tests for `std::shared_ptr`, previously there were none.
Differential Revision: https://reviews.llvm.org/D97165
Florian Hahn [Tue, 23 Feb 2021 16:57:21 +0000 (16:57 +0000)]
Revert "[LV] Allow tryToCreateWidenRecipe to return a VPValue, use for blends."
This reverts commit
4efa097eb4c87d7ffe09a95a5b4ff372bdddda85, because
some the compilers used for some bots do not support automatic
conversions to PointerUnion.
Florian Hahn [Mon, 22 Feb 2021 19:44:47 +0000 (19:44 +0000)]
[LV] Allow tryToCreateWidenRecipe to return a VPValue, use for blends.
Generalize the return value of tryToCreateWidenRecipe to return either a
newly create recipe or an existing VPValue. Use this to avoid creating
unnecessary VPBlendRecipes.
Fixes PR44800.
Nicolai Hähnle [Mon, 3 Aug 2020 13:03:18 +0000 (15:03 +0200)]
[AMDGPU][SelectionDAG] Don't combine uniform multiplies to MUL_[UI]24
Prefer to keep uniform (non-divergent) multiplies on the scalar ALU when
possible. This significantly improves some game cases by eliminating
v_readfirstlane instructions when the result feeds into a scalar
operation, like the address calculation for a scalar load or store.
Since isDivergent is only an approximation of whether a value is in
SGPRs, it can potentially regress some situations where a uniform value
ends up in a VGPR. These should be rare in real code, although the test
changes do contain a number of examples.
Most of the test changes are just using s_mul instead of v_mul/mad which
is generally better for both register pressure and latency (at least on
GFX10 where sgpr pressure doesn't affect occupancy and vector ALU
instructions have significantly longer latency than scalar ALU). Some
R600 tests now use MULLO_INT instead of MUL_UINT24.
GlobalISel appears to handle more scenarios in the desirable way,
although it can also be thrown off and fails to select the 24-bit
multiplies in some cases.
Alternative solution considered and rejected was to allow selecting
MUL_[UI]24 to S_MUL_I32. I've rejected this because the definition of
those SD operations works is don't-care on the most significant 8 bits,
and this fact is used in some combines via SimplifyDemandedBits.
Based on a patch by Nicolai Hähnle.
Differential Revision: https://reviews.llvm.org/D97063
Juneyoung Lee [Tue, 23 Feb 2021 14:35:12 +0000 (23:35 +0900)]
[JumpThreading] Update computeValueKnownInPredecessors to recognize logical and/or patterns
This allows JumpThreading's computeValueKnownInPredecessors to
recognize select form of and/or patterns as well.
Jay Foad [Tue, 23 Feb 2021 14:42:50 +0000 (14:42 +0000)]
[AMDGPU] Rename a prefix for sanity. NFC.
Nate Chandler [Mon, 22 Feb 2021 23:04:51 +0000 (15:04 -0800)]
Add @llvm.coro.async.size.replace intrinsic.
The new intrinsic replaces the size in one specified AsyncFunctionPointer with
the size in another. This ability is necessary for functions which merely
forward to async functions such as those defined for partial applications.
Reviewed By: aschwaighofer
Differential Revision: https://reviews.llvm.org/D97229
Jessica Clarke [Tue, 23 Feb 2021 14:17:15 +0000 (14:17 +0000)]
[Driver][NFC] Add explicit break to final case
Martin Storsjö [Mon, 2 Nov 2020 06:13:26 +0000 (08:13 +0200)]
[libcxx] [test] Define _CRT_STDIO_ISO_WIDE_SPECIFIERS while building tests
This matches how libc++ itself is built. This avoids errors due to
mismatch if linking libc++ statically.
Differential Revision: https://reviews.llvm.org/D97169
Nathan James [Tue, 23 Feb 2021 13:48:06 +0000 (13:48 +0000)]
[clang-tidy] Remove IncludeInserter from MoveConstructorInit check.
This check registers an IncludeInserter, however the check itself doesn't actually emit any fixes or includes, so the inserter is redundant.
From what I can tell the fixes were removed in D26453(rL290051) but the inserter was left in, probably an oversight.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D97243
Joe Ellis [Fri, 19 Feb 2021 17:09:50 +0000 (17:09 +0000)]
[clang][SVE] Don't warn on vector to sizeless builtin implicit conversion
This commit prevents warnings from -Wconversion when a clang vector type
is implicitly converted to a sizeless builtin type -- for example, when
implicitly converting a fixed-predicate to a scalable predicate.
The code below:
1 #include <arm_sve.h>
2
3 #define N __ARM_FEATURE_SVE_BITS
4 #define FIXED_ATTR __attribute__((arm_sve_vector_bits (N)))
5 typedef svbool_t fixed_svbool_t FIXED_ATTR;
6
7 inline fixed_svbool_t foo(fixed_svbool_t p) {
8 return svnot_z(svptrue_b64(), p);
9 }
would previously raise this warning:
warning: implicit conversion turns vector to scalar: \
'fixed_svbool_t' (vector of 8 'unsigned char' values) to 'svbool_t' \
(aka '__SVBool_t') [-Wconversion]
Note that many cases of these implicit conversions were already
permitted because many functions inside arm_sve.h are spawned via
preprocessor macros, and the call to isInSystemMacro would cover us in
this case. This commit fixes the remaining cases.
Differential Revision: https://reviews.llvm.org/D97053
Balázs Kéri [Mon, 22 Feb 2021 16:16:51 +0000 (17:16 +0100)]
[clang-tidy] Extending bugprone-signal-handler with POSIX functions.
An option is added to the check to select wich set of functions is
defined as asynchronous-safe functions.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D90851
Michał Górny [Tue, 23 Feb 2021 12:43:40 +0000 (13:43 +0100)]
[lldb] [test] Un-XFAIL TestBuiltinTrap on FreeBSD/aarch64
Michał Górny [Tue, 23 Feb 2021 12:34:04 +0000 (13:34 +0100)]
[lldb] [test] Un-XFAIL a test that no longer fail on FreeBSD
Simon Pilgrim [Tue, 23 Feb 2021 13:31:26 +0000 (13:31 +0000)]
[X86] Cleanup overflow test check prefixes. NFCI.
Tidy up the check prefixes to improve reuse.
Jay Foad [Fri, 19 Feb 2021 15:04:03 +0000 (15:04 +0000)]
[AMDGPU] Use divergent addresses for vector loads
Change some test cases to use divergent addresses for vector loads,
which should be the common case in real world code. Using uniform
addresses causes poor instruction selection for the surrounding
code which has to be fixed up post-register-allocation, and this causes
a lot of testsuite churn for a forthcoming patch to stop selecting
24-bit vector multiply instructions for uniform multiplies.
This shows up some problems in the idot tests where we fail to select
v_dot instructions because the patterns only match MUL_[UI]24 ISD nodes,
but the DAG contains i16 mul nodes instead.
Differential Revision: https://reviews.llvm.org/D97062
Sjoerd Meijer [Tue, 23 Feb 2021 12:58:03 +0000 (12:58 +0000)]
[ARM] do not consider sp as deprecated for ldm/stm
Early versions of the ARMv7 reference manuals considered the sp register
as a deprecated register for ldm/stm familiy of instructions. However,
later versions such as ARM DDI 0406C.d added a note to the Appendix:
D9.3 Use of the SP as a general-purpose register
Most ARM instructions, unlike Thumb instructions, provide exactly the
same access to the SP as to R0-R12. This means that it is possible to
use the SP as a general-purpose register. Earlier issues of this manual
deprecated the use of SP in an ARM instruction, in any way that is
deprecated, not permitted, or not possible in the corresponding
Thumb instruction. However, user feedback indicates a number of cases
where these instructions are useful. Therefore, ARM no longer deprecates
these instruction uses.
Also Armv8 manuals no longer consider SP as deprecated register for ldm/
stm A32 instructions.
Furthermore, GNU as also does not print a deprecated warning when using
SP with those instructions.
Drop deprecation warning for pop/ldm/push/stm instructions.
Patch by: Stefan Agner.
Differential Revision: https://reviews.llvm.org/D82692
David Green [Tue, 23 Feb 2021 13:04:59 +0000 (13:04 +0000)]
[TTI] Change getOperandsScalarizationOverhead to take Type args
As a followup to D95291, getOperandsScalarizationOverhead was still
using a VF as a vector factor if the arguments were scalar, and would
assert on certain matrix intrinsics with differently sized vector
arguments. This patch removes the VF arg, instead passing the Types
through directly. This should allow it to more accurately compute the
cost without having to guess at which operands will be vectorized,
something difficult with more complex intrinsics.
This adjusts one SVE test as it is now calling the wrong intrinsic vs
veccall. Without invalid InstructCosts the cost of the scalarized
intrinsic is too low. This should get fixed when the cost of
scalarization is accounted for with scalable types.
Differential Revision: https://reviews.llvm.org/D96287
David Green [Tue, 23 Feb 2021 13:03:26 +0000 (13:03 +0000)]
[CostModel] Remove VF from IntrinsicCostAttributes
getIntrinsicInstrCost takes a IntrinsicCostAttributes holding various
parameters of the intrinsic being costed. It can either be called with a
scalar intrinsic (RetTy==Scalar, VF==1), with a vector instruction
(RetTy==Vector, VF==1) or from the vectorizer with a scalar type and
vector width (RetTy==Scalar, VF>1). A RetTy==Vector, VF>1 is considered
an error. Both of the vector modes are expected to be treated the same,
but because this is confusing many backends end up getting it wrong.
Instead of trying work with those two values separately this removes the
VF parameter, widening the RetTy/ArgTys by VF used called from the
vectorizer. This keeps things simpler, but does require some other
modifications to keep things consistent.
Most backends look like this will be an improvement (or were not using
getIntrinsicInstrCost). AMDGPU needed the most changes to keep the code
from
c230965ccf36af5c88c working. ARM removed the fix in
dfac521da1b90db683, webassembly happens to get a fixup for an SLP cost
issue and both X86 and AArch64 seem to now be using better costs from
the vectorizer.
Differential Revision: https://reviews.llvm.org/D95291
Nathan James [Tue, 23 Feb 2021 13:01:16 +0000 (13:01 +0000)]
[clang-tidy] Update checks list.
Timm Bäder [Tue, 23 Feb 2021 12:20:28 +0000 (13:20 +0100)]
[clang][parse][NFC] Remove dead ProhibitAttributes() call
GNU-style attribute in enum bodies are allowed (and used by several
tests), and this call to ProhibitAttributes() was dead code.
Differential Revision: https://reviews.llvm.org/D97271
Florian Schmaus [Tue, 23 Feb 2021 12:38:11 +0000 (12:38 +0000)]
[clang-tidy] Install run-clang-tidy.py in bin/ as run-clang-tidy
The run-clang-tidy.py helper script is supposed to be used by the
user, hence it should be placed in the user's PATH. Some
distributions, like Gentoo [1], won't have it in PATH unless it is
installed in bin/.
Furthermore, installed scripts in PATH usually do not carry a filename
extension, since there is no need to know that this is a Python
script. For example Debian and Ubuntu already install this script as
'run-clang-tidy' [2] and hence build systems like Meson also look for
this name first [3]. Hence we install run-clang-tidy.py as
run-clang-tidy, as suggested by Sylvestre Ledru [4].
1: https://bugs.gentoo.org/753380
2: https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/
60aefb14171ab5c3867a0081844b507fc9f6e015/debian/clang-tidy-X.Y.links.in#L2
3: https://github.com/mesonbuild/meson/blob/
b6dc4d5e5c6e838de0b52e62d982ba2547eb366d/mesonbuild/scripts/clangtidy.py#L44
4: https://reviews.llvm.org/D90972#2380640
Reviewed By: sylvestre.ledru, JonasToth
Differential Revision: https://reviews.llvm.org/D90972
Matteo Favaro [Tue, 23 Feb 2021 10:22:53 +0000 (10:22 +0000)]
[DSE] Allow ptrs defined in the entry block in IsGuaranteedLoopInvariant.
The **IsGuaranteedLoopInvariant** function is making sure to check if the
incoming pointer is guaranteed to be loop invariant, therefore I think
the case where the pointer is defined in the entry block of a function
automatically guarantees the pointer to be loop invariant, as the entry
block of a function cannot have predecessors or be part of a loop.
I implemented this small patch and tested it using
**ninja check-llvm-unit** and **ninja check-llvm**. I added a contained test
file that shows the problem and used **opt -O3 -debug** on it to make sure
the case is not currently handled (in fact the debug log is showing that
the DSE pass is bailing out when testing if the killer store is able to
clobber the dead store).
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D96979
Hsiangkai Wang [Tue, 23 Feb 2021 05:49:18 +0000 (13:49 +0800)]
[RISCV] vle1.v/vse1.v should be unmasked instructions.
vle1.v/vse1.v should be unmasked instructions. The vm encoding is 1 for
unmasked instructions.
Differential Revision: https://reviews.llvm.org/D97237
Anastasia Stulova [Tue, 23 Feb 2021 11:44:13 +0000 (11:44 +0000)]
[OpenCL][Docs] Change description for the OpenCL standard headers.
After updating the user interface in D96515, update the docs
reflecting the new approach.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D96616
Nicolas Vasilache [Tue, 23 Feb 2021 08:52:55 +0000 (08:52 +0000)]
[mlir][Linalg] Retire hoistViewAllocOps.
This transformation was only used for quick experimentation and is not general enough.
Retire it.
Differential Revision: https://reviews.llvm.org/D97266
Simon Pilgrim [Tue, 23 Feb 2021 11:41:51 +0000 (11:41 +0000)]
Fix Wdocumentation parameter warning. NFCI.
Nicolas Vasilache [Tue, 23 Feb 2021 11:01:05 +0000 (11:01 +0000)]
[mlir] NFC - Use declarative assembly for scf::YieldOp
Raphael Isemann [Tue, 23 Feb 2021 11:10:39 +0000 (12:10 +0100)]
[lldb][NFC] Remove unused ValueObject::LogValueObject functions
Those functions aren't called anywhere. For debugging purposes we usually
have Dump() methods (which already exist in some semi-functional form in
ValueObject).
Alexey Lapshin [Mon, 8 Feb 2021 15:11:39 +0000 (18:11 +0300)]
[Support] Add reserve() method to the raw_ostream.
If resulting size of the output stream is already known,
then the space for stream data could be preliminary
allocated in some cases. f.e. raw_string_ostream could
preallocate the space for the target string(it allows
to avoid reallocations during writing into the stream).
Differential Revision: https://reviews.llvm.org/D91693
Raphael Isemann [Tue, 23 Feb 2021 11:01:29 +0000 (12:01 +0100)]
[lldb][NFC] Clean up ValueObject comments
* Remove commented out code.
* Doxygenify comments that serve as documentation.
* Use the LLVM comment style where possible.
David Green [Tue, 23 Feb 2021 10:53:22 +0000 (10:53 +0000)]
[ARM] Add pre/post inc tests of various sizes. NFC
Andy Wingo [Tue, 23 Feb 2021 10:23:31 +0000 (11:23 +0100)]
Revert "[WebAssembly] call_indirect issues table number relocs"
This reverts commit
861dbe1a021e6439af837b72b219fb9c449a57ae. It broke
emscripten -- see https://reviews.llvm.org/D90948#2578843.
Fraser Cormack [Thu, 18 Feb 2021 16:48:49 +0000 (16:48 +0000)]
[RISCV] Support insertion of misaligned subvectors
This patch extends the support for RVV INSERT_SUBVECTOR to cover those
which don't align to a vector register boundary. Like the support for
EXTRACT_SUBVECTOR in D96959, it accomplishes this by extracting the
nearest register-sized subvector (a subregister operation), then sliding
the vector down with VSLIDEDOWN, inserting the subvector to the first
position, and sliding the vector back up again afterwards.
Unlike subvector extraction, for vectors that occupy less than a full
vector register we must preserve the untouched elements. We do this by
lowering to an LMUL=1 INSERT_SUBVECTOR using the above method and
lowering that to a VSLIDEUP with a zero offset. This uses a
tail-undisturbed policy and so has the effect of "sliding in" the
subvector elements while preserving the surrounding ones.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D96972
Frederik Gossen [Tue, 23 Feb 2021 10:17:40 +0000 (11:17 +0100)]
Fix unused variable
Sven van Haastregt [Tue, 23 Feb 2021 10:18:14 +0000 (10:18 +0000)]
[OpenCL] Move remaining defines to opencl-c-base.h
Move any remaining preprocessor defines from `opencl-c.h` to
`opencl-c-base.h`, such that they are shared with
`-fdeclare-opencl-builtins` too.
In particular, move:
- the `as_type` and `as_typen` definitions, and
- the `kernel_exec` and `__kernel_exec` definitions.
Also clang-format the changes.
Differential Revision: https://reviews.llvm.org/D96948
Raphael Isemann [Tue, 23 Feb 2021 09:38:48 +0000 (10:38 +0100)]
[lldb][NFC] Give CompilerType's IsArrayType/IsVectorType/IsBlockPointerType out-parameters default values
We already do this for most functions that have out-parameters, so let's do
the same here and avoid all the `nullptr, nullptr, nullptr` in every call.
Martin Liska [Tue, 23 Feb 2021 09:11:07 +0000 (10:11 +0100)]
Fix UBSAN in __ubsan::Value::getSIntValue
/home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77:25: runtime error: left shift of 0x0000000000000000fffffffffffffffb by 96 places cannot be represented in type '__int128'
#0 0x7ffff754edfe in __ubsan::Value::getSIntValue() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77
#1 0x7ffff7548719 in __ubsan::Value::isNegative() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.h:190
#2 0x7ffff7542a34 in handleShiftOutOfBoundsImpl /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:338
#3 0x7ffff75431b7 in __ubsan_handle_shift_out_of_bounds /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:370
#4 0x40067f in main (/home/marxin/Programming/testcases/a.out+0x40067f)
#5 0x7ffff72c8b24 in __libc_start_main (/lib64/libc.so.6+0x27b24)
#6 0x4005bd in _start (/home/marxin/Programming/testcases/a.out+0x4005bd)
Differential Revision: https://reviews.llvm.org/D97263
Luís Marques [Tue, 23 Feb 2021 09:23:37 +0000 (09:23 +0000)]
[Sanitizer][NFC] Fix typo
Raphael Isemann [Tue, 23 Feb 2021 09:14:43 +0000 (10:14 +0100)]
[lldb][NFC] Don't inherit from UserID in ValueObject
ValueObject inherits from UserID which is just a bad idea:
* The inheritance gives ValueObject some member functions that are at best
misleading (such as `Clear()` which doesn't clear any value beside `id`).
* It allows passing ValueObject to the overloaded operators for UserID (such as
`==` or `<<` which won't actually compare or print anything in the ValueObject).
* It exposes the `SetID` and `Clear` which both allow users to change the
internal id value.
Similar to D91699 which did the same for Process
Reviewed By: #lldb, JDevlieghere
Differential Revision: https://reviews.llvm.org/D97205
Liu, Chen3 [Tue, 23 Feb 2021 05:53:47 +0000 (13:53 +0800)]
[X86] Support amx-int8 intrinsic.
Adding support for intrinsics of TDPBSUD/TDPBUSD/TDPBUUD.
Differential Revision: https://reviews.llvm.org/D97259
River Riddle [Tue, 23 Feb 2021 08:51:57 +0000 (00:51 -0800)]
[mlir] Add support for DebugCounters using the new DebugAction infrastructure
DebugCounters allow for selectively enabling the execution of a debug action based upon a "counter". This counter is comprised of two components that are used in the control of execution of an action, a "skip" value and a "count" value. The "skip" value is used to skip a certain number of initial executions of a debug action. The "count" value is used to prevent a debug action from executing after it has executed for a set number of times (not including any executions that have been skipped). For example, a counter for a debug action with `skip=47` and `count=2`, would skip the first 47 executions, then execute twice, and finally prevent any further executions.
This is effectively the same as the DebugCounter infrastructure in LLVM, but using the DebugAction infrastructure in MLIR. We can't simply reuse the DebugCounter support already present in LLVM due to its heavy reliance on global constructors (which are not allowed in MLIR). The DebugAction infrastructure already nicely supports the debug counter use case, and promotes the separation of policy and mechanism design philosophy.
Differential Revision: https://reviews.llvm.org/D96395
River Riddle [Tue, 23 Feb 2021 08:51:49 +0000 (00:51 -0800)]
[mlir] Add a new debug action framework.
This revision adds the infrastructure for `Debug Actions`. This is a DEBUG only
API that allows for external entities to control various aspects of compiler
execution. This is conceptually similar to something like DebugCounters in LLVM, but at a lower level. This framework doesn't make any assumptions about how the higher level driver is controlling the execution, it merely provides a framework for connecting the two together. This means that on top of DebugCounter functionality, we could also provide more interesting drivers such as interactive execution. A high level overview of the workflow surrounding debug actions is
shown below:
* Compiler developer defines an `action` that is taken by the a pass,
transformation, utility that they are developing.
* Depending on the needs, the developer dispatches various queries, pertaining
to this action, to an `action manager` that will provide an answer as to
what behavior the action should do.
* An external entity registers an `action handler` with the action manager,
and provides the logic to resolve queries on actions.
The exact definition of an `external entity` is left opaque, to allow for more
interesting handlers.
This framework was proposed here: https://llvm.discourse.group/t/rfc-debug-actions-in-mlir-debug-counters-for-the-modern-world
Differential Revision: https://reviews.llvm.org/D84986
Kadir Cetinkaya [Fri, 19 Feb 2021 12:14:55 +0000 (13:14 +0100)]
[clang][DeclPrinter] Pass Context into StmtPrinter whenever possible
ASTContext were only passed to the StmtPrinter in some places, while it
is always available in DeclPrinter. The context is used by StmtPrinter to better
print statements in some cases, like printing constants as written.
Differential Revision: https://reviews.llvm.org/D97043
Raphael Isemann [Tue, 23 Feb 2021 08:38:34 +0000 (09:38 +0100)]
[lldb][NFC] Cleanup ValueObject construction code
Just code cleanup for ValueObject constructors:
* Use default member initializers where possible.
* Doxygenify the comments for membersa nd constructors where needed.
* Delete the default constructor which isn't defined.
* Initialize the bitfields via a utility struct instead of doing this in the
different constructors.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D97199
Craig Topper [Tue, 23 Feb 2021 08:26:56 +0000 (00:26 -0800)]
[RISCV] Add test case for missed opportunity use bgez for the canonical form X > -1. NFC
Juneyoung Lee [Tue, 23 Feb 2021 08:32:28 +0000 (17:32 +0900)]
[SimplifyCFG] Minor tweaks to the added tests (NFC)
Juneyoung Lee [Tue, 23 Feb 2021 08:15:17 +0000 (17:15 +0900)]
[SimplifyCFG] Add tests for D97244 (NFC)