Sanjay Patel [Mon, 11 Jan 2021 20:36:22 +0000 (15:36 -0500)]
[InstCombine] reduce icmp(ashr X, C1), C2 to sign-bit test
This is a more basic pattern that we should handle before trying to solve:
https://llvm.org/PR48640
There might be a better way to think about this because the pre-condition
that I came up with (number of sign bits in the compare constant) misses a
potential transform for each of ugt and ult as commented on in the test file.
Tried to model this is in Alive:
https://rise4fun.com/Alive/juX1
...but I couldn't get the ComputeNumSignBits() pre-condition to work as
expected, so replaced with leading 0/1 preconditions instead.
Name: ugt
Pre: countLeadingZeros(C2) <= C1 && countLeadingOnes(C2) <= C1
%a = ashr %x, C1
%r = icmp ugt i8 %a, C2
=>
%r = icmp slt i8 %x, 0
Name: ult
Pre: countLeadingZeros(C2) <= C1 && countLeadingOnes(C2) <= C1
%a = ashr %x, C1
%r = icmp ult i4 %a, C2
=>
%r = icmp sgt i4 %x, -1
Also approximated in Alive2:
https://alive2.llvm.org/ce/z/u5hCcz
https://alive2.llvm.org/ce/z/__szVL
Differential Revision: https://reviews.llvm.org/D94014
Mehdi Amini [Mon, 11 Jan 2021 20:42:10 +0000 (20:42 +0000)]
Revert "[mlir][linalg] Support parsing attributes in named op spec"
This reverts commit
df86f15f0c53c395dac5a14aba08745bc12b9b9b.
The gcc-5 build was broken by this change:
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp:1275:77: required from here
/usr/include/c++/5/ext/new_allocator.h:120:4: error: no matching function for call to 'std::pair<const std::__cxx11::basic_string<char>, {anonymous}::TCParser::RegisteredAttr>::pair(llvm::StringRef&, {anonymous}::TCParser::RegisteredAttr'
Stella Laurenzo [Sun, 10 Jan 2021 03:01:39 +0000 (19:01 -0800)]
[mlir][CAPI] Introduce standard source layout for mlir-c dialect registration.
* Registers a small set of sample dialects.
* NFC with respect to existing C-API symbols but some headers have been moved down a level to the Dialect/ sub-directory.
* Adds an additional entry point per dialect that is needed for dynamic discovery/loading.
* See discussion: https://llvm.discourse.group/t/dialects-and-the-c-api/2306/16
Differential Revision: https://reviews.llvm.org/D94370
Stella Laurenzo [Sun, 10 Jan 2021 01:14:47 +0000 (17:14 -0800)]
Enable python bindings for tensor, shape and linalg dialects.
* We've got significant missing features in order to use most of these effectively (i.e. custom builders, region-based builders).
* We presently also lack a mechanism for actually registering these dialects but they can be use with contexts that allow unregistered dialects for further prototyping.
Differential Revision: https://reviews.llvm.org/D94368
Thomas Raoux [Mon, 11 Jan 2021 18:35:30 +0000 (10:35 -0800)]
[mlir][vector] Add side-effect information to different load/store ops
Differential Revision: https://reviews.llvm.org/D94434
Mircea Trofin [Mon, 11 Jan 2021 17:34:45 +0000 (09:34 -0800)]
[NFC] Disallow unused prefixes under llvm/test/CodeGen
This patch finishes addressing unused prefixes under CodeGen: 2
remaining tests fixed, and then undo-ing the lit.local.cfg changes under
various subdirs and moving the policy under CodeGen.
Differential Revision: https://reviews.llvm.org/D94430
Abhina Sreeskantharajan [Mon, 11 Jan 2021 20:13:40 +0000 (15:13 -0500)]
[tools] Mark output of tools as text if it is really text
This is a continuation of https://reviews.llvm.org/D67696. The following tools also need to set the OF_Text flag correctly.
- llvm-profdata
- llvm-link
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D94313
Nathan James [Mon, 11 Jan 2021 20:12:53 +0000 (20:12 +0000)]
[ADT] Add makeIntrusiveRefCnt helper function
Works like std::make_unique but for IntrusiveRefCntPtr objects.
See https://lists.llvm.org/pipermail/llvm-dev/2021-January/147729.html
Reviewed By: dblaikie, MaskRay
Differential Revision: https://reviews.llvm.org/D94440
River Riddle [Mon, 11 Jan 2021 19:55:09 +0000 (11:55 -0800)]
[mlir][IR][NFC] Move the definitions of Complex/Function/Integer/Opaque/TupleType to ODS
The type tablegen backend now has enough support to represent these types well enough, so we can now move them to be declaratively defined.
Differential Revision: https://reviews.llvm.org/D94275
River Riddle [Mon, 11 Jan 2021 19:55:00 +0000 (11:55 -0800)]
[mlir][TypeDefGen] Add support for adding builders when generating a TypeDef
This allows for specifying additional get/getChecked methods that should be generated on the type, and acts similarly to how OpBuilders work. TypeBuilders have two additional components though:
* InferredContextParam
- Bit indicating that the context parameter of a get method is inferred from one of the builder parameters
* checkedBody
- A code block representing the body of the equivalent getChecked method.
Differential Revision: https://reviews.llvm.org/D94274
River Riddle [Mon, 11 Jan 2021 19:54:51 +0000 (11:54 -0800)]
[mlir][ODS] Add a C++ abstraction for OpBuilders
This removes the need for OpDefinitionsGen to use raw tablegen API, and will also
simplify adding builders to TypeDefs as well.
Differential Revision: https://reviews.llvm.org/D94273
Tony Tye [Sat, 9 Jan 2021 10:20:42 +0000 (10:20 +0000)]
[NFC][AMDGPU] Clarify memory model support for volatile
Reorder the AMDGPUUage description of the memory model code sequences
for volatile so clear that it applies independent of the nontemporal
setting.
Differential Revision: https://reviews.llvm.org/D94358
Marek Kurdej [Mon, 11 Jan 2021 19:48:15 +0000 (20:48 +0100)]
[libc++] Turn off auto-formatting of generated files. NFC.
This adds `// clang-format off` in the auto-generated file to avoid lint warnings.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D94410
Fraser Cormack [Wed, 6 Jan 2021 11:25:25 +0000 (11:25 +0000)]
[RISCV] Add scalable vector fcmp ISel patterns
Original patch by @rogfer01.
All ordered comparisons except ONE are supported natively, and all
unordered comparisons except UNE are expanded into sequences involving
explicit NaN checks and mask arithmetic.
Additionally, we expand GT,OGT,GE,OGE to their swapped-operand versions, and
pattern-match those back to the "original", swapping operands once more. This
way we catch both operations and both "vf" and "fv" forms with fewer patterns.
Also add support for floating-point splat_vector, with an optimization for
splatting fpimm0.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94242
Christian Sigg [Mon, 11 Jan 2021 19:34:44 +0000 (20:34 +0100)]
[mlir] Add structural conversion to async dialect lowering.
Lowering of async dialect uses a fixed type converter and therefore does not support lowering non-standard types.
This revision adds a structural conversion so that non-standard types in `!async.value`s can be lowered to LLVM before lowering the async dialect itself.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D94404
Abhina Sreeskantharajan [Mon, 11 Jan 2021 19:27:10 +0000 (14:27 -0500)]
[SystemZ][z/OS] Fix Permission denied pattern matching
On z/OS, the error message "EDC5111I Permission denied." is not matched correctly in lit tests. This patch updates the check expression to match successfully.
Reviewed By: fanbo-meng
Differential Revision: https://reviews.llvm.org/D94432
David Stuttard [Mon, 11 Jan 2021 19:20:47 +0000 (11:20 -0800)]
Fix minor build issue (NFC)
Change [x86] Fix tile register spill issue was causing problems for our build
using gcc-5.4.1
The problem was caused by this line:
for (const MachineInstr &MI : make_range(MIS.begin(), MI))
where MI was previously defined as a MachineBasicBlock iterator.
Differential Revision: https://reviews.llvm.org/D94415
Jamie Schmeiser [Mon, 11 Jan 2021 19:11:00 +0000 (14:11 -0500)]
Introduce new quiet mode and new option handling for -print-changed.
Summary:
Introduce a new mode of operation for -print-changed that only reports
after a pass changes the IR with all of the other messages suppressed (ie,
no initial IR and no messages about ignored, filtered or non-modifying
passes).
The option processing for -print-changed is changed to take an optional
string indicating options for print-changed. Initially, the only option
supported is quiet (as described above). This new quiet mode is specified
with -print-changed=quiet while -print-changed will continue to function
in the same way. It is intended that there will be more options in the
future.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks)
Differential Revision: https://reviews.llvm.org/D92589
Sriraman Tallam [Mon, 11 Jan 2021 19:08:36 +0000 (11:08 -0800)]
-funique-internal-linkage-names appends a hex md5hash suffix to the symbol name which is not demangler friendly, convert it to decimal.
Please see D93747 for more context which tries to make linkage names of internal
linkage functions to be the uniqueified names. This causes a problem with gdb
because breaking using the demangled function name will not work if the new
uniqueified name cannot be demangled. The problem is the generated suffix which
is a mix of integers and letters which do not demangle. The demangler accepts
either all numbers or all letters. This patch simply converts the hash to decimal.
There is no loss of uniqueness by doing this as the precision is maintained.
The symbol names get longer by a few characters though.
Differential Revision: https://reviews.llvm.org/D94154
Valentin Clement [Mon, 11 Jan 2021 19:08:35 +0000 (14:08 -0500)]
[flang][openxx][NFC] Remove duplicated function to check required clauses
Remove duplicated function to check for required clauses on a directive. This was
still there from the merging of OpenACC and OpenMP common semantic checks and it can now be
removed so we use only one function.
Reviewed By: sameeranjoshi
Differential Revision: https://reviews.llvm.org/D93575
Alex Zinenko [Fri, 8 Jan 2021 14:08:44 +0000 (15:08 +0100)]
[mlir] Expose MemRef layout in Python bindings
This wasn't possible before because there was no support for affine expressions
as maps. Now that this support is available, provide the mechanism for
constructing maps with a layout and inspecting it.
Rework the `get` method on MemRefType in Python to avoid needing an explicit
memory space or layout map. Remove the `get_num_maps`, it is too low-level,
using the length of the now-avaiable pseudo-list of layout maps is more
pythonic.
Depends On D94297
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D94302
Alex Zinenko [Fri, 8 Jan 2021 12:36:27 +0000 (13:36 +0100)]
[mlir] More Python bindings for AffineMap
Now that the bindings for AffineExpr have been added, add more bindings for
constructing and inspecting AffineMap that consists of AffineExprs.
Depends On D94225
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D94297
Alex Zinenko [Thu, 7 Jan 2021 10:09:09 +0000 (11:09 +0100)]
[mlir] Add Python bindings for AffineExpr
This adds the Python bindings for AffineExpr and a couple of utility functions
to the C API. AffineExpr is a top-level context-owned object and is modeled
similarly to attributes and types. It is required, e.g., to build layout maps
of the built-in memref type.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D94225
Joe Nash [Tue, 5 Jan 2021 00:47:55 +0000 (19:47 -0500)]
[AMDGPU] Deduplicate VOP tablegen asm & ins
VOP3 and VOP DPP subroutines to generate input
operands and asm strings were essentially copy
pasted several times. They are deduplicated to
reduce the maintenance burden and allow faster
development.
Reviewed By: dp
Differential Revision: https://reviews.llvm.org/D94102
Change-Id: I76225eed3c33239d9573351e0c8a0abfad0146ea
Krzysztof Parzyszek [Thu, 7 Jan 2021 15:30:56 +0000 (09:30 -0600)]
[Hexagon] Custom-widen SETCC's operands
The result cannot be widened, unfortunately, because widening vNi1
would depend on the context in which it appears (i.e. the type alone
is not sufficient to tell if it needs to be widened).
Sean Dooher [Mon, 11 Jan 2021 14:57:08 +0000 (06:57 -0800)]
[attributes] Add a facility for enforcing a Trusted Computing Base.
Introduce a function attribute 'enforce_tcb' that prevents the function
from calling other functions without the same attribute. This allows
isolating code that's considered to be somehow privileged so that it could not
use its privileges to exhibit arbitrary behavior.
Introduce an on-by-default warning '-Wtcb-enforcement' that warns
about violations of the above rule.
Introduce a function attribute 'enforce_tcb_leaf' that suppresses
the new warning within the function it is attached to. Such leaf functions
may implement common functionality between the trusted and the untrusted code
but they require extra careful audit with respect to their capabilities.
Fixes after a revert in
419ef38a50293c58078f830517f5e305068dbee6:
Fix a test.
Add workaround for GCC bug (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67274).
Attribute the patch appropriately!
Differential Revision: https://reviews.llvm.org/D91898
Simon Pilgrim [Mon, 11 Jan 2021 18:12:25 +0000 (18:12 +0000)]
[X86] Regenerate vector-constrained-fp-intrinsics.ll tests
Adding missing libcall PLT qualifier
Paul Robinson [Mon, 11 Jan 2021 17:33:55 +0000 (09:33 -0800)]
[FastISel] NFC: Clean up unnecessary bookkeeping
Now that we flush the local value map for every instruction, we don't
need any extra flushes for specific cases. Also, LastFlushPoint is
not used for anything. Follow-ups to #c161665 (D91734).
This reapplies #3fd39d3.
Differential Revision: https://reviews.llvm.org/D92338
Jonas Paulsson [Mon, 11 Jan 2021 16:21:44 +0000 (10:21 -0600)]
[SystemZ] Minor NFC fix in SchedModels.
The unused LRMux opcode was removed by 8f8c381, but a regexp still matched
for it in the scheduler files which is now removed.
Review: Ulrich Weigand
Fangrui Song [Mon, 11 Jan 2021 17:33:21 +0000 (09:33 -0800)]
[ELF] --exclude-libs: localize defined libcall symbols referenced by lto.tmp
Fixes PR48681: after LTO, lto.tmp may reference a libcall symbol not in an IR
symbol table of any bitcode file. If such a symbol is defined in an archive
matched by a --exclude-libs, we don't correctly localize the symbol.
Add another `excludeLibs` after `compileBitcodeFiles` to localize such libcall
symbols. Unfortunately we have keep the existing one for D43126.
Using VER_NDX_LOCAL is an implementation detail of `--exclude-libs`, it does not
necessarily tie to the "localize" behavior. `local:` patterns in a version
script can be omitted.
The `symbol ... has undefined version ...` error should not be exempted.
Ideally we should error as GNU ld does. https://issuetracker.google.com/issues/
73020933
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D94280
Paul Robinson [Mon, 11 Jan 2021 17:20:45 +0000 (09:20 -0800)]
[FastISel] NFC: Remove obsolete -fast-isel-sink-local-values option
This option is not used for anything after #c161665 (D91737).
This commit reapplies #a474657.
MaheshRavishankar [Mon, 11 Jan 2021 17:24:07 +0000 (09:24 -0800)]
[mlir][Linalg] Fix reshape fusion to reshape the outs instead of creating new tensors.
When fusing tensor_reshape ops with generic/indexed_Generic op, new
linalg.init_tensor operations were created for the `outs` of the fused
op. While correct (technically) it is better to just reshape the
original `outs` operands and rely on canonicalization of init_tensor
-> tensor_reshape to achieve the same effect.
Differential Revision: https://reviews.llvm.org/D93774
Thomas Raoux [Fri, 8 Jan 2021 17:46:41 +0000 (09:46 -0800)]
[mlir][vector] Add memory effects to transfer_read transfer_write ops
This allow more accurate modeling of the side effects and allow dead code
elimination to remove dead transfer ops.
Differential Revision: https://reviews.llvm.org/D94318
Mircea Trofin [Mon, 11 Jan 2021 04:15:05 +0000 (20:15 -0800)]
[NFC] Disallow unused prefixes in CodeGen/PowerPC tests.
Also removed where applicable.
Differential Revision: https://reviews.llvm.org/D94385
Scott Linder [Fri, 8 Jan 2021 21:55:57 +0000 (21:55 +0000)]
[Clang][Docs] Fix ambiguity in clang-offload-bundler docs
Differential Revision: https://reviews.llvm.org/D94338
MaheshRavishankar [Mon, 11 Jan 2021 17:21:39 +0000 (09:21 -0800)]
[mlir][Linalg] Fold init_tensor -> linalg.tensor_reshape.
Reshaping an init_tensor can be folded to a init_tensor op of the
final type.
Differential Revision: https://reviews.llvm.org/D93773
Simon Pilgrim [Mon, 11 Jan 2021 16:59:07 +0000 (16:59 +0000)]
[X86][AVX] Attempt to fold vpermf128(op(x,i),op(y,i)) -> op(vpermf128(x,y),i)
If vpermf128/vpermi128 is acting on 2 similar 'inlane' ops, then try to perform the vpermf128 first which will allow us to merge the ops.
This will help us fix one of the regressions in D56387
Paul Robinson [Mon, 11 Jan 2021 16:32:36 +0000 (08:32 -0800)]
[FastISel] Flush local value map on every instruction
Local values are constants or addresses that can't be folded into
the instruction that uses them. FastISel materializes these in a
"local value" area that always dominates the current insertion
point, to try to avoid materializing these values more than once
(per block).
https://reviews.llvm.org/D43093 added code to sink these local
value instructions to their first use, which has two beneficial
effects. One, it is likely to avoid some unnecessary spills and
reloads; two, it allows us to attach the debug location of the
user to the local value instruction. The latter effect can
improve the debugging experience for debuggers with a "set next
statement" feature, such as the Visual Studio debugger and PS4
debugger, because instructions to set up constants for a given
statement will be associated with the appropriate source line.
There are also some constants (primarily addresses) that could be
produced by no-op casts or GEP instructions; the main difference
from "local value" instructions is that these are values from
separate IR instructions, and therefore could have multiple users
across multiple basic blocks. D43093 avoided sinking these, even
though they were emitted to the same "local value" area as the
other instructions. The patch comment for D43093 states:
Local values may also be used by no-op casts, which adds the
register to the RegFixups table. Without reversing the RegFixups
map direction, we don't have enough information to sink these
instructions.
This patch undoes most of D43093, and instead flushes the local
value map after(*) every IR instruction, using that instruction's
debug location. This avoids sometimes incorrect locations used
previously, and emits instructions in a more natural order.
In addition, constants materialized due to PHI instructions are
not assigned a debug location immediately; instead, when the
local value map is flushed, if the first local value instruction
has no debug location, it is given the same location as the
first non-local-value-map instruction. This prevents PHIs
from introducing unattributed instructions, which would either
be implicitly attributed to the location for the preceding IR
instruction, or given line 0 if they are at the beginning of
a machine basic block. Neither of those consequences is good
for debugging.
This does mean materialized values are not re-used across IR
instruction boundaries; however, only about 5% of those values
were reused in an experimental self-build of clang.
(*) Actually, just prior to the next instruction. It seems like
it would be cleaner the other way, but I was having trouble
getting that to work.
This reapplies commits
cf1c774d and
dc35368c, and adds the
modification to PHI handling, which should avoid problems
with debugging under gdb.
Differential Revision: https://reviews.llvm.org/D91734
Paul Robinson [Mon, 11 Jan 2021 15:03:08 +0000 (07:03 -0800)]
NFC: Use -LABEL more
There were a number of tests needing updates for D91734, and I added a
bunch of LABEL directives to help track down where those had to go.
These directives are an improvement independent of the functional
patch, so I'm committing them as their own separate patch.
Nathan James [Mon, 11 Jan 2021 16:14:26 +0000 (16:14 +0000)]
[clangd] Remove ScratchFS from tests
This can lead to issues if files in the tmp directory we don't care about / control are found.
This was partially addressed in D94321, but this is a more permanent fix.
Fixes https://github.com/clangd/clangd/issues/354
Reviewed By: adamcz, sammccall
Differential Revision: https://reviews.llvm.org/D94359
Giorgis Georgakoudis [Mon, 11 Jan 2021 16:03:08 +0000 (08:03 -0800)]
[OpenMPOpt][WIP] Expand parallel region merging
The existing implementation of parallel region merging applies only to
consecutive parallel regions that have speculatable sequential
instructions in-between. This patch lifts this limitation to expand
merging with any sequential instructions in-between, except calls to
unmergable OpenMP runtime functions. In-between sequential instructions
in the merged region are sequentialized in a "master" region and any
output values are broadcasted to the following parallel regions and the
sequential region continuation of the merged region.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D90909
Haojian Wu [Mon, 11 Jan 2021 14:48:27 +0000 (15:48 +0100)]
[clangd] Fix -check mode doesn't respect any tidy configs.
Differential Revision: https://reviews.llvm.org/D94411
Ranjeet Singh [Mon, 11 Jan 2021 15:37:51 +0000 (15:37 +0000)]
[ARM] Update existing test case with +pauth targets
Differential Revision: https://reviews.llvm.org/D94414
Simon Pilgrim [Mon, 11 Jan 2021 15:27:08 +0000 (15:27 +0000)]
[X86] Extend lzcnt-cmp tests to test on non-lzcnt targets
Simon Pilgrim [Mon, 11 Jan 2021 15:02:43 +0000 (15:02 +0000)]
[X86] Add nounwind to lzcnt-cmp tests
Remove unnecessary cfi markup
Christian Sigg [Mon, 11 Jan 2021 13:46:26 +0000 (14:46 +0100)]
[mlir] Fix gpu-to-llvm lowering for gpu.alloc with dynamic sizes.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D94402
Nico Weber [Mon, 11 Jan 2021 14:51:06 +0000 (09:51 -0500)]
Revert "[attributes] Add a facility for enforcing a Trusted Computing Base."
This reverts commit
c163aae45ef6b7f3bd99601195d3ce4aad5850c6.
Doesn't compile on some bots
(http://lab.llvm.org:8011/#/builders/98/builds/3387/steps/9/logs/stdio),
breaks tests on bots where it does compile
(http://45.33.8.238/linux/36843/step_7.txt).
Florian Hahn [Mon, 23 Nov 2020 15:44:50 +0000 (15:44 +0000)]
[VPlan] Unify value/recipe printing after VPDef transition.
This patch unifies the way recipes and VPValues are printed after the
transition to VPDef.
VPSlotTracker has been updated to iterate over all recipes and all
their defined values to number those. There is no need to number
values in Value2VPValue.
It also updates a few places that only used slot numbers for
VPInstruction. All recipes now can produce numbered VPValues.
Artem Dergachev [Mon, 11 Jan 2021 12:52:04 +0000 (04:52 -0800)]
[attributes] Add a facility for enforcing a Trusted Computing Base.
Introduce a function attribute 'enforce_tcb' that prevents the function
from calling other functions without the same attribute. This allows
isolating code that's considered to be somehow privileged so that it could not
use its privileges to exhibit arbitrary behavior.
Introduce an on-by-default warning '-Wtcb-enforcement' that warns
about violations of the above rule.
Introduce a function attribute 'enforce_tcb_leaf' that suppresses
the new warning within the function it is attached to. Such leaf functions
may implement common functionality between the trusted and the untrusted code
but they require extra careful audit with respect to their capabilities.
Differential Revision: https://reviews.llvm.org/D91898
Joe Ellis [Tue, 15 Dec 2020 17:20:11 +0000 (17:20 +0000)]
[DAGCombiner] Use getVectorElementCount inside visitINSERT_SUBVECTOR
This avoids TypeSize-/ElementCount-related warnings.
Differential Revision: https://reviews.llvm.org/D92747
Lei Zhang [Mon, 11 Jan 2021 14:08:21 +0000 (09:08 -0500)]
[mlir][linalg] Support permutation when lowering to loop nests
Linalg ops are perfect loop nests. When materializing the concrete
loop nest, the default order specified by the Linalg op's iterators
may not be the best for further CodeGen: targets frequently need
to plan the loop order in order to gain better data access. And
different targets can have different preferences. So there should
exist a way to control the order.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D91795
Lei Zhang [Mon, 11 Jan 2021 13:50:00 +0000 (08:50 -0500)]
[mlir][linalg] Support parsing attributes in named op spec
With this, now we can specify a list of attributes on named ops
generated from the spec. The format is defined as
```
attr-id ::= bare-id (`?`)?
attr-typedef ::= type (`[` `]`)?
attr-def ::= attr-id `:` attr-typedef
tc-attr-def ::= `attr` `(` attr-def-list `)`
tc-def ::= `def` bare-id
`(`tensor-def-list`)` `->` `(` tensor-def-list`)`
(tc-attr-def)?
```
For example,
```
ods_def<SomeCppOp>
def some_op(...) -> (...)
attr(
f32_attr: f32,
i32_attr: i32,
array_attr : f32[],
optional_attr? : f32
)
```
where `?` means optional attribute and `[]` means array type.
Reviewed By: hanchung, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D94240
Andrzej Warzynski [Thu, 7 Jan 2021 17:49:38 +0000 (17:49 +0000)]
[flang][driver] Copy input files into a temp dir when testing
The following frontend driver invocation will generate 2 output files
in the same directory as the input files:
```
flang-new -fc1 input-1.f input-2.f
```
This is the desired behaviour. However, when testing we need to make
sure that we don't pollute the source directory. To this end, copy test
input files into a temporary directory.
Differential Revision: https://reviews.llvm.org/D94243
Christian Sigg [Mon, 11 Jan 2021 12:56:35 +0000 (13:56 +0100)]
[mlir] Make GpuAsyncRegion pass depend on async dialect.
Do not cache gpu.async.token type so that the pass can be created before the GPU dialect is registered.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D94397
Christian Sigg [Mon, 11 Jan 2021 12:25:23 +0000 (13:25 +0100)]
[mlir] Remove unnecessary llvm.mlir.cast in AsyncToLLVM lowering.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D94400
Jay Foad [Mon, 11 Jan 2021 13:28:34 +0000 (13:28 +0000)]
[AMDGPU] Fix a urem combine test to test what it was supposed to
Stephan Herhut [Mon, 11 Jan 2021 11:32:25 +0000 (12:32 +0100)]
[ARM] Add uses for locals introduced for debug messages. NFC.
This adds uses for locals introduced for new debug messages for the load store optimizer. Those locals are only used on debug statements and otherwise create unused variable warnings.
Differential Revision: https://reviews.llvm.org/D94398
Simon Pilgrim [Mon, 11 Jan 2021 12:51:03 +0000 (12:51 +0000)]
[X86][SSE] Add 'vectorized sum' test patterns
These are often generated when building a vector from the reduction sums of independent vectors.
I've implemented some typical patterns from various v4f32/v4i32 based off current codegen emitted from the vectorizers, although these tests are more about tweaking some hadd style backend folds to handle whatever the vectorizers/vectorcombine throws at us...
Pavel Labath [Mon, 11 Jan 2021 12:15:01 +0000 (13:15 +0100)]
[lldb] Disable PipeTest.OpenAsReader on windows
This test seems to be broken there (which is not totally surprising as
this functionality was never used on windows). Disable the test while I
investigate.
Georgii Rymar [Wed, 23 Dec 2020 10:40:58 +0000 (13:40 +0300)]
[obj2yaml][test] - Improve and fix section-group.yaml test.
It has multiple issues fixed by this patch:
1) It shouldn't test how llvm-readelf/yaml2obj works.
2) It should use "-NEXT" prefix for check lines.
3) It can use YAML macros, that allows to use a single YAML.
4) It should probably test the case when a group member is a null section.
Differential revision: https://reviews.llvm.org/D93753
Florian Hahn [Mon, 11 Jan 2021 12:20:04 +0000 (12:20 +0000)]
[VPlan] Move initial quote emission from ::print to ::dumpBasicBlock.
This means there will be no stray " when printing individual recipes
using print()/dump() in a debugger, for example.
Georgii Rymar [Thu, 24 Dec 2020 13:20:07 +0000 (16:20 +0300)]
[llvm-readelf/obj] - Index phdrs and relocations from 0 when reporting warnings.
As was mentioned in comments here:
https://reviews.llvm.org/D92636#inline-864967
we are not consistent and sometimes index things from 0, but sometimes
from 1 in warnings.
This patch fixes 2 places: messages reported for
program headers and messages reported for relocations.
Differential revision: https://reviews.llvm.org/D93805
Joe Ellis [Fri, 8 Jan 2021 11:44:15 +0000 (11:44 +0000)]
[clang][AArch64][SVE] Avoid going through memory for coerced VLST return values
VLST return values are coerced to VLATs in the function epilog for
consistency with the VLAT ABI. Previously, this coercion was done
through memory. It is preferable to use the
llvm.experimental.vector.insert intrinsic to avoid going through memory
here.
Reviewed By: c-rhodes
Differential Revision: https://reviews.llvm.org/D94290
Georgii Rymar [Wed, 23 Dec 2020 10:29:01 +0000 (13:29 +0300)]
[obj2yaml] - Fix the crash in getUniquedSectionName().
`getUniquedSectionName(const Elf_Shdr *Sec)` assumes that
`Sec` is not `nullptr`.
I've found one place in `getUniquedSymbolName` where it is
not true (because of that we crash when trying to dump
unnamed null section symbols).
Patch fixes the crash and changes the signature of the
`getUniquedSectionName` section to accept a reference.
Differential revision: https://reviews.llvm.org/D93754
Kazushi (Jam) Marukawa [Fri, 8 Jan 2021 11:41:37 +0000 (20:41 +0900)]
[VE] Support additional VMRGW and VMV intrinsic instructions
Support missing VMRGW and VMV intrinsic instructions and add regression
tests.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D94300
Kazushi (Jam) Marukawa [Fri, 8 Jan 2021 11:29:42 +0000 (20:29 +0900)]
[VE] Support intrinsic to isnert/extract_subreg of v512i1
Support insert/extract_subreg intrinsic instructions for v512i1
registers and add regression tests.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D94298
Simon Pilgrim [Mon, 11 Jan 2021 11:24:20 +0000 (11:24 +0000)]
[X86][SSE] Add missing SSE test coverage for permute(hop,hop) folds
Should help avoid bugs like reported in rG80dee7965dff
Simon Pilgrim [Mon, 11 Jan 2021 11:15:28 +0000 (11:15 +0000)]
Revert rGd43a264a5dd3 "Revert "[X86][SSE] Fold unpack(hop(),hop()) -> permute(hop())""
This reapplies commit rG80dee7965dffdfb866afa9d74f3a4a97453708b2.
[X86][SSE] Fold unpack(hop(),hop()) -> permute(hop())
UNPCKL/UNPCKH only uses one op from each hop, so we can merge the hops and then permute the result.
REAPPLIED with a fix for unary unpacks of HOP.
Kerry McLaughlin [Mon, 11 Jan 2021 10:57:46 +0000 (10:57 +0000)]
[SVE][CodeGen] Fix legalisation of floating-point masked gathers
Changes in this patch:
- When lowering floating-point masked gathers, cast the result of the
gather back to the original type with reinterpret_cast before returning.
- Added patterns for reinterpret_casts from integer to floating point, and
concat_vector patterns for bfloat16.
- Tests for various legalisation scenarios with floating point types.
Reviewed By: sdesmalen, david-arm
Differential Revision: https://reviews.llvm.org/D94171
Bjorn Pettersson [Fri, 1 Jan 2021 23:05:53 +0000 (00:05 +0100)]
Require chained analyses in BasicAA and AAResults to be transitive
This patch fixes a bug that could result in miscompiles (at least
in an OOT target). The problem could be seen by adding checks that
the DominatorTree used in BasicAliasAnalysis and ValueTracking was
valid (e.g. by adding DT->verify() call before every DT dereference
and then running all tests in test/CodeGen).
Problem was that the LegacyPassManager calculated "last user"
incorrectly for passes such as the DominatorTree when not telling
the pass manager that there was a transitive dependency between
the different analyses. And then it could happen that an incorrect
dominator tree was used when doing alias analysis (which was a pretty
serious bug as the alias analysis result could be invalid).
Fixes: https://bugs.llvm.org/show_bug.cgi?id=48709
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D94138
Luo, Yuanke [Thu, 31 Dec 2020 05:47:42 +0000 (13:47 +0800)]
[X86] Fix tile register spill issue.
The tile register spill need 2 instructions.
%46:gr64_nosp = MOV64ri 64
TILESTORED %stack.2, 1, killed %46:gr64_nosp, 0, $noreg, %43:tile
The first instruction load the stride to a GPR, and the second
instruction store tile register to stack slot. The optimization of merge
spill instruction is done after register allocation. And spill tile
register need create a new virtual register to for stride, so we can't
hoist tile spill instruction in postOptimization() of register
allocation. We can't hoist TILESTORED alone and we can't hoist the 2
instuctions together because MOV64ri will clobber some GPR. This patch
is to disble the spill merge for any spill which need 2 instructions.
Differential Revision: https://reviews.llvm.org/D93898
Haojian Wu [Mon, 11 Jan 2021 08:55:24 +0000 (09:55 +0100)]
[clangd] Add metrics for go-to-implementation.
Differential Revision: https://reviews.llvm.org/D94393
David Green [Mon, 11 Jan 2021 09:24:28 +0000 (09:24 +0000)]
[ARM] Add debug messages for the load store optimizer. NFC
David Sherwood [Wed, 25 Nov 2020 15:55:43 +0000 (15:55 +0000)]
[NFC][InstructionCost] Change LoopVectorizationCostModel::getInstructionCost to return InstructionCost
This patch is part of a series of patches that migrate integer
instruction costs to use InstructionCost. In the function
selectVectorizationFactor I have simply asserted that the cost
is valid and extracted the value as is. In future we expect
to encounter invalid costs, but we should filter out those
vectorization factors that lead to such invalid costs.
See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html
Differential Revision: https://reviews.llvm.org/D92178
Jan Svoboda [Fri, 8 Jan 2021 10:05:16 +0000 (11:05 +0100)]
Reapply "[clang][cli] Port DiagnosticOpts to new option parsing system"
This reverts commit
8e3e148c
This commit fixes two issues with the original patch:
* The sanitizer build bot reported an uninitialized value. This was caused by normalizeStringIntegral not returning None on failure.
* Some build bots complained about inaccessible keypaths. To mitigate that, "this->" was added back to the keypath to restore the previous behavior.
David Sherwood [Thu, 7 Jan 2021 15:02:50 +0000 (15:02 +0000)]
[NFC] Remove min/max functions from InstructionCost
Removed the InstructionCost::min/max functions because it's
fine to use std::min/max instead.
Differential Revision: https://reviews.llvm.org/D94301
David Green [Mon, 11 Jan 2021 08:59:28 +0000 (08:59 +0000)]
[ARM] Update trunc costs
We did not have specific costs for larger than legal truncates that were
not otherwise cheap (where they were next to stores, for example). As
MVE does not have a dedicated instruction for them (and we do not use
loads/stores yet), they should be expensive as they get expanded to a
series of lane moves.
Differential Revision: https://reviews.llvm.org/D94260
Rafał Jelonek [Mon, 11 Jan 2021 08:43:30 +0000 (09:43 +0100)]
[clang-format] Find main include after block ended with #pragma hdrstop
Find main include in first include block not ended with #pragma hdrstop
Reviewed By: curdeius
Differential Revision: https://reviews.llvm.org/D94217
Rafał Jelonek [Mon, 11 Jan 2021 08:34:01 +0000 (09:34 +0100)]
[clang-format] turn on formatting after "clang-format on" while sorting includes
Formatting is not active after "clang-format on" due to merging lines while formatting is off. Also, use trimmed line. Behaviour with LF is different than with CRLF.
Reviewed By: curdeius, MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D94206
David Green [Mon, 11 Jan 2021 08:35:16 +0000 (08:35 +0000)]
[ARM] Additional trunc cost tests. NFC
Rafał Jelonek [Mon, 11 Jan 2021 08:28:41 +0000 (09:28 +0100)]
[clang-format] Skip UTF8 Byte Order Mark while sorting includes
If file contain BOM then first instruction (include or clang-format off) is ignored
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D94201
Zi Xuan Wu [Mon, 11 Jan 2021 08:18:01 +0000 (16:18 +0800)]
[CSKY] Add visibility macro to fix link error
Add LLVM_EXTERNAL_VISIBILITY macro to fix link error of
https://reviews.llvm.org/D88466#2476378
Adrian Kuegel [Fri, 8 Jan 2021 14:37:06 +0000 (15:37 +0100)]
Remove redundant casts.
Differential Revision: https://reviews.llvm.org/D94305
Craig Topper [Mon, 11 Jan 2021 06:35:51 +0000 (22:35 -0800)]
[RISCV] Clear isCodeGenOnly flag on VMSGE(U) pseudo instructions. Remove InstAliases that duplicate the asm strings in the pseudos.
The Pseudo class sets isCodeGenOnly=1 which causes the asm strings
in the pseudos to be ignored. I think this is why the aliases are
needed at all.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D94024
Lang Hames [Mon, 11 Jan 2021 07:30:09 +0000 (18:30 +1100)]
[JITLink] Rename PostAllocationPasses to PreFixupPasses.
PreFixupPasses better reflects when these passes will run.
A future patch will (re)introduce a PostAllocationPasses list that will run
after allocation, but before JITLinkContext::notifyResolved is called to notify
the rest of the JIT about the resolved symbol addresses.
Hsiangkai Wang [Fri, 8 Jan 2021 13:07:05 +0000 (21:07 +0800)]
[NFC][AsmPrinter] Make comments for spill/reload more precise.
The size of spill/reload may be unknown for scalable vector types.
When the size is unknown, print it as "Unknown-size" instead of a very
large number.
Differential Revision: https://reviews.llvm.org/D94299
ergawy [Mon, 11 Jan 2021 06:37:34 +0000 (07:37 +0100)]
[MLIR][SPIRV] Add (de-)serialization support for SpecConstantOpeation.
This commit adds support for (de-)serializing SpecConstantOpeation. One
thing worth noting is that during deserialization, we assign a fake ID to
enclosed ops inside SpecConstantOpeation. We need to do this in order
for deserialization logic to properly update ID to value map and to
later reference the created value from the sibling 'spv::YieldOp'.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D93591
Fangrui Song [Mon, 11 Jan 2021 06:22:07 +0000 (22:22 -0800)]
CGDebugInfo: Delete unneeded UnwrapTypeForDebugInfo
Tested with stage 2 -DCMAKE_BUILD_TYPE=Debug clang, byte identical.
Fangrui Song [Mon, 11 Jan 2021 06:09:47 +0000 (22:09 -0800)]
CGDebugInfo: Delete redundant test
Chris Lattner [Mon, 11 Jan 2021 02:32:57 +0000 (18:32 -0800)]
[IR Parser] Fix a crash handling zero width integer attributes.
llvm::APInt cannot hold zero bit values, therefore we shouldn't try
to form them.
Differential Revision: https://reviews.llvm.org/D94384
Esme-Yi [Mon, 11 Jan 2021 03:52:16 +0000 (03:52 +0000)]
[PowerPC] Add variants of 64-bit vector types for vec_sel.
Summary: This patch added variants of vec_sel and fixed bugzilla 46770.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D94162
Serguei Katkov [Tue, 22 Dec 2020 09:40:50 +0000 (16:40 +0700)]
[LoopUnroll] Fix a crash
Loop peeling as a last step triggers loop simplification and this
can change the loop structure. As a result all cashed values like
latch branch becomes invalid.
Patch re-structure the code to take into account the possible
changes caused by peeling.
Reviewers: dmgreen, Meinersbur, etiotto, fhahn, efriedma, bmahjour
Reviewed By: Meinersbur, fhahn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D93686
Craig Topper [Mon, 11 Jan 2021 02:01:23 +0000 (18:01 -0800)]
[RISCV] Convert most of the information about RVV Pseudos into bits in TSFlags.
This patch moves all but the BaseInstr to bits in TSFlags.
For the index fields, we can just use a bit to indicate their presence.
The locations of the operands are well defined.
This reduces the llc binary by about 32K on my build. It also
removes the binary search of the table from the custom inserter.
Instead we just check that the SEW op is present.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D94375
QingShan Zhang [Mon, 11 Jan 2021 02:25:53 +0000 (02:25 +0000)]
[DAGCombine] Remove the check for unsafe-fp-math when we are checking the AFN
We are checking the unsafe-fp-math for sqrt but not for fpow, which behaves inconsistent.
As the direction is to remove this global option, we need to remove the unsafe-fp-math
check for sqrt and update the test with afn fast-math flags.
Reviewed By: Spatel
Differential Revision: https://reviews.llvm.org/D93891
Nico Weber [Mon, 11 Jan 2021 01:22:53 +0000 (20:22 -0500)]
Revert "[X86][SSE] Fold unpack(hop(),hop()) -> permute(hop())"
This reverts commit
80dee7965dffdfb866afa9d74f3a4a97453708b2.
Makes clang sometimes hang forever. See
https://bugs.chromium.org/p/chromium/issues/detail?id=1164786#c6 for a
stand-alone repro.
Philip Reames [Sun, 10 Jan 2021 23:57:25 +0000 (15:57 -0800)]
[LoopDeletion] Break backedge of outermost loops when known not taken
This is a resubmit of
dd6bb367 (which was reverted due to stage2 build failures in 7c63aac), with the additional restriction added to the transform to only consider outer most loops.
As shown in the added test case, ensuring LCSSA is up to date when deleting an inner loop is tricky as we may actually need to remove blocks from any outer loops, thus changing the exit block set. For the moment, just avoid transforming this case. I plan to return to this case in a follow up patch and see if we can do better.
Original commit message follows...
The basic idea is that if SCEV can prove the backedge isn't taken, we can go ahead and get rid of the backedge (and thus the loop) while leaving the rest of the control in place. This nicely handles cases with dispatch between multiple exits and internal side effects.
Differential Revision: https://reviews.llvm.org/D93906
Fangrui Song [Sun, 10 Jan 2021 23:03:40 +0000 (15:03 -0800)]
CGDebugInfo: Delete unused DIFile* parameter
Kazu Hirata [Sun, 10 Jan 2021 22:32:02 +0000 (14:32 -0800)]
[StringExtras] Add a helper class for comma-separated lists
This patch introduces a helper class SubsequentDelim to simplify loops
that generate a comma-separated lists.
For example, consider the following loop, taken from
llvm/lib/CodeGen/MachineBasicBlock.cpp:
for (auto I = pred_begin(), E = pred_end(); I != E; ++I) {
if (I != pred_begin())
OS << ", ";
OS << printMBBReference(**I);
}
The new class allows us to rewrite the loop as:
SubsequentDelim SD;
for (auto I = pred_begin(), E = pred_end(); I != E; ++I)
OS << SD << printMBBReference(**I);
where SD evaluates to the empty string for the first time and ", " for
subsequent iterations.
Unlike interleaveComma, defined in llvm/include/llvm/ADT/STLExtras.h,
SubsequentDelim can accommodate a wider variety of loops, including:
- those that conditionally skip certain items,
- those that need iterators to call getSuccProbability(I), and
- those that iterate over integer ranges.
As an example, this patch cleans up MachineBasicBlock::print.
Differential Revision: https://reviews.llvm.org/D94377
Shilei Tian [Sun, 10 Jan 2021 21:46:09 +0000 (16:46 -0500)]
[OpenMP] Not set OPENMP_STANDALONE_BUILD=ON when building OpenMP along with LLVM
For now, `*_STANDALONE_BUILD` is set to ON even if they're built along
with LLVM because of issues mentioned in the comments. This can cause some issues.
For example, if we build OpenMP along with LLVM, we'd like to copy those OpenMP
headers to `<prefix>/lib/clang/<version>/include` such that `clang` can find
those headers without using `-I <prefix>/include` because those headers will be
copied to `<prefix>/include` if it is built standalone.
In this patch, we fixed the dependence issue in OpenMP such that it can be built
correctly even with `OPENMP_STANDALONE_BUILD=OFF`. The issue is in the call to
`add_lit_testsuite`, where `clang` and `clang-resource-headers` are passed as
`DEPENDS`. Since we're building OpenMP along with LLVM, `clang` is set by CMake
to be the C/C++ compiler, therefore these two dependences are no longer needed,
where caused the dependence issue.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D93738
Shilei Tian [Sun, 10 Jan 2021 21:45:39 +0000 (16:45 -0500)]
[LLVM] Added OpenMP to `LLVM_ALL_RUNTIMES`
This patch added `openmp` to `LLVM_ALL_RUNTIMES` so that when the CMake argument `LLVM_ENABLE_RUNTIMES=all`, OpenMP can also be built.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D94369