Roman Lebedev [Wed, 19 May 2021 08:49:16 +0000 (11:49 +0300)]
[NFCI][SimplifyCFG] simplifySingleResume(): use DeleteDeadBlock()
Roman Lebedev [Wed, 19 May 2021 08:44:43 +0000 (11:44 +0300)]
[NFCI][SimplifyCFG] simplifyCommonResume(): use DeleteDeadBlock()
Sergey Dmitriev [Wed, 19 May 2021 08:11:53 +0000 (01:11 -0700)]
[llvm-objcopy] Add support for '--' for delimiting options from input/output files
This will allow to use llvm-objcopy with file names that begin with dashes.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D102665
Fraser Cormack [Tue, 18 May 2021 16:17:21 +0000 (17:17 +0100)]
[RISCV] Support INSERT_VECTOR_ELT into i1 vectors
Like the element extraction of these vectors, we choose to promote up to
an i8 vector type and perform the insertion there.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D102697
Roman Lebedev [Wed, 19 May 2021 08:31:53 +0000 (11:31 +0300)]
[NFCI] SimplifyCFGPass: mergeEmptyReturnBlocks(): use DeleteDeadBlocks()
In this case, it does the same thing as the original pattern does.
SimplifyCFG has a few lurking miscompilations about deleting blocks that
have their address taken, and consistently using DeleteDeadBlocks() instead
of a hand-rolled pattern will allow to weed those cases out easierly.
Haojian Wu [Tue, 18 May 2021 19:53:32 +0000 (21:53 +0200)]
[clang-tidy] Fix a crash on invalid code for memset-usage check.
Differential Revision: https://reviews.llvm.org/D102714
Rong Xu [Wed, 19 May 2021 05:40:30 +0000 (22:40 -0700)]
Fix sanitizer test errors from commit
886629a8
Explictly handle the empty string in the Hash calculation.
Matthias Springer [Mon, 17 May 2021 05:37:32 +0000 (14:37 +0900)]
[mlir] Use VectorTransferPermutationMapLoweringPatterns in VectorToSCF
VectorTransferPermutationMapLoweringPatterns can be enabled via a pass option. These additional patterns lower permutation maps to minor identity maps with broadcasting, if possible, allowing for more efficient vector load/stores. The option is deactivated by default.
Differential Revision: https://reviews.llvm.org/D102593
Vitaly Buka [Wed, 19 May 2021 05:39:36 +0000 (22:39 -0700)]
[libfuzzer] Update doc mentioning removed flags.
MaheshRavishankar [Wed, 19 May 2021 05:08:12 +0000 (22:08 -0700)]
[mlir][Linalg] Break unnecessary dependency through unused `outs` tensor.
LinalgOps that are all parallel do not use the value of `outs`
tensor. The semantics is that the `outs` tensor is fully
overwritten. Using anything other than `init_tensor` can add false
dependencies between operations, when the use is just for the shape of
the tensor. Adding a canonicalization to always use `init_tensor` in
such cases, breaks this dependence.
Differential Revision: https://reviews.llvm.org/D102561
Arthur Eubanks [Fri, 7 May 2021 21:32:20 +0000 (14:32 -0700)]
[NewPM] Add options to PrintPassInstrumentation
To bring D99599's implementation in line with the existing
PrintPassInstrumentation, and to fix a FIXME, add more customizability
to PrintPassInstrumentation.
Introduce three new options. The first takes over the existing
"-debug-pass-manager-verbose" cl::opt.
The second and third option are specific to -fdebug-pass-structure. They
allow indentation, and also don't print analysis queries.
To avoid more golden file tests than necessary, prune down the
-fdebug-pass-structure tests.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D102196
Senran Zhang [Wed, 19 May 2021 03:40:59 +0000 (23:40 -0400)]
[Utils][vim] Highlight CHECK-EMPTY: & CHECK-COUNT: directives
Reviewed By: porglezomp
Differential Revision: https://reviews.llvm.org/D101135
Vladimir Vereschaka [Wed, 19 May 2021 02:04:49 +0000 (19:04 -0700)]
[CMake] Update Cmake cache file for Win to ARM Linux cross builds. NFC
Parametrize the cache file with TARGET_TRIPLE parameter. Normalize
the target triple to follow the runtime library installation directory.
Explicity enable LLVM_ENABLE_PER_TARGET_RUNTIME_DIR option.
Wenyi Zhao [Wed, 19 May 2021 02:11:33 +0000 (02:11 +0000)]
Enhance InferShapedTypeOpInterface to make it accessible during dialect conversion
Original interfaces are not safe to be called during dialect conversion.
This is because some ops (e.g. `dynamic_reshape(input, target_shape)`)
depend on the values of their operands to calculate the output shape.
However the operands may be out of reach during dialect conversion (e.g.
converting from tensor world to buffer world). This patch provides a new
kind of interface which accpets user-provided operands to solve this
problem.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D102317
Richard Smith [Wed, 19 May 2021 00:43:06 +0000 (17:43 -0700)]
Revert "[IR] Add a Location to BlockArgument." and follow-on commit
"[mlir] Speed up Lexer::getEncodedSourceLocation"
This reverts commit
3043be9d2db4d0cdf079adb5e1bdff032405e941 and commit
861d69a5259653f60d59795597493a7939b794fe.
This change resulted in printing textual MLIR that can't be parsed; see
review thread https://reviews.llvm.org/D102567 for details.
Joseph Huber [Wed, 19 May 2021 00:10:05 +0000 (20:10 -0400)]
[Attributor] Change AAExecutionDomain to only accept intrinsics
Summary:
The OpenMP runtime functions don't always provide unique thread ID's to
determine if a basic block is truly single-threaded. Change the implementation
to only check NVPTX intrinsics for now.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102700
Guozhi Wei [Wed, 19 May 2021 01:02:36 +0000 (18:02 -0700)]
[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB
This patch transforms the sequence
lea (reg1, reg2), reg3
sub reg3, reg4
to two sub instructions
sub reg1, reg4
sub reg2, reg4
Similar optimization can also be applied to LEA/ADD sequence.
The modifications to TwoAddressInstructionPass is to ensure the operands of ADD
instruction has expected order (the dest register of LEA should be src register
of ADD).
Differential Revision: https://reviews.llvm.org/D101970
Thomas Köppe [Tue, 18 May 2021 23:44:25 +0000 (23:44 +0000)]
Add a helper function to convert LogicalResult to int for return from main
At present, a lot of code contains main function bodies like "return failed(mlir::MlirOptMain(...);". This is unfortunate for two reasons: a) it uses ADL, which is maybe not what the free "failed" function was designed for; and b) it is a bit awkward to read, requring the reader to both understand the boolean nature of the value and the semantics of main's return value. (And it's also not portable, since 1 is not a portable success value.)
The replacement code, `return mlir::AsMainReturnCode(mlir::MlirOptMain(...))` is a bit more self-explanatory.
The change applies the new function to a few internal uses of MlirOptMain, too.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D102641
River Riddle [Tue, 18 May 2021 23:36:19 +0000 (16:36 -0700)]
[mlir] Speed up Lexer::getEncodedSourceLocation
We currently use SourceMgr::getLineAndColumn to get the line and column for an SMLoc, but this includes a call to StringRef::find_last_of that ends up dominating compile time. In D102567, we start creating locations from the input file for block arguments which resulted in an extreme performance regression for modules with very large amounts of block arguments. This revision switches to just using a pointer offset from the beginning of the line to calculate the column(all MLIR files are simple ascii), resulting in a compile time reduction from 4700 seconds (1 hour and 18 minutes) to 8 seconds.
Differential Revision: https://reviews.llvm.org/D102734
Amy Huang [Mon, 15 Mar 2021 21:20:49 +0000 (14:20 -0700)]
Apply [[standalone_debug]] to some types in the STL.
Add this attribute to some types to ensure that they have
debug info.
The debug info for these classes are required for debuggers to display
some STL types. With constructor homing (a new debug info optimization)
their debug info isn't emitted because their constructors are never
called.
The list of types with the attribute added are __hash_value_type,
__value_type, __tree_node_base, __tree_node, __hash_node, __list_node,
and __forward_list_node.
Differential Revision: https://reviews.llvm.org/D98750
Arthur O'Dwyer [Thu, 13 May 2021 03:04:03 +0000 (23:04 -0400)]
[libc++] Alphabetize header inclusions and include-what-you-use <__debug>. NFCI.
Arthur O'Dwyer [Wed, 12 May 2021 17:09:26 +0000 (13:09 -0400)]
[libc++] Some fixes to the <bit> utilities.
Fix __bitop_unsigned_integer and rename to __libcpp_is_unsigned_integer.
There are only five unsigned integer types, so we should just list them out.
Also provide `__libcpp_is_signed_integer`, even though the Standard doesn't
consume that trait anywhere yet.
Notice that `concept uniform_random_bit_generator` is specifically specified
to rely on `concept unsigned_integral` and *not* `__is_unsigned_integer`.
Instantiating `std::ranges::sample` with a type `U` satisfying
`uniform_random_bit_generator` where `unsigned_integral<U::result_type>`
and not `__is_unsigned_integer<U::result_type>` is simply IFNDR.
Orthogonally, fix an undefined behavior in std::countr_zero(__uint128_t).
Orthogonally, improve tests for the <bit> manipulation functions.
It was these new tests that detected the bug in countr_zero.
Differential Revision: https://reviews.llvm.org/D102328
Rong Xu [Tue, 18 May 2021 23:52:07 +0000 (16:52 -0700)]
Fix a buildbot failure from commit
886629a8
LLVM GN Syncbot [Tue, 18 May 2021 23:27:42 +0000 (23:27 +0000)]
[gn build] Port
886629a8c9e5
Rong Xu [Tue, 18 May 2021 23:08:38 +0000 (16:08 -0700)]
[SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO
This patch implements first part of Flow Sensitive SampleFDO (FSAFDO).
It has the following changes:
(1) disable current discriminator encoding scheme,
(2) new hierarchical discriminator for FSAFDO.
For this patch, option "-enable-fs-discriminator=true" turns on the new
functionality. Option "-enable-fs-discriminator=false" (the default)
keeps the current SampleFDO behavior. When the fs-discriminator is
enabled, we insert a flag variable, namely, llvm_fs_discriminator, to
the object. This symbol will checked by create_llvm_prof tool, and used
to generate a profile with FS-AFDO discriminators enabled. If this
happens, for an extbinary format profile, create_llvm_prof tool
will add a flag to profile summary section.
Differential Revision: https://reviews.llvm.org/D102246
Mike Rice [Tue, 18 May 2021 16:18:17 +0000 (09:18 -0700)]
[OpenMP] Stabilize OpenMP/parallel_for_codegen.cpp test (NFC)
Revert recent commit to require x86-registered-target (
e4b790c5e3653053819182a67c593bc65de860ac).
Remove -O1 from the run lines so they are less dependent on backend passes.
Update the CHECK6 and CHECK10 lines with script.
Differential Revision: https://reviews.llvm.org/D102720
Tomasz Miąsko [Wed, 19 May 2021 00:00:00 +0000 (00:00 +0000)]
[Demangle][Rust] Speculative fix for bot build failure
> error: ‘InType’ is not a class, namespace, or enumeration
Alex Orlov [Tue, 18 May 2021 22:38:13 +0000 (02:38 +0400)]
[symbolizer] Added StartAddress for the resolved function.
In many cases it is helpful to know at what address the resolved function starts.
This patch adds a new StartAddress member to the DILineInfo structure.
Reviewed By: jhenderson, dblaikie
Differential Revision: https://reviews.llvm.org/D102316
Fabian Sommer [Tue, 18 May 2021 22:06:08 +0000 (15:06 -0700)]
Default stack alignment of x86 NaCl to 16 bytes
X86 NaCl generally requires the stack to be aligned to 16 bytes.
This change was already implemented in two downstream NaCl compilers
based on llvm.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D102610
Tomasz Miąsko [Tue, 18 May 2021 16:15:00 +0000 (18:15 +0200)]
[Demangle][Rust] Parse tuples
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102579
Tomasz Miąsko [Tue, 18 May 2021 16:14:43 +0000 (18:14 +0200)]
[Demangle][Rust] Parse slice type
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102578
Tomasz Miąsko [Tue, 18 May 2021 16:14:02 +0000 (18:14 +0200)]
[Demangle][Rust] Parse array type
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102573
Tomasz Miąsko [Tue, 18 May 2021 16:13:21 +0000 (18:13 +0200)]
[Demangle][Rust] Parse named types
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102571
Peter Collingbourne [Tue, 18 May 2021 19:57:19 +0000 (12:57 -0700)]
scudo: Test realloc on increasing size buffers.
While developing a change to the allocator I ended up breaking
realloc on secondary allocations with increasing sizes. That didn't
cause any of the unit tests to fail, which indicated that we're
missing some test coverage here. Add a unit test for that case.
Differential Revision: https://reviews.llvm.org/D102716
Sanjay Patel [Tue, 18 May 2021 20:08:28 +0000 (16:08 -0400)]
[x86] add FMF propagation test for target-specific intrinsic; NFC
Sanjay Patel [Tue, 18 May 2021 18:02:11 +0000 (14:02 -0400)]
[x86] trim zeros from constants for readability; NFC
River Riddle [Tue, 18 May 2021 21:31:33 +0000 (14:31 -0700)]
[mlir] Allow derived rewrite patterns to define a non-virtual `initialize` hook
This is a hook that allows for providing custom initialization of the pattern, e.g. if it has bounded recursion, setting the debug name, etc., without needing to define a custom constructor. A non-virtual hook was chosen to avoid polluting the vtable with code that we really just want to be inlined when constructing the pattern. The alternative to this would be to just define a constructor for each pattern, this unfortunately creates a lot of otherwise unnecessary boiler plate for a lot of patterns and a hook provides a much simpler/cleaner interface for the very common case.
Differential Revision: https://reviews.llvm.org/D102440
River Riddle [Tue, 18 May 2021 21:31:22 +0000 (14:31 -0700)]
[mlir-docs] Add a blurb on recursion during pattern application
We currently do not document how the pattern rewriter infra treats recursion when it gets detected. This revision adds a blurb on recursion in patterns, and how patterns can signal that they are equipped to handle it.
Differential Revision: https://reviews.llvm.org/D102439
Arthur Eubanks [Tue, 18 May 2021 21:38:12 +0000 (14:38 -0700)]
[docs] Fix broken docs after
1c7f32334
Arthur Eubanks [Sun, 2 May 2021 04:27:47 +0000 (21:27 -0700)]
[NFC] Use ArgListEntry indirect types more in ISel lowering
For opaque pointers, we're trying to avoid uses of
PointerType::getElementType().
A couple of ISel places use PointerType::getElementType(). Some of these
are easy to fix by using ArgListEntry's indirect types.
The inalloca type wasn't stored there, as opposed to preallocated and
byval which have their indirect types available, so add it and use it.
This is a reland after an MSan fix in D102667.
Differential Revision: https://reviews.llvm.org/D101713
Arthur Eubanks [Tue, 4 May 2021 01:00:50 +0000 (18:00 -0700)]
[TargetLowering] Only inspect attributes in the arguments for ArgListEntry
Parameter attributes are considered part of the function [1], and like
mismatched calling conventions [2], we can't have the verifier check for
mismatched parameter attributes.
This is a reland after fixing MSan issues in D102667.
[1] https://llvm.org/docs/LangRef.html#parameter-attributes
[2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D101806
Arthur Eubanks [Tue, 18 May 2021 05:11:06 +0000 (22:11 -0700)]
[MSan] Set zeroext on call arguments to msan functions with zeroext parameter attribute
ABI attributes need to match between the caller and callee.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D102667
Konstantin Zhuravlyov [Tue, 18 May 2021 20:56:23 +0000 (16:56 -0400)]
AMDGPU/Docs: Remove reserved MACH 0x3E (it is no longer reserved), sort MACHs by value
Neumann Hon [Tue, 18 May 2021 19:02:11 +0000 (15:02 -0400)]
[SystemZ] [z/OS] Add XPLINK64 Calling Convention to SystemZ
This patch adds the XPLINK64 calling convention to the SystemZ
backend. It specifies and implements the argument passing and
return value conventions.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D101010
Martin Storsjö [Fri, 14 May 2021 20:34:51 +0000 (23:34 +0300)]
[compiler-rt] [builtins] Provide a SEH specific __gcc_personality_seh0
This matches how __gxx_personality_seh0 is hooked up in libcxxabi.
Differential Revision: https://reviews.llvm.org/D102530
Arthur Eubanks [Thu, 6 May 2021 23:30:39 +0000 (16:30 -0700)]
[NewPM] Don't mark AA analyses as preserved
Currently all AA analyses marked as preserved are stateless, not taking
into account their dependent analyses. So there's no need to mark them
as preserved, they won't be invalidated unless their analyses are.
SCEVAAResults was the one exception to this, it was treated like a
typical analysis result. Make it like the others and don't invalidate
unless SCEV is invalidated.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D102032
Mateusz Mikuła [Tue, 18 May 2021 20:36:50 +0000 (23:36 +0300)]
[MinGW] Fix the cmake condition for -mbig-obj
This is a correction to D102419, fixing the condition to the
form that actually works as intended.
Arthur Eubanks [Thu, 13 May 2021 22:44:21 +0000 (15:44 -0700)]
[OpaquePtr] Make loads and stores work with opaque pointers
Don't check that types match when the pointer operand is an opaque
pointer.
I would separate the Assembler and Verifier changes, but
verify-uselistorder in the Assembler test ends up running the verifier.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102450
Petr Hosek [Tue, 18 May 2021 19:59:57 +0000 (12:59 -0700)]
[CMake] Use -O0 for unittests under full LTO as well
We already use -O0 for unittests under ThinLTO, do the same for full LTO
where the compile time costs to runtime benefits tradeoff is even worse.
Differential Revision: https://reviews.llvm.org/D102718
Reid Kleckner [Tue, 18 May 2021 19:34:02 +0000 (12:34 -0700)]
[PDB] Improve error handling when writes fail
Handle PDB writing errors like any other error in LLD: emit an error and
continue. This allows the linker to print timing data and summary data
after linking, which can be helpful for finding PDB size problems. Also
report how large the file would have been.
Example output:
lld-link: error: Output data is larger than 4 GiB. File size would have been 6,937,108,480
lld-link: error: failed to write PDB file ./chrome.dll.pdb
Summary
--------------------------------------------------------------------------------
33282 Input OBJ files (expanded from all cmd-line inputs)
4 PDB type server dependencies
0 Precomp OBJ dependencies
33396931 Input type records
... snip ...
Input File Reading: 59756 ms ( 45.5%)
GC: 7500 ms ( 5.7%)
ICF: 3336 ms ( 2.5%)
Code Layout: 6329 ms ( 4.8%)
PDB Emission (Cumulative): 46192 ms ( 35.2%)
Add Objects: 27609 ms ( 21.0%)
Type Merging: 16740 ms ( 12.8%)
Symbol Merging: 10761 ms ( 8.2%)
Publics Stream Layout: 9383 ms ( 7.1%)
TPI Stream Layout: 1678 ms ( 1.3%)
Commit to Disk: 3461 ms ( 2.6%)
--------------------------------------------------
Total Link Time: 131244 ms (100.0%)
Differential Revision: https://reviews.llvm.org/D102713
River Riddle [Tue, 18 May 2021 19:57:36 +0000 (12:57 -0700)]
[mlir-lsp-server] Add support for recording text document versions
The version is used by LSP clients to ignore stale diagnostics, and can be used in a followup to help verify incremental changes.
Differential Revision: https://reviews.llvm.org/D102644
Sam Clegg [Tue, 18 May 2021 18:08:32 +0000 (11:08 -0700)]
[lld][WebAssembly] Convert test to assembly. NFC.
Differential Revision: https://reviews.llvm.org/D102704
Simon Pilgrim [Tue, 18 May 2021 19:25:42 +0000 (20:25 +0100)]
[X86][AVX] createVariablePermute - correctly extend same-sized-vector indices (PR50356)
D101838 incorrectly handled indices vectors of the same size but with higher element counts to just bitcast to the target indices type instead of performing a ZERO_EXTEND_VECTOR_INREG
Sam Clegg [Wed, 12 May 2021 23:48:34 +0000 (16:48 -0700)]
[lld][WebAssembly] Enable string tail merging in debug sections
This is a followup to https://reviews.llvm.org/D97657 which
applied string tail merging to data segments.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=48828
Differential Revision: https://reviews.llvm.org/D102436
Vassil Vassilev [Tue, 18 May 2021 17:53:54 +0000 (17:53 +0000)]
[clang-repl] Better match the underlying architecture.
In cases where -fno-integrated-as is specified we should overwrite the
EmitAssembly action as well.
We also should rely on the target triple from the process at least until we
implement out-of-process execution.
This patch should improve clang-repl on AIX.
Discussion available at: https://reviews.llvm.org/D96033
Differential revision: https://reviews.llvm.org/D102688
Konstantin Zhuravlyov [Tue, 18 May 2021 19:10:53 +0000 (15:10 -0400)]
AMDGPU/NFC: Replace EF_AMDGPU_MACH_AMDGCN_RESERVED_0X3E with EF_AMDGPU_MACH_AMDGCN_GFX1034
Differential Revision: https://reviews.llvm.org/D102708
Simon Pilgrim [Tue, 18 May 2021 18:30:49 +0000 (19:30 +0100)]
[X86][AVX] Add variable-permute test case from PR50356
Rafael Auler [Tue, 18 May 2021 00:18:15 +0000 (17:18 -0700)]
[RuntimeDyld] Add allowStubs/allowZeroSyms
This patch introduces functionality used by BOLT when
re-linking the final binary. It adds to MemoryManager a new member
function allowStubAllocation to control whether this MemoryManager
supports increasing code size with stubs or not. Since BOLT can
rewrite some files in-place, it needs to avoid stub insertion done
by the linker. This patch also introduces allowsZeroSymbols to the
JITSymbolResolver class, enabling us to finish a link successfully
even when some symbols resolve to the value zero. When rewriting a
binary, sometimes we do need to resolve a target to zero in case
the input binary calls address zero and we want to be bug
compatible. We also expose reassignSectionAddress as it is used by
BOLT.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97898
peter klausler [Mon, 17 May 2021 21:10:02 +0000 (14:10 -0700)]
[flang] Accept OPEN(ACCESS='APPEND') legacy extension even without warnings enabled
My earlier patch to accept ACCESS='APPEND' only worked when warnings
were enabled; fix it.
Differential Revision: https://reviews.llvm.org/D102653
Nikita Popov [Mon, 17 May 2021 21:37:14 +0000 (23:37 +0200)]
[LICM] Remove MaybePromotable set (PR50367)
The MaybePromotable set keeps track of loads/stores for which
promotion was not attempted yet. Normally, any load/stores that
are promoted in the current iteration will be removed from this
set, because they naturally MustAlias with the promoted value.
However, if the source program has UB with metadata claiming that
a store is NoAlias, while it is actually MustAlias, and multiple
different pointers are promoted in the same iteration, it can
happen that a store is removed that is still in the MaybePromotable
set, causing a use-after-free.
While this could be fixed by explicitly invalidating values in
MaybePromotable in the LoopPromoter, I'm going with the more
radical option of dropping the set entirely here and check all
load/stores on each promotion iteration. As promotion, and especially
repeated promotion, are quite rare, this doesn't seem to have any
impact on compile-time.
Fixes https://bugs.llvm.org/show_bug.cgi?id=50367.
peter klausler [Mon, 17 May 2021 21:06:44 +0000 (14:06 -0700)]
[flang] Implement MATMUL in the runtime
Define an API for the transformational intrinsic function MATMUL,
implement it, and add some basic unit tests. The large number of
possible argument type combinations are covered by a set of
generalized templates that are instantiated for each valid
pair of possible argument types.
Places where BLAS-2/3 routines could be called for acceleration
are marked with TODOs. Handling for other special cases (e.g.,
known-shape 3x3 matrices and vectors) are deferred.
Some minor tweaks were made to the recent related implementation
of DOT_PRODUCT to reflect lessons learned.
Differential Revision: https://reviews.llvm.org/D102652
Fangrui Song [Tue, 18 May 2021 17:57:24 +0000 (10:57 -0700)]
[Driver] Delete -mimplicit-it=
This is a GNU as and Clang cc1as option, not a GCC option.
Users should specify `-Wa,-mimplicit-it=` instead.
Note: mixing the -m option and the -Wa, option doesn't work
`-Wa,-mimplicit-it=never -mimplicit-it=always` =>
`clang (LLVM option parsing): for the --arm-implicit-it option: may only occur zero or one times!`
Reviewed By: nickdesaulniers, raj.khem
Differential Revision: https://reviews.llvm.org/D102568
Nico Weber [Mon, 17 May 2021 17:49:17 +0000 (13:49 -0400)]
[lld/mac] Correctly set nextdefsym
In LC_DYSYMTAB, private externs were still emitted as exported symbols instead
of as locals.
Fixes PR50373. See bug for details.
Differential Revision: https://reviews.llvm.org/D102662
Chris Lattner [Tue, 18 May 2021 17:23:01 +0000 (10:23 -0700)]
[IntegerAttr] Add helpers for working with LLVM's APSInt type.
The FIRRTL dialect in CIRCT uses inherently signful types, and APSInt
is the best way to model that. Add a couple of helpers that make it
easier to work with an IntegerAttr that carries a sign.
This follows the example of getZExt() and getSExt() which assert when
the underlying type of the attribute is unexpected. In this case
we assert fail when the underlying type of the attribute is signless.
This is strictly additive, so it is NFC. It is tested in the CIRCT
repo.
Differential Revision: https://reviews.llvm.org/D102701
Arthur Eubanks [Tue, 18 May 2021 17:39:12 +0000 (10:39 -0700)]
[NFC] Format PassesBindingsTests CMake like other unittests
Sanjay Patel [Tue, 18 May 2021 17:28:31 +0000 (13:28 -0400)]
[InstCombine] restrict funnel shift match to avoid miscompile
As noted in the post-commit discussion for:
https://reviews.llvm.org/rGabd7529625a73f405e40a63dcc446c41d51a219e
...that change exposed a logic hole that allows a miscompile
if the shift amount could exceed the narrow width:
https://alive2.llvm.org/ce/z/-i_CiM
https://alive2.llvm.org/ce/z/NaYz28
The restriction isn't necessary for a rotate (same operand for
both shifts), so we should adjust the matching for the shift
value as a follow-up enhancement:
https://alive2.llvm.org/ce/z/ahuuQb
Arthur Eubanks [Tue, 18 May 2021 16:33:50 +0000 (09:33 -0700)]
[test] Speculative fix for bots (round 2)
Bot has error "Failed to create target from default triple: Unable to
find target for this triple (no targets are registered)", likely because
we only initialized the native target, not the registered target if it's
different.
https://lab.llvm.org/buildbot/#/builders/86/builds/13664
Arthur Eubanks [Tue, 18 May 2021 17:25:35 +0000 (10:25 -0700)]
[gn build] Rename PassesBindingsTests and add it to unittests
Sanjay Patel [Tue, 18 May 2021 12:01:10 +0000 (08:01 -0400)]
[InstCombine] add tests for funnel shift miscompile; NFC
Chris Lattner [Sun, 16 May 2021 05:18:16 +0000 (22:18 -0700)]
[IR] Add a Location to BlockArgument.
This adds the ability to specify a location when creating BlockArguments.
Notably Value::getLoc() will return this correctly, which makes diagnostics
more precise (e.g. the example in test-legalize-type-conversion.mlir).
This is currently optional to avoid breaking any existing code - if
absent, the BlockArgument defaults to using the location of its enclosing
operation (preserving existing behavior).
The bulk of this change is plumbing location tracking through the parser
and printer to make sure it can round trip (in -mlir-print-debuginfo
mode). This is complete for generic operations, but requires manual
adoption for custom ops.
I added support for function-like ops to round trip their argument
locations - they print correctly, but when parsing the locations are
dropped on the floor. I intend to fix this, but it will require more
invasive plumbing through "function_like_impl" stuff so I think it
best to split it out to its own patch.
Differential Revision: https://reviews.llvm.org/D102567
Arthur Eubanks [Tue, 18 May 2021 17:12:51 +0000 (10:12 -0700)]
Revert "[test] Speculative fix for bots"
This reverts commit
5c291482ec8bcd686044ebc0d4cffe7bf769521c.
unittests/Passes/CMakeFiles/PassesBindingsTests.dir/PassBuilderBindingsTest.cpp.o: In function `PassBuilderCTest::SetUp()':
PassBuilderBindingsTest.cpp:(.text._ZN16PassBuilderCTest5SetUpEv[_ZN16PassBuilderCTest5SetUpEv]+0x28): undefined reference to `LLVMInitializeARMTargetInfo'
Simon Pilgrim [Tue, 18 May 2021 17:06:14 +0000 (18:06 +0100)]
[X86] Use Skylake Server model for x86-64-v4 so we have full instruction coverage
The x86-64-v4 generic cpu arch supports AVX512BW/DQ/CD/VLX which isn't covered by the Haswell model, use the SkylakeServer model instead which is a lot closer to what the arch represents.
Differential Revision: https://reviews.llvm.org/D102553
Arthur Eubanks [Tue, 18 May 2021 16:33:50 +0000 (09:33 -0700)]
[test] Speculative fix for bots
Bot has error "Failed to create target from default triple: Unable to
find target for this triple (no targets are registered)", likely because
we only initialized the native target, not the registered target if it's
different.
https://lab.llvm.org/buildbot/#/builders/86/builds/13664
Arthur Eubanks [Tue, 18 May 2021 17:00:54 +0000 (10:00 -0700)]
[gn build] Add target for PassesBindingsTest
Jessica Paquette [Mon, 17 May 2021 23:00:53 +0000 (16:00 -0700)]
[AArch64][GlobalISel] Prefer mov for s32->s64 G_ZEXT
We can use an ORRWrs (mov) + SUBREG_TO_REG rather than a UBFX for G_ZEXT on
s32->s64.
This closer matches what SDAG does, and is likely more power efficient etc.
(Also fixed up arm64-rev.ll which had a fallback check line which was entirely
useless.)
Simple example: https://godbolt.org/z/h1jKKdx5c
Differential Revision: https://reviews.llvm.org/D102656
Roman Lebedev [Tue, 18 May 2021 16:53:02 +0000 (19:53 +0300)]
[X86] AMD Zen 3: fix MULX modelling - don't forget about WriteIMulH (PR50387)
Otherwise lack thereof will be caught by a defensive check during
scheduling, and we'll crash.
I've literally never seen this syntax before..
Vinayaka Bandishti [Tue, 18 May 2021 12:28:27 +0000 (17:58 +0530)]
[MLIR][Affine] Privatize certain escaping memrefs
During affine loop fusion, create private memrefs for escaping memrefs
too under the conditions that:
-- the source is not removed after fusion, and
-- the destination does not write to the memref.
This creates more fusion opportunities as illustrated in the test case.
Reviewed By: bondhugula, ayzhuang
Differential Revision: https://reviews.llvm.org/D102604
Aaron Ballman [Tue, 18 May 2021 16:42:52 +0000 (12:42 -0400)]
Speculatively fix failing tests from
6381664580080f015bc0c2ec647853f697cf744a
This was causing some Mac-specific build failures:
http://45.33.8.238/macm1/9739/step_7.txt
http://45.33.8.238/mac/31615/step_7.txt
As best I can tell with psychic debugging, the /Users/blah path to the
source file is being treated as a macro undef with the clang-cl driver.
This splits the filename off explicitly so hopefully the rest of the
command line arguments will be read properly.
Jessica Paquette [Fri, 14 May 2021 23:53:52 +0000 (16:53 -0700)]
[GlobalISel] Simplify G_ICMP to true/false when the result is known
Use existing KnownBits helpers from KnownBits.h to simplify G_ICMPs.
E.g.
x == x -> true
x != x -> false
load(x) > 1 -> true (when the load is known to be greater than 1)
And so on.
Differential Revision: https://reviews.llvm.org/D102542
Sergey Dmitriev [Tue, 18 May 2021 15:43:51 +0000 (08:43 -0700)]
[clang-offload-bundler] Add sections and set section flags using one llvm-objcopy invocation
llvm-objcopy has been changed to support adding a section and updating section flags
in one run (D90438), so we can now change clang-offload-bundler to run llvm-objcopy
tool only once when creating fat object.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D102670
Lang Hames [Tue, 18 May 2021 04:38:17 +0000 (21:38 -0700)]
[ORC-RT] Add apply_tuple utility.
This is a substitute for std::apply, which we can't use until we move to c++17.
apply_tuple will be used in upcoming the upcoming wrapper-function utils code.
Lang Hames [Tue, 18 May 2021 02:51:42 +0000 (19:51 -0700)]
[ORC-RT] Add compiler abstraction header for the ORC runtime.
This header provides helper macros to insulate the rest of the ORC runtime from
compiler specifics.
Lang Hames [Mon, 17 May 2021 19:28:46 +0000 (12:28 -0700)]
[ORC] Don't try to obtain a ref to a non-existent buffer.
Aaron Ballman [Tue, 18 May 2021 14:32:22 +0000 (10:32 -0400)]
Introduce SYCL 2020 mode
Currently, we have support for SYCL 1.2.1 (also known as SYCL 2017).
This patch introduces the start of support for SYCL 2020 mode, which is
the latest SYCL standard available at (https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html).
This sets the default SYCL to be 2020 in the driver, and introduces the
notion of a "default" version (set to 2020) when cc1 is in SYCL mode
but there was no explicit -sycl-std= specified on the command line.
Tim Northover [Tue, 12 Jan 2021 13:12:40 +0000 (13:12 +0000)]
Recommit X86: support Swift Async context
This adds support to the X86 backend for the newly committed swiftasync
function parameter. If such a (pointer) parameter is present it gets stored
into an augmented frame record (populated in IR, but generally containing
enhanced backtrace for coroutines using lots of tail calls back and forth).
The context frame is identical to AArch64 (primarily so that unwinders etc
don't get extra complexity). Specfically, the new frame record is [AsyncCtx,
%rbp, ReturnAddr], and its presence is signalled by bit 60 of the stored %rbp
being set to 1. %rbp still points to the frame pointer in memory for backwards
compatibility (only partial on x86, but OTOH the weird AsyncCtx before the rest
of the record is because of x86).
Recommited with a fix for unwind info when i386 pc-rel thunks are
adjacent to a prologue.
Jinsong Ji [Tue, 18 May 2021 14:03:26 +0000 (14:03 +0000)]
[DebugInfo][test] Check specific func name to ignore codegen differences
We use `CHECK-LABEL: define` to divide input stream into functions,
this works well on most platforms.
But there are cases that some platforms (eg: AIX) may have different
codegen , especially for global constructor and descructors.
On AIX, the codegen will have two more functions: __dtor_b,
__finalize_b, which will fail the test.
The fix is to use specific function name so that we can safely ignore
those unrelated codegen differences.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102654
Raphael Isemann [Tue, 18 May 2021 08:30:54 +0000 (10:30 +0200)]
[ADT] Remove StringRef::withNullAsEmpty
A long time ago LLDB wanted to start using StringRef instead of
C-Strings/ConstString but was blocked by the StringRef(const char *) ctor
asserting that the C-string isn't a nullptr. To workaround this, D24697
introduced a special function called withNullAsEmpty and that's what LLDB (and
only LLDB) started to use to build StringRefs from C-strings.
A bit later it seems that withNullAsEmpty was declared too awkward to use and
instead the assert in the StringRef constructor got removed (see D24904). The
rest of LLDB was then converted to StringRef by just calling the now perfectly
usable implicit constructor.
However, it seems that the original approach with withNullAsEmpty was never
touched again since then and now just exists as a function in StringRef that
is only used in a few places in LLDB.
I removed the few uses of withNullAsEmpty in D102597 and this patch removes
the function itself. Calling the implicit StringRef(const char *) constructor
is the preferred way of doing this today.
Reviewed By: lattner
Differential Revision: https://reviews.llvm.org/D102599
Roman Lebedev [Tue, 18 May 2021 12:55:02 +0000 (15:55 +0300)]
[X86] AMD Zen 3: cap LoopMicroOpBufferSize to workaround PR50384 (quadratic IndVars runtime)
While i would like to keep the right value here,
i would also like to be able to actually compile
e.g. vanilla test-suite.
256 is a pretty random guess, it should be pretty good enough
for serious loops, but small enough to result in tolerant
compile times for certain edge cases.
https://bugs.llvm.org/show_bug.cgi?id=50384
Kristina Bessonova [Sun, 18 Apr 2021 16:46:46 +0000 (18:46 +0200)]
[libcxx][test] Attempt to make debug mode tests more bulletproof
The problem with debug mode tests is that it isn't known which particular
_LIBCPP_ASSERT causes the test to exit, and as shown by
https://reviews.llvm.org/D100029 and
2908eb20ba7 it might be not the
expected one.
The patch adds TEST_LIBCPP_ASSERT_FAILURE macro that allows checking
_LIBCPP_ASSERT message to ensure we caught an expected failure.
Reviewed By: Quuxplusone, ldionne
Differential Revision: https://reviews.llvm.org/D100595
David Sherwood [Tue, 18 May 2021 10:42:48 +0000 (11:42 +0100)]
[NFC] Removed unused VFInfo comparison operator
Martin Storsjö [Tue, 18 May 2021 12:05:53 +0000 (15:05 +0300)]
[LLD] [MinGW] Pass the canExitEarly parameter through properly
The MinGW driver passed a hardcoded true to this parameter
since
6f4e255219f2a7878d3, but when the MinGW driver got the
canExitEarly parameter for consistency in
b11386f9be9b2dc7276, this
call was missed so it wasn't passed on properly.
Simon Pilgrim [Tue, 18 May 2021 12:06:57 +0000 (13:06 +0100)]
[X86][AVX] Cleanup AVX2 vector integer truncation costs
Noticed while investigating PR50364, the truncation costs for v4i64->v4i16/v4i8 and v8i32->v8i8 were way too optimistic for a shuffle sequence that usually matches the AVX1 codegen (they matched AVX512 numbers which have actual truncation instructions!).
Nico Weber [Tue, 18 May 2021 00:53:55 +0000 (20:53 -0400)]
[lld/mac] Propagate -(un)exported_symbol(s_list) to privateExtern in Driver
That way, it's done only once instead of every time shouldExportSymbol() is
called.
Possibly a bit faster:
% ministat at_main at_symtodo
x at_main
+ at_symtodo
N Min Max Median Avg Stddev
x 30 3.9732189 4.114846 4.024621 4.0304692 0.
037022865
+ 30 3.93766 4.0510042 3.9973931 3.991469 0.
028472565
Difference at 95.0% confidence
-0.0390002 +/- 0.0170714
-0.967635% +/- 0.423559%
(Student's t, pooled s = 0.0330256)
In other runs with n=30 it makes no perf difference, so maybe it's just noise.
But being able to quickly and conveniently answer "is this symbol exported?"
is useful for fixing PR50373 and for implementing -dead_strip, so this seems
like a good change regardless.
No behavior change.
Differential Revision: https://reviews.llvm.org/D102661
Florian Hahn [Fri, 14 May 2021 22:33:27 +0000 (23:33 +0100)]
[LV] Add test which sinks a load a across an aliasing store.
Simon Pilgrim [Tue, 18 May 2021 11:24:59 +0000 (12:24 +0100)]
[CostModel][X86] Add scalar truncation cost checks
Ensure these are all zero
Simon Pilgrim [Tue, 18 May 2021 11:20:19 +0000 (12:20 +0100)]
[CostModel][X86] Add missing check prefixes from cast.ll
We have checks for these but no actual RUNs were using them
Simon Pilgrim [Tue, 18 May 2021 11:15:38 +0000 (12:15 +0100)]
Revert rGd70cbd1ce9b426f2c7e0e0f900769bbcbb300a95 "[AMDGPU] Regenerate wave32.ll tests"
This is failing on buildbots but not locally - not sure why
Simon Pilgrim [Tue, 18 May 2021 11:03:13 +0000 (12:03 +0100)]
[AMDGPU] Regenerate wave32.ll tests
Keep the manual GFX10DEFWAVE checks for VGPRBlocks
Alexey Bader [Tue, 13 Apr 2021 14:10:15 +0000 (17:10 +0300)]
[SYCL] Enable `opencl_global_[host,device]` attributes for SYCL
Differential Revision: https://reviews.llvm.org/D100396
Nicolas Vasilache [Tue, 18 May 2021 10:07:03 +0000 (10:07 +0000)]
[mlir][Linalg] Drop spuriously long matmul_column_major benchmark