Oleg Shyshkov [Wed, 19 Oct 2022 09:42:25 +0000 (11:42 +0200)]
[mlir] Add TransposeOp to Linalg structured ops.
RFC: https://discourse.llvm.org/t/rfc-primitive-ops-add-mapop-reductionop-transposeop-broadcastop-to-linalg/64184
Differential Revision: https://reviews.llvm.org/
D135854
Florian Hahn [Wed, 19 Oct 2022 10:24:10 +0000 (11:24 +0100)]
[SCEV] Replace assert with returning CouldNotComp in computeMaxBECountForLT.
This patch removes the bail out for signed predicates and non-positive
strides in howManyLessThans and updates computeMaxBECountForLT to return
SCEVCouldNotCompute for signed predicates with negative strides.
AFAICT bail-out was only added because computeMaxBECountForLT may not
handle negative signed strides correctly. Instead of not calling
computeMaxBECountForLT at all because we bail out earlier, we can
instead return SCEVCouldNotCompute in computeMaxBECountForLT.
The max backedge taken count will be computed as the max value of the
symbolic backedge taken count.
This improves precision in cases where we can compute symbolic backedge
taken counts and also fixes a crash.
Fixes #57818.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/
D135667
bipmis [Wed, 19 Oct 2022 10:22:58 +0000 (11:22 +0100)]
[AggressiveInstCombine] Load merge the reverse load pattern of consecutive loads.
This patch extends the load merge/widen in AggressiveInstCombine() to handle reverse load patterns.
Differential Revision: https://reviews.llvm.org/
D135137
Serge Pavlov [Wed, 19 Oct 2022 10:19:04 +0000 (17:19 +0700)]
Keep configuration file search directories in ExpansionContext. NFC
Class ExpansionContext encapsulates options for search and expansion of
response files, including configuration files. With this change the
directories which are searched for configuration files are also stored
in ExpansionContext.
Differential Revision: https://reviews.llvm.org/
D135439
Simon Pilgrim [Wed, 19 Oct 2022 10:18:39 +0000 (11:18 +0100)]
[DAG] Fold (sra (or (shl x, c1), (shl y, c2)), c1) -> (sext_inreg (or x, (shl y,c2-c1)) iff c2 >= c1
Helps with some of the AMDGPU regressions identified in
D136042 where we were losing signed BFE patterns after sinking shifts behind logic ops.
Differential Revision: https://reviews.llvm.org/
D136081
Jay Foad [Wed, 19 Oct 2022 09:32:08 +0000 (10:32 +0100)]
[AMDGPU] Assume getDefIgnoringCopies will succeed. NFC.
getDefIgnoringCopies and getSrcRegIgnoringCopies should not fail on
valid MIR, so don't bother to check for failure.
Differential Revision: https://reviews.llvm.org/
D136238
Tobias Gysi [Wed, 19 Oct 2022 09:48:45 +0000 (12:48 +0300)]
[mlir][llvm] Ordered traversal in LLVM IR import.
The revision performs a topological sort of the blocks to
ensure the operations are processed in dominance order.
After the change, we do not need to introduce dummy
instructions if an operand has not yet been processed.
Additionally, the revision also moves and simplifies the
control-flow related tests to a separate test file.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/
D136230
Caroline Concatto [Wed, 19 Oct 2022 09:43:37 +0000 (10:43 +0100)]
[AArch64] Replace sme-i64 by sme-i16i64 and sme-f64 by sme-f64f64
The names in developer.arm for these SME features are:
HaveSMEI16I64 and HaveSMEF64F64
so the new flag names are consistent with the documentation page
Reviewed By: sdesmalen, c-rhodes
Differential Revision: https://reviews.llvm.org/
D135974
Jay Foad [Wed, 19 Oct 2022 09:52:12 +0000 (10:52 +0100)]
[AMDGPU] Add test case for a VOPD s_delay_alu insertion bug
Juan Manuel MARTINEZ CAAMAÑO [Wed, 19 Oct 2022 07:40:22 +0000 (02:40 -0500)]
[AMDGPU][Backend] Fix user-after-free in AMDGPUReleaseVGPRs::isLastVGPRUseVMEMStore
Reviewed By: jpages, arsenm
Differential Revision: https://reviews.llvm.org/
D134641
Nikolas Klauser [Wed, 19 Oct 2022 09:07:34 +0000 (11:07 +0200)]
[libc++] Remove std::function in C++03
We've said that we'll remove `std::function` from C++03 in LLVM 16, so we might as well do it now before we forget.
Reviewed By: ldionne, #libc, Mordante
Spies: jloser, Mordante, libcxx-commits
Differential Revision: https://reviews.llvm.org/
D135868
Jean Perier [Wed, 19 Oct 2022 09:06:27 +0000 (11:06 +0200)]
[flang] Add fir.declare operation
Add fir.declare operation whose purpose was described in https://reviews.llvm.org/
D134285.
It uses the FortranVariableInterfaceOp for most of its logic (including the verifier).
The rational is that all these aspects/logic will be shared by hlfir.designate and
hlfir.associate.
Its codegen and lowering will be added in later patches.
Differential Revision: https://reviews.llvm.org/
D136181
Nikita Popov [Wed, 19 Oct 2022 09:03:54 +0000 (11:03 +0200)]
[AA] Rename getModRefBehavior() to getMemoryEffects() (NFC)
Follow up on
D135962, renaming the method name to match the new
type name.
Nikita Popov [Wed, 19 Oct 2022 08:42:09 +0000 (10:42 +0200)]
[AA] Rename uses of FunctionModRefBehavior (NFC)
Followup to
D135962 to rename remaining uses of
FunctionModRefBehavior to MemoryEffects. Does not touch API names
yet, but also updates variables names FMRB/MRB to ME, to match the
new type name.
Nikita Popov [Fri, 14 Oct 2022 14:57:07 +0000 (16:57 +0200)]
[AA] Rename FunctionModRefBehavior to MemoryEffects (NFC)
As part of https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579,
the FunctionModRefBehavior class sees a good bit of additional use,
and I've found the name to be something of a mouthful. This patch
renames it to MemoryEffects, which has a couple of advantages over
the old name:
* It is more concise.
* It decouples it from modelling only functions.
* It matches the terminology of the aforementioned RFC.
* The meaning should be more obvious to people not familiar with
our particular AA lingo.
This patch just updates the class definition. Other uses of the
name will be updated separately.
Differential Revision: https://reviews.llvm.org/
D135962
luxufan [Wed, 19 Oct 2022 06:34:05 +0000 (14:34 +0800)]
[RISCV] Enable the LocalStackSlotAllocation pass support
For RISC-V, load/store(exclude vector load/store) instructions only
has a 12 bit immediate operand. If the offset is out-of-range, it
must make use of a temp register to make up this offset. If between
these offsets, they have a small(IsInt<12>) relative offset,
LocalStackSlotAllocation pass can find a value as frame base register's
value, and replace the origin offset with this register's value plus
the relative offset.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D98101
Valentin Clement [Wed, 19 Oct 2022 07:51:33 +0000 (09:51 +0200)]
[flang][NFC] Fix printed name from proc_nopass_p2
Walter Erquinigo [Wed, 19 Oct 2022 07:18:01 +0000 (00:18 -0700)]
[lldb][trace] Fix some minor bugs in the call tree
- We weren't truncating the output files
- We weren't considering the case in which we couldn't disassembly an
instruction.
Valentin Clement [Wed, 19 Oct 2022 07:41:23 +0000 (09:41 +0200)]
[flang] Add fir.dispatch code generation
fir.dispatch code generation uses the binding table stored in the
type descriptor. There is no runtime call involved. The binding table
is always build from the parent type so the index of a specific binding
is the same in the parent derived-type or in the extended type.
Follow-up patches will deal cases not present here such as allocatable
polymorphic entities or pointers.
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/
D136189
Kai Luo [Wed, 19 Oct 2022 07:25:44 +0000 (07:25 +0000)]
[include-cleaner] Fix link errors when -DBUILD_SHARED_LIBS=ON
Fixed ppc buildbot https://lab.llvm.org/buildbot/#/builders/121/builds/24273 which is using `-DBUILD_SHARED_LIBS=ON`.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/
D136229
Jean Perier [Wed, 19 Oct 2022 06:55:02 +0000 (08:55 +0200)]
[flang] Introduce FortranVariableOpInterface for ops creating variable
HLFIR will rely on certain operations to create SSA memory values
that correspond to a Fortran variable. They will hold bounds and type
parameters information as well as metadata (like Fortran attributes).
This patch adds an interface that for such operations so that Fortran
variable can be stored, manipulated, and queried regardless of what
created them. This is so far intended for fir.declare, hlfir.designate
and hlfir.associate operations.
It is added to FIR and not HLFIR because fir.declare needs it and it
does not itself needs any HLFIR concepts.
Unit tests for the interface methods will be added alongside
fir.declare in the next patch.
Differential Revision: https://reviews.llvm.org/
D136151
Lei Zhang [Wed, 19 Oct 2022 05:49:08 +0000 (05:49 +0000)]
[mlir][spirv] Consider target when converting one-element vector
Vectors with just one element will be converted into scalars.
However, we cannot just return the element types and assume it
is supported in the target environment; we need to conver the
element type again factoring in those considerations.
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/
D136226
Freddy Ye [Wed, 19 Oct 2022 03:21:46 +0000 (11:21 +0800)]
[X86] Add WRMSRNS instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/
D135935
Craig Topper [Wed, 19 Oct 2022 04:11:42 +0000 (21:11 -0700)]
[RISCV] Add an early out to lowerVECTOR_SHUFFLEAsVSlidedown. NFC
If Mask[0] is 0, then we're never going to match a slidedown. If
we get through the for loop, then it's an identity mask which should
have already been optimized out. Otherwise it's some non-contiguous
mask that will fail out of the lop. Might as well not bother entering
the loop.
Maksim Panchenko [Thu, 22 Sep 2022 20:08:05 +0000 (13:08 -0700)]
[BOLT][NFC] Refactor EFMM initialization
Move EFMM initialization code to emitAndLink(), where EFMM is used.
Reviewed By: yavtuk
Differential Revision: https://reviews.llvm.org/
D136205
Freddy Ye [Wed, 19 Oct 2022 01:49:35 +0000 (09:49 +0800)]
[X86] Add MSRLIST instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Reviewed By: skan, RKSimon
Differential Revision: https://reviews.llvm.org/
D135934
chenglin.bi [Wed, 19 Oct 2022 02:32:32 +0000 (10:32 +0800)]
[MC][COFF] Add COFF section flag "Info"
For now, we have not parse section flag `Info` in asm file. When we emit a section with info flag to asm, then compile asm to obj we will lose the Info flag for the section.
The motivation of this change is ARM64EC's hybmp$x section. If we lose the Info flag MSVC link will report a warning:
`warning LNK4078: multiple '.hybmp' sections found with different attributes`
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/
D136125
Weining Lu [Tue, 18 Oct 2022 23:58:35 +0000 (07:58 +0800)]
Reland "[LoongArch] Fix codegen of atomicrmw nand"
Fix invalid RISCV-like MI being emitted for performing the `not`
operation: the LoongArch `xori` zero-extends the immediate, hence is
not equivalent to RISCV `xori`. The LoongArch `not` is a `nor` with
zero.
Patch by lrzlin (Lin Runze).
Differential Revision: https://reviews.llvm.org/
D136021
Chen Zheng [Thu, 13 Oct 2022 01:49:02 +0000 (01:49 +0000)]
[PowerPC] handle more than two predecessors loop header in ctrloop pass
After ISEL, the "valid" loop header which has two predecessors
(one is preheader and the other one is latch) may be transformed
to have more than two predecessors by some optimizations, like tail
duplicator, if the old header's successor(will be changed to new
header) is a sub loop.
The predecessors of the new loop header are preheader, loop latch
and the loop latch(es) of the sub loop(old header's successor).
Before the patch, ctrloop pass assumes two predecessors for candidate
loop header. This patch fixes this case.
Reviewed By: lkail
Differential Revision: https://reviews.llvm.org/
D135846
Yuanfang Chen [Wed, 19 Oct 2022 00:19:58 +0000 (17:19 -0700)]
[Clang] constraints partial ordering should work with deduction guide
D128750 incorrectly skips constraints partial ordering for deduction guide.
This patch reverts that part.
Fixes https://github.com/llvm/llvm-project/issues/58456.
Sam Clegg [Mon, 17 Oct 2022 23:26:54 +0000 (16:26 -0700)]
[lld][WebAssembly] Don't allow `--global-base` to be specified in -share/-pie or --relocatable modes
Add some checks around this combination of flags
Also, honor `--global-base` when specified in `--stack-first` mode
rather than ignoring it. But error out if the specified base preseeds
the end of the stack.
Differential Revision: https://reviews.llvm.org/
D136117
Weining Lu [Tue, 18 Oct 2022 12:59:59 +0000 (20:59 +0800)]
Revert "[LoongArch] Fix codegen of atomicrmw nand"
This reverts commit
9572406bbcb497f8c23c28daa762b55ee3219f41.
The author name is wrong.
Jan Svoboda [Tue, 18 Oct 2022 03:04:33 +0000 (20:04 -0700)]
[clang][deps] Remove unintentional `move`
This is a fix related to
D135414. The original intention was to keep `BaseFS` as a member of the worker and conditionally overlay it with local in-memory FS. The `move` of ref-counted `BaseFS` was not intended, and it's a bug.
Disabling parallelism in the "by-module-name" test reliably reproduces this, and the test itself doesn't *need* parallelism. (I think `-j 4` was cargo culted from another test.) Reusing that test to check for correct behavior...
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/
D136124
Lang Hames [Tue, 18 Oct 2022 17:53:25 +0000 (10:53 -0700)]
[JITLink] Add convenience methods for creating block readers / writers.
This saves clients some boilerplate compared to setting up the readers and
writers manually.
To obtain a BinaryStreamWriter / BinaryStreamReader for a given block, B,
clients can now write:
auto Reader = G.getBlockContentReader(B);
and
auto Writer = G.getBlockContentWriter(B);
The latter will trigger a copy to mutable memory allocated on the graph's
allocator if the block is currently marked as backed by read-only memory.
This commit also introduces a new createMutableContentBlock overload that
creates a block with a given size and zero-filled content (by default --
passing false for the ZeroInitialize bypasses initialization entirely).
This overload is intended to be used with getBlockContentWriter above when
creating new content for the graph.
Florian Mayer [Tue, 18 Oct 2022 23:19:11 +0000 (16:19 -0700)]
[sanitizer] Let internal symbolizer use toupper and tolower
Xiang Li [Tue, 18 Oct 2022 20:09:01 +0000 (13:09 -0700)]
[HLSL] Add SV_DispatchThreadID
Support SV_DispatchThreadID attribute.
Translate it into dx.thread.id in clang codeGen.
Reviewed By: beanz, aaron.ballman
Differential Revision: https://reviews.llvm.org/
D133983
wren romano [Tue, 18 Oct 2022 02:11:20 +0000 (19:11 -0700)]
[mlir][sparse] Removing the DimLvlType and DimLevelFormat types
This removes another massive source of redundancy, and instead has the Merger.{h,cpp} reuse the SparseTensorEnums library.
Depends On
D136005
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/
D136123
Quentin Colombet [Wed, 12 Oct 2022 00:53:52 +0000 (00:53 +0000)]
[mlir][MemRef] Move the forwarding patterns for `extract_strided_metadata`
The `SimplifyExtractStridedMetadata` pass features a pattern that forward
statically known information (offset, sizes, strides) to their respective
users.
This patch moves this pattern from this pass to the
`extract_strided_metadata` folding patterns.
Differential Revision: https://reviews.llvm.org/
D135797
Sam McCall [Tue, 18 Oct 2022 17:12:47 +0000 (19:12 +0200)]
[include-cleaner] Add line numbers to HTML output
wren romano [Tue, 18 Oct 2022 01:33:40 +0000 (18:33 -0700)]
[mlir][sparse] Moving Enums.h into Dialect/SparseTensor/IR
Move the SparseTensorEnums library out of the ExecutionEngine directory and into Dialect/SparseTensor/IR.
Depends On
D136002
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/
D136005
Siva Chandra Reddy [Thu, 13 Oct 2022 22:18:52 +0000 (22:18 +0000)]
[libc] Add implementation of sigaltstack for linux.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/
D135949
jinge90 [Tue, 18 Oct 2022 22:00:09 +0000 (15:00 -0700)]
[CMake] Fix LIBUNWIND_ENABLE_CET build after
D110005
D110005 renamed LIBUNWIND_SUPPORTS_* to CXX_SUPPORTS_*.
Reviewed By: MaskRay, #libunwind, mstorsjo
Differential Revision: https://reviews.llvm.org/
D136131
Joseph Huber [Fri, 14 Oct 2022 20:49:26 +0000 (15:49 -0500)]
[clang-format] Do not parse certain characters in pragma directives
Currently, we parse lines inside of a compiler `#pragma` the same way we
parse any other line. This is fine for some cases, like separating
expressions and adding proper spacing, but in others it causes some poor
results from miscategorizing some tokens.
For example, the OpenMP offloading uses certain clauses that contain
special characters like `map(tofrom : A[0:N])`. This will be formatted
poorly as it will be split between lines on the first colon.
Additionally the subscript notation will lead to poor spacing. This can
be seen in the OpenMP tests as the automatic clang formatting with
inevitably ruin the formatting.
For example, the following contrived example will be formatted poorly.
```
#pragma omp target teams distribute collapse(2) map(to: A[0 : M * K]) \
map(to: B[0:K * N]) map(tofrom:C[0:M*N]) firstprivate(Alpha) \
firstprivate(Beta) firstprivate(X) firstprivate(D) firstprivate(Y) \
firstprivate(E) firstprivate(Z) firstprivate(F)
```
This results in this when formatted, which is far from ideal.
```
#pragma omp target teams distribute collapse(2) map(to \
: A [0:M * K]) \
map(to \
: B [0:K * N]) map(tofrom \
: C [0:M * N]) firstprivate(Alpha) \
firstprivate(Beta) firstprivate(X) firstprivate(D) firstprivate(Y) \
firstprivate(E) firstprivate(Z) firstprivate(F)
```
This patch seeks to improve this by adding extra logic where the parsing goes
awry. This is primarily caused by the colon being parsed as an inline-asm
directive and the brackes an objective-C expressions. Also the line gets
indented every single time the line is dropped.
This doesn't implement true parsing handling for OpenMP statements.
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/
D136100
Joseph Huber [Tue, 18 Oct 2022 20:03:28 +0000 (15:03 -0500)]
[OpenMP] Make kernels have protected visibility
This patch changes the kernels generated by OpenMP to have protected
visibility. This is unlikely to change anything functionally. However,
protected visibility better matches the behaviour of these GPU kernels.
We do not expect any pending shared library load to preempt these
kernels so we can specify a more restrictive visibility.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/
D136198
Jez Ng [Tue, 18 Oct 2022 21:21:43 +0000 (17:21 -0400)]
[lld-macho] Folded symbols should have size zero in linker map
This matches ld64's behavior.
I also extended the icf-stabs.s test to demonstrate that even though
folded symbols have size zero, we cannot use the size-zero property in
lieu of `wasIdenticalCodeFolded`, because size zero symbols should still
get STABS entries.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/
D136001
Jez Ng [Tue, 18 Oct 2022 21:21:39 +0000 (17:21 -0400)]
[lld-macho] Don't fold subsections with symbols at nonzero offsets
Symbols occur at non-zero offsets in a subsection if they are
`.alt_entry` symbols, or if `.subsections_via_symbols` is omitted.
It doesn't seem like ld64 supports folding those subsections either.
Moreover, supporting this it makes `foldIdentical` a lot more
complicated to implement. The existing implementation has some
questionable behavior around STABS omission -- if a section with an
non-zero offset symbol was folded into one without, we would omit the
STABS entry for the non-zero offset symbol.
I will be following up with a diff that makes `foldIdentical` zero out
the symbol sizes for folded symbols. Again, this is much easier to
implement if we don't have to worry about non-zero offsets.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/
D136000
wren romano [Tue, 18 Oct 2022 01:13:05 +0000 (18:13 -0700)]
[mlir][sparse] Factoring out SparseTensorEnums library
This differential splits the SparseTensorEnums library out from the SparseTensorRuntime library. The actual moving of files will be handled in the next differential.
Depends On
D135996
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/
D136002
Aart Bik [Tue, 18 Oct 2022 17:35:00 +0000 (10:35 -0700)]
[mlir][sparse] refine insertion code
builds SSA cycle for compress insertion loop
adds casting on index mismatch during push_back
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/
D136186
Arthur Eubanks [Sun, 2 Oct 2022 21:07:51 +0000 (14:07 -0700)]
[opt] Don't initialize legacy instrumentation passes
So that we require `opt -passes=` syntax for instrumentation passes.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/
D135042
Siva Chandra Reddy [Tue, 18 Oct 2022 20:58:08 +0000 (20:58 +0000)]
[libc][Obvious] Skip some termios tests when there no is /dev/tty.
Walter Erquinigo [Sun, 16 Oct 2022 01:52:22 +0000 (18:52 -0700)]
[lldb][trace] Add a basic function call dump [3] - Add a JSON dumper
The JSON dumper is very minimalistic. It pretty much only shows the
delimiting instruction IDs of every segment, so that further queries to
the SBCursor can be used to make sense of the data. It's main purpose is
to be serialized somewhat cheaply.
I also renamed untracedSegment to untracedPrefixSegment, in case in the
future we add an untracedSuffixSegment. In any case, this new name is
more explicit, which I like.
Differential Revision: https://reviews.llvm.org/
D136034
Walter Erquinigo [Mon, 10 Oct 2022 19:57:13 +0000 (12:57 -0700)]
[lldb][trace] Add a basic function call dump [2] - Implement the reconstruction algorithm
This diff implements the reconstruction algorithm for the call tree and
add tests.
See TraceDumper.h for documentation and explanations.
One important detail is that the tree objects are in TraceDumper, even
though Trace.h is a better home. I'm leaving that as future work.
Another detail is that this code is as slow as dumping the entire
symolicated trace, which is not that bad tbh. The reason is that we use
symbols throughout the algorithm and we are not being careful about
memory and speed. This is also another area for future improvement.
Lastly, I made sure that incomplete traces work, i.e. you start tracing
very deep in the stack or failures randomly appear in the trace.
Differential Revision: https://reviews.llvm.org/
D135917
Walter Erquinigo [Sat, 8 Oct 2022 21:06:44 +0000 (14:06 -0700)]
[lldb][trace] Add a basic function call dumpdump [1] - Add the command scaffolding
The command is thread trace dump function-calls and as minimum will
require printing to a file in json and non-json format
I added a test
Differential Revision: https://reviews.llvm.org/
D135521
Siva Chandra Reddy [Mon, 17 Oct 2022 16:27:45 +0000 (16:27 +0000)]
[libc] Add termios.h and the implementation of functions declared in it.
Reviewed By: lntue, michaelrj
Differential Revision: https://reviews.llvm.org/
D136143
Maksim Panchenko [Mon, 17 Oct 2022 23:15:59 +0000 (16:15 -0700)]
[BOLT] Fix instruction encoding validation
Always use non-symbolizing disassembler for instruction encoding
validation as symbols will be treated as undefined/zeros be the encoder
and causing byte sequence mismatches.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/
D136118
Paul Pluzhnikov [Tue, 18 Oct 2022 20:47:55 +0000 (20:47 +0000)]
Fix incorrect check for running out of source locations.
When CurrentLoadedOffset is less than TotalSize, current code will
trigger unsigned overflow and will not return an "allocation failed"
indicator.
Google ref: b/
248613299
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/
D135192
wren romano [Fri, 14 Oct 2022 23:40:28 +0000 (16:40 -0700)]
[mlir][sparse] Use the runtime DimLevelType instead of a separate tablegen enum
This differential replaces all uses of SparseTensorEncodingAttr::DimLevelType with DimLevelType. The next differential will break out a separate library for the DimLevelType enum, so that the Dialect code doesn't need to depend on the rest of the runtime
Depends On
D135995
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/
D135996
Nico Weber [Tue, 18 Oct 2022 20:39:55 +0000 (16:39 -0400)]
[clang] Move variable declaration closer to use
...and add some whitespace to delimit the three logical steps done in this
function.
No behavior change.
Nancy Wang [Tue, 18 Oct 2022 19:53:03 +0000 (15:53 -0400)]
[SystemZ][z/OS][libcxx]: fix the mask in stage2_float_loop function
This patch is to fix issue related to __stage2_float_loop function, float point value comparison is not working on EBCDIC mode because the mask is hard-coded and assumes character is ASCII, fix is to use toupper function when do the comparison.
Differential Revision: https://reviews.llvm.org/
D118930
Quentin Colombet [Mon, 17 Oct 2022 19:40:19 +0000 (19:40 +0000)]
[mlir][MemRef] Fix the simplification of extract_strided_metadata(subview)
Prior to this patch we were wrongly applying the sub-strides to the
computation of the final offset of the subview.
Put differently, we were computing the offset as:
```
offset = baseOffset + sum(subOffset#i * baseStrides#i * subSizes#i)
```
Whereas we should be doing:
```
offset = baseOffset + sum(subOffset#i * baseStrides#i)
```
I.e., drop the subSizes#i term from the sum.
Differential Revision: https://reviews.llvm.org/
D136107
LLVM GN Syncbot [Tue, 18 Oct 2022 19:15:31 +0000 (19:15 +0000)]
[gn build] Port
594fa1474f0c
wren romano [Fri, 14 Oct 2022 23:36:14 +0000 (16:36 -0700)]
[mlir][sparse] rename the values of the runtime DimLevelType
This change is to make way for reusing the DimLevelType enum in lieu of the SparseTensorEncodingAttr::DimLevelType enum, but broken out to make it quick and easy to review
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/
D135995
Yuanfang Chen [Tue, 18 Oct 2022 18:51:02 +0000 (11:51 -0700)]
[C++20] Implement P2113R0: Changes to the Partial Ordering of Constrained Functions
This implementation matches GCC behavior in that [[ https://eel.is/c++draft/temp.func.order#6.2.1 | temp.func.order p6.2.1 ]] is not implemented [1]. I reached out to the GCC author to confirm that some changes elsewhere to overload resolution are probably needed, but no solution has been developed sufficiently [3].
Most of the wordings are implemented straightforwardly. However,
for [[ https://eel.is/c++draft/temp.func.order#6.2.2 | temp.func.order p6.2.2 ]] "... or if the function parameters that positionally correspond between the two templates are not of the same type", the "same type" is not very clear ([2] is a bug related to this). Here is a quick example
```
template <C T, C U> int f(T, U);
template <typename T, C U> int f(U, T);
int x = f(0, 0);
```
Is the `U` and `T` from different `f`s the "same type"? The answer is NO even though both `U` and `T` are deduced to be `int` in this case. The reason is that `U` and `T` are dependent types, according to [[ https://eel.is/c++draft/temp.over.link#3 | temp.over.link p3 ]], they can not be the "same type".
To check if two function parameters are the "same type":
* For //function template//: compare the function parameter canonical types and return type between two function templates.
* For //class template/partial specialization//: by [[ https://eel.is/c++draft/temp.spec.partial.order#1.2 | temp.spec.partial.order p1.2 ]], compare the injected template arguments between two templates using hashing(TemplateArgument::Profile) is enough.
[1] https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=
57b4daf8dc4ed7b669cc70638866ddb00f5b7746
[2] https://github.com/llvm/llvm-project/issues/49308
[3] https://lists.isocpp.org/core/2020/06/index.php#msg9392
Fixes https://github.com/llvm/llvm-project/issues/54039
Fixes https://github.com/llvm/llvm-project/issues/49308 (PR49964)
Reviewed By: royjacobson, #clang-language-wg, mizvekov
Differential Revision: https://reviews.llvm.org/
D128750
Mark de Wever [Tue, 18 Oct 2022 18:57:54 +0000 (20:57 +0200)]
[libc++][chrono] Fixes build.
Changes in
D134742 were not properly propagated to
D136037 before
landing.
Alexey Bataev [Thu, 18 Nov 2021 23:59:30 +0000 (15:59 -0800)]
[SLP]Generalize cost model.
Generalized the cost model estimation. Improved cost model estimation
for repeated scalars (no need to count their cost anymore), improved
cost model for extractelement instructions.
cpu2017
511.povray_r 0.57
520.omnetpp_r -0.98
521.wrf_r -0.01
525.x264_r 3.59 <+
526.blender_r -0.12
531.deepsjeng_r -0.07
538.imagick_r -1.42
Geometric mean: 0.21
Differential Revision: https://reviews.llvm.org/
D115757
Yuanfang Chen [Tue, 18 Oct 2022 18:24:38 +0000 (11:24 -0700)]
[Clang] update cxx_dr_status.html by running make_cxx_dr_status
For https://github.com/llvm/llvm-project/issues/58382
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/
D136133
Eli Friedman [Tue, 18 Oct 2022 18:44:01 +0000 (11:44 -0700)]
[AArch64][Windows] Add MC support for save_any_reg.
Representing this as 12 separate operations is a bit ugly, but
trying to represent the different modes using a bitfield seemed worse.
Differential Revision: https://reviews.llvm.org/
D135417
Mark de Wever [Sun, 20 Mar 2022 12:40:02 +0000 (13:40 +0100)]
[libc++][chrono] Implements formatter weekday.
Partially implements:
- P1361 Integration of chrono with text formatting
- P2372 Fixing locale handling in chrono formatters
Depends on
D134742
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/
D136037
Mark de Wever [Sun, 20 Mar 2022 12:40:02 +0000 (13:40 +0100)]
[libc++][chrono] Implements formatter duration.
Partially implements:
- P1361 Integration of chrono with text formatting
- P2372 Fixing locale handling in chrono formatters
- LWG3270 Parsing and formatting %j with durations
Completes:
- P1650R0 std::chrono::days with 'd' suffix
- LWG3262 Formatting of negative durations is not specified
- LWG3314 Is stream insertion behavior locale dependent when Period::type is micro?
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/
D134742
Hui Xie [Fri, 7 Oct 2022 17:07:54 +0000 (18:07 +0100)]
[libc++][ranges] implement `std::ranges::drop_while_view`
Differential Revision: https://reviews.llvm.org/
D135460
Alexey Bataev [Tue, 18 Oct 2022 18:23:43 +0000 (11:23 -0700)]
Revert "[SLP]Generalize cost model."
This reverts commit
f12fb91188b836e1bddb36bacbbdb8e4ab70b9b6 and
f5c747bfbe36b8f53e6fe2d85ffcaecba6d7153c to fix detected non-initialized
var use.
Sjoerd Meijer [Tue, 18 Oct 2022 17:54:04 +0000 (23:24 +0530)]
Revert "Recommit "[LoopFlatten] Enable it by default""
This reverts commit
5b9597f59a445523bd59b5251ab1c2865e74919f.
A miscompilation was reported:
https://github.com/llvm/llvm-project/issues/58441
Reverting this while I look at that.
Felipe de Azevedo Piovezan [Tue, 18 Oct 2022 17:59:29 +0000 (13:59 -0400)]
Revert "[lldb-tests] Remove dubious standard library flag"
This reverts commit
f477412685fe6bac49d3d080ba91896c28e62116.
Valentin Clement [Tue, 18 Oct 2022 17:52:20 +0000 (19:52 +0200)]
[flang] Add getTypeDescriptorBindingTableName function
Type descriptor and its binding table are defined as fir.global in FIR.
Their names are derived from the derived-type name. This patch adds a new
function `getTypeDescriptorBindingTableName` in the NameUniquer and
refactor the `GetTypeDescriptorName` function to reuse the same code.
This will be used in the fir.dispatch code generation.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/
D136167
Felipe de Azevedo Piovezan [Tue, 18 Oct 2022 13:22:20 +0000 (09:22 -0400)]
[lldb-tests] Remove dubious standard library flag
The test currently sets `USE_LIBSTDCPP = 0`, which is curious given the
behavior of `and` and `or` in Makefiles (the contents of the variables
are not important). In particular, this causes the tests to not use the
standard libraries appropriately.
To capture the actual intent of the test, we're changing this to
`USE_LIBCXX=1`.
Differential Revision: https://reviews.llvm.org/
D136171
Felipe de Azevedo Piovezan [Tue, 18 Oct 2022 13:13:45 +0000 (09:13 -0400)]
[lldb-tests] Add libcxx version check for regex tests
Regex requires the c++20 flag, which was not introduced available prior
to Clang 11.
Differential Revision: https://reviews.llvm.org/
D136165
Felipe de Azevedo Piovezan [Tue, 18 Oct 2022 15:01:38 +0000 (11:01 -0400)]
[lldb-tests] Add compiler version check in TestFunctionStarts
This test requires compiling its input program without debug
information. To do so, it uses certain Makefile variables that are never
populated with custom libcxx paths (if present). Doing so would not
necessarily be correct: we cannot guarantee that said standard library
has no debug symbols.
As such, we keep using the system libraries but disable the tests in
clang versions that are too old to work with more modern system
libraries, as in the case of the lldb-matrix bot.
Differential Revision: https://reviews.llvm.org/
D136178
Fangrui Song [Tue, 18 Oct 2022 17:28:11 +0000 (10:28 -0700)]
[ELF] Restore AArch64Relaxer after
685b21255315e699aa839d93fe71b37d806c90c2
relocateAlloc may be parallel so we should avoid sharing AArch64 states.
Katherine Rasmussen [Thu, 13 Oct 2022 00:32:14 +0000 (17:32 -0700)]
[flang] Add atomic_cas to the list of intrinsics
Add the atomic subroutine, atomic_cas, to the list of intrinsic
subroutines and check one of its arguments for a coindexed object.
Create a new function, CheckAtomicKind, that will be used for the
atomic subroutines that have arguments that can be either of type
int and of kind atomic_int_kind or of type logical and of kind
atomic_logical_kind.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/
D135835
Alexey Bataev [Tue, 18 Oct 2022 17:07:25 +0000 (10:07 -0700)]
[SLP][NFC]Fix a warning for ?: with enum/unsigned, NFC.
Slava Zakharin [Tue, 18 Oct 2022 00:48:05 +0000 (17:48 -0700)]
[flang] Restrict __float128 support for some build configurations.
This change is intended to resolve build issues reported in
D134503.
A compiler supporting __float128 must define either __FLOAT128__ or
__SIZEOF_FLOAT128__ (or both). Additional check for _LIBCPP_VERSION
was added to disable __float128 for builds with libc++, because
__float128 support is incomplete there.
Differential Revision: https://reviews.llvm.org/
D136121
Arthur Eubanks [Tue, 18 Oct 2022 16:56:42 +0000 (09:56 -0700)]
[test] Remove redundant -passes flags
Aart Bik [Tue, 18 Oct 2022 06:08:09 +0000 (23:08 -0700)]
[mlir][sparse] improve push_back type checking, printing, parsing
Rationale:
Enforces type consistency on parsed and generated IR.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/
D136132
Arthur Eubanks [Sun, 2 Oct 2022 20:20:21 +0000 (13:20 -0700)]
[ObjCARC][test] Use `opt -passes=` syntax
Krzysztof Parzyszek [Fri, 14 Oct 2022 23:07:50 +0000 (16:07 -0700)]
[Hexagon] Use shifts by scalar for funnel shifts by scalar
HVX has vector shifts by a scalar register. Use those in the expansions
of funnel shifts where profitable.
Chris Bieneman [Tue, 18 Oct 2022 16:42:09 +0000 (11:42 -0500)]
[DX] Create globals for DXContainer parts
DXContainer files have a handful of sections that need to be written.
This adds a pass to write the section data into IR globals, and writes
the shader flag data into a global.
The test cases here verify that the shader flags are correctly written
from the IR into the global and emitted to the DXContainer.
This change also fixes a bug in the MCDXContainerWriter, where the size
of the dxbc::ProgramHeader was not being included in the part offset
calcuations. This is verified to be working by the new testcases where
obj2yaml can properly dump part data for parts after the DXIL part.
Resolves issue #57742 (https://github.com/llvm/llvm-project/issues/57742)
Reviewed By: python3kgae
Differential Revision: https://reviews.llvm.org/
D135793
David Zarzycki [Tue, 18 Oct 2022 16:32:12 +0000 (12:32 -0400)]
[clang testing] Unbreak read-only source builds
Mark de Wever [Sun, 16 Oct 2022 18:48:12 +0000 (20:48 +0200)]
[libc++] Improves modular build.
Makes sure headers having a std::ranges::less as default argument export
the proper header. Without exporting these modularized headers are not
self contained.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/
D136045
Mark de Wever [Sat, 23 Oct 2021 17:11:02 +0000 (19:11 +0200)]
[libc++][format] Move iterators when needed.
LWG-3539 was already implemented but not marked as done.
LWG-3567 is implemented in this commit.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/
D112368
Florian Hahn [Tue, 18 Oct 2022 16:38:14 +0000 (17:38 +0100)]
[IndVars] Forget SCEV for instruction and users before replacing it.
Extra invalidation is needed here to clear stale values to fix a
verification failure.
Fixes #58440.
bixia1 [Mon, 17 Oct 2022 17:02:17 +0000 (10:02 -0700)]
[mlir][sparse] Add options to sparse-tensor-rewrite to disable rewriting rules for operators foreach and convert.
This is to help simplify FileCheck tests for sparse-tensor-rewrite.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/
D136093
Sam McCall [Fri, 14 Oct 2022 10:56:41 +0000 (12:56 +0200)]
[include-cleaner] Add include-cleaner tool, with initial HTML report
The immediate goal is to start producing an HTML report to debug and explain
include-cleaner recommendations.
For now, this includes only the lowest-level piece: a list of the references
found in the source code.
How this fits into future ideas:
- under refs we can also show the headers providing the symbol, which includes
match those headers etc
- we can also annotate the #include lines with which symbols they cover, and
add whichever includes we're suggesting too
- the include-cleaner tool will likely have modes where it emits diagnostics
and/or applies edits, so the HTML report is behind a flag
Differential Revision: https://reviews.llvm.org/
D135956
Mingming Liu [Sat, 15 Oct 2022 18:50:56 +0000 (11:50 -0700)]
[AArch64] Enhance bit-field-positioning op matcher to see through 'any_extend' for pattern 'and(any_extend(shl(val, N)), shifted-mask)'
Before this patch (and refactor patch
D135843), isBitfieldPositioningOp won't handle "and(any_extend(shl(val, N), shifted-mask)" (bail out if AND op is not SHL)
After this patch, isBitfieldPositioningOp will see through "any_extend" to find "shl" to find possible bit-field-positioning nodes.
https://gcc.godbolt.org/z/3ncGKbGW6 is a four-liner LLVM IR that could be optimized to UBFIZ (see added test case test_and_extended_shift_with_imm in llvm/test/CodeGen/AArch64/bitfield-insert.ll). One existing test case also improves.
Differential Revision: https://reviews.llvm.org/
D135852
Han-Kuan Chen [Tue, 18 Oct 2022 06:32:12 +0000 (23:32 -0700)]
[RISCV] Lower VECTOR_SHUFFLE to VSLIDEDOWN_VL.
Differential Revision: https://reviews.llvm.org/
D136136
Han-Kuan Chen [Tue, 18 Oct 2022 04:39:20 +0000 (21:39 -0700)]
[RISCV] Pre-commit tests for lowering VECTOR_SHUFFLE to VSLIDEDOWN_VL.
Differential Revision: https://reviews.llvm.org/
D136135
Anton Sidorenko [Mon, 17 Oct 2022 09:06:08 +0000 (12:06 +0300)]
[MachineCombiner][RISCV] Enable MachineCombiner for RISCV
Initial implementation to match basic FP reassociation patterns.
Differential Revision: https://reviews.llvm.org/
D135264
Alexey Bataev [Thu, 18 Nov 2021 23:59:30 +0000 (15:59 -0800)]
[SLP]Generalize cost model.
Generalized the cost model estimation. Improved cost model estimation
for repeated scalars (no need to count their cost anymore), improved
cost model for extractelement instructions.
cpu2017
511.povray_r 0.57
520.omnetpp_r -0.98
521.wrf_r -0.01
525.x264_r 3.59 <+
526.blender_r -0.12
531.deepsjeng_r -0.07
538.imagick_r -1.42
Geometric mean: 0.21
Differential Revision: https://reviews.llvm.org/
D115757
Arthur Eubanks [Tue, 11 Oct 2022 22:07:13 +0000 (15:07 -0700)]
Port print-cfg-sccs to new pass manager
This is actually used, see https://discourse.llvm.org/t/use-print-callgrapg-sccs-from-opt/65782.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/
D135718
Arthur Eubanks [Tue, 18 Oct 2022 15:46:14 +0000 (08:46 -0700)]
[win][compiler-rt] Make tests use lld-link instead of link
Git bash ships with a link.exe. We try to add git bash to the beginning
of PATH (see D84380). These tests end up executing the wrong link.exe.
As a workaround, use lld-link. Note that `REQUIRES: lld-available` tests currently aren't running, see
D128567. I did manually verify that these tests pass with lld-link.
Reviewed By: rnk, hans
Differential Revision: https://reviews.llvm.org/
D136108
Arthur Eubanks [Tue, 18 Oct 2022 15:45:15 +0000 (08:45 -0700)]
[ubsan][test] Make some tests have shorter file names
We're hitting path size limits on Windows on these tests. As a
workaround, make the file names shorter.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/
D136113