Haojian Wu [Tue, 27 Jun 2023 13:15:02 +0000 (15:15 +0200)]
Revert "[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map"
This reverts commit
12e9c7aaa66b7624b5d7666ce2794d912bf9e4b7.
The commit has broken the buildbot, see comment https://reviews.llvm.org/D147740#4451540
Louis Dionne [Mon, 5 Jun 2023 19:53:42 +0000 (12:53 -0700)]
[libc++] Expand the contents of LIBCXX_ENABLE_FILESYSTEM
Since LIBCXX_ENABLE_FILESYSTEM now truly represents whether the
platform supports a filesystem (as opposed to whether the <filesystem>
library is provided), we can provide a few additional classes from
the <filesystem> library even when the platform does not have support
for a filesystem. For example, this allows performing path manipulations
using std::filesystem::path even on platforms where there is no actual
filesystem.
rdar://
107061236
Differential Revision: https://reviews.llvm.org/D152382
Matthias Springer [Tue, 27 Jun 2023 12:29:34 +0000 (14:29 +0200)]
[mlir][linalg] Padding transformation: Write back result to original destination
Copy back the padded result to the original destination of the computation. This is important for bufferization, to ensure that the result of the computation does not suddenly materialize in a different buffer due to padding.
A `bufferization.copy_tensor` is inserted for every (unpadded) result. Such ops bufferize to memcpys, but they fold away, should the padding fold away.
Differential Revision: https://reviews.llvm.org/D153554
Matthias Springer [Tue, 27 Jun 2023 12:29:18 +0000 (14:29 +0200)]
[mlir][bufferization] Add bufferization.copy_tensor op
This operation is a "copy" operation on tensors. It is guaranteed to bufferize to a memcpy. This is different from "tensor.insert_slice", which may fold away.
Note: There is a symmetry between certain tensor, bufferization and memref ops:
* `tensor.empty`, `bufferization.alloc_tensor`, `memref.alloc`
* (none), `bufferization.dealloc_tensor`, `memref.dealloc`
* `tensor.insert_slice`, `bufferization.copy_tensor`, `memref.copy`
Tensor ops can generally canonicalize/fold away, while bufferization dialect ops can be used when a certain side effect is expected to materialize; so they do not fold away.
Differential Revision: https://reviews.llvm.org/D153552
Matthias Springer [Tue, 27 Jun 2023 12:29:04 +0000 (14:29 +0200)]
[mlir][linalg] BufferizeToAllocationOp: Return handle to buffer
Add an additional result handle to the op. This new handle is mapped to the newly allocated buffer.
Differential Revision: https://reviews.llvm.org/D153514
Matthias Springer [Tue, 27 Jun 2023 12:28:50 +0000 (14:28 +0200)]
[mlir][linalg][NFC] Cleanup padding transform
* Use LinalgPaddingOptions instead of passing many parameters.
* Split function into two parts.
Differential Revision: https://reviews.llvm.org/D153853
Matthias Springer [Tue, 27 Jun 2023 12:28:38 +0000 (14:28 +0200)]
[mlir][linalg][NFC] Move padding transformation to separate file
Also remove `LinalgPaddingPattern`, which has no uses. (There is a transform dialect op that is used for testing instead.)
Differential Revision: https://reviews.llvm.org/D153512
Felipe de Azevedo Piovezan [Mon, 26 Jun 2023 20:10:30 +0000 (16:10 -0400)]
[DebugInfo][AsmPrinter] Don't emit accelerator entries with empty names
The DWARF 5 specification says that:
> All other debugging information entries without a DW_AT_name attribute are
> excluded.
Clang started generating these variables for string literals, see D123534.
Differential Revision: https://reviews.llvm.org/D153809
Haojian Wu [Tue, 27 Jun 2023 12:38:59 +0000 (14:38 +0200)]
[ELFAttributeParser] Update the ELFAttributeParserTest for 8f208ed
The 8f208ed has removed the "unrecognized vendor-name" error, updated
the unittest accordingly.
Alex Bradbury [Tue, 27 Jun 2023 12:35:41 +0000 (13:35 +0100)]
[RISCV][NFC] Adjust RISCVMoveMerge.cpp header to match standard style
* 80 columns
* Fix name of file (RISCVMoveMerger.cpp vs RISCVMoveMerge.cpp)
* `//===--` prefix rather than center-aligned text.
Alex Bradbury [Tue, 27 Jun 2023 12:28:28 +0000 (13:28 +0100)]
[RISCV] Relax rules for ordering s/z/x prefixed extensions in ISA naming strings
This was discussed somewhat in D148315. As it stands, we require in
RISCVISAInfo::parseArchString (used for e.g. -march parsing in Clang)
that extensions are given in the order of z, then s, then x prefixed
extensions (after the standard single-letter extensions). However, we
recently (in D148315) moved to that order from z/x/s as the canonical
ordering was changed in the spec. In addition, recent GCC seems to
require z* extensions before s*.
My recollection of the history here is that we thought keeping -march as
close to the rules for ISA naming strings as possible would simplify
things, as there's an existing spec to point to. My feeling is that now
we've had incompatible changes, and an incompatibility with GCC there's
no real benefit to sticking to this restriction, and it risks making it
much more painful than it needs to be to copy a -march= string between
GCC and Clang.
This patch removes all ordering restrictions so you can freely mix x/s/z
extensions.
To be very explicit, this doesn't change our behaviour when emitting a
canonically ordered extension string (e.g. in build attributes). We of
course sort according to the canonical order (as we understand it) in
that case.
Differential Revision: https://reviews.llvm.org/D149246
Nicolas Vasilache [Tue, 27 Jun 2023 12:29:16 +0000 (12:29 +0000)]
[mlir][Linalg]Add support for inferring batch dimensions
pvanhout [Tue, 27 Jun 2023 08:34:06 +0000 (10:34 +0200)]
[AMDGPU] Add more Common Feature Sets
A small refactor to add more `_Common` feature sets for GFX8+.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D153843
Simon Tatham [Tue, 20 Jun 2023 12:44:38 +0000 (13:44 +0100)]
[ELFAttributeParser] Skip unknown vendor subsections.
An .ARM.attributes section is divided into subsections, each labelled
with a vendor name. There is one standardised vendor name, which must
be used for all attributes that affect compatibility. Subsections
labelled with other vendor names can be used for optimisation
purposes, but it has to be safe for an object file consumer to ignore
them if it doesn't recognise the vendor name.
LLD currently terminates parsing of the whole attributes section as
soon as it encounters a subsection with a vendor name it doesn't
recognise (which is anything other than the standard one). This can
prevent it from detecting compatibility issues, if a standard
subsection followed the vendor-specific one.
This patch modifies the attribute parser so that unrecognised vendor
subsections are silently skipped, and the subsections beyond them are
still processed.
Differential Revision: https://reviews.llvm.org/D153335
Timm Bäder [Tue, 27 Jun 2023 09:42:38 +0000 (11:42 +0200)]
[clang] Add myself as code owner for the new constant interpreter
As discussed in the RFC at
https://discourse.llvm.org/t/rfc-proposing-a-code-owner-for-the-experimental-constexpr-interpreter/71514
Jean Perier [Tue, 27 Jun 2023 11:30:15 +0000 (13:30 +0200)]
[flang][hlfir] Add codegen for vector subscripted LHS
This patch adds support for vector subscripted assignment left-hand
side. It does not yet add support for the cases where the LHS must be
saved because its evaluation could be impacted by the assignment.
The implementation adds an hlfir::ElementalOpInterface to share the
elemental inlining utility and some other tools between
hlfir::ElementalOp and hlfir::ElelemntalAddrOp.
It adds generateYieldedLHS() to allow retrieving the LHS value
in lowering, whether or not it is vector subscripted. If it is vector
subscripted, this utility creates a loop nest iterating over the
elements and returns the address of an element.
Differential Revision: https://reviews.llvm.org/D153759
Florian Hahn [Tue, 27 Jun 2023 11:22:10 +0000 (12:22 +0100)]
[ConstraintElim] Add additional induction phi tests.
Extra tests with different GEP step sizes for D152730.
Timm Bäder [Tue, 27 Jun 2023 11:06:01 +0000 (13:06 +0200)]
Revert "[clang][CFG][NFC] A few smaller cleanups"
This reverts commit
173df3dd5f9a812b07f9866965f4e92a982a3fca.
Looks like this wasn't as innocent as it seemed:
https://lab.llvm.org/buildbot#builders/38/builds/12982
Simon Pilgrim [Tue, 27 Jun 2023 10:46:02 +0000 (11:46 +0100)]
Fix "this this" duplicate typo in comment. NFC.
Simon Pilgrim [Tue, 27 Jun 2023 10:39:31 +0000 (11:39 +0100)]
Fix "for for" duplicate typo in comment. NFC.
Simon Pilgrim [Tue, 27 Jun 2023 10:37:05 +0000 (11:37 +0100)]
Fix "the the" duplicate typo in comment. NFC.
Timm Bäder [Wed, 21 Jun 2023 09:15:55 +0000 (11:15 +0200)]
[clang][CFG][NFC] A few smaller cleanups
Use dyn_cast_if_present instead of _or_null, use decomposition decls,
and a few other minor things.
Timm Bäder [Sun, 18 Jun 2023 07:07:31 +0000 (09:07 +0200)]
[clang][Interp][NFC] Compare std::optional variables directly
We can do that and we already checked that they aren't nullopt before.
Guillaume Chatelet [Mon, 26 Jun 2023 13:58:30 +0000 (13:58 +0000)]
[Align] Add isAligned taking an APInt
This showed up in https://reviews.llvm.org/D153308
Reviewed By: courbet, nikic
Differential Revision: https://reviews.llvm.org/D153356
David Spickett [Thu, 22 Jun 2023 14:24:40 +0000 (14:24 +0000)]
[lldb] Use SmallVector for handling register data
Previously lldb was using arrays of size kMaxRegisterByteSize to handle
registers. This was set to 256 because the largest possible register
we support is Arm's scalable vectors (SVE) which can be up to 256 bytes long.
This means for most operations aside from SVE, we're wasting 192 bytes
of it. Which is ok given that we don't have to pay the cost of a heap
alocation and 256 bytes isn't all that much overall.
With the introduction of the Arm Scalable Matrix extension there is a new
array storage register, ZA. This register is essentially a square made up of
SVE vectors. Therefore ZA could be up to 64kb in size.
https://developer.arm.com/documentation/ddi0616/latest/
"The Effective Streaming SVE vector length, SVL, is a power of two in the range 128 to 2048 bits inclusive."
"The ZA storage is architectural register state consisting of a two-dimensional ZA array of [SVLB × SVLB] bytes."
99% of operations will never touch ZA and making every stack frame 64kb+ just
for that slim chance is a bad idea.
Instead I'm switching register handling to use SmallVector with a stack allocation
size of kTypicalRegisterByteSize. kMaxRegisterByteSize will be used in places
where we can't predict the size of register we're reading (in the GDB remote client).
The result is that the 99% of small register operations can use the stack
as before and the actual ZA operations will move to the heap as needed.
I tested this by first working out -wframe-larger-than values for all the
libraries using the arrays previously. With this change I was able to increase
kMaxRegisterByteSize to 256*256 without hitting those limits. With the
exception of the GDB server which needs to use a max size buffer.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D153626
Christian Ulmann [Tue, 27 Jun 2023 08:34:39 +0000 (08:34 +0000)]
[mlir][LLVM] Improve error handling in switch parsing
This commit changes the 'llvm.switch' parsing to not silently fail when
it encounters superfluous commas in the case list.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D153841
Martin Braenne [Tue, 27 Jun 2023 04:41:17 +0000 (04:41 +0000)]
[clang][dataflow] Use namespace qualifiers when defining functions.
See
https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions
Thank you to MaskRay for pointing this out on
https://reviews.llvm.org/D153006
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D153833
serge-sans-paille [Tue, 30 May 2023 21:15:11 +0000 (23:15 +0200)]
[Clang][Sema] Do not try to analyze dependent alignment during -Wcast-align
Fix #63007
Differential Revision: https://reviews.llvm.org/D151753
Diana Picus [Tue, 20 Jun 2023 07:23:50 +0000 (09:23 +0200)]
[AMDGPU][DAGISel] Be more flexible about what calls are allowed
Remove DAGISel checks on calling conventions. GlobalISel doesn't have
these checks either and we prefer it that way (see D152794).
Add a simple test like the one introduced in D117479 for GlobalISel.
Differential Revision: https://reviews.llvm.org/D153535
Haojian Wu [Tue, 27 Jun 2023 07:37:12 +0000 (09:37 +0200)]
[clangd] Fix the flaky FindTarget unittest after 1b66840
after 1b66840, FindTarget will report multiple refs with the same
location, make the sort order of the refs deterministic in
FindTargetTests.
David Spickett [Fri, 23 Jun 2023 14:31:14 +0000 (14:31 +0000)]
[LLDB] Fix the use of "platform process launch" with no extra arguments
This fixes #62068.
After
8d1de7b34af46a089eb5433c700419ad9b2923ee the following issue appeared:
```
$ ./bin/lldb /tmp/test.o
(lldb) target create "/tmp/test.o"
Current executable set to '/tmp/test.o' (aarch64).
(lldb) platform process launch -s
error: Cannot launch '': Nothing to launch
```
Previously would call target->GetRunArguments when there were no extra
arguments, so we could find out what target.run-args might be.
Once that change started relying on the first arg being the exe,
the fact that that call clears the existing argument list caused the bug.
Instead, have it set a local arg list and append that to the existing
one. Which in this case will just contain the exe name.
Since there's no existing tests for this command I've added a new file
that covers enough to check this issue.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D153636
Nikita Popov [Mon, 19 Jun 2023 12:21:40 +0000 (14:21 +0200)]
[BasicAA] Fix nsw handling for negated scales (PR63266)
We currently preserve the nsw flag when negating scales, which is
incorrect for INT_MIN.
However, just dropping the NSW flag in this case makes BasicAA
behavior unreliable and asymmetric, because we may or may not
drop the NSW flag depending on which side gets subtracted.
Instead, leave the Scale alone and add an additional IsNegated flag,
which indicates that the whole VarIndex should be interpreted as a
subtraction. This allows us to retain the NSW flag.
When accumulating the offset range, we need to use subtraction
instead of adding for IsNegated indices. Everything else works on
the absolute value of the scale, so the negation does not matter
there.
Fixes https://github.com/llvm/llvm-project/issues/63266.
Differential Revision: https://reviews.llvm.org/D153270
Cullen Rhodes [Tue, 27 Jun 2023 07:29:11 +0000 (07:29 +0000)]
[mlir][ArmSME] Fix crash on func decls in 'arm_za' legality checks
Reviewed By: dcaballe, Dinistro
Differential Revision: https://reviews.llvm.org/D153750
Aiden Grossman [Tue, 27 Jun 2023 07:36:25 +0000 (07:36 +0000)]
[llvm-exegesis] Fix requires flags on memory annotation tests
Job Noorman [Tue, 27 Jun 2023 07:18:39 +0000 (09:18 +0200)]
[JITLink] Use SubtargetFeatures to store features in LinkGraph
D149522 introduced target features to LinkGraph. However, to avoid a
public dependency on MC, the features were stored in a std::vector
instead of using SubtargetFeatures directly.
Since SubtargetFeatures was moved from MC to TargetParser (D150549), we
can now use it directly to store the features. This patch implements
that and removes the (private) dependency on MC.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D153749
Tobias Gysi [Tue, 27 Jun 2023 06:56:01 +0000 (06:56 +0000)]
[mlir][llvm] Add comdat attribute to functions
This revision adds comdat support to functions. Additionally,
it ensures only comdats that have uses are imported/exported and
only non-empty global comdat operations are created.
Reviewed By: Dinistro
Differential Revision: https://reviews.llvm.org/D153739
Aiden Grossman [Tue, 27 Jun 2023 07:01:20 +0000 (07:01 +0000)]
[llvm-exegesis] Fix warning and hoist statement of arch-specific section
My last patch broke most of the builders that aren't currently running
at least Kernel 5.6 as there was a variable used later on inside a
region that required that kernel version. Also fixes a minor warning
left over from a bad merge.
Aiden Grossman [Fri, 26 May 2023 09:13:34 +0000 (09:13 +0000)]
[llvm-exegesis] Add support for using memory annotations
This patch adds in support for using memory annotations in the
subprocess execution mode.
Mehdi Amini [Mon, 26 Jun 2023 20:51:02 +0000 (22:51 +0200)]
Revert "[mlir][Transform] Add support for mma.sync m16n8k16 f16 rewrite." and "[mlir][Transform] Introduce nvgpu transform extensions"
This reverts commit
40deed40ae77ba22f7c72693903752ab6bfeb4e7.
and commit
1660f2174d59bc2fd04131dab9ab0b43178bf665.
The buildbot is broken, the two tests aren't passing.
Aiden Grossman [Tue, 27 Jun 2023 06:09:01 +0000 (06:09 +0000)]
[llvm-exegesis] Remove extraneous print debug statement
This was leftover when I was working on the lit config while recomitting
a patch and never should've made it in tree.
Aiden Grossman [Tue, 27 Jun 2023 06:05:27 +0000 (06:05 +0000)]
Revert "[NFC] Refactor MBB hotness/coldness into templated PSI functions."
This reverts commit
c3e33720403c010e140c17313eeefd9a0f25887a.
This has broken quite a few buildbots.
Jim Lin [Tue, 27 Jun 2023 02:29:45 +0000 (10:29 +0800)]
[LibCallsShrinkWrap][NFC] Reuse createCond and createOrCond
Add two new functions `createCond` and `createOrCond` that accept extra
arguments Arg and Arg/Arg2 respectively.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D153253
Craig Topper [Tue, 27 Jun 2023 06:02:14 +0000 (23:02 -0700)]
[RISCV] Add bypass for ReadJalr in SiFive7 scheduler model.
Aiden Grossman [Fri, 26 May 2023 09:06:13 +0000 (09:06 +0000)]
[llvm-exegesis] Add memory annotation parsing
This patch adds memory annotation parsing to llvm-exegesis. The memory
annotations cannot be used currently, but this allows for using parsed
memory annotations within a FunctionExecutorImpl to set up a specified
execution environment.
Fangrui Song [Tue, 27 Jun 2023 05:20:02 +0000 (22:20 -0700)]
[test] Improve symbol visibility test
Aiden Grossman [Tue, 27 Jun 2023 04:59:09 +0000 (04:59 +0000)]
[llvm-exegesis] Fix test failure caused by assymetric values
I fixed compilation on 32-bit ARM earlier in the -Werror case using
preprocessor directives but forgot to update the unit tests that also
depend upon that value. This patch updates the unit tests as well as a
quick fix for the builders that were broken by the earlier patch.
Han Shen [Sat, 24 Jun 2023 17:19:26 +0000 (10:19 -0700)]
[NFC] Refactor MBB hotness/coldness into templated PSI functions.
In D152399, we calculate BPI->BFI in MachineFunctionSplit pass just to
use PSI->isFunctionHotInCallGraph, which is expensive. Instead, we can
implement this directly with MBFI.
Reviewer mentioned in the comment, that machine_size_opts already has
isFunctionColdInCallGraph, isFunctionHotInCallGraphNthPercentile, etc
implemented. These can be refactored and reused across MFS and machine
size opts.
This CL does this - it refactors out those internal static functions
into PSI as templated functions, so they can be accessed easily.
Differential Revision: https://reviews.llvm.org/D152758
Advenam Tacet [Thu, 1 Jun 2023 05:13:24 +0000 (05:13 +0000)]
[2a/3][ASan][libcxx] std::deque annotations
This revision is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector`, to `std::string` and `std::deque` collections. These changes allow ASan to detect cases when the instrumented program accesses memory which is internally allocated by the collection but is still not in-use (accesses before or after the stored elements for `std::deque`, or between the size and capacity bounds for `std::string`).
The motivation for the research and those changes was a bug, found by Trail of Bits, in a real code where an out-of-bounds read could happen as two strings were compared via a std::equals function that took `iter1_begin`, `iter1_end`, `iter2_begin` iterators (with a custom comparison function). When object `iter1` was longer than `iter2`, read out-of-bounds on `iter2` could happen. Container sanitization would detect it.
This revision introduces annotations for `std::deque`. Each chunk of the container can now be annotated using the `__sanitizer_annotate_double_ended_contiguous_container` function, which was added in the rG1c5ad6d2c01294a0decde43a88e9c27d7437d157. Any attempt to access poisoned memory will trigger an ASan error. Although false negatives are rare, they are possible due to limitations in the ASan API, where a few (usually up to 7) bytes before the container may remain unpoisoned. There are no false positives in the same way as with `std::vector` annotations.
This patch only supports objects (deques) that use the standard allocator. However, it can be easily extended to support all allocators, as suggested in the D146815 revision.
Furthermore, the patch includes the addition of the `is_double_ended_contiguous_container_asan_correct` function to `libcxx/test/support/asan_testing.h`. This function can be used to verify whether a `std::deque` object has been correctly annotated.
Finally, the patch extends the unit tests to verify ASan annotations (added LIBCPP_ASSERTs).
If a program is compiled without ASan, all helper functions will be no-ops. In binaries with ASan, there is a negligible performance impact since the code from the change is only executed when the deque container changes in size and it’s proportional to the change. It is important to note that regardless of whether or not these changes are in use, every access to the container's memory is instrumented.
If you have any questions, please email:
- advenam.tacet@trailofbits.com
- disconnect3d@trailofbits.com
Reviewed By: #libc, philnik
Differential Revision: https://reviews.llvm.org/D132092
Advenam Tacet [Tue, 27 Jun 2023 01:30:03 +0000 (03:30 +0200)]
[ASan] Remove sanity checks during annotation of contiguous container
This revision removes sanity checks in
`__sanitizer_annotate_contiguous_container`.
(Changed them to `DCHECK_EQ`.)
Those checks may be problematic, if someone manually unpoisoned memory block.
Manual unpoisoning may be used if part of the program is not
instrumented.
Those checks are helpful while confirming correctness of ASan annotations
implementation.
Originally suggested here: https://reviews.llvm.org/D136765#4174546
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D145482
Slava Zakharin [Mon, 26 Jun 2023 20:20:04 +0000 (13:20 -0700)]
[flang][hlfir] Special handling for temporary LHS in AssignOp.
When `AssignOp` is used with LHS that is a compiler generated temporary
special care must be taken to initialize the temporary and avoid
finalizations of its components. This change-set adds optional
`temporary_lhs` attribute for `AssignOp` to convey this information
to HLFIR-to-FIR conversion pass. Currently, this results in
calling `AssignTemporary` runtime for doing the assignment.
Reviewed By: jeanPerier, tblah
Differential Revision: https://reviews.llvm.org/D152482
Jie Fu [Tue, 27 Jun 2023 01:14:07 +0000 (09:14 +0800)]
[llvm-exegesis] Fix unused variable 'VAddressSpaceCeiling' warning (NFC)
/Users/jiefu/llvm-project/llvm/tools/llvm-exegesis/lib/X86/Target.cpp:52:33: error: unused variable 'VAddressSpaceCeiling' [-Werror,-Wunused-const-variable]
static constexpr const intptr_t VAddressSpaceCeiling = 0x0000800000000000;
^
1 error generated.
Fangrui Song [Tue, 27 Jun 2023 01:05:00 +0000 (18:05 -0700)]
[llvm-profdata][unittest] Fix -Wsign-compare after D147740
Fangrui Song [Tue, 27 Jun 2023 01:02:22 +0000 (18:02 -0700)]
[MC] Report location information for MCDwarfCallFrameFragment diagnostics
Fangrui Song [Tue, 27 Jun 2023 00:58:29 +0000 (17:58 -0700)]
[MC] Add SMLoc to MCCFIInstruction
to help debug and report better diagnostics for functions like
relaxDwarfCallFrameFragment (D153167).
In MCStreamer, some emitCFI* functions already take a SMLoc argument. Add a
SMLoc argument to the remaining functions that generate a MCCFIInstruction.
Jason Molenda [Tue, 27 Jun 2023 00:45:16 +0000 (17:45 -0700)]
FileSystem::EnumerateDirectory should skip entries w/o Status, not halt
EnumerateDirectory gets the vfs::Status of each directory entry to
decide how to process it. If it is unable to get the Status for
a directory entry, it will currently halt the directory iteration
entirely. It should only skip this entry.
Differential Revision: https://reviews.llvm.org/D153822
rdar://
110861210
Hongtao Yu [Mon, 26 Jun 2023 18:50:30 +0000 (11:50 -0700)]
[CSSPGO][Preinliner] Bump up the threshold to favor previous compiler inline decision.
The compiler has more insight and knowledge about functions based on their IR and attribures and should make a better inline decision than the offline preinliner does which is purely based on callsites hotness and code size. Therefore I'm making changes to favor previous compiler inline decision by bumping up the callsite allowance.
This should improve the performance by more than 1% according to testing on Meta services.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D153797
LLVM GN Syncbot [Tue, 27 Jun 2023 00:06:15 +0000 (00:06 +0000)]
[gn build] Port
12e9c7aaa66b
William Huang [Sat, 24 Jun 2023 07:43:05 +0000 (07:43 +0000)]
[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map
This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map.
The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow.
Several changes to note:
(1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names.
(2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) )
(3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing.
(4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable.
(5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive)
Performance impact:
When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%.
Reviewed By: davidxl, snehasish
Differential Revision: https://reviews.llvm.org/D147740
River Riddle [Mon, 26 Jun 2023 18:36:06 +0000 (11:36 -0700)]
[mlir][bytecode] Fix lazy loading of non-isolated regions
The bytecode reader currently assumes all regions can be lazy
loaded, which breaks reading any non-isolated region. This patch
fixes that by properly handling nested non-lazy regions, and only
considers isolated regions as lazy.
Differential Revision: https://reviews.llvm.org/D153795
Benjamin Kramer [Mon, 26 Jun 2023 23:24:15 +0000 (01:24 +0200)]
[bazel][mlir] Add missing dependencies for
5a1cdcbd8698cd263696b38e2672fccac9ec793c
Petr Hosek [Fri, 21 Apr 2023 02:50:46 +0000 (02:50 +0000)]
[Driver][BareMetal] Support --emit-static-lib in BareMetal driver
This allows building static libraries with Clang driver.
Differential Revision: https://reviews.llvm.org/D148869
Alex MacLean [Mon, 26 Jun 2023 23:12:43 +0000 (16:12 -0700)]
[SelectionDAG] Add memory size for CSEMap ID calculation
In NVPTX `ReplaceVectorLoad()`, i1 and i8 types are promoted to i16,
followed by a truncate operation. Thus, v2i8 (or v2i1) and v2i16 will
have the same VTList, which causes a collision in CSEMap.
To differentiate the original VTList, let's add the size in generating
an ID. Otherwise the compiler crashes in refineAlignment:
`MMO->getSize() == getSize() && "Size mismatch!"`
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D153712
Jerry Wu [Sun, 25 Jun 2023 08:21:09 +0000 (08:21 +0000)]
[mlir] Add pattern to handle trivial shape_cast in SPIR-V
Handle the trivial case of size-1 vector.shape_cast.
Differential Revision: https://reviews.llvm.org/D153719
Jim Ingham [Mon, 26 Jun 2023 23:01:18 +0000 (16:01 -0700)]
Don't allow SBValue::Cast to cast from a smaller type to a larger,
as we don't in general know where the extra data should come from.
Differential Revision: https://reviews.llvm.org/D153657
fernandosalas [Mon, 26 Jun 2023 22:41:18 +0000 (22:41 +0000)]
[scudo] Secondary Cache Dump
Dumped some basic info about what is being cached and about the cache
itself. Output of test below:
...
Stats: MapAllocatorCache: EntriesCount: 33, MaxEntriesCount: 64, MaxEntrySize: 1048576
StartBlockAddress: 0x6d342c1000, EndBlockAddress: 0x6d342d2000, BlockSize: 69632
StartBlockAddress: 0x6fc45ff000, EndBlockAddress: 0x6fc462f000, BlockSize: 196608
...
Reviewed By: Chia-hungDuan
Differential Revision: https://reviews.llvm.org/D153483
Riley [Mon, 26 Jun 2023 22:30:27 +0000 (22:30 +0000)]
[scudo] Fix data leak in wrappers_c_test.cpp
In SmallAlign implemented deallocation for the pointers
Reviewed By: cferris, Chia-hungDuan
Differential Revision: https://reviews.llvm.org/D153480
Chia-hung Duan [Mon, 26 Jun 2023 22:28:38 +0000 (22:28 +0000)]
[scudo] Fix insufficient blocks when pushing BatchClass blocks
`populateFreeListAndPopBatch` may return batch with single block. Merge
it with the reported free blocks and enqueue them together.
Reviewed By: cferris, fabio-d
Differential Revision: https://reviews.llvm.org/D153787
Eduard Zingerman [Thu, 15 Jun 2023 00:58:21 +0000 (03:58 +0300)]
[BPF] Propagate NoMerge attribute when lowering function calls
`NoMerge` attribute on machine instructions prevents certain
transformations from merging these instructions.
One of such transformations is 'llvm/lib/CodeGen/BranchFolding.cpp'.
This attribute should be copied from IR `call` instructions to machine
level instructions. See `X86TargetLowering::LowerCall` as another
example.
Differential Revision: https://reviews.llvm.org/D152987
Eduard Zingerman [Thu, 15 Jun 2023 00:35:07 +0000 (03:35 +0300)]
[clang] Allow 'nomerge' attribute for function pointers
Allow specifying 'nomerge' attribute for function pointers,
e.g. like in the following C code:
extern void (*foo)(void) __attribute__((nomerge));
void bar(long i) {
if (i)
foo();
else
foo();
}
With the goal to attach 'nomerge' to both calls done through 'foo':
@foo = external local_unnamed_addr global ptr, align 8
define dso_local void @bar(i64 noundef %i) local_unnamed_addr #0 {
; ...
%0 = load ptr, ptr @foo, align 8, !tbaa !5
; ...
if.then:
tail call void %0() #1
br label %if.end
if.else:
tail call void %0() #1
br label %if.end
if.end:
ret void
}
; ...
attributes #1 = { nomerge ... }
Report a warning in case if 'nomerge' is specified for a variable that
is not a function pointer, e.g.:
t.c:2:22: warning: 'nomerge' attribute is ignored because 'j' is not a function pointer [-Wignored-attributes]
2 | int j __attribute__((nomerge));
| ^
The intended use-case is for BPF backend.
BPF provides a sort of "standard library" functions that are called
helpers. BPF also verifies usage of these helpers before program
execution. Because of limitations of verification / runtime model it
is important to keep calls to some of such helpers from merging.
An example could be found by the link [1], there input C code:
if (data_end - data > 1024) {
bpf_for_each_map_elem(&map1, cb, &cb_data, 0);
} else {
bpf_for_each_map_elem(&map2, cb, &cb_data, 0);
}
Is converted to bytecode equivalent to:
if (data_end - data > 1024)
tmp = &map1;
else
tmp = &map2;
bpf_for_each_map_elem(tmp, cb, &cb_data, 0);
However, BPF verification/runtime requires to use the same map address
for each particular `bpf_for_each_map_elem()` call.
The 'nomerge' attribute is a perfect match for this situation, but
unfortunately BPF helpers are declared as pointers to functions:
static long (*bpf_for_each_map_elem)(void *map, ...) = (void *) 164;
Hence, this commit, allowing to use 'nomerge' for function pointers.
[1] https://lore.kernel.org/bpf/
03bdf90f-f374-1e67-69d6-
76dd9c8318a4@meta.com/
Differential Revision: https://reviews.llvm.org/D152986
Kelvin Li [Mon, 26 Jun 2023 18:30:10 +0000 (14:30 -0400)]
[flang] Fix flang-aarch64-latest-gcc build failure due to f295c88
Insert a return statement
Alex Langford [Mon, 26 Jun 2023 21:21:48 +0000 (14:21 -0700)]
[lldb] Increase the maximum number of classes we can read from shared cache
The shared cache has grown past the previously hardcoded limit. As a
temporary measure, I am increasing the hardcoded number of classes we
can expect to read from the shared cache. This is not a good long-term
solution.
Differential Revision: https://reviews.llvm.org/D153817
David Green [Mon, 26 Jun 2023 21:41:18 +0000 (22:41 +0100)]
[AArch64] Don't recreate nodes in tryCombineLongOpWithDup
If we don't find a node with either operand through
isEssentiallyExtractHighSubvector, there is little point
recreating the node with the same operands. Returning
SDValue better communicates that no changes were made.
This fixes #63491 by not recreating uabd nodes with swapped
operands. As noted in the ticket there are other fixes that
might be useful to make too, but this should prevent the
infinite combine.
Andres Villegas [Mon, 26 Jun 2023 21:35:39 +0000 (14:35 -0700)]
[llvm-libtool-darwin] Switch to OptTableSummary
Switch the parse of command line options fromllvm::cl to OptTable.
The motivation for this change is to continue adding llvm based tools
to the llvm driver multicall.
Differential Revision: https://reviews.llvm.org/D153665
Douglas Yung [Mon, 26 Jun 2023 21:33:09 +0000 (14:33 -0700)]
[NFC] Bump up DIAG_SIZE_AST as we have hit the limit as of f27afed.
Fangrui Song [Mon, 26 Jun 2023 21:26:06 +0000 (14:26 -0700)]
[MC] Reject CFI advance_loc separated by a non-private label for Mach-O
Due to Mach-O's .subsections_via_symbols mechanism, non-private labels cannot
appear between .cfi_startproc/.cfi_endproc. Compilers do not produce such
labels, but hand-written assembly may. Give an error. Unfortunately,
emitDwarfAdvanceFrameAddr generated MCExpr doesn't have location
informatin.
Note: evaluateKnownAbsolute is to force folding A-B to a constant even if A and
B are separate by a non-private label. The function is a workaround for some
Mach-O assembler issues and should generally be avoided.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D153167
yzhang93 [Mon, 26 Jun 2023 21:18:15 +0000 (14:18 -0700)]
[mlir] Narrow bitwidth emulation for MemRef load
This patch adds support for narrow bitwidth storage emulation. The goal is to support sub-byte type
codegen for LLVM CPU. Specifically, a type converter is added to convert memref of narrow bitwidth
(e.g., i4) into supported wider bitwidth (e.g., i8). Another focus of this patch is to populate the
pattern for int4 memref.load. memref.store pattern should be added in a seperate patch.
Reviewed By: hanchung, mravishankar
Differential Revision: https://reviews.llvm.org/D151519
cabreraam [Mon, 26 Jun 2023 21:03:07 +0000 (17:03 -0400)]
[flang][hlfir] `hlfir.char_extremum` lowering
This patch implements the lowering for the `hlfir.char_extremum` operation.
Discussion for this patch can be found in the draft patch [here](https://reviews.llvm.org/D143326). The reason
for not promoting this draft to a true patch for review was because I needed to separate the op
definition/codegen and lowering into two separate patches, as preferred by @jeanPerier.
Depends on D152474
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D152475
Fangrui Song [Mon, 26 Jun 2023 21:00:58 +0000 (14:00 -0700)]
[dataflow] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D153006
Aiden Grossman [Mon, 26 Jun 2023 20:52:55 +0000 (20:52 +0000)]
[llvm-exegesis] Define SYS_mmap to be SYS_mmap2 on 32-bit ARM
llvm-exegesis currently isn't compiling on 32-bit ARM due to SYS_mmap
not being defined as 32-bit ARM instead uses SYS_mmap2. This patch
defines SYS_mmap to be SYS_mmap2 in the relevant places to allow for
compilation to succeed.
Matthias Braun [Mon, 5 Jun 2023 22:15:01 +0000 (15:15 -0700)]
Ignore load/store until stack address computation
No longer conservatively assume a load/store accesses the stack when we
can prove that we did not compute any stack-relative address up to this
point in the program.
We do this in a cheap not-quite-a-dataflow-analysis: Assume
`NoStackAddressUsed` when all predecessors of a block already guarantee
it. Process blocks in reverse post order to guarantee that except for
loop headers we have processed all predecessors of a block before
processing the block itself. For loops we accept the conservative answer
as they are unlikely to be shrink-wrappable anyway.
Differential Revision: https://reviews.llvm.org/D152213
Matthias Braun [Thu, 25 May 2023 01:08:59 +0000 (18:08 -0700)]
Switch tests to use update_llc_test_checks
Switch and update some tests to use `update_llc_test_checks` to reduce
clutter in upcoming change.
Differential Revision: https://reviews.llvm.org/D152215
Aiden Grossman [Mon, 26 Jun 2023 20:41:54 +0000 (20:41 +0000)]
[llvm-exegesis] Define MAP_FIXED_NOREPLACE if not available
Some builders are currently failing as they don't have
MAP_FIXED_NOREPLACE available. This patch checks if MAP_FIXED_NOREPLACE
is available and if it isn't, it is simply defined as MAP_FIXED.
Shubham Sandeep Rastogi [Wed, 21 Jun 2023 23:47:28 +0000 (16:47 -0700)]
Do not emit a named symbol to denote the start of the debug_frame section
When emitting a debug_frame section, it contains a named symbol.
> echo "void foo(void) {}" | clang -arch arm64 -ffreestanding -g -c -o \
/tmp/test.o -x c -
> nm /tmp/test.o -s __DWARF __debug_frame
0000000000000200 s ltmp1
There are no such symbols emitted in any of the other DWARF sections,
this is because when the __debug_frame section is created, it doesn't
get a `BeginSymName` and so it creates a named symbol, such as `ltmp1`
and emits it when we switch to the section in MCDwarf.cpp.
This patch fixes the above issue.
Differential Revision: https://reviews.llvm.org/D153484
Aiden Grossman [Mon, 26 Jun 2023 20:26:30 +0000 (20:26 +0000)]
[llvm-exegesis] Explicitly link llvm-exegesis unit tests against librt
On some platforms such as PPC shm_open is in librt and it isn't
automatically linked against. This patch explicitly links against librt
in the unittests which should hopefully fix the symbol resolution
errors.
Diego Caballero [Mon, 26 Jun 2023 20:24:29 +0000 (20:24 +0000)]
[mlir][Vector] Fix vectorization of generic ops with transposed outputs
This patch fixes a bug in the way we compute the vector type for vector
transfer writes when the value to store needs to be transposed.
Reviewed By: nicolasvasilache, mravishankar
Differential Revision: https://reviews.llvm.org/D153687
Nicolas Vasilache [Mon, 26 Jun 2023 20:07:51 +0000 (20:07 +0000)]
[mlir][linalg] Add named op for matmul_transpose_a
matmul with transposed LHS operand allows better memory access
patterns on several architectures including common GPUs. Having a named
op for it allows to handle this kind of matmul in a more explicit way.
LLVM GN Syncbot [Mon, 26 Jun 2023 20:01:56 +0000 (20:01 +0000)]
[gn build] Port
f8927838fa85
LLVM GN Syncbot [Mon, 26 Jun 2023 20:01:55 +0000 (20:01 +0000)]
[gn build] Port
83f875dc94d7
Nico Weber [Mon, 26 Jun 2023 20:00:53 +0000 (16:00 -0400)]
[gn] prepare for porting
f8927838fa8558702794
Aart Bik [Mon, 26 Jun 2023 19:09:33 +0000 (12:09 -0700)]
[mlir][sparse] minor code changes
Submitting for Wren
Reviewed By: K-Wu
Differential Revision: https://reviews.llvm.org/D153804
Fangrui Song [Mon, 26 Jun 2023 19:55:48 +0000 (12:55 -0700)]
Nicolas Vasilache [Mon, 26 Jun 2023 19:43:58 +0000 (19:43 +0000)]
[mlir][linalg] Add missing op to match the generated file
D141430 added the generated yaml file for (batch_)?matmul_transpose_b ops, but the source of truth core_named_ops.py was not updated.
This change fixes .py file to generate the same result as the yaml file.
Differential revision: https://reviews.llvm.org/D150059
Authored-by: kon72 <kinsei0916@gmail.com>
Philip Reames [Mon, 26 Jun 2023 19:46:16 +0000 (12:46 -0700)]
[RISCV] Regen rvv/fixed-vectors-fmf.ll to avoid spurious test deltas
Fangrui Song [Mon, 26 Jun 2023 19:48:20 +0000 (12:48 -0700)]
[dataflow] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D153006
Anthony Cabrera [Thu, 8 Jun 2023 21:53:50 +0000 (17:53 -0400)]
[flang][hlfir] `hlfir.char_extremum` op definition and codegen
This patch adds an hlfir operation called `char_extremum`, which takes the
lexicographic comparison between a variadic number (minimum of 2 arguments) of
characters.
Discussion for this work can be found in the draft revision found
[here](https://reviews.llvm.org/D143326). The reason I'm not promoting that draft to
a true patch for review was because I needed to separate out the op
definition/codegen and lowering as two separate patches, as preferred by
@jeanPerier.
Differential Revision: https://reviews.llvm.org/D152474
Fangrui Song [Mon, 26 Jun 2023 19:28:02 +0000 (12:28 -0700)]
[Driver][ARM] Warn about -mabi= for assembler input
Previously, Clang Driver reported a warning when assembler input was assembled
with the -mabi= option. D152856 added TargetSpecific to -mabi= option and
reported an error for such a case. This change restores the previous behavior by
reporting a warning.
GCC translates -mabi={apcs-gnu,atpcs} to gas -meabi=gnu and other -mabi= values
to -meabi=5. We don't support setting e_flags to any value other than
EF_ARM_EABI_VER5.
Close https://github.com/ClangBuiltLinux/linux/issues/1878
Reviewed By: michaelplatings
Differential Revision: https://reviews.llvm.org/D153691
Aiden Grossman [Sat, 20 May 2023 09:46:50 +0000 (09:46 +0000)]
[llvm-exegesis] Add Target Memory Utility Functions
This patch adds in several functions to ExegesisTarget that will assist
in setting up memory for the planned memory annotations.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D151023
Sam McCall [Thu, 22 Jun 2023 19:54:52 +0000 (21:54 +0200)]
[dataflow] Make SAT solver deterministic
The SAT solver imported its constraints by iterating over an unordered DenseSet.
The path taken, and ultimately the runtime, the specific solution found, and
whether it would time out or complete could depend on the iteration order.
Instead, have the caller specify an ordered collection of constraints.
If this is built in a deterministic way, the system can have deterministic
behavior.
(The main alternative is to sort the constraints by value, but this option
is simpler today).
A lot of nondeterminism still appears to be remain in the framework, so today
the solver's inputs themselves are not deterministic yet.
Differential Revision: https://reviews.llvm.org/D153584
Craig Topper [Mon, 26 Jun 2023 19:14:34 +0000 (12:14 -0700)]
[RISCV] Add i32 as a legal type for GPR register class.
I'm investigating if it is feasible to have i32 as a legal type for RV64.
The first thing we need to do is make i32 a valid type for the GPR
register class.
We already added f32/f64 as valid types which required adding explicit
types to tablegen patterns. Adding additional types to GPR is free now.
Reviewed By: sunshaoce
Differential Revision: https://reviews.llvm.org/D151177
Aiden Grossman [Mon, 26 Jun 2023 18:57:51 +0000 (18:57 +0000)]
[NFC][llvm-exegesis] Disable tests using preprocessor directives
This patch changes to disabling tests in SubprocessMemoryTest.cpp using
preprocessor directives rather than pulling the file out of the build
using CMake. This is the de facto canonical way to do it in the rest of
the tree as seen in other unittest files such as DwarfDebugInfoTest.cpp.