Jessica Paquette [Sat, 10 Sep 2022 02:58:02 +0000 (19:58 -0700)]
Add a utility for converting between different types of remarks
This adds llvm-remarkutil. This is intended to be a general tool for doing stuff
with/to remark files.
This patch gives it the following powers:
* `bitstream2yaml` - To convert bitstream remarks to YAML
* `yaml2bitstream` - To convert YAML remarks to bitstream remarks
These are both implemented as subcommands, like
`llvm-remarkutil bitstream2yaml <input_file> -o -`
I ran into an issue where I had some bitstream remarks coming from CI, and I
wanted to be able to do stuff with them (e.g. visualize them). But then I
noticed we didn't have any tooling for doing that, so I decided to write this
thing.
Being able to output YAML as a start seemed like a good idea, since it
would allow people to reuse any tooling they may have written based around YAML
remarks.
Hopefully it can grow into a more featureful remark utility. :)
Currently there are is an outstanding performance issue (see the TODO) with
the bitstream2yaml case. I decided that I'd keep the tool small to start with
and have the yaml2bitstream and bitstream2yaml cases be symmetric.
Also I moved the remarks documentation to its own header because it seems
a little out of place with "basic commands" and "developer tools"; it's
really kind of its own thing.
Differential Revision: https://reviews.llvm.org/D133646
Matt Arsenault [Mon, 12 Sep 2022 19:31:03 +0000 (15:31 -0400)]
LiveRegUnits: Do not use phys_regs_and_masks
Somehow DeadMachineInstructionElim is about 3x slower when using it.
Hopefully this reverses the compile time regression reported for
b5041527c75de2f409aa9e2e6deba12b17834c59.
David Majnemer [Wed, 31 Aug 2022 21:55:12 +0000 (21:55 +0000)]
[clang, llvm] Add __declspec(safebuffers), support it in CodeView
__declspec(safebuffers) is equivalent to
__attribute__((no_stack_protector)). This information is recorded in
CodeView.
While we are here, add support for strict_gs_check.
Aart Bik [Mon, 12 Sep 2022 19:53:07 +0000 (12:53 -0700)]
[mlir][sparse] add memSizes array to sparse storage format
Rationale:
For every dynamic memref (memref<?xtype>), the stored size really
indicates the capacity and the entry in the memSizes indicates
the actual size. This allows us to use memref's as "vectors".
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D133724
Greg Clayton [Tue, 30 Aug 2022 22:46:57 +0000 (15:46 -0700)]
Add the ability to show when variables fails to be available when debug info is valid.
Summary:
Many times when debugging variables might not be available even though a user can successfully set breakpoints and stops somewhere. Letting the user know will help users fix these kinds of issues and have a better debugging experience.
Examples of this include:
- enabling -gline-tables-only and being able to set file and line breakpoints and yet see no variables
- unable to open object file for DWARF in .o file debugging for darwin targets due to modification time mismatch or not being able to locate the N_OSO file.
This patch adds an new API to SBValueList:
lldb::SBError lldb::SBValueList::GetError();
object so that if you request a stack frame's variables using SBValueList SBFrame::GetVariables(...), you can get an error the describes why the variables were not available.
This patch adds the ability to get an error back when requesting variables from a lldb_private::StackFrame when calling GetVariableList.
It also now shows an error in response to "frame variable" if we have debug info and are unable to get varialbes due to an error as mentioned above:
(lldb) frame variable
error: "a.o" object from the "/tmp/libfoo.a" archive: either the .o file doesn't exist in the archive or the modification time (0x63111541) of the .o file doesn't match
Reviewers: labath JDevlieghere aadsm yinghuitan jdoerfert sscalpone
Subscribers:
Differential Revision: https://reviews.llvm.org/D133164
Kazu Hirata [Mon, 12 Sep 2022 20:34:35 +0000 (13:34 -0700)]
[llvm] Use x.empty() instead of llvm::empty(x) (NFC)
I'm planning to deprecate and eventually remove llvm::empty.
I thought about replacing llvm::empty(x) with std::empty(x), but it
turns out that all uses can be converted to x.empty(). That is, no
use requires the ability of std::empty to accept C arrays and
std::initializer_list.
Differential Revision: https://reviews.llvm.org/D133677
YongKang Zhu [Mon, 12 Sep 2022 20:24:47 +0000 (13:24 -0700)]
Bug fix on stable hash calculation for machine operands RegisterMask and RegisterLiveOut
MachineOperand::getRegMask() returns a pointer to register mask. We should hash the raw content of register mask instead of its pointer.
Reviewed By: kyulee
Differential Revision: https://reviews.llvm.org/D133637
Ben Langmuir [Mon, 12 Sep 2022 20:10:22 +0000 (13:10 -0700)]
Revert "[clang][test] Disallow using the default module cache path in lit tests"
This reverts commit
d96f526196ac4cebfdd318473816f6d4b9d76707.
Some systems do not support `env -u`.
Fangrui Song [Mon, 12 Sep 2022 19:56:35 +0000 (12:56 -0700)]
[ELF] Parallelize relocation scanning
* Change `Symbol::flags` to a `std::atomic<uint16_t>`
* Add `llvm::parallel::threadIndex` as a thread-local non-negative integer
* Add `relocsVec` to part.relaDyn and part.relrDyn so that relative relocations can be added without a mutex
* Arbitrarily change -z nocombreloc to move relative relocations to the end. Disable parallelism for deterministic output.
MIPS and PPC64 use global states for relocation scanning. Keep serial scanning.
Speed-up with mimalloc and --threads=8 on an Intel Skylake machine:
* clang (Release): 1.27x as fast
* clang (Debug): 1.06x as fast
* chrome (default): 1.05x as fast
* scylladb (default): 1.04x as fast
Speed-up with glibc malloc and --threads=16 on a ThunderX2 (AArch64):
* clang (Release): 1.31x as fast
* scylladb (default): 1.06x as fast
Reviewed By: andrewng
Differential Revision: https://reviews.llvm.org/D133003
Jez Ng [Mon, 12 Sep 2022 19:51:46 +0000 (15:51 -0400)]
[lld-macho][docs] Cosmetic changes
1. Fixed rST hyperlink syntax
2. Renamed LD64 -> ld64
3. Moved up the `-no_deduplicate` section so it is right under the
section talking about how our default dedup behavior differs; IMO
it makes more sense to read them in that order
4. De-bullet-listed some other sections so we have less whitespace in
the rendered page
5. Since the Mach-O LLD Port page has only one sub-page, don't render an
entire toctree with just one item. Use a "See also" box instead.
6. Wrap lines at 80 chars.
Reviewed By: #lld-macho, thevinster
Differential Revision: https://reviews.llvm.org/D133717
LLVM GN Syncbot [Mon, 12 Sep 2022 19:46:19 +0000 (19:46 +0000)]
[gn build] Port
cf72dddaefe9
Nico Weber [Mon, 12 Sep 2022 19:45:34 +0000 (15:45 -0400)]
[gn build] port
346856dc6c208 (or port
4d50a392401c0 more?)
Felipe de Azevedo Piovezan [Tue, 30 Aug 2022 13:28:14 +0000 (09:28 -0400)]
Reland "[lldb] Use just-built libcxx for tests when available"
This commit improves upon
cc0b5ebf7fc8, which added support for
specifying which libcxx to use when testing LLDB. That patch honored
requests by tests that had `USE_LIBCPP=1` defined in their makefiles.
Now, we also use a non-default libcxx if all conditions below are true:
1. The test is not explicitly requesting the use of libstdcpp
(USE_LIBSTDCPP=1).
2. The test is not explicitly requesting the use of the system's
library (USE_SYSTEM_STDLIB=1).
3. A path to libcxx was either provided by the user through CMake flags
or libcxx was built together with LLDB.
Condition (2) is new and introduced in this patch in order to support
tests that are either:
* Cross-platform (such as API/macosx/macCatalyst and
API/tools/lldb-server). The just-built libcxx is usually not built for
platforms other than the host's.
* Cross-language (such as API/lang/objc/exceptions). In this case, the
Objective C runtime throws an exceptions that always goes through the
system's libcxx, instead of the just built libcxx. Fixing this would
require either changing the install-name of the just built libcxx in Mac
systems, or tuning the DYLD_LIBRARY_PATH variable at runtime.
Some other tests exposes limitations of LLDB when running with a debug
standard library. TestDbgInfoContentForwardLists had an assertion
removed, as it was checking for buggy LLDB behavior (which now
crashes). TestFixIts had a variable renamed, as the old name clashes
with a standard library name when debug info is present. This is a known
issue: https://github.com/llvm/llvm-project/issues/34391.
For `TestSBModule`, the way the "main" module is found was changed to
look for the "a.out" module, instead of relying on the index being 0. In
some systems, the index 0 is dyld when a custom standard library is
used.
Differential Revision: https://reviews.llvm.org/D132940
Sanjay Patel [Mon, 12 Sep 2022 19:00:08 +0000 (15:00 -0400)]
[InstCombine] look through 'not' of ctlz/cttz op with 0-is-undef
https://alive2.llvm.org/ce/z/MNsC1S
This pattern was flagged at:
https://discourse.llvm.org/t/instcombines-select-optimizations-dont-trigger-reliably/64927
Sanjay Patel [Mon, 12 Sep 2022 18:34:57 +0000 (14:34 -0400)]
[InstCombine] add tests for select of ctlz/cttz with 'not' value; NFC
Zequan Wu [Sat, 10 Sep 2022 00:47:25 +0000 (17:47 -0700)]
[LLDB][NativePDB] Add local variables with no location info.
If we don't add local variables with no location info, when trying to print it,
lldb won't find it in the its parent DeclContext, which makes lldb to spend more
time to search all the way up in DeclContext hierarchy until found same name
variable or failed. Dwarf plugin also add local vars even if they don't have
location info.
Differential Revision: https://reviews.llvm.org/D133626
Richard Howell [Fri, 9 Sep 2022 17:49:44 +0000 (10:49 -0700)]
[clang] sort additional module maps when serializing
Sort additional module maps when serializing pcm files. This ensures
the `MODULE_MAP_FILE` record is deterministic across repeated builds.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D133611
Matthias Braun [Tue, 16 Aug 2022 21:11:28 +0000 (14:11 -0700)]
Use update_mir_test_checks for some more tests.
Stella Stamenova [Mon, 12 Sep 2022 18:31:17 +0000 (11:31 -0700)]
Revert "Add the ability to show when variables fails to be available when debug info is valid."
This reverts commit
9af089f5179d52c6561ec27532880edcfb6253af.
This broke the windows lldb bot: https://lab.llvm.org/buildbot/#/builders/83/builds/23528
Laura Chaparro-Gutierrez [Mon, 12 Sep 2022 15:39:57 +0000 (15:39 +0000)]
[lldb] Add SBBreakpointLocation::SetCallback
* Include SetCallback in SBBreakpointLocation, similar as in SBBreakpoint.
* Add test_breakpoint_location_callback test as part of TestMultithreaded.
Reviewed By: werat, JDevlieghere
Differential Revision: https://reviews.llvm.org/D133689
Co-authored-by: Andy Yankovsky <weratt@gmail.com>
Craig Topper [Mon, 12 Sep 2022 17:34:51 +0000 (10:34 -0700)]
[LegalizeTypes] Improve splitting for urem/udiv by constant for some constants.
For remainder:
If (1 << (Bitwidth / 2)) % Divisor == 1, we can add the high and low halves
together and use a (Bitwidth / 2) urem. If (BitWidth /2) is a legal integer
type, this urem will be expand by DAGCombiner using multiply by magic
constant. We do have to take into account that adding high and low
together can produce a carry, making it a (BitWidth / 2)+1 bit number.
So we need to also add back in the carry from the first addition.
For division:
We can use the above trick to compute the remainder, subtract that
remainder from the dividend, then multiply by the multiplicative
inverse of the Divisor modulo (1 << BitWidth).
This is based on the section "Remainder by Summing Digits" in
Hacker's delight.
The remainder trick is similar to a trick you may have learned for
determining if a decimal number is divisible by 3. You can add all the
digits together and see if the sum is divisible by 3. If you're not sure
if the sum is divisible by 3, you can add its digits together. This
can be repeated until you have a single decimal digit. If that digit
is 3, 6, or 9, then the original number is divisible by 3. This works
because 10 % 3 == 1.
gcc already does this same trick. There are additional tricks gcc
does urem as well as srem, udiv, and sdiv that I plan to add in
future patches.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D130862
Corentin Jabot [Mon, 12 Sep 2022 15:32:47 +0000 (17:32 +0200)]
[Clang] NFC: Make UnqualifiedId::Kind private for consistency.
Differential Revision: https://reviews.llvm.org/D133703
Benjamin Kramer [Mon, 12 Sep 2022 10:55:58 +0000 (12:55 +0200)]
[DFSan] Don't crash with the legacy pass manager
TargetLibraryInfo isn't optional, so we have to provide it even with the
lageacy stuff. Ideally we wouldn't need it anymore but there are still
users out there that are stuck on the legacy PM.
Differential Revision: https://reviews.llvm.org/D133685
Aart Bik [Fri, 9 Sep 2022 22:42:46 +0000 (15:42 -0700)]
[mlir][sparse] properly record dimension level type and properties
A next step towards supporting the new dimension level types and
properties. This changes properly records the properties in the
Merger, so that subsequent computations (lattice optimizations)
and code generation (during sparsification) can do the right thing.
https://github.com/llvm/llvm-project/issues/51658
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D133620
A-Wadhwani [Mon, 12 Sep 2022 16:29:34 +0000 (09:29 -0700)]
[SROA] Create additional vector type candidates based on store and load slices
This patch adds additional vector types to be considered when doing
promotion in SROA, based on the types of the store and load slices. This
provides more promotion opportunities, by potentially using an optimal
"intermediate" vector type.
For example, the following code would currently not be promoted to a
vector, since `__m128i` is a `<2 x i64>` vector.
```
__m128i packfoo0(int a, int b, int c, int d) {
int r[4] = {a, b, c, d};
__m128i rm;
std::memcpy(&rm, r, sizeof(rm));
return rm;
}
```
```
packfoo0(int, int, int, int):
mov dword ptr [rsp - 24], edi
mov dword ptr [rsp - 20], esi
mov dword ptr [rsp - 16], edx
mov dword ptr [rsp - 12], ecx
movaps xmm0, xmmword ptr [rsp - 24]
ret
```
By also considering the types of the elements, we could find that the
`<4 x i32>` type would be valid for promotion, hence removing the memory
accesses for this function. In other words, we can explore other new
vector types, with the same size but different element types based on
the load and store instructions from the Slices, which can provide us
more promotion opportunities.
Additionally, the step for removing duplicate elements from the
`CandidateTys` vector was not using an equality comparator, which has
been fixed.
Differential Revision: https://reviews.llvm.org/D132096
Ben Langmuir [Fri, 9 Sep 2022 22:56:08 +0000 (15:56 -0700)]
[clang][test] Disallow using the default module cache path in lit tests
Make the default module cache path invalid when running lit tests so
that tests are forced to provide a cache path. This avoids accidentally
escaping to the system default location, and would have caught the
failure recently found in ClangScanDeps/multiple-commands.c.
Differential Revision: https://reviews.llvm.org/D133622
Benjamin Kramer [Mon, 12 Sep 2022 16:46:35 +0000 (18:46 +0200)]
[mlir][linalg] Explicitly instantiate DownscaleSizeOneWindowed2DConvolution
It's not possible to use a template with no definition from another
translation unit. Fixes the shared library build.
Adrian Prantl [Mon, 12 Sep 2022 16:48:31 +0000 (09:48 -0700)]
Skip crashing test
Craig Topper [Mon, 12 Sep 2022 16:32:15 +0000 (09:32 -0700)]
[RISCV] Rename WriteFALU* and ReadFALU* to WriteFAdd*/ReadFAdd*.
ALU seems a little vague. FAdd felt more precise even though it
also include FSUB instructions.
Reviewed By: monkchiang
Differential Revision: https://reviews.llvm.org/D133632
Felipe de Azevedo Piovezan [Sat, 10 Sep 2022 11:21:27 +0000 (07:21 -0400)]
[lldb] Fix detection of existing libcxx
The CMake variable LLDB_HAS_LIBCXX is passed to
`llvm_canonicalize_cmake_booleans`, which transforms TRUE/FALSE into
'1'/'0'. It also transforms undefined variables to '0'.
In particular, this means that the configuration script for LLDB API's
test always has _some_ value for the `has_libcxx` configuration:
```
config.has_libcxx = '@LLDB_HAS_LIBCXX@'
```
When deciding whether a libcxx exist, the testing scripts would only
check for the existence of `has_libcxx`, but not for its value. In other
words, because `if ('0')` is true in python we always think there is a
libcxx.
This was caught once D132940 was merged and most tests started to use
libcxx by default if `has_libcxx` is true. Prior to that, no failures
were seen because only tests are marked with
`@add_test_categories(["libc++"])` would require a libcxx, and these
would be filtered out on builds without the libcxx target. Furthermore,
the MacOS bots always build libcxx.
We fix this by making `has_libcxx` a boolean (instead of a string) and
by checking its value in the test configuration.
Differential Revision: https://reviews.llvm.org/D133639
Sanjay Patel [Mon, 12 Sep 2022 16:03:21 +0000 (12:03 -0400)]
[Reassociate] prevent partial undef negation replacement
As shown in the examples in issue #57683, we allow matching
vectors with poison (undef) in this transform (and possibly more),
but we can't then use the partially defined value as a replacement
value in other expressions blindly.
This seems to be avoided in simpler examples of reassociation,
and other passes should be able to clean up the redundant op
seen in these tests.
Sanjay Patel [Mon, 12 Sep 2022 15:36:11 +0000 (11:36 -0400)]
[Reassociate] add tests for vector negate with undef elements; NFC
Reduced/expanded from issue #57683.
Craig Topper [Mon, 12 Sep 2022 16:12:56 +0000 (09:12 -0700)]
[RISCV] Custom type legalize i32 loads by sign extending.
The default is to use extload which can become a zextload or
sextload if it is followed by an 'and' or sext_inreg.
Sometimes type legalization will introduce an 'and' from promoting
something like 'srl X, C' and a sext_inreg from from a setcc. The
'and' could be freely folded with the promoted 'srl' by using srliw,
but the sext_inreg can't be folded into a compare. DAG combiner
will see both of these choices and may decide to fold the 'and'
instead of the 'sext_inreg'. This forces the sext_inreg to become
a sext.w.
By picking sextload in the type legalizer we take this choice away.
Looking at spec2006 compiled with Zba and Zbb this appeared to be
net reduction in lines of code in the objdump disassembly output.
This is similar to what we do with i32 add/sub/mul/shl in
type legalization where we always emit a sext_inreg.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D130397
Matthias Gehre [Mon, 12 Sep 2022 13:27:04 +0000 (14:27 +0100)]
Move TargetTransformInfo::maxLegalDivRemBitWidth -> TargetLowering::maxSupportedDivRemBitWidth
Also remove new-pass-manager version of ExpandLargeDivRem because there is no way
yet to access TargetLowering in the new pass manager.
Differential Revision: https://reviews.llvm.org/D133691
Jay Foad [Mon, 12 Sep 2022 15:32:25 +0000 (16:32 +0100)]
[GlobalISel] Simplify extended add/sub to add/sub with carry
Simplify extended add/sub (with carry-in and carry-out) to add/sub with
carry (with carry-out only) if carry-in is known to be zero.
Differential Revision: https://reviews.llvm.org/D133702
Kazu Hirata [Mon, 12 Sep 2022 15:52:51 +0000 (08:52 -0700)]
[mlir] Fix deprecation warnings (NFC)
This patch fixes a couple of warnings by switching to has_value/value:
mlir/lib/Dialect/Vector/IR/VectorOps.cpp:529:28: error: 'hasValue'
is deprecated: Use has_value
instead. [-Werror,-Wdeprecated-declarations]
mlir/lib/Dialect/Vector/IR/VectorOps.cpp:533:48: error: 'getValue'
is deprecated: Use value
instead. [-Werror,-Wdeprecated-declarations]
Katherine Rasmussen [Thu, 8 Sep 2022 17:02:43 +0000 (10:02 -0700)]
[flang] Write semantics test for atomic_fetch_and
Write a semantics test for the atomic intrinsic subroutine,
atomic_fetch_and.
Reviewed By: rouson
Differential Revision: https://reviews.llvm.org/D133506
Simon Pilgrim [Mon, 12 Sep 2022 15:34:29 +0000 (16:34 +0100)]
[CostModel][X86] Add CostKinds handling for abs ops
This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695
Oleg Shyshkov [Mon, 12 Sep 2022 14:53:36 +0000 (16:53 +0200)]
[mlir] Change IteratorType in ContractionOp in Vector dialect from string to enum.
This is the first step in replacing interator_type from strings with enums in Vector and Linalg dialect. This change adds IteratorTypeAttr and uses it in ContractionOp.
To avoid breaking all the tests, print/parse code has conversion between string and enum for now.
There is a shared code in StructuredOpsUtils.h that expects iterator types to be strings. To break this dependancy, this change forks helper function `isParallelIterator` and `isReductionIterator` to utils in both dialects and adds `getIteratorTypeNames()` to support backward compatibility with StructuredGenerator.
In the later changes, I plan to add a similar enum attribute to Linalg.
Differential Revision: https://reviews.llvm.org/D133696
Louis Dionne [Mon, 12 Sep 2022 14:51:07 +0000 (10:51 -0400)]
[libc++] Add LLDB data formatters dependencies to the CI image
This will be required in order to add a CI job running the LLDB
data formatters.
Florian Hahn [Mon, 12 Sep 2022 14:53:30 +0000 (15:53 +0100)]
[SLP] Add Preheader to CSE blocks after hoisting CSE-able instrs.
Adding the pre-header to CSEBlocks ensures instructions are CSE'd even
after hoisting.
This was original discovered by @atrick a while ago.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D133649
Jun Zhang [Mon, 12 Sep 2022 14:21:17 +0000 (22:21 +0800)]
[Clang] Reword diagnostic for scope identifier with linkage
If the declaration of an identifier has block scope, and the identifier has
external or internal linkage, the declaration shall have no initializer for
the identifier.
Clang now gives a more suitable diagnosis for this case.
Fixes https://github.com/llvm/llvm-project/issues/57478
Signed-off-by: Jun Zhang <jun@junz.org>
Differential Revision: https://reviews.llvm.org/D133088
Simon Pilgrim [Mon, 12 Sep 2022 14:36:41 +0000 (15:36 +0100)]
[CostModel][X86] Add CostKinds test coverage for abs intrinsics
Simon Pilgrim [Mon, 12 Sep 2022 14:06:18 +0000 (15:06 +0100)]
[SimplifyCFG][X86] Regenerate speculate-cttz-ctlz.ll
There's no difference between generic/bmi/lzcnt targets atm
Joe Nash [Fri, 9 Sep 2022 19:05:32 +0000 (15:05 -0400)]
[AMDGPU] Separate check lines for some GFX11 16-bit codegen tests
NFC. Pre-commits test changes to have a separate CHECK line where GFX11 behavior will diverge from
previous subtargets in a future patch.
Joe Nash [Fri, 9 Sep 2022 20:20:21 +0000 (16:20 -0400)]
[AMDGPU] Change test check name. NFC
Change the check name from GFX10 to GFX10Plus to refect its actual usage
Alexey Bataev [Thu, 8 Sep 2022 14:41:24 +0000 (07:41 -0700)]
[SLP]Improve reordering of clustered reused scalars.
If the reused scalars are clustered, i.e. each part of the reused mask
contains all elements of the original scalars exactly once, we can
reorder those clusters to improve the whole ordering of of the clustered
vectors.
Differential Revision: https://reviews.llvm.org/D133524
Joe Nash [Fri, 9 Sep 2022 20:24:18 +0000 (16:24 -0400)]
[AMDGPU] Autogenerate test with regclass numbers
NFC. This test contains a good amount of verbose compiler output as well
as numbers which depend on the number of registers defined and are
difficult to update. So autogenerate the test.
Matt Arsenault [Mon, 12 Sep 2022 13:33:22 +0000 (09:33 -0400)]
AMDGPU: Fix test failure
Forgot to commit regenerated test
Matt Arsenault [Sun, 24 Jul 2022 22:26:22 +0000 (18:26 -0400)]
RegAllocGreedy: Try local instruction splitting with subranges
This was only trying this to relax register class constraints, but
this can also help if there are subranges involved.
This solves a compilation failure for AMDGPU when there is high
pressure created by large register tuples. If one virtual register is
using most of the available budget, we need to be able to evict
subranges.
This solves the immediate failure, but this solution leaves a lot to
be desired. In the relevant testcases, we have 32-element tuples but
most of the uses are operations on 1 element subranges of it. What
we're now getting is a spill and restore of the full 1024 bits and an
extract of the used 32-bits. It would be far better if we introduced a
copy to a new virtual register with a smaller register class and used
narrower spills.
Furthermore, we could probably do a better job if the allocator were
to introduce new subranges where none previously existed in the
highest pressure scenarios. The block and region splits should also
try to split specific subranges out.
The mve-vst3.ll test changes looks like noise to me, but instruction
count increased by one. mve-vst4.ll looks like a solid improvement
with several 16-byte spills eliminated. splitkit-copy-live-lanes.mir
also shows a solid reduction in total spill count.
This could use more tests but it's pretty tiring to come up with cases
that fail on this.
Matt Arsenault [Mon, 1 Aug 2022 13:13:29 +0000 (09:13 -0400)]
TableGen: Introduce generated getSubRegisterClass function
Currently there isn't a generic way to get a smaller register class
that can be produced from a subregister of a larger class. Replaces a
manually implemented version for AMDGPU. This will be used to improve
subregister support in the allocator.
Pavel Samolysov [Mon, 12 Sep 2022 12:39:57 +0000 (15:39 +0300)]
[NFC][ScheduleDAG] Use a reference to iterate over NodeSuccs/ChainSuccs
Pavel Samolysov [Mon, 12 Sep 2022 12:24:09 +0000 (15:24 +0300)]
[NFC][ScheduleDAG] Use structure bindings and emplace_back
Some uses of std::make_pair and the std::pair's first/second members
in the ScheduleDAGRRList.cpp file were replaced with using of the
vector's emplace_back along with structure bindings from C++17.
Sander de Smalen [Mon, 12 Sep 2022 11:05:55 +0000 (11:05 +0000)]
[AArch64][SME] Add utility class for handling SME attributes.
This patch adds a utility class that will be used in subsequent patches
for parsing the function/callsite attributes and determining whether
changes to PSTATE.SM are needed, or whether a lazy-save mechanism is
required.
It also implements some of the restrictions on the SME attributes
in the IR Verifier pass.
More details about the SME attributes and design can be found
in D131562.
Reviewed By: david-arm, aemerson
Differential Revision: https://reviews.llvm.org/D131570
Matt Arsenault [Fri, 24 Jun 2022 18:07:51 +0000 (14:07 -0400)]
DAG: Sink some getter code closer to uses
Matt Arsenault [Mon, 18 Jul 2022 19:52:47 +0000 (15:52 -0400)]
CodeGen: Set MODereferenceable from isDereferenceableAndAlignedPointer
Previously this was assuming piontsToConstantMemory implies
dereferenceable.
Muhammad Omair Javaid [Mon, 12 Sep 2022 12:21:55 +0000 (17:21 +0500)]
[LLD] Fix pdb-natvis.test for Windows on Arm
pdb-natvis test fails on Arm Windows as generated byte sizes and crc
differ for Windows on Arm. This patch fixes the test to make it pass on
Arm/Windows.
Clement Courbet [Mon, 12 Sep 2022 09:37:11 +0000 (11:37 +0200)]
[llvm-exegesis][NFC] Use factory function for LlvmState.
This allows failing more gracefully.
Pavel Samolysov [Mon, 12 Sep 2022 10:54:44 +0000 (13:54 +0300)]
[NFC][ScheduleDAG] Use Register and MCPhysReg instead of unsigned
Matt Arsenault [Thu, 21 Apr 2022 13:11:40 +0000 (09:11 -0400)]
DeadMachineInstructionElim: Switch to using LiveRegUnits
Theoretically improves compile time for targets with many overlapping
registers
Matt Arsenault [Sat, 23 Jul 2022 23:58:24 +0000 (19:58 -0400)]
RegAlloc: Use SmallSet instead of std::set
There shouldn't be more than a small handful of hints at most.
Jonas Hahnfeld [Mon, 12 Sep 2022 11:45:17 +0000 (13:45 +0200)]
Fix build of Lex unit test with CLANG_DYLIB
If CLANG_LINK_CLANG_DYLIB, clang_target_link_libraries ignores all
indivial libraries and only links clang-cpp. As LLVMTestingSupport
is separate, pass it via target_link_libraries directly.
J. Ryan Stinnett [Mon, 12 Sep 2022 11:43:17 +0000 (12:43 +0100)]
[DebugInfo][Docs] Fix RST syntax for DW_OP_LLVM_arg in LangRef
The inline code in the description of `DW_OP_LLVM_arg` wasn't terminating
correctly, leading to more text displayed as code than intended. This fixes that
up and adds a superscript as a tiny embellishment.
Muhammad Usman Shahid [Mon, 12 Sep 2022 11:47:18 +0000 (07:47 -0400)]
Rewording note note_constexpr_invalid_cast
The diagnostics here are correct, but the note is really silly. It
talks about reinterpret_cast in C code. So rewording it for c mode by
using another %select{}.
```
int array[(long)(char *)0];
```
previous note:
```
cast that performs the conversions of a reinterpret_cast is not allowed in a constant expression
```
reworded note:
```
this conversion is not allowed in a constant expression
```
Differential Revision: https://reviews.llvm.org/D133194
zhongyunde [Mon, 12 Sep 2022 09:37:36 +0000 (17:37 +0800)]
[AA] Improve the BasicAA analysis capability
According https://discourse.llvm.org/t/memoryssa-does-the-accessedbetween-support-scalable-vector-pointer/65052,
scalable vector support in BasicAA is currently essentially limited,
and should be improved effectively for a constant offset GEP if the scalable index is zero, eg:
getelementptr <vscale x 4 x i32>, ptr %p, i64 0, i64 %i
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D133567
Simon Pilgrim [Mon, 12 Sep 2022 11:08:42 +0000 (12:08 +0100)]
[CostModel][X86] Move AVX512/AVX2 uniform shift costs into the generic uniform cost tables
They shouldn't be happening after XOP shift costs - AVX2 shift supports takes preference over XOP for everything but vXi8 shifts - the improvement is pretty limited as it only affects bdver4 targets but it does help clean up a fraction of the messy shift cost logic....
Mehdi Amini [Mon, 29 Aug 2022 11:57:50 +0000 (11:57 +0000)]
Apply clang-tidy fixes for modernize-use-equals-default in TestPatterns.cpp (NFC)
Mehdi Amini [Mon, 29 Aug 2022 11:52:06 +0000 (11:52 +0000)]
Apply clang-tidy fixes for modernize-use-equals-default in TestLinalgDecomposeOps.cpp (NFC)
Mehdi Amini [Mon, 29 Aug 2022 11:26:39 +0000 (11:26 +0000)]
Apply clang-tidy fixes for readability-identifier-naming in JitRunner.cpp (NFC)
David Green [Mon, 12 Sep 2022 10:13:23 +0000 (11:13 +0100)]
[AArch64] Add some extra typepromotion cost tests. NFC
David Spickett [Mon, 12 Sep 2022 09:56:52 +0000 (09:56 +0000)]
[llvm][AArch64] Test warning for clobbering w19 with base frame pointer
The test added in
739b69e655fe66674982cffc8b8166306355e7d3 only checked
that X19 triggers the explanation, also check W19.
lorenzo chelini [Mon, 12 Sep 2022 09:37:31 +0000 (11:37 +0200)]
[MLIR][Linalg] Fix typos in 'DropUnitDims.cpp'
David Spickett [Fri, 2 Sep 2022 15:19:02 +0000 (15:19 +0000)]
[LLVM][AArch64] Explain that X19 is used as the frame base pointer register
Fixes #50098
LLVM uses X19 as the frame base pointer, if it needs to. Meaning you
can get warnings if you clobber that with inline asm.
However, it doesn't explain why. The frame base register is not part
of the ABI so it's pretty confusing why you get that warning out of the blue.
This adds a method to explain a reserved register with X19 as the first one.
The logic is the same as getReservedRegs.
I could have added a return parameter to isASMClobberable and friends
but found that there's a lot of things that call isReservedReg in various
ways.
So while one more method on the pile isn't great design, it is simpler
right now to do it this way and only pay the cost if you are actually using
a reserved register.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D133213
Max Kazantsev [Mon, 12 Sep 2022 08:50:28 +0000 (15:50 +0700)]
[IRCE] Bail in case of pointer types. PR40539
We should not unconditionally expect that SCEVable types are all integers
because SCEV can also be computed for pointers. Bail in this case.
owenca [Sat, 10 Sep 2022 06:27:22 +0000 (23:27 -0700)]
[clang-format] Don't insert braces for loops with a null statement
This is a workaround for #57539.
Fixes #57509.
Differential Revision: https://reviews.llvm.org/D133635
Djordje Todorovic [Mon, 12 Sep 2022 06:23:07 +0000 (08:23 +0200)]
Revert ""Recommit "[AggressiveInstCombine] Lower Table Based CTTZ"""
This reverts commit
df868edee561eb973edd85ec9df41c67aa0bff6b, as it
introduces a bug found by Alive2 (more on the rGdf868edee561).
Johannes Doerfert [Mon, 12 Sep 2022 04:35:50 +0000 (21:35 -0700)]
Revert "[Attributor] AAPointerInfo should allow "harmless" uses"
Revert "[Attributor] Teach AAPointerInfo to look into aggregates"
This reverts commit
844f6c5d03d58e7ac0c6b838e4a7834ac575ab9b and
4ed0a88cd8a77370073feb270d77a9e8b27bd68c as they broke the buildbots
that run openmp/libomptarget/test/offloading/bug49021.cpp.
Jeffrey Tan [Thu, 8 Sep 2022 18:00:22 +0000 (11:00 -0700)]
Add SBDebugger::GetSetting() public APIs
This patch adds new SBDebugger::GetSetting() API which
enables client to access settings as SBStructedData.
Implementation wise, a new ToJSON() virtual function is added to OptionValue
class so that each concrete child class can override and provides its
own JSON representation. This patch aims to define the APIs and implement
a common set of OptionValue child classes, leaving the remaining for
future patches.
This patch is used later by auto deduce source map from source line breakpoint
feature for testing generated source map entries.
Differential Revision: https://reviews.llvm.org/D133038
Johannes Doerfert [Mon, 12 Sep 2022 01:43:20 +0000 (18:43 -0700)]
[Attributor] AAPointerInfo should allow "harmless" uses
If a call base use will not capture a pointer we can approximate the
effects. This is important especially for readnone/only uses.
Johannes Doerfert [Wed, 31 Aug 2022 19:13:42 +0000 (12:13 -0700)]
[Attributor] Teach AAPointerInfo to look into aggregates
If we have a constant aggregate, e.g., as an initializer, we usually
failed to extract the proper value/type from it. This patch provides the
size and offset information necessary to extract the right part of the
constant.
Johannes Doerfert [Tue, 30 Aug 2022 21:04:48 +0000 (14:04 -0700)]
[Attributor][FIX] Conservatively handle ptr2int, don't crash
If a pointer-2-int cast is found we give up on AAPointerInfo for now.
This caused a crash before.
Reported by John Tramm (@jtramm).
Johannes Doerfert [Fri, 12 Aug 2022 23:33:29 +0000 (18:33 -0500)]
[OpenMP] Allow the Attributor to look at functions we also internalized
This is important as we have accesses to globals in those which we need to
categorize.
V Donaldson [Sat, 10 Sep 2022 04:23:50 +0000 (21:23 -0700)]
[flang] Fix invalid branch optimization
Branch optimization in function rewriteIfGotos attempts to rewrite code
such as
<<IfConstruct>>
1 If[Then]Stmt: if(cond) goto L
2 GotoStmt: goto L
3 EndIfStmt
<<End IfConstruct>>
4 Statement: ...
5 Statement: ...
6 Statement: L ...
to eliminate a branch and a trivial basic block to get:
<<IfConstruct>>
1 If[Then]Stmt [negate]: if(cond) goto L
4 Statement: ...
5 Statement: ...
3 EndIfStmt
<<End IfConstruct>>
6 Statement: L ...
Among other requirements, this is invalid if any statement between the
GOTO and its target is an intermediate construct statement such as a
CASE or ELSE IF statement, like the CASE DEFAULT statement in:
select case(i)
case (:2)
n = i * 10
case (5:)
n = i * 1000
if (i <= 6) goto 9 ! exit over 'case default'; may not be rewritten
n = i * 10000
case default
n = i * 100
9 end select
Kazu Hirata [Sun, 11 Sep 2022 23:11:41 +0000 (16:11 -0700)]
[XRay] Remove XRayRecordStorage
AFAICT, this type hasn't used for 4 years at least.
Kazu Hirata [Sun, 11 Sep 2022 23:11:39 +0000 (16:11 -0700)]
[llvm] Use std::aligned_storage_t (NFC)
Tom Honermann [Sun, 11 Sep 2022 19:58:09 +0000 (15:58 -0400)]
[libc++] Workaround the absence of the __GLIBC_USE macro in glibc versions prior to 2.25.
This change correct a configuration check that relies on the glibc __GLIBC_USE
macro being defined. Previously, the function-like macro was expanded without
ensuring it was actually defined. This resulted in compilation failures for
glibc versions prior to 2.25 (the glibc version in which the macro was added).
Differential Revision: https://reviews.llvm.org/D130946
Kazu Hirata [Sun, 11 Sep 2022 19:19:37 +0000 (12:19 -0700)]
[GlobalISel] Use std::initializer_list::size (NFC)
Vitaly Buka [Sun, 11 Sep 2022 18:46:49 +0000 (11:46 -0700)]
[test][clangd] Fix use-after-return after
72142fbac4
Vitaly Buka [Sun, 11 Sep 2022 18:44:38 +0000 (11:44 -0700)]
Revert "[test][clangd] Fix use-after-return after
72142fbac4"
Will try another fix.
This reverts commit
c3c930d573656a825523b7112891bd97eec7b64f.
Aaron Puchert [Sun, 11 Sep 2022 18:44:51 +0000 (20:44 +0200)]
Make sure libLLVM users link with libatomic if needed
64-bit atomics are used in llvm/ADT/Statistic.h, which means that users
of libLLVM.so might also have to link with libatomic. To avoid having
to special-case the library here, we simply add all `LLVM_SYSTEM_LIBS`
as public link dependencies to libLLVM.
This fixes a build failure on PowerPC 32-bit.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D132799
Aaron Puchert [Sun, 11 Sep 2022 18:44:36 +0000 (20:44 +0200)]
[libcxxabi] Fix forced_unwind3.pass.cpp compilation error
Under some circumstances there is no struct _Unwind_Exception, it's just
an alias to another struct. This would result in an error like this:
libcxxabi/test/forced_unwind3.pass.cpp:50:77: error: typedef '_Unwind_Exception' cannot be referenced with a struct specifier
static _Unwind_Reason_Code stop(int, _Unwind_Action actions, type, struct _Unwind_Exception*, struct _Unwind_Context*,
^
<...>/lib/clang/15.0.0/include/unwind.h:68:38: note: declared here
typedef struct _Unwind_Control_Block _Unwind_Exception; /* Alias */
^
This seems to have been an issue since the test was first added in
D109856, except that it didn't surface with Clang 14 because the code
is filtered out by the preprocessor if `__clang_major__ < 15`.
Reviewed By: danielkiss, mstorsjo, #libc_abi, ldionne
Differential Revision: https://reviews.llvm.org/D132873
Aaron Puchert [Sun, 11 Sep 2022 18:43:55 +0000 (20:43 +0200)]
[docs] Use relative URLs for man pages
Should have no effect on the online documentation, but it makes offline
builds more self-contained. With relative links however we have to
abstain from using `:manpage:` outside of man page cross-references.
Reviewed By: mysterymath
Differential Revision: https://reviews.llvm.org/D132794
Vitaly Buka [Sun, 11 Sep 2022 17:15:13 +0000 (10:15 -0700)]
[test][clangd] Fix use-after-return after
72142fbac4
Vitaly Buka [Sun, 11 Sep 2022 17:06:55 +0000 (10:06 -0700)]
Revert "[test][clangd] Try to unbrake bots after
72142fbac4"
It does not help.
This reverts commit
355dbd3b2aa28d479170c4e43265de186317dd86.
Mark de Wever [Sat, 3 Sep 2022 11:37:09 +0000 (13:37 +0200)]
[libc++][random] Removes transitive includes.
It seems these includes are still provided by the sub headers, so it only
removes the duplicates.
There is no change in the list of includes, but the change affects the
modular build. By not having the includes in the top-level header the
module map has changed. This uncovers missing includes in the tests
and missing exports in the module map. This causes the huge amount of
changes in the patch.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D133252
Junduo Dong [Tue, 16 Aug 2022 12:45:09 +0000 (05:45 -0700)]
[Clang] Reimplement time tracing of NewPassManager by PassInstrumentation framework
The previous implementation of time tracing in NewPassManager is direct but messive.
The key codes are like the demo below:
```
/// Runs the function pass across every function in the module.
PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,
LazyCallGraph &CG, CGSCCUpdateResult &UR) {
/// ...
PreservedAnalyses PassPA;
{
TimeTraceScope TimeScope(Pass.name());
PassPA = Pass.run(F, FAM);
}
/// ...
}
```
It can be bothered to judge where should we add the tracing codes by hands.
With the PassInstrumentation framework, we can easily add `Before/After` callback
functions to add time tracing codes.
Differential Revision: https://reviews.llvm.org/D131960
Florian Hahn [Sun, 11 Sep 2022 11:24:43 +0000 (12:24 +0100)]
[VPlan] Check recipe uses instead of type of underlying instr (NFC).
Suggested by @Ayal post-commit, to reduce the dependence on the
underlying instruction in favor of information available directly for
the recipe.
Marc Auberer [Sun, 11 Sep 2022 10:13:25 +0000 (06:13 -0400)]
[InstCombine] Fold x + (x | -x) to x & (x - 1)
Fixes #57531
This transformation may be particularly useful on x86-64,
because x & (x - 1) can be performed by a single blsr instruction.
Differential Revision: https://reviews.llvm.org/D133362
Sanjay Patel [Sat, 10 Sep 2022 15:38:32 +0000 (11:38 -0400)]
[InstCombine] add tests for mul-by-neg-pow2; NFC
Sanjay Patel [Fri, 9 Sep 2022 20:19:03 +0000 (16:19 -0400)]
[InstCombine] add tests for demanded bits of add with multi-use; NFC