A-Wadhwani [Mon, 12 Sep 2022 16:29:34 +0000 (09:29 -0700)]
[SROA] Create additional vector type candidates based on store and load slices
This patch adds additional vector types to be considered when doing
promotion in SROA, based on the types of the store and load slices. This
provides more promotion opportunities, by potentially using an optimal
"intermediate" vector type.
For example, the following code would currently not be promoted to a
vector, since `__m128i` is a `<2 x i64>` vector.
```
__m128i packfoo0(int a, int b, int c, int d) {
int r[4] = {a, b, c, d};
__m128i rm;
std::memcpy(&rm, r, sizeof(rm));
return rm;
}
```
```
packfoo0(int, int, int, int):
mov dword ptr [rsp - 24], edi
mov dword ptr [rsp - 20], esi
mov dword ptr [rsp - 16], edx
mov dword ptr [rsp - 12], ecx
movaps xmm0, xmmword ptr [rsp - 24]
ret
```
By also considering the types of the elements, we could find that the
`<4 x i32>` type would be valid for promotion, hence removing the memory
accesses for this function. In other words, we can explore other new
vector types, with the same size but different element types based on
the load and store instructions from the Slices, which can provide us
more promotion opportunities.
Additionally, the step for removing duplicate elements from the
`CandidateTys` vector was not using an equality comparator, which has
been fixed.
Differential Revision: https://reviews.llvm.org/D132096
Ben Langmuir [Fri, 9 Sep 2022 22:56:08 +0000 (15:56 -0700)]
[clang][test] Disallow using the default module cache path in lit tests
Make the default module cache path invalid when running lit tests so
that tests are forced to provide a cache path. This avoids accidentally
escaping to the system default location, and would have caught the
failure recently found in ClangScanDeps/multiple-commands.c.
Differential Revision: https://reviews.llvm.org/D133622
Benjamin Kramer [Mon, 12 Sep 2022 16:46:35 +0000 (18:46 +0200)]
[mlir][linalg] Explicitly instantiate DownscaleSizeOneWindowed2DConvolution
It's not possible to use a template with no definition from another
translation unit. Fixes the shared library build.
Adrian Prantl [Mon, 12 Sep 2022 16:48:31 +0000 (09:48 -0700)]
Skip crashing test
Craig Topper [Mon, 12 Sep 2022 16:32:15 +0000 (09:32 -0700)]
[RISCV] Rename WriteFALU* and ReadFALU* to WriteFAdd*/ReadFAdd*.
ALU seems a little vague. FAdd felt more precise even though it
also include FSUB instructions.
Reviewed By: monkchiang
Differential Revision: https://reviews.llvm.org/D133632
Felipe de Azevedo Piovezan [Sat, 10 Sep 2022 11:21:27 +0000 (07:21 -0400)]
[lldb] Fix detection of existing libcxx
The CMake variable LLDB_HAS_LIBCXX is passed to
`llvm_canonicalize_cmake_booleans`, which transforms TRUE/FALSE into
'1'/'0'. It also transforms undefined variables to '0'.
In particular, this means that the configuration script for LLDB API's
test always has _some_ value for the `has_libcxx` configuration:
```
config.has_libcxx = '@LLDB_HAS_LIBCXX@'
```
When deciding whether a libcxx exist, the testing scripts would only
check for the existence of `has_libcxx`, but not for its value. In other
words, because `if ('0')` is true in python we always think there is a
libcxx.
This was caught once D132940 was merged and most tests started to use
libcxx by default if `has_libcxx` is true. Prior to that, no failures
were seen because only tests are marked with
`@add_test_categories(["libc++"])` would require a libcxx, and these
would be filtered out on builds without the libcxx target. Furthermore,
the MacOS bots always build libcxx.
We fix this by making `has_libcxx` a boolean (instead of a string) and
by checking its value in the test configuration.
Differential Revision: https://reviews.llvm.org/D133639
Sanjay Patel [Mon, 12 Sep 2022 16:03:21 +0000 (12:03 -0400)]
[Reassociate] prevent partial undef negation replacement
As shown in the examples in issue #57683, we allow matching
vectors with poison (undef) in this transform (and possibly more),
but we can't then use the partially defined value as a replacement
value in other expressions blindly.
This seems to be avoided in simpler examples of reassociation,
and other passes should be able to clean up the redundant op
seen in these tests.
Sanjay Patel [Mon, 12 Sep 2022 15:36:11 +0000 (11:36 -0400)]
[Reassociate] add tests for vector negate with undef elements; NFC
Reduced/expanded from issue #57683.
Craig Topper [Mon, 12 Sep 2022 16:12:56 +0000 (09:12 -0700)]
[RISCV] Custom type legalize i32 loads by sign extending.
The default is to use extload which can become a zextload or
sextload if it is followed by an 'and' or sext_inreg.
Sometimes type legalization will introduce an 'and' from promoting
something like 'srl X, C' and a sext_inreg from from a setcc. The
'and' could be freely folded with the promoted 'srl' by using srliw,
but the sext_inreg can't be folded into a compare. DAG combiner
will see both of these choices and may decide to fold the 'and'
instead of the 'sext_inreg'. This forces the sext_inreg to become
a sext.w.
By picking sextload in the type legalizer we take this choice away.
Looking at spec2006 compiled with Zba and Zbb this appeared to be
net reduction in lines of code in the objdump disassembly output.
This is similar to what we do with i32 add/sub/mul/shl in
type legalization where we always emit a sext_inreg.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D130397
Matthias Gehre [Mon, 12 Sep 2022 13:27:04 +0000 (14:27 +0100)]
Move TargetTransformInfo::maxLegalDivRemBitWidth -> TargetLowering::maxSupportedDivRemBitWidth
Also remove new-pass-manager version of ExpandLargeDivRem because there is no way
yet to access TargetLowering in the new pass manager.
Differential Revision: https://reviews.llvm.org/D133691
Jay Foad [Mon, 12 Sep 2022 15:32:25 +0000 (16:32 +0100)]
[GlobalISel] Simplify extended add/sub to add/sub with carry
Simplify extended add/sub (with carry-in and carry-out) to add/sub with
carry (with carry-out only) if carry-in is known to be zero.
Differential Revision: https://reviews.llvm.org/D133702
Kazu Hirata [Mon, 12 Sep 2022 15:52:51 +0000 (08:52 -0700)]
[mlir] Fix deprecation warnings (NFC)
This patch fixes a couple of warnings by switching to has_value/value:
mlir/lib/Dialect/Vector/IR/VectorOps.cpp:529:28: error: 'hasValue'
is deprecated: Use has_value
instead. [-Werror,-Wdeprecated-declarations]
mlir/lib/Dialect/Vector/IR/VectorOps.cpp:533:48: error: 'getValue'
is deprecated: Use value
instead. [-Werror,-Wdeprecated-declarations]
Katherine Rasmussen [Thu, 8 Sep 2022 17:02:43 +0000 (10:02 -0700)]
[flang] Write semantics test for atomic_fetch_and
Write a semantics test for the atomic intrinsic subroutine,
atomic_fetch_and.
Reviewed By: rouson
Differential Revision: https://reviews.llvm.org/D133506
Simon Pilgrim [Mon, 12 Sep 2022 15:34:29 +0000 (16:34 +0100)]
[CostModel][X86] Add CostKinds handling for abs ops
This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695
Oleg Shyshkov [Mon, 12 Sep 2022 14:53:36 +0000 (16:53 +0200)]
[mlir] Change IteratorType in ContractionOp in Vector dialect from string to enum.
This is the first step in replacing interator_type from strings with enums in Vector and Linalg dialect. This change adds IteratorTypeAttr and uses it in ContractionOp.
To avoid breaking all the tests, print/parse code has conversion between string and enum for now.
There is a shared code in StructuredOpsUtils.h that expects iterator types to be strings. To break this dependancy, this change forks helper function `isParallelIterator` and `isReductionIterator` to utils in both dialects and adds `getIteratorTypeNames()` to support backward compatibility with StructuredGenerator.
In the later changes, I plan to add a similar enum attribute to Linalg.
Differential Revision: https://reviews.llvm.org/D133696
Louis Dionne [Mon, 12 Sep 2022 14:51:07 +0000 (10:51 -0400)]
[libc++] Add LLDB data formatters dependencies to the CI image
This will be required in order to add a CI job running the LLDB
data formatters.
Florian Hahn [Mon, 12 Sep 2022 14:53:30 +0000 (15:53 +0100)]
[SLP] Add Preheader to CSE blocks after hoisting CSE-able instrs.
Adding the pre-header to CSEBlocks ensures instructions are CSE'd even
after hoisting.
This was original discovered by @atrick a while ago.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D133649
Jun Zhang [Mon, 12 Sep 2022 14:21:17 +0000 (22:21 +0800)]
[Clang] Reword diagnostic for scope identifier with linkage
If the declaration of an identifier has block scope, and the identifier has
external or internal linkage, the declaration shall have no initializer for
the identifier.
Clang now gives a more suitable diagnosis for this case.
Fixes https://github.com/llvm/llvm-project/issues/57478
Signed-off-by: Jun Zhang <jun@junz.org>
Differential Revision: https://reviews.llvm.org/D133088
Simon Pilgrim [Mon, 12 Sep 2022 14:36:41 +0000 (15:36 +0100)]
[CostModel][X86] Add CostKinds test coverage for abs intrinsics
Simon Pilgrim [Mon, 12 Sep 2022 14:06:18 +0000 (15:06 +0100)]
[SimplifyCFG][X86] Regenerate speculate-cttz-ctlz.ll
There's no difference between generic/bmi/lzcnt targets atm
Joe Nash [Fri, 9 Sep 2022 19:05:32 +0000 (15:05 -0400)]
[AMDGPU] Separate check lines for some GFX11 16-bit codegen tests
NFC. Pre-commits test changes to have a separate CHECK line where GFX11 behavior will diverge from
previous subtargets in a future patch.
Joe Nash [Fri, 9 Sep 2022 20:20:21 +0000 (16:20 -0400)]
[AMDGPU] Change test check name. NFC
Change the check name from GFX10 to GFX10Plus to refect its actual usage
Alexey Bataev [Thu, 8 Sep 2022 14:41:24 +0000 (07:41 -0700)]
[SLP]Improve reordering of clustered reused scalars.
If the reused scalars are clustered, i.e. each part of the reused mask
contains all elements of the original scalars exactly once, we can
reorder those clusters to improve the whole ordering of of the clustered
vectors.
Differential Revision: https://reviews.llvm.org/D133524
Joe Nash [Fri, 9 Sep 2022 20:24:18 +0000 (16:24 -0400)]
[AMDGPU] Autogenerate test with regclass numbers
NFC. This test contains a good amount of verbose compiler output as well
as numbers which depend on the number of registers defined and are
difficult to update. So autogenerate the test.
Matt Arsenault [Mon, 12 Sep 2022 13:33:22 +0000 (09:33 -0400)]
AMDGPU: Fix test failure
Forgot to commit regenerated test
Matt Arsenault [Sun, 24 Jul 2022 22:26:22 +0000 (18:26 -0400)]
RegAllocGreedy: Try local instruction splitting with subranges
This was only trying this to relax register class constraints, but
this can also help if there are subranges involved.
This solves a compilation failure for AMDGPU when there is high
pressure created by large register tuples. If one virtual register is
using most of the available budget, we need to be able to evict
subranges.
This solves the immediate failure, but this solution leaves a lot to
be desired. In the relevant testcases, we have 32-element tuples but
most of the uses are operations on 1 element subranges of it. What
we're now getting is a spill and restore of the full 1024 bits and an
extract of the used 32-bits. It would be far better if we introduced a
copy to a new virtual register with a smaller register class and used
narrower spills.
Furthermore, we could probably do a better job if the allocator were
to introduce new subranges where none previously existed in the
highest pressure scenarios. The block and region splits should also
try to split specific subranges out.
The mve-vst3.ll test changes looks like noise to me, but instruction
count increased by one. mve-vst4.ll looks like a solid improvement
with several 16-byte spills eliminated. splitkit-copy-live-lanes.mir
also shows a solid reduction in total spill count.
This could use more tests but it's pretty tiring to come up with cases
that fail on this.
Matt Arsenault [Mon, 1 Aug 2022 13:13:29 +0000 (09:13 -0400)]
TableGen: Introduce generated getSubRegisterClass function
Currently there isn't a generic way to get a smaller register class
that can be produced from a subregister of a larger class. Replaces a
manually implemented version for AMDGPU. This will be used to improve
subregister support in the allocator.
Pavel Samolysov [Mon, 12 Sep 2022 12:39:57 +0000 (15:39 +0300)]
[NFC][ScheduleDAG] Use a reference to iterate over NodeSuccs/ChainSuccs
Pavel Samolysov [Mon, 12 Sep 2022 12:24:09 +0000 (15:24 +0300)]
[NFC][ScheduleDAG] Use structure bindings and emplace_back
Some uses of std::make_pair and the std::pair's first/second members
in the ScheduleDAGRRList.cpp file were replaced with using of the
vector's emplace_back along with structure bindings from C++17.
Sander de Smalen [Mon, 12 Sep 2022 11:05:55 +0000 (11:05 +0000)]
[AArch64][SME] Add utility class for handling SME attributes.
This patch adds a utility class that will be used in subsequent patches
for parsing the function/callsite attributes and determining whether
changes to PSTATE.SM are needed, or whether a lazy-save mechanism is
required.
It also implements some of the restrictions on the SME attributes
in the IR Verifier pass.
More details about the SME attributes and design can be found
in D131562.
Reviewed By: david-arm, aemerson
Differential Revision: https://reviews.llvm.org/D131570
Matt Arsenault [Fri, 24 Jun 2022 18:07:51 +0000 (14:07 -0400)]
DAG: Sink some getter code closer to uses
Matt Arsenault [Mon, 18 Jul 2022 19:52:47 +0000 (15:52 -0400)]
CodeGen: Set MODereferenceable from isDereferenceableAndAlignedPointer
Previously this was assuming piontsToConstantMemory implies
dereferenceable.
Muhammad Omair Javaid [Mon, 12 Sep 2022 12:21:55 +0000 (17:21 +0500)]
[LLD] Fix pdb-natvis.test for Windows on Arm
pdb-natvis test fails on Arm Windows as generated byte sizes and crc
differ for Windows on Arm. This patch fixes the test to make it pass on
Arm/Windows.
Clement Courbet [Mon, 12 Sep 2022 09:37:11 +0000 (11:37 +0200)]
[llvm-exegesis][NFC] Use factory function for LlvmState.
This allows failing more gracefully.
Pavel Samolysov [Mon, 12 Sep 2022 10:54:44 +0000 (13:54 +0300)]
[NFC][ScheduleDAG] Use Register and MCPhysReg instead of unsigned
Matt Arsenault [Thu, 21 Apr 2022 13:11:40 +0000 (09:11 -0400)]
DeadMachineInstructionElim: Switch to using LiveRegUnits
Theoretically improves compile time for targets with many overlapping
registers
Matt Arsenault [Sat, 23 Jul 2022 23:58:24 +0000 (19:58 -0400)]
RegAlloc: Use SmallSet instead of std::set
There shouldn't be more than a small handful of hints at most.
Jonas Hahnfeld [Mon, 12 Sep 2022 11:45:17 +0000 (13:45 +0200)]
Fix build of Lex unit test with CLANG_DYLIB
If CLANG_LINK_CLANG_DYLIB, clang_target_link_libraries ignores all
indivial libraries and only links clang-cpp. As LLVMTestingSupport
is separate, pass it via target_link_libraries directly.
J. Ryan Stinnett [Mon, 12 Sep 2022 11:43:17 +0000 (12:43 +0100)]
[DebugInfo][Docs] Fix RST syntax for DW_OP_LLVM_arg in LangRef
The inline code in the description of `DW_OP_LLVM_arg` wasn't terminating
correctly, leading to more text displayed as code than intended. This fixes that
up and adds a superscript as a tiny embellishment.
Muhammad Usman Shahid [Mon, 12 Sep 2022 11:47:18 +0000 (07:47 -0400)]
Rewording note note_constexpr_invalid_cast
The diagnostics here are correct, but the note is really silly. It
talks about reinterpret_cast in C code. So rewording it for c mode by
using another %select{}.
```
int array[(long)(char *)0];
```
previous note:
```
cast that performs the conversions of a reinterpret_cast is not allowed in a constant expression
```
reworded note:
```
this conversion is not allowed in a constant expression
```
Differential Revision: https://reviews.llvm.org/D133194
zhongyunde [Mon, 12 Sep 2022 09:37:36 +0000 (17:37 +0800)]
[AA] Improve the BasicAA analysis capability
According https://discourse.llvm.org/t/memoryssa-does-the-accessedbetween-support-scalable-vector-pointer/65052,
scalable vector support in BasicAA is currently essentially limited,
and should be improved effectively for a constant offset GEP if the scalable index is zero, eg:
getelementptr <vscale x 4 x i32>, ptr %p, i64 0, i64 %i
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D133567
Simon Pilgrim [Mon, 12 Sep 2022 11:08:42 +0000 (12:08 +0100)]
[CostModel][X86] Move AVX512/AVX2 uniform shift costs into the generic uniform cost tables
They shouldn't be happening after XOP shift costs - AVX2 shift supports takes preference over XOP for everything but vXi8 shifts - the improvement is pretty limited as it only affects bdver4 targets but it does help clean up a fraction of the messy shift cost logic....
Mehdi Amini [Mon, 29 Aug 2022 11:57:50 +0000 (11:57 +0000)]
Apply clang-tidy fixes for modernize-use-equals-default in TestPatterns.cpp (NFC)
Mehdi Amini [Mon, 29 Aug 2022 11:52:06 +0000 (11:52 +0000)]
Apply clang-tidy fixes for modernize-use-equals-default in TestLinalgDecomposeOps.cpp (NFC)
Mehdi Amini [Mon, 29 Aug 2022 11:26:39 +0000 (11:26 +0000)]
Apply clang-tidy fixes for readability-identifier-naming in JitRunner.cpp (NFC)
David Green [Mon, 12 Sep 2022 10:13:23 +0000 (11:13 +0100)]
[AArch64] Add some extra typepromotion cost tests. NFC
David Spickett [Mon, 12 Sep 2022 09:56:52 +0000 (09:56 +0000)]
[llvm][AArch64] Test warning for clobbering w19 with base frame pointer
The test added in
739b69e655fe66674982cffc8b8166306355e7d3 only checked
that X19 triggers the explanation, also check W19.
lorenzo chelini [Mon, 12 Sep 2022 09:37:31 +0000 (11:37 +0200)]
[MLIR][Linalg] Fix typos in 'DropUnitDims.cpp'
David Spickett [Fri, 2 Sep 2022 15:19:02 +0000 (15:19 +0000)]
[LLVM][AArch64] Explain that X19 is used as the frame base pointer register
Fixes #50098
LLVM uses X19 as the frame base pointer, if it needs to. Meaning you
can get warnings if you clobber that with inline asm.
However, it doesn't explain why. The frame base register is not part
of the ABI so it's pretty confusing why you get that warning out of the blue.
This adds a method to explain a reserved register with X19 as the first one.
The logic is the same as getReservedRegs.
I could have added a return parameter to isASMClobberable and friends
but found that there's a lot of things that call isReservedReg in various
ways.
So while one more method on the pile isn't great design, it is simpler
right now to do it this way and only pay the cost if you are actually using
a reserved register.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D133213
Max Kazantsev [Mon, 12 Sep 2022 08:50:28 +0000 (15:50 +0700)]
[IRCE] Bail in case of pointer types. PR40539
We should not unconditionally expect that SCEVable types are all integers
because SCEV can also be computed for pointers. Bail in this case.
owenca [Sat, 10 Sep 2022 06:27:22 +0000 (23:27 -0700)]
[clang-format] Don't insert braces for loops with a null statement
This is a workaround for #57539.
Fixes #57509.
Differential Revision: https://reviews.llvm.org/D133635
Djordje Todorovic [Mon, 12 Sep 2022 06:23:07 +0000 (08:23 +0200)]
Revert ""Recommit "[AggressiveInstCombine] Lower Table Based CTTZ"""
This reverts commit
df868edee561eb973edd85ec9df41c67aa0bff6b, as it
introduces a bug found by Alive2 (more on the rGdf868edee561).
Johannes Doerfert [Mon, 12 Sep 2022 04:35:50 +0000 (21:35 -0700)]
Revert "[Attributor] AAPointerInfo should allow "harmless" uses"
Revert "[Attributor] Teach AAPointerInfo to look into aggregates"
This reverts commit
844f6c5d03d58e7ac0c6b838e4a7834ac575ab9b and
4ed0a88cd8a77370073feb270d77a9e8b27bd68c as they broke the buildbots
that run openmp/libomptarget/test/offloading/bug49021.cpp.
Jeffrey Tan [Thu, 8 Sep 2022 18:00:22 +0000 (11:00 -0700)]
Add SBDebugger::GetSetting() public APIs
This patch adds new SBDebugger::GetSetting() API which
enables client to access settings as SBStructedData.
Implementation wise, a new ToJSON() virtual function is added to OptionValue
class so that each concrete child class can override and provides its
own JSON representation. This patch aims to define the APIs and implement
a common set of OptionValue child classes, leaving the remaining for
future patches.
This patch is used later by auto deduce source map from source line breakpoint
feature for testing generated source map entries.
Differential Revision: https://reviews.llvm.org/D133038
Johannes Doerfert [Mon, 12 Sep 2022 01:43:20 +0000 (18:43 -0700)]
[Attributor] AAPointerInfo should allow "harmless" uses
If a call base use will not capture a pointer we can approximate the
effects. This is important especially for readnone/only uses.
Johannes Doerfert [Wed, 31 Aug 2022 19:13:42 +0000 (12:13 -0700)]
[Attributor] Teach AAPointerInfo to look into aggregates
If we have a constant aggregate, e.g., as an initializer, we usually
failed to extract the proper value/type from it. This patch provides the
size and offset information necessary to extract the right part of the
constant.
Johannes Doerfert [Tue, 30 Aug 2022 21:04:48 +0000 (14:04 -0700)]
[Attributor][FIX] Conservatively handle ptr2int, don't crash
If a pointer-2-int cast is found we give up on AAPointerInfo for now.
This caused a crash before.
Reported by John Tramm (@jtramm).
Johannes Doerfert [Fri, 12 Aug 2022 23:33:29 +0000 (18:33 -0500)]
[OpenMP] Allow the Attributor to look at functions we also internalized
This is important as we have accesses to globals in those which we need to
categorize.
V Donaldson [Sat, 10 Sep 2022 04:23:50 +0000 (21:23 -0700)]
[flang] Fix invalid branch optimization
Branch optimization in function rewriteIfGotos attempts to rewrite code
such as
<<IfConstruct>>
1 If[Then]Stmt: if(cond) goto L
2 GotoStmt: goto L
3 EndIfStmt
<<End IfConstruct>>
4 Statement: ...
5 Statement: ...
6 Statement: L ...
to eliminate a branch and a trivial basic block to get:
<<IfConstruct>>
1 If[Then]Stmt [negate]: if(cond) goto L
4 Statement: ...
5 Statement: ...
3 EndIfStmt
<<End IfConstruct>>
6 Statement: L ...
Among other requirements, this is invalid if any statement between the
GOTO and its target is an intermediate construct statement such as a
CASE or ELSE IF statement, like the CASE DEFAULT statement in:
select case(i)
case (:2)
n = i * 10
case (5:)
n = i * 1000
if (i <= 6) goto 9 ! exit over 'case default'; may not be rewritten
n = i * 10000
case default
n = i * 100
9 end select
Kazu Hirata [Sun, 11 Sep 2022 23:11:41 +0000 (16:11 -0700)]
[XRay] Remove XRayRecordStorage
AFAICT, this type hasn't used for 4 years at least.
Kazu Hirata [Sun, 11 Sep 2022 23:11:39 +0000 (16:11 -0700)]
[llvm] Use std::aligned_storage_t (NFC)
Tom Honermann [Sun, 11 Sep 2022 19:58:09 +0000 (15:58 -0400)]
[libc++] Workaround the absence of the __GLIBC_USE macro in glibc versions prior to 2.25.
This change correct a configuration check that relies on the glibc __GLIBC_USE
macro being defined. Previously, the function-like macro was expanded without
ensuring it was actually defined. This resulted in compilation failures for
glibc versions prior to 2.25 (the glibc version in which the macro was added).
Differential Revision: https://reviews.llvm.org/D130946
Kazu Hirata [Sun, 11 Sep 2022 19:19:37 +0000 (12:19 -0700)]
[GlobalISel] Use std::initializer_list::size (NFC)
Vitaly Buka [Sun, 11 Sep 2022 18:46:49 +0000 (11:46 -0700)]
[test][clangd] Fix use-after-return after
72142fbac4
Vitaly Buka [Sun, 11 Sep 2022 18:44:38 +0000 (11:44 -0700)]
Revert "[test][clangd] Fix use-after-return after
72142fbac4"
Will try another fix.
This reverts commit
c3c930d573656a825523b7112891bd97eec7b64f.
Aaron Puchert [Sun, 11 Sep 2022 18:44:51 +0000 (20:44 +0200)]
Make sure libLLVM users link with libatomic if needed
64-bit atomics are used in llvm/ADT/Statistic.h, which means that users
of libLLVM.so might also have to link with libatomic. To avoid having
to special-case the library here, we simply add all `LLVM_SYSTEM_LIBS`
as public link dependencies to libLLVM.
This fixes a build failure on PowerPC 32-bit.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D132799
Aaron Puchert [Sun, 11 Sep 2022 18:44:36 +0000 (20:44 +0200)]
[libcxxabi] Fix forced_unwind3.pass.cpp compilation error
Under some circumstances there is no struct _Unwind_Exception, it's just
an alias to another struct. This would result in an error like this:
libcxxabi/test/forced_unwind3.pass.cpp:50:77: error: typedef '_Unwind_Exception' cannot be referenced with a struct specifier
static _Unwind_Reason_Code stop(int, _Unwind_Action actions, type, struct _Unwind_Exception*, struct _Unwind_Context*,
^
<...>/lib/clang/15.0.0/include/unwind.h:68:38: note: declared here
typedef struct _Unwind_Control_Block _Unwind_Exception; /* Alias */
^
This seems to have been an issue since the test was first added in
D109856, except that it didn't surface with Clang 14 because the code
is filtered out by the preprocessor if `__clang_major__ < 15`.
Reviewed By: danielkiss, mstorsjo, #libc_abi, ldionne
Differential Revision: https://reviews.llvm.org/D132873
Aaron Puchert [Sun, 11 Sep 2022 18:43:55 +0000 (20:43 +0200)]
[docs] Use relative URLs for man pages
Should have no effect on the online documentation, but it makes offline
builds more self-contained. With relative links however we have to
abstain from using `:manpage:` outside of man page cross-references.
Reviewed By: mysterymath
Differential Revision: https://reviews.llvm.org/D132794
Vitaly Buka [Sun, 11 Sep 2022 17:15:13 +0000 (10:15 -0700)]
[test][clangd] Fix use-after-return after
72142fbac4
Vitaly Buka [Sun, 11 Sep 2022 17:06:55 +0000 (10:06 -0700)]
Revert "[test][clangd] Try to unbrake bots after
72142fbac4"
It does not help.
This reverts commit
355dbd3b2aa28d479170c4e43265de186317dd86.
Mark de Wever [Sat, 3 Sep 2022 11:37:09 +0000 (13:37 +0200)]
[libc++][random] Removes transitive includes.
It seems these includes are still provided by the sub headers, so it only
removes the duplicates.
There is no change in the list of includes, but the change affects the
modular build. By not having the includes in the top-level header the
module map has changed. This uncovers missing includes in the tests
and missing exports in the module map. This causes the huge amount of
changes in the patch.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D133252
Junduo Dong [Tue, 16 Aug 2022 12:45:09 +0000 (05:45 -0700)]
[Clang] Reimplement time tracing of NewPassManager by PassInstrumentation framework
The previous implementation of time tracing in NewPassManager is direct but messive.
The key codes are like the demo below:
```
/// Runs the function pass across every function in the module.
PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,
LazyCallGraph &CG, CGSCCUpdateResult &UR) {
/// ...
PreservedAnalyses PassPA;
{
TimeTraceScope TimeScope(Pass.name());
PassPA = Pass.run(F, FAM);
}
/// ...
}
```
It can be bothered to judge where should we add the tracing codes by hands.
With the PassInstrumentation framework, we can easily add `Before/After` callback
functions to add time tracing codes.
Differential Revision: https://reviews.llvm.org/D131960
Florian Hahn [Sun, 11 Sep 2022 11:24:43 +0000 (12:24 +0100)]
[VPlan] Check recipe uses instead of type of underlying instr (NFC).
Suggested by @Ayal post-commit, to reduce the dependence on the
underlying instruction in favor of information available directly for
the recipe.
Marc Auberer [Sun, 11 Sep 2022 10:13:25 +0000 (06:13 -0400)]
[InstCombine] Fold x + (x | -x) to x & (x - 1)
Fixes #57531
This transformation may be particularly useful on x86-64,
because x & (x - 1) can be performed by a single blsr instruction.
Differential Revision: https://reviews.llvm.org/D133362
Sanjay Patel [Sat, 10 Sep 2022 15:38:32 +0000 (11:38 -0400)]
[InstCombine] add tests for mul-by-neg-pow2; NFC
Sanjay Patel [Fri, 9 Sep 2022 20:19:03 +0000 (16:19 -0400)]
[InstCombine] add tests for demanded bits of add with multi-use; NFC
Mark de Wever [Sun, 11 Sep 2022 10:12:22 +0000 (12:12 +0200)]
[libc++][doc] Updates format status page.
Rainer Orth [Sun, 11 Sep 2022 09:25:53 +0000 (11:25 +0200)]
[compiler-rt] Handle non-canonical triples with new runtime lib layout
As described in Issue #54196
<https://github.com/llvm/llvm-project/issues/54196>, the ideas of `clang`
and `compiler-rt` where runtime libs are located with
`-DLLVM_ENABLE_RUNTIMES` can differ. This is the `compiler-rt` side of the
patch I've used to get them in sync for the `amd64-pc-solaris2.11` and
`sparc64-unknown-linux-gnu` release builds.
Tested on `amd64-pc-solaris2.11` and `sparc64-unknown-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D133406
Alexey Bader [Wed, 3 Aug 2022 13:06:50 +0000 (06:06 -0700)]
[StripDeadDebugInfo] Drop dead CUs
In situations when a submodule is extracted from big module (i.e. using
CloneModule) a lot of debug info is copied via metadata nodes. Despite of
the fact that part of that info is not linked to any instruction in extracted
IR file, StripDeadDebugInfo pass doesn't drop them.
Strengthen criteria for debug info that should be kept in a module:
- Only those compile units are left that referenced by a subprogram debug info
node that is attached to a function definition in the module or to an instruction
in the module that belongs to an inlined function.
Signed-off-by: Mikhail Lychkov <mikhail.lychkov@intel.com>
Differential Revision: https://reviews.llvm.org/D122163
Vitaly Buka [Sun, 11 Sep 2022 08:19:22 +0000 (01:19 -0700)]
[test][clangd] Try to unbrake bots after
72142fbac4
Tom Honermann [Sat, 10 Sep 2022 14:16:20 +0000 (10:16 -0400)]
[libc++][cuchar] Declare std::c8rtomb and std::mbrtoc8 in <cuchar> if available.
This change implements the C library dependent portions of P0482R6
(char8_t: A type for UTF-8 characters and strings (Revision 6)) by
declaring std::c8rtomb() and std::mbrtoc8() in the <cuchar> header
when implementations are provided by the C library as specified by
WG14 N2653 (char8_t: A type for UTF-8 characters and strings
(Revision 1)) as adopted for C23.
A _LIBCPP_HAS_NO_C8RTOMB_MBRTOC8 macro is defined by the libc++ __config
header unless it is known that the C library provides these functions
in the current compilation mode. This macro is used for testing purposes
and may be of use to libc++ users. At present, the only C library known
to implement these functions is GNU libc as of its 2.36 release.
Reviewed By: ldionne
Differential Revision: https://reviews.llvm.org/D130946
Groverkss [Sun, 11 Sep 2022 00:02:52 +0000 (01:02 +0100)]
[MLIR][Presburger] Refactor MultiAffineFunction to be defined over universe
This patch refactors MAF to be defined over the universe in a given space
instead of being defined over a restricted domain.
The reasoning for this refactor is to store division representation for local
variables explicitly for the function outputs. This change is required for
unionLexMax/Min to support local variables which will be upstreamed after this
patch. Another reason for this refactor is to have a flattened form of
AffineMap as MultiAffineFunction.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D131864
Vitaly Buka [Sat, 10 Sep 2022 23:00:17 +0000 (16:00 -0700)]
[test][msan] Add tests for @llvm.masked.*
Aiden Grossman [Wed, 7 Sep 2022 20:36:09 +0000 (20:36 +0000)]
[MLGO] Make TFLiteUtils throw an error if some features haven't been passed to the model
In the Tensorflow C lib utilities, an error gets thrown if some features
haven't gotten passed into the model (due to differences in ordering
which now don't exist with the transition to TFLite). However, this is
not currently the case when using TFLiteUtils. This patch makes some
minor changes to throw an error when not all inputs of the model have
been passed, which when not handled will result in a seg fault within
TFLite.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D133451
Vitaly Buka [Sat, 10 Sep 2022 22:19:00 +0000 (15:19 -0700)]
[test][msan] Remove -DAG after fixing indeterminism
Vitaly Buka [Sat, 10 Sep 2022 21:15:23 +0000 (14:15 -0700)]
[msan] Don't deppend on argumens evaluation order
Vitaly Buka [Sat, 10 Sep 2022 20:19:17 +0000 (13:19 -0700)]
[test][msan] Autogenerate the test
Vitaly Buka [Sat, 10 Sep 2022 20:47:10 +0000 (13:47 -0700)]
[msan] Do not deppend on arguments evaluation order
Clang and GCC do this differently making IR inconsistent.
https://lab.llvm.org/buildbot#builders/6/builds/13120
Vitaly Buka [Sat, 10 Sep 2022 20:31:56 +0000 (13:31 -0700)]
Revert "[test][msan] Convert test into autogenerated"
Fails https://lab.llvm.org/buildbot#builders/6/builds/13120
This reverts commit
affc90ed8d30badf585a93d1b6997e400099075c.
Vitaly Buka [Sat, 10 Sep 2022 20:19:17 +0000 (13:19 -0700)]
[test][msan] Convert test into autogenerated
Florian Hahn [Sat, 10 Sep 2022 19:55:03 +0000 (20:55 +0100)]
[SLP] Add test case showing missing CSE in hoisted instructions.
Vitaly Buka [Sat, 10 Sep 2022 19:20:54 +0000 (12:20 -0700)]
[NFC][msan] Remove unused return type
Vitaly Buka [Sat, 10 Sep 2022 19:09:51 +0000 (12:09 -0700)]
[msan] Relax handling of llvm.masked.expandload and llvm.masked.gather
This is work around for new false positives. Real implementation will
follow.
Simon Pilgrim [Sat, 10 Sep 2022 17:18:36 +0000 (18:18 +0100)]
[CostModel][X86] Merge AVX512BW vXi8/vXi16 shifts into default AVX512BW cost table
We only need to handle the uniform cases early
Corentin Jabot [Sat, 10 Sep 2022 17:07:56 +0000 (19:07 +0200)]
[Clang] NFC: Remove duplicated variable def in CheckLValueConstantExpression
Simon Pilgrim [Sat, 10 Sep 2022 16:57:20 +0000 (17:57 +0100)]
[CostModel][X86] Update CTPOP costs
With the bdver2 model updates, many of the AVX1 costs were far too high - it also helped expose some costs mismatches for Atom/Silvermont
Simon Pilgrim [Sat, 10 Sep 2022 16:34:40 +0000 (17:34 +0100)]
[X86] Fix bdver2 128-bit shuffles throughputs
Noticed while trying to get vector ctpop/ctlz/cttz costs fixed using the script from D103695 - all of these are full-rate but the throughput costs were weirdly high for bdver2
Matches AMD 15h SoG, Agner and instlatx64
Simon Pilgrim [Sat, 10 Sep 2022 15:21:50 +0000 (16:21 +0100)]
[X86] Fix bdver2 128-bit ALU/logic/shift throughputs
Noticed while trying to get vector shifts costs fixed using the script from D103695 - all of these are full-rate but the throughput costs were weirdly high for bdver2
Matches AMD 15h SoG, Agner and instlatx64
Manuel Brito [Sat, 10 Sep 2022 13:22:38 +0000 (14:22 +0100)]
Use PoisonValue instead of UndefValue when RAUWing unreachable code [NFC]
Replacing the following instances of UndefValue with PoisonValue, where the UndefValue is used as an arbitrary value:
- llvm/lib/CodeGen/WinEHPrepare.cpp
`demotePHIsOnFunclets`: RAUW arbitrary value for lingering uses of removed PHI nodes
- llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
`FoldSingleEntryPHINodes`: Removes a self-referential single entry phi node.
- llvm/lib/Transforms/Utils/CallGraphUpdater.cpp
`finalize`: Remove all references to removed functions.
- llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
`cleanup`: the result is not used then the inserted instructions are removed.
- llvm/tools/bugpoint/CrashDebugger.cpp
`TestInts`: the program is cloned and instructions are removed to narrow down source of crash.
Differential Revision: https://reviews.llvm.org/D133640
Yi Kong [Fri, 9 Sep 2022 19:42:15 +0000 (03:42 +0800)]
[Object] Improve ArchiveWriter diagnostics
Print out the archive member that failed, to make debugging easier.
Before:
error: failed to build archive: Not an int attribute (Producer: 'LLVM15.0.1git' Reader: 'LLVM 14.0.5-rust-dev')
After:
error: failed to build archive: 'fake_bt_keystore.o': Not an int attribute (Producer: 'LLVM15.0.1git' Reader: 'LLVM 14.0.5-rust-dev')
Differential Revision: https://reviews.llvm.org/D133607