Ben Shi [Thu, 5 May 2022 02:18:09 +0000 (02:18 +0000)]
[Disassembler][AVR] Remove unused static functions
The unused static functions cause failures on some build machines.
Craig Topper [Thu, 5 May 2022 01:51:25 +0000 (18:51 -0700)]
[X86] Call initializeX86PreTileConfigPass from LLVMInitializeX86Target.
Without this, the pass doesn't show up in print-before/after-all.
Differential Revision: https://reviews.llvm.org/D124973
Craig Topper [Thu, 5 May 2022 01:29:15 +0000 (18:29 -0700)]
[SelectionDAG] Use llvm::any_of to simplify a loop. NFC
Ben Shi [Mon, 11 Apr 2022 01:44:49 +0000 (01:44 +0000)]
[MC][AVR] Implement decoding ST/LD
Reviewed By: aykevl, dylanmckay
Differential Revision: https://reviews.llvm.org/D123476
Ben Shi [Sat, 9 Apr 2022 01:45:22 +0000 (01:45 +0000)]
[MC][AVR] Implement decoding STD/LDD
Reviewed By: aykevl, dylanmckay
Differential Revision: https://reviews.llvm.org/D123442
Alexander Shaposhnikov [Thu, 5 May 2022 00:50:33 +0000 (00:50 +0000)]
[InstCombine] Fold ((A&B)^C)|B
Fold ((A&B)^C)|B into C|B.
https://alive2.llvm.org/ce/z/zSGSor
This addresses the issue https://github.com/llvm/llvm-project/issues/55169
Test plan: ninja check-all
Differential revision: https://reviews.llvm.org/D124710
Ayke van Laethem [Wed, 4 May 2022 22:46:27 +0000 (00:46 +0200)]
[compiler-rt][AVR] Fix avr_SOURCES CMake variable
D123200 did not include the generic sources, which means that only the
AVR-specific sources were compiled. With this change, generic sources
are included as expected.
Tested with the following commands:
cmake -G Ninja -DCOMPILER_RT_DEFAULT_TARGET_TRIPLE=avr -DCOMPILER_RT_BAREMETAL_BUILD=1 -DCMAKE_C_COMPILER=clang-14 -DCMAKE_C_FLAGS="--target=avr -mmcu=avr5 -nostdlibinc -mdouble=64" ../path/to/builtins
ninja
Differential Revision: https://reviews.llvm.org/D124969
Craig Topper [Thu, 5 May 2022 00:19:43 +0000 (17:19 -0700)]
[RISCV] Use movImm went multiplying by simm12 in getVLENFactoredAmount.
No reason to special case simm12, movImm handles all immediates.
This also fixe a bug that we weren't passing the frame-setup/destroy
flag to movImm when we were calling it.
Alexander Shaposhnikov [Thu, 5 May 2022 00:07:49 +0000 (00:07 +0000)]
[InstCombine][NFC] Update comment in and-xor-or.ll
Alexander Shaposhnikov [Thu, 5 May 2022 00:04:33 +0000 (00:04 +0000)]
[InstCombine][NFC] Add baseline tests for folds of ((A&B)^C)|B
Differential revision: https://reviews.llvm.org/D124709
Test plan: make check-all
Nico Weber [Fri, 22 Apr 2022 15:55:50 +0000 (11:55 -0400)]
[lld/mac] Support writing zippered dylibs and bundles
With -platform_version flags for two distinct platforms,
this writes a LC_BUILD_VERSION header for each.
The motivation is that this is needed for self-hosting with lld as linker
after D124059.
To create a zippered output at the clang driver level, pass
-target arm64-apple-macos -darwin-target-variant arm64-apple-ios-macabi
to create a zippered dylib.
(In Xcode's clang, `-darwin-target-variant` is spelled just `-target-variant`.)
(If you pass `-target arm64-apple-ios-macabi -target-variant arm64-apple-macos`
instead, ld64 crashes!)
This results in two -platform_version flags being passed to the linker.
ld64 also verifies that the iOS SDK version is at least 13.1. We don't do that
yet. But ld64 also does that for other platforms and we don't. So we need to
do that at some point, but not in this patch.
Only dylib and bundle outputs can be zippered.
I verified that a Catalyst app linked against a dylib created with
clang -shared foo.cc -o libfoo.dylib \
-target arm64-apple-macos \
-target-variant arm64-apple-ios-macabi \
-Wl,-install_name,@rpath/libfoo.dylib \
-fuse-ld=$PWD/out/gn/bin/ld64.lld
runs successfully. (The app calls a function `f()` in libfoo.dylib
that returns a const char* "foo", and NSLog(@"%s")s it.)
ld64 is a bit more permissive when writing zippered outputs,
see references to "unzippered twins". That's not implemented yet.
(If anybody wants to implement that, D124275 is a good start.)
Differential Revision: https://reviews.llvm.org/D124887
Nico Weber [Wed, 4 May 2022 13:08:58 +0000 (09:08 -0400)]
[llvm-otool] Make `llvm-otool -l` output compatible with otool for LC_BUILD_VERSION
Namely, only "symbolize" platform and tool names if `-v` is passed.
(`llvm-otool -lv` output still isn't quite the same as `otool -lv` output, but
`-v` output is arguably for consumption by humans, so I'm not changing that
at this point. Someone else could change it if it was important to them.)
Differential Revision: https://reviews.llvm.org/D124920
Craig Topper [Wed, 4 May 2022 23:13:06 +0000 (16:13 -0700)]
[PowerPC] Re-run update_mir_test_checks.py on nofpexcept.ll. NFC
This test was previously generated by the script, but the script
now uses CHECK-NEXT instead of CHECK.
This is preparation for a strictfp related patch I'm working on.
H.J. Lu [Wed, 4 May 2022 21:53:05 +0000 (14:53 -0700)]
[sanitizer] Use newfstatat for x32
Since newfstatat is supported on x32, use it for x32.
Differential Revision: https://reviews.llvm.org/D124968
Jason Molenda [Wed, 4 May 2022 22:27:09 +0000 (15:27 -0700)]
Remove expected fail for TestStepNoDebug on AArch64
My fix in https://reviews.llvm.org/D124492 should fix
this - I got an "unexpected pass" failure from an
Aarch64 Ubuntu bot when I landed my fix.
Junfeng Dong [Wed, 4 May 2022 22:21:30 +0000 (15:21 -0700)]
[DebugInfo] Give warning instead of error for premature terminator in .debug_aranges section.
llvm-profgen gives error message when the input binary contains premature terminator in .debug_aranges section. These zero length items point to some rodata with zero size type in embed Rust Library. Considering Zero-Sized Types are a valid feature in Rust. They are not real error. This change makes the "error:" message into a warning to avoid misleading.
Why do we still want a warning on such case? because it doesn't follow dwarf standard. https://bugs.llvm.org/show_bug.cgi?id=46805 contains early discussion.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D124121
Jez Ng [Wed, 4 May 2022 22:01:34 +0000 (18:01 -0400)]
[lld-macho][nfc] Set test min version to 11.0
The arm64-apple-macos triple is only valid for versions >= 11.0. (If
one passes arm64-apple-macos10.15 to llvm-mc, the output's min version is still
11.0). In order to write tests easily for both target archs, let's up the
default min version in our tests.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D124562
Jason Molenda [Wed, 4 May 2022 21:43:42 +0000 (14:43 -0700)]
Update the CFA to use $sp when $fp is restored on arm64
In UnwindAssemblyInstEmulation we correctly recognize when a LDP
restores the fp & lr in an epilogue, and mark them as having the
caller's contents now, but we don't update the CFA register rule
at that point to indicate that the CFA is now calculated in terms
of $sp. This doesn't impact the backtrace because the register
contents are all <same> now, but it can confuse the stepper when
the StackID changes mid-epilogue.
Differential Revision: https://reviews.llvm.org/D124492
rdar://
92064415
Zixu Wang [Wed, 4 May 2022 19:29:45 +0000 (12:29 -0700)]
Revert "Revert "[clang][extract-api] Use relative includes""
Reapply the change after fixing sanitizer errors.
The original problem was that `StringRef`s in `Matches` are pointing to
temporary local `std::string`s created by `path::convert_to_slash` in
the regex match call. This patch does the conversion up front in
container `FilePath`.
This reverts commit
2966f0fa505266735dbc8324b8821b7f0aa901ff.
Differential Revision: https://reviews.llvm.org/D124964
Philip Reames [Tue, 3 May 2022 21:00:51 +0000 (14:00 -0700)]
[RISCV] Add a version of insertVSETVLI which uses an iterator [NFC]
This is to simplify the final version of D124869.
Stanislav Mekhanoshin [Wed, 27 Apr 2022 19:10:16 +0000 (12:10 -0700)]
[AMDGPU] Handle LDS DMA and LDS_DIRECT hazards
There shall be 1 wait state between M0 write and LDS DMA/LDS_DIRECT use.
Differential Revision: https://reviews.llvm.org/D124550
Jon Chesterfield [Wed, 4 May 2022 21:42:05 +0000 (22:42 +0100)]
[amdgpu] Elide module lds allocation in kernels with no callees
Introduces a string attribute, amdgpu-requires-module-lds, to allow
eliding the module.lds block from kernels. Will allocate the block as before
if the attribute is missing or has its default value of true.
Patch uses the new attribute to detect the simplest possible instance of this,
where a kernel makes no calls and thus cannot call any functions that use LDS.
Tests updated to match, coverage was already good. Interesting cases is in
lower-module-lds-offsets where annotating the kernel allows the backend to pick
a different (in this case better) variable ordering than previously. A later
patch will avoid moving kernel variables into module.lds when the kernel can
have this attribute, allowing optimal ordering and locally unused variable
elimination.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D122091
Craig Topper [Wed, 4 May 2022 21:26:44 +0000 (14:26 -0700)]
[RISCV] Add a special case to treat riscv-v-vector-bits-min=-1 as meaning use Zvl*b value.
riscv-v-vector-bits-min is primarily used to opt-in to the
autovectorizer. The vector width can be determined from Zvl*b.
This patch adds support treating -1 as meaning use Zvl*b so we can
still opt-in to autovectorization without needing to repeat a
vector width already given by Zvl*b or -mcpu.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D124960
Congzhe Cao [Wed, 4 May 2022 21:09:13 +0000 (17:09 -0400)]
[LoopCacheAnalysis][NFC] Add a test case for improved loop cache analysis cost calculation
Added a motivating test case for D123400 where the loopnest has a
suboptimal loop order j-i-k. After D123400 we ensure that the order
of loop cache analysis output is loop i-j-k, despite the suboptimal
order in the original loopnest.
Reviewed By: bmahjour, #loopoptwg
Differential Revision: https://reviews.llvm.org/D122776
David Green [Wed, 4 May 2022 21:12:09 +0000 (22:12 +0100)]
[ARM] Delay creation of MVE Imm shifts to legalization
The reasoning for creating VSHLIMM/VSHRsIMM/VSHRuIMM nodes in a combine
- because matching i64 constants is difficult - does not apply for MVE,
as there are not v2i64 shifts. Delaying the creation of the nodes can
allow extra transforms on target independant shl/shr.
Amir Ayupov [Wed, 4 May 2022 21:07:42 +0000 (14:07 -0700)]
[BOLT][NFC] Move getInliningInfo out of Inliner class
`getInliningInfo` is useful in other passes that need to check inlining
eligibility for some function. Move the declaration and InliningInfo definition
out of Inliner class. Prepare for subsequent use in ICP.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D124899
Amir Ayupov [Wed, 4 May 2022 21:03:24 +0000 (14:03 -0700)]
[BOLT][NFC] Minor cleanup in ICP getCallTargets and canPromoteCallsite
Minor refactoring. NFC.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D124898
Yitzhak Mandelbaum [Wed, 4 May 2022 18:47:08 +0000 (18:47 +0000)]
[clang-tidy] Escape diagnostic messages before passing to `diag` in Transformer.
Messages generated by Transformer rules may have `%` in them, which
needs to be escaped before being passed to `diag`, which interprets them
specially (and crashes if they are misused).
Differential Revision: https://reviews.llvm.org/D124952
Ayke van Laethem [Wed, 4 May 2022 16:37:28 +0000 (18:37 +0200)]
[compiler-rt][AVR] Use correct return value for __ledf2 etc
Previously the default was long, which is 32-bit on AVR. But avr-gcc
expects a smaller value: it reads the return value from r24.
This is actually a regression from https://reviews.llvm.org/D98205.
Before D98205, the return value was an enum (which was 2 bytes in size)
which was compatible with the 1-byte return value that avr-gcc was
expecting. But long is 4 bytes and thus places the significant return
value in a different register.
Differential Revision: https://reviews.llvm.org/D124939
Aaron Ballman [Wed, 4 May 2022 20:45:42 +0000 (16:45 -0400)]
Fix a crash on targets where __bf16 isn't supported
We'd nondeterministically assert (and later crash) when calculating the size or
alignment of a __bf16 type when the type isn't supported on a target because of
reading uninitialized values. Now we check whether the type is supported first.
Fixes #50171
Min-Yih Hsu [Wed, 20 Apr 2022 17:13:59 +0000 (10:13 -0700)]
[mlir][LLVMIR] Do not update instMap via assignments to entry references
Inside processInstruction, we assign the translated mlir::Value to a
reference previously taken from the corresponding entry in instMap.
However, instMap (a DenseMap) might resize after the entry reference was
taken, rendering the assignment useless since it's assigning to a
dangling reference. Here is a (pseudo) snippet that shows the concept:
```
// inst has type llvm::Instruction *
Value &v = instMap[inst];
...
// op is one of the operands of inst, has type llvm::Value *
processValue(op);
// instMap resizes inside processValue
...
translatedValue = b.createOp<Foo>(...);
// v is already a dangling reference at this point!
// The following assignment is bogus.
v = translatedValue;
```
Nevertheless, after we stop caching llvm::Constant into instMap, there
is only one case that can cause processValue to resize instMap: If the
operand is a llvm::ConstantExpr. In which case we will insert the
derived llvm::Instruction into instMap.
To trigger instMap to resize, which is a DenseMap, the threshold depends
on the ratio between # of map entries and # of (hash) buckets. More specifically,
it resizes if (# of map entries / # of buckets) >= 0.75.
In this case # of map entries is equal to # of LLVM instructions, and # of
buckets is the power-of-two upperbound of # of map entries. Thus, eventually
in the attaching test case (test/Target/LLVMIR/Import/incorrect-instmap-assignment.ll),
we picked 96 and 128 for the # of map entries and # of buckets, respectively.
(We can't pick numbers that are too small since DenseMap used inlined
storage for small number of entries). Therefore, the ConstantExpr in the
said test case (i.e. a GEP) is the 96-th llvm::Value cached into the
instMap, triggering the issue we're discussing here on its enclosing
instruction (i.e. a load).
This patch fixes this issue by calling `operator[]` everytime we need to
update an entry.
Differential Revision: https://reviews.llvm.org/D124627
Teresa Johnson [Wed, 4 May 2022 18:52:47 +0000 (11:52 -0700)]
[memprof] Use unknown_function error type for missing functions
Switch the error type when a function is not found in the memprof
profile to unknown_function. This gives compatibility with normal PGO
function matching, and also prevents issuing large numbers of additional
matching errors since pgo-warn-missing-function is off by default.
Differential Revision: https://reviews.llvm.org/D124953
Martin Storsjö [Wed, 4 May 2022 09:53:58 +0000 (12:53 +0300)]
[libunwind] Silence warnings about unused variables. NFC.
This variable was considered unused when NDEBUG was defined.
Differential Revision: https://reviews.llvm.org/D124911
Martin Storsjö [Wed, 4 May 2022 09:52:20 +0000 (12:52 +0300)]
[libunwind] [CMake] Handle the RelWithDebInfo configuration similarly to Release
This makes sure to include libunwind log messages in the build if
LIBUNWIND_ENABLE_ASSERTIONS is set (which it is by default), when
building in RelWithDebInfo configurations.
Differential Revision: https://reviews.llvm.org/D124912
Amir Ayupov [Wed, 4 May 2022 18:42:14 +0000 (11:42 -0700)]
[BOLT][NFC] Fix MCPlusBuilder::getAliases caching behavior
Caching behavior of `getAliases` causes a failure in unit tests where two
MCPlusBuilder objects are created corresponding to AArch64 and X86:
the alias cache is created for AArch64 but then used for X86.
https://lab.llvm.org/staging/#/builders/211/builds/126
The issue only affects unit tests as we only construct one MCPlusBuilder
for ELF binary.
Resolve the issue by moving alias bitvectors to MCPlusBuilder object.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D124942
Zixu Wang [Wed, 4 May 2022 19:26:18 +0000 (12:26 -0700)]
Revert "[clang][extract-api] Use relative includes"
This reverts commit
4c262fee08b5383c96857d77eefe80d61c41d2b0.
Revert to fix Msan and Asan errors.
owenca [Tue, 3 May 2022 19:04:50 +0000 (12:04 -0700)]
[clang-format] Fix a bug in AlignConsecutiveAssignments
Fixes #55113.
Differential Revision: https://reviews.llvm.org/D124868
LLVM GN Syncbot [Wed, 4 May 2022 18:28:43 +0000 (18:28 +0000)]
[gn build] Port
80045e9afa2f
Nikolas Klauser [Wed, 4 May 2022 18:27:07 +0000 (20:27 +0200)]
[libc++] Implement ranges::for_each{, _n}
Reviewed By: var-const, #libc
Spies: libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D124332
Florian Mayer [Wed, 4 May 2022 18:21:23 +0000 (11:21 -0700)]
[HWASan] cleanup imports in hwasan_symbolize.
Louis Dionne [Mon, 25 Apr 2022 16:49:47 +0000 (10:49 -0600)]
[libc++] Refactor max_size.pass.cpp
Reorganize the test and simplify the #ifdefs. Fix a typo in __powerpc64__
as a fly-by, and also add a test for the unstable ABI.
Differential Revision: https://reviews.llvm.org/D124403
Luboš Luňák [Tue, 5 Apr 2022 13:40:21 +0000 (15:40 +0200)]
[lldb] parallelize calling of Module::PreloadSymbols()
If LLDB index cache is enabled and everything is cached, then loading of debug
info is essentially single-threaded, because it's done from PreloadSymbols()
called from GetOrCreateModule(), which is called from a loop calling
LoadModuleAtAddress() in DynamicLoaderPOSIXDYLD. Parallelizing the entire
loop could be unsafe because of GetOrCreateModule() operating on a module
list, so instead move only the PreloadSymbols() call to Target::ModulesDidLoad()
and parallelize there, which should be safe.
This may greatly reduce the load time if the debugged program uses a large
number of binaries (as opposed to monolithic programs where this presumably
doesn't make a difference). In my specific case of LibreOffice Calc this reduces
startup time from 6s to 2s.
Differential Revision: https://reviews.llvm.org/D122975
Zixu Wang [Wed, 4 May 2022 17:40:25 +0000 (10:40 -0700)]
[NFC] Remove unfinished test case
4c262fee08b5383c96857d77eefe80d61c41d2b0 accidentally added local
unfinished test case clang/test/Index/annotate-comments-enum-constant.c
This patch removes it.
Zixu Wang [Fri, 15 Apr 2022 02:04:30 +0000 (19:04 -0700)]
[clang][extract-api] Use relative includes
This patch transforms the given input headers to relative include names
using header search entries and some heuritics.
For example: `/Path/To/Header.h` will be included as `<Header.h>` with a
search path of `-I /Path/To/`; and
`/Path/To/Framework.framework/Headers/Header.h` will be included as
`<Framework/Header.h>`, given a search path of `-F /Path/To`.
Headermaps will also be queried in reverse to find a spelled name to
include headers.
Differential Revision: https://reviews.llvm.org/D123831
Aaron Ballman [Wed, 4 May 2022 17:22:30 +0000 (13:22 -0400)]
Fix a failing assertion with vector type initialization
When constant evaluating the initializer for an object of vector type,
we would call APInt::trunc() but truncate to the same bit-width the
object already had, which would cause an assertion. Instead, use
APInt::truncOrSelf() so that we no longer assert in this situation.
Fix #50216
Sanjay Patel [Wed, 4 May 2022 16:57:34 +0000 (12:57 -0400)]
[InstCombine] add type constraint to intrinsic+shuffle fold
This check is in the related fold for binops,
but it was missed when the code was adapted
for intrinsics in
432c199e8473. The new test
would crash when trying to create a new
intrinsic with mismatched types.
Sanjay Patel [Wed, 4 May 2022 16:44:47 +0000 (12:44 -0400)]
[InstCombine] move shuffle after funnel shift with same-shuffled operands
This extends
432c199e8473 and
9c4770eaab9d9 with an intrinsic
cited directly in issue #46238
Eventually, we will want to use llvm::isTriviallyVectorizable()
or create some new API for this list, but for now, I am intentionally
making a minimum change to reduce risk and only affect an intrinsic
with regression tests in place.
Sanjay Patel [Wed, 4 May 2022 16:39:34 +0000 (12:39 -0400)]
[InstCombine] add tests for funnel-shift with shuffled operands; NFC
Yaxun (Sam) Liu [Tue, 3 May 2022 11:48:37 +0000 (07:48 -0400)]
[NFC][CUDA][HIP] rework mangling number for aux target
CUDA/HIP needs to mangle for aux target. When mangling for aux target,
the mangler should use mangling number for aux target. Previously
in https://reviews.llvm.org/D122734 a state was introduced in
ASTContext to let the mangler get mangling number for aux target
from ASTContext. This patch removes that state from ASTConext
and add an IsAux member to MangleContext to indicate that
the mangle context is for aux target. This reflects the reality that
the mangle context is created for mangling aux target and makes
ASTContext cleaner.
Reviewed by: Artem Belevich, Reid Kleckner
Differential Revision: https://reviews.llvm.org/D124842
Bixia Zheng [Wed, 4 May 2022 14:36:03 +0000 (07:36 -0700)]
[mlir][sparse][taco] Support more data types.
Support int8, int16, int32 and int32. Also fix source code format in mlir_pytaco_utils.py.
Add tests.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D124925
Cyndy Ishida [Wed, 4 May 2022 14:38:20 +0000 (07:38 -0700)]
[clang] Track how headers get included generally during lookup time
tapi & clang-extractapi both attempt to construct then check against
how a header was included to determine api information when working
against multiple search paths, headermap, and vfsoverlay mechanisms.
Validating this against what the preprocessor sees during lookup time
makes this check more reliable.
Reviewed By: zixuw, jansvoboda11
Differential Revision: https://reviews.llvm.org/D124638
Aaron Ballman [Wed, 4 May 2022 16:39:18 +0000 (12:39 -0400)]
Fix a crash on invalid with _Generic expressions
We were failing to check if the controlling expression is dependent or
not when testing whether it has side effects. This would trigger an
assertion. Instead, if the controlling expression is dependent, we
suppress the check and diagnostic.
This fixes Issue 50227.
Florian Hahn [Wed, 4 May 2022 16:19:02 +0000 (17:19 +0100)]
[VPlan] Add test for printing plan with an exit value.
Test for printing plan with additions from D123537.
Sanjay Patel [Wed, 4 May 2022 16:01:53 +0000 (12:01 -0400)]
[InstCombine] propagate FMF when reordering intrinsics and shuffles
This was missed when extending the fold to allow fma with
9c4770eaab9d95c
Sanjay Patel [Wed, 4 May 2022 15:58:01 +0000 (11:58 -0400)]
[InstCombine] add FMF to tests for better coverage; NFC
The fold added with
9c4770eaab9d95c neglected to propagate FMF.
Ilya Biryukov [Wed, 4 May 2022 15:31:59 +0000 (15:31 +0000)]
[Sema] Simplify CheckConstraintSatisfaction. NFC
- Exit early when constraint caching is disabled.
- Use unique_ptr to manage temporary lifetime.
- Fix a typo in a comment (InsertPos instead of InsertNode).
The new code duplicates the forwarding call to CheckConstraintSatisfaction,
but reduces the number of interconnected if statements and simplifies lifetime
management.
This increases the overall readability.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D124923
Sanjay Patel [Wed, 4 May 2022 15:17:25 +0000 (11:17 -0400)]
[InstCombine] move shuffle after fma with same-shuffled operands
https://alive2.llvm.org/ce/z/sD-JVv
This extends
432c199e8473 with a 3 arg intrinsic to demonstrate
that the code works with the extra operand.
Eventually, we will want to use llvm::isTriviallyVectorizable()
or create some new API for this list, but for now, I am intentionally
making a minimum change to reduce risk and only affect an intrinsic
with regression tests in place.
Sanjay Patel [Wed, 4 May 2022 15:11:47 +0000 (11:11 -0400)]
[InstCombine] add tests for fma with shuffled operands; NFC
Alexander Belyaev [Wed, 4 May 2022 15:46:17 +0000 (17:46 +0200)]
[mlir] Add a flag to allow equivalent results.
Differential Revision: https://reviews.llvm.org/D124931
Eric Li [Mon, 2 May 2022 21:36:04 +0000 (21:36 +0000)]
[clang][dataflow] Only skip ExprWithCleanups when visiting terminators
`IgnoreParenImpCasts` will remove implicit casts to bool
(e.g. `PointerToBoolean`), such that the resulting expression may not
be of the `bool` type. The `cast_or_null<BoolValue>` in
`extendFlowCondition` will then trigger an assert, as the pointer
expression will not have a `BoolValue`.
Instead, we only skip `ExprWithCleanups` and `ParenExpr` nodes, as the
CFG does not emit them.
Differential Revision: https://reviews.llvm.org/D124807
David Green [Wed, 4 May 2022 14:07:47 +0000 (15:07 +0100)]
[VectorCombine] Add tests for shuffle binops patterns. NFC
Fraser Cormack [Wed, 20 Apr 2022 13:12:23 +0000 (14:12 +0100)]
[RISCV] Add a test showing incorrect VSETVLI insertion
This test shows incorrect cross-bb insertion. We'd expect to see
a SEW=8 vsetvli, something like:
vsetvli zero, zero, e8, mf8, ta, mu
vluxei64.v v1, (a2), v8, v0.t
But instead the vsetvli is omitted and instead an inherited SEW=64
vsetvli is used:
vmv1r.v v9, v1
vsetvli a3, zero, e64, m1, ta, mu
vmseq.vi v9, v1, 0
vmv1r.v v8, v0
vmandn.mm v0, v9, v2
beqz a0, .LBB0_2
# %bb.1:
vluxei64.v v1, (a2), v8, v0.t
vmv1r.v v3, v1
The "mask reg op" vmandn.mm in bb.1 appears to be confusing the insertion
process, as it is able to elide its own vsetvli as its VLMAX (SEW=8,
LMUL=MF8) is identical to the previous one (SEW=64, LMUL=1).
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D124089
Nikita Popov [Tue, 3 May 2022 15:06:46 +0000 (17:06 +0200)]
[SDAG] Handle truncated not in haveNoCommonBitsSet()
Demanded bits analysis may replace a full-width not with a
any_extend (not (truncate X)) pattern. This patch looks through
this kind of pattern in haveNoCommonBitsSet(). Of course, we can
only do this if we only need negated bits in the non-extended part,
as the other bits may now be arbitrary. For example, if we have
haveNoCommonBitsSet(~X & Y, X) then ~X only needs to actually
negate bits set in Y.
This is only a partial solution to the problem in that it allows
add -> or conversion, but the resulting or doesn't get folded yet.
(I guess that will involve exposing getBitwiseNotOperand() as a
more general helper and using that in the relevant transform.)
Differential Revision: https://reviews.llvm.org/D124856
Nikita Popov [Wed, 4 May 2022 13:23:19 +0000 (15:23 +0200)]
[SCEV] Add additional poison implication tests (NFC)
Phoebe Wang [Wed, 4 May 2022 11:21:13 +0000 (19:21 +0800)]
[X86] Fix uninitialized variable warnings in cetintrin.h reported by #55224
Fix uninitialized variables introduced by D116325.
Differential Revision: https://reviews.llvm.org/D124916
Aaron Ballman [Wed, 4 May 2022 13:06:16 +0000 (09:06 -0400)]
Do not rely on implicit int for this test
This should address failing test bots:
https://lab.llvm.org/buildbot/#/builders/68/builds/31828
Aaron Ballman [Wed, 4 May 2022 13:05:07 +0000 (09:05 -0400)]
Bump the serialization major version number
This is a speculative fix for a build bot which does not put the LLVM
revision information into the PCH hash.
http://45.33.8.238/linux/75290/step_7.txt
Bradley Smith [Thu, 28 Apr 2022 11:11:11 +0000 (11:11 +0000)]
[AArch64][SVE] Restore SP from FP when SVE CSRs and variable sized objects are present
Without SVE, after a dynamic stack allocation has modified the SP, it is
presumed that a frame pointer restoration will revert the SP back to
it's correct value prior to any caller stack being restored. However the
SVE frame is restored using the stack pointer directly, as it is located
after the frame pointer. This means that in the presence of a dynamic
stack allocation, any SVE callee state gets corrupted as SP has the
incorrect value when the SVE state is restored.
To address this issue, when variable sized objects and SVE CSRs are
present, treat the stack as having been realigned, hence restoring the
stack pointer from the frame pointerr prior to restoring the SVE state.
Differential Revision: https://reviews.llvm.org/D124615
Nikita Popov [Wed, 4 May 2022 12:52:31 +0000 (14:52 +0200)]
[InstCombine] Fix commuted tests (NFC)
As pointed out on D124710, these need more thwarting.
Nikita Popov [Wed, 4 May 2022 12:47:26 +0000 (14:47 +0200)]
[SCEV] Add poison implication tests for umin_seq (NFC)
Aaron Ballman [Wed, 4 May 2022 12:42:52 +0000 (08:42 -0400)]
Fix failing buildbot for lldb
This should address the issue found by:
https://lab.llvm.org/buildbot/#/builders/68/builds/31827
Aaron Ballman [Wed, 4 May 2022 12:34:26 +0000 (08:34 -0400)]
Change the behavior of implicit int diagnostics
C89 allowed a type specifier to be elided with the resulting type being
int, aka implicit int behavior. This feature was subsequently removed
in C99 without a deprecation period, so implementations continued to
support the feature. Now, as with implicit function declarations, is a
good time to reevaluate the need for this support.
This patch allows -Wimplicit-int to issue warnings in C89 mode (off by
default), defaults the warning to an error in C99 through C17, and
disables support for the feature entirely in C2x. It also removes a
warning about missing declaration specifiers that really was just an
implicit int warning in disguise and other minor related cleanups.
Phoebe Wang [Wed, 4 May 2022 12:28:12 +0000 (20:28 +0800)]
[X86] Fix redundant `%s` in RUN command. NFC
Andrzej Warzynski [Thu, 28 Apr 2022 14:12:32 +0000 (14:12 +0000)]
[flang][driver] Define the default frontend driver triple
*SUMMARY*
Currently, the frontend driver assumes that a target triple is either:
* provided by the frontend itself (e.g. when lowering and generating
code),
* specified through the `-triple/-target` command line flags.
If `-triple/-target` is not used, the frontend will simply use the host
triple.
This is going to be insufficient when e.g. consuming an LLVM IR file
that has no triple specified (reading LLVM files is WIP, see D124667).
We shouldn't require the triple to be specified via the command line in
such situation. Instead, the frontend driver should contain a good
default, e.g. the host triple.
This patch updates Flang's `CompilerInvocation` to do just that, i.e.
defines its default target triple. Similarly to Clang:
* the default `CompilerInvocation` triple is set as the host triple,
* the value specified with `-triple` takes precedence over the frontend
driver default and the current module triple,
* the frontend driver default takes precedence over the module triple.
*TESTS*
This change requires 2 unit tests to be updated. That's because relevant
frontend actions are updated to assume that there's always a valid
triple available in the current `CompilerInvocation`. This update is
required because the unit tests bypass the regular `CompilerInvocation`
set-up (in particular, they don't call
`CompilerInvocation::CreateFromArgs`). I've also taken the liberty to
disable the pre-precossor formatting in the affected unit tests as well
(it is not required).
No new tests are added. As `flang-new -fc1` does not support consuming
LLVM IR files just yet, it is not possible to compile an LLVM IR file
without a triple. More specifically, atm all LLVM IR files are generated
and stored internally and the driver makes sure that these contain a
valid target triple. This is about to change in D124667 (which adds
support for reading LLVM IR/BC files) and that's where tests for
exercising the default frontend driver triple will be added.
*WHAT DOES CLANG DO?*
For reference, the default target triple for Clang's
`CompilerInvocation` is set through option marshalling infra [1] in
Options.td. Please check the definition of the `-triple` flag:
```
def triple : Separate<["-"], "triple">,
HelpText<"Specify target triple (e.g. i686-apple-darwin9)">,
MarshallingInfoString<TargetOpts<"Triple">, "llvm::Triple::normalize(llvm::sys::getDefaultTargetTriple())">,
AlwaysEmit, Normalizer<"normalizeTriple">;
```
Ideally, we should re-use the marshalling infra in Flang.
[1] https://clang.llvm.org/docs/InternalsManual.html#option-marshalling-infrastructure
Differential Revision: https://reviews.llvm.org/D124664
Tobias Hieta [Fri, 29 Apr 2022 09:42:06 +0000 (11:42 +0200)]
[CMake] Make omitting CMAKE_BUILD_TYPE an error
After a lot of discussion in this diff the consensus was that it is really hard to guess the users intention with their LLVM build. Instead of trying to guess if Debug or Release is the correct default option we opted for just not specifying CMAKE_BUILD_TYPE a error.
Discussion on discourse here:
https://discourse.llvm.org/t/rfc-select-a-better-linker-by-default-or-warn-about-using-bfd
Reviewed By: hans, mehdi_amini, aaron.ballman, jhenderson, MaskRay, awarzynski
Differential Revision: https://reviews.llvm.org/D124153
Simon Pilgrim [Wed, 4 May 2022 11:21:07 +0000 (12:21 +0100)]
[X86] load-local illegal types tests - expose the load/store stack offsets
Make it easier to track whats going on accessing parts of the custom sized types
Florian Hahn [Wed, 4 May 2022 09:53:42 +0000 (10:53 +0100)]
Recommit "[VPlan] Remove uneeded needsVectorIV check."
This reverts commit
f4e1eaa3755a13f85696be3b74b387122b74a558.
The patch was originally reverted because it uncovered an issue that has
now been fixed in
0ef8ca6d88aa7e4abc.
Jonas Paulsson [Tue, 3 May 2022 17:58:56 +0000 (19:58 +0200)]
[SystemZ] Avoid crashing in tryRISBGZero().
Bail out from cases where the result is a ConstantSDNode as it cannot be
selected and should typically not end up here.
Fixes: #55204
Reviewed By: Ulrich Weigand
Daniil Dudkin [Wed, 4 May 2022 09:29:46 +0000 (12:29 +0300)]
[flang] Fix ICE for passing a label for non alternate return arguments
When we pass an alternate return specifier to a regular (not an asterisk)
dummy argument, flang would throw an internal compiler error of
derefencing a null pointer.
To avoid the ICE, a check was added.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D123947
Tobias Hieta [Mon, 25 Apr 2022 08:28:59 +0000 (10:28 +0200)]
[docs] Improve documentation around CMAKE_BUILD_TYPE
See discussion in: https://reviews.llvm.org/D124153
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D124367
Ulrich Weigand [Wed, 4 May 2022 08:43:11 +0000 (10:43 +0200)]
[libunwind][SystemZ] Unwind out of signal handlers
Unwinding out of signal handlers currently does not work since
the sigreturn trampoline is not annotated with CFI data.
Fix this by detecting the sigreturn trampoline during unwinding
and providing appropriate unwind data manually. This follows
closely the approach used by existing code for the AArch64 target.
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D124765
Martin Liska [Tue, 3 May 2022 08:09:07 +0000 (10:09 +0200)]
tsan: fix GCC warnings
Fixes:
tsan/tsan_shadow.h:93:32: warning: enumerated and non-enumerated type in conditional expression [-Wextra]
tsan/tsan_shadow.h:94:44: warning: enumerated and non-enumerated type in conditional expression [-Wextra]
Differential Revision: https://reviews.llvm.org/D124828
Matthias Springer [Tue, 3 May 2022 14:40:13 +0000 (23:40 +0900)]
[mlir][linalg][bufferize][NFC] Remove remaining Comprehensive Bufferize code
This commit removes the Linalg Comprehensive Bufferize pass.
Differential Revision: https://reviews.llvm.org/D124854
Matthias Springer [Tue, 3 May 2022 14:39:31 +0000 (23:39 +0900)]
[mlir][linalg][bufferize][NFC] Make init_tensor elimination a separate pre-processing pass
This commit decouples init_tensor elimination from the rest of the bufferization.
Differential Revision: https://reviews.llvm.org/D124853
Fangrui Song [Wed, 4 May 2022 08:10:45 +0000 (01:10 -0700)]
[ELF] Support custom sections between DATA_SEGMENT_ALIGN and DATA_SEGMENT_RELRO_END
We currently hard code RELRO sections. When a custom section is between
DATA_SEGMENT_ALIGN and DATA_SEGMENT_RELRO_END, we may report a spurious
`error: section: ... is not contiguous with other relro sections`. GNU ld
makes such sections RELRO.
glibc recently switched to default --with-default-link=no. This configuration
places `__libc_atexit` and others between DATA_SEGMENT_ALIGN and
DATA_SEGMENT_RELRO_END. This patch allows such a ld.bfd --verbose
linker script to be fed into lld.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D124656
Matthias Springer [Tue, 3 May 2022 14:39:07 +0000 (23:39 +0900)]
[mlir][bufferize] Allow in-place bufferization for writes to init_tensors in loops
This commit relaxes the rules around ops that define a value but do not specify the tensor's contents. (The only such op at the moment is init_tensor.)
When such a tensor is written in a loop, it should not cause out-of-place bufferization.
Differential Revision: https://reviews.llvm.org/D124849
serge-sans-paille [Tue, 3 May 2022 12:15:24 +0000 (14:15 +0200)]
[iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since
fa5a4e1b95c8f37796 detected a few
regressions, fixing them.
Differential Revision: https://reviews.llvm.org/D124847
Marius Brehler [Tue, 3 May 2022 12:56:51 +0000 (12:56 +0000)]
[mlir] Add missing CMake deps to mlir-pdll
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D124851
Luboš Luňák [Wed, 4 May 2022 06:11:49 +0000 (08:11 +0200)]
[ThreadPool] delete debug global variable if not needed
https://lab.llvm.org/buildbot/#/builders/5/builds/23099
Douglas Yung [Wed, 4 May 2022 05:45:43 +0000 (22:45 -0700)]
Mark test icf-safe.s as requiring aarch64 to fix buildbots which don't build that target.
Luboš Luňák [Wed, 6 Apr 2022 13:48:22 +0000 (15:48 +0200)]
[lldb] use one shared ThreadPool and task groups
As a preparation for parallelizing loading of symbols (D122975),
it is necessary to use just one thread pool to avoid using
a thread pool from inside a task of another thread pool.
Differential Revision: https://reviews.llvm.org/D123226
Luboš Luňák [Tue, 5 Apr 2022 19:27:14 +0000 (21:27 +0200)]
[ThreadPool] add ability to group tasks into separate groups
This is needed for parallelizing of loading modules symbols in LLDB
(D122975). Currently LLDB can parallelize indexing symbols
when loading a module, but modules are loaded sequentially. If LLDB
index cache is enabled, this means that the cache loading is not
parallelized, even though it could. However doing that creates
a threadpool-within-threadpool situation, so the number of threads
would not be properly limited.
This change adds ThreadPoolTaskGroup as a simple type that can be
used with ThreadPool calls to put tasks into groups that can be
independently waited for (even recursively from within a task)
but still run in the same thread pool.
Differential Revision: https://reviews.llvm.org/D123225
Luo, Yuanke [Wed, 4 May 2022 03:24:48 +0000 (11:24 +0800)]
[fastregalloc] Fix bug when undef value is tied to def.
If the tied use is undef value, fastregalloc should free the def
register. There is no reload needed for the undef value.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D124834
Craig Topper [Wed, 4 May 2022 02:42:42 +0000 (19:42 -0700)]
[RISCV] Update isLegalAddressingMode for RVV.
RVV instructions only support base register addressing.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D124820
Craig Topper [Wed, 4 May 2022 02:29:33 +0000 (19:29 -0700)]
[RISCV] Make use of SHXADD instructions in RVV spill/reload code.
We can use SH1ADD, SH2ADD, SH3ADD to multipy by 3, 5, and 9 respectively.
We could extend this to 3, 5, or 9 multiplied by a power 2 by also
emitting a SLLI.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D124824
Craig Topper [Wed, 4 May 2022 02:28:26 +0000 (19:28 -0700)]
[RISCV] Don't lookup TII in RISCVInstrInfo::getVLENFactoredAmount. NFCI
We're already inside of our implementation of TII.
Amir Ayupov [Wed, 4 May 2022 02:33:43 +0000 (19:33 -0700)]
[BOLT] Fix ICPJumpTablesTopN option use
Fix non-sensical `opts::ICPJumpTablesTopN != 0 ? opts::ICPTopN : opts::ICPTopN`.
Refactor/simplify another similar assignment.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D124880
Amir Ayupov [Wed, 4 May 2022 02:32:19 +0000 (19:32 -0700)]
[BOLT][NFC] Make ICP options naming uniform
Rename `opts::IndirectCallPromotion*` to `opts::ICP*`, making option naming
uniform and easier to follow.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D124879
Amir Ayupov [Wed, 4 May 2022 02:30:41 +0000 (19:30 -0700)]
[BOLT][NFC] ICP: simplify findTargetsIndex
Unnest lambda and use `llvm::is_contained`.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D124877
Amir Ayupov [Wed, 4 May 2022 02:28:24 +0000 (19:28 -0700)]
[BOLT][NFC] Refactor ICP::findCallTargetSymbols
Reduce nesting making it easier to read.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D124876