Roman Lebedev [Mon, 22 Mar 2021 09:26:07 +0000 (12:26 +0300)]
[NFC][lit] discovery: find_tests_for_inputs: avoid py warning when no suites found
If lit was run on a directory that contained no suites,
then naturally suite[0] will not be there,
and that line would cause python warnings.
So just predicate it with a check that it is there in the first place.
Florian Hahn [Mon, 15 Mar 2021 11:22:50 +0000 (11:22 +0000)]
[ConstraintElimination] Add gep tests without inbounds.
Add a set of interesting test cases for GEPs without inbounds for
upcoming patches.
Muhammad Omair Javaid [Mon, 22 Mar 2021 12:03:48 +0000 (17:03 +0500)]
[LLDB] XFAIL dwarf5-debug_line-file-index.s on arm-linux
Tests dwarf5-debug_line-file-index.s fails on arm-linux-gnueabihf.
Bug # 49678 has been filed against it.
Bradley Smith [Wed, 3 Mar 2021 13:53:30 +0000 (13:53 +0000)]
[IR] Add vscale_range IR function attribute
This attribute represents the minimum and maximum values vscale can
take. For now this attribute is not hooked up to anything during
codegen, this will be added in the future when such codegen is
considered stable.
Additionally hook up the -msve-vector-bits=<x> clang option to emit this
attribute.
Differential Revision: https://reviews.llvm.org/D98030
Sven van Haastregt [Mon, 22 Mar 2021 11:59:05 +0000 (11:59 +0000)]
[OpenCL] Support template parameters for as_type
Implement the TreeTransform for AsTypeExpr. Split `BuildAsTypeExpr`
out of `ActOnAsTypeExpr`, such that we can call the Build method from
the TreeTransform.
Fixes PR47979.
Differential Revision: https://reviews.llvm.org/D98855
Kadir Cetinkaya [Mon, 22 Mar 2021 10:18:18 +0000 (11:18 +0100)]
[clangd] Replace usages of dummy with more descriptive words
Dummy is a word with inappropriate associations. This patch updates the
references to it in clangd code base with more precise ones.
The only user-visible change is the default variable name used when extracting a
variable. It will be named as `placeholder` from now on.
Differential Revision: https://reviews.llvm.org/D99065
Andrzej Warzynski [Sat, 20 Mar 2021 15:26:46 +0000 (15:26 +0000)]
[clang][flang] Moke the definition of `-module-dir` restricted to Flang
`-module-dir` is a Flang specific option and should not be visible in
Clang. This patch adds `FlangOnlyOption` flag to its definition. This
way Clang will know that it should reject it and skip it when generating
output for `clang -help`.
The definition of `-module-dir` is moved next to other Flang options.
As `-J` is an alias for `-module-dir`, it has to be moved as well (the
alias cannot be defined before the original option). As `gfortran` mode
is effectively no longer supported (*), `-J` is claimed as Flang only
option.
This is a follow-up of a post-commit review for
https://reviews.llvm.org/D95448.
* https://reviews.llvm.org/rG6a75496836ea14bcfd2f4b59d35a1cad4ac58cee
Differential Revision: https://reviews.llvm.org/D99018
Sjoerd Meijer [Fri, 19 Mar 2021 14:16:17 +0000 (14:16 +0000)]
[AArch64] Add some float -> int -> float conversion patterns
This adds some conversion match patterns for which we want to keep the int
values in FP registers using the corresponding NEON instructions (not the FP
instructions) to avoid more costly int <-> fp register transfers.
Differential Revision: https://reviews.llvm.org/D98956
Valeriy Savchenko [Fri, 24 Jul 2020 11:13:31 +0000 (14:13 +0300)]
[analyzer][solver] Redesign constraint ranges data structure
ImmutableSet doesn't seem like the perfect fit for the RangeSet
data structure. It is good for saving memory in a persistent
setting, but not for the case when the population of the container
is tiny. This commit replaces RangeSet implementation and
redesigns the most common operations to be more efficient.
Differential Revision: https://reviews.llvm.org/D86465
Stefan Gränitz [Mon, 22 Mar 2021 10:41:59 +0000 (11:41 +0100)]
[llvm-jitlink] Fix Windows build after
4a8161fe40cc
Florian Hahn [Mon, 22 Mar 2021 10:09:19 +0000 (10:09 +0000)]
[ConstraintElimination] Add multi-dimension GEP tests.
Add a set of interesting test cases with multi-dimensional GEPs for
upcoming patches.
Stefan Gränitz [Mon, 22 Mar 2021 10:17:11 +0000 (11:17 +0100)]
[llvm-jitlink] Add diagnostic output and port executor to getaddrinfo(3) as well
Add diagnostic output for TCP connections on both sides, llvm-jitlink and llvm-jitlink-executor.
Port the executor to use getaddrinfo(3) as well. This makes the code more symmetric and seems to be the recommended way for implementing the server side.
Reviewed By: rzurob
Differential Revision: https://reviews.llvm.org/D98581
Stefan Gränitz [Mon, 22 Mar 2021 10:18:49 +0000 (11:18 +0100)]
[llvm-jitlink] Fix use of getaddrinfo(3) when connecting remote executor via TCP socket
Since llvm-jitlink moved from gethostbyname to getaddrinfo in D95477, it seems to no longer connect to llvm-jitlink-executor via TCP. I can reproduce this behavior on both, Debian 10 and macOS 10.15.7:
```
> llvm-jitlink-executor listen=localhost:10819
--
> llvm-jitlink --oop-executor-connect=localhost:10819 /path/to/obj.o
Failed to resolve localhost:10819
```
Reviewed By: rzurob
Differential Revision: https://reviews.llvm.org/D98579
Sven van Haastregt [Mon, 22 Mar 2021 09:46:28 +0000 (09:46 +0000)]
[OpenCL] Use -fdeclare-opencl-builtins for some tests
This speeds up the test running times, as the large `opencl-c.h`
header no longer needs to be parsed.
serge-sans-paille [Mon, 22 Mar 2021 09:05:25 +0000 (10:05 +0100)]
Make clangd CompletionModel usable even with non-standard (but supported) layout
llvm supports specifying a non-standard layout where each project lies in its
own place. Do not assume a fixed layout and use the appropriate cmake variable
instead.
Differential Revision: https://reviews.llvm.org/D96787
serge-sans-paille [Mon, 22 Mar 2021 08:52:39 +0000 (09:52 +0100)]
[NFC] Simpler and faster key computation for getSubtargetImpl memoization
There's no use in computing a large key that's only used for a memoization
optimization.
Adrian Kuegel [Mon, 22 Mar 2021 08:42:57 +0000 (09:42 +0100)]
[mlir] Add an option to still use bottom-up traversal
GreedyPatternRewriteDriver was changed from bottom-up traversal to top-down traversal. Not all passes work yet with that change for traversal order. To give some time for fixing, add an option to allow to switch back to bottom-up traversal. Use this option in FusionOfTensorOpsPass which fails otherwise.
Differential Revision: https://reviews.llvm.org/D99059
Fangrui Song [Mon, 22 Mar 2021 08:27:06 +0000 (01:27 -0700)]
[Driver] -m32: Add /usr/include/i386-linux-gnu for Debian
Kristof Beyls [Sat, 20 Mar 2021 08:09:39 +0000 (09:09 +0100)]
[docs] GettingInvolved: split out flang and openmp meeting series
Split out the flang and openmp meeting series, as each has a separate
canonical page where the information is maintained.
As part of that, also call out the alias analysis series separately as
it doesn't seem to be relevant for just flang.
Differential Revision: https://reviews.llvm.org/D99012
Yang Fan [Mon, 22 Mar 2021 08:08:47 +0000 (16:08 +0800)]
[ELF][docs] Add line breaks
Valeriy Savchenko [Fri, 19 Mar 2021 14:00:00 +0000 (17:00 +0300)]
[analyzer][solver] Fix infeasible constraints (PR49642)
Additionally, this patch puts an assertion checking for feasible
constraints in every place where constraints are assigned to states.
Differential Revision: https://reviews.llvm.org/D98948
Kim-Anh Tran [Thu, 18 Mar 2021 20:32:39 +0000 (21:32 +0100)]
[lldb] Use CompileUnit::ResolveSymbolContext in SymbolFileDWARF
SymbolFileDWARF::ResolveSymbolContext is currently unaware that in DWARF5 the primary file is specified at file index 0. As a result it misses to correctly resolve the symbol context for the primary file when DWARF5 debug data is used and the primary file is only specified at index 0.
This change makes use of CompileUnit::ResolveSymbolContext to resolve the symbol context. The ResolveSymbolContext in CompileUnit has been previously already updated to reflect changes in DWARF5
and contains a more readable version. It can resolve more, but will also do a bit more work than
SymbolFileDWARF::ResolveSymbolContext (getting the Module, and going through SymbolFileDWARF::ResolveSymbolContextForAddress), however, it's mostly directed by $resolve_scope
what will be resolved, and ensures that code is easier to maintain if there's only one path.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D98619
Fangrui Song [Mon, 22 Mar 2021 07:23:54 +0000 (00:23 -0700)]
[Driver] Gnu.cpp: remove obsoleted i386 triple detection from end-of-life distribution versions
This saves 16 openat syscalls for `clang a.cc` on x86_64.
Nathan Ridge [Mon, 22 Mar 2021 05:30:18 +0000 (01:30 -0400)]
[clangd] Fix linker error when linking clang-index-server with shared libraries
Fixes https://github.com/clangd/clangd/issues/723
Differential Revision: https://reviews.llvm.org/D99049
Qiu Chaofan [Mon, 22 Mar 2021 06:29:22 +0000 (14:29 +0800)]
[PowerPC] Enable redundant TOC save removal on AIX
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D97039
Fangrui Song [Mon, 22 Mar 2021 05:40:38 +0000 (22:40 -0700)]
[Driver] Clean up Debian multiarch /usr/include/<triplet> madness
Debian multiarch additionally adds /usr/include/<triplet> and somehow
Android borrowed the idea. (Note /usr/<triplet>/include is already an
include dir...). On Debian, we should just assume a GCC installation is
available and use its triple.
Stella Laurenzo [Mon, 22 Mar 2021 04:58:17 +0000 (04:58 +0000)]
Fix extraneous context parameter in templated helper function.
(missed in lattner's overall updates related to D99028)
Bing1 Yu [Mon, 22 Mar 2021 01:48:59 +0000 (09:48 +0800)]
[X86] Pass to transform tdpbf16ps intrinsics to scalar operation.
In previous patch https://reviews.llvm.org/D93594, we only scalarize tilezero, tileload, tilestore and tiledpbssd. In this patch we scalarize tdpbf16ps intrinsic.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D96110
Max Kazantsev [Mon, 22 Mar 2021 04:07:32 +0000 (11:07 +0700)]
[IndVars] Sharpen context in eliminateIVComparison
When eliminating comparisons, we can use common dominator of
all its users as context. This gives better results when ICMP is not
computed right before the branch that uses it.
Differential Revision: https://reviews.llvm.org/D98924
Reviewed By: lebedev.ri
Lang Hames [Mon, 22 Mar 2021 02:58:08 +0000 (19:58 -0700)]
[JITLink][ELF/x86-64] Add support for R_X86_64_GOTPC64 and R_X86_64_GOT64.
Start adding support for ELF x86-64 large code model, PIC relocations.
Siva Chandra [Sat, 20 Mar 2021 04:50:48 +0000 (04:50 +0000)]
[libc] Add a target "install-llvmlibc" to install LLVM libc static archive.
Lang Hames [Sun, 21 Mar 2021 03:22:40 +0000 (20:22 -0700)]
[JITLink] Start laying the groundwork for ELF x86-64 large code model support.
Introduces DefineExternalSectionStartAndEndSymbols.h, which defines a template
for a JITLink pass that transforms external symbols meeting a user-supplied
predicate into defined symbols pointing at the start and end of a Section
identified by the predicate. JITLink.h is updated with a new makeAbsolute
function to support this pass.
Also renames BasicGOTAndStubsBuilder to PerGraphGOTAndPLTStubsBuilder -- the new
name better describes the intent of this GOT and PLT stubs builder, and will
help to distinguish it from future GOT and PLT stub builders that build entries
that may be shared between multiple graphs.
Lang Hames [Mon, 22 Mar 2021 00:20:07 +0000 (17:20 -0700)]
[JITLink][ELF/x86-64] Add Delta32, NegDelta32, NegDelta64 support.
These were missing, but are used in eh-frame section support.
Chuanqi Xu [Mon, 22 Mar 2021 02:25:32 +0000 (10:25 +0800)]
[ASTMatcher] Add AST Matcher support for C++20 coroutine keywords
Summary: Try to enable the support for C++20 coroutine keywords for AST
Matchers.
Reviewers: sammccall, njames93, aaron.ballman
Differential Revision: https://reviews.llvm.org/D96316
Luo, Yuanke [Sun, 21 Mar 2021 02:58:57 +0000 (10:58 +0800)]
[X86][AMX] Add test cases for AMX load/store lowering.
Differential Revision: https://reviews.llvm.org/D99030
Fangrui Song [Mon, 22 Mar 2021 00:33:30 +0000 (17:33 -0700)]
[Driver] Detect Debian hack g++-multiarch-incdir.diff to simplify addLibStdCXXIncludePaths call sites
Fangrui Song [Sun, 21 Mar 2021 22:37:35 +0000 (15:37 -0700)]
[test] Add test for cross compiling on Linux
Fangrui Song [Sun, 21 Mar 2021 22:23:49 +0000 (15:23 -0700)]
[test] Delete obsoleted debian_multiarch_tree and ubuntu_13.04_multiarch_tree
They are quite outdated. Delete them to avoid unnecessary test churn.
Jacques Pienaar [Sun, 21 Mar 2021 22:15:34 +0000 (15:15 -0700)]
Update examples post OwningRewritePatternList change
Nico Weber [Sun, 21 Mar 2021 20:35:38 +0000 (16:35 -0400)]
Revert "[lld-macho] Implement -dependency_info (partially - more opcodes needed)"
This reverts commit
c53a1322f329e29446c7625da423f58f09ec1a55.
Test only passes depending on build dir having a lexicographically later name
than the source dir, and doesn't link on mac/win. See
https://reviews.llvm.org/D98559#2640265 onward.
Roman Lebedev [Sun, 21 Mar 2021 20:22:41 +0000 (23:22 +0300)]
[clang][Codegen] EmitBranchOnBoolExpr(): emit prof branch counts even at -O0
This restores the original behaviour before i unadvertedly broke it in
e3a470162738871bba982416748ae5f5e3572947 and clang/test/Profile/ caught it.
Roman Lebedev [Sun, 21 Mar 2021 19:13:47 +0000 (22:13 +0300)]
[clang][CodeGen] Lower Likelihood attributes to @llvm.expect intrin instead of branch weights
08196e0b2e1f8aaa8a854585335c17ba479114df exposed LowerExpectIntrinsic's
internal implementation detail in the form of
LikelyBranchWeight/UnlikelyBranchWeight options to the outside.
While this isn't incorrect from the results viewpoint,
this is suboptimal from the layering viewpoint,
and causes confusion - should transforms also use those weights,
or should they use something else, D98898?
So go back to status quo by making LikelyBranchWeight/UnlikelyBranchWeight
internal again, and fixing all the code that used it directly,
which currently is only clang codegen, thankfully,
to emit proper @llvm.expect intrinsics instead.
Roman Lebedev [Sun, 21 Mar 2021 16:55:21 +0000 (19:55 +0300)]
Revert "[BranchProbability] move options for 'likely' and 'unlikely'"
Upon reviewing D98898 i've come to realization that these are
implementation detail of LowerExpectIntrinsicPass,
and they should not be exposed to outside of it.
This reverts commit
ee8b53815ddf6f6f94ade0068903cd5ae843fafa.
Fangrui Song [Sun, 21 Mar 2021 19:01:44 +0000 (12:01 -0700)]
[Driver] Gnu.cpp: fix libstdc++ search path for multilib
With this change, on Debian x86-64 (with a MULTILIB_OSDIRNAMES local patch
../lib64 -> ../lib; this does not matter because /usr/lib64/crt{1,i,n}.o do not exist),
`clang++ --target=aarch64-linux-gnu a.cc -Wl,--dynamic-linker=/usr/aarch64-linux-gnu/lib/ld-linux-aarch64.so.1 -Wl,-rpath,/usr/aarch64-linux-gnu/lib`
built executable can run under qemu-user. Previously this failed with
`/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../include/c++/10/iostream:38:10: fatal error: 'bits/c++config.h' file not found`
On Arch Linux, due to the MULTILIB_OSDIRNAMES patch and the existence of
/usr/lib64/crt{1,i,n}.o, clang driver may pick
/usr/lib64/crt{1,i,n}.o and cause a linker error. -B can work around the problem.
`clang++ --target=aarch64-linux-gnu -B /usr/aarch64-linux-gnu/lib a.cc -Wl,--dynamic-linker=/usr/aarch64-linux-gnu/lib/ld-linux-aarch64.so.1 -Wl,-rpath,/usr/aarch64-linux-gnu/lib64:/usr/aarch64-linux-gnu/lib`
Vy Nguyen [Fri, 12 Mar 2021 22:40:37 +0000 (17:40 -0500)]
[lld-macho] Implement -dependency_info (partially - more opcodes needed)
Bug: https://bugs.llvm.org/show_bug.cgi?id=49278
The flag is not well documented, so this implementation is based on observed behaviour.
When specified, `-dependency_info <path>` produced a text file containing information pertaining to the current linkage, such as input files, output file, linker version, etc.
This file's layout is also not documented, but it seems to be a series of null ('\0') terminated strings in the form `<op code><path>`
`<op code>` could be:
`0x00` : linker version
`0x10` : input
`0x11` : files not found(??)
`0x40` : output
`<path>` : is the file path, except for the linker-version case.
(??) This part is a bit unclear. I think it means all the files the linker attempted to look at, but could not find.
Differential Revision: https://reviews.llvm.org/D98559
Craig Topper [Sun, 21 Mar 2021 17:44:31 +0000 (10:44 -0700)]
[DAGCombiner] Minor compile time improvement to (sext_in_reg (sign_extend_vector_inreg x)) optimization.
Don't bother calling ComputeNumSignBits if N00Bits < ExtVTBits. No
matter what answer we get back this will be true:
(N00Bits - DAG.ComputeNumSignBits(N00, DemandedSrcElts)) < ExtVTBits)
So we might as well save the computation. This makes the code more
consistent with the similar (sext_in_reg (sext x)) handling above.
Nikita Popov [Sun, 21 Mar 2021 17:36:20 +0000 (18:36 +0100)]
[ValueTracking] Improve mul handling in isKnownNonEqual()
X != X * C is true if:
* C is not 0 or 1
* X is not 0
* mul is nsw or nuw
Proof: https://alive2.llvm.org/ce/z/uwF29z
This is motivated by one of the cases in D98422.
Nikita Popov [Sun, 21 Mar 2021 17:14:43 +0000 (18:14 +0100)]
[ValueTracking] Add more tests for isKnownNonEqual() of mul (NFC)
This is for the case of (x * C) == x, rather than the
(x * C1) == (x * C2) variant that we already cover.
Chris Lattner [Sun, 21 Mar 2021 17:38:35 +0000 (10:38 -0700)]
Remove the extraneous MLIRContext argument from populateWithGenerated. NFC.
Matt Arsenault [Sun, 21 Mar 2021 16:00:55 +0000 (12:00 -0400)]
MIR: Fix missing serialization for HasTailCall
Matt Arsenault [Sun, 14 Mar 2021 20:59:34 +0000 (16:59 -0400)]
AMDGPU: Fix allowing immediates for tail call pseudo.
The pseudo was using SSrc_b64, so it allowed folding immediates into
the destination operand for a tail call to null. However, this is not
a valid operand for the s_setpc_b64 this will be lowered to. Avoids
printing the operand as an invalid immediate.
Avoids a regression when tail calls are enabled in GlobalISel (somehow
tail calls to null get deleted in the DAG).
Chris Lattner [Sun, 21 Mar 2021 17:10:38 +0000 (10:10 -0700)]
[ShapeDialect] Silence a build warning, NFC
mlir/lib/Dialect/Shape/IR/Shape.cpp:573:26: warning: loop variable 'shape' is always a copy because the range of type '::mlir::Operation::operand_range' (aka 'mlir::OperandRange') does not return a reference [-Wrange-loop-analysis]
for (const auto &shape : shapes()) {
^
Chris Lattner [Sat, 20 Mar 2021 23:29:41 +0000 (16:29 -0700)]
Change OwningRewritePatternList to carry an MLIRContext with it.
This updates the codebase to pass the context when creating an instance of
OwningRewritePatternList, and starts removing extraneous MLIRContext
parameters. There are many many more to be removed.
Differential Revision: https://reviews.llvm.org/D99028
Nikita Popov [Sat, 6 Mar 2021 11:14:48 +0000 (12:14 +0100)]
Reapply [ConstantFold] Handle vectors in ConstantFoldLoadThroughBitcast()
There seems to be an impedance mismatch between what the type
system considers an aggregate (structs and arrays) and what
constants consider an aggregate (structs, arrays and vectors).
Adjust the type check to consider vectors as well. The previous
version of the patch dropped the type check entirely, but it
turns out that getAggregateElement() does require the constant
to be an aggregate in some edge cases: For Poison/Undef the
getNumElements() API is called, without checking in advance that
we're dealing with an aggregate. Possibly the implementation should
avoid doing that, but for now I'm adding an assert so the next
person doesn't fall into this trap.
Nikita Popov [Sun, 21 Mar 2021 16:40:17 +0000 (17:40 +0100)]
[InstSimplify] Add load of undef aggregate test (NFC)
To make sure this doesn't crash the following commit.
Nikita Popov [Sun, 21 Mar 2021 16:32:14 +0000 (17:32 +0100)]
[InstSimplify] Regenerate test checks (NFC)
Nikita Popov [Sun, 21 Mar 2021 12:45:23 +0000 (13:45 +0100)]
[InstSimplify] Add additional select operand replacement tests (NFC)
This tests for binops with identity elements.
Nikita Popov [Sun, 21 Mar 2021 12:32:24 +0000 (13:32 +0100)]
[InstSimplify] Clean up SimplifyReplacedWithOp implementation (NFCI)
Replace Op with RepOp up-front, and then always work with the new
operands, rather than checking for replacement in various places.
Matt Arsenault [Sat, 20 Mar 2021 17:42:17 +0000 (13:42 -0400)]
GlobalISel: Avoid unnecessary truncation to i64
We can just directly pass through the APInt to create a new constant.
Matt Arsenault [Sat, 20 Mar 2021 16:53:58 +0000 (12:53 -0400)]
AMDGPU/GlobalISel: Enable CSE in pre-legalizer combiner
Simon Pilgrim [Sun, 21 Mar 2021 14:00:59 +0000 (14:00 +0000)]
[DAG] Limit (sext_in_reg (zero_extend_vector_inreg x)) to exact sign extension
As commented by @craig.topper on rG1ba5c550d418, we can't guarantee that we'll be extending zero bits, just sign bit. So, revert to the old code for zero_extend_vector_inreg cases.
Jez Ng [Sun, 21 Mar 2021 05:10:04 +0000 (01:10 -0400)]
[lld-macho][nfc] Format Options.td
Summary: A good chunk of it was mis-indented. Fixed by using the
formatting settings from llvm/utils/vim.
Simon Pilgrim [Sun, 21 Mar 2021 12:22:51 +0000 (12:22 +0000)]
[X86][AVX] ComputeNumSignBitsForTargetNode - add X86ISD::VBROADCAST handling for scalar sources
The target shuffle code handles vector sources, but X86ISD::VBROADCAST can also accept a scalar source for splatting.
Added as an extension to PR49658
Simon Pilgrim [Sun, 21 Mar 2021 12:08:53 +0000 (12:08 +0000)]
[X86] Add 'mulhs' variant of PR49658 test case
David Green [Sun, 21 Mar 2021 12:00:06 +0000 (12:00 +0000)]
[ARM] VINS f16 pattern
This adds an extra pattern for inserting an f16 into a odd vector lane
via an VINS. If the dual-insert-lane pattern does not happen to apply,
this can help with some simple cases.
Differential Revision: https://reviews.llvm.org/D95471
luxufan [Fri, 19 Mar 2021 09:02:28 +0000 (17:02 +0800)]
[RISCV] remove redundant instruction when eliminate frame index
The reason for generating mv a0, a0 instruction is when the stack object offset is large then int<12>. To deal this situation, in the elimintateFrameIndex function, it will
create a virtual register, which needs the register scavenger to scavenge it. If the machine instruction that contains the stack object and the opcode is ADDI(the addi
was generated by frameindexNode), and then this instruction's destination register was the same as the register that was generated by the register scavenger, then the
mv a0, a0 was generated. So to eliminnate this instruction, in the eliminateFrameIndex function, if the instrution opcode is ADDI, then the virtual register can't be created.
Differential Revision: https://reviews.llvm.org/D92479
Simon Pilgrim [Sun, 21 Mar 2021 10:40:57 +0000 (10:40 +0000)]
[X86][AVX] computeKnownBitsForTargetNode - add X86ISD::VBROADCAST handling for scalar sources
The target shuffle code handles vector sources, but X86ISD::VBROADCAST can also accept a scalar source for splatting.
Suggested by @craig.topper on PR49658
Simon Pilgrim [Sun, 21 Mar 2021 10:16:55 +0000 (10:16 +0000)]
[X86] Add PR49658 test case
Simon Pilgrim [Sun, 21 Mar 2021 09:57:20 +0000 (09:57 +0000)]
[X86] computeKnownBitsForTargetNode - add X86ISD::PMULUDQ handling
Reuse the existing KnownBits multiplication code to handle what is effectively a ISD::UMUL_LOHI varient
Fangrui Song [Sun, 21 Mar 2021 07:56:03 +0000 (00:56 -0700)]
[Driver] Linux.cpp: add -internal-isystem lib/../$triple/include
With this change, for `#include <ar.h>`, `clang --target=aarch64-linux-gnu`
will read `/usr/lib/gcc/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/include/ar.h`
(on Debian gcc->gcc-cross)
instead of `/usr/include/ar.h`. Some glibc headers (e.g. gnu/stubs.h) are different across architectures.
Fangrui Song [Sun, 21 Mar 2021 04:37:49 +0000 (21:37 -0700)]
[Driver] Gnu.cpp: drop an unneeded special rule related to sysroot
Fangrui Song [Sun, 21 Mar 2021 04:32:55 +0000 (21:32 -0700)]
[Driver] Gnu.cpp: drop an unneeded special rule related to sysroot
Seem unnecessary to diverge from GCC here.
Beside, lib/../$OSLibDir can be considered closer to the GCC
installation then the system root. The comment should not apply.
Fangrui Song [Sun, 21 Mar 2021 03:12:45 +0000 (20:12 -0700)]
[Driver] Gnu.cpp: remove unneeded -L detection hack for -mx32
Removing the hack actually improves our compatibility with gcc -mx32.
Fangrui Song [Sun, 21 Mar 2021 01:56:40 +0000 (18:56 -0700)]
[Driver] Gnu.cpp: remove unneeded -L detection for libc++
If clang is installed in the system, the other -L suffice;
otherwise $ccc_install_dir/../lib below suffices.
Fangrui Song [Sun, 21 Mar 2021 01:50:14 +0000 (18:50 -0700)]
[Driver] Gnu.cpp: remove unneeded -L lib/gcc/$triple/$version/../../../$triple
After path resolution, it duplicates a subsequent -L entry. The entry below
(lib/gcc/$triple/$version/../../../../$OSLibDir) usually does not exist (e.g.
Arch Linux; Debian cross gcc). When it exists, it typically just has ld.so (e.g.
Debian native gcc) which cannot cause collision. Removing the -L (similar to
reordering it) is therefore justified.
Craig Topper [Sun, 21 Mar 2021 00:43:30 +0000 (17:43 -0700)]
[RISCV] Add test case to show a case where (mul (and X, 0xffffffff), (and Y, 0xffffffff)) optimization does not improve code.
If the mul add two users, one of which was a sext.w, the mul
would also be selected to a MULW before our pattern runs. This
causes the ANDs to now be used by the already selected MULW and
the mul we still need to select. They are unneeded on the MULW
since MULW only reads the lower bits. So they get selected to
SLLI+SRLI for the MULW use. The use for the
(mul (and X, 0xffffffff), (and Y, 0xffffffff)) manages to reuse
the SLLI.
The end result is increased register pressure and no improvement
to how soon we can start the MULW.
Chris Lattner [Sat, 20 Mar 2021 04:22:15 +0000 (21:22 -0700)]
[Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants.
This reapplies b5d9a3c / https://reviews.llvm.org/D98609 with a one line fix in
processExistingConstants to skip() when erasing a constant we've already seen.
Original commit message:
1) Change the canonicalizer to walk the function in top-down order instead of
bottom-up order. This composes well with the "top down" nature of constant
folding and simplification, reducing iterations and re-evaluation of ops in
simple cases.
2) Explicitly enter existing constants into the OperationFolder table before
canonicalizing. Previously we would "constant fold" them and rematerialize
them, wastefully recreating a bunch fo constants, which lead to pointless
memory traffic.
Both changes together provide a 33% speedup for canonicalize on some mid-size
CIRCT examples.
One artifact of this change is that the constants generated in normal pattern
application get inserted at the top of the function as the patterns are applied.
Because of this, we get "inverted" constants more often, which is an aethetic
change to the IR but does permute some testcases.
Differential Revision: https://reviews.llvm.org/D99006
Andrew Litteken [Sat, 20 Mar 2021 23:03:02 +0000 (18:03 -0500)]
Revert "[IRSim] Adding basic implementation of llvm-sim."
Causing build errors on the Windows Buildbots.
This reverts commit
5155dff2784a47583d432d796b7cf47a0bed9f20.
Jessica Clarke [Sat, 20 Mar 2021 22:35:40 +0000 (22:35 +0000)]
[RISCV] Update comment in RISCVInstrInfoM.td
Missed in
07ed62b7d551.
Craig Topper [Sat, 20 Mar 2021 22:14:46 +0000 (15:14 -0700)]
[RISCV] Disable (mul (and X, 0xffffffff), (and Y, 0xffffffff)) optimization when Zba is enabled.
This optimization is trying to save SRLI instructions needed to
implement the ANDs. If we have zext.w we won't save anything.
Because we don't check that the multiply is the only user of the
AND we might even increase instruction count.
Craig Topper [Sat, 20 Mar 2021 22:09:15 +0000 (15:09 -0700)]
[RISCV] Add Zba command lines to xaluo.ll. NFC
Some of the patterns end up with 32 to 64 bit zero extends on RV64
which can be handled by zext.w.
Fangrui Song [Sat, 20 Mar 2021 22:24:02 +0000 (15:24 -0700)]
[test] Delete "-internal-isystem" "/usr/local/include"
Craig Topper [Sat, 20 Mar 2021 19:34:06 +0000 (12:34 -0700)]
[RISCV] Add isel pattern to optimize (mul (and X, 0xffffffff), (and Y, 0xffffffff)) on RV64
This patterns computes the full 64 bit product of a 32x32 unsigned
multiply. This requires a two pairs of SLLI+SRLI to zero the
upper 32 bits of the inputs.
We can do better than this by using two SLLI to move the lower
bits to the upper bits then use MULHU to compute the product. This
is the high half of a full 64x64 product. Since we put 32 0s in the lower
bits of the inputs we know the 128-bit product will have zeros in the
lower 64 bits. So the upper 64 bits, which MULHU computes, will contain
the original 64 bit product we were after.
The same trick would work for (mul (sext_inreg X, i32), (sext_inreg Y, i32))
using MULHS, but sext_inreg is sext.w which is already one instruction so we
wouldn't save anything.
Differential Revision: https://reviews.llvm.org/D99026
Andrew Litteken [Thu, 17 Sep 2020 20:43:40 +0000 (15:43 -0500)]
[IRSim] Adding basic implementation of llvm-sim.
This is a similarity visualization tool that accepts a Module and
passes it to the IRSimilarityIdentifier. The resulting SimilarityGroups
are output in a JSON file.
Tests are found in test/tools/llvm-sim and check for the file not found,
a bad module, and that the JSON is created correctly.
Reviewers: paquette, jroelofs, MaskRay
Recommit of:
15645d044bcfe2a0f63156048b302f997a717688 to fix linking
errors.
Differential Revision: https://reviews.llvm.org/D86974
Jinsong Ji [Sat, 20 Mar 2021 03:48:48 +0000 (03:48 +0000)]
[AIX] Update rpath for BUILD_SHARED_LIBS
BUILD_SHARED_LIBS build llvm component as shared library,
which can reduce the size a lot.
Normally, the binary use ORIGIN../lib to load component libraries,
unfortunatly, ORIGIN is not supported by AIX ld.
We hardcoded the build lib and install lib path in rpath for now
to enable BUILD_SHARED_LIBS build.
Understand that this is not perfect solution,
we can update this when we find better solution.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D98901
Fangrui Song [Sat, 20 Mar 2021 20:24:49 +0000 (13:24 -0700)]
[test] Fix Driver/gcc-toolchain.cpp if CLANG_DEFAULT_RTLIB is compiler-rt
Sanjay Patel [Sat, 20 Mar 2021 18:45:56 +0000 (14:45 -0400)]
[BranchProbability] move options for 'likely' and 'unlikely'
This makes the settings available for use in other passes by housing
them within the Support lib, but NFC otherwise.
See D98898 for the proposed usage in SimplifyCFG
(where this change was originally included).
Differential Revision: https://reviews.llvm.org/D98945
Jez Ng [Sat, 20 Mar 2021 05:03:50 +0000 (01:03 -0400)]
[lld-macho] Minor touch-up to objc.s
Stephen Kelly [Wed, 17 Mar 2021 23:22:31 +0000 (23:22 +0000)]
[AST] Ensure that an empty json file is generated if compile errors
Differential Revision: https://reviews.llvm.org/D98827
Fangrui Song [Sat, 20 Mar 2021 18:06:44 +0000 (11:06 -0700)]
[test] Fix Driver/gcc-toolchain.cpp if CLANG_DEFAULT_CXX_STDLIB is libc++
Fangrui Song [Sat, 20 Mar 2021 17:36:51 +0000 (10:36 -0700)]
[VE] Fix types of multiclass template arguments in TableGen files
There were not properly checked before `[TableGen] Improve handling of template arguments`.
Fangrui Song [Sat, 20 Mar 2021 16:57:05 +0000 (09:57 -0700)]
Revert "Revert "[Driver] Drop obsoleted Ubuntu 11.04 gcc detection""
This reverts commit
243333ef3ec6c1e3910eb442177c2e2e927e6a87.
Vaivaswatha Nagaraj [Fri, 19 Mar 2021 14:05:13 +0000 (19:35 +0530)]
[OCaml] Add (get/set)_module_identifer functions
Also:
- Fix a bug that crept in when fixing a buildbot failure in
https://github.com/llvm/llvm-project/commit/
f7be9db6220cb39f0eaa12d2af3abedf0d86c303
- Use mlsize_t for cstr_to_string as that is what
caml_alloc_string specifies.
Differential Revision: https://reviews.llvm.org/D98851
David Zarzycki [Sat, 20 Mar 2021 11:52:08 +0000 (07:52 -0400)]
[lit] Sort testing summary output
As fallout from from the record-and-reorder work, people asked that the
summary output be sorted to aid diffing.
David Zarzycki [Sat, 20 Mar 2021 11:29:01 +0000 (07:29 -0400)]
Revert "[Driver] Drop obsoleted Ubuntu 11.04 gcc detection"
This reverts commit
bdf39e6b0ed4b41a1842ac0193f30a726f8d9f63.
The change is failing on Fedora 33 (x86-64).
Nathan James [Sat, 20 Mar 2021 10:59:36 +0000 (10:59 +0000)]
[clang-tidy] Fix bugprone-terminating-continue when continue appears inside a switch
Don't emit a warning if the `continue` appears in a switch context as changing it to `break` will break out of the switch rather than a do loop containing the switch.
Fixes https://llvm.org/PR49492.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D98338
Butygin [Fri, 12 Mar 2021 14:39:43 +0000 (17:39 +0300)]
[mlir] Additional folding for SelectOp
* Fold SelectOp when both true and false args are same SSA value
* Fold some cmp + select patterns
Differential Revision: https://reviews.llvm.org/D98576
Jeroen Dobbelaere [Sat, 20 Mar 2021 10:37:09 +0000 (11:37 +0100)]
Revert of D49126 [PredicateInfo] Use custom mangling to support ssa_copy with unnamed types.
Now that intrinsic name mangling can cope with unnamed types, the custom name mangling in PredicateInfo (introduced by D49126) can be removed.
(See D91250, D48541)
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D91661
Wang, Pengfei [Sat, 20 Mar 2021 04:55:46 +0000 (12:55 +0800)]
[X86] Fix a bug when calculating the ldtilecfg insertion points.
The BB we initialized the ldtilecfg is special. We don't need to check
if its predecessor BBs need to insert ldtilecfg for calls.
We reused the flag HasCallBeforeAMX, so that the predecessors won't be
added to CfgNeedInsert.
This case happens only when the entry BB is in a loop. We need to hoist
the first tile config point out of the loop in future.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D98845
Butygin [Fri, 12 Mar 2021 14:39:43 +0000 (17:39 +0300)]
[mlir] Canonicalize IfOp with trivial `then` and `else` bodies to list of SelectOp's
* Do we need a threshold on maximum number of Yeild arguments processed (maximum number of SelectOp's to be generated)?
* Had to modify some old IfOp tests to not get optimized by this pattern
Differential Revision: https://reviews.llvm.org/D98592