Sjoerd Meijer [Thu, 4 Nov 2021 10:36:19 +0000 (10:36 +0000)]
[FuncSpec] Enable it only with -O3
Function specialisation was running at all optimisation levels (if enabled on
the command line, it is not on by default). That was an oversight and not
something we want to do. Function specialisation duplicates functions when it
triggers, so the backend is processing more functions/instructions resulting in
compile-time increases, which seems more appropriate with -O3 and inline with
GCC. Please note that since function specialisation is not enabled by default,
this didn't require updating any pass manager tests.
Differential Revision: https://reviews.llvm.org/D112129
Raphael Isemann [Thu, 4 Nov 2021 10:20:29 +0000 (11:20 +0100)]
[lldb][NFC] StringRef-ify the name parameter in CreateEnumerationType
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D113176
Guillaume Chatelet [Thu, 4 Nov 2021 12:14:31 +0000 (12:14 +0000)]
[libc][NFC] Allow memset (and bzero) to be inlined
This allows shipping individual functions without also having to provide
memset or bzero at the expense of bigger functions.
Similar to D113097.
Differential Revision: https://reviews.llvm.org/D113108
Guillaume Chatelet [Thu, 4 Nov 2021 12:13:29 +0000 (12:13 +0000)]
[libc][NFC] Allow memcmp to be inlined
Similar to D113097 although not strictly necessary for now. It helps
keeping the same structure for all memory functions.
Differential Revision: https://reviews.llvm.org/D113103
Guillaume Chatelet [Thu, 4 Nov 2021 12:15:17 +0000 (12:15 +0000)]
[libc][NFC] Allow memcpy to be inlined
This allows shipping individual functions without also having to provide
`memcpy` at the expense of bigger functions.
Next is to use this `inlined_memcpy` in:
- loader/linux/x86_64/start.cpp
- src/string/memmove.cpp
- src/string/mempcpy.cpp
- src/string/strcpy.cpp
- src/string/strdup.cpp
- src/string/strndup.cpp
Differential Revision: https://reviews.llvm.org/D113097
Josh Mottley [Fri, 29 Oct 2021 13:45:28 +0000 (13:45 +0000)]
[flang][flang-omp-report] Removed unnecessary comments in flang-omp-report plugin tests
This patch removes unnecessary comments in the flang-omp-report
plugin tests which can be implied from the file name and run command.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D112817
Chen Zheng [Thu, 4 Nov 2021 13:32:31 +0000 (13:32 +0000)]
[PowerPC][NFC] make option ppc-formprep-max-vars can be set more than one time.
Aaron Ballman [Thu, 4 Nov 2021 13:40:22 +0000 (09:40 -0400)]
No longer crash when a consteval function returns a structure
Ensure that the destination slot exists in this case. This addresses PR51484.
Simon Pilgrim [Thu, 4 Nov 2021 13:00:42 +0000 (13:00 +0000)]
[InstCombine] Add reference to PR52397 to help with triage
rG1e5f814302f8 added the test case, I've added PR52397 to the comment to help keep track of the source of the bug
Simon Pilgrim [Thu, 4 Nov 2021 12:11:42 +0000 (12:11 +0000)]
[X86][SSE] Improve PMADDWD SimplifyDemandedVectorElts handling
Check both operands for zero elements to remove unnecessary demanded elts.
Try to help reduce some minor regressions noticed in D110995
Muhammad Omair Javaid [Thu, 4 Nov 2021 12:38:24 +0000 (17:38 +0500)]
[LLDB] Fix Cpsr size for WoA64 target
CPSR on Arm64 is 4 bytes in size but windows on Arm implementation is trying to read/write 8 bytes against a byte register causing LLDB unit tests failures.
Ref: https://docs.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-arm64_nt_context
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D112471
Muhammad Omair Javaid [Thu, 4 Nov 2021 12:37:42 +0000 (17:37 +0500)]
[LIT] Add win32 PLATFORM env var to test config
LIT skips various system environment variables while building test
config. It turns out that we require PLATFORM environment variable for
detection of x86 vs Arm windows platform.
This patch adds system environment variable PLATFORM into LIT test
config for detection of win32 Arm platform.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D113165
Florian Hahn [Thu, 4 Nov 2021 12:11:17 +0000 (13:11 +0100)]
[LV] Clarify uniform worklist contains instrs demanding lane 0.
gbreynoo [Thu, 4 Nov 2021 11:01:32 +0000 (11:01 +0000)]
[llvm-objdump] Fix the Assertion failure when providing invalid --debug-vars or --dwarf values
As seen in https://bugs.llvm.org/show_bug.cgi?id=52213 llvm-objdump
asserts if either the --debug-vars or the --dwarf options are provided
with invalid values. As suggested, this fix adds use of a default value
to these options and errors when given bad input.
Differential Revision: https://reviews.llvm.org/D112183
Tim Northover [Fri, 22 Oct 2021 08:13:02 +0000 (09:13 +0100)]
Coroutines: don't infer function attrs before lowering
Coroutines have weird semantics that don't quite match normal LLVM functions,
so trying to infer even simple attributes based on thier contents can go wrong.
Markus Böck [Thu, 4 Nov 2021 09:59:35 +0000 (10:59 +0100)]
[mlir] Fix typos in comments
Valentin Clement [Thu, 4 Nov 2021 09:36:00 +0000 (10:36 +0100)]
[fir] Add fir.insert_on_range conversion
Convert fir.insert_on_range operation to corresponding
llvm.insertvalue operations.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D112896
David Green [Thu, 4 Nov 2021 09:24:27 +0000 (09:24 +0000)]
[InstCombine] Fix infinite recursion in ashr/xor vector fold.
The added test has poison lanes due to the vector shuffle. This can
cause an infinite loop of combines in instcombine where it folds
xor(ashr, -1) -> select (icmp slt 0), -1, 0 -> sext (icmp slt 0) -> xor(ashr, -1).
We usually prevent this by checking that the xor constant is not -1,
but with vectors some of the lanes may be -1, some may be poison. So
this changes the way we detect that from "!C1->isAllOnesValue()" to
"!match(C1, m_AllOnes())", which is more able to detect that some of the
lanes are poison.
Fixes PR52397
Martin Storsjö [Tue, 2 Nov 2021 16:32:09 +0000 (16:32 +0000)]
[libcxx] Remove nonstandard _FilesystemClock::{to,from}_time_t
These are not standard methods, neither libstdc++ nor MSVC STL provide
them.
In practice, one of them was untested and the other one was only used in
one single test.
Differential Revision: https://reviews.llvm.org/D113027
Lawrence D'Anna [Thu, 4 Nov 2021 07:58:31 +0000 (00:58 -0700)]
[lldb] Fix TestEchoCommands.test again
In
7f01f78593d6 [lldb] update TestEchoCommands -- I fixed this test,
but not on windows, becuase I used some unix shell syntax that
doesn't work with cmd.exe. Fixed it so it will work in both.
Test logic is the same.
This is a trivial fix, so bypassing review to get the build clean again
ASAP.
Valentin Clement [Thu, 4 Nov 2021 07:53:49 +0000 (08:53 +0100)]
[fir] Restrict array type on fir.insert_on_range
Sequence type had no restriction on the insert_on_range operation.
This patch adds a restriction for the type to have constant shape
and size.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D113092
Martin Liska [Wed, 3 Nov 2021 12:49:21 +0000 (13:49 +0100)]
Fix -Wformat warnings reported by GCC.
Differential Revision: https://reviews.llvm.org/D113099
Keith Smiley [Thu, 4 Nov 2021 04:47:49 +0000 (21:47 -0700)]
[lld-macho] Silently ignore the -objc_abi_version
This undocumented ld64 flag, based on the most recent ld64 source dump
from Xcode 12, only applies to i386. It seems like on all newer
architectures this behavior is the default.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D113070
Keith Smiley [Thu, 4 Nov 2021 04:42:04 +0000 (21:42 -0700)]
[lld-macho] Cache readFile results
In one of our links lld was reading 760k files, but the unique number of
files was only 1500. This takes that link from 30 seconds to 8.
This seems like a heavy hammer, especially since some things don't need
to be cached, like the filelist arguments and the passed static
archives (the latter is already cached as a one off), but it seems ld64
does something similar here to short circuit these duplicate reads:
https://github.com/keith/ld64/blob/
82e429e186488529111b0ef86af33a3b1b9438c7/src/ld/InputFiles.cpp#L644-L665
Of the types of files being read for our iOS app, the biggest problem
was constantly re-reading small tbd files:
```
% wc -l /tmp/read.txt
761414 /tmp/read.txt
% cat /tmp/read.txt | sort -u | wc -l
1503
% cat /tmp/read.txt | grep "\.a$" | wc -l
43721
% cat /tmp/read.txt | grep "\.tbd$" | wc -l
717656
```
We could likely hoist this logic up to not cache at this level, but it
would be a more invasive change to make sure all callers that needed it
cached the results.
I could see this being an issue with OOMs, and I'm not a linker expert so
maybe there's another way we should solve this problem? Feedback welcome!
Reviewed By: int3, #lld-macho
Differential Revision: https://reviews.llvm.org/D113153
Keith Smiley [Thu, 4 Nov 2021 04:23:04 +0000 (21:23 -0700)]
[lld-macho] Implement -arch_errors_fatal
By default with ld64, architecture mismatches are just warnings, then
this flag can be passed to make these fail. This matches that behavior.
Reviewed By: int3, #lld-macho
Differential Revision: https://reviews.llvm.org/D113082
Matthias Springer [Thu, 4 Nov 2021 04:51:30 +0000 (13:51 +0900)]
[mlir][linalg][bufferize] Generalize InitTensorOp elimination
This allows for external users of Comprehensive Bufferize to specify their own InitTensorOp elimination procedures.
Differential Revision: https://reviews.llvm.org/D112686
Jez Ng [Thu, 4 Nov 2021 04:01:49 +0000 (00:01 -0400)]
[lld-macho][nfc] Remove unnecessary -pie flags in tests
D101513 means that we no longer need to specify `-pie` in most of our
test RUN commands. Let's clean up the unused flags so as not to confuse
future test writers.
Reviewed By: #lld-macho, oontvoo, MaskRay
Differential Revision: https://reviews.llvm.org/D113114
Chuanqi Xu [Thu, 4 Nov 2021 03:50:30 +0000 (11:50 +0800)]
[Coroutines] [Frontend] Lookup in std namespace first
Now in libcxx and clang, all the coroutine components are defined in
std::experimental namespace.
And now the coroutine TS is merged into C++20. So in the working draft
like N4892, we could find the coroutine components is defined in std
namespace instead of std::experimental namespace.
And the coroutine support in clang seems to be relatively stable. So I
think it may be suitable to move the coroutine component into the
experiment namespace now.
This patch would make clang lookup coroutine_traits in std namespace
first. For the compatibility consideration, clang would lookup in
std::experimental namespace if it can't find definitions in std
namespace. So the existing codes wouldn't be break after update
compiler.
And in case the compiler found std::coroutine_traits and
std::experimental::coroutine_traits at the same time, it would emit an
error for it.
The support for looking up std::experimental::coroutine_traits would be
removed in Clang16.
Reviewed By: lxfind, Quuxplusone
Differential Revision: https://reviews.llvm.org/D108696
Muhammad Omair Javaid [Thu, 4 Nov 2021 03:44:29 +0000 (08:44 +0500)]
[LLDB] Adjust DumpDataExtractorTest.Formats for Windows
Floating point results mismtach between Visual stdio 2019 and previous
versions. This adjusts macro accordingly.
Qiu Chaofan [Thu, 4 Nov 2021 03:44:04 +0000 (11:44 +0800)]
[PowerPC] Enforce side effects to FPSCR read/set intrinsics
Currently, FPSCR is not modeled, so in some early passes (such as
early-cse), the read/set intrinsics to FPSCR may get incorrect
simplification.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D112380
Ben Vanik [Thu, 4 Nov 2021 03:30:04 +0000 (20:30 -0700)]
[ADT] Simplifying hex string parsing so it runs faster in debug modes.
This expands the lookup table statically and avoids routing through methods that
contain asserts (like StringRef/std::string element accessors and drop_front)
such that performance is more predictable across compilation environments. This
was primarily driven by slow debug mode performance but has a large benefit in
release builds as well.
```
ssd_mobilenet_v2_face_float (42MB .mlir)
Debug/MSVC (old): 5.22s
Debug/MSVC (new): 0.16s
Release/MSVC (old): 0.81s
Release/MSVC (new): 0.02s
huggingface_minilm (536MB .mlir)
Debug/MSVC (old): 65.31s
Debug/MSVC (new): 2.03s
Release/MSVC (old): 9.93s
Release/MSVC (new): 0.27s
```
Now in debug the time is split evenly between lexString, tryGetFromHex, and
element attrs hashing, with the next step to making it faster being to combine
the work (incremental hashing during conversion, etc) - but this is at least in
the right order of magnitude and retains the original API surface.
I have not profiled a build with clang but this is strictly less code and simpler
data structures so I'd expect improvements there as well.
This also fixes a bug where 0xFF bytes in the input would read out of bounds.
Reviewed By: dblaikie, stellaraccident
Differential Revision: https://reviews.llvm.org/D112105
RamNalamothu [Wed, 3 Nov 2021 16:53:39 +0000 (22:23 +0530)]
[AMDGPU] Do not add debug locations to the code inside prologue
There is no real source location for code inside prologue as it is
generated by compiler but source locations are being added to code
inside prologue as a side effect of https://reviews.llvm.org/D99269
because buildSpillLoadStore() is using source location of the real
instruction in the basic block if any.
Fixes: SWDEV-307590
Reviewed By: scott.linder, sebastian-ne
Differential Revision: https://reviews.llvm.org/D113100
Julian Lettner [Thu, 4 Nov 2021 01:21:52 +0000 (18:21 -0700)]
Revert "Mark tsan cxa_guard_acquire test as unsupported on Darwin"
This reverts commit
593275c93c5cd3e02819f012f812eee19081911b.
This test now passes again.
Matthias Springer [Thu, 4 Nov 2021 01:47:51 +0000 (10:47 +0900)]
[mlir][linalg][bufferize] Fix typo in function name
Differential Revision: https://reviews.llvm.org/D113162
Jakub Kuderski [Thu, 4 Nov 2021 00:47:57 +0000 (20:47 -0400)]
Make enum iteration with seq safe by default
By default `llvm::seq` would happily iterate over enums, which may be unsafe if the enum values are not continuous. This patch disable enum iteration with `llvm::seq` and `llvm::seq_inclusive` and adds two new functions: `enum_seq` and `enum_seq_inclusive`.
To make sure enum iteration is safe, we require users to declare their enum types as iterable by specializing `enum_iteration_traits<SomeEnum>`. Because it's not always possible to add these traits next to enum definition (e.g., for enums defined in external libraries), we provide an escape hatch to allow iteration on per-callsite basis by passing `force_iteration_on_noniterable_enum`.
The main benefit of this approach is that these global declarations via traits can appear just next to enum definitions, making easy to spot when enums are miss-labeled, e.g., after introducing new enum values, whereas `force_iteration_on_noniterable_enum` should stand out and be easy to grep for.
This emerged from a discussion with gchatelet@ about reusing llvm's `Sequence.h` in lieu of https://github.com/GPUOpen-Drivers/llpc/blob/dev/lgc/interface/lgc/EnumIterator.h.
Reviewed By: dblaikie, gchatelet, aaron.ballman
Differential Revision: https://reviews.llvm.org/D107378
Mehdi Amini [Wed, 3 Nov 2021 23:59:05 +0000 (23:59 +0000)]
Revert "Fix iterator_adaptor_base/enumerator_iter to allow composition of llvm::enumerate with llvm::make_filter_range"
This reverts commit
ba7a6b314fd14bb2c9ff5d3f4fe2b6525514cada.
Post-commit review showed that the fix implemented wasn't correct, and a
more principled fix is possible.
Volodymyr Sapsai [Tue, 21 Sep 2021 01:59:19 +0000 (18:59 -0700)]
[clang][objc] Speed up populating the global method pool from modules.
For each selector encountered in the source code, we need to load
selectors from the imported modules and check that we are calling a
selector with compatible types.
At the moment, for each module we are storing methods declared in the
headers belonging to this module and methods from the transitive closure
of imported modules. When a module is imported by a few other modules,
methods from the shared module are duplicated in each importer. As the
result, we can end up with lots of identical methods that we try to add
to the global method pool. Doing this duplicate work is useless and
relatively expensive.
Avoid processing duplicate methods by storing in each module only its
own methods and not storing methods from dependencies. Collect methods
from dependencies by walking the graph of module dependencies.
The issue was discovered and reported by Richard Howell. He has done the
hard work for this fix as he has investigated and provided a detailed
explanation of the performance problem.
Differential Revision: https://reviews.llvm.org/D110123
Michael Jones [Tue, 2 Nov 2021 21:49:38 +0000 (14:49 -0700)]
[libc][NFC] rename str_conv_utils to str_to_integer
rename str_conv_utils to str_to_integer to be more
in line with str_to_float.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D113061
Jacques Pienaar [Wed, 3 Nov 2021 22:34:13 +0000 (15:34 -0700)]
[mlir] Use _odsPrinter for printer name in generated code
The generated name should not be load bearing, so this should be a NFC change.
Differential Revision: https://reviews.llvm.org/D113149
Philip Reames [Wed, 3 Nov 2021 22:13:31 +0000 (15:13 -0700)]
Backout must-exit based parts of
3fc9882e, and 412eb0
Not sure these are correct. I think I missed a case when porting this from the original SCEV change to the IndVar changes. I may end up reapplying this later with a comment about how this is correct, but in case the current bad feeling turns out to be true, I'm removing from tree while investigating further.
Jonas Devlieghere [Wed, 3 Nov 2021 21:55:28 +0000 (14:55 -0700)]
[lldb] Update tagged pointer command output and test.
- Use formatv to print the addresses.
- Add check for 0x0 which is treated as an invalid address.
- Use a an address that's less likely to be interpreted as a real
tagged pointer.
Arthur Eubanks [Wed, 3 Nov 2021 22:00:28 +0000 (15:00 -0700)]
[NFC] Clarify why LinkAll*.h are actually necessary
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D113074
Arthur Eubanks [Tue, 2 Nov 2021 02:49:05 +0000 (19:49 -0700)]
[ArgPromo] Preserve FunctionAnalysisManagerCGSCCProxy
We already make sure to properly clear analyses for deleted functions.
This makes investigating some future potential compile time improvements easier.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D113032
Mogball [Wed, 3 Nov 2021 21:12:41 +0000 (21:12 +0000)]
[mlir] fix Debug unittests
Flag NDEBUG needed to be changed to LLVM_ENABLE_ABI_BREAKING_CHECKS
Philip Reames [Wed, 3 Nov 2021 21:33:08 +0000 (14:33 -0700)]
[tests] Precommit for generalization of D112262
Craig Topper [Wed, 3 Nov 2021 21:11:18 +0000 (14:11 -0700)]
[RISCV] Use HasVInstructions and HasVInstructionsAnyF in more place in TableGen. NFC
Change RISCVSubtarget.hasVInstructionAnyF() to call hasVInstructionsF32
so that any changes to hasVInstructionsF32 are reflected.
The files were missed in D112496.
Matthias Braun [Tue, 28 Sep 2021 00:57:22 +0000 (17:57 -0700)]
X86InstrInfo: Support immediates that are +1/-1 different in optimizeCompareInstr
This is a re-commit of
e2c7ee0743592e39274e28dbe0d0c213ba342317 which
was reverted in
a2a58d91e82db38fbdf88cc317dcb3753d79d492. This includes
a fix to consistently check for EFLAGS being live-out. See phabricator
review.
Original Summary:
This extends `optimizeCompareInstr` to re-use previous comparison
results if the previous comparison was with an immediate that was 1
bigger or smaller. Example:
CMP x, 13
...
CMP x, 12 ; can be removed if we change the SETg
SETg ... ; x > 12 changed to `SETge` (x >= 13) removing CMP
Motivation: This often happens because SelectionDAG canonicalization
tends to add/subtract 1 often when optimizing for fallthrough blocks.
Example for `x > C` the fallthrough optimization switches true/false
blocks with `!(x > C)` --> `x <= C` and canonicalization turns this into
`x < C + 1`.
Differential Revision: https://reviews.llvm.org/D110867
Lang Hames [Wed, 3 Nov 2021 20:42:05 +0000 (13:42 -0700)]
[ORC-RT] Add SPS serialization for span<const char> / SPSSequence<char>.
Philip Reames [Wed, 3 Nov 2021 20:38:09 +0000 (13:38 -0700)]
Revert "[indvars] Move a check slightlly earlier [NFC]"
This reverts commit
7ff943a9ed878e3b8ffe162b2af41a81da1a11a2.
This wasn't NFC. isSigned != !isUnsigned as there are also relational operators.
River Riddle [Wed, 3 Nov 2021 19:57:36 +0000 (19:57 +0000)]
[mlir] Avoid folding in OpBuilder::tryFold when types change
This was missed when tightening fold restrictions in https://reviews.llvm.org/D95991.
Differential Revision: https://reviews.llvm.org/D113138
Kirill Stoimenov [Wed, 3 Nov 2021 18:39:38 +0000 (18:39 +0000)]
[ASan] Process functions in Asan module pass
This came up as recommendation while reviewing D112098.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D112732
alex-t [Sun, 31 Oct 2021 20:34:03 +0000 (23:34 +0300)]
[AMDGPU] Enable divergence-driven BFE selection
Detailed description: This change enables the bit field extract patterns
selection to s_bfe_u32 or v_bfe_u32 dependent on the pattern root node
divergence.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D110950
Vitaly Buka [Wed, 3 Nov 2021 20:11:05 +0000 (13:11 -0700)]
[asan] Disable test on Android Arm 32bit
Caused by D111703.
Valentin Clement [Wed, 3 Nov 2021 19:44:51 +0000 (20:44 +0100)]
[fir] Use notifyMatchFailure in fir.zero_bits conversion
Change emitOpError to notifyMatchFailure in conversion pattern.
Post-commit change after D113014
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D113091
Martin Storsjö [Thu, 28 Oct 2021 07:57:27 +0000 (10:57 +0300)]
[Support] [Windows] Use RemoveFileOnSignal if unable to use the delete-on-close flag
This takes care of cleaning up the temp files on crashes. It doesn't
handle cleanup when explicitly killed though.
Differential Revision: https://reviews.llvm.org/D112710
Philip Reames [Wed, 3 Nov 2021 19:24:10 +0000 (12:24 -0700)]
[indvars] Move a check slightlly earlier [NFC]
Philip Reames [Wed, 3 Nov 2021 19:08:16 +0000 (12:08 -0700)]
[indvars] Rotate zext though icmp to reduce loop varying computation
This change looks for cases where we can prove that an exit test of a loop can be performed in a narrower bitwidth, and that by doing so we can replace a loop-varying extend with a loop-invariant truncate.
The motivation here is that doing this unblocks the trip count analysis for narrow IVs involved in extended compare exit tests. It also has the nice side effect of simply making the code faster, even if we gain no other benefit from the improved analysis ability.
I've noted a few places this could be extended, but I think this stands reasonable on it's own as well.
Differential Revision: https://reviews.llvm.org/D112262
Vitaly Buka [Wed, 3 Nov 2021 19:02:09 +0000 (12:02 -0700)]
[PassBuilder] Remove unused function after D113072
Keith Smiley [Wed, 3 Nov 2021 18:28:45 +0000 (11:28 -0700)]
[lld-macho] Enable search-paths tests on macOS
I'm not sure what the history is here but this test passes on macOS
today. It seems like we should unify these tests if they need to run
cross platform.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D113085
Vitaly Buka [Wed, 3 Nov 2021 18:56:26 +0000 (11:56 -0700)]
[sanitizer] Disable new test on Android
Test added with D113055
River Riddle [Wed, 3 Nov 2021 18:22:49 +0000 (18:22 +0000)]
[mlir] Move the Operation OperandStorage to the first trailing object
The main benefits of this change are faster access to operands
(no need to compute the offset, as it is now right after the
operation), simpler code(no need to manage a lot of the "is the
operand storage trailing" logic we had to before). The major
downside to this though, is that operand holding operations now
grow in size by 1 word (as no matter how we do this change, there
will need to be some additional book keeping).
Differential Revision: https://reviews.llvm.org/D111695
Vitaly Buka [Wed, 3 Nov 2021 00:06:28 +0000 (17:06 -0700)]
[NFC][asan] Use AddressSanitizerOptions in ModuleAddressSanitizerPass
Reviewed By: kstoimenov
Differential Revision: https://reviews.llvm.org/D113072
Keith Smiley [Wed, 3 Nov 2021 18:08:57 +0000 (11:08 -0700)]
[lld-macho] Cache discovered framework paths
On our large iOS project this took a link from 1 minute 45 seconds to 45
seconds. For reference ld64 does the same link in ~20 seconds.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D113063
Markus Böck [Wed, 3 Nov 2021 18:02:50 +0000 (19:02 +0100)]
[mlir] Change ABI breaking use of NDEBUG to LLVM_ENABLE_ABI_BREAKING_CHECKS in DebugActions.h
A quick grep for NDEBUG in MLIR revealed a use in DebugActions.h that breaks ABI. This patch changes the use of NDEBUG to LLVM_ENABLE_ABI_BREAKING_CHECKS which has the advantage of being independent of whether clients build their own app in debug or release as it is purely dependant on how MLIR itself was built.
Differential Revision: https://reviews.llvm.org/D113088
Kirill Stoimenov [Wed, 3 Nov 2021 17:59:29 +0000 (17:59 +0000)]
Revert "[ASan] Process functions in Asan module pass"
This reverts commit
76ea87b94e5cba335d691e4e18e3464ad45c8b52.
Reviewed By: kstoimenov
Differential Revision: https://reviews.llvm.org/D113129
Kirill Stoimenov [Wed, 3 Nov 2021 16:32:44 +0000 (16:32 +0000)]
[ASan] Process functions in Asan module pass
This came up as recommendation while reviewing D112098.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D112732
Sanjay Patel [Wed, 3 Nov 2021 16:53:23 +0000 (12:53 -0400)]
[InstCombine] adjust test for icmp fold; NFC
I missed that the bitwidth changed from the previous test in the sequence.
Tamir Duberstein [Wed, 3 Nov 2021 17:21:43 +0000 (10:21 -0700)]
[sanitizer] Allow getsockname with NULL addrlen
This is already permitted in getpeername, and returns EFAULT
on Linux (does not crash the program).
Fixes https://github.com/google/sanitizers/issues/1451.
Differential Revision: https://reviews.llvm.org/D113055
Fangrui Song [Wed, 3 Nov 2021 17:21:13 +0000 (10:21 -0700)]
[docs] Mention --leading-lines instead of --no-leading-lines
Tamir Duberstein [Wed, 3 Nov 2021 17:16:20 +0000 (10:16 -0700)]
[sanitizer] Mark before deref in PosixSpawnImpl
Read each pointer in the argv and envp arrays before dereferencing
it; this correctly marks an error when these pointers point into
memory that has been freed.
Differential Revision: https://reviews.llvm.org/D113046
Keith Smiley [Wed, 3 Nov 2021 16:49:13 +0000 (09:49 -0700)]
[lld-macho] Cache library paths from findLibrary
On top of https://reviews.llvm.org/D113063 this took another 10 seconds
off our overall link time.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D113073
Louis Dionne [Wed, 3 Nov 2021 15:30:12 +0000 (11:30 -0400)]
[libc++] Fix GDB pretty printer tests for older Clangs and GCC
This was missed by https://llvm.org/D111477, which broke the CI.
Differential Revision: https://reviews.llvm.org/D113112
Shivam Gupta [Wed, 3 Nov 2021 16:44:42 +0000 (22:14 +0530)]
[Docs] Document scripts that are use to generate assertion in test cases
This patch document llvm/utils/update_* python scripts that are used to generate
assertions in many of the LLVM regression test cases.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D112936
Harald van Dijk [Wed, 3 Nov 2021 16:43:44 +0000 (16:43 +0000)]
[X86] Fix X32 indirect call generation
The check for whether a zero extension was needed was subtly wrong and
saw a value that was already 64 bits, so did not extend.
Fixes PR52357.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D112860
Sanjay Patel [Wed, 3 Nov 2021 16:13:53 +0000 (12:13 -0400)]
[InstCombine] refactor fold for icmp with trunc op; NFC
There are at least 3 related folds we can add here - see D112634.
Sanjay Patel [Wed, 3 Nov 2021 16:06:32 +0000 (12:06 -0400)]
[InstCombine] add tests for icmp with trunc op; NFC
Roman Lebedev [Wed, 3 Nov 2021 16:40:23 +0000 (19:40 +0300)]
[NFC] Add forgotten `REQUIRES: asserts` into the new costmodel test
Roman Lebedev [Wed, 3 Nov 2021 16:23:25 +0000 (19:23 +0300)]
[PassManager] `buildModuleOptimizationPipeline()`: schedule `LoopDeletion` pass run before vectorization passes
Test thanks to Michael Kuklinski from `#llvm`: https://godbolt.org/z/bdrah5Goo
originally inspired by Daniel Lemire's https://lemire.me/blog/2021/10/26/in-c-is-empty-faster-than-comparing-the-size-with-zero/
We manage to deduce that the answer does not require looping,
but we do that after the last `LoopDeletion` pass run,
so we end up being stuck with a dead loop.
Now, as with all things SCEV, this has
a very expected ~`+0.12%` compile time performance regression:
https://llvm-compile-time-tracker.com/compare.php?from=
0ae7bf124a9bca76dd9a91b2f7379168ff13f562&to=
c2ae57c9b961aeb4a28c747266949340613a6d84&stat=instructions
(for comparison, doing that in function simplification pipeline
would have been ~`+0.5` compile time performance regression, D112840)
Looking at the transformation stats over vanilla test-suite, i think it's rather expected:
```
| statistic name | baseline | proposed | Δ | % | |%| |
|--------------------------------------------------|----------:|----------:|------:|-------:|-------:|
| scalar-evolution.NumBruteForceTripCountsComputed | 789 | 888 | 99 | 12.55% | 12.55% |
| scalar-evolution.NumTripCountsNotComputed | 105592 | 117900 | 12308 | 11.66% | 11.66% |
| loop-delete.NumBackedgesBroken | 542 | 559 | 17 | 3.14% | 3.14% |
| regalloc.numExtends | 81 | 79 | -2 | -2.47% | 2.47% |
| indvars.NumFoldedUser | 408 | 400 | -8 | -1.96% | 1.96% |
| indvars.NumElimCmp | 3831 | 3758 | -73 | -1.91% | 1.91% |
| scalar-evolution.NumTripCountsComputed | 299759 | 304278 | 4519 | 1.51% | 1.51% |
| loop-delete.NumDeleted | 8055 | 8128 | 73 | 0.91% | 0.91% |
| machine-cse.NumCommutes | 111 | 110 | -1 | -0.90% | 0.90% |
| globaldce.NumFunctions | 1187 | 1192 | 5 | 0.42% | 0.42% |
| codegenprepare.NumSelectsExpanded | 277 | 278 | 1 | 0.36% | 0.36% |
| loop-unroll.NumRuntimeUnrolled | 13841 | 13791 | -50 | -0.36% | 0.36% |
| machinelicm.NumPostRAHoisted | 1168 | 1172 | 4 | 0.34% | 0.34% |
| phi-node-elimination.NumCriticalEdgesSplit | 83054 | 82879 | -175 | -0.21% | 0.21% |
| machine-cse.NumPREs | 3085 | 3079 | -6 | -0.19% | 0.19% |
| branch-folder.NumBranchOpts | 108122 | 107942 | -180 | -0.17% | 0.17% |
| loop-unroll.NumUnrolled | 40136 | 40067 | -69 | -0.17% | 0.17% |
| branch-folder.NumDeadBlocks | 130818 | 130607 | -211 | -0.16% | 0.16% |
| codegenprepare.NumBlocksElim | 92856 | 92714 | -142 | -0.15% | 0.15% |
| instsimplify.NumSimplified | 103263 | 103129 | -134 | -0.13% | 0.13% |
| instcombine.NumConstProp | 26070 | 26102 | 32 | 0.12% | 0.12% |
| instsimplify.NumExpand | 1716 | 1718 | 2 | 0.12% | 0.12% |
| loop-unroll.NumCompletelyUnrolled | 9236 | 9225 | -11 | -0.12% | 0.12% |
| branch-folder.NumHoist | 2773 | 2770 | -3 | -0.11% | 0.11% |
| regalloc.NumReloadsRemoved | 10822 | 10834 | 12 | 0.11% | 0.11% |
| regalloc.NumSnippets | 11394 | 11406 | 12 | 0.11% | 0.11% |
| machine-cse.NumCrossBBCSEs | 1052 | 1053 | 1 | 0.10% | 0.10% |
| machinelicm.NumCSEed | 99887 | 99784 | -103 | -0.10% | 0.10% |
| branch-folder.NumTailMerge | 72501 | 72435 | -66 | -0.09% | 0.09% |
| codegenprepare.NumExtUses | 22007 | 21987 | -20 | -0.09% | 0.09% |
| local.NumRemoved | 68232 | 68294 | 62 | 0.09% | 0.09% |
| loop-vectorize.LoopsAnalyzed | 75483 | 75413 | -70 | -0.09% | 0.09% |
```
Note that i'm only changing current PM, and not touching obsolete PM.
This is an alternative to the function simplification pipeline variant
of the same change, D112840. It has both less compile time impact
(since the additional number of SCEV trip count calculations
is way lass less than with the D112840), and it is
much more powerful/impactful (almost 2x more loops deleted).
I have checked, and doing this after loop rotation
is favorable (more loops deleted).
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D112851
Kazu Hirata [Wed, 3 Nov 2021 16:22:50 +0000 (09:22 -0700)]
[AArch64, AMDGPU] Use make_early_inc_range (NFC)
Roman Lebedev [Wed, 3 Nov 2021 16:15:05 +0000 (19:15 +0300)]
[NFC] Rewrite runlines in interleaved-store-accesses-with-gaps.ll once again
https://lab.llvm.org/buildbot/#/builders/98/builds/8198 is still failing,
and i really don't understand how runlines in this test differ
from the ones in other nearby tests...
Hans Wennborg [Wed, 3 Nov 2021 15:54:28 +0000 (16:54 +0100)]
Revert "X86InstrInfo: Support immediates that are +1/-1 different in optimizeCompareInstr"
This casued miscompiles of switches, see comments on the code review.
> This extends `optimizeCompareInstr` to re-use previous comparison
> results if the previous comparison was with an immediate that was 1
> bigger or smaller. Example:
>
> CMP x, 13
> ...
> CMP x, 12 ; can be removed if we change the SETg
> SETg ... ; x > 12 changed to `SETge` (x >= 13) removing CMP
>
> Motivation: This often happens because SelectionDAG canonicalization
> tends to add/subtract 1 often when optimizing for fallthrough blocks.
> Example for `x > C` the fallthrough optimization switches true/false
> blocks with `!(x > C)` --> `x <= C` and canonicalization turns this into
> `x < C + 1`.
>
> Differential Revision: https://reviews.llvm.org/D110867
This reverts commit
e2c7ee0743592e39274e28dbe0d0c213ba342317.
Roman Lebedev [Wed, 3 Nov 2021 15:14:35 +0000 (18:14 +0300)]
[X86] `X86TTIImpl::getInterleavedMemoryOpCostAVX512()`: fallback to scalarization cost computation for mask
I don't really buy that masked interleaved memory loads/stores are supported on X86.
There is zero costmodel test coverage, no actual cost modelling for the generation
of the mask repetition, and basically only two LV tests.
Additionally, i'm not very interested in AVX512.
I don't know if this really helps "soft" block over at
https://reviews.llvm.org/D111460#inline-1075467,
but i think it can't make things worse at least.
When we are being told that there is a masking, instead of
completely giving up and falling back to
fully scalarizing `BasicTTIImplBase::getInterleavedMemoryOpCost()`,
let's correctly query the cost of masked memory ops,
keep all the pretty shuffle cost modelling,
but scalarize the cost computation for the mask replication.
I think, not scalarizing the shuffles themselves
may adjust the computed costs a bit,
and maybe hopefully just enough to hide the "regressions"
at https://reviews.llvm.org/D111460#inline-1075467
I do mean hide, because the test coverage is non-existent.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D112873
Roman Lebedev [Wed, 3 Nov 2021 15:12:12 +0000 (18:12 +0300)]
[NFC] Use single-dash-prefixed options in newly-added test
https://lab.llvm.org/buildbot/#/builders/98/builds/8195 complains,
and this is the only guess i have.
Clement Courbet [Wed, 3 Nov 2021 14:43:04 +0000 (15:43 +0100)]
[Sema][NFC] Improve test coverage for builtin binary operators.
In preparation for D112453.
Erich Keane [Wed, 3 Nov 2021 14:42:00 +0000 (07:42 -0700)]
Update ast-dump-decl.mm test to work on 32 bit windows
Windows member functions have __attribute__((thiscall)) on their type,
so any machine running this that is 32 bit windows fails this test, add
a wildcard, plus an additional run line to explain why.
Roman Lebedev [Wed, 3 Nov 2021 14:33:28 +0000 (17:33 +0300)]
[BasicTTI] getInterleavedMemoryOpCost(): discount unused members of mask if mask for gap will be used
As it can be seen in `InnerLoopVectorizer::vectorizeInterleaveGroup()`,
in some cases (reported by `UseMaskForGaps`), the gaps in the interleaved load/store group
will be masked away by another constant mask, so there is no need to
account for the cost of replication of the mask for these.
Differential Revision: https://reviews.llvm.org/D112877
Roman Lebedev [Wed, 3 Nov 2021 14:13:17 +0000 (17:13 +0300)]
[NFC][X86] Duplicate LV test into a costmodel test
Copied from llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll
As discussed in D111460 / D112877 / D112873 we have basically no test coverage
for this part of cost model.
Erich Keane [Wed, 3 Nov 2021 14:13:02 +0000 (07:13 -0700)]
Revert part of D112349 to allow ifunc resolvers be declarations.
The patch in D112349 added a previously nonexistant restriction on ifunc
resolvers that they MUST be defintions. However, the function
multiversioning depends on being able to resolve these resolvers at
link-time, so this additional restriction was breaking.
David Sherwood [Wed, 3 Nov 2021 13:37:30 +0000 (13:37 +0000)]
[NFC][LoopVectorize] Simple tidy-up in InnerLoopVectorizer::createVectorIntOrFpInductionPHI
Use getSignedIntOrFpConstant instead of creating int or FP constants
manually.
David Spickett [Wed, 3 Nov 2021 13:32:34 +0000 (13:32 +0000)]
Reland "[lldb] Remove non address bits when looking up memory regions"
This reverts commit
5fbcf677347e38718461496d9e9e184a7a30c3fb.
ProcessDebugger is used in ProcessWindows and NativeProcessWindows.
I thought I was simplifying things by renaming to DoGetMemoryRegionInfo
in ProcessDebugger but the Native process side expects "GetMemoryRegionInfo".
Follow the pattern that WriteMemory uses. So:
* ProcessWindows::DoGetMemoryRegioninfo calls ProcessDebugger::GetMemoryRegionInfo
* NativeProcessWindows::GetMemoryRegionInfo does the same
Peter Waller [Wed, 3 Nov 2021 13:40:22 +0000 (13:40 +0000)]
Reland "[AArch64][SVE][InstCombine] Combine contiguous gather/scatter to load/store"
This reverts commit
753eba64213ef20195644994df53d564f30eb65f.
Contiguous gather => masked load:
(sve.ld1.gather.index Mask BasePtr (sve.index IndexBase 1))
=> (masked.load (gep BasePtr IndexBase) Align Mask undef)
Contiguous scatter => masked store:
(sve.ld1.scatter.index Value Mask BasePtr (sve.index IndexBase 1))
=> (masked.store Value (gep BasePtr IndexBase) Align Mask)
Tests with <vscale x 2 x double>:
[Gather, Scatter] for each [Positive test (index=1), Negative test
(index=2), Alignment propagation].
Differential Revision: https://reviews.llvm.org/D112076
Peter Waller [Wed, 3 Nov 2021 13:39:38 +0000 (13:39 +0000)]
Revert "[AArch64][SVE][InstCombine] Combine contiguous gather/scatter to load/store"
This reverts commit
1febf42f03f664ec84aedf0ece3b29f92b10dce9, which has
a use-of-uninitialized-memory bug.
See: https://reviews.llvm.org/D112076
David Spickett [Wed, 3 Nov 2021 13:27:41 +0000 (13:27 +0000)]
Revert "[lldb] Remove non address bits when looking up memory regions"
This reverts commit
6f5ce43b433706c3ae5c37022d6c0964b6bfadf8 due to
build failure on Windows.
Florian Hahn [Wed, 3 Nov 2021 13:26:15 +0000 (14:26 +0100)]
[LV] Drop unneeded use of getVPSingleValue (NFC).
VPReductionPHIRecipe inherits from VPValue, so there's no need to call
getVPSingleValue.
Konstantin Boyarinov [Wed, 3 Nov 2021 13:08:27 +0000 (16:08 +0300)]
[libcxx][test][NFC] More tests for containers comparisons
Add more missing tests for comparisons to improve code coverage (follow-up for D111738)
Reviewed By: ldionne, rarutyun, #libc
Differential Revision: https://reviews.llvm.org/D112424
Sanjay Patel [Wed, 3 Nov 2021 12:55:50 +0000 (08:55 -0400)]
[PhaseOrdering] add tests for x86 abs/max using SSE intrinsics (PR34047); NFC
D113035
Florian Hahn [Wed, 3 Nov 2021 13:11:01 +0000 (14:11 +0100)]
[VPlan] Make VPWidenCanonicalIVRecipe a VPValue (NFC).
The recipe produces exactly one VPValue and can inherit directly from
it. This is in line with other recipes and avoids having to use
getVPSingleValue.
Andrew Savonichev [Wed, 3 Nov 2021 12:48:04 +0000 (15:48 +0300)]
[NVPTX] Mark special registers as reserved
A reserved register:
- is not allocatable
- is considered always live
- is ignored by liveness tracking
NVPTX special registers match the criteria, and marking them as
reserved helps to avoid machine verifier error:
*** Bad machine code: Using an undefined physical register ***
- function: foo
- basic block: %bb.0 (0x557bb178b708)
- instruction: %0:int32regs = MOV_SPECIAL $envreg0
- operand 1: $envreg0
Differential Revision: https://reviews.llvm.org/D113008
Clement Courbet [Wed, 3 Nov 2021 09:44:21 +0000 (10:44 +0100)]
[Sema][NFC] Improve test coverage for builtin operators.
In preparation for D112453.
Pavel Labath [Wed, 3 Nov 2021 11:59:51 +0000 (12:59 +0100)]
[lldb] Remove ConstString from plugin names in PluginManager innards
This completes de-constification of plugin names.