usama hameed [Mon, 23 May 2022 21:52:14 +0000 (14:52 -0700)]
bugfix in InfiniteLoopCheck to not print warnings for unevaluated loops
Added a separate check for unevaluated statements. Updated InfiniteLoopCheck to use new check
Differential Revision: https://reviews.llvm.org/D126246
usama hameed [Thu, 19 May 2022 23:51:34 +0000 (16:51 -0700)]
bugfix in InfiniteLoopCheck to not print warnings for unevaluated loops
Differential Revision: https://reviews.llvm.org/D126034
Wolfgang Pieb [Tue, 24 May 2022 00:08:01 +0000 (17:08 -0700)]
[NFC][Metadata] Define move constructor and move assignment operator for MDOperand.
This is a preparatory patch for the MDNode resize functionality.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D125994
Sotiris Apostolakis [Tue, 24 May 2022 02:05:41 +0000 (22:05 -0400)]
[SelectOpti][4/5] Loop Heuristics
This patch adds the loop-level heuristics for determining whether branches are more profitable than conditional moves.
These heuristics apply to only inner-most loops.
Depends on D120231
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D120232
Sotiris Apostolakis [Mon, 23 May 2022 20:26:09 +0000 (16:26 -0400)]
[SelectOpti][3/5] Base Heuristics
This patch adds the base heuristics for determining whether branches are more profitable than conditional moves.
Base heuristics apply to all code apart from inner-most loops.
Depends on D122259
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D120231
Vy Nguyen [Tue, 24 May 2022 00:59:18 +0000 (07:59 +0700)]
[lld-macho][nfc] Run clang-format on lld/MachO/*.{h,cpp}
- fixed inconsistent indents and spaces
- prevent extraneous formatting changes in other patches
Differential Revision: https://reviews.llvm.org/D126262
Peter Klausler [Wed, 11 May 2022 21:32:59 +0000 (14:32 -0700)]
[flang] Ignore BIND(C) binding name conflicts of inner procedures
The binding names of inner procedures with BIND(C) are not exposed
to the loader and should be ignored for potential conflict errors.
Differential Revision: https://reviews.llvm.org/D126141
Peter Klausler [Wed, 11 May 2022 21:13:50 +0000 (14:13 -0700)]
[flang] Allow global scope names that clash with intrinsic modules
Intrinsic module names are not in the user's namespace, so they
are free to declare global names that conflict with intrinsic
modules.
Differential Revision: https://reviews.llvm.org/D126140
Xeonacid [Tue, 24 May 2022 00:58:23 +0000 (02:58 +0200)]
[RISCV] Make old JIT ExecutionEngine tests unsupported
Make old JIT ExecutionEngine tests unsupported for RISCV, like many other architectures included.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D126188
Peter Klausler [Wed, 11 May 2022 20:15:59 +0000 (13:15 -0700)]
[flang] Fix character length calculation for Unicode component
The character length value in the derived type component information table
entry is already in units of characters, not bytes, so don't divide by the
per-character byte size.
Differential Revision: https://reviews.llvm.org/D126139
Sam Clegg [Fri, 20 May 2022 21:39:33 +0000 (14:39 -0700)]
[lld][WebAssembly] Allow use of statically allocated TLS region.
It turns out we were already allocating static address space for TLS
data along with the non-TLS static data, but this space was going
unused/ignored.
With this change, we include the TLS segment in `__wasm_init_memory`
(which does the work of loading the passive segments into memory when a
module is first loaded). We also set the `__tls_base` global to point
to the start of this segment.
This means that the runtime can use this static copy of the TLS data for
the first/primary thread if it chooses, rather than doing a runtime
allocation prior to calling `__wasm_init_tls`.
Practically speaking, this will allow emscripten to avoid dynamic
allocation of TLS region on the main thread.
Differential Revision: https://reviews.llvm.org/D126107
Hendrik Greving [Fri, 13 May 2022 17:53:13 +0000 (10:53 -0700)]
[BasicBlockUtils] Do not move loop metadata if outer loop header.
Fixes a bug preventing moving the loop's metadata to an outer loop's header,
which happens if the loop's exit is also the header of an outer loop.
Adjusts test for above.
Fixes #55416.
Differential Revision: https://reviews.llvm.org/D125574
Hendrik Greving [Mon, 16 May 2022 14:34:04 +0000 (07:34 -0700)]
[BasicBlockUtils] Add corner case test for loop metadata.
Adds a test to expose #55416.
Differential Revision: https://reviews.llvm.org/D125696
Mehdi Amini [Mon, 16 May 2022 10:33:00 +0000 (10:33 +0000)]
Apply clang-tidy fixes for modernize-use-bool-literals in Parser.cpp (NFC)
Mehdi Amini [Mon, 16 May 2022 10:24:43 +0000 (10:24 +0000)]
Apply clang-tidy fixes for modernize-use-override in SparseTensorUtils.cpp (NFC)
Mehdi Amini [Mon, 16 May 2022 10:09:28 +0000 (10:09 +0000)]
Apply clang-tidy fixes for performance-unnecessary-value-param in Utils.cpp (NFC)
Vitaly Buka [Mon, 23 May 2022 22:56:35 +0000 (15:56 -0700)]
[test][clang] Move -O3 in command line
Jamie Schmeiser [Thu, 7 Oct 2021 19:02:19 +0000 (15:02 -0400)]
Add new hidden option -print-on-crash that prints out IR that caused opt pipeline to crash
A new hidden option -print-on-crash that prints the IR as it was upon entering
the last pass when there is a crash.
The IR is saved in its print form before each pass is started and a
signal handler is registered. If the compilation crashes, the signal
handler will print the saved IR to dbgs(). This option
can be modified using -print-module-scope to get the IR for the complete
module. Note that this option only works with the new pass manager.
Reviewed By: yrouban
Differential Revision: https://reviews.llvm.org/D86657
Tom Stellard [Mon, 23 May 2022 22:09:26 +0000 (15:09 -0700)]
github: Switch release PR repository to llvm/llvm-project-release-prs
As discussed in https://discourse.llvm.org/t/creating-a-new-repository-for-release-branch-pull-requests/61339
Reviewed By: asl
Differential Revision: https://reviews.llvm.org/D125851
Alex Brachet [Mon, 23 May 2022 21:47:22 +0000 (21:47 +0000)]
[libc][docs] Use same formatting for headers in source_layout
utils looks different from the other directory names
in the docs, see
https://libc.llvm.org/source_layout.html#the-utils-directory
Differential revision: https://reviews.llvm.org/D126211
NAKAMURA Takumi [Sat, 21 May 2022 23:52:03 +0000 (08:52 +0900)]
[TableGen] emitStringLiteralDef: Pad trailing '\0' at the end of char array.
Fixup for https://reviews.llvm.org/D73044
String literal has an implicit terminator '\0'. This commit adjusts char array
to long literal.
This causes difference of artifacts between -long-string-literals=true
and false.
Differential Revision: https://reviews.llvm.org/D126136
Jeffrey Tan [Mon, 23 May 2022 17:17:44 +0000 (10:17 -0700)]
Fix lldb-vscode frame test failure
Previous patch (https://reviews.llvm.org/D126013) added a new "optimized"
attribute to DAP stack frame this caused some tests, like
lldb-vscode/coreFile/TestVSCode_coreFile.py
to fail because the tests explicitly check for all attributes.
To fix the test failure I decided to remove this attribute.
Differential Revision: https://reviews.llvm.org/D126225
NAKAMURA Takumi [Sat, 21 May 2022 23:56:30 +0000 (08:56 +0900)]
emitStringLiteralDef: Return earlier here. NFC.
Differential Revision: https://reviews.llvm.org/D126135
Mitch Phillips [Mon, 23 May 2022 20:11:01 +0000 (13:11 -0700)]
[symbolizer] Parse DW_TAG_variable DIs to show line info for globals
Currently, llvm-symbolizer doesn't like to parse .debug_info in order to
show the line info for global variables. addr2line does this. In the
future, I'm looking to migrate AddressSanitizer off of internal metadata
over to using debuginfo, and this is predicated on being able to get the
line info for global variables.
This patch adds the requisite support for getting the line info from the
.debug_info section for symbolizing global variables. This only happens
when you ask for a global variable to be symbolized as data.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D123538
Sotiris Apostolakis [Mon, 23 May 2022 14:47:32 +0000 (10:47 -0400)]
[SelectOpti][2/5] Select-to-branch base transformation
This patch implements the actual transformation of selects to branches.
It includes only the base transformation without any sinking.
Depends on D120230
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D122259
Qunyan Mangus [Mon, 23 May 2022 19:48:06 +0000 (12:48 -0700)]
Remove duplicate fields in RAGreedy
RAGreedy has two fields of RegisterClassInfo, one called RCI and another RegClassInfo from its base class.
RCI is initialized without freezeReservedRegs first, while RegClassInfo does. Therefore, if reserved registers
information is changed between last time freezeReservedRegs is called and RAGreedy, it's not picked up by RCI.
Instead of having both fields in RAGreedy, remove RCI and use RegClassInfo instead. Also removed is the TRI field
which is present in its base class.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D125926
Paul Robinson [Mon, 23 May 2022 19:49:20 +0000 (12:49 -0700)]
[PS5] Make driver's PIC behavior match PS4
The new test is a copy of the corresponding PS4 test, with the triple
etc updated, because there's currently no good way to make one lit test
"iterate" with multiple targets.
Louis Dionne [Mon, 23 May 2022 19:36:35 +0000 (15:36 -0400)]
[libc++] Remove duplicate tests for callable concepts
This is essentially a revert of
c7ad02009. Indeed, it seems that both
96dbdd75 and
c7ad02009 were committed, but
c7ad02009 seems to be only
an older version of
96dbdd75's tests.
Stella Stamenova [Mon, 23 May 2022 19:38:02 +0000 (12:38 -0700)]
[mlir] Use 'native' instead of 'llvm_has_native_target' in the mlir tests
The tests actually require the target triple to match the host, rather than just having the host in the list of available targets. This change removes `llvm_has_native_target` and instead uses the `native` feature from the lit configuration.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D126011
Florian Hahn [Mon, 23 May 2022 19:27:42 +0000 (20:27 +0100)]
[AArch64] Add tests with free shuffles for indexed fma variants.
The new tests contain examples where shuffles are free, because indexed
fma instructions can be used.
Paul Walker [Mon, 23 May 2022 18:07:10 +0000 (19:07 +0100)]
[SVEInstrFormats] Ensure scatter instructions are named consistently.
Alexey Bataev [Mon, 23 May 2022 15:09:55 +0000 (08:09 -0700)]
[SLP][NFC]Improve compile time, NFC.
Builds UserIgnore list only once as a SmallDenseSet without rebuilding
it between the runs, iterate over gathers instead list of reduction ops,
do some checks in the buildTree_rec only if the corresponding containers
are not empty.
Julian Lettner [Mon, 23 May 2022 18:32:38 +0000 (11:32 -0700)]
[Sanitizer][Darwin] Add explanation for Apple platform macros
Differential Revision: https://reviews.llvm.org/D126229
LLVM GN Syncbot [Mon, 23 May 2022 18:52:16 +0000 (18:52 +0000)]
[gn build] Port
eebc1fb772c5
Nikolas Klauser [Sun, 22 May 2022 11:43:37 +0000 (13:43 +0200)]
[libc++] Add ranges::max_element to the synopsis and ADL-proof the __min_element_impl calls
Reviewed By: ldionne, #libc
Spies: sstefan1, libcxx-commits
Differential Revision: https://reviews.llvm.org/D126167
Nikolas Klauser [Sun, 22 May 2022 11:34:22 +0000 (13:34 +0200)]
[libc++] Add auto to the list of required extensions in C++03
We use `auto` in C++03, so we shouldn't say that we aren't.
Reviewed By: ldionne, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D126165
Nikolas Klauser [Fri, 20 May 2022 15:11:58 +0000 (17:11 +0200)]
[libc++] Assume that push_macro and pop_macro are available
All compilers that libc++ supports support `push_macro` and `pop_macro`. So let's remove it.
Reviewed By: ldionne, #libc
Spies: libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D126073
Nikolas Klauser [Thu, 12 May 2022 13:46:18 +0000 (15:46 +0200)]
[libc++] Always enable the ranges concepts
The ranges concepts were already available in libc++13, so we shouldn't guard them with `_LIBCPP_HAS_NO_INCOMPLETE_RANGES`.
Fixes https://github.com/llvm/llvm-project/issues/54765
Reviewed By: #libc, ldionne
Spies: ldionne, libcxx-commits
Differential Revision: https://reviews.llvm.org/D124011
Nikolas Klauser [Fri, 20 May 2022 21:31:13 +0000 (23:31 +0200)]
[libc++] Granularize parts of <type_traits>
`<type_traits>` is quite a large header, so I'll granularize it in a few steps.
Reviewed By: ldionne, #libc
Spies: libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D124755
Jorge Gorbe Moya [Mon, 23 May 2022 18:29:10 +0000 (11:29 -0700)]
Remove `friend` classes from TypeCategoryMap
As far as I can tell, the only thing those friend classes access is the
`ValueSP` typedef.
Given that this is a map-ish class, with "Map" in its name, it doesn't
seem like a stretch to make `KeyType`, `ValueType` and `ValueSP` public.
More so when the public methods of the class have `KeyType` and
`ValueSP` arguments and clearly `ValueSP` needs to be accessed from the
outside.
`friend` complicates local reasoning about who can access private
members, which is valuable in a class like this that has every method
locking a mutex to prevent concurrent access.
Differential Revision: https://reviews.llvm.org/D126103
natashaknk [Mon, 23 May 2022 17:58:33 +0000 (10:58 -0700)]
[mlir][tosa] Change tosa.depthwise_conv2d's ending reshape to a collapse.
TOSAs depthwise_conv2d operation includes a reshape to include the implicit x1 dimension.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D126212
Julian Lettner [Mon, 23 May 2022 18:18:15 +0000 (11:18 -0700)]
[Sanitizer][Darwin] Add SANITIZER_DRIVERKIT platform macro
Sanjay Patel [Mon, 23 May 2022 17:31:00 +0000 (13:31 -0400)]
[IR] add and use pattern match specialization for sqrt intrinsic; NFC
This was included in D126190 originally, but it's
independent and a useful change for readability.
Craig Topper [Mon, 23 May 2022 05:38:04 +0000 (22:38 -0700)]
[DAGCombiner][AArch64] Don't fold (smulo x, 2) -> (saddo x, x) if VT is i2.
If the VT is i2, then 2 is really -2.
Test has not been commited yet, but diff shows the change.
Fixes PR55644.
Differential Revision: https://reviews.llvm.org/D126213
Craig Topper [Mon, 23 May 2022 05:32:16 +0000 (22:32 -0700)]
[AArch64] Add test case for pr55644. NFC
Dave Lee [Mon, 23 May 2022 18:00:22 +0000 (11:00 -0700)]
[lldb] Specify aguments of `image list`
Register positional argument details in `CommandObjectTargetModulesList`.
I recently learned that `image list` takes a module name, but the help info
does not indicate this. With this change, `help image list` will show that it
accepts zero or more module names.
This makes it easier to get info about specific modules, without having to
find/grep through the full image list.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D125154
Stephen Long [Mon, 23 May 2022 14:01:55 +0000 (07:01 -0700)]
[MSVC, ARM64] Add __readx18 intrinsics
https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170
unsigned char __readx18byte(unsigned long)
unsigned short __readx18word(unsigned long)
unsigned long __readx18dword(unsigned long)
unsigned __int64 __readx18qword(unsigned long)
Given the lack of documentation of the intrinsics, we chose to align the offset with just
`CharUnits::One()` when calling `IRBuilderBase::CreateAlignedLoad()`
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D126024
Dave Lee [Wed, 18 May 2022 23:31:49 +0000 (16:31 -0700)]
[lldb] Improve formatting of dlopen error messages (NFC)
Ensure there's a space between "utility" and "function", and also makes
it easier to grep/search for "utility function".
While making this change, I also re-formatted the other dlopen error messages
(with clang-format). This fix other instances of spaces missing between words,
and makes each of these strings fit a single line, making them greppable.
Differential Revision: https://reviews.llvm.org/D126078
Benjamin Kramer [Mon, 23 May 2022 17:53:40 +0000 (19:53 +0200)]
Fix an unused variable warning in no-asserts build mode
Paul Robinson [Mon, 23 May 2022 17:43:12 +0000 (10:43 -0700)]
[PS5] Disable a test, same as PS4
Philip Reames [Mon, 23 May 2022 17:10:08 +0000 (10:10 -0700)]
[RISCV] Add basic fault-first load coverage for VSETVLI insertion
Simplified version of a test taken from D123581.
Jeffrey Tan [Tue, 17 May 2022 16:21:10 +0000 (09:21 -0700)]
Show error message for optimized variables
This fixes an issue that optimized variable error message is not shown to end
users in lldb-vscode.
Differential Revision: https://reviews.llvm.org/D126014
Jeffrey Tan [Tue, 17 May 2022 16:17:26 +0000 (09:17 -0700)]
Add [opt] suffix to optimized stack frame in lldb-vscode
To help user identify optimized code This diff adds a "[opt]" suffix to
optimized stack frames in lldb-vscode. This provides consistent experience
as command line lldb.
It also adds a new "optimized" attribute to DAP stack frame object so that
it is easy to identify from telemetry than parsing trailing "[opt]".
Differential Revision: https://reviews.llvm.org/D126013
Fangrui Song [Mon, 23 May 2022 16:58:54 +0000 (09:58 -0700)]
[llvm-nm][docs] Document -W and -U
Latest GNU nm (milestone: 2.39) has added -W/--no-weak and changed -U to mean
--defined-only (instead of --unicode=). The changes match our semantics.
Close #55297
Reviewed by: jhenderson, keith
Differential Revision: https://reviews.llvm.org/D126133
Christopher Bate [Tue, 17 May 2022 21:42:47 +0000 (15:42 -0600)]
[mlir][NvGpuToNVVM] Fix byte size calculation in async copy lowering
AsyncCopyOp lowering converted "size in elements" to "size in bytes"
assuming the element type size is at least one byte. This removes
that restriction, allowing for types such as i4 and b1 to be handled
correctly.
Differential Revision: https://reviews.llvm.org/D125838
Matthias Springer [Mon, 23 May 2022 16:49:45 +0000 (18:49 +0200)]
[mlir][bufferize][NFC] Update One-Shot Bufferize pass documentation
Differential Revision: https://reviews.llvm.org/D125637
Christopher Bate [Fri, 20 May 2022 20:41:55 +0000 (14:41 -0600)]
[mlir][NvGpuToNVVM] Fix missing i4 support for nvgpu.mma.sync
This changes adds missing support for the i4 data type. Tests are added
to ensure proper lowering of an nvgpu.mma.sync operation targeting the
16x8x64xi4 and 16x8x32xi4 MMA variants in the NVVM dialect.
Differential Revision: https://reviews.llvm.org/D126092
Matthias Springer [Mon, 23 May 2022 16:37:26 +0000 (18:37 +0200)]
[mlir][bufferize] Support fully dynamic layout maps in BufferResultsToOutParams
Also fixes integration of the pass into One-Shot Bufferize and adds additional test cases.
BufferResultsToOutParams can be used with "identity-layout-map" and "fully-dynamic-layout-map". "infer-layout-map" is not supported.
Differential Revision: https://reviews.llvm.org/D125636
Jonas Devlieghere [Mon, 23 May 2022 16:07:54 +0000 (09:07 -0700)]
[lldb] Fix should_skip_simulator_test decorator
Currently simulator tests get skipped when the reported platform is
macosx rather than darwin. Update the decorator to match both.
Matthias Springer [Mon, 23 May 2022 16:10:12 +0000 (18:10 +0200)]
[mlir][bufferization] Fix Python bindings
Differential Revision: https://reviews.llvm.org/D126179
Nathan Sidwell [Tue, 22 Mar 2022 17:49:08 +0000 (10:49 -0700)]
[clang] Module global init mangling
C++20 modules require emission of an initializer function, which is
called by importers of the module. This implements the mangling for
that function. It is the one place the ABI exposes partition names in
symbols -- but fortunately only needed by other TUs of that same module.
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D122741
Stella Laurenzo [Sun, 22 May 2022 04:30:01 +0000 (21:30 -0700)]
NFC: Silence two warnings for unused bufferization symbols in release mode.
Differential Revision: https://reviews.llvm.org/D126182
Richard [Fri, 20 May 2022 23:19:11 +0000 (17:19 -0600)]
[clang-tidy] Improve add_new_check.py to recognize more checks
When looking for whether or not a check provides fixits, the script
examines the implementation of the check. Some checks are not
implemented in source files that correspond one-to-one with the check
name, e.g. cert-dcl21-cpp. So if we can't find the check implementation
directly from the check name, open up the corresponding module file and
look for the class name that is registered with the check. Then consult
the file corresponding to the class name.
Some checks are derived from a base class that implements fixits. So if
we can't find fixits in the implementation file for a check, scrape out
the name of it's base class. If it's not ClangTidyCheck, then consult
the base class implementation to look for fixit support.
Differential Revision: https://reviews.llvm.org/D126134
Fixes #55630
Nikita Popov [Mon, 23 May 2022 15:29:33 +0000 (17:29 +0200)]
[InstCombine] Change operand order in recursive and/or of icmps fold
The order obviously doesn't matter for bitwise and/or, but would
matter for logical and/or, so change it to preserve the original
order.
Nikita Popov [Mon, 23 May 2022 15:24:19 +0000 (17:24 +0200)]
[InstCombine] Add tests for recursive and/or of icmp folds (NFC)
Add variations with bitwise and logical and/or, as well as
commuted operands.
Jingu Kang [Mon, 23 May 2022 11:33:48 +0000 (12:33 +0100)]
Revert "Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth""
This reverts commit
42ebfa8269470e6b1fe2de996d3f1db6d142e16a.
The commmit from https://reviews.llvm.org/D125918 has fixed the stage 2 build
failure.
Differential Revision: https://reviews.llvm.org/D118979
Matthias Springer [Mon, 23 May 2022 14:53:17 +0000 (16:53 +0200)]
[mlir][bufferization][NFC] Improve assembly format of AllocTensorOp
No longer pass static dim sizes as an attribute. This was redundant and required extra checks in the verifier. This change also makes the op symmetrical to memref::AllocOp.
Differential Revision: https://reviews.llvm.org/D126178
PeixinQiao [Mon, 23 May 2022 14:50:06 +0000 (22:50 +0800)]
[NFC][flang] Change the OpenMP atomic read/write test cases
Remove the integration tests and rename the file.
Reviewed By: shraiysh, NimishMishra
Differential Revision: https://reviews.llvm.org/D126169
Alexander Belyaev [Mon, 23 May 2022 14:29:02 +0000 (16:29 +0200)]
[mlir] Add Expm1 tp ComplexOps.td.
Differential Revision: https://reviews.llvm.org/D126206
Jay Foad [Mon, 23 May 2022 14:18:34 +0000 (15:18 +0100)]
[TableGen] Remove an untrue statement from the docs
You can't use foreach in a record body. This was a mistake in the
documentation dating from when it was first written in D85838.
Alexander Belyaev [Mon, 23 May 2022 14:10:20 +0000 (16:10 +0200)]
[mlir] Add RSqrt tp ComplexOps.td.
Differential Revision: https://reviews.llvm.org/D126202
Alexey Bataev [Wed, 4 Aug 2021 17:58:37 +0000 (10:58 -0700)]
[SLP]Do not emit extract elements for insertelements users, replace with shuffles directly.
SLP vectorizer emits extracts for externally used vectorized scalars and
estimates the cost for each such extract. But in many cases these
scalars are input for insertelement instructions, forming buildvector,
and instead of extractelement/insertelement pair we can emit/cost
estimate shuffle(s) cost and generate series of shuffles, which can be
further optimized.
Tested using test-suite (+SPEC2017), the tests passed, SLP was able to
generate/vectorize more instructions in many cases and it allowed to reduce
number of re-vectorization attempts (where we could try to vectorize
buildector insertelements again and again).
Differential Revision: https://reviews.llvm.org/D107966
Stephen Long [Mon, 23 May 2022 14:00:54 +0000 (07:00 -0700)]
[MSVC, ARM64] Add __writex18 intrinsics
https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170
void __writex18byte(unsigned long, unsigned char)
void __writex18word(unsigned long, unsigned short)
void __writex18dword(unsigned long, unsigned long)
void __writex18qword(unsigned long, unsigned __int64)
Given the lack of documentation of the intrinsics, we chose to align the offset with just
`CharUnits::One()` when calling `IRBuilderBase::CreateAlignedStore()`.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D126023
Sanjay Patel [Mon, 23 May 2022 13:26:43 +0000 (09:26 -0400)]
[InstCombine] fold icmp of zext bool based on limited range
X <u (zext i1 Y) --> (X == 0) && Y
https://alive2.llvm.org/ce/z/avQDRY
This is a generalization of
4069cccf3b4ff4a based on the post-commit suggestion.
This also adds the i1 type check and tests that were missing from the earlier
attempt; that commit caused several bot fails and was reverted.
Differential Revision: https://reviews.llvm.org/D126171
Sanjay Patel [Sun, 22 May 2022 17:16:19 +0000 (13:16 -0400)]
[InstCombine] add tests for icmp of zext i1; NFC
Alexey Bataev [Mon, 23 May 2022 13:43:02 +0000 (06:43 -0700)]
[SLP][NFC]Add a test for extracting scalar from undef result vector,
NFC.
Nikita Popov [Mon, 23 May 2022 13:12:15 +0000 (15:12 +0200)]
[InstCombine] Reuse icmp of and/or folds for logical and/or
Similarly to a change recently done for fcmps, add a flag that
indicates whether the and/or is logical to foldAndOrOfICmps, and
reuse the function when folding logical and/or.
We were already calling some parts of it, but this gives us a
clearer indication of which parts may need poison-safe variants,
and would also allow to fold combinations of bitwise and logical
and/or.
This change should be close to NFC, because all folds this enables
were either already called previously, or can make use of implied
poison reasoning.
Anastasia Stulova [Mon, 23 May 2022 13:03:54 +0000 (14:03 +0100)]
[SPIR-V] Allow setting SPIR-V version via target triple.
Currently added versions are from v1.0 to v1.5, other versions
can be added as needed.
This change also adds documentation about SPIR-V target support
in LLVM.
Differential Revision: https://reviews.llvm.org/D124776
Timm Bäder [Mon, 23 May 2022 13:22:27 +0000 (15:22 +0200)]
Revert "[clang][driver] Dynamically select gcc-toolset/devtoolset version"
This reverts commit
8717b492dfcd12d6387543a2f8322e0cf9059982.
The new unittest fails on Windows buildbots, e.g.
https://lab.llvm.org/buildbot/#/builders/119/builds/8647
Dmitry Preobrazhensky [Mon, 23 May 2022 12:48:47 +0000 (15:48 +0300)]
[AMDGPU][MC][GFX940] Disable v_mac_f32_dpp
Differential Revision: https://reviews.llvm.org/D126070
Sylvestre Ledru [Mon, 23 May 2022 11:49:34 +0000 (13:49 +0200)]
Add support of the next Ubuntu (Ubuntu 22.10 - Kinetic Kudu)
Sylvestre Ledru [Mon, 23 May 2022 11:47:13 +0000 (13:47 +0200)]
Add support of the next Debian (Debian 13 - Trixie)
Jay Foad [Mon, 23 May 2022 11:04:43 +0000 (12:04 +0100)]
[AMDGPU] Remove unneeded regex escaping in FileCheck patterns
These must have crept in since D117298 was landed.
Edd Barrett [Mon, 23 May 2022 10:18:30 +0000 (11:18 +0100)]
Test stackmap support for i128
This diff adds tests that check the currently-working stackmap cases for i128.
This will help ensure no regressions are later introduced by D125680 (when
ready).
Note that i128 stackmap support is currently incomplete, so we cant test all
i128 functionality:
i128 constants >= 2^{63} crash LLVM
non-constant i128s crash LLVM
So this change tests only constant i128 operands of value < 2^{63}.
A couple of incorrect comments are also fixed.
Simon Pilgrim [Mon, 23 May 2022 10:48:24 +0000 (11:48 +0100)]
[AArch64] Regenerate andandshift.ll test checks
Timm Bäder [Wed, 18 May 2022 08:31:41 +0000 (10:31 +0200)]
[clang][driver] Dynamically select gcc-toolset/devtoolset version
And pick the highest one, instead of adding all possibilities to the
prefixes.
Differential Revision: https://reviews.llvm.org/D125862
LiaoChunyu [Mon, 23 May 2022 00:59:41 +0000 (08:59 +0800)]
[RISCV][NFC] Test cases for fmuladd intrinsic
These test cases are copy from fma
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D126049
Nikita Popov [Wed, 18 May 2022 13:34:12 +0000 (15:34 +0200)]
[CGP] Freeze condition when despeculating ctlz/cttz
Freeze the condition of the newly introduced conditional branch,
to avoid immediate undefined behavior if the input to ctlz/cttz
was originally poison.
Differential Revision: https://reviews.llvm.org/D125887
Andre Vieira [Mon, 23 May 2022 08:43:39 +0000 (09:43 +0100)]
[AArch64] Order STP Q's by ascending address
This patch adds an AArch64 specific PostRA MachineScheduler to try to schedule
STP Q's to the same base-address in ascending order of offsets. We have found
this to improve performance on Neoverse N1 and should not hurt other AArch64
cores.
Differential Revision: https://reviews.llvm.org/D125377
Florian Hahn [Mon, 23 May 2022 08:39:00 +0000 (09:39 +0100)]
[AArch64] implement isReassocProfitable, disable for (u|s)mlal.
Currently reassociating add expressions can lead to failing to select
(u|s)mlal. Implement isReassocProfitable to skip reassociating
expressions that can be lowered to (u|s)mlal.
The same issue exists for the *mlsl variants as well, but the DAG
combiner doesn't use the isReassocProfitable hook before reassociating.
To be fixed in a follow-up commit as this requires DAGCombiner changes
as well.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D125895
Chuanqi Xu [Mon, 23 May 2022 08:21:42 +0000 (16:21 +0800)]
Revert "[C++20] [Coroutines] Conform the updates for CWG issue 2585"
This reverts commit
1b89a25a9b960886e486eb20b755634613c088f8.
The test would fail in windows versions.
Peter Waller [Mon, 16 May 2022 20:59:17 +0000 (20:59 +0000)]
[LV] Improve register pressure estimate at high VFs
Previously, `getRegUsageForType` was implemented using
`getTypeLegalizationCost`. `getRegUsageForType` is used by the loop
vectorizer to estimate the register pressure caused by using a vector
type. However, `getTypeLegalizationCost` currently only appears to
understand splitting and not scalarization, so significantly
underestimates the register requirements.
Instead, use `getNumRegisters`, which understands when scalarization
can occur (via computeRegisterProperties).
This was discovered while investigating D118979 (Set maximum VF with
shouldMaximizeVectorBandwidth), where under fixed-length 512-bit SVE the
loop vectorizer previously ends up costing an v128i1 as 2 v64i*
registers where it actually occupies 128 i32 registers.
I'm sending this patch early for comment, I'm still doing some sanity checking
with LNT. I note that getRegisterClassForType appears to return VectorRC even
though the type in question (large vNi1 types) end up occupying scalar
registers. That might be worth fixing too.
Differential Revision: https://reviews.llvm.org/D125918
David Green [Mon, 23 May 2022 07:55:54 +0000 (08:55 +0100)]
[AArch64] Fix assumptions on input type of tryCombineFixedPointConvert
It is possible for the input type to not be v2i64 or v4i32, so weaken
the assertion to a return, fixing the crash in the new test.
Fixes #55606
Chuanqi Xu [Mon, 23 May 2022 07:23:00 +0000 (15:23 +0800)]
[C++20] [Coroutines] Conform the updates for CWG issue 2585
According to the updates in CWG issue 2585
https://cplusplus.github.io/CWG/issues/2585.html, we shouldn't find an
allocation function with (size, p0, …, pn) in global scope.
Sergei Trofimovich [Mon, 23 May 2022 07:39:48 +0000 (08:39 +0100)]
[Support] Add missing <cstdint> header to Base64.h
Without the change llvm build fails on this week's gcc-13 snapshot as:
[ 91%] Building CXX object unittests/Support/CMakeFiles/SupportTests.dir/Base64Test.cpp.o
In file included from llvm/unittests/Support/Base64Test.cpp:14:
llvm/include/llvm/Support/Base64.h: In function 'std::string llvm::encodeBase64(const InputBytes&)':
llvm/include/llvm/Support/Base64.h:29:5: error: 'uint32_t' was not declared in this scope
29 | uint32_t x = ((unsigned char)Bytes[i] << 16) |
| ^~~~~~~~
Sergei Trofimovich [Mon, 23 May 2022 07:03:23 +0000 (08:03 +0100)]
[Support] Add missing <cstdint> header to Signals.h
Without the change llvm build fails on this week's gcc-13 snapshot as:
[ 0%] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Signals.cpp.o
In file included from llvm/lib/Support/Signals.cpp:14:
llvm/include/llvm/Support/Signals.h:119:8: error: variable or field 'CleanupOnSignal' declared void
119 | void CleanupOnSignal(uintptr_t Context);
| ^~~~~~~~~~~~~~~
Gabor Marton [Thu, 19 May 2022 09:14:56 +0000 (11:14 +0200)]
[analyzer][NFC] Factor out the copy-paste code repetition of assumeDual and assumeInclusiveRangeDual
Depends on D125892. There might be efficiency and performance
implications by using a lambda. Thus, I am going to conduct measurements
to see if there is any noticeable impact.
I've been thinking about two more alternatives:
1) Make `assumeDualImpl` a variadic template and (perfect) forward the
arguments for the used `assume` function.
2) Use a macros.
I have concerns though, whether these alternatives would deteriorate the
readability of the code.
Differential Revision: https://reviews.llvm.org/D125954
Gabor Marton [Thu, 19 May 2022 09:05:24 +0000 (11:05 +0200)]
[analyzer] Implement assumeInclusiveRange in terms of assumeInclusiveRangeDual
Depends on D124758. This is the very same thing we have done for
assumeDual, but this time we do it for assumeInclusiveRange. This patch
is basically a no-brainer copy of that previous patch.
Differential Revision:
https://reviews.llvm.org/D125892
Muhammad Omair Javaid [Mon, 23 May 2022 06:17:24 +0000 (11:17 +0500)]
Revert "[lldb] Consider binary as module of last resort"
This reverts commit
a3c3482ceb529206b0ae4e7782e5496da5e0879d.
It broke LLDB API test TestBadAddressBreakpoints.py
Differential revision: https://reviews.llvm.org/D124731
Chenbing Zheng [Mon, 23 May 2022 06:17:01 +0000 (14:17 +0800)]
[InstCombine] add tests for bitcast; NFC