Krzysztof Parzyszek [Tue, 22 Nov 2022 17:11:17 +0000 (09:11 -0800)]
[Hexagon] Make local array static in getIntrinsicForHexagonNonClangBuiltin
It should not be created on every call, the omission of `static` was a bug
in the patch that introduced it.
Kirill Stoimenov [Fri, 18 Nov 2022 23:46:54 +0000 (23:46 +0000)]
[Sanitizer][NFC] Rearranged prototype definitions in lsan_common.h to group them by implementation file.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D138335
Guozhi Wei [Tue, 22 Nov 2022 17:18:29 +0000 (17:18 +0000)]
[AArch64] Correctly recognize -reserve-regs-for-regalloc=X30,X29
In AArch64 backend X30 is named as LR, X29 is named as FP. So the code in AArch64Subtarget::AArch64Subtarget can't recognize these 2 registers.
for (unsigned i = 0; i < 31; ++i) {
if (ReservedRegNames.count(TRI->getName(AArch64::X0 + i)))
ReserveXRegisterForRA.set(i);
}
This patch add code to explicitly handle these 2 registers.
Differential Revision: https://reviews.llvm.org/D137810
Jay Foad [Tue, 22 Nov 2022 17:06:53 +0000 (17:06 +0000)]
[AMDGPU] Remove RegStrictDom variable. NFC.
D117544 removed the only substantive use of RegStrictDom. Now we can
simplify by using StrictDom for everything.
Jay Foad [Tue, 22 Nov 2022 16:59:53 +0000 (16:59 +0000)]
[AMDGPU] Define and use new allZeroWaitcnt helper. NFC.
Florian Hahn [Tue, 22 Nov 2022 16:55:08 +0000 (16:55 +0000)]
[AArch64] Add zext test with scalable vectors.
Mark de Wever [Tue, 15 Nov 2022 18:53:30 +0000 (19:53 +0100)]
[libc++][format] Fixes visit_format_arg.
The Standard specifies which types are stored in the basic_format_arg
"variant" and which types are stored as a handle. Libc++ stores
additional types in the "variant". During a reflector discussion
@jwakely mention this is user observable; visit_format_arg uses the type
instead of a handle as argument.
This optimization is useful and will probably be used for other small
types in the future. To be conferment the visitor creates a handle and
uses that as argument. There is a second visitor so the formatter can
still directly access the 128-bit integrals.
The test for the visitor and get has been made public too, there is no
reason not too. The 128-bit integral types are required by the Standard,
when they are available.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D138052
Matthias Springer [Tue, 22 Nov 2022 16:26:19 +0000 (17:26 +0100)]
[mlir][tensor] Add dim(expand_shape/collapse_shape) folding
Differential Revision: https://reviews.llvm.org/D138487
Timm Bäder [Fri, 18 Nov 2022 13:13:18 +0000 (14:13 +0100)]
[clang][Parse] Remove constant expression from if condition
MaybeTypeCast here is not a variable, it's an enum member with value 1.
Differential Revision: https://reviews.llvm.org/D138289
bixia1 [Mon, 21 Nov 2022 05:57:30 +0000 (21:57 -0800)]
[mlir][sparse] Fix a bug in concatenate operator rewriting.
When calculating the dynamic dimensions for the concatenate result, we
shouldn't accumulate the sizes for the non-concatenating dimensions.
Reviewed By: aartbik, Peiming
Differential Revision: https://reviews.llvm.org/D138436
Matt Arsenault [Sun, 20 Nov 2022 17:08:34 +0000 (09:08 -0800)]
clang/HIP: Add new header test for math IR gen
The current header testing is pretty thin. This is in
preparation for a series of patches to replace many
builtin implementations.
I did try to stress everything in this header, but skipped
a few things. Mostly I didn't understand why we have
various language version checks which skip defining some
things. It doesn't seem right to have any of these if guards
on __cplusplus, __HIPCC_RTC__, and __OPENMP_AMDGCN__.
Matt Arsenault [Mon, 21 Nov 2022 03:07:01 +0000 (19:07 -0800)]
LoopDeletion: Fix missing newlines in debug printing
Yitzhak Mandelbaum [Thu, 3 Nov 2022 00:36:58 +0000 (00:36 +0000)]
[clang][dataflow] Add widening API and implement it for built-in boolean model.
* Adds API support for widening of lattice elements and environments,
* Updates the algorithm to apply widening where appropriate,
* Implements widening for boolean values. In the process, moves the unsoundness
of comparison from the default implementation of
`Environment::ValueModel::compare` to model-specific handling inside
`DataflowEnvironment::equivalentTo`. This change is intended to clarify
the source and location of unsoundess.
This patch is a replacement for, and was based substantially on, https://reviews.llvm.org/D131645.
Differential Revision: https://reviews.llvm.org/D137948
Matthias Springer [Tue, 22 Nov 2022 16:03:32 +0000 (17:03 +0100)]
[mlir][bufferize][NFC] Minor code and comment cleanups
Differential Revision: https://reviews.llvm.org/D135056
Kelvin Li [Tue, 22 Nov 2022 15:09:47 +0000 (10:09 -0500)]
[Flang] Removing Float Bessel functions for AIX
AIX libc only provides bessel functions j0,j1,jn and y0,y1,yn but
does not have their float equivalents j0f,j1f,jnf and y0f,y1f,ynf.
Committed on behalf of madanial
Differential Revision: https://reviews.llvm.org/D136128
David Spickett [Mon, 21 Nov 2022 17:01:11 +0000 (17:01 +0000)]
[libcxx] Add BOT_OWNERS.txt
Buildkite doesn't provide a way to list bot owners so currently
we are pinging people on Discord and Phabricator.
Which works ok until that person is on vacation. This file gives us
a place to list multiple people, or group contacts for each bot.
I've stuck to the CODE_OWNERS.txt format because there's no great
reason to change it.
Reviewed By: #libc, EricWF, ldionne
Differential Revision: https://reviews.llvm.org/D138445
Vy Nguyen [Fri, 18 Nov 2022 20:21:23 +0000 (15:21 -0500)]
[lld-macho] Fix bug in CUE folding that resulted in wrong unwind table.
PR/59070
Differential Revision: https://reviews.llvm.org/D138320
Yaxun (Sam) Liu [Thu, 17 Nov 2022 17:04:57 +0000 (12:04 -0500)]
[HIP] Fix lld failure when devie object is empty
When -fgpu-rdc is used for linking relocatable objects, clang driver launches
clang-offload-bundler to extract a device relocatable object from each input
relocatable object file and passes the extracted files to lld. The input relocatable
object file could either come from HIP program or C++ program. The relocatable
object file from C++ program does not contain device relocatable objects, therefore
clang-offload-bundler extracts an empty file and passes it to lld. lld treates
empty file as linker script. When there is no object input file to lld, lld
will emit error:
target emulation unknown: -m or at least one .o file required
This patch adds "elf64_amdgpu" to lld so that lld always know the target
no matter whether there are object input files or not.
Reviewed by: Artem Belevich, Fangrui Song
Differential Revision: https://reviews.llvm.org/D138221
Stefan Pintilie [Tue, 22 Nov 2022 14:41:14 +0000 (08:41 -0600)]
[PowerPC] Add handling for WACC register spilling.
This patch adds spilling for the new WACC registers.
In order to get the spilling test to work the MMA instructions from Power 10 are
now supported for Future CPU except that they are all using the new WACC
registers instead of the ACC registers from Power 10.
Reviewed By: amyk, saghir
Differential Revision: https://reviews.llvm.org/D136728
Sanjay Patel [Tue, 22 Nov 2022 14:28:55 +0000 (09:28 -0500)]
[VectorCombine] switch on opcode to compile faster
This follows
87debdadaf18 to further eliminate wasting time
calling helper functions only to early return to the main
run loop.
Once again, this results in significant savings based on
experimental data:
https://llvm-compile-time-tracker.com/compare.php?from=
01023bfcd33f922ed8c934ce563e54abe8bfe246&to=
3dce4f70b73e48ccb045decb634c185e6b4c67d5&stat=instructions:u
This is NFCI other than making the pass faster. The total
cost of VectorCombine runs in an -O3 build appears to be
well under 0.1% of compile-time now, so there's not much
left to do AFAICT.
There's a TODO about making the code cleaner, but it
probably doesn't change timing much. I didn't include those
changes here because it requires updating much more code.
Pavel Labath [Sun, 6 Nov 2022 08:43:27 +0000 (09:43 +0100)]
[lldb] rm include/lldb/Host/posix/Fcntl.h
File is unused.
Pavel Labath [Sun, 6 Nov 2022 08:45:17 +0000 (09:45 +0100)]
Add include guards for PlatformQemuUser.h
Oleksandr "Alex" Zinenko [Tue, 22 Nov 2022 15:08:47 +0000 (16:08 +0100)]
[mlir] fix incorrect summary/description in doc
Summary is the short one.
Paul Walker [Fri, 11 Nov 2022 21:01:59 +0000 (21:01 +0000)]
[SVE] Fix incorrect predicate for fixed length int/fp conversion.
When performing shrinking int/fp conversions the predicate should
be created to match the original fixed length vector type so the
unused lanes don't trigger side effects.
This patch also includes related refactoring to better detect such
issues and streamline the code a little.
Differential Revision: https://reviews.llvm.org/D138351
Philip Pfaffe [Tue, 22 Nov 2022 14:29:19 +0000 (14:29 +0000)]
[lldb] Allow plugins to extend DWARF expression parsing for vendor extensions
Parsing DWARF expressions currently does not support DW_OPs that are vendor
extensions. With this change expression parsing calls into SymbolFileDWARF for
unknown opcodes, which is the semantically "closest" plugin that we have right
now. Plugins can then extend SymbolFileDWARF to add support for vendor
extensions.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D137247
Pierre van Houtryve [Tue, 22 Nov 2022 14:14:46 +0000 (14:14 +0000)]
[AMDGPU][NFC] Remove isLegalVOP3PShuffleMask
Unused function since D134967
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138493
Jan Sjodin [Wed, 16 Nov 2022 13:54:05 +0000 (08:54 -0500)]
[OpenMP][OMPIRBuilder] Add a configuration class that captures flags that affect codegen
This patch introudces the OpenMPIRBuilderConfig class which contains various
flags that are needed to lower OMP constructs to LLVM-IR. The purpose is to
keep the flags in one place so they do not have to be passed in every time.
The flags can be set optionally since some uses cases don't rely on functions
that depend on these flags.
Reviewed By: jdoerfert, tschuett
Differential Revision: https://reviews.llvm.org/D138220
Ties Stuij [Tue, 22 Nov 2022 12:38:47 +0000 (12:38 +0000)]
[AArch64][clang] implement 2022 General Data-Processing instructions
This patch implements the 2022 Architecture General Data-Processing Instructions
They include:
Common Short Sequence Compression (CSSC) instructions
- scalar comparison instructions
SMAX, SMIN, UMAX, UMIN (32/64 bits) with or without immediate
- ABS (absolute), CNT (count non-zero bits), CTZ (count trailing zeroes)
- command-line options for CSSC
Associated with these instructions in the documentation is the Range Prefetch
Memory (RPRFM) instruction, which signals to the memory system that data memory
accesses from a specified range of addresses are likely to occur in the near
future. The instruction lies in hint space, and is made unconditional.
Specs for the individual instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09/Base-Instructions/
contributors to this patch:
- Cullen Rhodes
- Son Tuan Vu
- Mark Murray
- Tomas Matheson
- Sam Elliott
- Ties Stuij
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D138488
Pierre van Houtryve [Tue, 22 Nov 2022 08:35:02 +0000 (08:35 +0000)]
[AMDGPU][GISel] Select llvm.amdgcn.fcmp intrinsics
Adds FP CCs opcodes/selection logic, including src mods selection
Depends on D136591, D136448
Resolves #58326 (https://github.com/llvm/llvm-project/issues/58326)
Reviewed By: arsenm, foad
Differential Revision: https://reviews.llvm.org/D136592
Youling Tang [Tue, 22 Nov 2022 14:08:47 +0000 (22:08 +0800)]
[Sanitizer] Fix the implementation of internal_fstat on LoongArch
If `pathname` is an empty string and the AT_EMPTY_PATH flag is specified in `flags`,
statx `pathname` argument is of type `const char *restrict`, so it should be `""`
instead of `0`.
Reviewed By: SixWeining, xen0n, xry111, lixing-star
Differential Revision: https://reviews.llvm.org/D138414
Valentin Clement [Tue, 22 Nov 2022 14:13:18 +0000 (15:13 +0100)]
[flang][NFC] Switch CollectBindings return to SymbolVector
As suggested on D138129, switching rteurn of CollectBindings
function to SymbolVector.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D138419
Youling Tang [Tue, 22 Nov 2022 14:02:30 +0000 (22:02 +0800)]
[scudo] Add loongarch64 support for scudo
Enable scudo on LoongArch64 on both clang side and compiler-rt side.
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D138350
Valentin Clement [Tue, 22 Nov 2022 13:47:05 +0000 (14:47 +0100)]
[flang] Set initial size and type code for unlimited polymorphic descriptor
Initialization of unlimited polymorphic descriptor was raising an error.
This patch sets a default size and type code for unlimited polymoprhic descriptor
that will be updated once allocated/assigned.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D138479
David Green [Tue, 22 Nov 2022 13:47:56 +0000 (13:47 +0000)]
[SelectOptimize] Add some debug logging. NFC
This is some quick debug messages for the SelectOptimize pass, adding
some information for the costs that are measured from getInstructionCost
calls, and re-using the existing optimization remarks to print some
information about if transforms were performed or not.
Differential Revision: https://reviews.llvm.org/D138108
Dmitry Bushev [Tue, 22 Nov 2022 11:52:10 +0000 (14:52 +0300)]
[RISCV][NFC] Mark rs1 in most memory instructions as memory operand.
Marking rs1 (memory offset base) as memory operand provides additional
semantic value to this operand that can be used by different tools
(e.g. llvm-exegesis).
This change does not affect neigther Isel nor assembler. However it
required some tweaks in tablegen compressed inst emmiter.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D136847
Matthias Springer [Tue, 22 Nov 2022 13:29:47 +0000 (14:29 +0100)]
[mlir][bufferize][NFC] Rename DialectAnalysisState and move to OneShotAnalysis
`DialectAnalysisState` is now `OneShotAnalysisState::Extension`.
This state extension mechanism is needed only for One-Shot Analysis, so it is moved from `BufferizableOpInterface.h` to `OneShotAnalysis.h`.
Extensions are now identified via TypeIDs instead of StringRefs. The API of state extensions is cleaned up and follows the same pattern as other extension mechanisms in MLIR (e.g., `transform::TransformState::Extension`).
Also delete some dead code.
Differential Revision: https://reviews.llvm.org/D135051
Fahad Nayyar [Tue, 22 Nov 2022 13:26:21 +0000 (13:26 +0000)]
[Clang][Sema] Added space after ',' in a warning
This change fixes a typo in a warning message.
rdar://
79707705
Thomas Symalla [Mon, 14 Nov 2022 08:45:58 +0000 (09:45 +0100)]
[InstCombine] Fold extractelt with select of constants
An extractelt with a constant index which extracts an element from the
two vector operands of a select can be directly folded into a select.
extractelt (select %x, %vec1, %vec2), %const ->
select %x, %vec1[%const], %vec2[%const]
Note: the implementation currently only works for constant vector operands.
Reviewed By: foad, spatel
Differential Revision: https://reviews.llvm.org/D137934
Stefan Gränitz [Mon, 7 Nov 2022 11:23:58 +0000 (12:23 +0100)]
[CGObjC] Add run line for release mode in test arc-exceptions-seh.mm (NFC)
In release mode `arc-exceptions-seh.mm` fails. It needs `-enable-objc-arc-opts=false` to skip ObjC ARC optimizations.
Reviewed By: triplef
Differential Revision: https://reviews.llvm.org/D137942
Nuno Lopes [Tue, 22 Nov 2022 12:41:22 +0000 (12:41 +0000)]
Revert "[CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC]"
This reverts commit
f50423c1a4422900aa1240fed643f5920451a88d.
chenglin.bi [Tue, 22 Nov 2022 12:39:25 +0000 (20:39 +0800)]
[AMDGPU] precommit test for D138401; NFC
esmeyi [Tue, 22 Nov 2022 12:17:44 +0000 (07:17 -0500)]
[XCOFF] set fragment for XMC_PR csects.
Summary: -xcoff-traceback-table is a default option on AIX regardless of optimization and debug levels. An error of relocation for paired relocatable term is not yet supported in XCOFFObjectWriter::recordRelocation occurred when both of the -xcoff-traceback-table and -function-sections are enabled.
The root cause is that we missed to calculate the symbols difference as absolute value before adding fixups when symbol_A without the fragment set is the csect itself and symbol_B is in it.
This patch only sets the fragment for XMC_PR csects because we don't have other cases that hit this problem yet.
Reviewed By: DiggerLin, hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D137230
Yi Kong [Tue, 22 Nov 2022 11:11:42 +0000 (20:11 +0900)]
Revert "[libc++] Remove workarounds for systems that used to require __need_XXX macros"
This reverts commit
119cef40d18c48240854edc553dca61c4e9fdf27.
The change broke multiple builders.
Manuel Brito [Tue, 22 Nov 2022 11:40:06 +0000 (11:40 +0000)]
[CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC]
Differential Revision: https://reviews.llvm.org/D138483
Stefan Gränitz [Tue, 22 Nov 2022 09:13:33 +0000 (10:13 +0100)]
[CGObjC] Open cleanup scope before SaveAndRestore CurrentFuncletPad and push CatchRetScope early
Pushing the `CatchRetScope` early causes cleanups for catch parameters to be emitted in the basic block of the catch handler instead of the `catchret.dest` block. This is important because the latter is not part of the catchpad and this caused code truncations due to ARC PreISel intrinsics in WinEHPrepare.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D137939
Piotr Sobczak [Tue, 22 Nov 2022 09:35:07 +0000 (10:35 +0100)]
[AMDGPU] Add encoding tests for SALU_CYCLE_2/3
Add missing assembler/disassembler tests for INSTID_SALU_CYCLE_2
and INSTID_SALU_CYCLE_3 which are possible arguments in S_DELAY_ALU.
Differential Revision: https://reviews.llvm.org/D138482
Lorenzo Chelini [Tue, 22 Nov 2022 09:19:34 +0000 (10:19 +0100)]
[MLIR][Tensor] Use the existing helper function `applyPermutationToVector` (NFC)
Avoid duplicate code by using an existing helper function to interchange
a vector based on a permutation. Address comments emerged after landing
D138119.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D138480
Matthias Springer [Tue, 22 Nov 2022 10:20:41 +0000 (11:20 +0100)]
[mlir][SCF] Add tensor.dim(scf.foreach_thread) folding
Dim sizes of `scf.foreach_thread` op results match the dim sizes of their respective tied shared_outs operands.
Differential Revision: https://reviews.llvm.org/D138484
Alexander Belyaev [Tue, 22 Nov 2022 09:58:23 +0000 (10:58 +0100)]
[mlir] Update custom<DynamicIndexList> for Pack/Unpack.
Alexander Belyaev [Tue, 22 Nov 2022 07:55:59 +0000 (08:55 +0100)]
[mlir] Clean-up ViewLikeOpInterface w.r.t. kDynamic change.
Differential Revision: https://reviews.llvm.org/D138478
Benjamin Kramer [Tue, 22 Nov 2022 09:46:52 +0000 (10:46 +0100)]
[bazel] Add missing dependency after
9aa505a28d
zhanghb97 [Fri, 11 Nov 2022 08:01:05 +0000 (16:01 +0800)]
[mlir] Initial MLIR VP intrinsic integration test on host and RVV emulator.
This patch adds the initial VP intrinsic integration test on the host backend and RVV emulator. Please see more detailed [discussion on the discourse](https://discourse.llvm.org/t/mlir-vp-ops-on-rvv-backend-integration-test-and-issues-report/66343).
- Run the test cases on the host by configuring the CMake option: `-DMLIR_INCLUDE_INTEGRATION_TESTS=ON`
- Build the RVV environment and run the test cases on RVV QEMU by [this doc](https://gist.github.com/zhanghb97/
ad44407e169de298911b8a4235e68497).
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D137816
Han-Kuan Chen [Tue, 8 Nov 2022 09:43:11 +0000 (01:43 -0800)]
[CodeGen] Refactor visitSCALAR_TO_VECTOR. NFC.
Differential Revision: https://reviews.llvm.org/D137688
WuXinlong [Mon, 21 Nov 2022 03:20:41 +0000 (11:20 +0800)]
[RISCV] Add CodeGen support and MC testcase of RISCV Zca Extension
This patch add the support of RISCV Zca ext
`Zca` is a subset of C extension instructions that are compatible with the Zc extension.
So this patch implements Zca code generation with reference to the C extension and sets the 2-byte alignment for the Zca extension, just like C extension does.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D130483
Pierre van Houtryve [Tue, 22 Nov 2022 09:13:57 +0000 (09:13 +0000)]
[AMDGPU] Make aperture registers 64 bit
Makes the SRC_(SHARED|PRIVATE)_(BASE|LIMIT) registers 64 bit instead of 32.
They're still usable as 32 bit operands by using the _LO suffix.
Preparation for D137542
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D137767
Valentin Clement [Tue, 22 Nov 2022 09:11:50 +0000 (10:11 +0100)]
Revert "[flang][NFC] Switch CollectBindings return to SymbolVector"
This reverts commit
97e8eeb758fcae4f2afd9ac516ffc9509b4daaf0.
Max Kazantsev [Tue, 22 Nov 2022 08:52:49 +0000 (15:52 +0700)]
[SCEV][NFC] Introduce API for getting basic block's symbolic max exit count
Currently, it just returns exact exit count. This is a refectoring step
before it is actually implemented.
gonglingqin [Tue, 22 Nov 2022 08:16:09 +0000 (16:16 +0800)]
[LoongArch] Fix issue on CMake Xcode build configuration
Add missing dependency for loongarch-resource-headers. This patch refers to D126892 to repair same error.
Differential Revision: https://reviews.llvm.org/D138403
Valentin Clement [Tue, 22 Nov 2022 08:42:32 +0000 (09:42 +0100)]
[flang][NFC] Switch CollectBindings return to SymbolVector
As suggested on D138129, switching rteurn of CollectBindings
function to SymbolVector.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D138419
Valentin Clement [Tue, 22 Nov 2022 08:41:09 +0000 (09:41 +0100)]
[flang] Do not propagate type desc when box type is not polymorphic
When the rhs is non-polymorphic the type descriptor should not
be propagated. An error in the EmboxOp verifier was raised in that case.
This patch propagate the type descriptor only if the result type of the
EmboxOp operation is polymorphic.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D138442
Alvin Wong [Tue, 22 Nov 2022 08:33:35 +0000 (16:33 +0800)]
[libcxx] Fix std::equal not accepting volatile types by refactoring __equal_to
Fixes https://github.com/llvm/llvm-project/issues/59021
Reviewed By: #libc, philnik
Differential Revision: https://reviews.llvm.org/D138268
Max Kazantsev [Tue, 22 Nov 2022 08:18:05 +0000 (15:18 +0700)]
[SCEV][NFC] Call getExitCount with SymbolicMaximum when computing loop symbolic max
Currently this is NFC, because SymbolicMaximum for BB is not implemented and just
reuses exact result. However, from code purity perspective, it's a necessary step
to do. Plans to implement symbolic max for blocks are underway.
Pierre van Houtryve [Tue, 22 Nov 2022 08:23:29 +0000 (08:23 +0000)]
[AMDGPU][GISel] Add llvm.amdgcn.icmp selection
Add missing logic to select i16 variants and enable GISel testing.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D136448
Lorenzo Chelini [Tue, 15 Nov 2022 09:30:50 +0000 (10:30 +0100)]
Introduce `tensor.pack` and `tensor.unpack` operations
Pack and Unpack return new tensors within which the individual elements
are reshuffled according to the packing specification. This has the
consequence of modifying the canonical order in which a given operator
(i.e., Matmul) accesses the individual elements. After bufferization,
this typically translates to increased access locality and cache
behavior improvement, e.g., eliminating cache line splitting.
Co-authored-by: Mahesh Ravishankar <ravishankarm@google.com>
Co-authored-by: Han-Chung Wang <hanchung@google.com>
RFC: https://discourse.llvm.org/t/rfc-tensor-pack-and-tensor-unpack/66408/1
Reviewed By: nicolasvasilache, rengolin, hanchung
Differential Revision: https://reviews.llvm.org/D138119
Chen Zheng [Tue, 22 Nov 2022 07:40:30 +0000 (02:40 -0500)]
[PowerPC][GISel]add support for float point arithmetic operations
Add global isel support for G_FADD, G_FSUB, G_FMUL, G_FDIV.
Reviewed By: Kai, nemanjai, arsenm, amyk
Differential Revision: https://reviews.llvm.org/D132942
gonglingqin [Tue, 22 Nov 2022 07:23:49 +0000 (15:23 +0800)]
[LoongArch] Support when the depth of __builtin_frame_address is greater than zero
As discussed in D137541, it supports processing when the depth of
__builtin_frame_address is greater than 0 instead of reporting an error.
Unsafe calls rely on the '-Wframe-address' option for diagnosis.
Differential Revision: https://reviews.llvm.org/D138084
Chen Zheng [Tue, 22 Nov 2022 07:23:10 +0000 (07:23 +0000)]
[PowerPC] store the LR before stack update for big offsets.
For case that LROffset + FrameSize can not be encoded to the LR
store instruction, we have to store the LR before the stack update.
Chen Zheng [Tue, 22 Nov 2022 07:17:20 +0000 (07:17 +0000)]
[PowerPC][NFC] add test case for mflr store fix
David Green [Tue, 22 Nov 2022 07:23:56 +0000 (07:23 +0000)]
[LoopFlatten] Fix IV increment use count
The add from the IV in the inner loop was always checking for 2 uses,
the phi and the compare. The compare could be based on the phi though,
leaving one valid use of the compare. In the testcase we could be left
with the phi and a lcssa phi as the two users, invalidly allowing
flattening where we shouldn't.
Fixes 58441
Differential Revision: https://reviews.llvm.org/D138404
Mahesh Ravishankar [Wed, 16 Nov 2022 07:52:34 +0000 (07:52 +0000)]
[mlir][Linalg] Avoid unnecessary propagating producer result to fused op result.
Elementwise op fusion conserves the result of the producer in the
fused op, relying on later clean up patterns to drop unused results of
the fused op. Instead, if the producer result has no other use apart
from the consumer op, avoid making the producer result available in
the fused node. This saves some unnecessary IR manipulations.
Differential Revision: https://reviews.llvm.org/D138096
Max Kazantsev [Tue, 22 Nov 2022 05:53:37 +0000 (12:53 +0700)]
[IndVarSimplify] Lift limitations on IV being a Phi for turn-to-invariant
These limitations are too strict, and their only purpose is to avoid code
size explosion. These restrictions seem obsolete, and the size problem
is solved in other places through cheap expansion limits.
The motivation is that the old code cannot deal with comparisons against
induction variant's increment.
Differential Revision: https://reviews.llvm.org/D138412
Reviewed By: lebedev.ri, reames
KAWASHIMA Takahiro [Mon, 14 Nov 2022 05:09:27 +0000 (14:09 +0900)]
[clang] Fix -fp-model={strict|precise} to disable -fapprox-func
`-fapprox-func` should be disabled by `-fp-model={strict|precise}`,
as well as other fast-math flags. See the last changes in
`clang/test/Driver/fp-model.c`.
Probably this route (`case options::OPT_ffp_model_EQ`) was forgot
to update in D106191 and D114564. There is no appropriate reason not
to disable the flag.
This commit also updates other regression tests, which are not directly
related to this bug, for consistency with other fast-math flags.
Differential Revision: https://reviews.llvm.org/D138109
Craig Topper [Tue, 22 Nov 2022 03:23:08 +0000 (19:23 -0800)]
[RISCV] Remove SExtWRemovalCands set from RISCVSExtWRemoval.
After D137970, we do the fixable instruction conversion in place
so we don't need to worry about iterator invalidation. This lets
us to conversion and updates in a single loop.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D138043
Craig Topper [Tue, 22 Nov 2022 03:22:51 +0000 (19:22 -0800)]
[RISCV] Transform fixable instruction in place in RISCVSExtWRemoval. NFC
Instead of creating a new instruction and copying operands, we can
use setDesc to convert in place.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D137970
Craig Topper [Tue, 22 Nov 2022 03:16:40 +0000 (19:16 -0800)]
[RISCV] Prevent constant hoisting for (and (shl X, C), mask<<C)
If the immediate is a shifted mask, we will use a pair of shifts
and never materialize the immediate. Consider the immediate free.
Reviewed By: reames, luismarques
Differential Revision: https://reviews.llvm.org/D138260
Kazu Hirata [Tue, 22 Nov 2022 03:06:42 +0000 (19:06 -0800)]
Return None instead of Optional<T>() (NFC)
This patch replaces:
return Optional<T>();
with:
return None;
to make the migration from llvm::Optional to std::optional easier.
Specifically, I can deprecate None (in my source tree, that is) to
identify all the instances of None that should be replaced with
std::nullopt.
Note that "return None" far outnumbers "return Optional<T>();". There
are more than 2000 instances of "return None" in our source tree.
All of the instances in this patch come from functions that return
Optional<T> except Archive::findSym and ASTNodeImporter::import, where
we return Expected<Optional<T>>. Note that we can construct
Expected<Optional<T>> from any parameter convertible to Optional<T>,
which None certainly is.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Differential Revision: https://reviews.llvm.org/D138464
Kazu Hirata [Tue, 22 Nov 2022 03:03:40 +0000 (19:03 -0800)]
Don't use Optional::getPointer (NFC)
Since std::optional does not offer getPointer(), this patch replaces
X.getPointer() with &*X to make the migration from llvm::Optional to
std::optional easier.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Differential Revision: https://reviews.llvm.org/D138466
Phoebe Wang [Tue, 22 Nov 2022 01:48:43 +0000 (09:48 +0800)]
[X86] Allow no X87 on 32-bit
This patch is an alternative of D100091. It solved the problems in `f80` type lowering.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D137946
Vitaly Buka [Tue, 22 Nov 2022 02:44:43 +0000 (18:44 -0800)]
[test][asan] Another try to fix Windows bot
Update pattern on Linux and Darwin for consistency.
Stephen Neuendorffer [Tue, 22 Nov 2022 02:03:30 +0000 (18:03 -0800)]
[llvm-link] Fix options of llvm-link
This tool only parsed options after creating the LLVMContext.
Unfortunately, this means that some options, such as --opaque-pointers,
which are read when the LLVMContext is created are impossible to
set from the command line. This patch moves the LLVMContext creation
after the option parsing.
Vitaly Buka [Tue, 22 Nov 2022 01:45:51 +0000 (17:45 -0800)]
[test][asan] Replace tr with sed
tr is not available on Windows bot.
Nico Weber [Tue, 22 Nov 2022 01:48:11 +0000 (20:48 -0500)]
[gn build] Add missing dep from check-bolt on llvm-bat-dump
KAWASHIMA Takahiro [Wed, 16 Nov 2022 04:57:52 +0000 (13:57 +0900)]
[clang][docs] Correct indent of option explanation
Indentation is significant for Sphinx. Lines with indentation after
a `.. option::` line are treated as explanations of the option.
KAWASHIMA Takahiro [Wed, 16 Nov 2022 04:56:27 +0000 (13:56 +0900)]
[clang][docs] Remove an unnecessary space
KAWASHIMA Takahiro [Wed, 16 Nov 2022 04:23:51 +0000 (13:23 +0900)]
[clang][docs] Use `option` directive in User's Manual
Sphinx has the `option` directive. Most option descriptions
in `clang/docs/UsersManual.rst` used it but some didn't.
This commit changes the remaining option descriptions to use
the `option` directive. This makes a consistent view in HTML.
The `option` directive automatically creates a cross-reference target.
So labeling by `.. _opt_XXX:` is almost unnecessary. However, options
with and without `no-` (e.g. `-fno-show-column`/`-fshow-column`)
cannot be distinguish for the cross-reference. So some required
`.. _opt_XXX:` directives are kept unremoved.
Differential Revision: https://reviews.llvm.org/D138088
Vitaly Buka [Tue, 22 Nov 2022 01:36:47 +0000 (17:36 -0800)]
[test][asan] Try to fix Windows bot
Peiming Liu [Wed, 16 Nov 2022 23:18:16 +0000 (23:18 +0000)]
[mlir][sparse] support affine expression on dense dimensions (except constant affine)
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D138169
Christopher Di Bella [Thu, 17 Nov 2022 07:36:37 +0000 (07:36 +0000)]
[libcxx] adds an include-what-you-use (IWYU) mapping file
This makes it possible for programmers to run IWYU and get more accurate
standard library inclusions. Prior to this commit, the following program
would be transformed thusly:
```cpp
// Before
#include <algorithm>
#include <vector>
void f() {
auto v = std::vector{0, 1};
std::find(std::ranges::begin(v), std::ranges::end(v), 0);
}
```
```cpp
// After
#include <__algorithm/find.h>
#include <__ranges/access.h>
#include <vector>
...
```
There are two ways to fix this issue: to use [comment pragmas](https://github.com/include-what-you-use/include-what-you-use/blob/master/docs/IWYUPragmas.md)
on every private include, or to write a canonical [mapping file](https://github.com/include-what-you-use/include-what-you-use/blob/master/docs/IWYUMappings.md)
that provides the tool with a manual on how libc++ is laid out. Due to
the complexity of libc++, this commit opts for the latter, to maximise
correctness and minimise developer burden.
To mimimise developer updates to the file, it makes use of wildcards
that match everything within listed subdirectories. A script has also
been added to ensure that the mapping is always fresh in CI, and makes
the process a single step.
Finally, documentation has been added to inform users that IWYU is
supported, and what they need to do in order to leverage the mapping
file.
Closes #56937.
Differential Revision: https://reviews.llvm.org/D138189
Kazu Hirata [Tue, 22 Nov 2022 01:00:50 +0000 (17:00 -0800)]
[AArch64] Remove emitCalleeSavedFrameMoves
The last use of emitCalleeSavedFrameMoves was removed on March 24,
2022 in commit
50a97aacacf689f838451439d913421d608e1bed.
Differential Revision: https://reviews.llvm.org/D138388
Evgenii Stepanov [Tue, 22 Nov 2022 00:56:53 +0000 (16:56 -0800)]
Revert "[scudo] Detect double free when running with MTE."
Mysterious failures on the x86_64-linux-qemu, to be debugged.
This reverts commit
1dd54691b20d8bf65156cdf35d241cfcd684cb54.
Aart Bik [Sat, 19 Nov 2022 00:57:02 +0000 (16:57 -0800)]
[mlir][sparse][vector] ensure loop peeling to remove vector masks works
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D138343
Nico Weber [Fri, 18 Nov 2022 13:48:30 +0000 (08:48 -0500)]
[gn build] Add build files for //bolt
Adds build files for libraries, tools, and tests -- everything except
the runtime.
Doesn't hook up bolt in the main BUILD.gn file yet -- I want to verify
that it builds on Linux, macOS, Windows before doing that. (I've only
checked on macOS so far.)
`ninja check-bolt` passes on macOS with this.
(I locally bumped the deployment target to macOS 10.12 for that. bolt/ uses
std::mutex quite a bit, which requires 10.12.)
Differential Revision: https://reviews.llvm.org/D138355
Advenam Tacet [Sun, 20 Nov 2022 01:34:46 +0000 (17:34 -0800)]
[1a/3][ASan][compiler-rt] API for double ended containers
This revision is a part of a series of patches extending
AddressSanitizer C++ container overflow detection capabilities by adding
annotations, similar to those existing in std::vector, to std::string
and std::deque collections. These changes allow ASan to detect cases
when the instrumented program accesses memory which is internally
allocated by the collection but is still not in-use (accesses before or
after the stored elements for std::deque, or between the size and
capacity bounds for std::string).
The motivation for the research and those changes was a bug, found by
Trail of Bits, in a real code where an out-of-bounds read could happen
as two strings were compared via a std::equals function that took
iter1_begin, iter1_end, iter2_begin iterators (with a custom comparison
function). When object iter1 was longer than iter2, read out-of-bounds
on iter2 could happen. Container sanitization would detect it.
This revision adds a new compiler-rt ASan sanitization API function
sanitizer_annotate_double_ended_contiguous_container necessary to
sanitize/annotate double ended contiguous containers. Note that that
function annotates a single contiguous memory buffer (for example the
std::deque's internal chunk). Such containers have the beginning of
allocated memory block, beginning of the container in-use data, end of
the container's in-use data and the end of the allocated memory block.
This also adds a new API function to verify if a double ended contiguous
container is correctly annotated
(__sanitizer_verify_double_ended_contiguous_container).
Since we do not modify the ASan's shadow memory encoding values, the
capability of sanitizing/annotating a prefix of the internal contiguous
memory buffer is limited – up to SHADOW_GRANULARITY-1 bytes may not be
poisoned before the container's in-use data. This can cause false
negatives (situations when ASan will not detect memory corruption in
those areas).
On the other hand, API function interfaces are designed to work even if
this caveat would not exist. Therefore implementations using those
functions will poison every byte correctly, if only ASan (and
compiler-rt) is extended to support it. In other words, if ASan was
modified to support annotating/poisoning of objects lying on addresses
unaligned to SHADOW_GRANULARITY (so e.g. prefixes of those blocks),
which would require changing its shadow memory encoding, this would not
require any changes in the libcxx std::string/deque code which is added
in further commits of this patch series.
If you have any questions, please email:
advenam.tacet@trailofbits.com
disconnect3d@trailofbits.com
Differential Revision: https://reviews.llvm.org/D132090
Vitaly Buka [Tue, 22 Nov 2022 00:22:38 +0000 (16:22 -0800)]
[test][asan] Ignore new lines in header
Vitaly Buka [Sun, 20 Nov 2022 06:24:46 +0000 (22:24 -0800)]
[test][asan] Limit scope of the var
Nico Weber [Mon, 21 Nov 2022 13:45:45 +0000 (08:45 -0500)]
[bolt] Use llvm::sys::RWMutex instead of std::shared_timed_mutex
This has the following advantages:
- std::shared_timed_mutex is macOS 10.12+ only. llvm::sys::RWMutex
automatically switches to a different implementation internally
when targeting older macOS versions.
- bolt only needs std::shared_mutex, not std::shared_timed_mutex.
llvm::sys::RWMutex automatically uses std::shared_mutex internally
where available.
std::shared_mutex and RWMutex have the same API, so no code changes
other than types and includes are needed.
Differential Revision: https://reviews.llvm.org/D138423
Aart Bik [Fri, 18 Nov 2022 20:18:00 +0000 (12:18 -0800)]
[mlir][sparse] introduce vectorization pass for sparse loops
This brings back previous SIMD functionality, but in a separate pass.
The idea is to improve this new pass incrementally, going beyond for-loops
to while-loops for co-iteration as welll (masking), while introducing new
abstractions to make the lowering more progressive. The separation of
sparsification and vectorization is a very good first step on this journey.
Also brings back ArmSVE support
Still to be fine-tuned:
+ use of "index" in SIMD loop (viz. a[i] = i)
+ check that all ops really have SIMD support
+ check all forms of reductions
+ chain reduction SIMD values
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D138236
David Blaikie [Mon, 21 Nov 2022 23:59:21 +0000 (23:59 +0000)]
pr59000: Clarify packed-non-pod warning that it's pod-for-the-purposes-of-layout
Kai Nacke [Mon, 21 Nov 2022 20:47:52 +0000 (20:47 +0000)]
[PowerPC] Add support for G_ADD and G_SUB.
Extends the global isel implementation to support G_ADD and G_SUB.
Reviewed By: arsenm, amyk
Differential Revision: https://reviews.llvm.org/D128106
Louis Dionne [Mon, 21 Nov 2022 14:50:31 +0000 (09:50 -0500)]
[libc++][NFC] Add missing conditionals for the existence of wide characters
Differential Revision: https://reviews.llvm.org/D138435